Tuesday, February 24, 2009

DNA fractals

Some days ago, I had a talk with I who suggested me to use some mathematical approaches to compare artificial with real sequences (one of my projects). Besides mutual information and SVM classification systems, he told me about fractals presents in the sequences. The method had been called "Chaos game for DNA" (emboss has a subprogram to do it named chaos) and the algorithm is very simple:
  1. use a square with each corner represents a "letter" (AGCT),
  2. start in the middle of the square and read the sequence,
  3. you add a point in the halfway of the previous point and the new letter,
  4. continue until you finish the sequence

This method was previously described in the 90's, recently we have more genome information to continue exploring this characteristic in the DNA sequence, some application include alignment-free comparison and other iterative systems which reproduce this effect.

After I plot some sequences, I asked me to test other species to see differences, but as G concludes, the pattern is present because genomic sequence has low CpG content (this is "CG" are poorly represent, biological implications are DNA regulation with methylation).

We still discussing if DNA can be called "fractal" and if this characteristic can be used to scan sequences looking for functional/non-functional parts or some insight.

I twitted some thoughts about this like "fractals are beauty", Marcha replied: "fractals are evil", so "evil is beauty".

No comments:

Post a Comment