antaRNA GC target Value Handling
An RNA sequence S of length L is comprised of RNA nucleotides of the alphabet Σ={A,C,G,U}.
Dependent on which characters the sequence is comprised of, a specific GC content is adjusted.
Each nucleotide contributes its character to the GC content with the weight 1/L, such that a
sequence of a certain length L can only engage a specific set of GC content values.
In the design process, this can lead to situations, in which the requested GC value can never be reached by
a certain sequence length and produces artificial GC distances, eventhough the underlaying optimization
lead to GC values, which are neighboring the requested GC content.
In the case, that a requested GC content cannot be achieved due to the sequence length problem, antaRNA
allows the neighboring GC values of the requested GC value to be valid target GC values.
Case
\frac{tGC}{\frac{1}{L}*100}== 0
The Single Nucleotide Constribution to the GC content of a sequence
with length L can achieve the requested GC value.
Case
\frac{tGC}{\frac{1}{L}*100}!= 0
The Single Nucleotide Constribution to the GC content of a sequence
with length L can NOT achieve the requested GC value.
Adjacent achievable GC values are considered as legit target
GC values, if the sequence GC value resembles one of them.
tGC \rightarrow [tGC - \delta^{-}_{GC}, tGC + \delta^{+}_{GC}]
Sequences which show to have a GC content resembling this interval, get a GC distance of 0.
Robert Kleinkauf 05/2015