antaRNA - Ant Colony Optimized RNA Sequence Design
Digest
antaRNA applies the principle of Ant Colony optimization (ACO) to the problem of inverse folding a RNA structure i.e. finding a suitable sequence, which can fold into that structure.
Besides the structural constraint, antaRNA realizes the usage of sequence constraints and provides the user to specify a GC value constraint.
Requirements & Installation
Requirements
For the usage of antaRNA, the program RNAfold of the ViennaRNA Package version 2.1.3 are required.
They need to be listed in the PATH variable of the system.
For the usage of pseudoknot structure constraint, pKiss_mfe is required to be installed on the machine, such that regular calls of it can be executed.
Python should be installed. Required is at least version 2.7.3
You need some non standard python libraries to execute antaRNA. So far it is only:
numpy
Installation
Download a provided python file of antaRNA and save it to your favorite place. Go and execute!
Once downloaded and having installed all dependencies, you can execute antaRNA from the shell.
Optional you can also include the the program to your PYTHONPATH, so that you can use antaRNA from within python or call functions from other python scripts.
Example for generating a sequence wild-card constrained instances comprising a desired target GC-content of 50%:
In order to use the pseudoknot option within antaRNA, the flags -p and -pkPar have to be used.
Example for generating a sequence wild-card constrained instances comprising a desired target GC-content of 50% using pkiss as folding prediction and using the parametrized parameters for pseudoknots:
A regular call of antaRNA will produce an output in classical FASTA format: A header and the output sequence of the program.
The option -v, however, induces a three lined verbose output: In the first
line some stats about the run and qualities of the result are added; in the second line the solution structure
is listed; in the third line, the solution sequence is listed:
Line 1
>antaRNA_0
Identifier within the batch
Cstr:....(((((...)))))...(((((...)))))...
applied strucutre constraint
Cseq:NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
applied sequence constraint
Alpha:1.0
applied edge pheromone weight contribution
Beta:1.0
applied edge path length contribution
tGC:0.5
applied target GC constraint (here: 50%)
ER:0.2
applied pheromone evaporation rate
Struct_CT:0.5
applied scoring structure distance weight
GC_CT:5.0
applied scoring GC distance weight
Seq_CT:1.0
applied scoring sequence distance weight
Ants:4
used number of 'best-out-of' 10 ants
Resets:0/5
used number/allowed number of terrain resets
AntsTC:50
used ants within the termination criterion
CC:130
used ants within the convergence criterion
IP:s
solution improvement method, here: score based method
BSS:0
best solution since (x) resets
LP:0
# detected lonely base pair situations within the constraint
ds:0.0
structural distance of the designed sequence towards it's target
dGC:0.0
GC distance of the designed sequence towards it's target
GC:50.0
actual GC value of the sequence
dseq:0.0
sequence distance of the designed sequence towards it's target
L:36
length of the constraint system
Time:0.183250188828
time spend within the ant hive, but be aware, that this also includes system idling time, and might not be correct
Robert Kleinkauf, Torsten Houwaart, Rolf Backofen and Martin Mann
antaRNA - Multi-Objective Inverse Folding of Pseudoknot RNA
using Ant-Colony Optimization
Submitted July 2015