Computes exact pattern matchings (EPM) between two RNA sequences. More...
#include <exact_matcher.hh>
Public Member Functions | |
ExactMatcher (const Sequence &seqA_, const Sequence &seqB_, const RnaData &rna_dataA_, const RnaData &rna_dataB_, const ArcMatches &arc_matches_, const SparseTraceController &sparse_trace_controller_, PatternPairMap &foundEPMs_, int alpha_1_, int alpha_2_, int alpha_3_, score_t difference_to_opt_score_, score_t min_score_, long int max_number_of_EPMs_, bool inexact_struct_match_, score_t struct_mismatch_score_, bool apply_filter_, bool verbose_) | |
Constructor. | |
void | compute_arcmatch_score () |
void | test_arcmatch_score () |
for debugging | |
void | trace_EPMs (bool suboptimal) |
computes the traceback and traces all EPMs |
Computes exact pattern matchings (EPM) between two RNA sequences.
There exist two different tracebacks: the heuristic and the suboptimal one. The heuristic traceback computes one EPM that ends at a position in the matrix F, if it is maximally extended. The suboptimal traceback enumerates all suboptimal EPMs up to a certain score (--diff-to-opt-score) or enumerates a certain number of EPMs (--number-of-EPMs).
The list of all traced EPMs can either be written in a file or handed over to the PatternPairMap to do the chaining in order to find the best set of EPMs that can be simultaneously be part of an alignment.
LocARNA::ExactMatcher::ExactMatcher | ( | const Sequence & | seqA_, |
const Sequence & | seqB_, | ||
const RnaData & | rna_dataA_, | ||
const RnaData & | rna_dataB_, | ||
const ArcMatches & | arc_matches_, | ||
const SparseTraceController & | sparse_trace_controller_, | ||
PatternPairMap & | foundEPMs_, | ||
int | alpha_1_, | ||
int | alpha_2_, | ||
int | alpha_3_, | ||
score_t | difference_to_opt_score_, | ||
score_t | min_score_, | ||
long int | max_number_of_EPMs_, | ||
bool | inexact_struct_match_, | ||
score_t | struct_mismatch_score_, | ||
bool | apply_filter_, | ||
bool | verbose_ | ||
) |
Constructor.
seqA_ | sequence A |
seqB_ | sequence B |
rna_dataA_ | sparsified data of RNA ensemble for sequence A |
rna_dataB_ | sparsified data of RNA ensemble for sequence B |
arc_matches_ | arc matches |
sparse_trace_controller_ | trace controller combined with the sparsification mapper |
foundEPMs_ | pattern pair map to store the list of all EPMs |
alpha_1_ | sequential weight |
alpha_2_ | structural weight |
alpha_3_ | stacking weight |
difference_to_opt_score_ | all EPMs with a score difference not more than difference_to_opt_score_ from the optimal score are traced |
min_score_ | the minimal score of an EPMs that is traced |
max_number_of_EPMs_ | maximal number of EPMs for the suboptimal traceback |
inexact_struct_match_ | whether to allow inexact structure matches |
struct_mismatch_score_ | the mismatch score for two nucleotides in an arcmatch (only used if inexact_struct_match_ is set) |
apply_filter_ | whether to apply an additional filter when allowing inexact structure matches |
verbose_ | whether to write additional information |
fills matrix D (i.e. computes all arc match scores) by filling matrices L, G_A and LR
void LocARNA::ExactMatcher::trace_EPMs | ( | bool | suboptimal | ) |
computes the traceback and traces all EPMs
suboptimal | whether to compute the suboptimal or heuristic traceback |