Maintains the relevant arc matches and their scores. More...
#include <arc_matches.hh>
Classes | |
class | lex_greater_left_ends |
class | tuple5 |
Public Types | |
typedef std::vector< int > ::size_type | size_type |
size | |
typedef BasePairs__Arc | Arc |
arc | |
typedef ArcMatchVec::const_iterator | const_iterator |
const iterator over arc matches | |
Public Member Functions | |
ArcMatches (const Sequence &seqA_, const Sequence &seqB_, const std::string &arcmatch_scores_file, int probability_scale, size_type max_length_diff, size_type max_diff_at_am, const MatchController &trace_controller, const AnchorConstraints &constraints) | |
construct with explicit arc match score list | |
ArcMatches (const RnaData &rnadataA, const RnaData &rnadataB, double min_prob, size_type max_length_diff, size_type max_diff_at_am, const MatchController &trace_controller, const AnchorConstraints &constraints) | |
construct from single base pair probabilities. | |
~ArcMatches () | |
clean up base pair objects | |
void | read_arcmatch_scores (const std::string &arcmatch_scores_file, int probability_scale) |
Reads scores for arc matches. | |
void | write_arcmatch_scores (const std::string &arcmatch_scores_file, const Scoring &scoring) const |
const BasePairs & | get_base_pairsA () const |
returns the base pairs object for RNA A | |
const BasePairs & | get_base_pairsB () const |
returns the base pairs object for RNA B | |
bool | explicit_scores () const |
void | make_scores_explicit (const Scoring &scoring) |
Make arcmatch scores explicit. | |
score_t | get_score (const ArcMatch &am) const |
size_type | num_arc_matches () const |
total number of arc matches | |
const ArcMatch & | arcmatch (size_type idx) const |
get arc match by its index | |
const ArcMatchIdxVec & | common_right_end_list (size_type i, size_type j) const |
list of all arc matches that share the common right end (i,j) | |
const ArcMatchIdxVec & | common_left_end_list (size_type i, size_type j) const |
list of all arc matches that share the common left end (i,j) | |
void | get_max_right_ends (size_type al, size_type bl, size_type *max_ar, size_type *max_br, bool no_lonely_pairs) const |
get the maximal right ends of any arc match with left ends (al,bl). | |
void | get_min_right_ends (size_type al, size_type bl, size_type *min_ar, size_type *min_br) const |
bool | exists_inner_arc_match (const ArcMatch &am) const |
const ArcMatch & | inner_arc_match (const ArcMatch &am) const |
void | sort_right_adjacency_lists () |
const_iterator | begin () const |
begin of arc matches vector | |
const_iterator | end () const |
end of arc matches vector | |
Protected Member Functions | |
bool | is_valid_arcmatch (const Arc &arcA, const Arc &arcB) const |
void | init_inner_arc_matchs () |
initialize the vector of inner arc match indices | |
Protected Attributes | |
size_type | lenA |
length of sequence A | |
size_type | lenB |
length of sequence B | |
BasePairs * | bpsA |
base pairs of RNA A | |
BasePairs * | bpsB |
base pairs of RNA B | |
size_type | max_length_diff |
for max-diff-am heuristics | |
size_type | max_diff_at_am |
for max diff at arc matches heuristics | |
const MatchController & | match_controller |
const AnchorConstraints & | constraints |
for constraints | |
bool | maintain_explicit_scores |
ArcMatchVec | arc_matches_vec |
vector of all maintained arc matches | |
size_type | number_of_arcmatches |
std::vector< score_t > | scores |
vector of scores (of arc matches with the same index) | |
Matrix< ArcMatchIdxVec > | common_right_end_lists |
Matrix< ArcMatchIdxVec > | common_left_end_lists |
ArcMatchIdxVec | inner_arcmatch_idxs |
vector of indices of inner arc matches |
Maintains the relevant arc matches and their scores.
It works as an interface between the source of arc match scoring information, i.e. dot plots of the single sequences or explicit listing of all arc matches, and the use of this information in the class Scoring and in the alignment algorithm. For the latter purpose, it allows to generate the necessary objects of class BasePairs for the iteration of base pairs in the single structures.
ArcMatches knows about all arc matches that shall be considered for the alignment. Each arc match gets an index between 0..(number of arc_matches-1). The index is used to access the arc match score, the single arcs of the match, the matrix D, ...
The object offers iteration over 1.) all arc matches 2.) all arc matches that share left ends/right ends (i,j) During iteration the index of the current arc match is known, thus one never needs to infer the arc match index from the arc ends or arc indices! (This could be done in constant time using an O(n^2) lookup table.)
LocARNA::ArcMatches::ArcMatches | ( | const Sequence & | seqA_, |
const Sequence & | seqB_, | ||
const std::string & | arcmatch_scores_file, | ||
int | probability_scale, | ||
size_type | max_length_diff, | ||
size_type | max_diff_at_am, | ||
const MatchController & | trace_controller, | ||
const AnchorConstraints & | constraints | ||
) |
construct with explicit arc match score list
construct from seqnames and explicit list of all scored arc matches together with their score.
seqA_ | sequence A |
seqB_ | sequence B |
arcmatch_scores_file | file containing arc match scores |
probability_scale | if >=0 read probabilities and multiply them by probability_scale |
max_length_diff | accept arc matches only up to maximal length difference |
trace_controller | accept only due to trace controller |
constraints | accept only due to constraints |
LocARNA::ArcMatches::ArcMatches | ( | const RnaData & | rnadataA, |
const RnaData & | rnadataB, | ||
double | min_prob, | ||
size_type | max_length_diff, | ||
size_type | max_diff_at_am, | ||
const MatchController & | trace_controller, | ||
const AnchorConstraints & | constraints | ||
) |
construct from single base pair probabilities.
In this case, the object filters for relevant base pairs/arcs by min_prob. Registers constraints and heuristics and then calls read_arcmatch_scores. Constructs BasePairs objects for each single object and registers them. Generates adjacency lists of arc matches for internal use and sorts them. Lists contain only valid arc matches according to constraints and heuristics (see is_valid_arcmatch()). The constructed object does not explicitely represent/maintain the scores of arc matchs.
rnadataA | data for RNA A |
rnadataB | data for RNA B |
min_prob | consider only arcs with this minimal probability |
max_length_diff | consider arc matches only up to maximal length difference |
trace_controller | arc matches only due to trace controller |
constraints | arc matches only due to constraints |
bool LocARNA::ArcMatches::exists_inner_arc_match | ( | const ArcMatch & | am | ) | const [inline] |
whether there is an inner (valid) arc match for the arc with the given index
bool LocARNA::ArcMatches::explicit_scores | ( | ) | const [inline] |
true, if arc match scores are explicit (because they are read in from a list)
void LocARNA::ArcMatches::get_max_right_ends | ( | size_type | al, |
size_type | bl, | ||
size_type * | max_ar, | ||
size_type * | max_br, | ||
bool | no_lonely_pairs | ||
) | const |
get the maximal right ends of any arc match with left ends (al,bl).
al | left and in sequence A | |
bl | left and in sequence B | |
[in,out] | max_ar | maximal right end in sequence A |
[in,out] | max_br | maximal right end in sequence B |
no_lonely_pairs | whether in lonely pair mode |
Determine minimal max_ar and max_br such that there is no arc match with left ends al, bl and larger right ends ar>max_ar or al>max_br. In lonely pair mode, consider only arc matchs that have an immediately enclosing arc match.
void LocARNA::ArcMatches::get_min_right_ends | ( | size_type | al, |
size_type | bl, | ||
size_type * | min_ar, | ||
size_type * | min_br | ||
) | const |
get the minimal right ends of any arc match with left ends (al,bl). pre: min_ar, min_br are initialized with largest possible values returns result in out parameters min_ar, min_br
score_t LocARNA::ArcMatches::get_score | ( | const ArcMatch & | am | ) | const [inline] |
get the score of an arc match
am | arc match |
const ArcMatch& LocARNA::ArcMatches::inner_arc_match | ( | const ArcMatch & | am | ) | const [inline] |
index of inner arc match for the arc with the given index. Call only if there is such an arc match
bool LocARNA::ArcMatches::is_valid_arcmatch | ( | const Arc & | arcA, |
const Arc & | arcB | ||
) | const [protected] |
decide according to constraints and heuristics whether an arc match is valid.
arcA | arc (i,j) in first sequence |
arcB | arc (k.l) in second sequence |
void LocARNA::ArcMatches::make_scores_explicit | ( | const Scoring & | scoring | ) |
Make arcmatch scores explicit.
scoring | An scoring object |
void LocARNA::ArcMatches::read_arcmatch_scores | ( | const std::string & | arcmatch_scores_file, |
int | probability_scale | ||
) |
Reads scores for arc matches.
Reads from a file reads a list i j k l score, where score is the score for matching arcs (i,j) and (k,l).
arcmatch_scores_file | file containing arc match scores |
probability_scale | if >=0 read probabilities and multiply them by probability_scale |
sort the lists of arc matches with common right ends in "common_right_end_list" by their left ends in lexicographically descending order. Additionally, generate data structure for optimized traversal.
void LocARNA::ArcMatches::write_arcmatch_scores | ( | const std::string & | arcmatch_scores_file, |
const Scoring & | scoring | ||
) | const |
write arc match scores to a file (this is useful after the scores are generated from base pair probabilities)
Matrix<ArcMatchIdxVec> LocARNA::ArcMatches::common_left_end_lists [protected] |
for each (i,j) maintain vector of the indices of the arc matchs that share the common left end (i,j)
Matrix<ArcMatchIdxVec> LocARNA::ArcMatches::common_right_end_lists [protected] |
for each (i,j) maintain vector of the indices of the arc matchs that share the common right end (i,j)
bool LocARNA::ArcMatches::maintain_explicit_scores [protected] |
whether scores are maintained explicitely or computed from pair probabilities
const MatchController& LocARNA::ArcMatches::match_controller [protected] |
allowed alignment traces by max-diff heuristics
size_type LocARNA::ArcMatches::number_of_arcmatches [protected] |
the number of valid arc matches