Constraint-based Methods in Bioinformatics
Constraints are an extremely successful technology that has a vast application
area throughout informatics. Due to our expertise in both areas,
constraint technology and bioinformatics, we are especially
interested in the application of constraints to
biological problems.
In the area of Bioinformatics, constraints already were succesfully applied to
problems related to
- Recognition, analysis, and organization of DNA sequences
- Biological systems simulations (for metabolic or regulatory networks)
- Prediction of the spatial conformation of a biological polymer, given its sequence of monomers (in particular for proteins and RNA)
All these problems can be naturally formalized using constraints over finite
domains or intervals of reals.
Recently, we showed how the fundamental bioinformatics problem of
sequence alignment can be solved by modern inference based constraint methods.
The in-depth study of constrained-based protein structure prediction by our
group promoted the development of new search strategies and, in particular, general symmetry breaking.
Constrained alignment using constraint-based inference
In this area, we investigate the application of constraint techniques to the
alignment problem. Our focus is
to allow for a more declarative form in order to incorporate biological
knowledge.
Recently, we succeeded in constructing a constraint-model
for sequence alignment that can be efficiently evaluated using the
inference-based constraint solving strategy of Cluster Tree Elimination.
At the same time the model can be extended by additional constraints
(in order to incorporate biolgical prior knowledge) and is still
efficiently solved.
Current Status
Hitherto, our work results in a general framework for constrained alignment
where almost arbitrary constraints can be imposed and even combined
arbitrarily. Possible constraints range from all the constraints that were
discussed for sequence alignment before up to complex structural constraints.
A web server for constrained alignment is in preparation.
Contributing group members
Main Publications
Funding
Related Research Topics
General Symmetry Breaking
In constraint programming as well as in bioinformatics, solving search
problems that contain lots of symmetries is a frequent task. (For example, in
protein folding, symmetries are rotations, reflections and translation
of a conformation). These symmetries blow up the search tree and search
times. We first encountered the problem of breaking the symmetries in protein
structure prediction. In this project we are interested in general schemes
for symmetry breaking and identifying interesting sub-classes of symmetries.
Furthermore we are interested in applying symmetry breaking to problems
of bioinformatics.
Current Status
We have developed the first general method for breaking arbitrary
symmetries. This method is based on dynamic
insertion of symmetry breaking constraints during constraint-based search.
Breaking a 90 degree rotation in 4-queens.
Besides handling geometrical symmetries and permutation symmetries,
we identified the subclass of partial symmetries and showed how the
approach can break them too.
Contributing group members
Main Publications
Related Research Topics
Dynamic decomposition during search
The counting of solutions of a given Constraint Satisfaction Problem (CSP)
is much more complex than deciding on satisfiability and gained
much interest in the last time. The solving of a CSP is in general
NP-complete, whereby counting all solutions is even harder and in the
complexity class #P.
Standard solving methods in constraint programming like Depth-First Search (DFS)
combined with constraint propagation are well suited for determining one solution, but
leave room for saving redundant work when counting all solutions. Here, we present
a method that is especially tailored for this case.
Basically, our new method dynamically decomposes the constraint (sub-)problems
that emerge during the search into independent partial problems along connected components
of the problem’s associated constraint graph. Separate counting in the partial
problems still allows to infer the number of solutions of the complete problem.
This technique is known as AND/OR-search and was introduced by Dechter et al.
Instead of statically exploiting only properties of the initial constraint graph, dynamic
strategies analyze the emerging constraint graphs during the search and employ
their features. We believe this is a major advantage in many constraint problems. In
particular, if the initial constraint network is very dense (as in our structure prediction
problem), static methods don’t make an impact.
Current Status
We have integrated AND/OR-search into a full features constraint
programming system utilizing the benefits of global constraints and
dynamic search heuristics. It was implemented using the free constraint
programming library
Gecode
and will be part of the official release.
Contributing group members
Main Publications
Funding
- Partially supported by REWERSE.
- Partially supported by EMBIO.
Related Research Topics