Constraint-based Methods in Bioinformatics

Constraints are an extremely successful technology that has a vast application area throughout informatics. Due to our expertise in both areas, constraint technology and bioinformatics, we are especially interested in the application of constraints to biological problems. In the area of Bioinformatics, constraints already were succesfully applied to problems related to

Recognition, analysis, and organization of DNA sequences
Biological systems simulations (for metabolic or regulatory networks)
Prediction of the spatial conformation of a biological polymer, given its sequence of monomers (in particular for proteins and RNA)

All these problems can be naturally formalized using constraints over finite domains or intervals of reals.

Recently, we showed how the fundamental bioinformatics problem of sequence alignment can be solved by modern inference based constraint methods. The in-depth study of constrained-based protein structure prediction by our group promoted the development of new search strategies and, in particular, general symmetry breaking.

Constrained alignment using constraint-based inference

In this area, we investigate the application of constraint techniques to the alignment problem. Our focus is to allow for a more declarative form in order to incorporate biological knowledge.

Recently, we succeeded in constructing a constraint-model for sequence alignment that can be efficiently evaluated using the inference-based constraint solving strategy of Cluster Tree Elimination. At the same time the model can be extended by additional constraints (in order to incorporate biolgical prior knowledge) and is still efficiently solved.

Current Status

Hitherto, our work results in a general framework for constrained alignment where almost arbitrary constraints can be imposed and even combined arbitrarily. Possible constraints range from all the constraints that were discussed for sequence alignment before up to complex structural constraints.

A web server for constrained alignment is in preparation.

Contributing group members

Sebastian Will (contact: )
Anke Busch
Rolf Backofen

Main Publications

Sebastian Will, Anke Busch, and Rolf Backofen. Efficient constraint-based sequence alignment by cluster tree elimination. In Proceedings of the Workshop on Constraint Based Methods in Bioinformatics (WCB05), pages 66-74, 2005.

Funding

Partially supported by REWERSE.

General Symmetry Breaking

In constraint programming as well as in bioinformatics, solving search problems that contain lots of symmetries is a frequent task. (For example, in protein folding, symmetries are rotations, reflections and translation of a conformation). These symmetries blow up the search tree and search times. We first encountered the problem of breaking the symmetries in protein structure prediction. In this project we are interested in general schemes for symmetry breaking and identifying interesting sub-classes of symmetries. Furthermore we are interested in applying symmetry breaking to problems of bioinformatics.

Current Status

We have developed the first general method for breaking arbitrary symmetries. This method is based on dynamic insertion of symmetry breaking constraints during constraint-based search.

Breaking a 90 degree rotation in 4-queens.

Besides handling geometrical symmetries and permutation symmetries, we identified the subclass of partial symmetries and showed how the approach can break them too.

Contributing group members

Sebastian Will (contact: )
Rolf Backofen

Main Publications

Rolf Backofen and Sebastian Will. Excluding symmetries in constraint-based search. Constraints, 7 no. 3 pp. 333-349, 2002.
Sebastian Will and Rolf Backofen. Breaking of partial symmetries in the photo and alignment problem. In Proceedings of the Third International Workshop on Symmetry in Constraint Satisfaction Problems (SymCon 2003), pages 187-194, 2003.

Dynamic decomposition during search

The counting of solutions of a given Constraint Satisfaction Problem (CSP) is much more complex than deciding on satisfiability and gained much interest in the last time. The solving of a CSP is in general NP-complete, whereby counting all solutions is even harder and in the complexity class #P.
Standard solving methods in constraint programming like Depth-First Search (DFS) combined with constraint propagation are well suited for determining one solution, but leave room for saving redundant work when counting all solutions. Here, we present a method that is especially tailored for this case.

DDFS example

Basically, our new method dynamically decomposes the constraint (sub-)problems that emerge during the search into independent partial problems along connected components of the problem’s associated constraint graph. Separate counting in the partial problems still allows to infer the number of solutions of the complete problem. This technique is known as AND/OR-search and was introduced by Dechter et al.
Instead of statically exploiting only properties of the initial constraint graph, dynamic strategies analyze the emerging constraint graphs during the search and employ their features. We believe this is a major advantage in many constraint problems. In particular, if the initial constraint network is very dense (as in our structure prediction problem), static methods don’t make an impact.

Current Status

We have integrated AND/OR-search into a full features constraint programming system utilizing the benefits of global constraints and dynamic search heuristics. It was implemented using the free constraint programming library Gecode and will be part of the official release.

Contributing group members

Sebastian Will (contact: )
Martin Mann (contact: )

Main Publications

Martin Mann, Guido Tack and Sebastian Will. Decomposition During Search for Propagation-Based Constraint Solvers. Technical Report, University Freiburg, 2008.
Stephane Zampelli, Martin Mann, Yves Deville and Rolf Backofen. Decomposition Techniques for Subgraph Matching. Technical Report, Universite catholique de Louvain, 2008.
Sebastian Will and Martin Mann. Counting protein structures by dfs with dynamic decomposition. In Proc. of the Workshop on Constraint Based Methods for Bioinformatics, page6, 2006.

Funding

Partially supported by REWERSE.
Partially supported by EMBIO.

Constraint-based Methods in Bioinformatics

Constrained alignment using constraint-based inference

Current Status

Contributing group members

Main Publications

Funding

Related Research Topics

General Symmetry Breaking

Current Status

Contributing group members

Main Publications

Related Research Topics

Dynamic decomposition during search

Current Status

Contributing group members

Main Publications

Funding

Related Research Topics