RNAscClust is a pipeline to cluster a set of structured RNAs taking their respective structural conservation into account. The aim of RNAscClust is to aid the discovery of families and classes of ncRNAs.
The input to RNAscClust is a set of multiple structural alignments of RNA sequences. Each alignment contains an RNA sequence from a species of interest structurally aligned to homologous sequences. RNAscClust computes minimum free-energy structures for each sequence from the species of interest using conserved base pairs as prior information for the folding. The sequences originating from the organism of interest are then clustered using a graph kernel-based strategy, which identifies common structural features.
The source code is available as a tarball.
Latest release: RNAscClust 1.1.1
Previous releases can be found under: Releases/
Instructions on installation and usage of the source package can be found in the file README.md
included in the downloaded tarball.
RNAscClust is available as a Docker container on Docker Hub. Using the Docker container, one can setup the pipeline in a few minutes, without having to install any of the dependencies by hand. The Docker container enables the user to easily reproduce all Figures and Tables shown in the Results section in in the RNAscClust paper (see reference below under Publication) by executing a short sequence of command line instructions. The RNAscClust Docker container supports multi-core execution.
Following the instructions in the Installation section, the pipeline can be installed on your machine preferably on a computing platform supporting the SGE computer cluster system.
A Docker client with the necessary user permissions is required for running the Docker image. Then execute:
docker pull mmiladi/rnascclust:latest
docker run -it -h dockersgeserver mmiladi/rnascclust:latest
The Docker container will startup with providing examples on how to run the pipeline and evaluations.
Executing the following series of commands allows to reproduce all Figures and Tables shown in the Results section in the RNAscClust paper (see reference below under Publication):
# Inside a terminal of the host system:
docker pull mmiladi/rnascclust:latest
docker run -it -v `pwd`/cluster_evaluation:/cluster_evaluation -h dockersgeserver mmiladi/rnascclust:latest
# Inside the docker image:
cd /; bash /rnascclust/bin/clustering/run_clustering_docker.sh >cluster_evaluation/run_clustering_docker.log 2>&1
After execution of the clustering, which takes ~2 hours, the directory cluster_evaluation
contains .pdf
Figures and .txt
Tables following the naming in the manuscript.
Datasets used in the paper for benchmarking can be downloaded from here.
The software is available under GNU-GPL3.
Milad Miladi*, Alexander Junge*, Fabrizio Costa, Stefan E. Seemann, Jakob Hull Havgaard, Jan Gorodkin, and Rolf Backofen. RNAscClust: clustering RNA sequences using structure conservation and graph based motifs (2016). Bioinformatics 33, no. 14 (2017): 2089-2096.
*these authors contributed equally to this work
RNAscClust is developed by the Chair for Bioinformatics, University of Freiburg and the Center for non-coding RNA in Technology and Health (RTH), University of Copenhagen.
For scientific questions, please contact:
For technical questions, please contact: