User Tools

Site Tools


Welcome to OpenCB project

OpenCB provides advanced open-source software for the analysis of high-throughput genomic data. During the last years advances in high-throughput sequencing technologies are producing data at an unprecedented scale and bioinformatics encounter difficulties in storage and analysis of vast amounts of biological data. Biology has joined big data era and faces new challenges such as the storage, analysis, search, sharing and visualization of data that requir new solutions.

OpenCB is organized in three different subprojects according to which problem they try to solve, this organization also helps to the development of the software as each of these subprojects share the same programming language and computing technologies that allow to get the best performance in each of them. These subprojects are:

  • High-Performance Genomic (HPG) data analysis project makes use of High-Performance Computing (HPC) technologies to speed-up algorithms and tools for data analaysis. It is being implemented using the C programming language, in incremental steps that involve analysing algorithms and parallel computing technologies in order to find which ones fit each problem best.
  • Cloud-computing and distributed databases project takes advantage of the great storage and computing capabilities of these systems in order to allow researchers to store and share data, and also query them in real-time way via RESTful web services. Cloud environments and NoSQL distributed databases APIs such is Amazon AWS are mainly written in Java, and so is this project.
  • Big data visualization project makes use of new web standards such as HTML5, SVG and CSS3 to deal with the necessity of visualizing big data in biology. Current experiments in biology are tipically conducted with NGS technologies generating TB of data, and visualization is a crucial part of the analysis. By making an efficient use of web technologies and networks data can be browsed remotely, without any plugin or data download.

OpenCB is an open initiative of Computational Biology unit at Institute of Computational Medicine. All the projects under OpenCB are hosted at GitHub and release under GPLv2 license. All computational scientists, biologists, biostatisticians and bioinformaticians are encouraged to contribute with code, feedback or reporting issues.

Some OpenCB applications...

So far OpenCB can be used to

  • Next-Generation Sequencing (NGS) short and log read alignment has been developed. We have implemented the most sensitive and fast NGS read aligner for both DNA and RNA-seq experiments. Any number of mismatches and INDELS are allowed. It is called HPG Aligner and it has been implemented using HPC technologies to speed-up alignments, more info at HPG Aligner
  • Genomic variant functional annotation, genomic variants from NGS experiments are exported in VCF files, they can be annotated to filter wich variants may have a functional or regulatory effect. A cloud-based annotation tool has been developed so researchers have a very fast tool which is always updated. You can find more info in the subproject website at HPG Variant Effect
  • GWAS for genotyping analysis, some GWAS test such as association, TDT or epistasis have been implemented, many others are being implemented right now. This test are being implemented using HPC technologies to exploit the parallelism of data, this is also part of HPG Variant GWAS
  • NGS data and genome browser, a web-based genomic data browser has been implemented, hundreds of GB of NGS data can be browsed in a remote or local manner. Visit Genome Maps for more information.
  • … many more coming and being documented
home.txt · Last modified: 2013/03/05 22:44 by imedina