website Publication more
Small open reading frames (sORFs) can be defined as open reading frames smaller than or equal to 300 nucleotides (100 amino acids). These “sORFs”, while inherent to all genomes, are historically ignored in gene annotation studies, stating that these lack any coding potential. Exclusion of these sORFs has emerged as a side effect during the development of different (gene prediction) tools in the field of bioinformatics/genomics/proteomics trying to reduce noise, imposed by technological limitations However, recent scientific breakthroughs discovered coding potential of several sORFs with clinical significance, indicating their importance. 1, 2, 4 . In particular, the advent of ribosome profiling 5 (RIBO-seq), a next generation deep sequencing technique, providing a genome-wide snapshot of the translating machinery in a cell, provided evidence of translation in sORFs. The value and importance of sORFs is becoming widely recognized 6, 7 furthermore ribosome profiling data is becoming more abundant. The creation of a public repository for sORFs, providing information resulting from various tools and metrics, seems a necessity in aiding functional research in the micropeptide field
With this in mind, we like to introduce sORF.org, a public repository for sORFs. The main purpose is to allow researchers to examine individual sORFs or to perform searches based on several criteria for further large-scale studies. Different data sources, both experimental and in silico (based on various bioinformatics tools), are collected. sORF.org currently holds 3551506 sORFs across three different species (human, mouse and fruit fly), derived from multiple RIBO-seq experiments and is expanding as more data becomes available. Available datasets can be inspected HERE.
Two query interfaces were developed for sORFs.org. The default query interface excels in the quick lookup of sORFs, however has limited query possibilities.For example the default query interface excels at the lookup of sORFs containing a specific sequence pattern. A tutorial regarding the default query interface can be found HERE.
For advanced querying and export options a BioMart query interface is implemented. BioMart allows to filter, view and export data according to the user’s needs. A tutorial regarding the BioMart query interface can be found HERE.
Relevant data and/or papers can be sumbitted by completing the form on the submit page, found HERE. Data provided will be manually curated and implemented if relevant. All contributions are highly appreciated and will be accredited accordingly.
Suggestions, questions or remarks, are graciously received by completing the contact form located HERE.
An update article on sORFs.org had been published, available here: An update on sORFs.org: a repository of small ORFs identified by ribosome profiling
If you wish to acknowledge sORFs.org in your publication, Please cite:
website
Over the course of the last two years, our lab has optimized this technique and the corresponding data analysis. A special peak-calling algorithm was created which involved the indirect use of the underlying CpG-density, and therefore yields far more relevant information than the application of e.g. ChIP-seq peak calling algorithms.
On this website, you’ll find current and former releases, in addition. A Genome Browser is available to study our human methylome by browsing to genes and genomic regions.