What the gene set counts mean?


This a gene set table (click on the image to enlarge):

Gene set table


Gene set statistics are split into two categories: Gene and Transcript providing quick summary counts for the content of a specific gene set. The prevelance of alternative splicing in eukaryotic genomes through which a single gene can encode for multiple transcripts means that the Gene and Transcript counts are likely to differ. Further, each of these categories are split between Protein_coding (those Genes/Transcripts which via mRNAs are translated into polypeptides) and Other which contains non-coding RNA genes and pseudogenes.

The 'Other' category is everything that is not an mRNA. These include (but are not limited to) the types listed below:

Note that there are many more RFAM families but whenever they are classified as motifs (e.g. a SECIS element motif, RF00031), they are filtered out by VectorBase/Ensembl ncRNA gene prediction pipeline. For all non-coding RNA, except tRNA and rRNA genes, models are predicted by aligning a genomic sequence against Rfam sequences. Rfam makes its annotations available for editing in the online encyclopedia Wikipedia.

Daub, J., Gardner, P. P., Tate, J., Ramsköld, D., Manske, M., Scott, W. G., … Bateman, A. (2008). The RNA WikiProject: Community annotation of RNA families. RNA, 14(12), 2462–2464. http://doi.org/10.1261/rna.1200508

Two other FAQs have related information that may also be of your interest: