UniGene - BioHackersNet

NCBI에서 제공하는 각 종에 대한 모든 EST들(즉,dbEST)을 EstClustering하여 redundant한 조각들을 제외한 UniGene Contig(Cluster) Database

UniGene build procedure

screen for contaminants, RepetitiveSequence, and low-complexity regions in GenBank
Clustering procedure (anchored clusters)
1. build clusters of Genes and mRNAs
2. Add ESTs to previous clusters (MegaBlast)
3. ESTs that join two clusters of genes/mRNAs are discarded
4. any resulting cluster without a polyadenilation signal or two 3' ESTs is discarded
ensures 5' and 3' ESTs from the same clone belongs to the same cluster
ESTs that have not been clustered, are reprocessed with lower level of stringency. ESTs added during this step are called guest members.
clusters of size 1 are compared against the rest of the clusters with a lower level of stringency and merged with the cluster containing the most similar sequence.

참고문헌 : 9382993,

Fri May 3 2002 현재, Human에 104,024개의 UniGene존재. (See also http://www.ncbi.nlm.nih.gov/UniGene/Hs.stats.shtml)

UniGene중에서 알려진 Gene들에 대해서 각각 annotation한 데이터베이스가 GeneCards