[[NCBI]]에서 제공하는 각 종에 대한 모든 [[EST]]들(즉,[[dbEST]])을 EstClustering하여 redundant한 조각들을 제외한 UniGene Contig(Cluster) [[Database]] http://www.ncbi.nlm.nih.gov/UniGene UniGene build procedure 1. screen for contaminants, RepetitiveSequence, and low-complexity regions in GenBank 1. [[Clustering]] procedure (anchored clusters) 1. build clusters of [[Gene]]s and m[[RNA]]s 1. Add [[EST]]s to previous clusters (MegaBlast) 1. [[EST]]s that join two clusters of genes/mRNAs are discarded 1. any resulting cluster without a polyadenilation signal or two 3' ESTs is discarded 1. ensures 5' and 3' ESTs from the same clone belongs to the same cluster 1. ESTs that have not been clustered, are reprocessed with lower level of stringency. ESTs added during this step are called guest members. 1. clusters of size 1 are compared against the rest of the clusters with a lower level of stringency and merged with the cluster containing the most similar sequence. ---- 참고문헌 : PubMed:9382993, Fri May 3 2002 현재, Human에 104,024개의 UniGene존재. (See also http://www.ncbi.nlm.nih.gov/UniGene/Hs.stats.shtml) UniGene중에서 알려진 Gene들에 대해서 각각 annotation한 데이터베이스가 GeneCards SeeAlso : GeneLynx ---- CategoryDatabase