Gene, GeneOntology를 이용한 유전자에 관한 다양한 통계들.

재료

  • Gene: NCBI gene ftp - Homo_sapiens_gene_info.gz (2012-06-28)
  • GO: Gene ontology site - GO.obo (2012-06-28)
  • GOA: NCBI gene ftp - gene2go (2012-06-28)

Count

  • Gene: Gene.objects.count() (->) 43,484

  • GO: GOTerm.objects.count() (->) 35,847

  • GOA: GOAssociation.objects.count() (->) 190,477

Gene category

  • protein-coding: 20,258
  • rRNA: 478
  • snRNA: 96
  • unknown: 2,954
  • scRNA: 5
  • pseudo: 12,671
  • other: 817
  • miscRNA: 5,219
  • tRNA: 599
  • snoRNA: 391
  • ncRNA: 1

chromosome 별 유전자 갯수

   1 >>> chrs = defaultdict(int)
   2 >>> for gene in Gene.objects.all():
   3         chrs[gene.chromosome] += 1
   4 >>> sorted(chrs.items(), key=itemgetter(0))
   5 [(u'-', 458),
   6  (u'1', 4001),
   7  (u'10', 1606),
   8  (u'10|19|3', 1),
   9  (u'11', 2514),
  10  (u'12', 1960),
  11  (u'12|Un', 1),
  12  (u'13', 1119),
  13  (u'13|Un', 1),
  14  (u'14', 1704),
  15  (u'15', 1458),
  16  (u'16', 1546),
  17  (u'17', 2062),
  18  (u'17|Un', 1),
  19  (u'18', 686),
  20  (u'18|Un', 2),
  21  (u'19', 2267),
  22  (u'2', 2795),
  23  (u'20', 1024),
  24  (u'21', 543),
  25  (u'22', 1004),
  26  (u'2|Un', 1),
  27  (u'3', 2303),
  28  (u'3|11', 1),
  29  (u'3|Un', 5),
  30  (u'4', 1694),
  31  (u'5', 1900),
  32  (u'6', 2437),
  33  (u'7', 2248),
  34  (u'8', 1574),
  35  (u'9', 1756),
  36  (u'MT', 74),
  37  (u'Un', 205),
  38  (u'X', 2015),
  39  (u'X|Y', 33),
  40  (u'Y', 485)]

HumanGeneStatistics (last edited 2012-07-08 16:08:53 by 221)

web biohackers.net