#format python """GeneOntology id retriver using Entrez and GenBank id --[cyppi]""" from Bio import GenBank, Fasta class getGoid: """ Get the goid from gid. this class fetch the Genbank document by given gid then parse the document and get goid(s) if not in the document or if there is some errer, this class will return just empty list. usage : go = getGoid() golist = go.getgoid(gid='some gid', data='protein') the data is set 'protein' as a default, if the gid is protein, then just put gid without anynotation like golist = go.getgoid('some gid') and if the gid related withe the dna data should be noted like golist = go.getgoid(gid='some gid', data='nucleotide') """ fpser ='' ncbi = '' ncbj = '' golist = [] def __init__(self): self.fpser = GenBank.FeatureParser() self.ncbi = GenBank.NCBIDictionary(database = 'protein',format='genbank',parser=self.fpser) self.ncbj = GenBank.NCBIDictionary(database = 'nucleotide',format='genbank', parser = self.fpser) def getProteinGo(self,gid): try: data = self.ncbi.get(gid) for item in data.features: if item.qualifiers != '' and item.qualifiers.has_key('note'): self.getgoidfromnote(item.qualifiers['note']) except : self.golist = [] def getDNAGo(self, gid): try: data = self.ncbj.get(gid) for item in data.features: if item.qualifiers != '' and item.qualifiers.has_key('note'): self.getgoidfromnote(item.qualifiers['note']) except: self.golist = [] def getgoidfromnote(self, note): self.golist = [] for item in note: item = item.replace('\n', ' ' ) nlist = item.split('goid') if len(nlist) > 0: for item in nlist[1:]: if item[1:8] != ' ': # print item[1:8] self.golist.append(item[1:8]) def getgoid(self,gid,data='protein'): if data== 'protein': self.getProteinGo(gid) elif data =='nucleotide': self.getDNAGo(gid) return self.golist if __name__ =='__main__': go = getGoid() go.getgoid('6755536') for item in go.golist: print item