KB={'P31994': {
'ID': {
'ProtID':'FCG2B',
'SpecID':'HUMAN',
'Status':'Reviewed',
'Length':310},
'AC': ['P31994', 'A6H8N3', 'O95649', 'Q53X85', 'Q5VXA9', 'Q8NIA1'],
'DT': {
'Integrated': {
'Date':'01-JUL-1993',
'KB':'UniProtKB/Swiss-Prot'},
'Sequenced': {
'Date':'30-MAY-2000',
'Version':2},
'Entry': {
'Date':'11-DEC-2013',
'Version':158}},
...}
When a section can have multiple entries it will be in a list, like AC. This means one would access a particular item or set of items like this:
KB['P31994']['AC'][0]
KB['P31994']['DE']['AltName'][1]['Short']
I can provide the structure of the document and description of the various fields in a similar way. Much of this is documentation with IDs into other systems. Things change over time so some of the documentation is out of date or otherwise unavailable. A relatively new line type, RX, has some issues. I tracked down how to use the document ID
import webbrowser
#The UniProt referenced database names used by RX records
#append with publication ID
UniProtPubDB={'MEDLINE':'', # you need a cross reference to pubmed id.
'PubMed':'http://www.ncbi.nlm.nih.gov/pubmed/',
'DOI':'http://dx.doi.org/',
'AGRICOLA','' # requires login
}
webbrowser.open_new(UniProtPubDB['PubMed']+'2531080')
The MEDLINE UI search as been deprecated so one would need a crossreference. AGRICOLA requires a login ID. I've only looked at four proteins. None have either MEDLINE or AGRICOLA references. The sample is too small but MEDLINE UI isn't of much use.
So much for today.
No comments:
Post a Comment