EEEB GU4055
1. Review notebook assignments: Numpy, Pandas, BLAST and Homology.
2. Discuss the assigned reading: OrthoDB and databases.
What do we mean by orthology and paralogy?
What do we mean by orthology and paralogy?
What do we mean by orthology and paralogy? Hemoglobin example.
Question [1]: Using orthology to transfer the gene functional annotation from one organism to another might be inaccurate in some instances. What type of things do you think could lead to these inaccuracies? In other words, why is the function of the model organism gene not always the same as some other organism being compared to it?
BLAST is an algorithm for comparing sequences, it searches the NCBI database. The NCBI Entrez tools provide an API interface to this database. These tools have facilitated the development of complex tools like OrthoDB.
import requests
# search term
term = "FOXP2[GENE] AND Mammalia[ORGN] AND phylogenetic study[PROP]"
# make a request to esearch
res = requests.get(
url="https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi",
params={
"db": "nucleotide",
"term": term,
"sort": "Organism Name",
"retmode": "text",
"retmax": "20",
"tool": "genomics-course",
"email": "de2356@columbia.edu",
},
)
# returns a response object
res = requests.get(...)
# the URL is built from the params arguments to requests.get()
res.url
'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=nucleotide&term=FOXP2%5BGENE%5D+AND+Mammalia%5BORGN%5D+AND+phylogenetic+study%5BPROP%5D&sort=Organism+Name&retmode=text&retmax=20&tool=genomics-course&email=de2356%40columbia.edu'
# returns a response object
res = requests.get(...)
# the text attribute store the returned HTML
res.text
A practical guide to inferring phylogenetic trees.