This notebook will correspond with chapter 4 in the official Python tutorial https://docs.python.org/3/tutorial/.
By the end of this exercise you should:
A function is used to perform a task based on a particular input. Functions are the bread and butter of any programming language. We have used many functions already that are builtin to the objects we have interacted with. For example, we saw that string
objects have functions to capitalize letters, or add spacing, or query their length. Similarly, list
objects have functions to search for elements in them, or to sort.
The next step in our journey is to begin writing our own functions. This is only an introduction, as we will continue over time to learn many new ways to write more advanced functions.
In Python functions are defined using the keyword def
. Optionally we can have the function return a result by ending it with the return
operator. This is not required, but is usually desirable if we want to want to assign the result of the function to a variable
## a simple function to add 100 to the input object
def myfunc(x):
return x + 100
## let's run our function on an integer
myfunc(200)
So the basic elements of a function include an input variable and a return variable. The next important thing is to add some documentation to our function. This is to explain what the function is for, and to let other users know how to use it. A documentation string, or docstring, should be entered as a string on the first line of the definition of a function.
def myfunc2(x):
"This function adds 100 to an int or float and returns"
return x + 100
myfunc2(300.3)
Of course we often want to write functions that take multiple inputs. This is easy.
def sumfunc1(arg1, arg2):
"returns the sum of two input args"
return arg1 + arg2
sumfunc1(10, 20)
Let's write a function that will calcuate the frequency of each base in a DNA string or genome. In addition to the docstring of a function, which is intended for the user to see, you can also still add comments to the function code to remind yourself what each element of the code is doing. You can find many comments describing the detailed action of the function below.
def base_frequency(string):
"returns the frequency of A, C, G, and T as a list"
# create an empty list to store results
freqs = []
# get the total length of the input string
slen = len(string)
# iterate over each letter in A,C,G,T
for base in "ACGT":
# count the letter's occurrence in the input string
# divided by the total length of the input string
frequency = string.count(base) / slen
# store the measured frequency in the result list
freqs.append(frequency)
# return the result list
return freqs
# test the function
base_frequency("ACACTGATCGACGAGCTAGCTAGCTAGCTGAC")
The task above can actually be accomplished in many possible ways. There is not only a single way to count the frequency of an element in a list. Among the many ways to accomplish a task some might be faster than others, but a good rule of thumb is to make your code as easily readable and comprehendable as possible. This is the best way to avoid mistakes.
Below is alternative implementation of our base_frequency()
function which I name base_frequency2()
. It returns the same result though the code runs in a slightly different way.
def base_frequency2(string):
"returns the frequence of A,C,G and T in order"
slen = len(string)
freqA = string.count("A") / slen
freqC = string.count("C") / slen
freqG = string.count("G") / slen
freqT = string.count("G") / slen
return [freqA, freqC, freqG, freqT]
# test the function
base_frequency2("ACACTGATCGACGAGCTAGCTAGCTAGCTGAC")
It can be a very useful exercise to look at code and functions that are written by others to try to learn common and useful techniques, and to try to understand what they are trying to accomplish and how they go about it. As an example, try to understand the function below and answer the questions following the demonstrated example of the function.
def mystery_function(string):
"no hint on this one"
# code block 1
ag = 0
ct = 0
# code block 2
for element in string:
if element in ["A", "G"]:
ag += 1
elif element in ["C", "T"]:
ct += 1
# code block 3
freq_ag = ag / len(string)
freq_ct = ct / len(string)
return [freq_ag, freq_ct]
# test the function
mystery_function("ACACTGATCGACGAGCTAGCTAGCTAGCTGAC")
You can optionally read chapter 6.2 if you wish, but otherwise we will just discuss it here because I think it covers a bit too much irrelevant details. This chapter introduces the Python standard library, and also what it means to import a library. The take home message is that there exists a large library of packages that are included in Python that can be accessed by importing them. We will learn about several common packages in the next few weeks. Let's learn about one of these package now by using it: the random
library.
import random
# draw a random number between 0 and 3
random.randint(0, 3)
# draw 10 random numbers between 0 and 3
[random.randint(0, 3) for i in range(10)]
# draw a random element from an iterable
random.choice("Columbia University")
# draw 10 random elements from an iterable
[random.choice("Columbia University") for i in range(10)]
# one way
def random_dna1(length):
return "".join(random.choice("ACTG") for i in range(length))
# another way
def random_dna2(length):
random_string = ""
for i in range(length):
random_string += random.choice("ACTG")
return random_string
random_dna1(20)
random_dna2(20)