This notebook will correspond with chapters 1 and 3 from the official Python tutorial: https://docs.python.org/3/tutorial/. You are welcome to read chapter 2 as well, but it is mostly about how to open, install, and run Python. Since we have Python running interactively in jupyter the details of starting a Python Interpreter from chapter 2 are not so important. The challenges in this notebook are meant to reinforce the material from the readings. Feel free to use this notebook as a scratch pad as well in which to write and test code from the readings.
By the end of this exercise you should:
One of the simplest and most common uses of Python involves operations on numeric values to perform mathematical operations. Addition, subtraction, division, and most other operators act as you would expect using any calculator. Parentheses can be used to group the order of operations.
This is our first execution of Python code, and so we will also spend some time learning about how the results of executed code are dealt with. In particular, we will introduce the concept of variables.
A variable is a named object that is used to store a value. In the first code block below we perform mathematical operations on a number of integer values. This code block does not include any variables. In the second code block we create a new variable called x by storing the value 3 to it. In the third code block below we substitute the variable x into the code to perform the same operation as in the first code block. Because the value of the variable x is 3, the result of executing the third code block is 25, just like in the first code block.
# you can perform math operations in Python (3 / 3) + (3 * 5) + (3 ** 2)
# create a new variable named x with the integer value 3 x = 3
# substitute named variables to represent a value or object (x / 3) + (x * 5) + (x ** 2)
Notice how in the cells above the first and third block returned a result that was printed below the cell, while the second block did not return anything. This is because in the second code block we stored the result as a variable by using the
= operator. The default behavior is for a value to be returned if it not stored to a variable, meaning that it will simply be shown in the output of the code cell.
To retiterate: the object on the last line of a code cell will be returned when the cell is executed if it is not stored to a variable. This behavior may be preferred in some situations, such as in the 1st code cell above where we simply wanted to see the result of the mathematic operation. In many cases, though, it is useful to store values to variables so that they can be reused in other operations.
# the value of x will be returned (shown) x
# the value in the cell will not be shown because it is stored to p p = x - 3
Notice that above when we return a value it is shown in the output cell next to the red signature Out[N]:, this indicates that a value was returned (not stored to a variable).
print() function is a more explicit and broadly useful way to view the value of an object or variable. It prints the value to a special variable called
stdout, which is the standard location for printing values that you want a user to be able to see. In a jupyter notebook, text printed to stdout shows up in the output area below an executed code cell.
A key difference between viewing a returned value and a printed value is that a code cell will only return a value that is on the final executed line in a code block. By contrast, the
print() function can be called on any line within a block of code to print the value of an object at any time. This makes it very useful as a tool for checking the value of variables during your code execution (a process called debugging).
# the print() function sends the value to the stdout of a cell print(x)
This example shows how returned values only include the final line of a code block. It does not show a returned value for the two lines before the final line. In the next code cell the
print() function shows the value on all three lines.
# a returned value is only the final line x x + 3 x + 6
print(x) print(x + 3) y = x + 10 print(y)
y = 30 z = 5.5 print(y / z)
In the example above you have already used two different types of objects in Python, an Integer and a Float (decimal value). We'll learn about object Types in more detail now.
There is little practical difference between integers and floats (particularly in Python3 as opposed to Python2) except when you get down to the details of their memory use. Integers are whole numbers and Floats store floating point decimal values. The objects can be compared or combined in mathematical operations.
# assigns an integer value y0 = 3 # also assigns an integer value y1 = float(5.0) # also assigns an integer variable y2 = int(5) # return whether the two variables are equal y1 == y2
A boolean type is a simple True or False statement. For example, you just saw above that the returned value of the comparison we performed was a value of
True. That's a boolean. This type is used when comparing objects or values. Binary statements of this type are very common in programming so expect that you will see boolean types very often.
## True can be stored as True or as 1 x = True y = 1 x == y
## False can be stored as False or as 0 x = False y = 0 x == y
As you can see above we used the = character to assign values to a variable and we used the == character to ask if two variables were equal. There are several other comparison expressions available in addition to ==.
x = 10 y = 3 z = "orange"
print(x > y) print(x >= y) print(y < x) print(x == z) print(z != y)
Not everything can be compared, though. For example, asking whether "orange" is greater than 3 does not make any sense. When you do this Python will raise an error. It is important to be aware of the Type of each of your variables. We expect the code below will raise an error, just go ahead with it. We will describe how to interpret and deal with Error messages more later.
print(z > y)
A "string" is the name used in Python for words, sentences, or paragraphs of text that are joined together. It is one of the most basic data types and one that Python is very good at dealing with. In fact, the ease with which Python can be used to manipulate text is one of the primary reasons it bas become such a popular language for both scientific programming as well as web development.
Let's work with a string representation of a sequence of DNA. A string is created by wrapping any text in single or double quotes.
# note that here we use double quotes dna = "ACGCAGACGATTTGATGATGAGCATCGACTAGCTACACAAAGACTCAGGGCATATA"
# note that here we use single quotes. You can use either one. dna = 'ACGCAGACGATTTGATGATGAGCATCGACTAGCTACACAAAGACTCAGGGCATATA'
Another difference between using the
print() function and the return value of a string is that when you use print special characters in the text will be rendered. This is particularly apparent for newline characters, which are used to represent line breaks, as well as many other types of characters like tabs. In the example below the string includes the special characters
"\t" which is used to represent a tab, and
"\n" to represent a line break.
# return the string mystring = "hello\tworld\nhello world" mystring
# print the string print(mystring)
hello world hello world
A string is an indexed datatype that is immutable. This means that we can select portions of the text using indexed numbering, but we cannot change/mutate individual elements of it.
The example below selects the elements in the string starting at the 5th character up until the 15th character.
# return an indexed portion of the dna string dna[5:15]
Python is called an object-oriented programming language, which refers to the fact that everything in Python is an object. What does that mean? Well, it means that everything you interact with has a hidden structure within it that it uses to store its values, and in addition, objects typically have built-in functions associated with them that are designed to interact with its data. We'll learn more about functions soon.
This is one of the most exciting things about using Python in jupyter, which uses an interactive version of Python, called IPython. In this framework it is really easy to access and see all of the attributes and functions associated with an object.
This can be done by typing a variable name followed by a dot, then, while your cursor is still sitting after the dot, press the
<tab> key on your keyboard. The animated GIF below shows an example of this. A pop-up shows a list of functions associated with the string object
Functions always end with a set of parentheses. By placing your cursor inside of the parentheses at the end of a function and holding shift and then pressing tab you can pull up a help menu with instructions on how to use the function. This is also shown in the GIF below. This is really useful. We will revisit it again later.
It is interesting that the string variable has a function called
.lower() to print its value in lower case. But doesn't that seem a little idiosyncratic. How could you have known that that function exists out of the thousands of functions in Python?
The answer is in two parts. First, it is something that you learn over time. As you read more Python code and see this function used repeatedly you will eventually memorize that this and other functions exist. But second, Python actually helps you to learn about these functions by way of its object-oriented design. The
.lower() function is attached to a string object because it is meant to be used on a string object. The association of functions with the object types that they are meant to be used on is a way in which the language itself provides tips to you while you use it.
If you look back at one of our integer variables that we created earlier, like x, you will notice that it has a different set of functions associated with it (which you can see by putting a dot after the variable and using tab-completion). The
.lower() function is not among the functions associated with x because it doesn't make sense to convert the value of an Integer object to lower case. It is only something that can be done to Strings.
Again, using the interactivity of Python is useful here. As mentioned earlier, functions always have a set of parentheses at the end. This is because some functions require additional arguments about how they should be executed, and these arguments are passed to the function by entering them in the parentheses.
As mentioned earlier, you can find more information about a function by selecting your cursor inside the parentheses, holding the shift key down, and then pressing tab.
In the example below we use a function that takes an argument. When we provide a string as an argument to the
.split() function it separates the object's string value into multiple smaller strings that are delimited by the input string argument. For example, the string value stored to DNA can be split on the character "TTT" to yeild two strings composing the string before TTT and after TTT. Try it below.
dnalist = dna.split("CG") print(dnalist)
['A', 'CAGA', 'ATTTGATGATGAGCAT', 'ACTAGCTACACAAAGACTCAGGGCATATA']
One of the most flexible and useful data objects in Python is the list. Lists are containers that can store any other type of data object, they can even store other lists. Lists are represented by values inside of square brackets. Seem familiar? That's right, we created a list above when we split the string into multiple objects. The returned value was a list containing multiple strings.
# create a list letters1 = ['a', 'b', 'c', 'd', 'e', 'f', 'g'] # another way to create a list letters2 = list("abcdefg")
['a', 'b', 'c', 'd', 'e', 'f', 'g']
['a', 'b', 'c', 'd', 'e', 'f', 'g']
# test that the two lists are identical letters2 == letters1
A list can be indexed just like a string, however, a big difference is that lists are mutable, meaning that we can replace individual elements of a list without having to create a new variable. This is shown below (we expect the error to be raised in the one example.)
# index a string dna[5:15]
# *try* to mutate part of a string (this won't work) dna = "T"
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-20-3a1d80b5d241> in <module> 1 # *try* to mutate part of a string (this won't work) ----> 2 dna = "T" TypeError: 'str' object does not support item assignment
# make a list of DNA dnalist = list(dna)
# index the dna list from 5 to 15 dnalist[5:15]
['G', 'A', 'C', 'G', 'A', 'T', 'T', 'T', 'G', 'A']
# mutate part of the list dnalist = "T" # print the list from 5 to 15 to show it changed relative to above dnalist[5:15]
['T', 'A', 'C', 'G', 'A', 'T', 'T', 'T', 'G', 'A']
Again, just like strings lists are also objects in Python, and as such they have functions accessible that can be used to operate on lists. You can see all of the functions associated with a list by using tab-completion after the object as described earlier.
# example: count how many "A" are in the list dnalist.count("A")
fiveprime = dna[:10] threeprime = dna[-10:]