Notebook 2.1: Conditionals

This notebook will correspond with content from chapters 3 and 4 in the official Python tutorial https://docs.python.org/3/tutorial/. Please read before completing this notebook.

Learning objectives:

By the end of this exercise you should:

  1. Be able to write conditional Python clauses.
  2. Become familiar with Python functions.
  3. Understand the use of tabs in structuring Python code.

Serial objects (using 'for' statements)

In the first notebook we learned about several Python objects: int, float, bool, str, and list. Of these, the str and list types differ in that they are iterable. This means that the objects have a built-in structure that allows us to easily look at each of their individual data points in order.

The for statement in Python can be used to iterate over iterable objects. Here, it creates a new variable in each iteration and assigns it the next value from the iterable object. Below, we use the variable name element to store the value in each iteration, but this variable can be named whatever you want.

In [1]:
# create a list (a type of iterable object)
mylist = list('lists-are-iterable')

# iterate over elements in mylist printing each one
for element in mylist:
    print(element)
l
i
s
t
s
-
a
r
e
-
i
t
e
r
a
b
l
e

Indentation

If you're familiar with other languages like R you may be asking, but where are the curly brackets or other characters to show when the for loop starts and ends? In Python there are no brackets! Instead, it uses tabs. The indentation of one tab in the line under the for loop clause indicates to Python that this line should be repeated in each iteration of the loop.

In [2]:
# create a dna string (also iterable)
dna = "AACTCGCTAAAG"

# iterate over the list operating on each element
for element in dna:
    print(element)
A
A
C
T
C
G
C
T
A
A
A
G

Conditionals

When iterating over elements this is a very natural place to insert a conditional statement in order to operate on only a subset of the iterable results. This can be done using if statements. Again, indentation is used to show that the line following the if statement should only be executed if the statement is True. Example:

In [3]:
# create a dna list
dna = "AACTCGCTAAAG"

# iterate over the list operating on each element
for element in dna:
    
    # apply the conditional
    if element == "A":
        
        # the code only reaches here if the conditional returned True
        print(element)
A
A
A
A
A

Indexing with iteration

You can select the elements from an iterable by iterating over the element itself, as we did above, or, another useful procedure is to iterate over a range of integers that can be used to index elements of the iterable object. Remember that indexing is used in Python to select a subset of elements from a string or list (e.g., dnalist[5:15]).

Below is an example of iterating through an iterable, and iterating through an index.

In [4]:
# an iterable list
dna = "AACCTTGG"

# iterating through the iterable itself
for letter in dna:
    print(letter)
A
A
C
C
T
T
G
G
In [5]:
# iterating through an index
for i in [0, 1, 2, 3, 4, 5]:
    print(dna[i])
A
A
C
C
T
T

The range and len functions

The sequence object range is a special highly efficient operator for iterating over numeric values. It has the form range(start, stop, step), and returns an object that generates numbers on the fly as they are sampled. This makes it highly efficient since if you tell it to generate a billion numbers it doesn't need to generate them ahead of time but instead generates them only as they are needed. range is important since it is often used in conjunction with sequence type objects to sample their index.

Another useful function that is often used it tandem with range is the function len, which stands for length. It is used to measure the length of an iterable object, meaning it will tell you how many elements are inside of it. One way that these two functions are combined is to ask range to return a sequence of integers that is the same length as an iterable object. This will allow you to iterate over the entire index of an iterable object.

In the example below we iterate over every element in dna by calling len(dna) which returns an integer and wrapping that inside of range which returns a sequence for that entire length.

In [6]:
# example calling len() function on dna object
len(dna)
Out[6]:
8
In [7]:
# example calling the range() function
for idx in range(5):
    print(idx)
0
1
2
3
4
In [8]:
# iterate over the full range of the dna list object
for i in range(len(dna)):
    print(i, dna[i])
0 A
1 A
2 C
3 C
4 T
5 T
6 G
7 G
Action: Follow the instructions to complete this multipart action to receive full points: (1) Create a new list object called dnalist that is made up of string object composing any number and order of of A, C, T, and Gs. (2) Iterate over the length of dnalist to select each element using indexing. (3) As each indexed value is selected use a conditional argument to select only elements that match the value "A". (4) If the value matches "A" replace the indexed value with a lowercase version of the letter. (5) print the final modified dnalist object to show that the values have been modified. Hint: Make sure you create the dnalist variable as a list object and not as a string object because only a list object is mutable (i.e., individual indexed values can be replaced with others).
In [58]:
dnalist = list("GCATCGATCGACTAGCATCGAT")

for idx in range(len(dnalist)):
    if dnalist[idx] == 'A':
        dnalist[idx] = dnalist[idx].lower()

print(dnalist)
['G', 'C', 'a', 'T', 'C', 'G', 'a', 'T', 'C', 'G', 'a', 'C', 'T', 'a', 'G', 'C', 'a', 'T', 'C', 'G', 'a', 'T']

If / Else clauses

A natural progression from making if statements is to also add an operation if the statement is False. This can be done using else, or, to write more complex statements, we can also add in elif which means else if.

In [59]:
mylist = ['a', 'b', 'c', 'd', 'e']
for letter in mylist:
    if letter == 'a':
        print('lower case a', letter)
    elif letter == 'b':
        print('upper case b', letter.upper())
    else:
        print("some other letter")
lower case a a
upper case b B
some other letter
some other letter
some other letter

Lists as stacks

A common use of lists is to store objects that have been passed through some type of filter process. Lists are nice for this because you can start with an empty list and sequentially add objects to it to build it up. Example below.

In [60]:
vowels = []
for item in "abcdefghijklmnopqrstuvwxyz":
    if item in "aeiou":
        vowels.append(item)
        
print(vowels)
['a', 'e', 'i', 'o', 'u']

List comprehension

A more compact way to assign values to a list while iterating over a for-loop or conditional statement is to use a method called list comprehension. This is essentially a way of rewriting a multi-line for-loop statement into a single line. The point of list comprehension is to make your code more compact and easier to read. It may look a little funny at first but once you become familiar with the list-comprehension syntax it can actually be very elegant.

In [61]:
vowels = [i for i in "abcdefghi" if i in "aeiou"]
vowels
Out[61]:
['a', 'e', 'i']
Action: Write code in the cell below to count the number of differences between the variables dna1 and dna2. Hint, create an integer variable set to 0 and add 1 to it each time you observe a difference between the two lists. Iterate through each list comparing items. You may find that using the `range` function and indexing the lists is easiest. See Chapter 4.3.
In [62]:
dna1 = "AACTCGCTAAAGCCTCGCGGATCGATAAGCTAG"
dna2 = "AAGTCGCTAAAGCAACGCGGAACGATAACCTGG"
In [63]:
count = 0
for i in range(len(dna1)):
    if dna1[i] != dna2[i]:
        count += 1
        
print(count)
6
Action: Save this notebook and download as HTML to submit to courseworks.