Notebook 3.1: Python dictionaries

Learning objectives:

By the end of this notebook you should be able to:

  1. Recognize the use cases for Python dictionaries.
  2. Be able to create and use Python dictionaries.

Introduction to Python dictionaries

Dictionaries are one of the most useful object types in Python. They provide a mapping between (key, value) pairs, and represent a fast and efficient way of creating look-up tables. A simple example use for a dictionary would be something like mapping names to phone numbers or addresses. In genomics, we might map sample names to DNA sequences. Using the dictionary we could then query a key (e.g., a person's name) and it will return the value associated with that key (e.g., an address).

Dictionaries are very fast and flexible for storing different types of data, and of various sizes. Once you master dictionaries you'll find yourself using them all the time.

A simple example

You can create a dictionary object by using either the dict() function, or by enclosing dictionary data inside of curly brackets. Both examples are shown below. The second form is more commonly used so I will use that in all following examples. In the curly bracket format keys are matched with values by a colon, and key/value pairs are separated by commas.

In [1]:
# make a dict from a list of key,val pairs
d1 = dict([('key1', 'val1'), ('key2', 'val2')])

# make a dict using the simpler curly bracket format
d2 = {'key1': 'val1', 'key2': 'val2'}
In [2]:
# return the dictionary
{'key1': 'val1', 'key2': 'val2'}

Query a dictionary value

To query a dictionary you provide a key to the dictionary as an index (in square brackets), and it will return the matching value.

In [3]:

Common use case

A common way to work with dictionaries is to start with an empty dictionary at the beginning of an iteration (e.g., a for-loop) and to fill elements of the dictionary as you iterate over elements of the list. Dictionaries are useful for this because you can quickly query whether an element that you visit in the iteration is already in the dictionary or not. Let's consider an example where we use a dictionary as a counter. We'll store names as keys, and integers as values.

In the example below we iterate over a list of random numbers and then apply a conditional if/else statement to either create a new key value pair in the dictionary, or to increment the value if the key is already in the dictionary.

In [4]:
import random

integer_list = [random.randint(0, 10) for i in range(1000)]

counter = {}

for item in integer_list:
    if item not in counter:
        counter[item] = 1
        counter[item] += 1

The resulting counter dictionary

The code above iterated over every element in a list of 1000 random values selected in the range 1-10, and counted how many times each occurred. In other words, we created a histogram.

Below we can return the dictionary and see that is shows a number of keys and their mapped values. The results are not sorted and/or super easy to read. In the next cell, we can instead query the keys in the order we wish to see them in order to display the results more clearly and ordered.

In [5]:
# return the dictionary results
{3: 101,
 7: 92,
 2: 93,
 0: 80,
 5: 90,
 10: 99,
 4: 89,
 6: 73,
 1: 100,
 9: 98,
 8: 85}
In [6]:
# return dictionary results in a queried order

# iterate over the keys in the dictionary (integers 1-10)
for i in range(10):
    # print the key and value
    print(i, counter[i])
0 80
1 100
2 93
3 101
4 89
5 90
6 73
7 92
8 85
9 98
In [7]:
# another way to do the same thing

# iterate over the keys which we know are 1-10
for key in sorted(counter.keys()):
    # print the key and value
    print(key, counter[key])
0 80
1 100
2 93
3 101
4 89
5 90
6 73
7 92
8 85
9 98
10 99

Interpreting code

Action [1]: In a code cell below describe what is happening on each line of the code by writing a comment above each line of code where I have written "# comment:". If you get stuck, try asking for help in the chatroom.
In [9]:
# comment: import a package 
import random

# comment: create a list of 1000 integers between 0-10
integer_list = [random.randint(0, 10) for i in range(1000)]

# comment: create an empty dictionary
counter = {}

# comment: iterate over items in the integer list 
for item in integer_list:
    # comment: conditional 'item' integer is not in 'counter' dictionary keys
    if item not in counter:
        # comment: 
        counter[item] = 1
    # comment: integer is already in dictionary keys
        # comment: if item is already in dictionary then value increases by 1
        counter[item] += 1

Dictionary attributes/features

Like other objects in Python, dictionaries have a number of functions and attributes associated with them that you can access by placing a dot after the dictionary name, and typing [tab]. Let's create an example below of a dictionary that stores a list of lists as values. Below we explain the .keys(), .items(), and .values() functions of dictionaries which can be used to return its data.

In [10]:
# lists of names and data
individuals = ['sample-1', 'sample-2', 'sample-3', 'sample-4']
trait1 = [56, 76, 22, 21]
trait2 = ['green', 'green', 'red', 'red']
trait3 = ['angry', 'docile', 'angry', 'docile']

# create a dictionary mapping multiple traits to each species
datadict = {}
for i in range(4):
    datadict[individuals[i]] = [trait1[i], trait2[i], trait3[i]]
In [11]:
## show the dictionary data
{'sample-1': [56, 'green', 'angry'],
 'sample-2': [76, 'green', 'docile'],
 'sample-3': [22, 'red', 'angry'],
 'sample-4': [21, 'red', 'docile']}
In [12]:
## .items() returns key,val pairs as tuples
for item in datadict.items():
('sample-1', [56, 'green', 'angry'])
('sample-2', [76, 'green', 'docile'])
('sample-3', [22, 'red', 'angry'])
('sample-4', [21, 'red', 'docile'])
In [13]:
## .keys() returns just the keys
for key in datadict.keys():
In [14]:
## .values returns just the values
for val in datadict.values():
[56, 'green', 'angry']
[76, 'green', 'docile']
[22, 'red', 'angry']
[21, 'red', 'docile']

list comprehension

Just as with lists, you can create dictionaries using list comprehension. This is simply a more efficient way to write code sometimes as opposed to writing a for-loop. The format can be thought of as: [append this thing as we iterate through each thing from a container of things].

In [15]:
# list-comprehension example for list objects
newlist = [i for i in range(10)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In [16]:
# list comprehension for a dictionary from a list of lists
ddict = {i: j for (i, j) in [['a', 1], ['b', 2], ['c', 3]]}
{'a': 1, 'b': 2, 'c': 3}
In [17]:
# another example using the Python function 'zip'
keys = ['a', 'b', 'c']
vals = [1, 2, 3]
{i: j for (i, j) in zip(keys, vals)}
{'a': 1, 'b': 2, 'c': 3}
Action [2]: Using either a for loop or list comprehension create your own dictionary object that is filled with whatever kinds of key/value pairs you can think of.
In [19]:
dd = {'a': 3, 'b': 4, 'c': 5}
{'a': 3, 'b': 4, 'c': 5}
Save and download this notebook as HTML to upload to courseworks.