Dictionaries are one of the most useful object types in Python. They provide a mapping between (key
, value
) pairs, and represent a fast and efficient way of creating look-up tables. A simple example use for a dictionary would be something like mapping names to phone numbers or addresses. In genomics, we might map sample names to DNA sequences. Using the dictionary we could then query a key
(e.g., a person's name) and it will return the value
associated with that key (e.g., an address).
Dictionaries are very fast and flexible for storing different types of data, and of various sizes. Once you master dictionaries you'll find yourself using them all the time.
You can create a dictionary object by using either the dict()
function, or by enclosing dictionary data inside of curly brackets. Both examples are shown below. The second form is more commonly used so I will use that in all following examples. In the curly bracket format keys
are matched with values
by a colon, and key/value
pairs are separated by commas.
# make a dict from a list of key,val pairs
d1 = dict([('key1', 'val1'), ('key2', 'val2')])
# make a dict using the simpler curly bracket format
d2 = {'key1': 'val1', 'key2': 'val2'}
# return the dictionary
d2
To query a dictionary you provide a key
to the dictionary as an index (in square brackets), and it will return the matching value
.
d2['key1']
A common way to work with dictionaries is to start with an empty dictionary at the beginning of an iteration (e.g., a for-loop) and to fill elements of the dictionary as you iterate over elements of the list. Dictionaries are useful for this because you can quickly query whether an element that you visit in the iteration is already in the dictionary or not. Let's consider an example where we use a dictionary as a counter. We'll store names as keys, and integers as values.
In the example below we iterate over a list of random numbers and then apply a conditional if/else statement to either create a new key value pair in the dictionary, or to increment the value if the key is already in the dictionary.
import random
integer_list = [random.randint(0, 10) for i in range(1000)]
counter = {}
for item in integer_list:
if item not in counter:
counter[item] = 1
else:
counter[item] += 1
counter
dictionary¶The code above iterated over every element in a list of 1000 random values selected in the range 1-10, and counted how many times each occurred. In other words, we created a histogram.
Below we can return the dictionary and see that is shows a number of keys and their mapped values. The results are not sorted and/or super easy to read. In the next cell, we can instead query the keys in the order we wish to see them in order to display the results more clearly and ordered.
# return the dictionary results
counter
# return dictionary results in a queried order
# iterate over the keys in the dictionary (integers 1-10)
for i in range(10):
# print the key and value
print(i, counter[i])
# another way to do the same thing
# iterate over the keys which we know are 1-10
for key in sorted(counter.keys()):
# print the key and value
print(key, counter[key])
# comment: import a package
import random
# comment: create a list of 1000 integers between 0-10
integer_list = [random.randint(0, 10) for i in range(1000)]
# comment: create an empty dictionary
counter = {}
# comment: iterate over items in the integer list
for item in integer_list:
# comment: conditional 'item' integer is not in 'counter' dictionary keys
if item not in counter:
# comment:
counter[item] = 1
# comment: integer is already in dictionary keys
else:
# comment: if item is already in dictionary then value increases by 1
counter[item] += 1
Like other objects in Python, dictionaries have a number of functions and attributes associated with them that you can access by placing a dot after the dictionary name, and typing [tab]. Let's create an example below of a dictionary that stores a list of lists as values. Below we explain the .keys()
, .items()
, and .values()
functions of dictionaries which can be used to return its data.
# lists of names and data
individuals = ['sample-1', 'sample-2', 'sample-3', 'sample-4']
trait1 = [56, 76, 22, 21]
trait2 = ['green', 'green', 'red', 'red']
trait3 = ['angry', 'docile', 'angry', 'docile']
# create a dictionary mapping multiple traits to each species
datadict = {}
for i in range(4):
datadict[individuals[i]] = [trait1[i], trait2[i], trait3[i]]
## show the dictionary data
datadict
## .items() returns key,val pairs as tuples
for item in datadict.items():
print(item)
## .keys() returns just the keys
for key in datadict.keys():
print(key)
## .values returns just the values
for val in datadict.values():
print(val)
Just as with lists, you can create dictionaries using list comprehension. This is simply a more efficient way to write code sometimes as opposed to writing a for-loop. The format can be thought of as: [append this thing
as we iterate through each thing
from a container of things
].
# list-comprehension example for list objects
newlist = [i for i in range(10)]
newlist
# list comprehension for a dictionary from a list of lists
ddict = {i: j for (i, j) in [['a', 1], ['b', 2], ['c', 3]]}
ddict
# another example using the Python function 'zip'
keys = ['a', 'b', 'c']
vals = [1, 2, 3]
{i: j for (i, j) in zip(keys, vals)}
dd = {'a': 3, 'b': 4, 'c': 5}
dd