thinking-functionally
Learning to think functionally¶
As with any language, there are often many possible ways in which to communicate the same information by using different words, or modifying their order, and one way is not necessarily better than another. In programming, our goal is provide a set of logical statements to accomplish a defined task. This can usually be done with different types of objects or routines in your code. We could evaluate different approaches based on their speed, reliability, and readability, but in the end we are usually most interested in the question of whether or not it works.
Now that you have learned the basic building blocks of the Python programming language you can begin to write useful functions for accomplishing tasks. But how does one learn how to think functionally? In other words, how do you train your brain to think in terms of how to solve problems by writing Python functions?
Kludge¶
The term kludge refers to an assemblage of parts that together are sufficient to accomplish a task, even if the solution is far from perfect. Kludging your way through computing problems is a common way to learn to code. Step 1 is to get a working solution, and step 2 is to find ways to improve it. While this approach will eventually help you to improve, it is not necessarily the fastest way to learn. Instead, you can now take advantage of the tremendous resources that are available to you through the internet to find and learn from existing examples of code. In the context of learning to speak a language, you are now ready to study abroad; to learn through practice and observation of how others use the language.
Copying code¶
So how do we learn to improve code? Use your social network. Copy code from examples on the internet. The best places to find solutions are usually either stackoverflow or GitHub. The former is a discussion board where people post questions and others respond and vote on the best answers. Reading these posts is a great way to learn why some coding examples are better than others.
To find solutions in this way try googling terms such as "{my question} Python stack overflow". For example, here is a page I found by searching "n choose k Python stack overflow". One user asks what is the fastest way to find the number of ways to sample n combinations from a list of k items. The responses include custom code, standard library functions, and 3rd party libraries (e.g., scipy). You can choose among these ideas for the one that works best for your purposes.
GitHub is not quite as easily searchable in terms of finding code snippets that will be accompanied by an explanation, but it is a great place to find examples of code being used in practice. Find a library that you really enjoy using and explore its repository on GitHub. Try to understand how it is organized, and how the main classes and functions work. We will go through some examples for this in class in the following weeks.
Thinking atomically¶
As we've discussed several times, it is a better practice to break up operations in your code into many small functions, rather than one large function. This will allow you to more easily debug your code when it contains errors, and it to write cleaner and more readable code. Writing functions in this way can also help you to conceptualize problems, by breaking them into many smaller parts for which you can write simpler solutions.
Thinking functionally¶
As an example, let's imagine our coding task is to search a database of biomedical publications in PDF format and to count all mentions or references of any gene names that exist in a Human Genome database. Before you even begin this task you should start by creating an outline of the atomized tasks involved. For example, we could break this problem into a number of functions where each accomplishes one of the following tasks:
- extract gene names from a biomedical database into a dict object mapping names to integer 0.
- check for special formatting of gene names and attempt to standardize (e.g., make all uppercase).
- extract text from PDFs into a dictionary mapping article name to contents.
- parse article text into list of words and format to match gene names (e.g., all uppercase).
- iterate over articles counting occurrence of all gene names among word lists in each.
- format the count results dictionary into a table and print to a file or stdout.
Now that we have identified the main tasks involved, you can begin to
develop these in order, testing each one individually to identify that
it is working properly, and then testing them in combination where the
output of one can serve as input to another. As you will learn in
tutorial 7.4, the object-oriented design of Python, and in particular,
the use of Class
objects, can help with developing code in this style.
Proceed to the next tutorial.