2.1. Shell, variables, and PATH¶

Operating systems¶

An operating system (OS) is the core software that manages hardware and software resources on a computer. It includes the kernel, whose core function is to communicate between hardware and software, and a core set of utilities (programs). Examples of operating systems include Linux, MacOS, and Windows. MacOS and Windows typically maintain a few versions of their OS at a time. By contrast, there are many different distributions of Linux. These share the use of a Linux kernel, but differ in the set of core software that they come with, and their default settings.

For this reason, you will see that the instructions we provide will sometimes differ slightly between Linux and MacOSX, and this extends to other Linux distributions as well. The most commonly used Linux distribution is called Ubuntu, and this is the distribution you will encounter when using WSL2, or an HPC cluster. Here we will introduce the terminal, or shell, which is a software tool used to call other software programs using commands.

echo $OSTYPE

linux-gnu

System software utilities¶

Your computer has a lot of software tools installed. Of course this includes those you see on your desktop, like email and calendar, but it also includes many smaller and simpler tools that you likely don't know about. This includes some of the core utilities that we've begun to learn about, such cd, ls, and echo. These are stored in a location in your filesystem that is relatively 'deep' -- near the root (/) -- such as in /bin or /usr/bin. These folders are protected by your system such that they can only be edited by users with administrative privileges. Which makes sense, you don't want to make changes to these programs and break them, or your system would be caput.

Let's use the list program ls to list all of the programs in our /bin folder. Here I'm showing only the first few lines of the output, as it contains hundreds of tools.

# see all the program names in /bin/
ls -l /bin/

total 659832
-rwxr-xr-x 1 root root        55744 Apr  5  2024 '['
-rwxr-xr-x 1 root root        14640 Mar 31  2024  411toppm
-rwxr-xr-x 1 root root        18744 Jul 18  2024  aa-enabled
-rwxr-xr-x 1 root root        18744 Jul 18  2024  aa-exec
-rwxr-xr-x 1 root root        18736 Jul 18  2024  aa-features-abi
-rwxr-xr-x 1 root root        22912 Apr  7  2024  aconnect
-rwxr-xr-x 1 root root         1622 Nov 30 13:21  acpidbg
-rwxr-xr-x 1 root root        16422 Aug 15 04:26  add-apt-repository
-rwxr-xr-x 1 root root        14720 Aug  8 22:33  addpart
lrwxrwxrwx 1 root root           26 Aug  7 06:15  addr2line -> x86_64-linux-gnu-addr2line
-rwxr-xr-x 1 root root        43552 Mar 31  2024  afm2pl
-rwxr-xr-x 1 root root        53824 Mar 31  2024  afm2tfm
-rwxr-xr-x 1 root root       158576 Apr  8  2024  airscan-discover
...

If you scrolled down this list you would see that it even includes a program called ls. That's right, that file is the ls program executable itself. Note that although the program is located within the bin folder, we were able to call ls from our shell without having to type the full path to its location. This is because the software tools in bin are all accessible by default in the shell. They are in our PATH (more on this below). However, we can also call programs from bin by writing the full path to the executable files. The command below should yield the same output as the one above.

# call the ls program from /bin/
/bin/ls -l /bin/

The Shell/Terminal¶

A shell provides an interface for users to interact with the operating system. It is primarily used to execute commands, run scripts, and automate tasks. The shell itself is a program. The most common shell programs are bash and zsh, but there are many alternative options. You can call the command below to find the name of your shell. Take note, we will use this information later.

echo $SHELL

/bin/bash

Shell variables¶

Most computer languages use variables as a way of assigning an object to a name that can be used to reference it over and over again. Usually the = operator is used to assign variables (e.g., x = 3, but even this differs among languages (e.g., you may be familiar with the syntax of x <- 3 in R to assign variables). Variables can also be used in a shell or terminal. This is an example of the shell being used as a scripting language. Shell scripting languages are not full programming languages, like C or Python, but provide a minimal framework for flow control to automate commands, including assigning variables. The example below is a shell script that stores a variable and then prints it by calling the echo command.

# store a variable
x=3

# retrieve the variable's value using $
echo $x

Default shell variables¶

As we learned in lecture, your shell stores a lot of information as variables behind the scenes. This includes variables that tell it how to display colors and text in the terminal, how to keep track of its own history, how to generally behave, and importantly, where to look for software or files that can be executed without specifying their path, and found with tab auto-completion. We just saw some examples of these variables above when we queried the $OSTYPE and $SHELL variables to learn about our system. These are variables that are automatically loaded and assigned when your terminal is opened. Even this behavior can be modified, as we will learn about more below.

The `PATH` variable¶

The PATH variable is a colon-separated list of filepaths describing the order in which your shell searches directories for executable binaries -- programs that can be called by name from the command line. This is referred to as a variable because it is a value that is temporarily stored to the name PATH when you open a terminal. Use echo in your shell to print the value of your $PATH variable. A simple one might look like the example below:

# show the $PATH variable where software is searched
echo $PATH

/usr/local/bin:/usr/bin:/usr/sbin:/sbin

The example above includes only the locations of system-wide software -- all of the paths are pointing to filepaths that are not in a user space, i.e., descended from /home/username on Linux or /Users/username on OSX. By contrast, and as we'll learn more about in the coming weeks, we will often wish to install 'local' software within our own user spaces. This is software that will be completely isolated from the system software that we definitely do not want to mess up. You can see in my PATH below that I have additional filepaths including ~/miniconda3/bin and ~/gems/bin (notice I'm using ~ as an alias to reference my /home/deren prefix) that point to directories located inside my home directory where additional programs are stored.

/home/deren/gems/bin:/home/deren/miniconda3/bin:/home/deren/miniconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin

The order of `PATH`¶

Your PATH variable can be modified and this can be very useful for specifying a particular version of a program when multiple versions are available. A key example of this is the program python. You can call python from the command line to execute Python scripts. Your system comes pre-installed with a version of Python that is used to run system-wide software. We generally do not want to use this version of Python for scientific programming. This is because we cannot easily update or change that version without affecting our entire operating system. So, instead, we will install a separate version of Python on our computers within our userspace (somewhere within your home directory) and we will tell the PATH variable to use this one instead of the system-wide version. Do not try to do this just yet, we will go through the process of installing Python later. For now, let's focus on how we know which version is being used.

You can find the filepath to a program that is located in your PATH by using the which or whereis commands. This will help to make clear how the PATH variable works. It is ordered. The first directory in the path is searched first, and then the next one, and then the next one. If python occurs in the first directory, then this version will supercede any programs with the name python occurring in directories that are listed later in the PATH variable. The which command will show you which precise program is at the top of your PATH, where as the whereis command will list all programs that contain the target in their name and are present in your PATH. I will show the results from my laptop, since I have multiple versions of python installed.

which python

/home/deren/miniconda3/bin/python

whereis python

python: /usr/bin/python3.8 /usr/lib/python3.8 /usr/lib/python3.9 /usr/lib/python2.7 /etc/python3.8 /usr/local/lib/python3.8 /usr/include/python3.8 /home/deren/miniconda3/bin/python3.8-config /home/deren/miniconda3/bin/python3.8 /home/deren/miniconda3/bin/python

Dotfiles (preferences)¶

Dotfiles are hidden files, generally located in your home directory (e.g., /home/deren or /Users/deren), and which start with a dot (.) as the first character in their filenames. By default, your shell and operating system typically hide these files from your view, since you don't need to interact with them unless you are a hacker.

Hacking these dotfiles can be really useful. They are files for setting preferences, and thus modifying them can make for a more pleasurable experience when interacting with your terminal. Here we will edit one or more dotfiles to add some basic customization to your terminal. You are welcome to do further customizations on your own (we provide some tips below). Your shell uses these dotfiles by reading and loading the variables in them when the terminal is started. By editing the variables in the files we can customize the behavior of the shell.

First, it is important to be sure which type of shell you have, since this will determine which dotfile name we need to ecdit. If you are in a bash terminal then the file is .bashrc, if you are in the zsh terminal then is it .zshrc.

We want to be careful when editing this file, since a bad configuration can prevent your terminal from opening properly. If this happens, do not worry, it can be fixed by opening the dotfile with a different system utility, like a graphical text editor, and changing it back to its default settings. For this reason, let's start by creating a backup copy of our dotfile by using the cp command. When typing the command below, be sure to specify the appropriate dotfile for your system -- you can ensure this by typing only the beginning of the filename and pressing tab to complete the filepath automatically. If the name does not autocomplete then you have already started to write the path incorrectly. Take note of where you are located when typing this (e.g., first run pwd) and remember that the dotfiles are located in your user home directory.

# ensure I am in my HOME, and then create backup of my .bash_rc
cd $HOME
cp .bashrc .bashrc_backup

The `nano` text-editor¶

To edit a dot file, or any file, we will want to use a text editor. When working in a terminal it can be very useful and efficient to use a terminal-based text editor -- a program that opens right inside of the shell and can be used to edit files. A very simple text editor that is usually installed by default is called nano. You can open a file by typing nano filename. Watch the video below for a short tutorial on using nano. Note that it uses special key-bindings to accomplish some tasks, such as saving and exiting, where you hold down the control key and press another at the same time.

Editing your dotfile¶

Alert

Complete the tasks in this section to earn points for the assignment.

For this assessment use nano to edit your shell's dotfile. We will add an alias, which is a new named variable that will serve as a shortcut to call a command that is more tedious to type. One that I find indispensible is the alias ll (the lowercase letter L twice). We can create an alias in your dotfile by editing the file to add the following line to it:

Linux/WLS2MacOSX

# open your shell's dot file in the 
nano ~/.bashrc

# open your shell's dot file in the 
nano ~/.zshrc

You will see a lot of text in this file. These are the default preference settings as well as many comments that are intended to direct you towards where to make edits when you are making changes to this file. Scroll down until you find a section that defines some aliases. Add the line below to this file and then save and exit.

alias ll='ls -l'

After you have made the edit, you must then re-load the dotfile, which can be done by either closing and re-opening the terminal, or by calling the following:

Linux/WLS2MacOSX

# open your shell's dot file in the 
source ~/.bashrc

# open your shell's dot file in the 
source ~/.zshrc

Now that the alias is created and loaded when you type ll you should see the result that we expect to see when typing ls -l, which is to list the items in a directory with one item per line.

# call our new alias 
ll

If you get stuck in this section, try watching the videos below for step by step instructions on customizing terminals.

Optional: further customize your terminal¶

You do not need to follow all of the instructions in these videos, make any changes that you find useful, and feel free to search for other tutorials or videos for tips. Just take note that you must follow instructions for the appropriate shell type that you have (e.g., bash versus zsh). If you got stuck on the section above, these videos might prove helpful to you.

That's it, you can move on to the next tutorial. There is no required submission for this tutorial.