Skip to content

Learning objectives

By the end of this session you should: - Have a new github repository for your mini-project formatted as a python module. -

Defining the mini-project package structure

Create a new mini-project repository

Go to your github account and create a new repository called mini-project. Clone the repository to your local computer in the hacks directory, and then continue populating your new repo with the following directory structure and files.

mini-project/
├── mini-project/
│   ├── __init__.py
|   |   __main__.py
│   └── module.py
├── README.md
└── setup.py

Populate the README.md

Fill in the basic information about your mini-project in the README.md file.

  • Write the name of your program as a level-1 header
  • Briefly describe your mini-project as best you can below the header as a text.
    • Include a brief desription of the input data and the expected output.
  • Write a level-3 header called "Installation"
  • Write a paragraph below this describing how developers can install the program locally to help work on the code. This should include instructions like the following:
conda install [list dependencies here...] -c conda-forge ...

git clone [myproject-link]
cd ./myproject-name
pip install -e .

Test to make sure your code is installable, even if the Python modules are mostly empty for now.

Populate the setup.py

Copy the format of your setup.py file from your hack-9-python repository (the darwinday example module). You will need to change some of the values to match your current project's information.

Populate __init__.py

Do the same for __init__.py, in this case at the moment you might only have the version info.

Populate __main__.py

Copy the text for __main__.py from the hack-9-python directory. You will probably have to remove or modify some things initially, but the structure of this file will be very useful, i.e. the parse_command_line() and main() functions will be useful whatever your mini-project is.

NB: __main__.py should be used for parsing command line and setting up and calling the functions in your module.py file. There shouldn't be much 'heavy lifting' in this file, as the majority of your functional code should be in external modules organized as classes (if needed).

A couple further useful directories

I also like to create directories at the top level of my package called notebooks and example_data for holding jupyter notebooks I use either for prototyping or testing, and as a place to store example data for using as a test-case with your tool (a very good practice).

Begin working on your mini-project

For the remainder of this session start working on your mini-project. This can take a couple of forms:

  • Compiling a toy example dataset: It is useful to have an example dataset that is structured in the same way as the data you anticipate manipulating, so you might spend some time creating (or finding) an appropriate example dataset. This dataset should be small enough to run fast, but have enough variation to meaningfully demonstrate the functions of your code.
  • Prototyping: I will often start a project with a new juptyer notebook called project-name-dev.ipynb in which I will prototype code to get it working before copy/pasting it into a class or function within my package. You can start prototyping to develop the functions of your code.
  • Defining needed CLI arguments: It's also sometimes helpful to start with the argparse arguments you anticipate your program will need, and fill these in.