Code Organization in Python

Modular design
Style guidelinesDocumentation

Modular Design

Modular design in software engineering is a methodology that involves breaking down a system into smaller, independent modules, each responsible for specific tasks.

It is a first step to achive separation of concerns.

"Separation of concerns is a principle used in programming to separate an application into units, with minimal overlapping between the functions of the individual units."

Modular Design

The benefit of a modular design are multiple:

  • Efficient Navigation
  • Testing
  • Onboarding
  • Reusability
  • Extensibility
Modular Design

Anatomy of a Python Library

Python allows modular design via 2 key components:

  • Modules
  • Packages
myproj/
├── mylib/
│   ├── __init__.py
│   ├── core.py
│   └── utils/
│       ├── __init__.py
│       └── helpers.py
├── tests/
├── docs/
├── main.py
└── README.md

Python Modules

A module is "the basic unit of code reusability in Python" [Docs].

There are 2 types of modules:

  • Pure modules (written in Python)
  • Extension modules (written in low level languages as C/C++)

⚠️ Extension modules are an advanced topic and are not covered in this course. We will focus on pure modules only.

Python Modules

For the purpose of this course, a module is just a file with a .py extension containing Python code.

# arithmetic.py
def add(a, b):
    return a + b

def multiply(a, b):
    return a * b

The name of the module is the name of the file (without the extension) and its content (variables, function, classes, ..) can be easily accesed by importing it.

⚠️ A module can contain executable statements as well as definitions. These statements are executed on import (only the first time the module is imported).

Python Modules - How to import

There are 3 ways to import definitions from a module:

  1. By importing the module
    >>> import arithmetic
    >>> arithmetic.add(3,5)
    8
    >>> arithmetic.multiply(3,5)
    15
    

    This will add the module name to the current namespace, meaning that its content can be accesed by using the module name as prefix.

    💡 This avoid confusion when two different modules have definitions with the same name!

Python Modules - How to import
  1. By importing some definitions from the module:

    >>> from arithmetic import add, multiply
    >>> add(3,5)
    8
    >>> multiply(3,5)
    15
    

    💡 This add selected names to the current namespace. Thus, it avoids repeated long prefixes that might make the code less readle.

    ⚠️ You might hide something you have already defined if it has the same name!

Python Modules - How to import
  1. By importing all the definitions from the module:

    >>> from arithmetic import *
    >>> add(3,5)
    8
    >>> multiply(3,5)
    15
    

    This add all names except those beginning with an underscore _ to the current namespace.

    ❌ BAD: You have no control on what you are importing and what you might overwrite!

Python Modules - How to import

It is also possible to import objects or whole modules with alternative names:

>>> from arithmetic import multiply as mul
>>> mul(3,5)
15

💡 This is very handy when importing frequently used modules. In certain cases is a de-facto standard. For example:

>>> import numpy as np
>>> import pandas as pd
>>> import matplotlib.pyplot as plt
Python Modules

Modules can import from other modules:

# area.py
from arithmetic import multiply

PI=3.14

def circle(radius):
    return multiply(PI, multiply(radius, radius))

def square(side):
    return multiply(side, side)
>>> import area
>>> area.circle(2)
12.56
>>> area.square(2)
4
Python Modules

Modules can be executed as scripts. This can be useful for testing purposes.

As on import, all the statements it contains are executed, but its __name__ attribute will be set to "__main__".

💡 It is good practice to enclose the main routine in a

if __name__ == "__main__":

This prevent the code from being executed when the module is imported.

Python Modules

For example we can add to area.py the following:

# area.py
if __name__ == "__main__":
    print(circle(1))

The code above is executed only when executing area.py as a script:

$ python3 -c "import area"
$ python3 area.py
3.14
Python Modules - What can I import?

Python comes equipped with an extensive pletora of built-in modules that are always available. These provide access to system functionality such as file I/O (os, sys, ...) or standardized solutions for many common problems (math, datetime, ...).

💡 For more details consult the Python Standard Library.

On top of the built-in modules, there is a vast collection of external modules that can be downloaded and installed from a remote repository (topic for tomorrow), or can be already stored on the local file system.

⚠️ Not all the .py files stored on the local file system can be imported directly.

Python Modules - What can I import?

When a module is imported, the interpreter:

  1. searches for a built-in module with that name;
  2. if not found, it searches in:
    • the input script's directory (or the current directory);
    • the directories specified in the environmental variable PYTHONPATH;
    • the installation-dependent default locations (by convention including a site-packages directory).
Python Modules - What can I import?

The complete list of the import locations is accessible via the sys.path variable:

import sys
print(sys.path)
['', '/usr/lib/python310.zip', '/usr/lib/python3.10', ...

⚠️ sys.path can be modified at run time but it is better to use PYTHONPATH instead.

Python Modules - What can I import?

🤔 How to make sure a module can be imported?

  • it can be placed in the same directory as the main python script;
  • its location can be added to the PYTHONPATH environmental variable before running the program;
  • it can be placed in one of the default directories.
Python Modules - What can I import?
$ export PYTHONPATH="/path/to/module/"
$ python3 -c "import sys; print(sys.path)"

['', '/path/to/module', '/usr/lib/python310.zip', '/usr/lib/python3.10', ...

or, to modify it only for a single execution,

$ PYTHONPATH="/path/to/module/" python3 -c "import sys; print(sys.path)"

['', '/path/to/module', '/usr/lib/python310.zip', '/usr/lib/python3.10', ...
Python Modules - What can I import?

💡 PYTHONPATH is a colon-separated (:) list of pathnames. When it is not empty it is recommended to prepend the new locations:

PYTHONPATH="/first/path/:/second/path/:$PYTHONPATH"

Python Packages

A package is a Python module which can contain other modules or recursively, other packages.

Packages allow to organise multiple modules hierarchically using dot notation, such as package.subpackage.module.

⚠️ In this part of the course the term "package" refers to a "import package". This should not be confuesd with a "distribution package", covered tomorrow. For more details, read this discussion.

Python Packages

A package is just a directory containing a special Python file: the __init__.py.

The content of __init__.py is executed when the package (or one of its subpackages) is imported.

💡 __init__.py might be empty.

⚠️ From Python version 3.3 it is possible to define a package also ommitting the __init__.py file. This will create a particular kind of package, a namespace package, with various implication and should be avoided in normal usecases.

Python Packages

__init__.py might contain code to:

  • Define package-level constants or variables (e.g. the version).
    __version__ = "1.2.3"
    
  • Import and expose specific modules or symbols from the package.
    from .module1 import some_function
    
    __all__ = ["some_function"]
    
  • Perform initialization tasks when the package is imported.
Python Packages

Following the previous example, we organise the two modules area and arithmetic in two subpackages geometry and maths of a the mypkg package:

mypkg/
├── geometry/
│   ├── __init__.py
│   └── area.py
├── maths/
│   ├── __init__.py
│   └── arithmetic.py
└── __init.py__

💡 The __init__.py file appears in the package directory mypkg and in all the subpackages maths and geometry.

Python Packages

If mypkg is in sys.path then to import objects from arithmetic:

from mypkg.maths.arithmetic import multiply

At the first import, the above will:

  1. load the mypkg package and execute its __init__.py;
  2. load the maths subpackage and execute its __init__.py;
  3. load the arithmetic module and execute it;
  4. add multiply in the current namespace.
Python Packages

⚠️ area.py contains intra-package references. Its import statement from arithmetic has to be modified, as this module is now referenced as mypkg.maths.arithmetic.

from arithmetic import multiply


from mypkg.maths.arithmetic import multiply


from ..maths.arithmetic import multiply

❌ arithmetic not in sys.path

✅ absolute imports to submodules

✅ relative imports to submodules

🤔 Relative imports might break when the module is executed as a script.

In Python, definitions can be saved in a file so to be used later in a script or in an interactive instance of the interpreter. Such a file is called a module.