Script parametrization

Script parametrization

Why parameterize Python scripts?

  • Avoid hardcoding
    • Changing file paths or experiment parameters shouldn’t require code edits
  • Easier automation
    • Run the same script with different inputs or configs
    • Ideal for running various data workflows and CI/CD pipelines
  • Cleaner, more maintainable code
    • Separates logic from configuration
    • Easier to test, debug, and extend the codebase
  • More user-friendly
    • Simple CLI options
    • Non-developers can run scripts too
Script parametrization
  • Parameterizing Python scripts by providing command line arguments

    python main.py --input_path input.csv --output_path output.csv --verbose
    
  • Various ways to provide command line arguments

    • Via argparse module
      • Part of Python's standard library, no need to install it
    • Via click package
      • install via pip
    • Via typer package
      • install via pip
Providing CLI arguments argparse module

Providing CLI arguments via argparse module

  • Advantages:
    • No external dependencies
    • Well suited for simple scripts
  • Disadvantages:
    • Setting up arguments is more convoluted compared to alternatives
    • Lacks modern conveniences like type checking or dynamic help generation
Providing CLI arguments argparse module

Simple CLI via argparse module

import argparse

from src.dataloaders import SFDataset, LADataset

def run_stats(city: str, file: str):
    dataset = SFDataset.load_data(file) if city == "sf" else LADataset.load_data(file)
    dataset.run_analysis_pipeline()

def main():
    parser = argparse.ArgumentParser(description="Housing CLI")
    parser.add_argument("--city", choices=["sf", "la"], required=True, help="City name")
    parser.add_argument("--file", required=True, help="Path to the dataset file")
    args = parser.parse_args()

    run_stats(args.city, args.file)

if __name__ == "__main__":
    main()
Providing CLI arguments argparse module
  • Help page:

    > python main_argparse.py --help
    usage: main_argparse_simple.py [-h] --city {sf,la} --file FILE
    
    Housing CLI
    
    options:
    -h, --help      show this help message and exit
    --city {sf,la}  City name
    --file FILE     Path to the dataset file
    
  • Running our data processing pipeline for "SF" city:

    > python main_argparse.py --city sf \
    --file "data/official_housing_sales_records_SF.csv"
    
    SFDataset Average Price: 6884266.153846154
    - Near main road: 6917845.168539326
    - Furnished: 7205102.05882353
    - Near main road & furnished: 7230625.9701492535
    
Providing CLI arguments argparse module

CLI with subcommands via argparse module

import argparse

from src.dataloaders import LADataset, SFDataset

def run_stats(city: str, file: str):
    dataset = SFDataset.load_data(file) if city == "sf" else LADataset.load_data(file)
    dataset.run_analysis_pipeline()


def average_filtered_price(city: str, file: str, key: str, value: str):
    dataset = SFDataset.load_data(file) if city == "sf" else LADataset.load_data(file)
    filtered = dataset.filter_data(dataset.records, key, value)
    avg_price = dataset._calculate_average(filtered)
    print(f"Average price for {len(filtered)} records where {key} = {value}: {avg_price}")


def main():
    parser = argparse.ArgumentParser(description="Housing CLI")
    subparsers = parser.add_subparsers(dest="command", required=True)


    stats_parser = subparsers.add_parser("stats", help="Show statistics for the dataset")
    stats_parser.add_argument("--city", choices=["sf", "la"], required=True, help="City name")
    stats_parser.add_argument("--file", required=True, help="Path to the dataset file")


    filter_parser = subparsers.add_parser("average-by-filter",
            help="Compute average house price for records matching a filter",)
    filter_parser.add_argument("--city", choices=["sf", "la"], required=True, help="City name")
    filter_parser.add_argument("--file", required=True, help="Path to the dataset file")
    filter_parser.add_argument("--key", required=True, help="Field to filter by (e.g., 'mainroad')")
    filter_parser.add_argument("--value", required=True, help="Value to match for the given field")


    args = parser.parse_args()

    if args.command == "stats":
        run_stats(args.city, args.file)
    elif args.command == "average-by-filter":
        average_filtered_price(args.city, args.file, args.key, args.value)

if __name__ == "__main__":
    main()
Providing CLI arguments argparse module
  • General help page:

    > python main_argparse.py --help
    usage: main_argparse.py [-h] {stats,average-by-filter} ...
    
    Housing CLI
    
    positional arguments:
    {stats,average-by-filter}
        stats               Show statistics for the dataset
        average-by-filter   Compute average house price for records matching a filter
    
    options:
    -h, --help            show this help message and exit
    
Providing CLI arguments argparse module
  • Help for sub-commands needs to be run separately:
    • For stats:

      > python main_argparse.py stats --help
      usage: main_argparse.py stats [-h] --city {sf,la} --file FILE
      
      options:
      -h, --help      show this help message and exit
      --city {sf,la}  City name
      --file FILE     Path to the dataset file
      
    • For average-by-filter:

      > python main_argparse.py average-by-filter --help
      usage: main_argparse.py average-by-filter [-h] --city {sf,la} --file FILE --key KEY --value VALUE
      
      options:
      -h, --help      show this help message and exit
      --city {sf,la}  City name
      --file FILE     Path to the dataset file
      --key KEY       Field to filter by (e.g., 'mainroad')
      --value VALUE   Value to match for the given field
      
Providing CLI arguments argparse module
  • Running the sub-commands:
    • Running our data processing pipeline for "SF" city:

      > python main_argparse.py stats --city sf \
      --file "data/official_housing_sales_records_SF.csv"
      
      SFDataset Average Price: 6884266.153846154
      - Near main road: 6917845.168539326
      - Furnished: 7205102.05882353
      - Near main road & furnished: 7230625.9701492535
      
    • Getting the average price for the specified city & records matching a filter

      > python main_argparse.py average-by-filter --city sf \
      --file "data/official_housing_sales_records_SF.csv" --key "furnishingstatus" \
      --value "furnished"
      
      Average price for 68 records where furnishingstatus = furnished: 7205102.05882353
      
Providing CLI arguments via click package

Providing CLI arguments via click package

  • Advantages:

    • Decorators for simplicity:
      • Arguments and commands are defined using decorators, reducing boilerplate.
    • Generates rich help outputs.
    • Built-in support for complex argument types (e.g. file paths, enums, ranges)
    • Creation of nested commands is much more straightforward and more scalable than with argparse
  • Disadvantages:

    • External dependency, needs to be installed via pip:
      pip install click
      
Providing CLI arguments via click package

Simple CLI via click module

import click
from src.dataloaders import SFDataset, LADataset

def run_stats(city: str, file: str):
    dataset = SFDataset.load_data(file) if city == "sf" else LADataset.load_data(file)
    dataset.run_analysis_pipeline()

@click.command(help="Show statistics for the dataset")
@click.option("--city", type=click.Choice(["sf", "la"]), required=True, help="City name: 'sf' or 'la'")
@click.option("--file", type=click.Path(exists=True), required=True, help="Path to the dataset file")
def stats(city, file):
    run_stats(city, file)

if __name__ == "__main__":
    stats()
Providing CLI arguments via click package
  • Help page:

    >  python main_click.py --help
    Usage: main_click.py [OPTIONS]
    
    Show statistics for the dataset
    
    Options:
    --city [sf|la]  City name: 'sf' or 'la'  [required]
    --file PATH     Path to the dataset file  [required]
    --help          Show this message and exit.
    
  • Running our data processing pipeline for "SF" city:

    > python main_click.py --city sf \
    --file "data/official_housing_sales_records_SF.csv"
    
    SFDataset Average Price: 6884266.153846154
    - Near main road: 6917845.168539326
    - Furnished: 7205102.05882353
    - Near main road & furnished: 7230625.9701492535
    
Providing CLI arguments via click package

CLI with subcommands via click module

import click
from src.dataloaders import SFDataset, LADataset

def run_stats(city: str, file: str):
    dataset = SFDataset.load_data(file) if city == "sf" else LADataset.load_data(file)
    dataset.run_analysis_pipeline()

def average_filtered_price(city: str, file: str, key: str, value: str):
    dataset = SFDataset.load_data(file) if city == "sf" else LADataset.load_data(file)
    filtered = dataset.filter_data(dataset.records, key, value)
    avg_price = dataset._calculate_average(filtered)
    print(f"Average price for {len(filtered)} records where {key} = {value}: {avg_price}")

@click.group()
def cli():
    pass

@cli.command(help="Show statistics for the dataset")
@click.option("--city", type=click.Choice(["sf", "la"]), required=True, help="City name")
@click.option("--file", type=click.Path(exists=True), required=True, help="Path to the dataset file")
def stats(city, file):
    run_stats(city, file)

@cli.command("average-by-filter", help="Compute average house price for records matching a filter")
@click.option("--city", type=click.Choice(["sf", "la"]), required=True, help="City name")
@click.option("--file", type=click.Path(exists=True), required=True, help="Path to the dataset file")
@click.option("--key", required=True, help="Field to filter by (e.g., 'mainroad')")
@click.option("--value", required=True, help="Value to match for the given field")
def filter(city, file, key, value):
    average_filtered_price(city, file, key, value)

if __name__ == "__main__":
    cli()
Providing CLI arguments via click package

  • argparse module
import argparse

def main():
    parser = argparse.ArgumentParser(description="Housing CLI")
    subparsers = parser.add_subparsers(dest="command", required=True)



    stats_parser = subparsers.add_parser("stats", help="Show ..")
    stats_parser.add_argument("--city", choices=["sf", "la"],)
    stats_parser.add_argument("--file",)




    filter_parser = subparsers.add_parser("average-by-filter",
        help="Compute ...",)
    filter_parser.add_argument("--city", choices=["sf", "la"],)
    filter_parser.add_argument("--key",)
    filter_parser.add_argument("--value",)



    args = parser.parse_args()

    if args.command == "stats":
        run_stats(args.city, args.file)
    elif args.command == "average-by-filter":
        average_filtered_price(args.city, args.key, args.value)

if __name__ == "__main__":
    main()

  • click package
import click


@click.group()
def cli():
    pass


@cli.command(help="Show ..")
@click.option("--city", type=click.Choice(["sf", "la"]),)
@click.option("--file", type=click.Path(exists=True),)
def stats(city, file):
    run_stats(city, file)


@cli.command("average-by-filter", help="Compute ..",)
@click.option("--city", type=click.Choice(["sf", "la"]),)
@click.option("--key",)
@click.option("--value",)
def filter(city, file, key, value):
    average_filtered_price(city, key, value)


if __name__ == "__main__":
    cli()







Providing CLI arguments via click package
  • General help page:
> python main_click.py --help
Usage: main_click.py [OPTIONS] COMMAND [ARGS]...

Options:
--help  Show this message and exit.

Commands:
average-by-filter  Compute average house price for records matching a filter
stats              Show statistics for the dataset
Providing CLI arguments via click package

General help page:

  • argparse module
> python main_argparse.py --help
uusage: main_argparse.py [-h] {stats,average-by-filter} ...

Housing CLI

positional arguments:
  {stats,average-by-filter}
    stats               Show statistics for the dataset
    average-by-filter   Compute average house price for records
                        matching a filter

options:
  -h, --help            show this help message and exit
  • click package
> python main_click.py --help
Usage: main_click.py [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  average-by-filter  Compute average house price for...
  stats              Show statistics for the dataset
Providing CLI arguments via click package
  • Help for sub-commands needs to be run separately:
    • For stats:

      > python main_click.py stats --help
      Usage: main_click.py stats [OPTIONS]
      
      Show statistics for the dataset
      
      Options:
      --city [sf|la]  City name  [required]
      --file PATH     Path to the dataset file  [required]
      --help          Show this message and exit.
      
    • For average-by-filter:

      > python main_click.py average-by-filter --help
      Usage: main_click.py average-by-filter [OPTIONS]
      
          Compute average house price for records matching a filter
      
      Options:
      --city [sf|la]  City name  [required]
      --file PATH     Path to the dataset file  [required]
      --key TEXT      Field to filter by (e.g., 'mainroad')  [required]
      --value TEXT    Value to match for the given field  [required]
      --help          Show this message and exit.
      
Providing CLI arguments via click package

Help for sub-commands needs to be run separately:

argparse module

> python main_argparse.py average-by-filter --help
usage: main_argparse.py average-by-filter [-h] --city {sf,la} --file FILE
                                          --key KEY --value VALUE

options:
  -h, --help      show this help message and exit
  --city {sf,la}  City name
  --file FILE     Path to the dataset file
  --key KEY       Field to filter by (e.g., 'mainroad')
  --value VALUE   Value to match for the given field

click module

> python main_click.py average-by-filter --help
Usage: main_click.py average-by-filter [OPTIONS]

  Compute average house price for records matching a filter

Options:
  --city [sf|la]  City name: 'sf' or 'la'  [required]
  --file PATH     Path to the dataset file  [required]
  --key TEXT      Field to filter by (e.g., 'mainroad')  [required]
  --value TEXT    Value to match for the given field  [required]
  --help          Show this message and exit.
Providing CLI arguments via click package
  • Running the sub-commands:
    • Running our data processing pipeline for "SF" city:

      > python main_click.py stats --city sf \
      --file "data/official_housing_sales_records_SF.csv"
      
      SFDataset Average Price: 6884266.153846154
      - Near main road: 6917845.168539326
      - Furnished: 7205102.05882353
      - Near main road & furnished: 7230625.9701492535
      
    • Getting the average price for the specified city & records matching a filter

      > python main_click.py average-by-filter --city sf \
      --file "data/official_housing_sales_records_SF.csv" --key "furnishingstatus" \
      --value "furnished"
      
      Average price for 68 records where furnishingstatus = furnished: 7205102.05882353
      
Providing CLI arguments via typer package

Providing CLI arguments via typer package

  • typer was built on top on click

  • Advantages:

    • Type hint integration:
      • utilizes Python's type hints for automatic validation and helps with message generation.
    • Minimize code duplication
    • Provides out-of-the-box shell autocompletion.
  • Disadvantages:

    • External dependency, needs to be installed via pip:
      pip install typer
      
Providing CLI arguments via typer package

Simple CLI via typer module

import typer
from enum import Enum
from pathlib import Path
from src.dataloaders import SFDataset, LADataset

app = typer.Typer()

class City(str, Enum):
    sf = "sf"
    la = "la"

def run_stats(city: str, file: str):
    dataset = SFDataset.load_data(file) if city == "sf" else LADataset.load_data(file)
    dataset.run_analysis_pipeline()

@app.command(help="Show statistics for the dataset")
def stats(
    city: City = typer.Option(..., help="City name: 'sf' or 'la'"),
    file: Path = typer.Option(..., exists=True, readable=True, help="Path to the dataset file")
):
    run_stats(city.value, file)


if __name__ == "__main__":
    app()
Providing CLI arguments via typer package
  • Help page:

    >  python main_typer.py --help
    Usage: main_typer_simple.py [OPTIONS]
    
    Show statistics for the dataset
    
    
    ╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────╮
    │ *  --city                      [sf|la]  City name: 'sf' or 'la' [default: None] [required]                   │
    │ *  --file                      PATH     Path to the dataset file [default: None] [required]                  │
    │    --install-completion                 Install completion for the current shell.                            │
    │    --show-completion                    Show completion for the current shell, to copy it or customize the   │
    │                                         installation.                                                        │
    │    --help                               Show this message and exit.                                          │
    ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
    
    
  • Running our data processing pipeline for "SF" city:

    > python main_typer.py --city sf \
    --file "data/official_housing_sales_records_SF.csv"
    
    SFDataset Average Price: 6884266.153846154
    - Near main road: 6917845.168539326
    - Furnished: 7205102.05882353
    - Near main road & furnished: 7230625.9701492535
    
Providing CLI arguments via typer package

CLI with subcommands via typer package

from enum import Enum
from pathlib import Path

import typer
from src.dataloaders import LADataset, SFDataset

app = typer.Typer()


class City(str, Enum):
    sf = "sf"
    la = "la"


def run_stats(city: str, file: str):
    dataset = SFDataset.load_data(file) if city == "sf" else LADataset.load_data(file)
    dataset.run_analysis_pipeline()


def average_filtered_price(city: str, file: str, key: str, value: str):
    dataset = SFDataset.load_data(file) if city == "sf" else LADataset.load_data(file)
    filtered = dataset.filter_data(dataset.records, key, value)
    avg_price = dataset._calculate_average(filtered)
    print(f"Average price for {len(filtered)} records where {key} = {value}: {avg_price}")


@app.command(help="Show statistics for the dataset")
def stats(
    city: City = typer.Option(..., help="City name: 'sf' or 'la'"),
    file: Path = typer.Option(..., exists=True, readable=True, help="Path to the dataset file"),
):
    run_stats(city.value, file)


@app.command(help="Compute average house price for records matching a filter")
def average_by_filter(
    city: City = typer.Option(..., help="City name: 'sf' or 'la'"),
    file: Path = typer.Option(..., exists=True, readable=True, help="Path to the dataset file"),
    key: str = typer.Option(..., help="Field to filter by (e.g., 'mainroad')"),
    value: str = typer.Option(..., help="Value to match for the given field"),
):
    average_filtered_price(city.value, str(file), key, value)


if __name__ == "__main__":
    app()
Providing CLI arguments via typer package

  • click package
import click


@click.group()
def cli():
    pass





@cli.command(help="Show ..")
@click.option("--city", type=click.Choice(["sf", "la"]),)
@click.option("--file", type=click.Path(exists=True),)
def stats(city, file):
    run_stats(city, file)




@cli.command("average-by-filter", help="Compute ..",)
@click.option("--city", type=click.Choice(["sf", "la"]),)
@click.option("--key",)
@click.option("--value",)
def filter(city, file, key, value):
    average_filtered_price(city, key, value)



if __name__ == "__main__":
    cli()


  • typer package
import typer
from enum import Enum
from pathlib import Path


app = typer.Typer()

class City(str, Enum):
    sf = "sf"
    la = "la"


@app.command(help="Show ...")
def stats(
    city: City = typer.Option(..., ),
    file: Path = typer.Option(..., exists=True, readable=True,)
):
    run_stats(city.name, file)



@app.command(help="Compute ...")
def average_by_filter(
    city: City = typer.Option(...,),
    key: str = typer.Option(..., ),
    value: str = typer.Option(..., )
):
    average_filtered_price(city.value, key, value)



if __name__ == "__main__":
    app()


Providing CLI arguments via typer package
  • General help page:

    >  python main_typer.py --help
    
    Usage: main_typer.py [OPTIONS] COMMAND [ARGS]...
    
    ╭─ Options ──────────────────────────────────────────────────────────────────────────────────────╮
    │ --install-completion          Install completion for the current shell.                        │
    │ --show-completion             Show completion for the current shell, to copy it or customize   │
    │                               the installation.                                                │
    │ --help                        Show this message and exit.                                      │
    ╰────────────────────────────────────────────────────────────────────────────────────────────────╯
    ╭─ Commands ─────────────────────────────────────────────────────────────────────────────────────╮
    │ stats               Show statistics for the dataset                                            │
    │ average-by-filter   Compute average house price for records matching a filter                  │
    ╰────────────────────────────────────────────────────────────────────────────────────────────────╯
    
    
Providing CLI arguments via typer package

General help page:

  • click package
> python main_click.py --help
Usage: main_click.py [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  average-by-filter  Compute average house price for records...
  stats              Show statistics for the dataset
  • typer package
>  python main_typer.py --help

 Usage: main_typer.py [OPTIONS] COMMAND [ARGS]...

╭─ Options ─────────────────────────────────────────────────────────╮
│ --install-completion          Install completion for the current  │
│                               shell.                              │
│ --show-completion             Show completion for the current     │
│                               shell, to copy it or customize the  │
│                               installation.                       │
│ --help                        Show this message and exit.         │
╰───────────────────────────────────────────────────────────────────╯
╭─ Commands ────────────────────────────────────────────────────────╮
│ stats               Show statistics for the dataset               │
│ average-by-filter   Compute average house price for records       │
│                     matching a filter                             │
╰───────────────────────────────────────────────────────────────────╯

Providing CLI arguments via typer package
  • Help for sub-commands needs to be run separately:
    • For stats:

      > python main_click.py stats --help
      Usage: main_typer.py stats [OPTIONS]
      
      Show statistics for the dataset
      
      ╭─ Options ─────────────────────────────────────────────────────────────────────╮
      │ *  --city        [sf|la]  City name: 'sf' or 'la' [default: None] [required]  │
      │ *  --file        PATH     Path to the dataset file [default: None] [required] │
      │    --help                 Show this message and exit.                         │
      ╰───────────────────────────────────────────────────────────────────────────────╯
      
    • For average-by-filter:

      > python main_typer.py average-by-filter --help
      Usage: main_typer.py average-by-filter [OPTIONS]
      
      Compute average house price for records matching a filter
      
      ╭─ Options ────────────────────────────────────────────────────────────────────────────────────╮
      │ *  --city         [sf|la]  City name: 'sf' or 'la' [default: None] [required]                │
      │ *  --file         PATH     Path to the dataset file [default: None] [required]               │
      │ *  --key          TEXT     Field to filter by (e.g., 'mainroad') [default: None] [required]  │
      │ *  --value        TEXT     Value to match for the given field [default: None] [required]     │
      │    --help                  Show this message and exit.                                       │
      ╰──────────────────────────────────────────────────────────────────────────────────────────────╯
      
Providing CLI arguments via typer package

Help for sub-commands:

  • click package
> python main_click.py average-by-filter --help
Usage: main_click.py average-by-filter [OPTIONS]

    Compute average house price for records matching a filter

Options:
--city [sf|la]  City name  [required]
--file PATH     Path to the dataset file  [required]
--key TEXT      Field to filter by (e.g., 'mainroad')  [required]
--value TEXT    Value to match for the given field  [required]
--help          Show this message and exit.
  • typer package
> python main_typer.py average-by-filter --help
Usage: main_typer.py average-by-filter [OPTIONS]

Compute average house price for records matching a filter


╭─ Options ─────────────────────────────────────────────────────────╮
│ *  --city         [sf|la]  City name: 'sf' or 'la'                │
│                            [default: None]                        │
│                            [required]                             │
│ *  --file         PATH     Path to the dataset file               │
│                            [default: None]                        │
│                            [required]                             │
│ *  --key          TEXT     Field to filter by (e.g., 'mainroad')  │
│                            [default: None]                        │
│                            [required]                             │
│ *  --value        TEXT     Value to match for the given field     │
│                            [default: None]                        │
│                            [required]                             │
│    --help                  Show this message and exit.            │
╰───────────────────────────────────────────────────────────────────╯
Providing CLI arguments via typer package
  • Running the sub-commands:
    • Running our data processing pipeline for "SF" city:

      > python main_typer.py stats --city sf \
      --file "data/official_housing_sales_records_SF.csv"
      
      SFDataset Average Price: 6884266.153846154
      - Near main road: 6917845.168539326
      - Furnished: 7205102.05882353
      - Near main road & furnished: 7230625.9701492535
      
    • Getting the average price for the specified city & records matching a filter

      > python main_typer.py average-by-filter --city sf \
      --file "data/official_housing_sales_records_SF.csv" --key "furnishingstatus" \
      --value "furnished"
      
      Average price for 68 records where furnishingstatus = furnished: 7205102.05882353
      
Conclusions / Take‑Home Messages

Take‑Home Messages

  • Parameterise scripts

    • CLI‑driven workflows → automation, reproducibility & easy integration with pipelines.
    • Separate logic from configuration → no hard‑coded paths/parameters
    • Provide clear help strings & validation.
  • CLI libraries

    • argparse: built‑in, minimal, more boiler‑plate.
    • click: decorator style, richer help, external dep.
    • typer: type‑hint integration, auto‑completion, built on click.
  • Which CLI module to use

    • Small, dependency‑free scripts → argparse
    • Growing CLIs with sub‑commands → click
    • Modern, type‑safe tools → typer

</div>

<div style="font-size: 0.75em !important;">

</div>