Company logo

Quantlane tech blog

How to start type-checking a large Python codebase

Static typing is a great tool for discovering bugs long before the code reaches production. Using it in new code is an obvious choice for us. However, introducing it in older codebases that predated mypy's maturity was not an easy journey for us. If you're in the same boat, read on about the strategy we successfully used to cover a lot of critical code with mypy checks.

Sharpen your tools: how to configure mypy before you start

The default mypy configuration is quite lenient. If you don't start with slightly stricter settings than what the default is, you may pick up some bad habits that will come back and bite you later.

Ensure full coverage

You want consistent quality of type annotations in the code you're going to check (which won't be the entire big codebase right from the start – more on that below). Adding these settings to mypy.ini will immediately raise the bar:

disallow_untyped_calls = True
disallow_untyped_defs = True
disallow_incomplete_defs = True
disallow_untyped_decorators = True
check_untyped_defs = True

Refer to the "Untyped definitions and calls" section of mypy's documentation for details.

Restrict dynamic typing (a little bit)

These options are not strictly necessary and might be difficult to enable in a large codebase. You may give them a try and see how much work it would be to update your code and annotations to conform.

disallow_any_generics = True      # e.g. `x: list[Any]` or `x: list`
disallow_subclassing_any = True
warn_return_any = True            # From functions not declared
                                  # to return Any.

Refer to the "Disallow dynamic typing" section of mypy's documentation for details and some even tougher options you can try.

Know exactly what you're doing

These options are a no-brainer. They alert you to cases where you're using mypy features incorrectly, or when mypy is doing something else you think it's doing.

warn_redundant_casts = True
warn_unused_ignores = True
warn_unused_configs = True
show_error_codes = True

Example mypy.ini file putting all of this together

This is a template file we use for new projects. Feel free to reuse it or adapt to your needs 🙂

[mypy]
; Ensure full coverage
disallow_untyped_calls = True
disallow_untyped_defs = True
disallow_incomplete_defs = True
disallow_untyped_decorators = True
check_untyped_defs = True

; Restrict dynamic typing
disallow_any_generics = True
disallow_subclassing_any = True
warn_return_any = True

; Know exactly what you're doing
warn_redundant_casts = True
warn_unused_ignores = True
warn_unused_configs = True
warn_unreachable = True
show_error_codes = True

; Explicit is better than implicit
no_implicit_optional = True

[mypy-*.tests.*]
; pytest decorators are not typed
disallow_untyped_decorators = False

Divide & conquer

Covering a large codebase with static type checking is a huge amount of work, so you want to break it up into manageable chunks. You will be working on gradual coverage: only small parts of your code will pass mypy checks in the beginning, and you will be expanding that ground over time – when and if you see the benefits justifying the costs.

In practice, we have several 'gradual coverage initiatives' that have slowly been progressing for years, and the cold hard truth is that they will never really be done because at some point the diminishing returns no longer justify the effort. And this is fine: static typing is not meant to be an OCD exercise at dotting the i's and crossing the t's just because. It is one of the tools that should help us find bugs early on, but not the only tool. For example, old, self-contained code that doesn't change much and is covered by automated tests will not benefit much from improved static typing checks.

That said, adhering to a high standard of typing coverage is easy when writing new code, and we take advantage of that to always improve our understanding and coverage.

The salami method: opt-in by default

Begin with opt-in: only modules explicitly listed in mypy invocation are checked. You can even start with just one module.

$ mypy models/ lib/cache/ dev/tools/manage.py

Once you get mypy passing on your first small set of modules, defend your progress: add that mypy invocation as a non-optional step to your CI pipeline. That ensures that your small island of correctly typed code doesn't disappear.

Your job is then to gradually grow that list of covered modules. Adding some modules will be a piece of cake best done with your afternoon coffee when you want to kick back and do some easy work. Other modules will require days of concerted effort.

This can even be fun. Once, a group of typing enthusiasts from our developer team organised a mini-mypy-hackathon. We stayed at the office overnight, and apart from eating a lot of pizza and drinking too much beer, we managed to cover a substantial part of our largest application. This included a lot of learning about type theory, and a mypy plugin got written, too!

When getting modules on your opt-in list to pass, you will be dealing with imports of modules that are not yet ready to be checked. You can tell mypy to not worry about those with these options in your mypy.ini:

ignore_missing_imports = True
follow_imports = silent

These broad relaxations can hide a lot of typing issues, so remember to remove them as soon as you can. It is a good idea to read up on how mypy follows imports when you use these options.

Turning the tables: opt-out by default

At some point, you want to switch from 'only explicitly listed modules are checked' to 'everything apart from some explicit exceptions is checked'. Of course, there might be dozens of exceptions in the beginning. While your mypy invocation will look pristine...

$ mypy

...your mypy.ini might contain a huge list of exceptions:

[mypy-lib.math.*]
ignore_errors = True
[mypy-controllers.utils]
ignore_errors = True
...

At this point, you're guaranteed that any newly added modules will be checked by default, which is a big win. With that, you now work to gradually reduce that list of exceptions. Just like with growing the opt-in list, this might be a long-term effort.

Including your own packages in the type check

If you write your own Python package and annotate its code, you should know that mypy will by default not respect those type annotations when you import the package in your projects. To tell mypy that a package contains valid type hints, you need to add a py.typed marker file (see PEP 561):

$ touch your_package/py.typed
setup(
        ...,
        package_data = {
                'your_package': ['py.typed'],
        },
        ...,
)

This is simple enough, but you need to do it to benefit from type hints in packages.

Third-party packages

A growing number of packages have adopted type hints, so mypy will be able to check that you're using them correctly. For those that are still not typed, you have a few options:

  • Write stub files for them. Refer to mypy documentation on stubs.

  • Ignore all untyped third-party packages. This is rather heavy-handed and rarely needed.

    ignore_missing_imports = True
    follow_imports = silent
    
  • Explicitly ignore just those that don't have type hints:

    [mypy-package.to.ignore]
    ignore_missing_imports = True
    follow_imports = silent
    

More about advanced static typing

You may also be interested about my EuroPython Basel talk which starts with the topic of this article and continues with some advanced static typing features that may come in handy when annotating complex code.


Slides are available at qntln.github.io/europython2019.

Quantlane Written by Vita Smid on December 21, 2021. We have more articles like this on our blog. If you want to be notified when we publish something about Python or finance, sign up for our newsletter:
Vita Smid

Hi, my name is Vita. I co-founded Quantlane together with a few stock traders in 2014. I developed the first version of our Python trading platform from the ground up. After our engineering team started to grow I became Quantlane's CTO. This was quite new for me, as prior to this I had been working as a freelance developer and had a degree in financial mathematics. In 2022, I left the company to seek new adventures.