Python Packaging (2024)

sysconfig Project Structure Setup Script Distribution Archives Distribution sysconfig The process of bundling Python code into a format that eases distribution and sharing. First up, I find it helps to get a concrete understanding of how the specific python distro I’m working with is configured, of paricular interest are the various system paths that will be visited for package dependencies. The built-in sysconfig module neatly manages and surfaces this information. python -m sysconfig On a Windows system: ...

December 29, 2023 · 3 min

Python PR Checklist

A modified version of excellent original checklist by Paul Wolf General Code is blackened with black ruff has been run with no errors mypy has been run with no errors Function complexity problems have been resolved using the default complexity index of flake8. Important core code can be loaded in iPython, ipdb easily. There is no dead code Comprehensions or generator expressions are used in place of for loops where appropriate Comprehensions and generator expressions produce state but they do not have side effects within the expression. Use zip(), any(), all(), filter(), etc. instead of for loops where appropriate Functions that take as parameters and mutate mutable variables don’t return these variables. They return None. Return immutable copies of mutable types instead of mutating the instances themselves when mutable types are passed as parameters with the intention of returning a mutated version of that variable. Avoid method cascading on objects with methods that return self. Function and method parameters never use empty collection or sequence instances like list [] or dict {}. Instead they must use None to indicate missing input Variables in a function body are initialised with empty sequences or collections by callables, list(), dict(), instead of [], {}, etc. Always use the Final type hint for class instance parameters that will not change. Context-dependent variables are not unnecessarily passed between functions or methods View functions either implement the business rules the view is repsonsible for or it passes data downstream to have this done by services and receives non-context dependent data back. View functions don’t pass request to called functions Functions including class methods don’t have too many local parameters or instance variables. Especially a class’ __init__() should not have too many parameters. Profiling code is minimal Logging is the minimum required for production use There are no home-brewed solutions for things that already exist in the PSL (python standard library) n00b habbits Bare except clause, Python uses exceptions to flag system level interupts such as sigkills. Don’t do this. Argument default mutatable arguments such as def foo(bar=[]) are defined when the function is defined, not when its run, and will result in a all function calls sharing the same instance of bar Checking for equality using ==. Due to inheritance this is not desirable as it pins to a concrete type and not potentially it descendents. In other words the Liskov substitution principle. Instead isinstance(p, tuple) Explicit bool or length checks, such as if bool(x) or if len(x) > 0 is redundant, as Python has sane truthy evaluation. Use of range over the for in idiom If you really need the index, always use enumerate Not using items() on a dict for k, v in dict.items()) Using time.time to measure code performance. Use time.perf_counter instead. Using print statements over logging Using import * will normally liter the namespace with variable. Dont be lazy, be specific. from itertools import count Imports and modules Imports are sorted by isort or according to some standard that is consistent within the team Import packages or modules to qualify the use of functions or classes so that unqualified function calls can be assumed to be to functions in the current module Documentation Modules have docstrings Classes have docstrings unless their purpose is immediately obvious Methods and functions have docstrings Comments and docstrings add non-obvious and helpful information that is not already present in the naming of functions and variables General Complexity Functions as complex as they need to be but no more (as defined by flake8\ ’s default complexity threshold) Classes have only as many methods as required and have a simple hierarchy Context Freedom All important functionality can be loaded easily in ipython without having to construct dummy requests, etc. All important functionality can be loaded in pdb (or a variant, ipdb, etc.) Types Use immutable types ()tuple, frozenset, Enum, etc) over mutable types whenever possible Functions Functions are pure wherever possible, i.e. they take input and provide a return value with no side-effects or reliance on hidden state. Modules Module level variables do not take context-dependent values like connection clients to remote systems unless the client is used immediately for another module level variable and not used again Classes Every class has a single well-defined purpose. That is, the class does not mix up different tasks, like remote state acquisition, web sockets notification, data formatting, etc. Classes manage state and do not just represent the encapsulation of behaviour All methods access either cls or self in the body. If a method does not access cls or self, it should be a function at module level. @classmethod is used in preference to @staticmethod but only if the method body accesses cls otherwise the method should be a module level function. Constants are declared at module level not in methods or class level Constants are always upper case Abstract classes are derived from abc: from abc import ABC Abstract methods use the @abstractmethod decorator Abstract class properties use both @abstractmethod and @property decorators Classes do not use multiple inheritance Classes do not use mixins (use composition instead) except in rare cases Class names do not use the word “Base” to signal they are the single ancestor, like “BaseWhatever” Decorators are not used to replace classes as a design pattern __init__() does not define too many local variables. Use the Parameter Consolidation pattern instead. A factory class or function at module level is used for complex class construction (see Design Patterns) to achieve composition Classes are not dynamically created from strings except where forward reference requires this Design Patterns Do not use designs that cause a typical Python developer to have to learn new semantics that are unexpected in Python Classes primarily use composition in preference to inheritance Beyond a very small number of simple variables, a class’ purpose is to acquire state for another class or it uses another class to acquire state in particular if the state is from a remote service. If you use the Context Parameter pattern, it is critical that the state of the context does not change after calling its __init__(), i.e. it should be immutable If a class’ purpose is to represent an external integration, you probably want numerous classes to compose the service: RemoteDataClient, DomainManager, ContextManager, Factory, NotificationController, DomainResponse, DataFormatter and so on.

October 30, 2023 · 5 min

Async Python

Background Using asyncio will not make your code multi-threaded. That is, it will not cause multiple Python instructions to be executed at the same time, and it will not in any way allow you to side step the so-called “global interpreter lock” (GIL). Some processes are CPU-bound: they consist of a series of instructions which need to be executed one after another until the result has been computed. Most of their time is spent making heavy use of the processor. ...

August 9, 2023 · 9 min

Python Type Annotations

Start with the docs and the Type hints cheat sheet Topics for consideration: syntax shorthands e.g. | for Union or Optional Self If you are using the typing library then there is an abstract type class provided for asynchronous context managers AsyncContextManager[T], where T is the type of the object which will be bound by the as clause of the async with statement. mypy If you are using typing then there is an abstract class Awaitable which is generic, so that Awaitable[R] for some type R means anything which is awaitable, and when used in an await statement will return something of type R. ...

August 9, 2023 · 1 min

Python Standard Libraries

An important part of becoming “good” at a language is becoming familiar with its library eco-system. The official Python Standard Library reference manual rocks. Module Category Description argparse functions for parsing command line arguments atexit allows you to register functions for your program to call when it exits bisect bisection algorithms for sorting lists (see Chapter 10) calendar a number of date-related functions codecs functions for encoding and decoding data collections a variety of useful data structures concurrent asynchronous computation copy functions for copying data csv functions for reading and writing CSV files datetime classes for handling dates and times fileinput file access iterate over lines from multiple files or input streams fnmatch functions for matching Unix-style filename patterns glob functions for matching Unix-style path patterns io functions for handling I/O streams and StringIO, which allows you to treat strings as files. json functions for reading and writing data in JSON format logging access to Python’s own built-in logging functionality multiprocessing allows you to run multiple subprocesses, while providing an API that makes them look like threads operator functions implementing the basic Python operators, instead of writing your own lambda expressions os swiss army knife access to basic OS functions pprint data types data pretty printer random functions for generating pseudorandom numbers re regular expression functionality sched an event scheduler without using multithreading select access to the select() and poll() functions for creating event loops shutil file access access to high-level file functions signal functions for handling POSIX signals tempfile file access functions for creating temporary files and directories threading access to high-level threading functionality urllib provides functions for handling and parsing URLs uuid allows you to generate Universally Unique Identifiers (UUIDs)

August 4, 2023 · 2 min

Objects in Python

Special methods (dunders) Foundational Iterators Compariable classes Serializable classes Classes with computed attributes Classes that are callable Classes that act like sets Classes that act like dictionaries Classes that act like numbers Classes that can be used in a with block Esoteric behavior Design Patterns As I learn more about Pythons idioms reflect on its unique approach to object based programming. In combination with duck typing its approach to objects feels distrubingly flexible. ...

August 3, 2023 · 9 min

Testing in Python

There are many ways to write unit tests in Python. unittest Here the focus is living off the land with built-in unittest. unittest is both a framework and test runner, meaning it can execute your tests and return the results. In order to write unittest tests, you must: Write your tests as methods within classes These TestCase classes must subclass unittest.TestCase Names of test functions must begin with test_ Import the code to be tested Use a series of built-in assertion methods Basic example import unittest class TestStringMethods(unittest.TestCase): def test_upper(self): self.assertEqual('foo'.upper(), 'FOO') def test_isupper(self): self.assertTrue('FOO'.isupper()) self.assertFalse('Foo'.isupper()) def test_split(self): s = 'hello world' self.assertEqual(s.split(), ['hello', 'world']) # check that s.split fails when the separator is not a string with self.assertRaises(TypeError): s.split(2) if __name__ == '__main__': unittest.main() Assertions The TestCase class provides several assert methods to check for and report failures. ...

August 3, 2023 · 2 min

Python quick reference

Forked from 101t/python-cheatsheet.md RTFM The Python Standard Library Built-in Functions Python Enhancement Propsoals (PEPs) The Zen of Python never far away in the REPL import this Contents Getting started: CPython, Easter eggs, Import paths, venv Collections: List, Dictionary, Set, Tuple, Range, Enumerate, Iterator, Generator Functions: Functions, Modules Types: Type, String, Regular_Exp, Format, Numbers, Combinatorics, Datetime Syntax: Args, Splat, Inline, Closure, Decorator, Class, Duck_Type, Enum, Exception System: Exit, Print, Input, Command_Line_Arguments, Open, Path, OS_Commands Data: JSON, Pickle, CSV, SQLite, Bytes, Struct, Array, Memory_View, Deque Advanced: Threading, Operator, Introspection, Metaprograming, Eval, Coroutines Libraries: Progress_Bar, Plot, Table, Curses, Logging, Scraping, Web, Profile, NumPy Packaging and Tools: Real app, Bytecode disassembler, Poetry, Gems, Resources CPython Most distros lag behind the latest releases of python. Its quite a pleasant experience to just build CPython from source, as per the docs: sudo apt update sudo apt install build-essential gdb lcov pkg-config \ libbz2-dev libffi-dev libgdbm-dev libgdbm-compat-dev liblzma-dev \ libncurses5-dev libreadline6-dev libsqlite3-dev libssl-dev \ lzma lzma-dev tk-dev uuid-dev zlib1g-dev ./configure sudo make sudo make install sudo python3 -m pip install --upgrade pip setuptools wheel Easter eggs import this import antigravity from __future__ import braces Import paths When importing modules, Python relies on a list of paths to know where to look for the module. This list is stored in the sys.path variable. ...

June 5, 2022 · 50 min