Python PR Checklist

A modified version of excellent original checklist by Paul Wolf General Code is blackened with black ruff has been run with no errors mypy has been run with no errors Function complexity problems have been resolved using the default complexity index of flake8. Important core code can be loaded in iPython, ipdb easily. There is no dead code Comprehensions or generator expressions are used in place of for loops where appropriate Comprehensions and generator expressions produce state but they do not have side effects within the expression. Use zip(), any(), all(), filter(), etc. instead of for loops where appropriate Functions that take as parameters and mutate mutable variables don’t return these variables. They return None. Return immutable copies of mutable types instead of mutating the instances themselves when mutable types are passed as parameters with the intention of returning a mutated version of that variable. Avoid method cascading on objects with methods that return self. Function and method parameters never use empty collection or sequence instances like list [] or dict {}. Instead they must use None to indicate missing input Variables in a function body are initialised with empty sequences or collections by callables, list(), dict(), instead of [], {}, etc. Always use the Final type hint for class instance parameters that will not change. Context-dependent variables are not unnecessarily passed between functions or methods View functions either implement the business rules the view is repsonsible for or it passes data downstream to have this done by services and receives non-context dependent data back. View functions don’t pass request to called functions Functions including class methods don’t have too many local parameters or instance variables. Especially a class’ __init__() should not have too many parameters. Profiling code is minimal Logging is the minimum required for production use There are no home-brewed solutions for things that already exist in the PSL (python standard library) n00b habbits Bare except clause, Python uses exceptions to flag system level interupts such as sigkills. Don’t do this. Argument default mutatable arguments such as def foo(bar=[]) are defined when the function is defined, not when its run, and will result in a all function calls sharing the same instance of bar Checking for equality using ==. Due to inheritance this is not desirable as it pins to a concrete type and not potentially it descendents. In other words the Liskov substitution principle. Instead isinstance(p, tuple) Explicit bool or length checks, such as if bool(x) or if len(x) > 0 is redundant, as Python has sane truthy evaluation. Use of range over the for in idiom If you really need the index, always use enumerate Not using items() on a dict for k, v in dict.items()) Using time.time to measure code performance. Use time.perf_counter instead. Using print statements over logging Using import * will normally liter the namespace with variable. Dont be lazy, be specific. from itertools import count Imports and modules Imports are sorted by isort or according to some standard that is consistent within the team Import packages or modules to qualify the use of functions or classes so that unqualified function calls can be assumed to be to functions in the current module Documentation Modules have docstrings Classes have docstrings unless their purpose is immediately obvious Methods and functions have docstrings Comments and docstrings add non-obvious and helpful information that is not already present in the naming of functions and variables General Complexity Functions as complex as they need to be but no more (as defined by flake8\ ’s default complexity threshold) Classes have only as many methods as required and have a simple hierarchy Context Freedom All important functionality can be loaded easily in ipython without having to construct dummy requests, etc. All important functionality can be loaded in pdb (or a variant, ipdb, etc.) Types Use immutable types ()tuple, frozenset, Enum, etc) over mutable types whenever possible Functions Functions are pure wherever possible, i.e. they take input and provide a return value with no side-effects or reliance on hidden state. Modules Module level variables do not take context-dependent values like connection clients to remote systems unless the client is used immediately for another module level variable and not used again Classes Every class has a single well-defined purpose. That is, the class does not mix up different tasks, like remote state acquisition, web sockets notification, data formatting, etc. Classes manage state and do not just represent the encapsulation of behaviour All methods access either cls or self in the body. If a method does not access cls or self, it should be a function at module level. @classmethod is used in preference to @staticmethod but only if the method body accesses cls otherwise the method should be a module level function. Constants are declared at module level not in methods or class level Constants are always upper case Abstract classes are derived from abc: from abc import ABC Abstract methods use the @abstractmethod decorator Abstract class properties use both @abstractmethod and @property decorators Classes do not use multiple inheritance Classes do not use mixins (use composition instead) except in rare cases Class names do not use the word “Base” to signal they are the single ancestor, like “BaseWhatever” Decorators are not used to replace classes as a design pattern __init__() does not define too many local variables. Use the Parameter Consolidation pattern instead. A factory class or function at module level is used for complex class construction (see Design Patterns) to achieve composition Classes are not dynamically created from strings except where forward reference requires this Design Patterns Do not use designs that cause a typical Python developer to have to learn new semantics that are unexpected in Python Classes primarily use composition in preference to inheritance Beyond a very small number of simple variables, a class’ purpose is to acquire state for another class or it uses another class to acquire state in particular if the state is from a remote service. If you use the Context Parameter pattern, it is critical that the state of the context does not change after calling its __init__(), i.e. it should be immutable If a class’ purpose is to represent an external integration, you probably want numerous classes to compose the service: RemoteDataClient, DomainManager, ContextManager, Factory, NotificationController, DomainResponse, DataFormatter and so on.

October 30, 2023 · 5 min

Async Python

Background Using asyncio will not make your code multi-threaded. That is, it will not cause multiple Python instructions to be executed at the same time, and it will not in any way allow you to side step the so-called “global interpreter lock” (GIL). Some processes are CPU-bound: they consist of a series of instructions which need to be executed one after another until the result has been computed. Most of their time is spent making heavy use of the processor. ...

August 9, 2023 · 9 min

Python Type Annotations

Start with the docs and the Type hints cheat sheet Topics for consideration: syntax shorthands e.g. | for Union or Optional Self If you are using the typing library then there is an abstract type class provided for asynchronous context managers AsyncContextManager[T], where T is the type of the object which will be bound by the as clause of the async with statement. mypy If you are using typing then there is an abstract class Awaitable which is generic, so that Awaitable[R] for some type R means anything which is awaitable, and when used in an await statement will return something of type R. ...

August 9, 2023 · 1 min

Python Standard Libraries

An important part of becoming “good” at a language is becoming familiar with its library eco-system. The official Python Standard Library reference manual rocks. Module Category Description argparse functions for parsing command line arguments atexit allows you to register functions for your program to call when it exits bisect bisection algorithms for sorting lists (see Chapter 10) calendar a number of date-related functions codecs functions for encoding and decoding data collections a variety of useful data structures concurrent asynchronous computation copy functions for copying data csv functions for reading and writing CSV files datetime classes for handling dates and times fileinput file access iterate over lines from multiple files or input streams fnmatch functions for matching Unix-style filename patterns glob functions for matching Unix-style path patterns io functions for handling I/O streams and StringIO, which allows you to treat strings as files. json functions for reading and writing data in JSON format logging access to Python’s own built-in logging functionality multiprocessing allows you to run multiple subprocesses, while providing an API that makes them look like threads operator functions implementing the basic Python operators, instead of writing your own lambda expressions os swiss army knife access to basic OS functions pprint data types data pretty printer random functions for generating pseudorandom numbers re regular expression functionality sched an event scheduler without using multithreading select access to the select() and poll() functions for creating event loops shutil file access access to high-level file functions signal functions for handling POSIX signals tempfile file access functions for creating temporary files and directories threading access to high-level threading functionality urllib provides functions for handling and parsing URLs uuid allows you to generate Universally Unique Identifiers (UUIDs)

August 4, 2023 · 2 min

Objects in Python

Special methods (dunders) Foundational Iterators Compariable classes Serializable classes Classes with computed attributes Classes that are callable Classes that act like sets Classes that act like dictionaries Classes that act like numbers Classes that can be used in a with block Esoteric behavior Design Patterns As I learn more about Pythons idioms reflect on its unique approach to object based programming. In combination with duck typing its approach to objects feels distrubingly flexible. ...

August 3, 2023 · 9 min

Testing in Python

There are many ways to write unit tests in Python. unittest Here the focus is living off the land with built-in unittest. unittest is both a framework and test runner, meaning it can execute your tests and return the results. In order to write unittest tests, you must: Write your tests as methods within classes These TestCase classes must subclass unittest.TestCase Names of test functions must begin with test_ Import the code to be tested Use a series of built-in assertion methods Basic example import unittest class TestStringMethods(unittest.TestCase): def test_upper(self): self.assertEqual('foo'.upper(), 'FOO') def test_isupper(self): self.assertTrue('FOO'.isupper()) self.assertFalse('Foo'.isupper()) def test_split(self): s = 'hello world' self.assertEqual(s.split(), ['hello', 'world']) # check that s.split fails when the separator is not a string with self.assertRaises(TypeError): s.split(2) if __name__ == '__main__': unittest.main() Assertions The TestCase class provides several assert methods to check for and report failures. ...

August 3, 2023 · 2 min

Digital Forensics

It’s semester 2 2023 and time for my final subject in the UNSW Cyber Security Masters course, digtital forensics run by Seth Enoka. I got to venture deep into Windows internals, including core Windows memory structures, subsystems such as prefetch and shimcache, NTFS file system internals and mechanicsm including MFT analysis and much more. All this analysis was conducting using the following Linux analysis tools: Tools Tools Description Yara A pattern-matching tool used in malware research and forensic analysis to identify and classify files based on defined rules and signatures. Volatility 2 & 3 Open-source memory forensics frameworks used to extract and analyze digital artifacts from volatile memory (RAM) in a memory dump to investigate cyber incidents and malware. Volatility USNParser Plugin A Volatility plugin specifically designed to parse and extract information from the USN journal on Windows systems, aiding in file activity analysis. SCCA Tools SCCA (Source Code Control System Analysis) Tools assist in examining version control system repositories to identify code changes, contributors, and track project history. ESEDB Tools These tools provide access to Extensible Storage Engine (ESE) Database files, commonly used in Windows applications, for analysis and recovery purposes. analyzeMFT A tool used in digital forensics to parse and analyze the Master File Table (MFT) entries from NTFS filesystems, revealing information about files and directories. Oletools A collection of Python-based tools for analyzing and extracting data from OLE (Object Linking and Embedding) files, such as Microsoft Office documents, often used in malware analysis. Wireshark A widely-used network protocol analyzer that captures and inspects data packets on a network, helping with network troubleshooting, security analysis, and protocol reverse engineering. The Sleuth Kit (TSK) An open-source digital forensic toolkit that includes various CLI tools (mmls, fls, icat) for file system analysis and data recovery from different operating systems. Plaso An open-source Python-based tool used for super timeline creation and log analysis, helping to reconstruct events and activities from various data sources for forensic investigations. Advanced Forensics Format Library (afflib) Tools Tools for working with the Advanced Forensics Format (AFF), an extensible open file format used in computer forensics to store disk images and related metadata. wxHexEditor A hexadecimal editor with a graphical user interface, used for low-level data inspection and editing in forensic analysis and reverse engineering. Gnumeric A spreadsheet application, similar to Microsoft Excel, used for data analysis and visualization, including data manipulation and statistical functions. Personal Folder File Tools (pfftools) Tools designed to work with Personal Folder File (PFF) formats, commonly used by Microsoft Outlook to store emails, calendars, and other personal data. These tools aid in email forensics and analysis. Resources Windows shellbags 8 timestamps on an NTFS file system, an attacker can fairly easily mutate 4 of them, hard to convincingly adjust nano-second level Eric Zimmermans Windows Forensics Tools SANS Hunt Evil Poster Knowing what’s normal on a Windows host helps cut through the noise to quickly locate potential malware. Use this information as a reference to know what’s normal in Windows and to focus your attention on the outliers. MITRE ATT&CK MITRE ATT&CK for ICS Cyber Kill Chain Industrial Cyber Kill Chain Locard’s Exchange Principle NIST Guide to Forensics in Incident Response Dragos Threat Groups Crowdstrike Adversary Groups Diamond Model for Intrusion Analysis The Four Types of Threat Detection Volatility v2.4 cheat sheet Module 0 - Intro Locards Principle (Edmond Locard aka Sherlock Holmes of France) ...

July 22, 2023 · 7 min

Python 3.11

Cool new features in 3.11. Performance 1.2x faster generally, thanks to an adaptive interpreter (PEP659) that optimises byte-code based on observed behaviour and usage. Take for example the LOAD_ATTR instruction, which under 3.11 can be replaced by LOAD_ATTR_ADAPTIVE. This will replace the call to the most optimised instruction based on what is being done, such as: LOAD_ATTR_INSTANCE_VALUE LOAD_ATTR_MODULE LOAD_ATTR_SLOT Disassembling some code: def feet_to_meters(feet): return 0.3048 * feet for feet in (1.0, 10.0, 100.0, 1000.0, 2000.0, 3000.0, 4000.0): print(f"{feet:7.1f} feet = {feet_to_meters(feet):7.1f} meters") import dis dis.dis(feet_to_meters, adaptive=True) # 1 0 RESUME 0 # # 2 2 LOAD_CONST 1 (0.3048) # 4 LOAD_FAST 0 (feet) # 6 BINARY_OP 5 (*) # 10 RETURN_VALUE However, when the interpreter is given more concrete to work with its able to optimise. For example, outside the loop context when given a float, floating point instructions are put to work: ...

July 17, 2023 · 2 min

Information Assurance

Kicking off the 2023 University year I continue my journey into the Cybersecurity Masters program with unit Infomation Assurance and Security run by Michael McGarity and Huadong Mo. Provides students with a deep understanding of the technical, management and organisational aspects of Information Assurance within a holistic legal and social framework. The course is essentially modelled off the CISSP certification, which dives into the following subjects: make a realistic assessment of the needs for information security in an organisation discuss the implications of security decisions on the organisation’s information systems understand the principles of writing secure code show an understanding of database and network security issues demonstrate an understanding of encryption techniques understand foundations of the tools and techniques in computer forensics show an appreciation of the commercial, legal and social context in which IT security is implemented apply knowledge gained to business and technical IA scenarios Intro Not a one size fits all approach. Too many factors and seemingling chaotic variables, such as risk appetites, country legislation, the business vertical (mining vs banking vs government), acceditation frameworks that apply to certain industries, tolerances, technology limitations, and so on. ...

March 4, 2023 · 4 min

Vue

A bunch of (scattered) tips and resources as I experiment with Vue. Basics: General wisdom Anatomy Eventhandling Watchers Computed props Components: Components Props Lifecycle hooks Emitting events Slots Fetching Data: Calling APIs in hooks Unique identifiers Styling Components: Global vs scoped styles CSS modules CSS v-bind Composition API: Composition API Reactive references script setup Composables Routing and Deployment: Vue Router History Dynamic routes Deployment Advanced: Pre-processors Pinia State Management Overview What is Vue? an open-source model–view–viewmodel front end JavaScript framework for building user interfaces and single-page applications, created by Evan You Helpful resouces: Read the offical docs Examples Vue cheat sheet Awesome Vue Vue.js devtools Volar VSCode extension Built-in Directives General wisdom It’s best to stick to conventions of the web and use camelCase in your script and kebab-case in your template Don’t pass functions as props, instead emit events props couples components to each other, for broad or deep cross cutting state, level up to state management Test data sources: JSON Placeholder PokeAPI Anatomy Here is a bare bones vue app. There are literally 3 blocks for script, template (markup) and style: ...

March 2, 2023 · 15 min