Elasticsearch Engineer 8.1

Revised 2024 edition based on Elasticsearch 8.1. Recently the opportunity to attend the latest revision of the 4-day Elasticsearch engineer course, which I did in-person about 5 years ago in Sydney. Elasticsearch has often been an integral part of the data solutions I’ve been involved with and I’m quite fond of it. This time round the course only runs in a virtual class room format (using strigo.io) with our awesome trainers Krishna Shah and Kiju Kim. ...

June 2, 2024 · 60 min

Nerd Gems 💎

This is a list of valuable (to me) developer resources that I’ve managed to stumble across (books, courses, friends and fellow programmers, hacker news, lobste.rs, university). Architecture AI and ML Books and Reading Lists C Compilers and Interpreters Cloud Containers Cheat sheets Databases Developer culture Diagramming dotfiles Git Golang gRPC Hardware Humanities Jobs Kubernetes Linux Message queues Mongo Monitoring Networking Open source Python Rust Security Shell systemd Text wrangling Talks Tasks Terminal Testing Text, Encoding and Serialization Vim Web Writing Architecture .NET Microservices: Architecture for Containerized .NET Applications a fantastic resource for working with the modern .NET stack (post 2022) Communicating Sequential Processes Tony Hoare’s seminal 1977 paper on concurrency and CSP Why Segment Went Back to a Monolith microservices come with serious tradeoffs All software sucks complexity is the bane of all software, simplicity is the most important quality Designing Actor-Based Software with Hugh McKee an approach to building scalable software systems Queueing: An interactive study of queueing strategies an interactive journey to understand common queueing strategies for handling HTTP requests. AI and ML Andrej Karpathy on The spelled-out intro to neural networks and backpropagation: building micrograd a 2.5 hour step-by-step spelled-out explanation of backpropagation and training of neural networks. It only assumes basic knowledge of Python and a vague recollection of calculus from high school. A Beginner’s Guide to Vector Embeddings Books and Reading Lists Arjan Codes Books learn-anything/books A Programmer’s Reading List: 100 Articles I Enjoyed 1-50 C Easy Makefile a Makefile boilerplate to hit the ground running Handmade Hero an educational series by Casey Muratori that teaches low-level game programming techniques by example Eskil Steenberg on How I program in C Compilers and Interpreters Crafting Interpreters by Robert Nystrom Ever wanted to make your own programming language or wondered how they are designed and built? If so, this book is for you. You should make a new programming language Cloud mcm Minimal Configuration Manager Packer a tool for building images for cloud platforms, virtual machines, containers and more from a single source configuration. CloudBoost a complete serverless platform for your app. The Google Cloud Developer’s Cheat Sheet every product in the Google Cloud family described in under 4 words Ask HN: Is Your Company Sticking to On-Premise Servers? Why? Using AWS CodeBuild to Execute Administrative Tasks What Is Amazon Resource Name (ARN)? more to the humble ARN than you think arn:partition:service:region:account-id:resource Containers Building Docker Images - Best Practices The Docker Handbook 25 Basic Docker Commands for Beginners Setting the Record Straight: containers vs. Zones vs. Jails vs. VMs Docker Security Best Practices tools and methods to help secure Docker Kubernetes Workshop tons of details for getting started 10 Most Common Mistakes When Using Kubernetes lens kube IDE kubeseal how to safely store secrets in git if you want to use them in k8s Container Technologies at Coinbase great history on how the industry got to containers, an why kubernetes isn’t used A Practical Introduction to Container Security Webtop full desktop environments in officially supported flavors accessible via any modern web browser Cheat sheets Linux Commands - A practical reference an amazing cheat sheet, quick reference The Ultimate List of SANS Cheat Sheets when it comes to quality cyber-security training and certs SANS is world leading. They have an amazing collection of thoughtful and useful cheat sheets from topics such as Writing Tips for IT Professionals, Windows to Unix Cheat Sheet, to using pieces of software such as nmap, netcat, burp. Its a treasure trove! Lenny Zeltser’s IT and Information Security Cheat Sheets speaking of thoughtful cheat sheets, lots of wisdom here Databases Things I Wished More Developers Knew About Databases SQL Coding Standards PostgreSQL Course: A Curious Moon learn PostgreSQL the way the pros do: on the job and under pressure. You’ll assume the role of interim DBA at aerospace startup Red:4, exploring data from the Cassini mission! Developer culture Eric S Raymond talking about The Cathedral and the Bazaar The Problem with Vibe Coding The Post-Developer Era Lessons Learned in 35 Years of Making Software What To Code inspiration and ideas Why the developers who use Rust love it so much Why we’re leaving the cloud You Are Not Google if you’re using a technology that originated at a large company, but your use case is very different, it’s unlikely that you arrived there deliberately; no, it’s more likely you got there through a ritualistic belief that imitating the giants would bring the same riches. GitLab’s Guide to All-Remote the remote manifesto, tips and tricks and remote resources Why we at $FAMOUS_COMPANY Switched to $HYPED_TECHNOLOGY Habbits of High-Functioning Software Teams characteristics and habits of the highest-performing dev teams A Taxonomy of Tech Debt Diagramming Excalidraw beautiful web based diagrams PlantText PlantUML (text) based diagram generator Taking ASCII Drawings Seriously: How Programmers Diagram Code dotfiles HexDSL LukeSmithxyz uoou Git Better Git configuration links and resources on configuring & using git Automate Repetitive Tasks with Custom git Commands how to write custom git commands Golang Everyday Golang LearnGo: A Large Collection of Go Examples, Exercises, and Quizzes Writing Go CLIs With Just Enough Architecture Getting Hands-On with io_uring from Go Diving Into Go by Building a CLI Application Immutability Patterns in Go Writing An Interpreter In Go gRPC gRPC - Best Practices Hardware Backblaze hard drive stats Build an 8-bit CPU by Ben Eater a programmable 8-bit computer from scratch on breadboards using only simple logic gates nand2tetris a distilled version of the book The Elements of Computing Systems, By Noam Nisan and Shimon Schocken (MIT Press), contains all the project materials and tools necessary for building a general-purpose computer system and a modern software hierarchy from the ground up Humanities The Chomsky List A definitive guide to Noam Chomsky: 10 books to get you started RATM reading list Jobs Inspired corp Kubernetes 7 Mind-Blowing Kubernetes Hacks The guide to kubectl I never had The Pros of On-Prem Kubernetes with Justin Garrison Languages Crafting Interpreters by Robert Nystrom Ever wanted to make your own programming language or wondered how they are designed and built? If so, this book is for you. Linux Linux Commands - A practical reference an amazing cheat sheet, quick reference 16 Linux server monitoring commands you really need to know Best 15 Unix Command Line Tools An In-Depth Guide to iptables covers pretty much every angle of iptables, from basic rules to NAT’ing to protocols and interfaces. mdadm Cheat Sheet practical commands when running software raid on Linux Async IO on Linux: select, poll, and epoll thorough write-up on ‘select’, ‘poll’ and ’epoll’ system calls, and how to measure them. The first 5 things to do when your Linux server keels over including hardware troubleshooting, checking the running state of applications How io_uring and eBPF Will Revolutionize Programming in Linux well explained history of Linux syscalls and their limitations, and how io_uring is a game changer by allowing async I/O via a pub/sub model bashtop gamified TUI resource monitor that shows usage and stats for processor, memory, disks, network and processes Time on Unix how time and localization works on Unix Tmux for mere mortals good defaults, modifying the keybindings to boost usability Tips for cleaning up a Linux server low hanging disk space fruit, like removing old kernels, pruning unused Docker space, clearing logs Shell productivity tips and tricks faster command line tips Message queues Postgres Message Queue - PGMQ lightweight message queue, like AWS SQS and RSMQ but on Postgres Mongo Quick reference cards Aggregation pipeline quick reference Monitoring Zabbix whatfiles logs the files programs CRUD, also traces new processes logtop reads stdin, can sort on any field and is updated in realtime Networking PacketLife Cheat Sheets The Packet Pioneer Chris Greer on TCP Fundamentals Part 1 TCP/IP Explained with Wireshark 59 Linux Networking commands and scripts the ultimate network tools goto list. Introduction to tcpdump and wireshark hping3 send arbitary TCP/IP packets to network hosts Setting up a Linux mail server linker∙d dynamic linker for microservices, taking care of the communication work needed to interact with distributed services, including routing, load balancing, and retrying. Manually Throttle the Bandwidth of a Linux Network Interface introduction to the tc tool for bandwidth shaping. connbeat agent that monitors TCP connection metadata and ships the data to Kafka or Elasticsearch, or an HTTP endpoint The Ultimate PCAP all protocols in a single PCAP What Every Developer Should Know About TCP SSH Tips & Tricks 2FA, securely forwarding agents, quitting from stuck sessions and using mosh or tmux High Availability Load Balancers with Maglev CloudFlare on their load balancing stack, BGP, Maglev connection scheduling, IPVS, UDP encapsulation for faster delivery Networking for Game Programmers: UDP vs TCP Open source Google Open Source 2000+ OSS projects managed by Google NSA on GitHub Python Interactive Python Type Challenges packse: Python packaging scenarios Python Design Patterns Inside the Python Virtual Machine Full Speed Python from Superior School of Technology of Setúbal Intermediate Python Ruff: Internals of a Rust-backed Python linter-formatter - Part 1 A Guide to Python’s Weak References Using weakref Module A Complete Guide to Pytest Fixtures Rust 100 Exercises To Learn Rust teaches Rust’s core concepts, one exercise at a time. You’ll learn about Rust’s syntax, its type system, its standard library, and its ecosystem. Security OST2.FYI OpenSecurityTraining2’s mission is to provide the world’s deepest and best cybersecurity training. That our classes are free is just a bonus! The Ultimate List of SANS Cheat Sheets when it comes to quality cyber-security training and certs SANS is world leading. They have an amazing collection of thoughtful and useful cheat sheets from topics such as Writing Tips for IT Professionals, Windows to Unix Cheat Sheet, to using pieces of software such as nmap, netcat, burb. Its a treasure trove! Lenny Zeltser’s IT and Information Security Cheat Sheets speaking of thoughtful cheat sheets, lots of wisdom here Linux reverse engineering 101 collection of resources for linux reverse engineering. Explain like I’m 5: Kerberos OAuth 2.0 Security Best Current Practices SSHHeatmap script that generates a heatmap of IP’s that made failed SSH login attempts using /var/log/auth.log psst Paper-based Secret Sharing Technique Shell Byobu multiplexer, enhanced profiles, convenient keybindings, configuration utilities, and toggle-able system status notifications for screen and tmux Makeself a self-extracting archiving tool for Unix systems, in 100% shell script 5 Types Of ZSH Aliases You Should Know alias suffixes & global aliases, plus other neat tricks Bash aliases you can’t live without systemd Why I Prefer systemd Timers Over Cron journal-triggerd runs trigger on systemd’s journal messages How to automatically execute shell script at startup boot on systemd Text wrangling CyberChef the ultimate open-source (by GCHQ) text wrangler you’ll ever need, life changing desed beautiful TUI that provides users with comfortable interface and practical debugger, used to step through complex sed scripts sed One Liners huge collection of useful sed examples xsv CLI for indexing, slicing, analyzing, splitting and joining CSV files Talks Rich Hickey on Simple Made Easy Mike Acton on Data-orientated Design Jonathan Blow on Programming Aesthetics learned from making independent games Eskil Steenberg on How I program in C Rich Hickey on Hammock Driven Development Brian Will on Why OOP is Bad Abner Coimbre on What Programming is Never About Scott Meyers on CPU Caches and Why You Care Jeff and Casey Show on The Evils of Non-native Programming Jeff and Casey’s Guide to Becoming a Bigger Programmer Hadi Hariri on The Silver Bullet Syndrome Bryan Cantrill on Fork Yeah! The Rise and Development if illumos Rob Pike on Concurrency Is Not Parallelism James Mickens on JavaScript Liz Rice on Containers From Scratch James Mickens on Why Do Keynote Speakers Keep Suggesting That Improving Security Is Possible? Tasks Learn Makefiles Abusing Makefiles for fun and profit Terminal Terminal Text Effects visual effects applied to text in the terminal Terminal Terminal Text Effects visual effects applied to text in the terminal Testing Smocker simple HTTP mock server, uses YAML to define mocks and responses MockServer for any system you integrate with via HTTP or HTTPS MockServer can be used as: a mock configured to return specific responses for different requests, a proxy recording and optionally modifying requests and responses or as both a proxy for some requests and a mock for other requests at the same time Text, Encoding and Serialization The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) Illustrated jq tutorial jq is a lightweight and flexible command-line JSON processor Cap’n Proto Google Protocol Buffers Vim A Flexible Minimalist Neovim for 2024 A Case for Using Neovim Without Plugins Web The Consise TypeScript Book HTML5 UP makes spiffy HTML5 site templates that are HTML5 + CSS3, customizable and 100% free under the Creative Commons How I built a modern website in 2021 Certbot automatically use Let’s Encrypt certificates Ask HN: Is There Still a Place for Native Desktop Apps? topngx parse and aggregrate statistics from NGINX access logs Writing Dungeons and Dragons taught me how to write alt text

April 2, 2017 · 11 min

Kustomize

Kustomize is built into kubectl with -k. Great samples on kubernetes.io/docs Kustomize provides a template-free way to customize kubernetes manifests Contents: Generating resources Setting cross cutting fields Composing and customizing resources Composing Customizing Patches Images Replacements Reference In a nutshell provides 3 key features: generating resources from other sources setting cross-cutting fields for resources composing and customizing collections of resources Generating resources To generate a ConfigMap from an .env file, add an entry to the envs list in configMapGenerator. Kustomize supports other formats such as .properties. ...

May 3, 2024 · 3 min

make

A small make orientation guide. make is a versatile task runner, its core competency is in creating files from other files Make essentials Equal signs Built-in variables Phony targets C specifics Custom variables Implicit variables Example Makefiles Make essentials make generates files from other files, using recipes, the syntax is as follows. Please note, thanks to POSIX standardisation the recipe MUST be indented with a tab (not spaces): target_file: prerequisite_file1 prerequisite_file2 shell command to build target_file (MUST be indented with tabs, not spaces) another shell command (these commands are called the "recipe") Unless you specify otherwise, Make assumes that the target (target_file above) and prerequisites (prerequisite_file1 and prerequisite_file2) are actual files or directories. You can ask Make to build a target from the command line like this: ...

October 9, 2016 · 7 min

Python Packaging (2024)

sysconfig Project Structure Setup Script Distribution Archives Distribution sysconfig The process of bundling Python code into a format that eases distribution and sharing. First up, I find it helps to get a concrete understanding of how the specific python distro I’m working with is configured, of paricular interest are the various system paths that will be visited for package dependencies. The built-in sysconfig module neatly manages and surfaces this information. python -m sysconfig On a Windows system: ...

December 29, 2023 · 3 min

Kubernetes Certified Administrator (CKA) 2024

CKA topics Kubernetes in a nutshell Lab environment kubeadm init sample output Buliding kubernetes clusters Networking kubeadm kubectl Contexts Resources CKA topics Cluster Architecture, Installation & Configuration: How to set up and configure a Kubernetes cluster, including how to install and configure a Kubernetes cluster using kubeadm, how to upgrade your cluster version, how to backup and restore an etcd cluster, and how to configure a pod to use secrets Workloads & Scheduling: How to deploy a Kubernetes application, create daemonsets, scale the application, configure health checks, use multi-container pods, and use config maps and secrets in a pod. You’ll also need to know how to expose your application using services Services & Networking: How to expose applications within the cluster or outside the cluster, how to manage networking policies, and how to configure ingress controllers Storage: How to create and configure persistent volumes, how to create and configure persistent volume claims, and how to expand persistent volumes Troubleshooting: How to troubleshoot common issues in a Kubernetes environment, including how to diagnose and resolve issues with pods, nodes, and network traffic Kubernetes in a nutshell Control plane management components that mother-hen nodes and pods. Key components: ...

December 22, 2023 · 7 min

Python PR Checklist

A modified version of excellent original checklist by Paul Wolf General Code is blackened with black ruff has been run with no errors mypy has been run with no errors Function complexity problems have been resolved using the default complexity index of flake8. Important core code can be loaded in iPython, ipdb easily. There is no dead code Comprehensions or generator expressions are used in place of for loops where appropriate Comprehensions and generator expressions produce state but they do not have side effects within the expression. Use zip(), any(), all(), filter(), etc. instead of for loops where appropriate Functions that take as parameters and mutate mutable variables don’t return these variables. They return None. Return immutable copies of mutable types instead of mutating the instances themselves when mutable types are passed as parameters with the intention of returning a mutated version of that variable. Avoid method cascading on objects with methods that return self. Function and method parameters never use empty collection or sequence instances like list [] or dict {}. Instead they must use None to indicate missing input Variables in a function body are initialised with empty sequences or collections by callables, list(), dict(), instead of [], {}, etc. Always use the Final type hint for class instance parameters that will not change. Context-dependent variables are not unnecessarily passed between functions or methods View functions either implement the business rules the view is repsonsible for or it passes data downstream to have this done by services and receives non-context dependent data back. View functions don’t pass request to called functions Functions including class methods don’t have too many local parameters or instance variables. Especially a class’ __init__() should not have too many parameters. Profiling code is minimal Logging is the minimum required for production use There are no home-brewed solutions for things that already exist in the PSL (python standard library) n00b habbits Bare except clause, Python uses exceptions to flag system level interupts such as sigkills. Don’t do this. Argument default mutatable arguments such as def foo(bar=[]) are defined when the function is defined, not when its run, and will result in a all function calls sharing the same instance of bar Checking for equality using ==. Due to inheritance this is not desirable as it pins to a concrete type and not potentially it descendents. In other words the Liskov substitution principle. Instead isinstance(p, tuple) Explicit bool or length checks, such as if bool(x) or if len(x) > 0 is redundant, as Python has sane truthy evaluation. Use of range over the for in idiom If you really need the index, always use enumerate Not using items() on a dict for k, v in dict.items()) Using time.time to measure code performance. Use time.perf_counter instead. Using print statements over logging Using import * will normally liter the namespace with variable. Dont be lazy, be specific. from itertools import count Imports and modules Imports are sorted by isort or according to some standard that is consistent within the team Import packages or modules to qualify the use of functions or classes so that unqualified function calls can be assumed to be to functions in the current module Documentation Modules have docstrings Classes have docstrings unless their purpose is immediately obvious Methods and functions have docstrings Comments and docstrings add non-obvious and helpful information that is not already present in the naming of functions and variables General Complexity Functions as complex as they need to be but no more (as defined by flake8\ ’s default complexity threshold) Classes have only as many methods as required and have a simple hierarchy Context Freedom All important functionality can be loaded easily in ipython without having to construct dummy requests, etc. All important functionality can be loaded in pdb (or a variant, ipdb, etc.) Types Use immutable types ()tuple, frozenset, Enum, etc) over mutable types whenever possible Functions Functions are pure wherever possible, i.e. they take input and provide a return value with no side-effects or reliance on hidden state. Modules Module level variables do not take context-dependent values like connection clients to remote systems unless the client is used immediately for another module level variable and not used again Classes Every class has a single well-defined purpose. That is, the class does not mix up different tasks, like remote state acquisition, web sockets notification, data formatting, etc. Classes manage state and do not just represent the encapsulation of behaviour All methods access either cls or self in the body. If a method does not access cls or self, it should be a function at module level. @classmethod is used in preference to @staticmethod but only if the method body accesses cls otherwise the method should be a module level function. Constants are declared at module level not in methods or class level Constants are always upper case Abstract classes are derived from abc: from abc import ABC Abstract methods use the @abstractmethod decorator Abstract class properties use both @abstractmethod and @property decorators Classes do not use multiple inheritance Classes do not use mixins (use composition instead) except in rare cases Class names do not use the word “Base” to signal they are the single ancestor, like “BaseWhatever” Decorators are not used to replace classes as a design pattern __init__() does not define too many local variables. Use the Parameter Consolidation pattern instead. A factory class or function at module level is used for complex class construction (see Design Patterns) to achieve composition Classes are not dynamically created from strings except where forward reference requires this Design Patterns Do not use designs that cause a typical Python developer to have to learn new semantics that are unexpected in Python Classes primarily use composition in preference to inheritance Beyond a very small number of simple variables, a class’ purpose is to acquire state for another class or it uses another class to acquire state in particular if the state is from a remote service. If you use the Context Parameter pattern, it is critical that the state of the context does not change after calling its __init__(), i.e. it should be immutable If a class’ purpose is to represent an external integration, you probably want numerous classes to compose the service: RemoteDataClient, DomainManager, ContextManager, Factory, NotificationController, DomainResponse, DataFormatter and so on.

October 30, 2023 · 5 min

Kinesis 360 Pro keyboard

Kinesis is a company based near Seattle that offers computer keyboards with ergonomic designs as alternatives to the traditional keyboard design. Most widely known among these are the contoured Advantage line, which features recessed keys in two bucket-like hollows to allow the user’s fingers to reach keys with less effort The Advantage 360 line was released in 2022 and is still insanely popular and challenging to get hold of. The pro edition allows you to customise the firmware, which is ZMK based. Kinesis have outsourced the actual job of compiling the firmware to GitHub Actions. ...

February 26, 2023 · 4 min

Arch Linux

After witnessing insane minimalism paired with a tiler (tiling window manager), knew it was my time to take the pilgrimage to Arch Linux. Some characteristics that make Arch unique: The Arch Way embody the principles behind Arch Linux; simplicity, modernity, pragmatism, user centrality and versatility. Forces one to build the system up by hand. This encourages you to question the role of each component of the system, and available options to satisfy that component (e.g. the terminal emulator). The result is a highly tailored and minimal system that meets precisely your needs. Practical and pragmatic documentation. The Arch Wiki is the gold standard when it comes to documentation. The Arch User Repository (AUR) is a treasure chest of pre-packaged useful recent software. Somehow every program I’ve ever needed has been available on AUR. Rolling upgrades. Arch was born in 2001, when Canadian programmer Judd Vinet, inspired by the elegance of systems such as Slackware and the BSD’s, set out to build his own distro based on a similar ethos. The first formal release, 0.1, dropped on March 11, 2002. ...

April 6, 2019 · 19 min

Async Python

Background Using asyncio will not make your code multi-threaded. That is, it will not cause multiple Python instructions to be executed at the same time, and it will not in any way allow you to side step the so-called “global interpreter lock” (GIL). Some processes are CPU-bound: they consist of a series of instructions which need to be executed one after another until the result has been computed. Most of their time is spent making heavy use of the processor. ...

August 9, 2023 · 9 min