Entity Resolution and the Instability Problem

The Problem Solution 1: Make the API record‑centric, not entity‑centric Solution 2: Introduce your own stable external Entity ID and map it to Senzing 2.1. Public vs internal IDs 2.2. Handling merges 2.3. Handling splits 2.4. Pros / Cons Solution 3: Provide an entity change feed (events) for downstream sync 3.1. Why? 3.2. Event model Solution 4: Treat entity IDs as ephemeral handles with TTL semantics Solution 5: Event‑sourcing / versioned entities (for heavy compliance/audit use‑cases) FrankenRes Internals API surface Detecting Splits and Merges with Senzing 1. What Senzing actually provides 2. Minimum state you need to track 3. Robust per-event processing pattern Concurrency safeguard Split vs Merge Detection Detecting splits Detecting merges A simplier way without splits and merges Senzing Lifecycle Detector C# Implementation Single-file example Usage TL;DR The Problem The classic entity resolution gotcha: the thing that looks like a primary key (e.g. Senzing’s entity ID) is actually a volatile cluster ID that can legitimately change as the engine learns. Senzing explicitly says their resolved entity ID is not a globally unique persistent identifier and that it’s just an identifier for a grouping that may be transient. (senzing.zendesk.com) ...

December 2, 2025 · 23 min

Entity Resolution with Senzing and the .NET SDK

Context Record vs Entity vs Relationship Data Quality Issues ER Addresses Senzing Repository Key Senzing Attributes Resolution Concepts Features Feature Scores Match Levels Senzing V4 SDK Setup on Metal Native Senzing SDK Setup .NET SDK Setup Setup Local NuGet Source Senzing V4 C# Snippets Senzing V4 CLI Tools sz_configtool listFeatures listAttributes listRules listFragments principles sz_explorer get how (tree) Senzing Weirdness Typed models vs loose JSON strings TODO Info Messages aka SZ_WITH_INFO Senzing Best practices Resources Context The process of identifying and linking records that refer to the same real-world entity across different data sources, even when the records contain variations, errors, or incomplete information. ...

September 19, 2025 · 15 min

Time Dimension Populate Script

Here is a very simple TSQL script that will flesh out a time dimension, for use with SQL Server Analysis Services (SSAS) cube, and can easily be molded to work with other vendor implementations. The AdventureWorks DW provides a nice reference implementation for a time dimension. Unfortunately provides no guidance around the actual population of the dimension. This script will provide a repeatable, configurable way of building out a similar implementation. ...

July 21, 2011 · 2 min

Managing Database Evolution

On a new clients site the other day, observed that over time the more companies I work for the deeper my knowledge for applying effective work practices becomes. In other words, over time you see things that work well, and things that don’t. I’m talking about simple practices that when applied to teams result more quality and/or efficient software. Databases and their associated artefacts (functions, triggers, message broker queues and so on) should be managed, and versioned. Again a simple problem with a simple solution, but in the real world tends to be practiced poorly. ...

June 5, 2010 · 2 min