martinuke0's Blog

draft: false tags: [“CRDT”,“collaborative-editing”,“distributed-systems”,"— title: “Why Conflict-Free Replicated Data Types Enable Seamless Collaborative Editing” date: “2026-05-18T18:00:56.182” real-time",“consistency”] description: “Explore how Conflict‑Free Replicated Data Types (CRDTs) power real‑time collaborative editors, guaranteeing convergence without central coordination.” summary: “This article explains the theory behind CRDTs, their practical benefits for collaborative editing, and real‑world libraries that make seamless co‑authoring possible.” showToc: true TocOpen: false cover: image: “/images/covers/2026-05-18-python-illustration-of-a-gcounter-merge.svg” alt: “Illustration of multiple users editing the same document simultaneously.” caption: "" relative: false

TL;DR — CRDTs let every participant edit a shared document locally and later merge changes automatically, guaranteeing that all replicas converge to the same state without complex conflict resolution or a single point of failure.

Collaborative editing tools such as Google Docs, Notion, or Figma feel effortless because users can type, move objects, or comment at the same time without stepping on each other’s toes. Under the hood, many of these products rely on Conflict‑Free Replicated Data Types (CRDTs), a family of mathematically proven data structures that ensure eventual consistency across distributed replicas. This post demystifies CRDTs, explains why they are a natural fit for real‑time collaboration, and showcases concrete libraries you can adopt today.

The Core Challenge of Real‑Time Collaboration

When multiple users edit the same piece of data concurrently, the system must answer three questions:

What should the final state look like?
How do we merge divergent edits without losing user intent?
Can we do this with low latency and without a central bottleneck?

Traditional client‑server models solve #1 by routing every operation through a single authoritative server. This approach guarantees a total order of operations but introduces latency, limits offline work, and creates a single point of failure. Operational Transformation (OT), the algorithm behind early Google Docs to maintain causality.

CRDTs sidestep these pitfalls by designing data structures that comm, tries to reorder operations on the fly, but it requires complex transformation functions and a tightly coupled server this property, replicas can apply operations in any order, and once all operations have been delivered, everyute: the result of applying operation A then B is identical to applying B then A. Because of Conflict‑Free Replicated Data Type is a data structure equipped with two essential ingredients:

| Ingredient | replica converges to the same state automatically.

What Are CRDTs? A Primer

Aative, and idempotent function (`merge`) that combines two replica states into a new state. Description
Merge Function	A deterministic, associative, commut a way that can be undone) so that state never regresses.

Because merge is | | Monotonic Updates | Operations that only add information (or remove inassociative* (a ⊔ (b ⊔ c) = (a *commutative* (a ⊔ b = b ⊔ a) and * ⊔ b) ⊔ c), the order of delivery does not matter. Idempotence (a ⊔ a = a) guarantees that duplicate messages have no effect, which is crucial over unreliable networks.

Two families of CRDTs exist:

**State‑based ( run merge.
Operation‑based (a.k.a. Commutative)a.k.a. Convergent) CRDTs – replicas periodically exchange their entire state and CRDTs** – replicas broadcast operations that are guaranteed to commute.

Both families achieve the same eventual consistency guarantee, but they differ in bandwidth usage and implementation complexity.

Example: A Grow‑Only Counter (G‑Counter)

A G‑Counter is the simplest state‑based CRDT. Each replica maintains a map from replica ID to a non‑negative integer. Incrementing the counter locally adds 1 to the replica’s own entry. Merging two counters takes the element‑wise maximum.

# Python illustration of a G‑Counter merge
def merge(counter_a, counter_b):
    merged = {}
    for replica in set(counter_a) | set(counter_b):
        merged[replica] = max(counter_a.get(replica, 0), counter_b.get(replica, 0))
    return merged

If replica A has {A: 3, B: 1} and replica B has {A: 2, B: 4}, the merge yields {A: 3, B: 4}. No matter which replica receives the other’s state first, the final result is the same, illustrating the core CRDT property.

CRDT Types Relevant to Text Editing

Text editing demands more expressive structures than a simple counter. The most widely used CRDTs for collaborative text are:

CRDT	Core Idea	Typical Use
RGA (Replicated Growable Array)	Maintains a linked list of atoms with unique identifiers; insertion is performed by referencing the predecessor’s ID.	Linear text buffers, ordered lists.
WOOT (WithOut Operational Transform)	Stores characters with tombstones for deletions, allowing inserts and deletes to commute.	Early research prototypes.
LSEQ	Uses a variable‑length identifier space to keep identifiers short even after many insertions.	Modern editors like Yjs.
Automerge List	Combines a JSON‑like tree with per‑field timestamps; resolves concurrent edits by deterministic rules.	General purpose document models.

All these CRDTs share a position‑identifier scheme: each character (or atom) gets a globally unique ID that encodes its logical position. When two users insert at the same logical location, the IDs are ordered by a deterministic tie‑breaker (e.g., replica ID). Deletions are modeled.

Insertion Example with LSEQ

// JavaScript snippet using Yjs' as *tombstones* or as *remove‑sets* that are merged with the insert set.Doc()
const ytext = ydoc.getText('shared')

// User A inserts "Hello"
y LSEQ identifier generation
import * as Y from 'yjs'

const ydoc = new Ytext.insert(0, "Hello")   // generates IDs: [id1, id2, id3, id4, id5]

// User B concurrently inserts "World" at position 0
ytext.insert(0, "World")   // generates IDs that sort after id1–id5 according to LSEQ rules

Because the identifiers are comparable, the final order after merging both operations is deterministic: either "WorldHello" or "HelloWorld" depending on the tie‑breaker, but all replicas agree on the same ordering.

How CRDTs Eliminate Central Coordination

1. Local‑First Editing

Each client maintains a full replica of the document and can apply edits immediately, without waiting for network acknowledgment. This optimistic approach yields sub‑millisecond latency, the hallmark of modern collaborative tools.

2. Asynchronous Propagation

Changes are packaged as either state deltas (state‑based) or operation messages (operation‑based) and sent over any transport (WebSocket, peer‑to‑peer, or even email). Because merges are commutative, the system tolerates out‑of‑order delivery, temporary partitions, or dropped packets.

3. Automatic Convergence

When a replica receives a remote update, it runs the deterministic merge function. If two users edit the same character simultaneously, the tie‑breaker (often the replica ID) decides the order. The important guarantee is no data loss: every insertion is preserved somewhere in the final state, and deletions are represented as a remove‑set that is also merged.

4. No Central Locking or Version Vectors

Operational Transformation requires a central server to maintain a history buffer and to compute transformation functions for each incoming operation. CRDTs avoid this overhead entirely. Some implementations still use version vectors for garbage collection (e.g., to prune tombstones), but these vectors are local and never block user interaction.

Real‑World Libraries and Projects

Library	Language	CRDT Model	Notable Use‑Cases
Automerge	JavaScript / TypeScript	Operation‑based JSON CRDT	Code editors, collaborative notebooks
Yjs	JavaScript / TypeScript	Operation‑based (LSEQ for text)	Figma‑like vector editors, Liveblocks
value stores
Logoot	Java, C#	Operation‑based	Early research
Delta State Replicated Data Types (Δ‑CRDTs)	Go,‑level APIs (`ytext.insert`, `automerge.change`) that hide the identifier gymnastics, allowing developers to focus on UI and business logic.

Example: Building a Minimal Collaborative Notepad with Yjs

# Install dependencies
npm install yjs y-websocket

// index.js – a tiny collaborative textarea
import * as Y from 'yjs'
import { WebsocketProvider } from 'y-websocket'

const ydoc = new Y.Doc()
const provider = new WebsocketProvider('wss://demos.yjs.dev', 'my-notepad', ydoc)
const ytext = ydoc.getText('shared')

const textarea = document.getElementById('editor')
textarea.value = ytext.toString()

// Apply local changes to the CRDT
textarea.addEventListener('input', () => {
  ytext.delete(0, ytext.length)
  ytext.insert(0, textarea.value)
})

// Reflect remote changes in the UI
ytext.observe(event => {
  textarea.value = ytext.toString()
})

The snippet demonstrates local‑first editing (the textarea updates instantly) while the Yjs provider synchronizes changes over a WebSocket server. No explicit conflict resolution code is required; Yjs guarantees convergence.

Performance, Scalability, and Trade‑offs

While CRDTs remove many architectural headaches, they introduce new considerations:

Aspect	Benefit	Potential Cost
Bandwidth	Operation‑based CRDTs send tiny deltas (often < 100 bytes).	State‑based CRDTs may need to ship full state on reconnect, which can be large for big documents.
Memory	Tombstones keep deleted elements forever unless garbage‑collected.	Uncollected tombstones can bloat the data structure, especially in long‑running sessions.
Complexity of Identifiers	LSEQ, Logoot, and similar schemes keep identifiers short, even after many inserts.	Early CRDTs (e.g., WOOT) produce identifiers that grow linearly, hurting performance.
Undo/Redo	Some CRDTs support inverse operations making undo straightforward.	Others require additional metadata or a separate undo stack.
Security	End‑to‑end encryption works because the data model is deterministic.	Conflict‑resolution logic must be shared; exposing internal IDs may leak collaboration patterns.

Developers often choose a library that already implements efficient garbage collection (e.g., Yjs’s encoding of deletes) and that fits the latency budget of their product. For a typical web‑based text editor, operation‑based CRDTs with compact identifiers give sub‑100 ms end‑to‑end latency even under 3G networks.

When to Prefer CRDTs Over Operational Transformation

Scenario	CRDT Preferred	Reason
Offline First Apps	✅	Users can continue editing offline; state sync occurs later without conflict.
Peer‑to‑Peer Collaboration	✅	No central server needed; each node runs the same merge logic.
Highly Dynamic Topologies (e.g., IoT)	✅	Devices join/leave arbitrarily; CRDTs tolerate churn.
Strict Regulatory Auditing	❌ (OT may be easier)	OT’s linear history can be simpler to audit; CRDTs produce a partial order that may need extra tooling.
Very Large Binary Blobs	❌ (CRDTs excel at structured data)	Merging large binary files often requires Erlang

These libraries expose high custom strategies beyond CRDTs. |

In practice, many modern products blend both approaches: they use compatibility or for specific data types that are easier to express with OT.

Key Takeaways

CRDTs for the core document model and fall back to OT‑style transform for legacy CRDTs guarantee eventual consistency by using commutative, associative, and idempotent feedback, while asynchronous propagation ensures all participants eventually see the same document.
**State‑based vs merge functions, eliminating the need for a central authority.
Local‑first editing provides instant. operation‑based** CRDTs trade bandwidth for simplicity; most web editors favor operation‑based Delta‑CRDT frameworks abstract the mathematics, letting developers build collaborative features with a few lines of code models with compact identifiers.
Real‑world libraries like Yjs, Automerge, and.
Performance considerations (tombstone growth, identifier size) are mitigated by moderns** makes sense for offline‑first, peer‑to‑peer, or highly distributed scenarios, algorithms (LSEQ, RGA) and built‑in garbage collection.
**Choosing CRDT while OT may still be useful for audit‑heavy or binary‑heavy workloads.

The Core Challenge of Real‑Time Collaboration#

What Are CRDTs? A Primer#

Example: A Grow‑Only Counter (G‑Counter)#

CRDT Types Relevant to Text Editing#

Insertion Example with LSEQ#

How CRDTs Eliminate Central Coordination#

1. Local‑First Editing#

2. Asynchronous Propagation#

3. Automatic Convergence#

4. No Central Locking or Version Vectors#

Real‑World Libraries and Projects#

Example: Building a Minimal Collaborative Notepad with Yjs#

Performance, Scalability, and Trade‑offs#

When to Prefer CRDTs Over Operational Transformation#

Key Takeaways#

Further Reading#