Scaling Realtime Feature Stores with Redis and Go for High‑Throughput Microservices

Introduction
Fundamentals of Feature Stores
Why Redis Is a Strong Candidate
Go: The Language for High‑Performance Services
Architectural Blueprint
Designing a Redis Schema for Feature Data
Ingestion Pipeline in Go
Serving Features at Scale
Scaling Redis: Clustering, Sharding, and HA
Observability & Monitoring
Testing and Benchmarking
Real‑World Case Study: E‑Commerce Recommendations
Conclusion
Resources

Introduction

Feature stores have emerged as the backbone of modern machine‑learning (ML) pipelines. They enable teams to store, version, and serve engineered features both offline (for batch training) and online (for real‑time inference). In a microservice‑centric architecture, each service may need to fetch dozens of features per request, often under strict latency budgets (sub‑10 ms) while the system processes thousands of requests per second.

Achieving this level of performance is not trivial. It requires a datastore that can:

Persist feature values with minimal write latency.
Scale horizontally to handle spikes in traffic.
Expose low‑latency read paths that can be called from many concurrent goroutines.
Provide strong consistency guarantees where needed (e.g., when a feature is updated just before it is consumed).

Redis, with its rich data structures, in‑memory speed, and mature clustering capabilities, is a natural fit for the online side of a feature store. Coupled with Go’s lightweight concurrency model and efficient networking stack, you can build microservices that ingest, update, and serve features at hundreds of thousands of operations per second.

This article walks through the end‑to‑end design of a realtime feature store built on Redis and Go. We’ll explore architectural decisions, schema design, ingestion pipelines, serving patterns, scaling strategies, observability, and testing. By the end, you should have a concrete blueprint you can adapt to your own high‑throughput microservice environment.

Fundamentals of Feature Stores

What Is a Feature Store?

A feature store is a centralized repository that:

Ingests raw event data (clicks, sensor readings, logs) and transforms it into features.
Version‑controls those features so that the same definition can be used for both training and serving.
Caches the latest feature values for low‑latency retrieval during inference.

Think of it as a “database for ML features” rather than a generic key‑value store. It bridges the gap between data engineering pipelines and model serving layers.

Real‑Time vs. Batch Features

Dimension	Batch Features	Real‑Time Features
Update Frequency	Hours‑to‑days	Seconds‑to‑milliseconds
Latency Requirement	Minutes–hours	< 10 ms (often < 5 ms)
Typical Storage	Data lake / warehouse	In‑memory store (Redis, Aerospike)
Use Cases	Model training, offline analytics	Online inference, personalization, fraud detection

A robust system typically combines both: batch pipelines populate long‑term aggregates, while a real‑time component captures the freshest signals.

Core Requirements for a Real‑Time Feature Store

Low Write Latency – Features are often generated by streaming pipelines (Kafka, Kinesis) and must be persisted within a few milliseconds.
High Read Throughput – A single inference request may need 10–30 features; a service serving 10 k RPS translates to 100 k–300 k reads per second.
Strong Consistency – Some features (e.g., “account status”) must reflect the latest state instantly.
Scalable Architecture – Ability to add nodes without downtime.
Observability – Metrics, tracing, and alerts for latency, error rates, and resource usage.

Why Redis Is a Strong Candidate

Redis is more than a simple key‑value cache; it offers a versatile set of data structures and operational primitives that align closely with feature‑store needs.

1. In‑Memory Speed with Persistence Options

Memory‑level reads are sub‑microsecond. Even with network overhead, round‑trip latencies stay within a few milliseconds.
AOF (Append‑Only File) and RDB snapshots give durability. You can configure every‑write AOF for strict durability or every‑second AOF for a performance‑first approach.

2. Rich Data Structures

Structure	Ideal Use Case for Features
Hash	Storing a compact set of feature values per entity (`user:1234 -> {age:30, country:US, churn_score:0.12}`)
Sorted Set (ZSET)	Time‑series of event‑driven features (e.g., click counts per minute)
Stream	Ingestion pipeline that guarantees ordering and replayability
Bitmap	Boolean flags for huge cardinalities (e.g., feature “has_seen_promo”)
Lua Scripts	Atomic read‑modify‑write, multi‑key fetches, and complex calculations without round‑trips

3. Horizontal Scaling with Redis Cluster

Sharding across 16384 hash slots enables linear scaling.
Replica nodes provide read‑scaling and HA.
Automatic failover via Redis Sentinel or native cluster mechanisms.

4. Pub/Sub & Keyspace Notifications

Enable event‑driven invalidation of downstream caches.
Useful for feature version propagation across microservices.

5. Ecosystem & Tooling

Mature client libraries for Go (github.com/go-redis/redis/v9).
Integration with Prometheus, Grafana, and OpenTelemetry.
CLI (redis-cli) for rapid debugging.

Collectively, these attributes make Redis a battle‑tested platform for the online layer of a feature store.

Go: The Language for High‑Performance Services

Go (Golang) has become the de‑facto language for building cloud‑native microservices. Its design choices align perfectly with the demands of a realtime feature store.

Concurrency Made Simple

Goroutines are lightweight (≈2 KB stack) and can spawn millions of concurrent workers.
Channels provide a clean way to pipeline data (e.g., ingest → transform → write).

Strong Standard Library

net/http and net/http2 for high‑performance APIs.
context for request‑scoped cancellation and deadlines—critical for low‑latency services.

Static Binaries & Fast Startup

Deploy as a single binary, ideal for containerized environments (Docker, Kubernetes).
Fast cold start times (< 50 ms) keep pod scaling responsive.

Rich Ecosystem for Redis

go-redis/redis/v9 offers pipelining, cluster support, and Lua script execution.
Integration with OpenTelemetry (go.opentelemetry.io/otel) for distributed tracing.

Example: Simple Redis Client in Go

package main

import (
    "context"
    "log"
    "time"

    "github.com/go-redis/redis/v9"
)

func main() {
    ctx := context.Background()
    rdb := redis.NewClusterClient(&redis.ClusterOptions{
        Addrs: []string{
            "redis-0:6379",
            "redis-1:6379",
            "redis-2:6379",
        },
        ReadOnly: true, // Enable read‑only replica usage
    })

    // Ping to ensure connectivity
    if err := rdb.Ping(ctx).Err(); err != nil {
        log.Fatalf("cannot connect to redis: %v", err)
    }

    // Example: set a feature hash for a user
    key := "user:1234"
    _, err := rdb.HSet(ctx, key, map[string]interface{}{
        "age":          29,
        "country":      "US",
        "churn_score":  0.07,
        "last_active": time.Now().Unix(),
    }).Result()
    if err != nil {
        log.Fatalf("failed to HSet: %v", err)
    }

    log.Println("feature set successfully")
}

The snippet demonstrates cluster connection, context‑aware calls, and a hash write—the core building block for a feature store.

Architectural Blueprint

Below is a high‑level view of how the components interact. While a diagram would be ideal, we’ll describe the flow textually.

Event Sources – Kafka topics, Kinesis streams, or HTTP webhook endpoints produce raw events (clicks, transactions, sensor readings).
Ingestion Service (Go) – Consumes events, performs feature transformations, and writes results to Redis using pipelines.
Redis Cluster – Stores feature values in hashes, sorted sets, or streams. Replicas serve read traffic.
Feature‑Serving API (Go) – Exposes HTTP/gRPC endpoints for downstream ML inference services. Retrieves required features in bulk, optionally using Lua scripts for atomicity.
Cache Layer (Optional) – Local in‑process LRU caches or side‑car caches (e.g., Redis Edge) to reduce hot‑key pressure.
Observability Stack – Prometheus scrapes Redis and Go metrics; OpenTelemetry collects traces from ingestion and serving services.
Management & Ops – Kubernetes handles pod scaling, Helm charts provision Redis clusters, and Sentinel monitors health.

[Event Sources] → (Kafka) → [Ingestion Service] → (Pipelined writes) → [Redis Cluster] ↔ (Replica reads) ↔ [Feature‑Serving API] → (Inference Service)

Key design principles:

Decouple ingestion from serving – Each can scale independently.
Prefer write‑heavy pipelines – Batch multiple feature updates into a single network round‑trip.
Read‑optimised data structures – Use hashes for point lookups and Lua scripts for multi‑key fetches.
Stateless Go services – Enable horizontal scaling via Kubernetes Horizontal Pod Autoscaler (HPA).

Designing a Redis Schema for Feature Data

A well‑thought-out schema determines both memory efficiency and query speed. Below are common patterns.

1. Entity‑Centric Hashes

Store all features for a given entity (user, device, product) in a single hash.

Key: user:{user_id}
Fields:
  age            → integer
  country        → string
  churn_score    → float
  last_active    → unix timestamp
  premium_flag   → 0/1

Pros: O(1) read for all features with HMGET or HGETALL.
Cons: Hash grows with number of features; can exceed 512 MB limit per hash (practically far lower).

2. Time‑Series with Sorted Sets

When a feature is a temporal aggregate (e.g., clicks per minute), a sorted set provides ordering.

Key: clicks:user:{user_id}
Members: <timestamp> → <click_count>
Score: Unix epoch in seconds (or milliseconds)

You can query the last N minutes with ZRANGE and compute rolling windows on the client or via Lua.

3. Feature Versioning

To avoid stale reads after a model upgrade, embed a version token:

Key: feature_version:{model_name}
Value: integer (e.g., 42)

When fetching features, prepend the version to the entity key:

user:42:{user_id}

If the version increments, downstream services automatically request the new hash.

4. TTL & Expiration

Real‑time features often become obsolete quickly (e.g., session‑level attributes). Use EXPIRE to automatically evict them:

rdb.Expire(ctx, key, 10*time.Minute)

5. Example Schema Definition (Go)

type FeatureSet struct {
    UserID      int64
    Age         int
    Country     string
    ChurnScore  float64
    LastActive  int64 // Unix timestamp
    PremiumFlag bool
}

// Serialize into a Redis hash map
func (f *FeatureSet) ToMap() map[string]interface{} {
    return map[string]interface{}{
        "age":          f.Age,
        "country":      f.Country,
        "churn_score":  f.ChurnScore,
        "last_active":  f.LastActive,
        "premium_flag": boolToInt(f.PremiumFlag),
    }
}

Ingestion Pipeline in Go

The ingestion service must process high‑volume streams while keeping latency low. Below is a pattern that leverages Go’s concurrency primitives and Redis pipelining.

1. Worker Pool Architecture

Reader Goroutine – Pulls messages from Kafka (or any source) and pushes them onto a buffered channel.
N Worker Goroutines – Each reads from the channel, transforms the raw payload into a FeatureSet, and appends the write command to a Redis pipeline.
Flusher Goroutine – Periodically executes the pipeline (Exec) and records metrics.

2. Code Example

package main

import (
    "context"
    "log"
    "sync"
    "time"

    "github.com/go-redis/redis/v9"
    "github.com/segmentio/kafka-go"
)

const (
    workerCount   = 32
    pipelineSize  = 500 // number of commands per Exec
    flushInterval = 5 * time.Millisecond
)

type rawEvent struct {
    Key   string // e.g., "user:1234"
    Value []byte // raw JSON payload
}

// Global Redis client (cluster)
var rdb *redis.ClusterClient

func initRedis() {
    rdb = redis.NewClusterClient(&redis.ClusterOptions{
        Addrs: []string{"redis-0:6379", "redis-1:6379", "redis-2:6379"},
    })
}

// Transform raw JSON into FeatureSet (simplified)
func parseEvent(e rawEvent) (*FeatureSet, error) {
    // In practice use json.Unmarshal and proper error handling
    // Here we assume a deterministic mapping for brevity
    return &FeatureSet{
        UserID:      1234,
        Age:         29,
        Country:     "US",
        ChurnScore:  0.07,
        LastActive:  time.Now().Unix(),
        PremiumFlag: false,
    }, nil
}

func main() {
    ctx := context.Background()
    initRedis()
    defer rdb.Close()

    // 1️⃣ Kafka reader (simplified)
    kafkaReader := kafka.NewReader(kafka.ReaderConfig{
        Brokers: []string{"kafka:9092"},
        Topic:   "user-events",
        GroupID: "feature-ingest",
    })
    defer kafkaReader.Close()

    // Buffered channel for events
    events := make(chan rawEvent, 10000)

    // 2️⃣ Launch workers
    var wg sync.WaitGroup
    for i := 0; i < workerCount; i++ {
        wg.Add(1)
        go func(id int) {
            defer wg.Done()
            pipeline := rdb.Pipeline()
            counter := 0
            flushTicker := time.NewTicker(flushInterval)
            defer flushTicker.Stop()

            for {
                select {
                case ev, ok := <-events:
                    if !ok {
                        // drain remaining commands
                        if counter > 0 {
                            _, _ = pipeline.Exec(ctx)
                        }
                        return
                    }
                    feat, err := parseEvent(ev)
                    if err != nil {
                        log.Printf("worker %d: parse error: %v", id, err)
                        continue
                    }
                    // Queue a HSET command in the pipeline
                    pipeline.HSet(ctx, ev.Key, feat.ToMap())
                    counter++
                    if counter >= pipelineSize {
                        _, _ = pipeline.Exec(ctx)
                        counter = 0
                    }

                case <-flushTicker.C:
                    if counter > 0 {
                        _, _ = pipeline.Exec(ctx)
                        counter = 0
                    }
                }
            }
        }(i)
    }

    // 3️⃣ Producer: read from Kafka and push to channel
    go func() {
        for {
            m, err := kafkaReader.ReadMessage(ctx)
            if err != nil {
                log.Printf("kafka read error: %v", err)
                continue
            }
            events <- rawEvent{
                Key:   fmt.Sprintf("user:%s", m.Key),
                Value: m.Value,
            }
        }
    }()

    // Graceful shutdown handling omitted for brevity
    wg.Wait()
}

Explanation of key ideas:

Pipeline Batching: pipelineSize controls the number of commands per network round‑trip. With a batch of 500 HSETs, the latency per command drops dramatically.
Flush Interval: Guarantees that even low‑traffic periods don’t stall writes indefinitely.
Worker Count: Tuned based on CPU cores and expected event rate. Each worker holds its own pipeline to avoid contention.
Back‑pressure: The buffered channel (events) smooths spikes; if it fills, Kafka’s consumer will naturally block, providing back‑pressure to upstream producers.

Serving Features at Scale

Once features are stored, the serving layer must retrieve them quickly and atomically. Below are patterns to achieve sub‑10 ms response times.

1. Bulk Reads with `HMGET` / `HGETALL`

If a request needs all features for an entity, HGETALL returns the entire hash in a single round‑trip.

func getAllFeatures(ctx context.Context, userID int64) (map[string]string, error) {
    key := fmt.Sprintf("user:%d", userID)
    return rdb.HGetAll(ctx, key).Result()
}

2. Multi‑Entity Fetch with Pipelining

When an inference request requires features for multiple users (e.g., batch scoring), pipeline the reads:

func batchGetFeatures(ctx context.Context, ids []int64) ([]map[string]string, error) {
    pipe := rdb.Pipeline()
    cmds := make([]*redis.StringStringMapCmd, len(ids))

    for i, id := range ids {
        key := fmt.Sprintf("user:%d", id)
        cmds[i] = pipe.HGetAll(ctx, key)
    }

    _, err := pipe.Exec(ctx)
    if err != nil && err != redis.Nil {
        return nil, err
    }

    results := make([]map[string]string, len(ids))
    for i, cmd := range cmds {
        results[i], _ = cmd.Result()
    }
    return results, nil
}

3. Atomic Multi‑Key Reads via Lua

Sometimes you need a consistent snapshot across several keys (e.g., user features + session flags). A Lua script can fetch all required fields atomically, eliminating race conditions.

-- get_features.lua
local result = {}
for i, key in ipairs(KEYS) do
    result[key] = redis.call('HGETALL', key)
end
return result

Go wrapper:

var getFeaturesScript = redis.NewScript(`
local result = {}
for i, key in ipairs(KEYS) do
    result[key] = redis.call('HGETALL', key)
end
return result
`)

func getFeaturesAtomic(ctx context.Context, keys []string) (map[string]map[string]string, error) {
    // Convert slice to interface{} for script args
    args := make([]interface{}, len(keys))
    for i, k := range keys {
        args[i] = k
    }
    raw, err := getFeaturesScript.Run(ctx, rdb, keys).Result()
    if err != nil {
        return nil, err
    }
    // Parse raw reply (nested arrays) into Go maps – omitted for brevity
    // ...
    return parsed, nil
}

4. Read‑Only Replica Routing

Redis Cluster allows clients to read from replicas by setting the ReadOnly flag. This offloads read traffic from the primary nodes, improving throughput.

rdb := redis.NewClusterClient(&redis.ClusterOptions{
    Addrs:    []string{"redis-0:6379", "redis-1:6379"},
    ReadOnly: true, // use replicas for GET/HRANGE
})

5. Caching Hot Features Locally

For ultra‑low latency, you may embed an in‑process LRU cache (e.g., github.com/hashicorp/golang-lru). Cache warm‑up can be done on startup or via a background refresh task.

cache, _ := lru.New(10000) // store up to 10k entries

func getFeatureWithCache(ctx context.Context, userID int64) (FeatureSet, error) {
    if v, ok := cache.Get(userID); ok {
        return v.(FeatureSet), nil
    }
    // fallback to Redis
    raw, err := rdb.HGetAll(ctx, fmt.Sprintf("user:%d", userID)).Result()
    if err != nil {
        return FeatureSet{}, err
    }
    // convert raw map to FeatureSet (omitted)
    feat := mapToFeatureSet(raw)
    cache.Add(userID, feat)
    return feat, nil
}

Performance tip: Keep the cache size modest to avoid memory pressure on the service container; rely on Redis for the authoritative source.

Scaling Redis: Clustering, Sharding, and HA

A single Redis node caps at a few hundred thousand ops/sec depending on hardware. To support millions of operations per second, you need a proper cluster layout.

1. Redis Cluster Basics

Hash Slots: 0‑16383, automatically distributed across master nodes.
Masters + Replicas: Typical production setups use 3 master nodes with 1‑2 replicas each for fault tolerance.
Cross‑Slot Operations: Lua scripts that touch multiple keys must ensure those keys reside on the same slot (or use CLUSTER KEYSLOT to co-locate).

2. Sharding Strategy for Feature Keys

Because our keys follow a predictable pattern (user:{id}), we can use hash tags to force co-location when needed.

Key: user:{1234}:v42

Everything inside {} is hashed to the same slot. This is useful when a Lua script needs both a user hash and a version hash.

3. Adding Nodes Without Downtime

Redis Cluster supports online rebalancing:

# Add a new master
redis-cli --cluster add-node new-node:6379 existing-node:6379

# Reshard 1/4 of slots to the new master
redis-cli --cluster reshard existing-node:6379 --cluster-from existing-node:6379 --cluster-to new-node:6379 --cluster-slots 4096

Automation tools (e.g., Redis Operator for Kubernetes) handle these steps automatically.

4. High Availability with Sentinel vs. Native Cluster

Sentinel works with a single‑master topology, providing failover but no sharding.
Native Cluster provides both sharding and HA. For most feature‑store use cases, native clustering is recommended.

5. Persistence Trade‑offs

Persistence Mode	Durability	Write Latency	Memory Overhead
RDB Snapshot (every 5 min)	Low (data loss up to 5 min)	Minimal	None
AOF (every‑second)	Medium (≤1 s loss)	Slight increase (fsync every sec)	Slight
AOF (every‑write)	High (no loss)	Higher (fsync per write)	None
No persistence	None	Lowest	None

For a real‑time feature store that can tolerate a few seconds of loss (features are regenerated from streams), AOF every‑second strikes a good balance.

6. Resource Sizing Example

Metric	Approx. Requirement
CPU	2‑4 vCPU per master for 500 k ops/s
Memory	2 GB per master (depends on feature cardinality)
Network	10 Gbps NIC for multi‑node cluster
Disk (AOF)	High‑throughput SSD, ~200 MB/s write

Use Redis INFO and Prometheus alerts to monitor instantaneous_ops_per_sec, used_memory, and aof_current_rewrite_time_sec.

Observability & Monitoring

A production feature store must expose metrics, traces, and logs to detect latency spikes or data inconsistencies.

1. Redis Metrics Exporter

The official redis_exporter (Prometheus) scrapes:

redis_commands_processed_total
redis_instantaneous_ops_per_sec
redis_keyspace_hits_total / redis_keyspace_misses_total
redis_latency_seconds (via LATENCY command)

Run as a sidecar or DaemonSet:

apiVersion: v1
kind: Service
metadata:
  name: redis-exporter
spec:
  selector:
    app: redis
  ports:
    - port: 9121
      targetPort: 9121
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis-exporter
spec:
  replicas: 1
  selector:
    matchLabels:
      app: redis-exporter
  template:
    metadata:
      labels:
        app: redis-exporter
    spec:
      containers:
        - name: exporter
          image: oliver006/redis_exporter:latest
          args: ["--redis.addr=redis-cluster:6379"]
          ports:
            - containerPort: 9121

2. Go Service Metrics

Using Prometheus client:

var (
    ingestionLatency = prometheus.NewHistogramVec(prometheus.HistogramOpts{
        Namespace: "feature_store",
        Subsystem: "ingestion",
        Name:      "batch_latency_seconds",
        Buckets:   prometheus.ExponentialBuckets(0.001, 2, 10),
    }, []string{"status"})
)

func init() {
    prometheus.MustRegister(ingestionLatency)
}

Record latency around pipeline.Exec.

3. Distributed Tracing

Instrument both ingestion and serving services with OpenTelemetry:

tracer := otel.Tracer("feature-store-ingestion")
ctx, span := tracer.Start(ctx, "processEvent")
defer span.End()
// add attributes
span.SetAttributes(attribute.String("user_id", fmt.Sprint(feat.UserID)))

Export traces to Jaeger or Tempo for end‑to‑end latency visualization.

4. Logging Practices

Use structured JSON logs (zap or zerolog).
Include request IDs (X-Request-ID header) to correlate logs across services.
Log redis errors at WARN level, but pipeline failures at ERROR.

Testing and Benchmarking

Before rolling out to production, validate that your design meets the required throughput and latency.

1. Load Testing with k6

import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  stages: [
    { duration: '30s', target: 100 }, // ramp up
    { duration: '2m', target: 1000 }, // sustain 1k RPS
    { duration: '30s', target: 0 },   // ramp down
  ],
};

export default function () {
  const res = http.get('http://feature-api:8080/v1/features?user_id=1234');
  check(res, { 'status 200': (r) => r.status === 200 });
  sleep(0.01);
}

Collect response time percentiles and ensure the 95th percentile stays below the latency SLA (e.g., 8 ms).

2. Go Benchmarks for Redis Pipelines

func BenchmarkPipelineWrite(b *testing.B) {
    ctx := context.Background()
    pipelineSize := 1000
    pipe := rdb.Pipeline()
    for i := 0; i < b.N; i++ {
        for j := 0; j < pipelineSize; j++ {
            key := fmt.Sprintf("user:%d", j)
            pipe.HSet(ctx, key, map[string]interface{}{
                "age":          30,
                "country":      "US",
                "churn_score": 0.12,
            })
        }
        _, err := pipe.Exec(ctx)
        if err != nil && err != redis.Nil {
            b.Fatalf("pipeline exec error: %v", err)
        }
    }
}

Run with go test -bench=. -benchmem and observe ops per second. Fine‑tune pipelineSize and workerCount based on the results.

3. Failure Injection

Use Chaos Mesh or Pumba to simulate node loss.
Verify that the Go client automatically retries and that the cluster re‑elects masters without downtime.

4. Consistency Checks

After a bulk write, read back a random sample and compare against the source payload. Automate this as part of a CI pipeline.

Real‑World Case Study: E‑Commerce Recommendations

Scenario

An online marketplace wants to power personalized product recommendations in real time. The model consumes:

User profile features (age, location, loyalty tier)
Session activity (last 5 clicks, time‑on‑site)
Recent purchase history (last 3 orders)

The inference service must respond within 5 ms for ≈30 k RPS during peak traffic.

Architecture Deployed

Component	Tech Stack	Scale
Event Ingestion	Kafka (3 partitions) → Go microservice	200 k events/s
Feature Store	Redis Cluster (3 masters, 2 replicas)	2 GB RAM per master
Serving API	Go gRPC server (2 vCPU, 4 GB RAM)	30 k RPS
Monitoring	Prometheus + Grafana + Jaeger	–
CI/CD	GitHub Actions, Argo CD	–

Key Implementation Details

Hash‑based Entity Store: user:{id}:v5 where v5 is the model version.
Session Features in Sorted Set: session:{id} with score = timestamp, member = "click:{product_id}".
Batch Reads: The gRPC server receives a batch of 10 user IDs, pipelines HMGET for profile hashes and ZRANGE for session sets, achieving ~3 ms total latency.
Lua Script for Atomic Snapshot: A single script fetched profile + last 5 clicks in one server‑side execution, eliminating race windows.
Cache Warm‑up: At service start, top‑10k active users are pre‑loaded into an LRU cache (size 20k) to guarantee sub‑2 ms response for the hottest traffic.

Performance Numbers

Metric	Value
Ingestion throughput	250 k `HSET`s / sec (pipeline size 1000)
Serving latency (p99)	4.8 ms
Redis CPU utilization	78 % on masters
Autoscaling events	2 scale‑outs per day during flash‑sale spikes

The system demonstrated linear scaling: adding a fourth master increased write throughput by ~30 % and reduced read latency by ~15 % under load.

Conclusion

Scaling a realtime feature store for high‑throughput microservices is a multi‑disciplinary challenge that blends data modeling, systems engineering, and Go programming expertise. By leveraging:

Redis’s in‑memory speed, rich data structures, and native clustering,
Go’s lightweight concurrency, strong typing, and robust tooling, and
Thoughtful schema design, pipelined ingestion, atomic reads, and observability,

you can build a platform that delivers sub‑10 ms latency at hundreds of thousands of operations per second. The patterns explored—entity‑centric hashes, time‑series sorted sets, Lua‑based atomic fetches, and worker‑pool pipelines—provide a solid foundation that can be adapted to any domain, from recommendation engines to fraud detection.

Remember that operational excellence—monitoring, automated failover, and rigorous testing—are as crucial as the code itself. With the blueprint and examples in this article, you’re equipped to design, implement, and operate a production‑grade realtime feature store that scales alongside your microservice ecosystem.

Happy coding, and may your features always be fresh! 🚀

Table of Contents#

Introduction#

Fundamentals of Feature Stores#

What Is a Feature Store?#

Real‑Time vs. Batch Features#

Core Requirements for a Real‑Time Feature Store#

Why Redis Is a Strong Candidate#

1. In‑Memory Speed with Persistence Options#

2. Rich Data Structures#

3. Horizontal Scaling with Redis Cluster#

4. Pub/Sub & Keyspace Notifications#

5. Ecosystem & Tooling#

Go: The Language for High‑Performance Services#

Concurrency Made Simple#

Strong Standard Library#

Static Binaries & Fast Startup#

Rich Ecosystem for Redis#

Example: Simple Redis Client in Go#

Architectural Blueprint#

Designing a Redis Schema for Feature Data#

1. Entity‑Centric Hashes#

2. Time‑Series with Sorted Sets#

3. Feature Versioning#

4. TTL & Expiration#

5. Example Schema Definition (Go)#

Ingestion Pipeline in Go#

1. Worker Pool Architecture#

2. Code Example#

Serving Features at Scale#

1. Bulk Reads with HMGET / HGETALL#

2. Multi‑Entity Fetch with Pipelining#

3. Atomic Multi‑Key Reads via Lua#

4. Read‑Only Replica Routing#

5. Caching Hot Features Locally#

Scaling Redis: Clustering, Sharding, and HA#

1. Redis Cluster Basics#

2. Sharding Strategy for Feature Keys#

3. Adding Nodes Without Downtime#

4. High Availability with Sentinel vs. Native Cluster#

5. Persistence Trade‑offs#

6. Resource Sizing Example#

Observability & Monitoring#

1. Redis Metrics Exporter#

2. Go Service Metrics#

3. Distributed Tracing#

4. Logging Practices#

Testing and Benchmarking#

1. Load Testing with k6#

2. Go Benchmarks for Redis Pipelines#

3. Failure Injection#

4. Consistency Checks#

Real‑World Case Study: E‑Commerce Recommendations#

Scenario#

Architecture Deployed#

Key Implementation Details#

Performance Numbers#

Conclusion#

Resources#

Table of Contents