Table of Contents

  1. Introduction
  2. Fundamentals of Feature Stores
  3. Why Redis Is a Strong Candidate
  4. Go: The Language for High‑Performance Services
  5. Architectural Blueprint
  6. Designing a Redis Schema for Feature Data
  7. Ingestion Pipeline in Go
  8. Serving Features at Scale
  9. Scaling Redis: Clustering, Sharding, and HA
  10. Observability & Monitoring
  11. Testing and Benchmarking
  12. Real‑World Case Study: E‑Commerce Recommendations
  13. Conclusion
  14. Resources

Introduction

Feature stores have emerged as the backbone of modern machine‑learning (ML) pipelines. They enable teams to store, version, and serve engineered features both offline (for batch training) and online (for real‑time inference). In a microservice‑centric architecture, each service may need to fetch dozens of features per request, often under strict latency budgets (sub‑10 ms) while the system processes thousands of requests per second.

Achieving this level of performance is not trivial. It requires a datastore that can:

  • Persist feature values with minimal write latency.
  • Scale horizontally to handle spikes in traffic.
  • Expose low‑latency read paths that can be called from many concurrent goroutines.
  • Provide strong consistency guarantees where needed (e.g., when a feature is updated just before it is consumed).

Redis, with its rich data structures, in‑memory speed, and mature clustering capabilities, is a natural fit for the online side of a feature store. Coupled with Go’s lightweight concurrency model and efficient networking stack, you can build microservices that ingest, update, and serve features at hundreds of thousands of operations per second.

This article walks through the end‑to‑end design of a realtime feature store built on Redis and Go. We’ll explore architectural decisions, schema design, ingestion pipelines, serving patterns, scaling strategies, observability, and testing. By the end, you should have a concrete blueprint you can adapt to your own high‑throughput microservice environment.


Fundamentals of Feature Stores

What Is a Feature Store?

A feature store is a centralized repository that:

  1. Ingests raw event data (clicks, sensor readings, logs) and transforms it into features.
  2. Version‑controls those features so that the same definition can be used for both training and serving.
  3. Caches the latest feature values for low‑latency retrieval during inference.

Think of it as a “database for ML features” rather than a generic key‑value store. It bridges the gap between data engineering pipelines and model serving layers.

Real‑Time vs. Batch Features

DimensionBatch FeaturesReal‑Time Features
Update FrequencyHours‑to‑daysSeconds‑to‑milliseconds
Latency RequirementMinutes–hours< 10 ms (often < 5 ms)
Typical StorageData lake / warehouseIn‑memory store (Redis, Aerospike)
Use CasesModel training, offline analyticsOnline inference, personalization, fraud detection

A robust system typically combines both: batch pipelines populate long‑term aggregates, while a real‑time component captures the freshest signals.

Core Requirements for a Real‑Time Feature Store

  • Low Write Latency – Features are often generated by streaming pipelines (Kafka, Kinesis) and must be persisted within a few milliseconds.
  • High Read Throughput – A single inference request may need 10–30 features; a service serving 10 k RPS translates to 100 k–300 k reads per second.
  • Strong Consistency – Some features (e.g., “account status”) must reflect the latest state instantly.
  • Scalable Architecture – Ability to add nodes without downtime.
  • Observability – Metrics, tracing, and alerts for latency, error rates, and resource usage.

Why Redis Is a Strong Candidate

Redis is more than a simple key‑value cache; it offers a versatile set of data structures and operational primitives that align closely with feature‑store needs.

1. In‑Memory Speed with Persistence Options

  • Memory‑level reads are sub‑microsecond. Even with network overhead, round‑trip latencies stay within a few milliseconds.
  • AOF (Append‑Only File) and RDB snapshots give durability. You can configure every‑write AOF for strict durability or every‑second AOF for a performance‑first approach.

2. Rich Data Structures

StructureIdeal Use Case for Features
HashStoring a compact set of feature values per entity (user:1234 -> {age:30, country:US, churn_score:0.12})
Sorted Set (ZSET)Time‑series of event‑driven features (e.g., click counts per minute)
StreamIngestion pipeline that guarantees ordering and replayability
BitmapBoolean flags for huge cardinalities (e.g., feature “has_seen_promo”)
Lua ScriptsAtomic read‑modify‑write, multi‑key fetches, and complex calculations without round‑trips

3. Horizontal Scaling with Redis Cluster

  • Sharding across 16384 hash slots enables linear scaling.
  • Replica nodes provide read‑scaling and HA.
  • Automatic failover via Redis Sentinel or native cluster mechanisms.

4. Pub/Sub & Keyspace Notifications

  • Enable event‑driven invalidation of downstream caches.
  • Useful for feature version propagation across microservices.

5. Ecosystem & Tooling

  • Mature client libraries for Go (github.com/go-redis/redis/v9).
  • Integration with Prometheus, Grafana, and OpenTelemetry.
  • CLI (redis-cli) for rapid debugging.

Collectively, these attributes make Redis a battle‑tested platform for the online layer of a feature store.


Go: The Language for High‑Performance Services

Go (Golang) has become the de‑facto language for building cloud‑native microservices. Its design choices align perfectly with the demands of a realtime feature store.

Concurrency Made Simple

  • Goroutines are lightweight (≈2 KB stack) and can spawn millions of concurrent workers.
  • Channels provide a clean way to pipeline data (e.g., ingest → transform → write).

Strong Standard Library

  • net/http and net/http2 for high‑performance APIs.
  • context for request‑scoped cancellation and deadlines—critical for low‑latency services.

Static Binaries & Fast Startup

  • Deploy as a single binary, ideal for containerized environments (Docker, Kubernetes).
  • Fast cold start times (< 50 ms) keep pod scaling responsive.

Rich Ecosystem for Redis

  • go-redis/redis/v9 offers pipelining, cluster support, and Lua script execution.
  • Integration with OpenTelemetry (go.opentelemetry.io/otel) for distributed tracing.

Example: Simple Redis Client in Go

package main

import (
    "context"
    "log"
    "time"

    "github.com/go-redis/redis/v9"
)

func main() {
    ctx := context.Background()
    rdb := redis.NewClusterClient(&redis.ClusterOptions{
        Addrs: []string{
            "redis-0:6379",
            "redis-1:6379",
            "redis-2:6379",
        },
        ReadOnly: true, // Enable read‑only replica usage
    })

    // Ping to ensure connectivity
    if err := rdb.Ping(ctx).Err(); err != nil {
        log.Fatalf("cannot connect to redis: %v", err)
    }

    // Example: set a feature hash for a user
    key := "user:1234"
    _, err := rdb.HSet(ctx, key, map[string]interface{}{
        "age":          29,
        "country":      "US",
        "churn_score":  0.07,
        "last_active": time.Now().Unix(),
    }).Result()
    if err != nil {
        log.Fatalf("failed to HSet: %v", err)
    }

    log.Println("feature set successfully")
}

The snippet demonstrates cluster connection, context‑aware calls, and a hash write—the core building block for a feature store.


Architectural Blueprint

Below is a high‑level view of how the components interact. While a diagram would be ideal, we’ll describe the flow textually.

  1. Event Sources – Kafka topics, Kinesis streams, or HTTP webhook endpoints produce raw events (clicks, transactions, sensor readings).
  2. Ingestion Service (Go) – Consumes events, performs feature transformations, and writes results to Redis using pipelines.
  3. Redis Cluster – Stores feature values in hashes, sorted sets, or streams. Replicas serve read traffic.
  4. Feature‑Serving API (Go) – Exposes HTTP/gRPC endpoints for downstream ML inference services. Retrieves required features in bulk, optionally using Lua scripts for atomicity.
  5. Cache Layer (Optional) – Local in‑process LRU caches or side‑car caches (e.g., Redis Edge) to reduce hot‑key pressure.
  6. Observability Stack – Prometheus scrapes Redis and Go metrics; OpenTelemetry collects traces from ingestion and serving services.
  7. Management & Ops – Kubernetes handles pod scaling, Helm charts provision Redis clusters, and Sentinel monitors health.
[Event Sources] → (Kafka) → [Ingestion Service] → (Pipelined writes) → [Redis Cluster] ↔ (Replica reads) ↔ [Feature‑Serving API] → (Inference Service)

Key design principles:

  • Decouple ingestion from serving – Each can scale independently.
  • Prefer write‑heavy pipelines – Batch multiple feature updates into a single network round‑trip.
  • Read‑optimised data structures – Use hashes for point lookups and Lua scripts for multi‑key fetches.
  • Stateless Go services – Enable horizontal scaling via Kubernetes Horizontal Pod Autoscaler (HPA).

Designing a Redis Schema for Feature Data

A well‑thought-out schema determines both memory efficiency and query speed. Below are common patterns.

1. Entity‑Centric Hashes

Store all features for a given entity (user, device, product) in a single hash.

Key: user:{user_id}
Fields:
  age            → integer
  country        → string
  churn_score    → float
  last_active    → unix timestamp
  premium_flag   → 0/1

Pros: O(1) read for all features with HMGET or HGETALL.
Cons: Hash grows with number of features; can exceed 512 MB limit per hash (practically far lower).

2. Time‑Series with Sorted Sets

When a feature is a temporal aggregate (e.g., clicks per minute), a sorted set provides ordering.

Key: clicks:user:{user_id}
Members: <timestamp> → <click_count>
Score: Unix epoch in seconds (or milliseconds)

You can query the last N minutes with ZRANGE and compute rolling windows on the client or via Lua.

3. Feature Versioning

To avoid stale reads after a model upgrade, embed a version token:

Key: feature_version:{model_name}
Value: integer (e.g., 42)

When fetching features, prepend the version to the entity key:

user:42:{user_id}

If the version increments, downstream services automatically request the new hash.

4. TTL & Expiration

Real‑time features often become obsolete quickly (e.g., session‑level attributes). Use EXPIRE to automatically evict them:

rdb.Expire(ctx, key, 10*time.Minute)

5. Example Schema Definition (Go)

type FeatureSet struct {
    UserID      int64
    Age         int
    Country     string
    ChurnScore  float64
    LastActive  int64 // Unix timestamp
    PremiumFlag bool
}

// Serialize into a Redis hash map
func (f *FeatureSet) ToMap() map[string]interface{} {
    return map[string]interface{}{
        "age":          f.Age,
        "country":      f.Country,
        "churn_score":  f.ChurnScore,
        "last_active":  f.LastActive,
        "premium_flag": boolToInt(f.PremiumFlag),
    }
}

Ingestion Pipeline in Go

The ingestion service must process high‑volume streams while keeping latency low. Below is a pattern that leverages Go’s concurrency primitives and Redis pipelining.

1. Worker Pool Architecture

  • Reader Goroutine – Pulls messages from Kafka (or any source) and pushes them onto a buffered channel.
  • N Worker Goroutines – Each reads from the channel, transforms the raw payload into a FeatureSet, and appends the write command to a Redis pipeline.
  • Flusher Goroutine – Periodically executes the pipeline (Exec) and records metrics.

2. Code Example

package main

import (
    "context"
    "log"
    "sync"
    "time"

    "github.com/go-redis/redis/v9"
    "github.com/segmentio/kafka-go"
)

const (
    workerCount   = 32
    pipelineSize  = 500 // number of commands per Exec
    flushInterval = 5 * time.Millisecond
)

type rawEvent struct {
    Key   string // e.g., "user:1234"
    Value []byte // raw JSON payload
}

// Global Redis client (cluster)
var rdb *redis.ClusterClient

func initRedis() {
    rdb = redis.NewClusterClient(&redis.ClusterOptions{
        Addrs: []string{"redis-0:6379", "redis-1:6379", "redis-2:6379"},
    })
}

// Transform raw JSON into FeatureSet (simplified)
func parseEvent(e rawEvent) (*FeatureSet, error) {
    // In practice use json.Unmarshal and proper error handling
    // Here we assume a deterministic mapping for brevity
    return &FeatureSet{
        UserID:      1234,
        Age:         29,
        Country:     "US",
        ChurnScore:  0.07,
        LastActive:  time.Now().Unix(),
        PremiumFlag: false,
    }, nil
}

func main() {
    ctx := context.Background()
    initRedis()
    defer rdb.Close()

    // 1️⃣ Kafka reader (simplified)
    kafkaReader := kafka.NewReader(kafka.ReaderConfig{
        Brokers: []string{"kafka:9092"},
        Topic:   "user-events",
        GroupID: "feature-ingest",
    })
    defer kafkaReader.Close()

    // Buffered channel for events
    events := make(chan rawEvent, 10000)

    // 2️⃣ Launch workers
    var wg sync.WaitGroup
    for i := 0; i < workerCount; i++ {
        wg.Add(1)
        go func(id int) {
            defer wg.Done()
            pipeline := rdb.Pipeline()
            counter := 0
            flushTicker := time.NewTicker(flushInterval)
            defer flushTicker.Stop()

            for {
                select {
                case ev, ok := <-events:
                    if !ok {
                        // drain remaining commands
                        if counter > 0 {
                            _, _ = pipeline.Exec(ctx)
                        }
                        return
                    }
                    feat, err := parseEvent(ev)
                    if err != nil {
                        log.Printf("worker %d: parse error: %v", id, err)
                        continue
                    }
                    // Queue a HSET command in the pipeline
                    pipeline.HSet(ctx, ev.Key, feat.ToMap())
                    counter++
                    if counter >= pipelineSize {
                        _, _ = pipeline.Exec(ctx)
                        counter = 0
                    }

                case <-flushTicker.C:
                    if counter > 0 {
                        _, _ = pipeline.Exec(ctx)
                        counter = 0
                    }
                }
            }
        }(i)
    }

    // 3️⃣ Producer: read from Kafka and push to channel
    go func() {
        for {
            m, err := kafkaReader.ReadMessage(ctx)
            if err != nil {
                log.Printf("kafka read error: %v", err)
                continue
            }
            events <- rawEvent{
                Key:   fmt.Sprintf("user:%s", m.Key),
                Value: m.Value,
            }
        }
    }()

    // Graceful shutdown handling omitted for brevity
    wg.Wait()
}

Explanation of key ideas:

  • Pipeline Batching: pipelineSize controls the number of commands per network round‑trip. With a batch of 500 HSETs, the latency per command drops dramatically.
  • Flush Interval: Guarantees that even low‑traffic periods don’t stall writes indefinitely.
  • Worker Count: Tuned based on CPU cores and expected event rate. Each worker holds its own pipeline to avoid contention.
  • Back‑pressure: The buffered channel (events) smooths spikes; if it fills, Kafka’s consumer will naturally block, providing back‑pressure to upstream producers.

Serving Features at Scale

Once features are stored, the serving layer must retrieve them quickly and atomically. Below are patterns to achieve sub‑10 ms response times.

1. Bulk Reads with HMGET / HGETALL

If a request needs all features for an entity, HGETALL returns the entire hash in a single round‑trip.

func getAllFeatures(ctx context.Context, userID int64) (map[string]string, error) {
    key := fmt.Sprintf("user:%d", userID)
    return rdb.HGetAll(ctx, key).Result()
}

2. Multi‑Entity Fetch with Pipelining

When an inference request requires features for multiple users (e.g., batch scoring), pipeline the reads:

func batchGetFeatures(ctx context.Context, ids []int64) ([]map[string]string, error) {
    pipe := rdb.Pipeline()
    cmds := make([]*redis.StringStringMapCmd, len(ids))

    for i, id := range ids {
        key := fmt.Sprintf("user:%d", id)
        cmds[i] = pipe.HGetAll(ctx, key)
    }

    _, err := pipe.Exec(ctx)
    if err != nil && err != redis.Nil {
        return nil, err
    }

    results := make([]map[string]string, len(ids))
    for i, cmd := range cmds {
        results[i], _ = cmd.Result()
    }
    return results, nil
}

3. Atomic Multi‑Key Reads via Lua

Sometimes you need a consistent snapshot across several keys (e.g., user features + session flags). A Lua script can fetch all required fields atomically, eliminating race conditions.

-- get_features.lua
local result = {}
for i, key in ipairs(KEYS) do
    result[key] = redis.call('HGETALL', key)
end
return result

Go wrapper:

var getFeaturesScript = redis.NewScript(`
local result = {}
for i, key in ipairs(KEYS) do
    result[key] = redis.call('HGETALL', key)
end
return result
`)

func getFeaturesAtomic(ctx context.Context, keys []string) (map[string]map[string]string, error) {
    // Convert slice to interface{} for script args
    args := make([]interface{}, len(keys))
    for i, k := range keys {
        args[i] = k
    }
    raw, err := getFeaturesScript.Run(ctx, rdb, keys).Result()
    if err != nil {
        return nil, err
    }
    // Parse raw reply (nested arrays) into Go maps – omitted for brevity
    // ...
    return parsed, nil
}

4. Read‑Only Replica Routing

Redis Cluster allows clients to read from replicas by setting the ReadOnly flag. This offloads read traffic from the primary nodes, improving throughput.

rdb := redis.NewClusterClient(&redis.ClusterOptions{
    Addrs:    []string{"redis-0:6379", "redis-1:6379"},
    ReadOnly: true, // use replicas for GET/HRANGE
})

5. Caching Hot Features Locally

For ultra‑low latency, you may embed an in‑process LRU cache (e.g., github.com/hashicorp/golang-lru). Cache warm‑up can be done on startup or via a background refresh task.

cache, _ := lru.New(10000) // store up to 10k entries

func getFeatureWithCache(ctx context.Context, userID int64) (FeatureSet, error) {
    if v, ok := cache.Get(userID); ok {
        return v.(FeatureSet), nil
    }
    // fallback to Redis
    raw, err := rdb.HGetAll(ctx, fmt.Sprintf("user:%d", userID)).Result()
    if err != nil {
        return FeatureSet{}, err
    }
    // convert raw map to FeatureSet (omitted)
    feat := mapToFeatureSet(raw)
    cache.Add(userID, feat)
    return feat, nil
}

Performance tip: Keep the cache size modest to avoid memory pressure on the service container; rely on Redis for the authoritative source.


Scaling Redis: Clustering, Sharding, and HA

A single Redis node caps at a few hundred thousand ops/sec depending on hardware. To support millions of operations per second, you need a proper cluster layout.

1. Redis Cluster Basics

  • Hash Slots: 0‑16383, automatically distributed across master nodes.
  • Masters + Replicas: Typical production setups use 3 master nodes with 1‑2 replicas each for fault tolerance.
  • Cross‑Slot Operations: Lua scripts that touch multiple keys must ensure those keys reside on the same slot (or use CLUSTER KEYSLOT to co-locate).

2. Sharding Strategy for Feature Keys

Because our keys follow a predictable pattern (user:{id}), we can use hash tags to force co-location when needed.

Key: user:{1234}:v42

Everything inside {} is hashed to the same slot. This is useful when a Lua script needs both a user hash and a version hash.

3. Adding Nodes Without Downtime

Redis Cluster supports online rebalancing:

# Add a new master
redis-cli --cluster add-node new-node:6379 existing-node:6379

# Reshard 1/4 of slots to the new master
redis-cli --cluster reshard existing-node:6379 --cluster-from existing-node:6379 --cluster-to new-node:6379 --cluster-slots 4096

Automation tools (e.g., Redis Operator for Kubernetes) handle these steps automatically.

4. High Availability with Sentinel vs. Native Cluster

  • Sentinel works with a single‑master topology, providing failover but no sharding.
  • Native Cluster provides both sharding and HA. For most feature‑store use cases, native clustering is recommended.

5. Persistence Trade‑offs

Persistence ModeDurabilityWrite LatencyMemory Overhead
RDB Snapshot (every 5 min)Low (data loss up to 5 min)MinimalNone
AOF (every‑second)Medium (≤1 s loss)Slight increase (fsync every sec)Slight
AOF (every‑write)High (no loss)Higher (fsync per write)None
No persistenceNoneLowestNone

For a real‑time feature store that can tolerate a few seconds of loss (features are regenerated from streams), AOF every‑second strikes a good balance.

6. Resource Sizing Example

MetricApprox. Requirement
CPU2‑4 vCPU per master for 500 k ops/s
Memory2 GB per master (depends on feature cardinality)
Network10 Gbps NIC for multi‑node cluster
Disk (AOF)High‑throughput SSD, ~200 MB/s write

Use Redis INFO and Prometheus alerts to monitor instantaneous_ops_per_sec, used_memory, and aof_current_rewrite_time_sec.


Observability & Monitoring

A production feature store must expose metrics, traces, and logs to detect latency spikes or data inconsistencies.

1. Redis Metrics Exporter

The official redis_exporter (Prometheus) scrapes:

  • redis_commands_processed_total
  • redis_instantaneous_ops_per_sec
  • redis_keyspace_hits_total / redis_keyspace_misses_total
  • redis_latency_seconds (via LATENCY command)

Run as a sidecar or DaemonSet:

apiVersion: v1
kind: Service
metadata:
  name: redis-exporter
spec:
  selector:
    app: redis
  ports:
    - port: 9121
      targetPort: 9121
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis-exporter
spec:
  replicas: 1
  selector:
    matchLabels:
      app: redis-exporter
  template:
    metadata:
      labels:
        app: redis-exporter
    spec:
      containers:
        - name: exporter
          image: oliver006/redis_exporter:latest
          args: ["--redis.addr=redis-cluster:6379"]
          ports:
            - containerPort: 9121

2. Go Service Metrics

Using Prometheus client:

var (
    ingestionLatency = prometheus.NewHistogramVec(prometheus.HistogramOpts{
        Namespace: "feature_store",
        Subsystem: "ingestion",
        Name:      "batch_latency_seconds",
        Buckets:   prometheus.ExponentialBuckets(0.001, 2, 10),
    }, []string{"status"})
)

func init() {
    prometheus.MustRegister(ingestionLatency)
}

Record latency around pipeline.Exec.

3. Distributed Tracing

Instrument both ingestion and serving services with OpenTelemetry:

tracer := otel.Tracer("feature-store-ingestion")
ctx, span := tracer.Start(ctx, "processEvent")
defer span.End()
// add attributes
span.SetAttributes(attribute.String("user_id", fmt.Sprint(feat.UserID)))

Export traces to Jaeger or Tempo for end‑to‑end latency visualization.

4. Logging Practices

  • Use structured JSON logs (zap or zerolog).
  • Include request IDs (X-Request-ID header) to correlate logs across services.
  • Log redis errors at WARN level, but pipeline failures at ERROR.

Testing and Benchmarking

Before rolling out to production, validate that your design meets the required throughput and latency.

1. Load Testing with k6

import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  stages: [
    { duration: '30s', target: 100 }, // ramp up
    { duration: '2m', target: 1000 }, // sustain 1k RPS
    { duration: '30s', target: 0 },   // ramp down
  ],
};

export default function () {
  const res = http.get('http://feature-api:8080/v1/features?user_id=1234');
  check(res, { 'status 200': (r) => r.status === 200 });
  sleep(0.01);
}

Collect response time percentiles and ensure the 95th percentile stays below the latency SLA (e.g., 8 ms).

2. Go Benchmarks for Redis Pipelines

func BenchmarkPipelineWrite(b *testing.B) {
    ctx := context.Background()
    pipelineSize := 1000
    pipe := rdb.Pipeline()
    for i := 0; i < b.N; i++ {
        for j := 0; j < pipelineSize; j++ {
            key := fmt.Sprintf("user:%d", j)
            pipe.HSet(ctx, key, map[string]interface{}{
                "age":          30,
                "country":      "US",
                "churn_score": 0.12,
            })
        }
        _, err := pipe.Exec(ctx)
        if err != nil && err != redis.Nil {
            b.Fatalf("pipeline exec error: %v", err)
        }
    }
}

Run with go test -bench=. -benchmem and observe ops per second. Fine‑tune pipelineSize and workerCount based on the results.

3. Failure Injection

  • Use Chaos Mesh or Pumba to simulate node loss.
  • Verify that the Go client automatically retries and that the cluster re‑elects masters without downtime.

4. Consistency Checks

After a bulk write, read back a random sample and compare against the source payload. Automate this as part of a CI pipeline.


Real‑World Case Study: E‑Commerce Recommendations

Scenario

An online marketplace wants to power personalized product recommendations in real time. The model consumes:

  • User profile features (age, location, loyalty tier)
  • Session activity (last 5 clicks, time‑on‑site)
  • Recent purchase history (last 3 orders)

The inference service must respond within 5 ms for ≈30 k RPS during peak traffic.

Architecture Deployed

ComponentTech StackScale
Event IngestionKafka (3 partitions) → Go microservice200 k events/s
Feature StoreRedis Cluster (3 masters, 2 replicas)2 GB RAM per master
Serving APIGo gRPC server (2 vCPU, 4 GB RAM)30 k RPS
MonitoringPrometheus + Grafana + Jaeger
CI/CDGitHub Actions, Argo CD

Key Implementation Details

  1. Hash‑based Entity Store: user:{id}:v5 where v5 is the model version.
  2. Session Features in Sorted Set: session:{id} with score = timestamp, member = "click:{product_id}".
  3. Batch Reads: The gRPC server receives a batch of 10 user IDs, pipelines HMGET for profile hashes and ZRANGE for session sets, achieving ~3 ms total latency.
  4. Lua Script for Atomic Snapshot: A single script fetched profile + last 5 clicks in one server‑side execution, eliminating race windows.
  5. Cache Warm‑up: At service start, top‑10k active users are pre‑loaded into an LRU cache (size 20k) to guarantee sub‑2 ms response for the hottest traffic.

Performance Numbers

MetricValue
Ingestion throughput250 k HSETs / sec (pipeline size 1000)
Serving latency (p99)4.8 ms
Redis CPU utilization78 % on masters
Autoscaling events2 scale‑outs per day during flash‑sale spikes

The system demonstrated linear scaling: adding a fourth master increased write throughput by ~30 % and reduced read latency by ~15 % under load.


Conclusion

Scaling a realtime feature store for high‑throughput microservices is a multi‑disciplinary challenge that blends data modeling, systems engineering, and Go programming expertise. By leveraging:

  • Redis’s in‑memory speed, rich data structures, and native clustering,
  • Go’s lightweight concurrency, strong typing, and robust tooling, and
  • Thoughtful schema design, pipelined ingestion, atomic reads, and observability,

you can build a platform that delivers sub‑10 ms latency at hundreds of thousands of operations per second. The patterns explored—entity‑centric hashes, time‑series sorted sets, Lua‑based atomic fetches, and worker‑pool pipelines—provide a solid foundation that can be adapted to any domain, from recommendation engines to fraud detection.

Remember that operational excellence—monitoring, automated failover, and rigorous testing—are as crucial as the code itself. With the blueprint and examples in this article, you’re equipped to design, implement, and operate a production‑grade realtime feature store that scales alongside your microservice ecosystem.

Happy coding, and may your features always be fresh! 🚀


Resources