Table of Contents
- Introduction
- Fundamentals of Feature Stores
- Why Redis Is a Strong Candidate
- Go: The Language for High‑Performance Services
- Architectural Blueprint
- Designing a Redis Schema for Feature Data
- Ingestion Pipeline in Go
- Serving Features at Scale
- Scaling Redis: Clustering, Sharding, and HA
- Observability & Monitoring
- Testing and Benchmarking
- Real‑World Case Study: E‑Commerce Recommendations
- Conclusion
- Resources
Introduction
Feature stores have emerged as the backbone of modern machine‑learning (ML) pipelines. They enable teams to store, version, and serve engineered features both offline (for batch training) and online (for real‑time inference). In a microservice‑centric architecture, each service may need to fetch dozens of features per request, often under strict latency budgets (sub‑10 ms) while the system processes thousands of requests per second.
Achieving this level of performance is not trivial. It requires a datastore that can:
- Persist feature values with minimal write latency.
- Scale horizontally to handle spikes in traffic.
- Expose low‑latency read paths that can be called from many concurrent goroutines.
- Provide strong consistency guarantees where needed (e.g., when a feature is updated just before it is consumed).
Redis, with its rich data structures, in‑memory speed, and mature clustering capabilities, is a natural fit for the online side of a feature store. Coupled with Go’s lightweight concurrency model and efficient networking stack, you can build microservices that ingest, update, and serve features at hundreds of thousands of operations per second.
This article walks through the end‑to‑end design of a realtime feature store built on Redis and Go. We’ll explore architectural decisions, schema design, ingestion pipelines, serving patterns, scaling strategies, observability, and testing. By the end, you should have a concrete blueprint you can adapt to your own high‑throughput microservice environment.
Fundamentals of Feature Stores
What Is a Feature Store?
A feature store is a centralized repository that:
- Ingests raw event data (clicks, sensor readings, logs) and transforms it into features.
- Version‑controls those features so that the same definition can be used for both training and serving.
- Caches the latest feature values for low‑latency retrieval during inference.
Think of it as a “database for ML features” rather than a generic key‑value store. It bridges the gap between data engineering pipelines and model serving layers.
Real‑Time vs. Batch Features
| Dimension | Batch Features | Real‑Time Features |
|---|---|---|
| Update Frequency | Hours‑to‑days | Seconds‑to‑milliseconds |
| Latency Requirement | Minutes–hours | < 10 ms (often < 5 ms) |
| Typical Storage | Data lake / warehouse | In‑memory store (Redis, Aerospike) |
| Use Cases | Model training, offline analytics | Online inference, personalization, fraud detection |
A robust system typically combines both: batch pipelines populate long‑term aggregates, while a real‑time component captures the freshest signals.
Core Requirements for a Real‑Time Feature Store
- Low Write Latency – Features are often generated by streaming pipelines (Kafka, Kinesis) and must be persisted within a few milliseconds.
- High Read Throughput – A single inference request may need 10–30 features; a service serving 10 k RPS translates to 100 k–300 k reads per second.
- Strong Consistency – Some features (e.g., “account status”) must reflect the latest state instantly.
- Scalable Architecture – Ability to add nodes without downtime.
- Observability – Metrics, tracing, and alerts for latency, error rates, and resource usage.
Why Redis Is a Strong Candidate
Redis is more than a simple key‑value cache; it offers a versatile set of data structures and operational primitives that align closely with feature‑store needs.
1. In‑Memory Speed with Persistence Options
- Memory‑level reads are sub‑microsecond. Even with network overhead, round‑trip latencies stay within a few milliseconds.
- AOF (Append‑Only File) and RDB snapshots give durability. You can configure every‑write AOF for strict durability or every‑second AOF for a performance‑first approach.
2. Rich Data Structures
| Structure | Ideal Use Case for Features |
|---|---|
| Hash | Storing a compact set of feature values per entity (user:1234 -> {age:30, country:US, churn_score:0.12}) |
| Sorted Set (ZSET) | Time‑series of event‑driven features (e.g., click counts per minute) |
| Stream | Ingestion pipeline that guarantees ordering and replayability |
| Bitmap | Boolean flags for huge cardinalities (e.g., feature “has_seen_promo”) |
| Lua Scripts | Atomic read‑modify‑write, multi‑key fetches, and complex calculations without round‑trips |
3. Horizontal Scaling with Redis Cluster
- Sharding across 16384 hash slots enables linear scaling.
- Replica nodes provide read‑scaling and HA.
- Automatic failover via Redis Sentinel or native cluster mechanisms.
4. Pub/Sub & Keyspace Notifications
- Enable event‑driven invalidation of downstream caches.
- Useful for feature version propagation across microservices.
5. Ecosystem & Tooling
- Mature client libraries for Go (
github.com/go-redis/redis/v9). - Integration with Prometheus, Grafana, and OpenTelemetry.
- CLI (
redis-cli) for rapid debugging.
Collectively, these attributes make Redis a battle‑tested platform for the online layer of a feature store.
Go: The Language for High‑Performance Services
Go (Golang) has become the de‑facto language for building cloud‑native microservices. Its design choices align perfectly with the demands of a realtime feature store.
Concurrency Made Simple
- Goroutines are lightweight (≈2 KB stack) and can spawn millions of concurrent workers.
- Channels provide a clean way to pipeline data (e.g., ingest → transform → write).
Strong Standard Library
net/httpandnet/http2for high‑performance APIs.contextfor request‑scoped cancellation and deadlines—critical for low‑latency services.
Static Binaries & Fast Startup
- Deploy as a single binary, ideal for containerized environments (Docker, Kubernetes).
- Fast cold start times (< 50 ms) keep pod scaling responsive.
Rich Ecosystem for Redis
go-redis/redis/v9offers pipelining, cluster support, and Lua script execution.- Integration with OpenTelemetry (
go.opentelemetry.io/otel) for distributed tracing.
Example: Simple Redis Client in Go
package main
import (
"context"
"log"
"time"
"github.com/go-redis/redis/v9"
)
func main() {
ctx := context.Background()
rdb := redis.NewClusterClient(&redis.ClusterOptions{
Addrs: []string{
"redis-0:6379",
"redis-1:6379",
"redis-2:6379",
},
ReadOnly: true, // Enable read‑only replica usage
})
// Ping to ensure connectivity
if err := rdb.Ping(ctx).Err(); err != nil {
log.Fatalf("cannot connect to redis: %v", err)
}
// Example: set a feature hash for a user
key := "user:1234"
_, err := rdb.HSet(ctx, key, map[string]interface{}{
"age": 29,
"country": "US",
"churn_score": 0.07,
"last_active": time.Now().Unix(),
}).Result()
if err != nil {
log.Fatalf("failed to HSet: %v", err)
}
log.Println("feature set successfully")
}
The snippet demonstrates cluster connection, context‑aware calls, and a hash write—the core building block for a feature store.
Architectural Blueprint
Below is a high‑level view of how the components interact. While a diagram would be ideal, we’ll describe the flow textually.
- Event Sources – Kafka topics, Kinesis streams, or HTTP webhook endpoints produce raw events (clicks, transactions, sensor readings).
- Ingestion Service (Go) – Consumes events, performs feature transformations, and writes results to Redis using pipelines.
- Redis Cluster – Stores feature values in hashes, sorted sets, or streams. Replicas serve read traffic.
- Feature‑Serving API (Go) – Exposes HTTP/gRPC endpoints for downstream ML inference services. Retrieves required features in bulk, optionally using Lua scripts for atomicity.
- Cache Layer (Optional) – Local in‑process LRU caches or side‑car caches (e.g., Redis Edge) to reduce hot‑key pressure.
- Observability Stack – Prometheus scrapes Redis and Go metrics; OpenTelemetry collects traces from ingestion and serving services.
- Management & Ops – Kubernetes handles pod scaling, Helm charts provision Redis clusters, and Sentinel monitors health.
[Event Sources] → (Kafka) → [Ingestion Service] → (Pipelined writes) → [Redis Cluster] ↔ (Replica reads) ↔ [Feature‑Serving API] → (Inference Service)
Key design principles:
- Decouple ingestion from serving – Each can scale independently.
- Prefer write‑heavy pipelines – Batch multiple feature updates into a single network round‑trip.
- Read‑optimised data structures – Use hashes for point lookups and Lua scripts for multi‑key fetches.
- Stateless Go services – Enable horizontal scaling via Kubernetes Horizontal Pod Autoscaler (HPA).
Designing a Redis Schema for Feature Data
A well‑thought-out schema determines both memory efficiency and query speed. Below are common patterns.
1. Entity‑Centric Hashes
Store all features for a given entity (user, device, product) in a single hash.
Key: user:{user_id}
Fields:
age → integer
country → string
churn_score → float
last_active → unix timestamp
premium_flag → 0/1
Pros: O(1) read for all features with HMGET or HGETALL.
Cons: Hash grows with number of features; can exceed 512 MB limit per hash (practically far lower).
2. Time‑Series with Sorted Sets
When a feature is a temporal aggregate (e.g., clicks per minute), a sorted set provides ordering.
Key: clicks:user:{user_id}
Members: <timestamp> → <click_count>
Score: Unix epoch in seconds (or milliseconds)
You can query the last N minutes with ZRANGE and compute rolling windows on the client or via Lua.
3. Feature Versioning
To avoid stale reads after a model upgrade, embed a version token:
Key: feature_version:{model_name}
Value: integer (e.g., 42)
When fetching features, prepend the version to the entity key:
user:42:{user_id}
If the version increments, downstream services automatically request the new hash.
4. TTL & Expiration
Real‑time features often become obsolete quickly (e.g., session‑level attributes). Use EXPIRE to automatically evict them:
rdb.Expire(ctx, key, 10*time.Minute)
5. Example Schema Definition (Go)
type FeatureSet struct {
UserID int64
Age int
Country string
ChurnScore float64
LastActive int64 // Unix timestamp
PremiumFlag bool
}
// Serialize into a Redis hash map
func (f *FeatureSet) ToMap() map[string]interface{} {
return map[string]interface{}{
"age": f.Age,
"country": f.Country,
"churn_score": f.ChurnScore,
"last_active": f.LastActive,
"premium_flag": boolToInt(f.PremiumFlag),
}
}
Ingestion Pipeline in Go
The ingestion service must process high‑volume streams while keeping latency low. Below is a pattern that leverages Go’s concurrency primitives and Redis pipelining.
1. Worker Pool Architecture
- Reader Goroutine – Pulls messages from Kafka (or any source) and pushes them onto a buffered channel.
- N Worker Goroutines – Each reads from the channel, transforms the raw payload into a
FeatureSet, and appends the write command to a Redis pipeline. - Flusher Goroutine – Periodically executes the pipeline (
Exec) and records metrics.
2. Code Example
package main
import (
"context"
"log"
"sync"
"time"
"github.com/go-redis/redis/v9"
"github.com/segmentio/kafka-go"
)
const (
workerCount = 32
pipelineSize = 500 // number of commands per Exec
flushInterval = 5 * time.Millisecond
)
type rawEvent struct {
Key string // e.g., "user:1234"
Value []byte // raw JSON payload
}
// Global Redis client (cluster)
var rdb *redis.ClusterClient
func initRedis() {
rdb = redis.NewClusterClient(&redis.ClusterOptions{
Addrs: []string{"redis-0:6379", "redis-1:6379", "redis-2:6379"},
})
}
// Transform raw JSON into FeatureSet (simplified)
func parseEvent(e rawEvent) (*FeatureSet, error) {
// In practice use json.Unmarshal and proper error handling
// Here we assume a deterministic mapping for brevity
return &FeatureSet{
UserID: 1234,
Age: 29,
Country: "US",
ChurnScore: 0.07,
LastActive: time.Now().Unix(),
PremiumFlag: false,
}, nil
}
func main() {
ctx := context.Background()
initRedis()
defer rdb.Close()
// 1️⃣ Kafka reader (simplified)
kafkaReader := kafka.NewReader(kafka.ReaderConfig{
Brokers: []string{"kafka:9092"},
Topic: "user-events",
GroupID: "feature-ingest",
})
defer kafkaReader.Close()
// Buffered channel for events
events := make(chan rawEvent, 10000)
// 2️⃣ Launch workers
var wg sync.WaitGroup
for i := 0; i < workerCount; i++ {
wg.Add(1)
go func(id int) {
defer wg.Done()
pipeline := rdb.Pipeline()
counter := 0
flushTicker := time.NewTicker(flushInterval)
defer flushTicker.Stop()
for {
select {
case ev, ok := <-events:
if !ok {
// drain remaining commands
if counter > 0 {
_, _ = pipeline.Exec(ctx)
}
return
}
feat, err := parseEvent(ev)
if err != nil {
log.Printf("worker %d: parse error: %v", id, err)
continue
}
// Queue a HSET command in the pipeline
pipeline.HSet(ctx, ev.Key, feat.ToMap())
counter++
if counter >= pipelineSize {
_, _ = pipeline.Exec(ctx)
counter = 0
}
case <-flushTicker.C:
if counter > 0 {
_, _ = pipeline.Exec(ctx)
counter = 0
}
}
}
}(i)
}
// 3️⃣ Producer: read from Kafka and push to channel
go func() {
for {
m, err := kafkaReader.ReadMessage(ctx)
if err != nil {
log.Printf("kafka read error: %v", err)
continue
}
events <- rawEvent{
Key: fmt.Sprintf("user:%s", m.Key),
Value: m.Value,
}
}
}()
// Graceful shutdown handling omitted for brevity
wg.Wait()
}
Explanation of key ideas:
- Pipeline Batching:
pipelineSizecontrols the number of commands per network round‑trip. With a batch of 500HSETs, the latency per command drops dramatically. - Flush Interval: Guarantees that even low‑traffic periods don’t stall writes indefinitely.
- Worker Count: Tuned based on CPU cores and expected event rate. Each worker holds its own pipeline to avoid contention.
- Back‑pressure: The buffered channel (
events) smooths spikes; if it fills, Kafka’s consumer will naturally block, providing back‑pressure to upstream producers.
Serving Features at Scale
Once features are stored, the serving layer must retrieve them quickly and atomically. Below are patterns to achieve sub‑10 ms response times.
1. Bulk Reads with HMGET / HGETALL
If a request needs all features for an entity, HGETALL returns the entire hash in a single round‑trip.
func getAllFeatures(ctx context.Context, userID int64) (map[string]string, error) {
key := fmt.Sprintf("user:%d", userID)
return rdb.HGetAll(ctx, key).Result()
}
2. Multi‑Entity Fetch with Pipelining
When an inference request requires features for multiple users (e.g., batch scoring), pipeline the reads:
func batchGetFeatures(ctx context.Context, ids []int64) ([]map[string]string, error) {
pipe := rdb.Pipeline()
cmds := make([]*redis.StringStringMapCmd, len(ids))
for i, id := range ids {
key := fmt.Sprintf("user:%d", id)
cmds[i] = pipe.HGetAll(ctx, key)
}
_, err := pipe.Exec(ctx)
if err != nil && err != redis.Nil {
return nil, err
}
results := make([]map[string]string, len(ids))
for i, cmd := range cmds {
results[i], _ = cmd.Result()
}
return results, nil
}
3. Atomic Multi‑Key Reads via Lua
Sometimes you need a consistent snapshot across several keys (e.g., user features + session flags). A Lua script can fetch all required fields atomically, eliminating race conditions.
-- get_features.lua
local result = {}
for i, key in ipairs(KEYS) do
result[key] = redis.call('HGETALL', key)
end
return result
Go wrapper:
var getFeaturesScript = redis.NewScript(`
local result = {}
for i, key in ipairs(KEYS) do
result[key] = redis.call('HGETALL', key)
end
return result
`)
func getFeaturesAtomic(ctx context.Context, keys []string) (map[string]map[string]string, error) {
// Convert slice to interface{} for script args
args := make([]interface{}, len(keys))
for i, k := range keys {
args[i] = k
}
raw, err := getFeaturesScript.Run(ctx, rdb, keys).Result()
if err != nil {
return nil, err
}
// Parse raw reply (nested arrays) into Go maps – omitted for brevity
// ...
return parsed, nil
}
4. Read‑Only Replica Routing
Redis Cluster allows clients to read from replicas by setting the ReadOnly flag. This offloads read traffic from the primary nodes, improving throughput.
rdb := redis.NewClusterClient(&redis.ClusterOptions{
Addrs: []string{"redis-0:6379", "redis-1:6379"},
ReadOnly: true, // use replicas for GET/HRANGE
})
5. Caching Hot Features Locally
For ultra‑low latency, you may embed an in‑process LRU cache (e.g., github.com/hashicorp/golang-lru). Cache warm‑up can be done on startup or via a background refresh task.
cache, _ := lru.New(10000) // store up to 10k entries
func getFeatureWithCache(ctx context.Context, userID int64) (FeatureSet, error) {
if v, ok := cache.Get(userID); ok {
return v.(FeatureSet), nil
}
// fallback to Redis
raw, err := rdb.HGetAll(ctx, fmt.Sprintf("user:%d", userID)).Result()
if err != nil {
return FeatureSet{}, err
}
// convert raw map to FeatureSet (omitted)
feat := mapToFeatureSet(raw)
cache.Add(userID, feat)
return feat, nil
}
Performance tip: Keep the cache size modest to avoid memory pressure on the service container; rely on Redis for the authoritative source.
Scaling Redis: Clustering, Sharding, and HA
A single Redis node caps at a few hundred thousand ops/sec depending on hardware. To support millions of operations per second, you need a proper cluster layout.
1. Redis Cluster Basics
- Hash Slots: 0‑16383, automatically distributed across master nodes.
- Masters + Replicas: Typical production setups use 3 master nodes with 1‑2 replicas each for fault tolerance.
- Cross‑Slot Operations: Lua scripts that touch multiple keys must ensure those keys reside on the same slot (or use
CLUSTER KEYSLOTto co-locate).
2. Sharding Strategy for Feature Keys
Because our keys follow a predictable pattern (user:{id}), we can use hash tags to force co-location when needed.
Key: user:{1234}:v42
Everything inside {} is hashed to the same slot. This is useful when a Lua script needs both a user hash and a version hash.
3. Adding Nodes Without Downtime
Redis Cluster supports online rebalancing:
# Add a new master
redis-cli --cluster add-node new-node:6379 existing-node:6379
# Reshard 1/4 of slots to the new master
redis-cli --cluster reshard existing-node:6379 --cluster-from existing-node:6379 --cluster-to new-node:6379 --cluster-slots 4096
Automation tools (e.g., Redis Operator for Kubernetes) handle these steps automatically.
4. High Availability with Sentinel vs. Native Cluster
- Sentinel works with a single‑master topology, providing failover but no sharding.
- Native Cluster provides both sharding and HA. For most feature‑store use cases, native clustering is recommended.
5. Persistence Trade‑offs
| Persistence Mode | Durability | Write Latency | Memory Overhead |
|---|---|---|---|
| RDB Snapshot (every 5 min) | Low (data loss up to 5 min) | Minimal | None |
| AOF (every‑second) | Medium (≤1 s loss) | Slight increase (fsync every sec) | Slight |
| AOF (every‑write) | High (no loss) | Higher (fsync per write) | None |
| No persistence | None | Lowest | None |
For a real‑time feature store that can tolerate a few seconds of loss (features are regenerated from streams), AOF every‑second strikes a good balance.
6. Resource Sizing Example
| Metric | Approx. Requirement |
|---|---|
| CPU | 2‑4 vCPU per master for 500 k ops/s |
| Memory | 2 GB per master (depends on feature cardinality) |
| Network | 10 Gbps NIC for multi‑node cluster |
| Disk (AOF) | High‑throughput SSD, ~200 MB/s write |
Use Redis INFO and Prometheus alerts to monitor instantaneous_ops_per_sec, used_memory, and aof_current_rewrite_time_sec.
Observability & Monitoring
A production feature store must expose metrics, traces, and logs to detect latency spikes or data inconsistencies.
1. Redis Metrics Exporter
The official redis_exporter (Prometheus) scrapes:
redis_commands_processed_totalredis_instantaneous_ops_per_secredis_keyspace_hits_total/redis_keyspace_misses_totalredis_latency_seconds(viaLATENCYcommand)
Run as a sidecar or DaemonSet:
apiVersion: v1
kind: Service
metadata:
name: redis-exporter
spec:
selector:
app: redis
ports:
- port: 9121
targetPort: 9121
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis-exporter
spec:
replicas: 1
selector:
matchLabels:
app: redis-exporter
template:
metadata:
labels:
app: redis-exporter
spec:
containers:
- name: exporter
image: oliver006/redis_exporter:latest
args: ["--redis.addr=redis-cluster:6379"]
ports:
- containerPort: 9121
2. Go Service Metrics
Using Prometheus client:
var (
ingestionLatency = prometheus.NewHistogramVec(prometheus.HistogramOpts{
Namespace: "feature_store",
Subsystem: "ingestion",
Name: "batch_latency_seconds",
Buckets: prometheus.ExponentialBuckets(0.001, 2, 10),
}, []string{"status"})
)
func init() {
prometheus.MustRegister(ingestionLatency)
}
Record latency around pipeline.Exec.
3. Distributed Tracing
Instrument both ingestion and serving services with OpenTelemetry:
tracer := otel.Tracer("feature-store-ingestion")
ctx, span := tracer.Start(ctx, "processEvent")
defer span.End()
// add attributes
span.SetAttributes(attribute.String("user_id", fmt.Sprint(feat.UserID)))
Export traces to Jaeger or Tempo for end‑to‑end latency visualization.
4. Logging Practices
- Use structured JSON logs (
zaporzerolog). - Include request IDs (
X-Request-IDheader) to correlate logs across services. - Log redis errors at
WARNlevel, but pipeline failures atERROR.
Testing and Benchmarking
Before rolling out to production, validate that your design meets the required throughput and latency.
1. Load Testing with k6
import http from 'k6/http';
import { check, sleep } from 'k6';
export const options = {
stages: [
{ duration: '30s', target: 100 }, // ramp up
{ duration: '2m', target: 1000 }, // sustain 1k RPS
{ duration: '30s', target: 0 }, // ramp down
],
};
export default function () {
const res = http.get('http://feature-api:8080/v1/features?user_id=1234');
check(res, { 'status 200': (r) => r.status === 200 });
sleep(0.01);
}
Collect response time percentiles and ensure the 95th percentile stays below the latency SLA (e.g., 8 ms).
2. Go Benchmarks for Redis Pipelines
func BenchmarkPipelineWrite(b *testing.B) {
ctx := context.Background()
pipelineSize := 1000
pipe := rdb.Pipeline()
for i := 0; i < b.N; i++ {
for j := 0; j < pipelineSize; j++ {
key := fmt.Sprintf("user:%d", j)
pipe.HSet(ctx, key, map[string]interface{}{
"age": 30,
"country": "US",
"churn_score": 0.12,
})
}
_, err := pipe.Exec(ctx)
if err != nil && err != redis.Nil {
b.Fatalf("pipeline exec error: %v", err)
}
}
}
Run with go test -bench=. -benchmem and observe ops per second. Fine‑tune pipelineSize and workerCount based on the results.
3. Failure Injection
- Use Chaos Mesh or Pumba to simulate node loss.
- Verify that the Go client automatically retries and that the cluster re‑elects masters without downtime.
4. Consistency Checks
After a bulk write, read back a random sample and compare against the source payload. Automate this as part of a CI pipeline.
Real‑World Case Study: E‑Commerce Recommendations
Scenario
An online marketplace wants to power personalized product recommendations in real time. The model consumes:
- User profile features (age, location, loyalty tier)
- Session activity (last 5 clicks, time‑on‑site)
- Recent purchase history (last 3 orders)
The inference service must respond within 5 ms for ≈30 k RPS during peak traffic.
Architecture Deployed
| Component | Tech Stack | Scale |
|---|---|---|
| Event Ingestion | Kafka (3 partitions) → Go microservice | 200 k events/s |
| Feature Store | Redis Cluster (3 masters, 2 replicas) | 2 GB RAM per master |
| Serving API | Go gRPC server (2 vCPU, 4 GB RAM) | 30 k RPS |
| Monitoring | Prometheus + Grafana + Jaeger | – |
| CI/CD | GitHub Actions, Argo CD | – |
Key Implementation Details
- Hash‑based Entity Store:
user:{id}:v5wherev5is the model version. - Session Features in Sorted Set:
session:{id}with score = timestamp, member ="click:{product_id}". - Batch Reads: The gRPC server receives a batch of 10 user IDs, pipelines
HMGETfor profile hashes andZRANGEfor session sets, achieving ~3 ms total latency. - Lua Script for Atomic Snapshot: A single script fetched profile + last 5 clicks in one server‑side execution, eliminating race windows.
- Cache Warm‑up: At service start, top‑10k active users are pre‑loaded into an LRU cache (size 20k) to guarantee sub‑2 ms response for the hottest traffic.
Performance Numbers
| Metric | Value |
|---|---|
| Ingestion throughput | 250 k HSETs / sec (pipeline size 1000) |
| Serving latency (p99) | 4.8 ms |
| Redis CPU utilization | 78 % on masters |
| Autoscaling events | 2 scale‑outs per day during flash‑sale spikes |
The system demonstrated linear scaling: adding a fourth master increased write throughput by ~30 % and reduced read latency by ~15 % under load.
Conclusion
Scaling a realtime feature store for high‑throughput microservices is a multi‑disciplinary challenge that blends data modeling, systems engineering, and Go programming expertise. By leveraging:
- Redis’s in‑memory speed, rich data structures, and native clustering,
- Go’s lightweight concurrency, strong typing, and robust tooling, and
- Thoughtful schema design, pipelined ingestion, atomic reads, and observability,
you can build a platform that delivers sub‑10 ms latency at hundreds of thousands of operations per second. The patterns explored—entity‑centric hashes, time‑series sorted sets, Lua‑based atomic fetches, and worker‑pool pipelines—provide a solid foundation that can be adapted to any domain, from recommendation engines to fraud detection.
Remember that operational excellence—monitoring, automated failover, and rigorous testing—are as crucial as the code itself. With the blueprint and examples in this article, you’re equipped to design, implement, and operate a production‑grade realtime feature store that scales alongside your microservice ecosystem.
Happy coding, and may your features always be fresh! 🚀
Resources
- Redis Documentation – Data Structures & Clustering
- Go Redis Client – go-redis/v9
- Feature Store Overview – Google Cloud Blog
- OpenTelemetry for Go
- Prometheus Redis Exporter