TL;DR — Both the JVM and .NET CLR rely on a generational heap to keep most allocations cheap and pause times short. Understanding young‑generation collection, promotion thresholds, and the concrete tuning knobs lets you shave seconds off latency and save gigabytes of RAM in production services.
Modern back‑end services run for weeks or months under heavy load, yet the majority of their memory churn happens in a few milliseconds. Generational garbage collection (GC) is the engine that makes this possible, but the knobs and observability signals differ between the HotSpot/OpenJDK world and the .NET Core/CLR ecosystem. This post unpacks the theory, dives into the concrete implementations, and shows you how to profile, tune, and monitor generational GC in the two most common enterprise runtimes.
Generational GC Basics
Why “generational”?
Empirical studies of real‑world workloads (e.g., web servers, stream processors) show that most objects die young—often within a handful of allocations. By separating “young” objects from “old” ones, a collector can:
- Run frequent, cheap collections on a small region (the young generation).
- Avoid scanning the entire heap during most pauses, keeping latency low.
- Promote long‑lived objects to an old generation that is collected less often, using more aggressive algorithms (mark‑sweep, compacting) only when necessary.
Both the JVM and .NET adopt this model, but they expose different region sizes, promotion policies, and collection algorithms.
Core terminology
| Term | JVM | .NET |
|---|---|---|
| Young generation | Eden + 2 Survivor spaces (S0, S1) | Ephemeral segment (also called Gen 0) |
| Old generation | Tenured heap (also called Gen 1) | Large Object Heap (LOH) + Gen 2 |
| Promotion | Copy from Eden → Survivor → Tenured after N survivals | Move from Gen 0 → Gen 1 → Gen 2 based on allocation thresholds |
| Pause type | Young (minor) vs Full (stop‑the‑world) | GC0 (ephemeral) vs GC1/GC2 (full) |
Understanding how each runtime maps these concepts to concrete memory regions is the first step toward effective tuning.
JVM Implementation
Heap layout in HotSpot
HotSpot splits the heap into three logical regions:
- Eden – where new objects are allocated.
- Survivor spaces (S0, S1) – act as a “to‑space” during copying collection.
- Tenured (old) generation – holds promoted objects.
The size of each region is configurable via flags such as -Xmn, -XX:SurvivorRatio, and -XX:NewRatio. Modern collectors (Parallel Scavenge, G1, ZGC) still respect these logical divisions, though they may merge or fragment them internally.
Example: JVM GC log snippet (Unified Logging)
[0.123s][info][gc] GC(0) Pause Young (Normal) (G1 Evacuation Pause) 512M->128M(1024M) 12.34ms
[0.456s][info][gc] GC(1) Pause Full (System.gc()) 128M->64M(1024M) 45.67ms
The log tells you which generation was collected (Pause Young vs Pause Full) and the before/after heap sizes. Tools like jcmd GC.heap_info or JDK Flight Recorder can turn these lines into visual timelines.
Parallel Scavenge vs G1
| Feature | Parallel Scavenge (PS) | G1 (Garbage‑First) |
|---|---|---|
| Young collection | Stop‑the‑world, but parallel copy | Concurrent marking + stop‑the‑world evacuation |
| Old collection | Full stop‑the‑world mark‑sweep | Incremental concurrent marking, mixed GCs |
| Tuning knobs | -XX:ParallelGCThreads, -XX:SurvivorRatio | -XX:MaxGCPauseMillis, -XX:InitiatingHeapOccupancyPercent |
| Typical use case | CPU‑bound services with predictable pause budgets | Large heaps (>8 GB) where latency spikes must be bounded |
When to choose G1: If you need sub‑100 ms pause targets on a heap larger than 8 GB, G1’s predictive pause model (-XX:MaxGCPauseMillis) gives you a concrete SLA. When PS shines: Small‑to‑medium heaps (<4 GB) where throughput is the primary metric; PS often delivers higher throughput because it avoids the extra bookkeeping of G1.
Tuning the young generation
- Set a target young size –
-Xmn(absolute) or-XX:NewRatio(relative). A rule of thumb: young ≈ 1/3 of total heap for latency‑sensitive services. - Adjust survivor ratio –
-XX:SurvivorRatio=8creates a 1:8 split between Eden and each survivor, giving Eden ~80 % of young space. - Control promotion threshold –
-XX:MaxTenuringThreshold=6means an object must survive six young GCs before promotion. Lower this if you see high old‑gen occupancy; raise it if promotion churn hurts throughput.
Example: JVM startup flags for a latency‑critical microservice
java -Xms8g -Xmx8g \
-Xmn2g \
-XX:SurvivorRatio=8 \
-XX:MaxTenuringThreshold=4 \
-XX:ParallelGCThreads=8 \
-XX:ConcGCThreads=2 \
-XX:MaxGCPauseMillis=50 \
-jar myservice.jar
.NET CLR Implementation
Memory segments in .NET Core
The .NET runtime divides managed memory into three generations:
| Generation | Typical size | Collection trigger |
|---|---|---|
| Gen 0 (Ephemeral) | Small (≈5–10 % of heap) | Allocation exceeds ephemeral segment size |
| Gen 1 | Medium (≈10–20 % of heap) | Promotion from Gen 0 fills Gen 1 |
| Gen 2 (Large Object Heap) | Remainder, plus LOH (>85 KB objects) | Full GC occurs when heap pressure exceeds GCHeapHardLimit or GCHeapThreshold |
The Large Object Heap (LOH) is a special region that is collected only during full (Gen 2) GCs, but .NET 5+ introduced LOH compaction (<gcAllowVeryLargeObjects enabled="true"/> in runtimeconfig.json) to mitigate fragmentation.
Server vs Workstation GC
- Workstation GC – optimized for low‑latency desktop apps; uses a single background thread for concurrent collections.
- Server GC – scales with CPU count; each logical processor gets its own GC thread, yielding higher throughput on multi‑core servers.
You select the mode via the System.GCSettings.LatencyMode property at runtime or by setting <gcServer enabled="true"/> in the runtimeconfig.
Example: runtimeconfig.json enabling Server GC and LOH compaction
{
"runtimeOptions": {
"configProperties": {
"System.GC.Server": true,
"System.GC.HeapHardLimit": 8589934592, // 8 GB
"System.GC.LOHCompactionEnabled": true
}
}
}
Tuning the ephemeral segment
The size of Gen 0 is driven by the GCHeapSegmentSize (default 1 MB on 64‑bit). You can influence it with the COMPlus_GCHeapCount and COMPlus_GCHeapSegmentSize environment variables. In practice, most production teams adjust GCHeapHardLimit (total heap) and let the runtime compute Gen 0 size automatically.
Example: PowerShell script to set environment variables for a container
$env:COMPlus_GCHeapHardLimit = 8GB
$env:COMPlus_GCHeapSegmentSize = 2MB
$env:COMPlus_gcServer = 1
Observability
- dotnet-counters:
dotnet-counters monitor -p <pid> System.Runtimeshowsgen-0-gc-count,gen-1-gc-count,gen-2-gc-count, and heap sizes. - EventPipe: Capture
Microsoft-Windows-DotNETRuntimeevents for GC start/end timestamps. - PerfView: Visualizes GC pauses, allocation stacks, and LOH fragmentation.
Architecture Patterns in Production
1. “GC‑Friendly” allocation discipline
- Object pooling for frequently reused large objects (e.g., buffers > 1 KB) reduces pressure on the LOH.
- Avoid long‑lived mutable collections in hot paths; instead, use immutable snapshots that can be reclaimed quickly.
- Prefer Span/Memory over
byte[]when working with I/O buffers; these live on the stack or are rented fromArrayPool<T>.
2. Multi‑process isolation
Running a high‑throughput API alongside a batch job in the same JVM or .NET process can cause GC interference: the batch job’s allocation spikes trigger full GCs that stall the API. The pattern is to split workloads into separate processes or containers, each with its own heap size and GC configuration.
3. Adaptive pause budgeting
Both runtimes expose a target pause time (-XX:MaxGCPauseMillis for JVM, GCHeapHardLimit plus heuristics for .NET). In a CI/CD pipeline, you can:
- Run a load test (e.g., with k6 or Locust).
- Extract the 95th‑percentile GC pause from logs (
jcmd VM.native_memory summaryordotnet-trace). - Feed the result back into a configuration-as-code file for the next deployment.
4. Hybrid collector usage (JVM)
For workloads that have both low‑latency request paths and occasional bulk processing, you can switch collectors at runtime using the -XX:+UseG1GC flag for the service tier and -XX:+UseParallelGC for the batch tier. The HotSpot -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilon no‑op collector can even be used for stateless micro‑services that never allocate (e.g., pure streaming of pre‑allocated buffers).
Monitoring and Tuning in Production
| Metric | Source | Typical alert threshold |
|---|---|---|
| Young GC pause time | jstat -gc / dotnet-counters | > 30 ms (JVM), > 20 ms (dotnet) for >95th percentile |
| Old generation occupancy | jcmd GC.heap_info / dotnet-counters | > 75 % of total heap |
| Promotion rate (objects/sec) | GC logs (-Xlog:gc+promotion) | Sudden spikes > 10 % of allocation rate |
| LOH fragmentation | dotnet-counters --providers Microsoft-Windows-DotNETRuntime | > 30 % free space in LOH |
| GC CPU % | top / perf | > 10 % sustained |
Dashboard example (Grafana): Plot jvm_gc_pause_seconds_sum (Prometheus metric) alongside request latency (http_request_duration_seconds). Correlating spikes often reveals a missing tuning knob (e.g., too small young generation).
Automated tuning with JDK Flight Recorder (JFR)
jfr start --profile async-profiler \
--duration 5m \
--filename myservice.jfr
The recorded file can be analyzed in JDK Mission Control, where the Garbage Collection tab highlights allocation rate, promotion failures, and heap pressure. Export the recommendations as a CI step to adjust -XX:MaxTenuringThreshold automatically.
.NET GC tuning via GCSettings.LatencyMode
using System;
using System.Runtime;
class Program {
static void Main() {
// Switch to low‑latency mode during a critical request
GCSettings.LatencyMode = GCLatencyMode.LowLatency;
ProcessCriticalWork();
GCSettings.LatencyMode = GCLatencyMode.Interactive;
}
}
Low‑latency mode suppresses concurrent collections, giving you a pause‑free window at the cost of higher allocation pressure. Use it sparingly (e.g., around a single transaction) and revert immediately.
Key Takeaways
- Generational GC isolates short‑lived objects, keeping most collections cheap and pause‑times predictable.
- In the JVM, the young generation consists of Eden + two survivor spaces; tuning
-Xmn,-XX:SurvivorRatio, and-XX:MaxTenuringThresholddirectly impacts promotion churn. - .NET’s ephemeral segment (Gen 0) is managed automatically, but you can influence heap limits and enable LOH compaction to reduce fragmentation.
- Choose G1 for large heaps with strict latency SLAs, and Parallel Scavenge for smaller, throughput‑focused services.
- Use Server GC on multi‑core .NET services, and enable environment variables (
COMPlus_…) for fine‑grained control in containers. - Adopt production patterns: object pooling, workload isolation, adaptive pause budgeting, and continuous observability via JFR, dotnet‑counters, and Grafana dashboards.
Further Reading
- Java HotSpot VM Garbage Collection Tuning Guide
- .NET Runtime Garbage Collection Overview
- Understanding G1 GC in Java 11