First published Mar 5, 2026 · Updated June 19, 2026 · SRE & Observability · 10 min read
Observability in Rust applications must balance telemetry granularity with performance overhead. By using the `tracing` ecosystem, non-blocking appenders, and OpenTelemetry protocols (OTLP), startups can monitor production systems with less than 2% CPU overhead. This guide details how to build and initialize a production-ready telemetry stack.
The standard logging approach in Rust relies on the `log` crate, but production monitoring requires structured logs and distributed traces. The `tracing` crate developed by Tokio provides a diagnostic framework based on spans (which represent logical periods of time) and events. Key design elements for low-overhead tracing include:
To initialize telemetry, developers combine multiple layers (stdout log, OTLP collector log, and Prometheus metrics) into a single unified subscriber registry. To maintain a strict <2% CPU overhead, it is crucial to use non-blocking writer implementations. The `tracing-appender` crate provides a utility that spawns a separate worker thread to consume log messages from a lock-free queue, preventing stdout writes from blocking the main Tokio executor threads. Below is the production-grade initialization pattern in Rust:
use tracing_subscriber::{prelude::*, EnvFilter, Registry};
use tracing_otlp::WithExportConfig;
use std::sync::Arc;
fn init_telemetry() -> tracing_appender::non_blocking::WorkerGuard {
// 1. Configure non-blocking console appender
let (non_blocking_writer, guard) = tracing_appender::non_blocking(std::io::stdout());
let console_layer = tracing_subscriber::fmt::layer()
.with_writer(non_blocking_writer)
.json(); // Use JSON for production ingestion
// 2. Configure EnvFilter for runtime log-level filtering
let filter = EnvFilter::try_from_default_env()
.unwrap_or_else(|_| EnvFilter::new("info,tokio=warn"));
// 3. Configure HTTP OTLP Exporter pipeline
let otlp_exporter = tracing_otlp::new_exporter()
.http()
.with_endpoint("http://otel-collector:4318/v1/traces");
let tracer = tracing_otlp::new_pipeline()
.tracing()
.with_exporter(otlp_exporter)
.install_batch(opentelemetry_sdk::runtime::Tokio)
.expect("Failed to initialize OTel pipeline");
let telemetry_layer = tracing_opentelemetry::layer().with_tracer(tracer);
// 4. Register layers with the global tracing registry
Registry::default()
.with(filter)
.with(console_layer)
.with(telemetry_layer)
.init();
// Return the guard so it remains alive in main() and flushes buffer on exit
guard
}
This setup ensures that log formatting occurs asynchronously. The returned WorkerGuard must be held in the main scope of the application; if dropped, the background logging thread is terminated, and any remaining logs in the buffer are flushed immediately to stdout. Additionally, we isolate gRPC/HTTP instrumentation logs by scoping the EnvFilter target to ignore high-frequency framework events unless their severity is warnings or above.
For high-frequency telemetry (like requests per second or latency percentiles), logging events introduces too much disk/network IO overhead. Developers must use the `metrics` or `opentelemetry_api` crates to track numeric states in-memory, serving them to a Prometheus scraper through a `/metrics` HTTP endpoint hosted on an isolated port.
In Rust, this is typically set up using the metrics-exporter-prometheus crate. This exporter instantiates a global recorder in-memory that maintains counter and histogram buckets. An auxiliary HTTP listener, built with axum or hyper, exposes these metrics. Below is a code block showing the initialization of the Prometheus recorder and the HTTP scraper handler:
use axum::{routing::get, Router};
use metrics_exporter_prometheus::{PrometheusBuilder, PrometheusHandle};
use std::net::SocketAddr;
async fn setup_metrics_server() {
// 1. Build the Prometheus recorder and get a handle
let builder = PrometheusBuilder::new();
let handle = builder
.install_recorder()
.expect("Failed to install Prometheus recorder");
// 2. Define the Router with the scraping endpoint
let app = Router::new()
.route("/metrics", get(move || {
let handle = handle.clone();
async move { handle.render() }
}));
// 3. Bind to an isolated admin port
let addr = SocketAddr::from(([0, 0, 0, 0], 9000));
let listener = tokio::net::TcpListener::bind(addr).await.unwrap();
// Spawn the metrics HTTP server in the background
tokio::spawn(async move {
axum::serve(listener, app).await.unwrap();
});
}
By exposing metrics on port 9000 rather than the public application port, we secure the telemetry endpoint from public access. The Prometheus agent scrapes this port at regular intervals (e.g., every 15 seconds) to collect histograms of requests. By utilizing thread-safe atomic counters under the hood, this registry incurs less than 50 nanoseconds of latency overhead per record call, making it safe for microsecond-sensitive pathways.
We help tech teams configure low-overhead tracing, Prometheus metrics dashboards, and automated alert systems. Book a free call.
Book a Free Call