Cloud Trace is a distributed tracing service that collects latency data from applications and provides near real-time performance insights. For the ACE exam, this topic appears in about 5-8% of questions, often integrated with Cloud Monitoring and Cloud Logging. This chapter covers how Trace works, how to configure it, how to analyze trace data, and how to troubleshoot latency issues using the Trace console and API.
Jump to a section
Imagine you run a package delivery company and want to understand why some deliveries are slow. Each truck has a GPS tracker that logs every stop, detour, and delay. When a package is delayed, you can replay its entire route: the time it left the warehouse, the duration at each sorting facility, the minutes spent in traffic, and the exact moment it was handed to the customer. This is Cloud Trace. It collects latency data (spans) from each service (truck) as a request travels through your distributed system (delivery route). Each span records a specific operation (e.g., database query, API call) with start and end timestamps, labels, and metadata. The trace is the complete journey of a single request (package). By analyzing traces, you can pinpoint which service is the bottleneck (which sorting facility is slow) and optimize it. Without Trace, you'd only see the total delivery time and guess which leg caused the delay.
What is Cloud Trace and Why It Exists
Cloud Trace is a managed distributed tracing service that captures end-to-end latency data for requests as they travel through a distributed system. It is part of Google Cloud's operations suite (formerly Stackdriver). The primary goal is to help developers and SREs identify performance bottlenecks, debug latency issues, and optimize application performance. Unlike simple logging or metrics, Trace provides a causal view of a request's path across services, showing exactly where time is spent.
How Cloud Trace Works Internally
Cloud Trace uses the concept of spans and traces. A trace represents the entire journey of a single request (e.g., an HTTP request from a user to a web server that calls a backend service and a database). A span is a single unit of work within a trace, such as a database query, a function call, or an HTTP request to another service. Each span has:
A span ID (unique within the trace)
A trace ID (shared by all spans in the trace)
A parent span ID (to establish hierarchy)
Start and end timestamps (nanosecond precision)
Labels (key-value pairs for metadata like HTTP method, status code, etc.)
Span kind (e.g., CLIENT, SERVER, PRODUCER, CONSUMER)
The tracing mechanism relies on context propagation. When a request enters the system (e.g., at a load balancer or application frontend), a trace ID is generated. This ID is passed along to every downstream service via HTTP headers (e.g., X-Cloud-Trace-Context). Each service creates its own spans, linking them to the parent via the parent span ID. The spans are then sent asynchronously to the Cloud Trace backend, where they are aggregated and stored.
Key Components, Values, Defaults, and Timers
Trace Sampling: By default, Cloud Trace uses a rate-based sampling of 1 request per second per instance (for Google Cloud projects with the default sampling). You can configure sampling rate (0 to 1, where 1 means all requests are traced) or use probability-based sampling (e.g., 0.1 for 10%). The default is 1 request per second (rate limiting) to control cost.
Span Limits: A single trace can have up to 256 spans. If more spans are generated, additional spans are dropped.
Trace Retention: Traces are retained for 30 days by default. You can adjust retention using retention policies (minimum 7 days, maximum 30 days).
Export: Traces can be exported to Cloud Storage or BigQuery for long-term analysis.
API: Cloud Trace has both a gRPC API and a REST API. The v2 API is the current version.
Integration: Cloud Trace integrates with Cloud Monitoring (for latency metrics like trace/span/response_latencies) and Cloud Logging (to correlate logs with traces using the trace ID).
Configuration and Verification Commands
To enable Cloud Trace for a project, you need to enable the Cloud Trace API:
gcloud services enable cloudtrace.googleapis.comTo view traces in the console: Navigate to Trace > Trace List in the Cloud Console.
To query traces using the API:
gcloud alpha trace traces list --project=my-project --limit=10To create a custom span in your application (using Python OpenTelemetry):
from opentelemetry import trace
tracer = trace.get_tracer(__name__)
with tracer.start_as_current_span("my_custom_operation") as span:
span.set_attribute("my_key", "my_value")
# do workTo verify that Cloud Trace is collecting data, check the Trace List for any traces. If no traces appear, ensure the application is instrumented and the API is enabled.
How Cloud Trace Interacts with Related Technologies
Cloud Monitoring: Trace latency data is automatically exported as metrics. You can create alerting policies based on latency thresholds (e.g., p99 latency > 500ms).
Cloud Logging: Log entries can include the trace ID (logging.googleapis.com/trace). This allows you to jump from a log entry to the corresponding trace.
Cloud Profiler: While Trace shows latency distribution across requests, Profiler shows CPU and memory consumption per function.
Error Reporting: Errors logged with the trace ID can be linked to the trace where the error occurred.
App Engine, GKE, Cloud Run: These services can auto-instrument traces with minimal configuration (e.g., App Engine standard environment automatically sends trace data).
Advanced Topics: Trace Filtering and Analysis
The Trace console provides a Trace List view where you can filter traces by:
- Time range (up to 30 days)
- Latency (e.g., traces slower than 1 second)
- HTTP method
- URL
- Status code
- Custom labels (e.g., user_id)
You can also perform latency distributions to see percentiles (p50, p95, p99) of request latencies over time. The Trace Details page shows a Gantt chart of spans, allowing you to drill into each span's latency and labels.
Cost Considerations
Cloud Trace charges based on the volume of ingested spans. The first 250,000 spans per month per project are free. After that, it's $0.10 per 250,000 spans. Sampling is crucial to control costs.
Enable Cloud Trace API
First, ensure the Cloud Trace API is enabled for your project. Use `gcloud services enable cloudtrace.googleapis.com`. Without this, trace data cannot be sent to Google Cloud. Verify with `gcloud services list --enabled | grep cloudtrace`.
Instrument Your Application
Add tracing instrumentation to your application code using OpenTelemetry SDKs or Google's Cloud Trace libraries. For example, in Python, install `opentelemetry-api`, `opentelemetry-sdk`, and `opentelemetry-exporter-gcp-trace`. Configure the exporter to send spans to the Cloud Trace API. The instrumentation automatically captures incoming and outgoing HTTP requests, gRPC calls, and custom operations.
Configure Sampling Rate
Set a sampling rate to control how many requests are traced. Use environment variables or code configuration. For example, in OpenTelemetry, set `OTEL_TRACES_SAMPLER=parentbased_traceidratio` and `OTEL_TRACES_SAMPLER_ARG=0.1` for 10% sampling. The default rate-based sampling (1 req/sec) may miss occasional slowdowns; adjust based on traffic volume and cost tolerance.
Deploy and Generate Traffic
Deploy the instrumented application to Google Cloud (e.g., Compute Engine, GKE, Cloud Run). Generate real or synthetic traffic to trigger requests. As requests flow, spans are created and exported asynchronously to Cloud Trace. The exporter batches spans and sends them every few seconds (default 5 seconds) or when the buffer is full.
View and Analyze Traces
In the Cloud Console, navigate to Trace > Trace List. Here you see a list of traces with their latency, method, and URL. Click on a trace to see a waterfall view of spans. Identify slow spans by their duration. Use filters to isolate problematic requests (e.g., status=500, latency>2s). Create a latency distribution chart to monitor p50/p95/p99 over time.
Scenario 1: E-commerce Checkout Latency
A large e-commerce platform noticed that checkout requests were slow during peak hours. They deployed Cloud Trace across their microservices: frontend, cart service, payment service, and inventory service. Trace revealed that the payment service was making two redundant calls to a third-party fraud detection API, each taking 500ms sequentially. By parallelizing these calls, they reduced checkout latency by 40%. The team set up a Cloud Monitoring alert on p99 latency of the checkout trace > 2s. They also used Trace's latency distribution to track improvements after deployment. Misconfiguration: Initially, they had set sampling to 0.001 (0.1%) which was too low to catch the issue. After increasing to 0.1 (10%), they quickly identified the bottleneck.
Scenario 2: Microservice Dependency Mapping
A fintech startup used Cloud Trace to map dependencies between 50+ microservices. They discovered an unexpected cascading call: the user service was calling the notification service on every login, which then called the email service, adding 300ms to login latency. They moved the notification to an asynchronous queue. Trace's span hierarchy made it easy to see the call chain. They also used Trace's ability to export to BigQuery to run custom analytics on latency trends by service version. Common pitfall: They forgot to propagate the trace context when using a message queue; they had to manually inject the trace ID into the message headers.
Scenario 3: Serverless Cold Start Debugging
A company running Cloud Run functions experienced intermittent high latencies. Using Cloud Trace, they saw that some spans had a long initial delay (cold start) before the function handler. By comparing traces with and without cold start, they determined that the cold start added 1-2 seconds. They mitigated by setting a minimum number of instances (min-instances) to keep functions warm. Trace also helped them identify that a dependency (a large library) was causing slow initialization. They optimized by lazy-loading the library. Performance consideration: Trace itself adds a small overhead (about 1-5% additional latency) due to span creation and export; they had to balance granularity with overhead.
What the ACE Exam Tests
Objective 4.1: Use Cloud Trace to analyze latency. The exam expects you to know how to enable Trace, instrument applications, view traces, and interpret span data.
Common Scenarios: Identify the slowest span in a trace, understand sampling rates, and know how to correlate traces with logs.
Most Common Wrong Answers
"Cloud Trace automatically traces all requests without any configuration." Reality: You must enable the API and instrument your application. Only App Engine standard environment has auto-instrumentation for some languages.
"Sampling rate of 1 means trace 1% of requests." Reality: Sampling rate is a number between 0 and 1, where 1 means 100% of requests. The default is rate-based (1 req/sec), not probability-based.
"Traces are stored indefinitely." Reality: Default retention is 30 days, configurable between 7 and 30 days.
"Cloud Trace can only trace HTTP requests." Reality: It can trace gRPC, custom operations, and any instrumented code.
Specific Numbers and Terms
Default sampling: 1 request per second per instance (rate-based).
Maximum spans per trace: 256.
Retention: 30 days default.
Free tier: 250,000 spans per month.
Header: X-Cloud-Trace-Context.
API version: v2.
Edge Cases
When using multiple projects, trace IDs may collide; use a common project for trace aggregation.
Traces from different projects can be viewed in a single project by configuring cross-project tracing.
If spans exceed 256, excess spans are silently dropped; you won't see the full trace.
Cloud Trace does not support distributed tracing across on-premises and cloud unless you use a hybrid tracing solution.
How to Eliminate Wrong Answers
If a question mentions "automatic tracing without code changes," look for App Engine or GKE with Istio (which can auto-inject sidecars). Otherwise, instrumentation is required.
If a question asks about reducing trace data volume, think sampling, not retention reduction.
If a question asks about correlating logs and traces, the answer involves the trace ID in log entries.
Cloud Trace captures end-to-end latency using spans and traces, with a default rate-based sampling of 1 request per second per instance.
Maximum spans per trace is 256; excess spans are dropped silently.
Traces are retained for 30 days by default; can be exported to BigQuery or Cloud Storage for longer retention.
Instrumentation is required for most services; use OpenTelemetry SDKs or Cloud Trace libraries.
The trace context is propagated via the 'X-Cloud-Trace-Context' HTTP header.
Cloud Trace integrates with Cloud Monitoring and Cloud Logging for correlation.
Free tier includes 250,000 spans per month; beyond that, cost is $0.10 per 250,000 spans.
To view traces, enable the Cloud Trace API and navigate to Trace > Trace List in the Cloud Console.
These come up on the exam all the time. Here's how to tell them apart.
Cloud Trace
Captures end-to-end request path with causal relationship between spans.
Provides per-span latency with nanosecond precision.
Allows drill-down into individual traces to identify root cause.
Requires instrumentation to propagate trace context.
Retains trace data for up to 30 days.
Cloud Monitoring (Metrics)
Aggregates metrics like request count, latency percentiles, error rate.
Provides time-series data for dashboards and alerts.
Does not show individual request details or causal chains.
Can be collected without code changes for many services (e.g., Compute Engine, GKE).
Metrics can be retained for up to 6 weeks (standard) or longer with custom retention.
Mistake
Cloud Trace automatically traces all Google Cloud services without any setup.
Correct
Only App Engine standard environment and GKE with Istio provide some auto-instrumentation. For most services (Compute Engine, Cloud Run, Cloud Functions), you must manually instrument your application code using OpenTelemetry or Cloud Trace libraries.
Mistake
The sampling rate is a percentage (e.g., 0.1 = 10%).
Correct
The sampling rate is a number between 0 and 1, where 1 means 100% of requests. However, the default is rate-based (1 req/sec), not probability-based. If you set a probability-based sampler, then yes, 0.1 means 10%.
Mistake
Cloud Trace stores traces forever.
Correct
Traces are retained for 30 days by default. You can configure retention between 7 and 30 days. For long-term storage, export traces to BigQuery or Cloud Storage.
Mistake
Cloud Trace only works with HTTP/HTTPS requests.
Correct
Cloud Trace supports any instrumented operation, including gRPC, database queries, custom functions, and message queue operations. The span kind can be CLIENT, SERVER, PRODUCER, or CONSUMER.
Mistake
You can trace all requests in a high-traffic system without incurring cost.
Correct
The free tier covers 250,000 spans per month. Beyond that, you pay $0.10 per 250,000 spans. Tracing 100% of requests in a high-traffic system can become expensive. Use sampling to control costs.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Enable the Cloud Trace API using `gcloud services enable cloudtrace.googleapis.com`. Then instrument your application with OpenTelemetry or Cloud Trace client libraries. For App Engine standard environment, auto-instrumentation is available for Python, Java, Go, and PHP.
Cloud Trace supports rate-based sampling (default: 1 request per second per instance) and probability-based sampling (set a float between 0 and 1). You configure sampling in your application code or via environment variables (e.g., `OTEL_TRACES_SAMPLER`). Sampling controls cost and volume.
Yes, you can configure cross-project tracing by setting the `goog-client-trace-project` header or using a service account that has permissions to write traces to a central project. The central project must have the Cloud Trace API enabled.
Include the trace ID in your log entries using the `logging.googleapis.com/trace` field. The trace ID is available from the span context. In Cloud Logging, you can then click the trace ID to open the trace in Cloud Trace.
Cloud Trace provides detailed, per-request latency breakdowns (spans) and causal relationships. Cloud Monitoring provides aggregated metrics (e.g., p99 latency, request count) and alerting. They complement each other: Monitoring alerts on high latency, and Trace helps you find the cause.
By default, traces are stored for 30 days. You can change the retention period to between 7 and 30 days using the Cloud Console or API. For longer retention, export traces to BigQuery or Cloud Storage.
Only the first 256 spans are retained; additional spans are silently dropped. This is a hard limit. If you need more spans, consider breaking the trace into multiple traces or using a higher-level abstraction.
You've just covered Cloud Trace for Latency Analysis — now see how well it sticks with free ACE practice questions. Full explanations included, no account needed.
Done with this chapter?