ACEChapter 99 of 101Objective 4.1

Cloud Trace for Latency Analysis

Cloud Trace is a distributed tracing service that collects latency data from applications and provides near real-time performance insights. For the ACE exam, this topic appears in about 5-8% of questions, often integrated with Cloud Monitoring and Cloud Logging. This chapter covers how Trace works, how to configure it, how to analyze trace data, and how to troubleshoot latency issues using the Trace console and API.

25 min read

Intermediate

Updated Jul 20, 2026

Reviewed by Johnson Ajibi· Senior Network & Security Engineer · MSc IT Security

Jump to a section

Explain it to me simply Where people get tripped up Test what I know Look up key terms

Package Delivery with GPS Tracking Logs

Picture yourself as the owner of a package delivery company, determined to understand why certain deliveries are slow. Each truck has a GPS tracker that logs every stop, detour, and delay. When a package is delayed, you can replay its entire route: the time it left the warehouse, the duration at each sorting facility, the minutes spent in traffic, and the exact moment it was handed to the customer. This is Cloud Trace. It collects latency data (spans) from each service (truck) as a request travels through your distributed system (delivery route). Each span records a specific operation (e.g., database query, API call) with start and end timestamps, labels, and metadata. The trace is the complete journey of a single request (package). By analyzing traces, you can pinpoint which service is the bottleneck (which sorting facility is slow) and optimize it. Without Trace, you'd only see the total delivery time and guess which leg caused the delay.

How It Actually Works

What is Cloud Trace and Why It Exists

Cloud Trace is a managed distributed tracing service that captures end-to-end latency data for requests as they travel through a distributed system. It is part of Google Cloud's operations suite (formerly Stackdriver). The primary goal is to help developers and SREs identify performance bottlenecks, debug latency issues, and optimize application performance. Unlike simple logging or metrics, Trace provides a causal view of a request's path across services, showing exactly where time is spent.

How Cloud Trace Works Internally

Cloud Trace uses the concept of spans and traces. A trace represents the entire journey of a single request (e.g., an HTTP request from a user to a web server that calls a backend service and a database). A span is a single unit of work within a trace, such as a database query, a function call, or an HTTP request to another service. Each span has:

A span ID (unique within the trace)

A trace ID (shared by all spans in the trace)

A parent span ID (to establish hierarchy)

Start and end timestamps (nanosecond precision)

Labels (key-value pairs for metadata like HTTP method, status code, etc.)

Span kind (e.g., CLIENT, SERVER, PRODUCER, CONSUMER)

The tracing mechanism relies on context propagation. When a request enters the system (e.g., at a load balancer or application frontend), a trace ID is generated. This ID is passed along to every downstream service via HTTP headers (e.g., X-Cloud-Trace-Context). Each service creates its own spans, linking them to the parent via the parent span ID. The spans are then sent asynchronously to the Cloud Trace backend, where they are aggregated and stored.

Key Components, Values, Defaults, and Timers

Trace Sampling: By default, Cloud Trace uses a rate-based sampling of 1 request per second per instance (for Google Cloud projects with the default sampling). You can configure sampling rate (0 to 1, where 1 means all requests are traced) or use probability-based sampling (e.g., 0.1 for 10%). The default is 1 request per second (rate limiting) to control cost.

Span Limits: A single trace can have up to 256 spans. If more spans are generated, additional spans are dropped.

Trace Retention: Traces are retained for 30 days by default. You can adjust retention using retention policies (minimum 7 days, maximum 30 days).

Export: Traces can be exported to Cloud Storage or BigQuery for long-term analysis.

API: Cloud Trace has both a gRPC API and a REST API. The v2 API is the current version.

Integration: Cloud Trace integrates with Cloud Monitoring (for latency metrics like trace/span/response_latencies) and Cloud Logging (to correlate logs with traces using the trace ID).

Configuration and Verification Commands

To enable Cloud Trace for a project, you need to enable the Cloud Trace API:

gcloud services enable cloudtrace.googleapis.com

To view traces in the console: Navigate to Trace > Trace List in the Cloud Console.

To query traces using the API:

gcloud alpha trace traces list --project=my-project --limit=10

To create a custom span in your application (using Python OpenTelemetry):

from opentelemetry import trace
tracer = trace.get_tracer(__name__)
with tracer.start_as_current_span("my_custom_operation") as span:
    span.set_attribute("my_key", "my_value")
    # do work

To verify that Cloud Trace is collecting data, check the Trace List for any traces. If no traces appear, ensure the application is instrumented and the API is enabled.

How Cloud Trace Interacts with Related Technologies

Cloud Monitoring: Trace latency data is automatically exported as metrics. You can create alerting policies based on latency thresholds (e.g., p99 latency > 500ms).

Cloud Logging: Log entries can include the trace ID (logging.googleapis.com/trace). This allows you to jump from a log entry to the corresponding trace.

Cloud Profiler: While Trace shows latency distribution across requests, Profiler shows CPU and memory consumption per function.

Error Reporting: Errors logged with the trace ID can be linked to the trace where the error occurred.

App Engine, GKE, Cloud Run: These services can auto-instrument traces with minimal configuration (e.g., App Engine standard environment automatically sends trace data).

Advanced Topics: Trace Filtering and Analysis

The Trace console provides a Trace List view where you can filter traces by: - Time range (up to 30 days) - Latency (e.g., traces slower than 1 second) - HTTP method - URL - Status code - Custom labels (e.g., user_id)

You can also perform latency distributions to see percentiles (p50, p95, p99) of request latencies over time. The Trace Details page shows a Gantt chart of spans, allowing you to drill into each span's latency and labels.

Cost Considerations

Cloud Trace charges based on the volume of ingested spans. The first 250,000 spans per month per project are free. After that, it's $0.10 per 250,000 spans. Sampling is crucial to control costs.

Walk-Through

Enable Cloud Trace API

First, ensure the Cloud Trace API is enabled for your project. Use `gcloud services enable cloudtrace.googleapis.com`. Without this, trace data cannot be sent to Google Cloud. Verify with `gcloud services list --enabled | grep cloudtrace`.

Instrument Your Application

Add tracing instrumentation to your application code using OpenTelemetry SDKs or Google's Cloud Trace libraries. For example, in Python, install `opentelemetry-api`, `opentelemetry-sdk`, and `opentelemetry-exporter-gcp-trace`. Configure the exporter to send spans to the Cloud Trace API. The instrumentation automatically captures incoming and outgoing HTTP requests, gRPC calls, and custom operations.

Configure Sampling Rate

Set a sampling rate to control how many requests are traced. Use environment variables or code configuration. For example, in OpenTelemetry, set `OTEL_TRACES_SAMPLER=parentbased_traceidratio` and `OTEL_TRACES_SAMPLER_ARG=0.1` for 10% sampling. The default rate-based sampling (1 req/sec) may miss occasional slowdowns; adjust based on traffic volume and cost tolerance.

Deploy and Generate Traffic

Deploy the instrumented application to Google Cloud (e.g., Compute Engine, GKE, Cloud Run). Generate real or synthetic traffic to trigger requests. As requests flow, spans are created and exported asynchronously to Cloud Trace. The exporter batches spans and sends them every few seconds (default 5 seconds) or when the buffer is full.

View and Analyze Traces

In the Cloud Console, navigate to Trace > Trace List. Here you see a list of traces with their latency, method, and URL. Click on a trace to see a waterfall view of spans. Identify slow spans by their duration. Use filters to isolate problematic requests (e.g., status=500, latency>2s). Create a latency distribution chart to monitor p50/p95/p99 over time.

What This Looks Like on the Job

Scenario 1: E-commerce Checkout Latency

A large e-commerce platform noticed that checkout requests were slow during peak hours. They deployed Cloud Trace across their microservices: frontend, cart service, payment service, and inventory service. Trace revealed that the payment service was making two redundant calls to a third-party fraud detection API, each taking 500ms sequentially. By parallelizing these calls, they reduced checkout latency by 40%. The team set up a Cloud Monitoring alert on p99 latency of the checkout trace > 2s. They also used Trace's latency distribution to track improvements after deployment. Misconfiguration: Initially, they had set sampling to 0.001 (0.1%) which was too low to catch the issue. After increasing to 0.1 (10%), they quickly identified the bottleneck.

Scenario 2: Microservice Dependency Mapping

A fintech startup used Cloud Trace to map dependencies between 50+ microservices. They discovered an unexpected cascading call: the user service was calling the notification service on every login, which then called the email service, adding 300ms to login latency. They moved the notification to an asynchronous queue. Trace's span hierarchy made it easy to see the call chain. They also used Trace's ability to export to BigQuery to run custom analytics on latency trends by service version. Common pitfall: They forgot to propagate the trace context when using a message queue; they had to manually inject the trace ID into the message headers.

Scenario 3: Serverless Cold Start Debugging

A company running Cloud Run functions experienced intermittent high latencies. Using Cloud Trace, they saw that some spans had a long initial delay (cold start) before the function handler. By comparing traces with and without cold start, they determined that the cold start added 1-2 seconds. They mitigated by setting a minimum number of instances (min-instances) to keep functions warm. Trace also helped them identify that a dependency (a large library) was causing slow initialization. They optimized by lazy-loading the library. Performance consideration: Trace itself adds a small overhead (about 1-5% additional latency) due to span creation and export; they had to balance granularity with overhead.

How ACE Actually Tests This

What the ACE Exam Tests

Objective 4.1: Use Cloud Trace to analyze latency. The exam expects you to know how to enable Trace, instrument applications, view traces, and interpret span data.

Common Scenarios: Identify the slowest span in a trace, understand sampling rates, and know how to correlate traces with logs.

Most Common Wrong Answers

"Cloud Trace automatically traces all requests without any configuration." Reality: You must enable the API and instrument your application. Only App Engine standard environment has auto-instrumentation for some languages.

"Sampling rate of 1 means trace 1% of requests." Reality: Sampling rate is a number between 0 and 1, where 1 means 100% of requests. The default is rate-based (1 req/sec), not probability-based.

"Traces are stored indefinitely." Reality: Default retention is 30 days, configurable between 7 and 30 days.

"Cloud Trace can only trace HTTP requests." Reality: It can trace gRPC, custom operations, and any instrumented code.

Specific Numbers and Terms

Default sampling: 1 request per second per instance (rate-based).

Maximum spans per trace: 256.

Retention: 30 days default.

Free tier: 250,000 spans per month.

Header: X-Cloud-Trace-Context.

API version: v2.

Edge Cases

When using multiple projects, trace IDs may collide; use a common project for trace aggregation.

Traces from different projects can be viewed in a single project by configuring cross-project tracing.

If spans exceed 256, excess spans are silently dropped; you won't see the full trace.

Cloud Trace does not support distributed tracing across on-premises and cloud unless you use a hybrid tracing solution.

How to Eliminate Wrong Answers

If a question mentions "automatic tracing without code changes," look for App Engine or GKE with Istio (which can auto-inject sidecars). Otherwise, instrumentation is required.

If a question asks about reducing trace data volume, think sampling, not retention reduction.

If a question asks about correlating logs and traces, the answer involves the trace ID in log entries.

Key Takeaways

Cloud Trace captures end-to-end latency using spans and traces, with a default rate-based sampling of 1 request per second per instance.

Maximum spans per trace is 256; excess spans are dropped silently.

Traces are retained for 30 days by default; can be exported to BigQuery or Cloud Storage for longer retention.

Instrumentation is required for most services; use OpenTelemetry SDKs or Cloud Trace libraries.

The trace context is propagated via the 'X-Cloud-Trace-Context' HTTP header.

Cloud Trace integrates with Cloud Monitoring and Cloud Logging for correlation.

Free tier includes 250,000 spans per month; beyond that, cost is $0.10 per 250,000 spans.

To view traces, enable the Cloud Trace API and navigate to Trace > Trace List in the Cloud Console.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Cloud Trace

Captures end-to-end request path with causal relationship between spans.

Provides per-span latency with nanosecond precision.

Allows drill-down into individual traces to identify root cause.

Requires instrumentation to propagate trace context.

Retains trace data for up to 30 days.

Cloud Monitoring (Metrics)

Aggregates metrics like request count, latency percentiles, error rate.

Provides time-series data for dashboards and alerts.

Does not show individual request details or causal chains.

Can be collected without code changes for many services (e.g., Compute Engine, GKE).

Metrics can be retained for up to 6 weeks (standard) or longer with custom retention.

Watch Out for These

Mistake

Cloud Trace automatically traces all Google Cloud services without any setup.

Correct

Only App Engine standard environment and GKE with Istio provide some auto-instrumentation. For most services (Compute Engine, Cloud Run, Cloud Functions), you must manually instrument your application code using OpenTelemetry or Cloud Trace libraries.

Mistake

The sampling rate is a percentage (e.g., 0.1 = 10%).

Correct

The sampling rate is a number between 0 and 1, where 1 means 100% of requests. However, the default is rate-based (1 req/sec), not probability-based. If you set a probability-based sampler, then yes, 0.1 means 10%.

Mistake

Cloud Trace stores traces forever.

Correct

Traces are retained for 30 days by default. You can configure retention between 7 and 30 days. For long-term storage, export traces to BigQuery or Cloud Storage.

Mistake

Cloud Trace only works with HTTP/HTTPS requests.

Correct

Cloud Trace supports any instrumented operation, including gRPC, database queries, custom functions, and message queue operations. The span kind can be CLIENT, SERVER, PRODUCER, or CONSUMER.

Mistake

You can trace all requests in a high-traffic system without incurring cost.

Correct

The free tier covers 250,000 spans per month. Beyond that, you pay $0.10 per 250,000 spans. Tracing 100% of requests in a high-traffic system can become expensive. Use sampling to control costs.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

How do I enable Cloud Trace for my project?

Enable the Cloud Trace API using `gcloud services enable cloudtrace.googleapis.com`. Then instrument your application with OpenTelemetry or Cloud Trace client libraries. For App Engine standard environment, auto-instrumentation is available for Python, Java, Go, and PHP.

How does Cloud Trace sampling work?

Cloud Trace supports rate-based sampling (default: 1 request per second per instance) and probability-based sampling (set a float between 0 and 1). You configure sampling in your application code or via environment variables (e.g., `OTEL_TRACES_SAMPLER`). Sampling controls cost and volume.

Can I view traces from multiple projects in one place?

Yes, you can configure cross-project tracing by setting the `goog-client-trace-project` header or using a service account that has permissions to write traces to a central project. The central project must have the Cloud Trace API enabled.

How do I correlate logs with traces?

Include the trace ID in your log entries using the `logging.googleapis.com/trace` field. The trace ID is available from the span context. In Cloud Logging, you can then click the trace ID to open the trace in Cloud Trace.

What is the difference between Cloud Trace and Cloud Monitoring?

Cloud Trace provides detailed, per-request latency breakdowns (spans) and causal relationships. Cloud Monitoring provides aggregated metrics (e.g., p99 latency, request count) and alerting. They complement each other: Monitoring alerts on high latency, and Trace helps you find the cause.

How long are traces stored?

By default, traces are stored for 30 days. You can change the retention period to between 7 and 30 days using the Cloud Console or API. For longer retention, export traces to BigQuery or Cloud Storage.

What happens if a trace has more than 256 spans?

Only the first 256 spans are retained; additional spans are silently dropped. This is a hard limit. If you need more spans, consider breaking the trace into multiple traces or using a higher-level abstraction.

Terms Worth Knowing

Azure Monitor Cloud computing Cloud Monitoring CloudWatch Region

Ready to put this to the test?

You've just covered Cloud Trace for Latency Analysis — now see how well it sticks with free ACE practice questions. Full explanations included, no account needed.

Try ACE practice questions Back to all chapters

Done with this chapter?

Skaffold for Local GKE Development

Billing Data Export to BigQuery

See the full ACE study guide