ACEChapter 89 of 101Objective 2.3

Cloud Spanner Replication and TrueTime

This chapter covers Cloud Spanner replication and the revolutionary TrueTime API that makes it possible. As a Google Associate Cloud Engineer, you must understand how Spanner combines horizontal scalability with strong consistency across global regions—a unique capability that exam questions frequently test. Approximately 10-15% of exam questions touch on Spanner's architecture, replication, and TrueTime, making this a high-yield topic for the ACE exam. By the end of this chapter, you will understand the internal mechanisms of TrueTime, how Spanner uses it for consistent reads and writes, and the specific configurations and trade-offs you must know for the exam.

25 min read
Intermediate
Updated May 31, 2026

Global Clock Network for Transaction Ordering

Imagine a global bank with branches in New York, London, and Tokyo, all processing wire transfers for the same customer accounts. To prevent double-spending, the bank installs a network of atomic clocks in each branch, synchronized by GPS to within 7 milliseconds of each other. Every transaction is timestamped with the local clock reading, and the bank's central system uses these timestamps to order transactions globally. If two transactions occur within the 7ms uncertainty window, the bank pauses and waits until the clocks drift apart enough to determine a clear order. This wait ensures that even if one branch's clock is slightly ahead, the system never incorrectly reverses the order. The key is that the clocks are not perfectly synchronized—they have a bounded uncertainty—but the system uses that bound to guarantee consistency. This is exactly how Google's TrueTime works for Spanner: each server has a GPS and atomic clock, timestamps are assigned with a commit wait equal to the clock uncertainty, and transactions are globally ordered by these timestamps. The system never 'guesses' which transaction happened first—it waits until the uncertainty interval no longer overlaps.

How It Actually Works

What is Cloud Spanner and Why Does It Exist?

Cloud Spanner is Google's fully managed, horizontally scalable, globally distributed relational database service. It is the first and only database that combines the benefits of relational database structure (ACID transactions, SQL) with non-relational horizontal scalability, all while providing strong consistency across regions. Traditional relational databases can scale vertically (by adding more power to a single machine) but hit limits. NoSQL databases scale horizontally but sacrifice ACID transactions and strong consistency. Spanner solves this by using a novel combination of synchronous replication, Paxos-based consensus, and a global time service called TrueTime.

For the ACE exam, you need to know that Spanner is designed for mission-critical applications that require both strong consistency and global scale, such as financial trading platforms, global supply chain management, and large-scale advertising systems.

How TrueTime Works Internally

TrueTime is the key innovation that makes Spanner's global consistency possible. It is a globally distributed time service that provides a monotonically increasing timestamp with a bounded uncertainty interval. Each Spanner server is equipped with a combination of GPS receivers and atomic clocks. The GPS provides absolute time with high accuracy, while atomic clocks maintain time locally between GPS syncs. The uncertainty interval, known as the "TrueTime error bound" (ε), is typically 1-7 milliseconds. The system guarantees that the actual time of an event is within the interval [t_earliest, t_latest] returned by the TrueTime API.

The TrueTime API exposes two core functions: - TT.now(): Returns an interval [earliest, latest] representing the current time uncertainty. - TT.after(t) and TT.before(t): Return true if the time t is definitely in the past or future, respectively.

When a write transaction occurs, Spanner assigns a commit timestamp s that is guaranteed to be greater than all previously assigned timestamps. To ensure this, Spanner uses the following mechanism: 1. The leader replica for a Paxos group calls TT.now() to get an interval [e, l]. 2. It selects a commit timestamp s that is greater than l (the latest possible current time). This ensures that s is in the future relative to the TrueTime uncertainty. 3. Before committing, the leader waits until TT.after(s) returns true—meaning the actual time has passed s by at least the uncertainty bound. This wait is called the "commit wait" and is typically equal to ε (1-7 ms). 4. Once the wait completes, the leader knows that any other transaction that might have started before s will see a timestamp earlier than s, ensuring a consistent global order.

Key Components and Defaults

Paxos Groups: Spanner partitions data into tablets, each replicated across multiple zones using the Paxos consensus algorithm. Each Paxos group has a leader and multiple followers. Writes are committed only after a majority of replicas acknowledge.

Replication Factor: By default, Spanner uses synchronous replication with a minimum of 3 replicas (one per zone in a multi-zone instance). You can configure up to 7 replicas for higher read throughput.

Node Count: Each Spanner node provides up to 2 TB of storage and 10,000 queries per second (QPS) of read throughput (or 2,000 write QPS). The exam may test that you need to scale nodes based on performance needs.

Regions and Zones: Spanner instances can span multiple regions. Each region can have one or more zones. The number of zones must be odd for Paxos majority (e.g., 3, 5, 7).

Read Types: Spanner offers strong reads (read at a timestamp that is guaranteed to be up-to-date) and stale reads (read at a timestamp in the past, typically up to 15 seconds old, to reduce latency).

Commit Timestamp: For writes, the commit timestamp is assigned by the leader and must be in the future relative to TrueTime. The commit wait ensures that all replicas see the timestamp as consistent.

Configuration and Verification Commands

To create a Spanner instance with replication across three regions:

gcloud spanner instances create test-instance \
    --config=nam3 \
    --description="Test Instance" \
    --nodes=2

The --config flag specifies the regional or multi-regional configuration. For example, nam3 is a multi-region configuration covering US regions.

To view instance configuration details:

gcloud spanner instances describe test-instance

To list available configurations:

gcloud spanner instance-configs list

To create a database with optional leader options:

gcloud spanner databases create mydb --instance=test-instance

To run a query with strong consistency:

SELECT * FROM MyTable WHERE Key = 'value';

By default, reads are strong. To use stale reads, you set a timestamp bound:

# Using gcloud with stale reads (not directly supported via CLI, but in client libraries)
# Example in Java:
// DatabaseClient client = ...;
// TimestampBound bound = TimestampBound.ofExactStaleness(10, TimeUnit.SECONDS);
// try (ResultSet rs = client.singleUse(bound).executeQuery(statement)) { ... }

How TrueTime Interacts with Related Technologies

TrueTime is not just for Spanner; it is part of Google's global infrastructure. However, for the ACE exam, focus on Spanner's use of TrueTime for: - Global Snapshot Isolation: Spanner provides externally consistent reads and writes. A read at a timestamp t sees all transactions that committed before t and none that committed after. TrueTime ensures that the timestamp t is globally meaningful. - Paxos Leader Lease: The Paxos leader uses TrueTime to manage leases. The leader holds a lease that expires at a certain time; it uses TrueTime to ensure it does not act as leader after the lease expires, preventing split-brain scenarios. - Distributed Transactions: For transactions spanning multiple Paxos groups (e.g., across different tables or partitions), Spanner uses two-phase commit (2PC) with TrueTime timestamps to ensure atomicity and consistency across groups.

Common Exam Traps

Misunderstanding Commit Wait: Candidates often think the commit wait is to wait for replicas to acknowledge. Actually, it is to wait for TrueTime uncertainty to pass so that the timestamp is globally consistent.

Confusing Strong vs. Stale Reads: Strong reads always return the latest data but may incur higher latency. Stale reads can return data up to 15 seconds old (default) but are faster. The exam tests when to use each.

Scaling Nodes: Adding nodes increases throughput and storage proportionally. However, replication factor is independent of node count. Adding nodes does not change the number of replicas per shard.

Multi-Region Configurations: The exam may ask which multi-region configuration to use for global consistency. The answer is to use a multi-region instance (e.g., nam3, eur3, asia1) rather than multiple single-region instances, because Spanner handles replication automatically within the multi-region config.

Walk-Through

1

Client sends write request

A client application initiates a write transaction by sending a SQL DML statement (e.g., INSERT, UPDATE) to a Spanner database. The request is routed to the nearest zone where the data resides, based on the primary key of the affected rows. Spanner's front-end servers parse the request and identify the Paxos group responsible for the data. The request is forwarded to the leader replica of that Paxos group. If the leader is in a different region, the request incurs cross-region latency. The client does not need to know which replica is the leader; Spanner handles this transparently.

2

Leader assigns commit timestamp

The Paxos leader calls `TT.now()` to obtain a TrueTime interval `[e, l]`. It then selects a commit timestamp `s` that is greater than `l` (the latest possible current time). This ensures that `s` is in the future relative to the TrueTime uncertainty. The leader logs the transaction and its timestamp `s` in its write-ahead log (WAL). The timestamp `s` is used to order this transaction globally. The leader also prepares to send the write to all follower replicas as part of the Paxos protocol.

3

Paxos consensus phase

The leader sends a Paxos accept request containing the write value and timestamp `s` to all follower replicas. Each follower replica verifies the request and acknowledges. The leader waits for acknowledgments from a majority of replicas (including itself). Once a majority acknowledges, the transaction is considered committed in the Paxos group. The leader then replies to the client that the write succeeded. This step ensures durability and consistency across replicas.

4

Commit wait for TrueTime

After receiving majority acknowledgment, the leader does not immediately return success to the client. Instead, it waits until `TT.after(s)` returns true—meaning the actual time has passed `s` by at least the uncertainty bound (ε). This wait is typically 1-7 ms. During this wait, the leader does not allow any other transaction to commit with a timestamp earlier than `s`. This ensures that any subsequent read at a timestamp >= `s` will see this write, and no other transaction can claim a timestamp that conflicts. The commit wait is critical for external consistency.

5

Client receives confirmation

Once the commit wait completes, the leader sends a final commit acknowledgment to the client. The client now knows the transaction is durably stored and globally ordered. The client can proceed with subsequent operations. If the client performs a strong read immediately after, it will see the new write because the read timestamp will be >= `s`. If the client uses a stale read, it may not see the write if the staleness bound is older than `s`. The entire process—from request to confirmation—typically takes tens of milliseconds for cross-region writes.

What This Looks Like on the Job

Scenario 1: Global Financial Trading Platform

A large investment bank uses Spanner to maintain a real-time order book for equities traded across multiple exchanges in New York, London, and Tokyo. The bank requires ACID transactions to ensure that buy and sell orders are matched without double-spending or lost updates. They deploy a Spanner multi-region instance with the nam3 configuration (US regions) and eur3 (Europe) and asia1 (Asia) as separate instances, but they need global consistency across all regions. To achieve this, they use a single Spanner instance spanning all three regions with 9 replicas (3 per region). The write latency is approximately 50-100 ms due to cross-continent Paxos communication. They use strong reads for order matching and stale reads for reporting (with 5-second staleness). The bank monitors the TrueTime uncertainty metric in Cloud Monitoring to ensure it stays below 7 ms. When misconfigured—e.g., using multiple single-region instances instead of a multi-region instance—they experienced inconsistent order books because writes in one region were not visible in another until replication lag caught up, violating the requirement for strong consistency.

Scenario 2: Global Advertising System

A major ad tech company uses Spanner to store ad impressions and clicks for real-time bidding. They need to deduplicate events and calculate budgets across multiple data centers. They deploy a Spanner multi-region instance with 5 replicas (2 in us-east1, 2 in us-west1, 1 in us-central1). They use stale reads for ad serving (to minimize latency) with a staleness of 10 seconds, because showing an ad that is up to 10 seconds behind is acceptable. For budget updates, they use strong reads to ensure they don't overspend. They scale nodes based on peak QPS: during Black Friday, they increase from 10 nodes to 50 nodes to handle 500,000 QPS. The common pitfall is underestimating the node count: each node handles only 10,000 QPS reads, so they must calculate nodes = (peak QPS) / 10,000. Failure to scale leads to throttling and increased latency.

Scenario 3: Global Gaming Leaderboard

A gaming company uses Spanner to maintain a global leaderboard for a multiplayer game. They use Spanner's interleaved tables to store player profiles and scores. They need strong consistency to ensure that the leaderboard accurately reflects the latest scores. They deploy a single-region Spanner instance in us-central1 because most players are in North America. For players in other regions, they use Cloud CDN to cache leaderboard results with a Time-to-Live (TTL) of 30 seconds. They use stale reads for the cached data and strong reads only when a player submits a new score. The problem they encountered: when they initially used stale reads for score submission, they sometimes overwrote a higher score with a lower one because the stale read returned an older score. They fixed this by using strong reads for the update operation. The exam tests this scenario: always use strong reads for read-modify-write operations.

How ACE Actually Tests This

What the ACE Exam Tests

The ACE exam objectives (2.3 Planning Solutions) specifically test your ability to:

Explain the benefits of Spanner's global replication and how TrueTime enables external consistency.

Choose the appropriate Spanner configuration (single-region vs. multi-region) based on consistency and latency requirements.

Understand the relationship between nodes, storage, and throughput.

Identify when to use strong vs. stale reads.

Recognize the commit wait mechanism and its purpose.

Common Wrong Answers and Why Candidates Choose Them

1.

"Spanner uses eventual consistency" – This is wrong because Spanner provides strong consistency via synchronous replication. Candidates confuse Spanner with NoSQL databases like Bigtable or Cassandra.

2.

"The commit wait is to wait for all replicas to acknowledge" – While Paxos waits for a majority, the commit wait is specifically for TrueTime to ensure the timestamp is in the past. Candidates conflate the two waits.

3.

"Adding more nodes increases replication factor" – Node count and replication factor are independent. Replication factor is set by the number of zones in the instance configuration. Candidates think more nodes = more replicas.

4.

"Stale reads always return data within 15 seconds" – The default staleness is 15 seconds, but you can configure it to be lower (e.g., 1 second) or higher. The exam may test that stale reads can be as fresh as 1 second.

Specific Numbers and Terms to Memorize

TrueTime uncertainty bound: typically 1-7 ms.

Commit wait duration: equal to the uncertainty bound (ε).

Default replication factor: 3 (one per zone in a multi-zone instance).

Maximum replicas: 7.

Node throughput: 10,000 QPS reads, 2,000 QPS writes per node.

Node storage: 2 TB per node.

Stale read default max staleness: 15 seconds (adjustable).

Multi-region configurations: nam3 (US), eur3 (Europe), asia1 (Asia).

Edge Cases and Exceptions

Read-your-writes consistency: Spanner guarantees that a write followed by a strong read from the same client will see the write. However, if you use stale reads, you might not see your own write until the staleness bound expires.

Global transactions: Transactions that span multiple Paxos groups use two-phase commit. The coordinator group may be in a different region, increasing latency.

Instance deletion: You cannot delete a Spanner instance if it has databases; you must delete databases first.

Backup and restore: Spanner supports point-in-time recovery with a version retention period (default 1 hour, max 7 days).

How to Eliminate Wrong Answers

If the question mentions "eventual consistency" for Spanner, eliminate that option immediately.

If the question asks why Spanner can have global strong consistency, the answer must involve TrueTime and commit wait.

For capacity planning, calculate nodes based on the higher of storage or throughput requirements.

If the question is about read types, look for clues: if the application can tolerate some staleness (e.g., reporting), choose stale reads; if it must see the latest data (e.g., financial transactions), choose strong reads.

Key Takeaways

TrueTime provides globally synchronized timestamps with a bounded uncertainty of 1-7 ms, enabling external consistency.

The commit wait (equal to TrueTime uncertainty) ensures that the commit timestamp is in the past before confirming a write.

Spanner uses synchronous replication with Paxos; a majority of replicas must acknowledge a write before it is committed.

Default replication factor is 3 (one per zone); maximum is 7.

Each node provides 2 TB storage and 10,000 QPS read throughput (or 2,000 QPS write throughput).

Strong reads always return the latest data; stale reads can be up to 15 seconds old (configurable).

Multi-region instances (nam3, eur3, asia1) provide global strong consistency but higher write latency.

Adding nodes increases capacity but does not change replication factor.

The TrueTime API functions are TT.now(), TT.after(), and TT.before().

Spanner is the only database that combines SQL, ACID transactions, and horizontal scalability with global strong consistency.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Single-Region Spanner Instance

All replicas in zones within one geographic region (e.g., us-central1).

Lower write latency (typically 5-10 ms) because Paxos communication is within region.

Read latency is low for clients in the same region.

Provides strong consistency and high availability within that region.

Suitable for applications with a regional user base.

Multi-Region Spanner Instance

Replicas spread across multiple geographic regions (e.g., US, Europe, Asia).

Higher write latency (typically 50-100 ms) due to cross-region Paxos.

Read latency is low for clients in any region that has a replica.

Provides strong consistency and high availability globally.

Suitable for global applications requiring strong consistency across continents.

Strong Read (default)

Always returns the most recently committed data.

May incur higher latency because it must contact the leader replica.

Guarantees read-your-writes consistency.

Suitable for financial transactions, real-time updates.

Uses the current timestamp from TrueTime.

Stale Read

May return data that is up to a configurable staleness (default 15 seconds).

Can be served by any replica, reducing latency.

Does not guarantee read-your-writes; might not see recent writes.

Suitable for reporting, dashboards, cache refreshes.

Uses a timestamp in the past, typically via `TimestampBound.ofExactStaleness()`.

Watch Out for These

Mistake

Cloud Spanner uses eventual consistency like many NoSQL databases.

Correct

Spanner provides strong external consistency via synchronous replication and TrueTime. Writes are committed only after a majority of replicas acknowledge, and reads always return the latest committed data unless stale reads are explicitly used.

Mistake

The commit wait is used to wait for all replicas to acknowledge the write.

Correct

The commit wait is to ensure the commit timestamp is in the past relative to TrueTime's uncertainty bound. The Paxos consensus already ensures a majority of replicas acknowledged before the commit wait begins.

Mistake

Adding nodes increases the replication factor of Spanner.

Correct

Replication factor is determined by the number of zones in the instance configuration (e.g., 3 zones = 3 replicas). Adding nodes increases storage and throughput capacity but does not change the number of replicas per shard.

Mistake

Stale reads always return data that is at most 15 seconds old.

Correct

The default maximum staleness is 15 seconds, but you can configure a lower value (e.g., 1 second) or use exact staleness. The staleness bound is adjustable per query.

Mistake

Spanner can be used for single-region deployments without replication.

Correct

Even single-region Spanner instances use multiple zones (minimum 3) for high availability. Replication is inherent to Spanner's architecture; you cannot have a single replica.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between strong and stale reads in Spanner?

Strong reads always return the most recently committed data and guarantee read-your-writes consistency. They must contact the leader replica, which can incur higher latency. Stale reads return data that is up to a configurable amount of time in the past (default 15 seconds) and can be served by any replica, reducing latency. Use strong reads for critical operations like financial transactions, and stale reads for reporting or dashboards where slight staleness is acceptable.

How does TrueTime ensure global consistency in Spanner?

TrueTime provides a time interval `[earliest, latest]` with a bounded uncertainty (1-7 ms). When a transaction commits, Spanner assigns a timestamp `s` that is greater than `latest`, then waits until `TT.after(s)` returns true (the commit wait). This ensures that `s` is in the past relative to the uncertainty, so all subsequent transactions will have timestamps greater than `s`. This guarantees a global ordering of transactions, preventing anomalies like lost updates.

What is the default replication factor for a Spanner instance?

The default replication factor is 3, which means each shard (split) is replicated across three zones. You can configure up to 7 replicas by choosing an instance configuration with more zones. The replication factor is determined by the number of zones in the instance configuration, not by the number of nodes.

How do I scale a Spanner instance for higher throughput?

You scale Spanner by adding nodes. Each node provides 10,000 QPS read throughput and 2,000 QPS write throughput, along with 2 TB of storage. You can add nodes using the `gcloud spanner instances update` command or the Google Cloud Console. Scaling is automatic in terms of rebalancing data across nodes. There is no downtime during scaling.

Can I use Spanner for a single-region application?

Yes, you can create a single-region Spanner instance (e.g., `regional-us-central1`). Even in a single region, Spanner uses multiple zones (typically 3) for high availability. This provides strong consistency and low latency within that region, but does not replicate data to other regions. For global applications, use a multi-region instance.

What happens if the TrueTime uncertainty becomes large?

If the uncertainty bound increases (e.g., due to GPS signal loss or atomic clock drift), Spanner may increase the commit wait duration accordingly. This can increase write latency. In extreme cases, if uncertainty exceeds a threshold, Spanner may reject writes to maintain consistency. Monitoring TrueTime uncertainty via Cloud Monitoring is recommended.

How does Spanner handle read-your-writes consistency?

Spanner guarantees read-your-writes consistency for strong reads. If a client writes data and then performs a strong read from the same client session, the read will see the write. This is because the strong read uses a timestamp greater than or equal to the commit timestamp of the write. Stale reads may not see the write if the staleness bound is older than the commit timestamp.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Cloud Spanner Replication and TrueTime — now see how well it sticks with free ACE practice questions. Full explanations included, no account needed.

Done with this chapter?