How computers synchronize clocks

date
Aug 2, 2024
outer_link
slug
how-computers-sync-clocks
status
Published
tags
CS Theory
summary
Computers have their own physical clocks to track time, but they deviate from each other. How do they synchronize back?
type
Post

Background

Computers generally have their own physical clocks to track time. In distributed computing, these clock values are crucial to coordinate and synchronize distributed computations. However, these values usually deviate from one another due to clock drifts. This is because, different computers count time at slightly different rates known as the drift rate.
Drift rate is the change in offset between a clock and a perfect reference clock per unit of time measured by the reference clock.
💡
Drift rate of quartz crystal is about 10-6

Christian's Method

The Christian's method is a rather straightforward way to synchronize clocks. Suppose we have 2 computers A & B.
notion image
  • A sends a request to B asking for it's time at t1.
  • B replies with it's time as t.
  • A receives B’s reply at t2 and records the total round trip time. The round trip time Tround = t2 - t1.
  • A now calculates B’s time as t + Tround/2.
  • A sets it's clock as this time and the 2 clocks are synchronized.
This method is quite intuitive, however A’s assumption is quite naive. It assumes that the roundtrip of requests are equal on both trips. So it took an average. In reality, the 2 request trips (A → B, B → A) could be very different. So the actual time of B might not be accurately calculated. It also assumes that the time spent by B to process A’s request is instant.

Berkeley Algorithm

Proposed by Gusella & Zetti for synchronizing a group of computers running Berkeley UNIX, this method is an extension of the Christian's method. Instead of 2 computers synchronizing themselves independently, why not have a group of computers synchronize together? That way the roundtrip time will get closer to actually being an average. This is how it works:
  • The master periodically polls the slaves
  • Slaves send their clock readings back to master
  • Master evaluates the local times of slaves by observing roundtrip times (similar to Cristian’s method)
  • Master averages the values obtained (including its own clock reading)
  • Master sends the amount for adjustment to each slave whose clock requires adjustment
notion image
Taking the average clock reading cancels out individual clocks’ tendencies to run fast or slow. Also, the master only sends the adjustment and not the actual time. This is done to reduce further uncertainty as the message from the master will take some time to reach the slaves.
💡
Empirical results: 15 computers on LAN (roundtrip time ~ 10 ms; clock drifts ~ 20 µs/s) synchronized to accuracy of 20-25 ms

Network Time Protocol

Christian's method and the Berkeley Algorithm still have a centralized design. That's not very scalable when we are talking about the internet. So, we have NTP.
Here, we have a network of servers structured hierarchically into a synchronization subnet. Primary servers are stratum 1 which are directly connected to a time source (e.g a radio clock receiving UTC). Stratum 2 servers synchronize with stratum 1 servers. But stratum 3 servers synchronize with stratum 2 servers thereby being more scalable.
notion image
Primary servers (stratum 1) are connected directly to a time source (e.g., radio clock receiving UTC). Stratum 2 servers are synchronized with primary servers. Stratum 3 servers are synchronized with stratum 2 servers.
This is a lot more fault tolerant because if a server fails, the other servers reconfigure themselves to reach a working server. If a stratum 1 server’s UTC source fails, it is downgraded to stratum 2. If a stratum 2’s primary source fails, it simply synchronizes with a different stratum 1 server.
💡
Empirical results: synchronization accuracy, 1ms over LAN, tens of ms over Internet

Are physical times really necessary?

The thing is, physical clocks in a distributed system can never be synchronized perfectly as proved by Lamport [1978]. Therefore, no matter how good a synchronization is, we can never use physical time to order arbitrary pair of events in a distributed system.
But we really don’t need the actual physical time to order events. Instead, we can use a scheme similar to physical causality to order events. This is where logical clocks come in. That is a whole other topic for another post.

Mohamed Irfan © 2025