Design a proximity service

A proximity service answers the question behind every "restaurants near me" tap. Given a point on the map and a radius, it returns the businesses inside that circle, ranked sensibly, and it does so fast enough that the results feel attached to the gesture rather than fetched from somewhere far away. It is the engine under Yelp's nearby tab, a retailer's store locator, and the place results in a maps app. The workload has a distinctive shape that makes it a favorite interview question. Reads outnumber writes by several orders of magnitude, the data tolerates being slightly stale because businesses move rarely, and the central technical obstacle is that ordinary database indexes order one dimension while the query lives in two. A candidate who can explain why a B-tree struggles here, and who can then build a spatial index from first principles, demonstrates exactly the kind of reasoning interviewers want to see.

The walkthrough below runs the question the way I would pace it in a real interview, beginning with scope, moving to the numbers, then spending most of the time on the indexing problem that dominates the design, and closing with the operational story of replicas, caches, and shards. At each fork I will name the alternative the chosen path was weighed against, because the choosing is where the design actually lives.

Scope and requirements

The functional core is a single query, namely returning the businesses within a user-chosen radius of a latitude and longitude, with optional filters such as category and a ranking that blends distance with rating. Business owners need to add, edit, and remove listings, and users need a detail view for a single business. I would confirm with the interviewer that full-text search, reviews, and photos belong to other services, and that this one owns the geospatial lookup alone. Drawing that boundary early matters, because each neighboring feature pulls the design in a different direction, and a forty-minute conversation cannot afford to chase text relevance and photo storage while the spatial index goes unexplained.

The non-functional requirements are friendly compared to most designs. Searches should return in a couple hundred milliseconds at the 99th percentile because they sit on the critical path of a consumer app, where someone has just tapped a button and is watching the screen with a thumb hovering. The system should be highly available, since a search tab that errors out is indistinguishable from a broken app to the person holding the phone. It can also be eventually consistent, which means readers may briefly observe data that lags the latest writes, and here that lag is harmless. If a newly added cafe takes a few minutes or even a day to appear in search results, nobody is hurt, and that looseness is worth saying out loud because it unlocks aggressive caching and periodic index rebuilds later in the design. Reads dwarf writes, since millions of people search while only a trickle of owners edit listings on any given day, and a read-heavy workload that also tolerates staleness is about the kindest combination a designer can be handed.

Sizing the problem

Concrete numbers keep the design grounded, so I would propose a set and invite the interviewer to adjust them. Assume 100 million businesses and 100 million daily users who each perform about five searches a day. Multiplying gives 500 million searches per day, and since a day holds 86,400 seconds, the average load is 500 million divided by 86,400, which comes out near 5,800 queries per second. I would round to 5,000 average for clean arithmetic and plan for a peak around three times that, or 15,000 queries per second, because consumer search traffic bunches around lunchtime and evenings rather than spreading evenly across the clock. Writes are a rounding error by comparison. Even if one percent of all businesses were edited on a busy day, that is 1 million updates spread over 86,400 seconds, or about 12 writes per second, a load a laptop could absorb without noticing. The gap between 15,000 reads and 12 writes is the most load-bearing fact in the whole design, because it licenses every read-side optimization that follows.

Storage is equally tame. A business record with name, coordinates, category, rating, address, and hours fits in about 1 KB, so 100 million businesses occupy roughly 100 GB, which a single replicated relational database handles without ceremony. The geospatial index is smaller still, because one entry per business holding an 8-byte ID plus a short cell identifier comes to under 2 GB of raw data and only a few GB once structural overhead is counted. That observation, that the entire index fits comfortably in the memory of one machine, will shape the architecture more than any other number here, since it converts what looks like a sharding problem into a replication problem, and replicating a small read-only structure is one of the easiest moves in distributed systems.

The API

The read path is one endpoint, and the write path is conventional CRUD on listings, CRUD being the create, read, update, and delete quartet that nearly every storage-backed service exposes. Pagination uses an opaque token so the server can encode where the previous page ended without exposing internals, which leaves the service free to change its ranking machinery later without breaking clients that saved old tokens.

GET /v1/search/nearby?lat=37.7749&lng=-122.4194&radius=500&category=cafe
→ 200 { "total": 312,
        "businesses": [ { "id": 8123, "name": "Blue Door Coffee",
                          "distance_m": 120, "rating": 4.6 }, ... ],
        "next_page_token": "eyJvZmZzZXQiOjIw..." }

GET    /v1/businesses/{id}        → 200 { full business detail }
POST   /v1/businesses             → 201 (owner adds a listing)
PUT    /v1/businesses/{id}        → 200 (edit; index updated asynchronously)
DELETE /v1/businesses/{id}        → 204

The data model

The source of truth is a plain relational table, because business records are small, relational tooling is mature, and nothing about the data demands anything exotic. A document database would store the same records without complaint, but it would buy nothing here and would give up the comfortable transactional update path that owner edits want, so the unexciting choice wins. The spatial structure lives beside the table as a mapping from cell identifiers to business IDs, which can be a second table at small scale or an in-memory structure at large scale, and the next sections explain what those cells are and why they exist at all.

CREATE TABLE businesses (
  id          BIGINT PRIMARY KEY,
  name        TEXT NOT NULL,
  lat         DOUBLE PRECISION NOT NULL,
  lng         DOUBLE PRECISION NOT NULL,
  category    TEXT,
  rating      REAL,
  address     TEXT,
  updated_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE TABLE geo_index (
  geohash      CHAR(6) NOT NULL,        -- cell of about 1.2 km by 0.6 km
  business_id  BIGINT NOT NULL REFERENCES businesses(id),
  PRIMARY KEY (geohash, business_id)
);

The high-level architecture

A load balancer fronts a stateless search service, stateless meaning any node can serve any request because nothing is remembered between calls, so nodes can be added or replaced freely. Searches hit a geospatial index that maps cells to business IDs, the candidate IDs are hydrated from the business database, hydration being the step that expands bare IDs into full records, and hot cell results are cached in front of everything. Because the index is only a few GB, it is replicated wholesale to several read-only nodes rather than sharded, and the database feeds those replicas through incremental updates or periodic rebuilds, which the loose freshness requirement permits. Every box on the read path exists to keep that path short, and the only asynchronous machinery in the system is the dashed rebuild loop that refreshes the index from the source of truth.

Searches resolve cells against an index replica, hydrate business details from the database, and lean on a cache for hot cells. The dashed path is the asynchronous index rebuild, which the loose freshness requirement makes safe to run on a relaxed schedule.

Why a one-dimensional index fails

The instinctive first answer is to index latitude and longitude with ordinary B-tree indexes, which are the balanced tree structures relational databases use to answer range queries over a single ordered column, and the instinct fails in an instructive way. A B-tree gives fast answers to a question like all rows with latitude between 37.770 and 37.780, because latitude is one ordered dimension and the matching rows sit contiguously in the tree. A radius query, though, is a conjunction of two ranges, one in latitude and one in longitude, and each range taken alone matches an enormous stripe of the planet.

Working through the arithmetic makes the failure concrete. A 500 m radius spans about 0.009 degrees of latitude, so the latitude predicate selects a thin band that nonetheless circles the entire globe. If businesses were spread evenly over the 180 degrees of latitude, the band would hold 100 million times 0.009 divided by 180, about 5,000 rows. In reality businesses cluster in cities, and a band passing through one metro area also passes through every other city sitting at the same latitude anywhere on Earth, so a band through Seoul also sweeps Madrid and Philadelphia, and the count lands in the tens or hundreds of thousands. The longitude band behaves the same way running north to south. Since the database can use only one of the two indexes to drive the scan, it pulls one huge stripe and filters it row by row against the other condition to find perhaps a hundred true matches. A composite index over both columns does not rescue the plan either, because a composite B-tree orders by its first column and only then by its second, so the longitude condition still cannot narrow the scan inside the latitude band. Doing tens of thousands of wasted row inspections per query, 15,000 times per second, is not a plan, and an operator would watch it fail as a database with saturated CPUs while users stare at spinners. The fix is an index whose keys encode both dimensions at once, which is exactly what geohashes and quadtrees provide.

Geohash, quadtree, and the neighbor problem

A geohash is a way of folding two dimensions into one sortable string, and the encoding is simple enough to derive at the whiteboard. Halve the longitude range and write a 1 if the point falls in the upper half, halve again, and repeat; do the same for latitude; then interleave the two bit streams, longitude bit first, and encode every five bits as one character of a base32 alphabet. The decisive property is that a prefix names a rectangle. Every point whose geohash starts with 9q8yy lies in one specific cell, and longer prefixes name smaller cells nested inside it, so the cells form a hierarchy and a string prefix match becomes a geometric containment test. The standard sizes are worth memorizing because they let you pick a precision instantly. Four characters covers about 39 km by 20 km, five characters about 4.9 km by 4.9 km, six characters about 1.2 km by 0.6 km, and seven characters about 150 m by 150 m. For a 500 m radius search, six characters is the natural fit, and the geo index table above is built on exactly that idea, with rows keyed by 6-character cell, which a B-tree handles perfectly because the cell ID is one ordered dimension.

One failure mode needs calling out before it bites. Two points 10 m apart can sit on opposite sides of a cell boundary and share no useful prefix at all, and a query point near a cell edge has part of its radius hanging into adjacent cells, so searching only the home cell silently drops results near every boundary. From the user's side the symptom is quietly absurd, since a cafe visibly across the street fails to appear in a 500 m search while one four blocks away shows up fine. The fix is mechanical rather than clever, namely computing the 8 neighboring cells, which costs only a little bit arithmetic, and querying all 9 together. With 6-character cells, the 3 by 3 block covers roughly 3.7 km by 1.8 km, comfortably containing any 500 m circle no matter where the point sits inside the home cell.

The query point sits near the edge of its home cell, so the radius spills into the neighbors. Fetching the 3 by 3 block guarantees coverage, and businesses inside fetched cells but outside the circle are dropped later by the exact distance check.

The quadtree is the other classic answer, and it adapts to density instead of using fixed cells. A quadtree is a tree that starts with one node covering the whole map and recursively splits any node into four equal quadrants until each leaf contains fewer than some threshold k businesses, so leaves are tiny in Manhattan and enormous over Montana, and every query descends to a leaf sized appropriately for its surroundings. The build arithmetic is reassuring. With k of 100, the structure needs about 100 million divided by 100, or 1 million leaves; a quadtree with 1 million leaves has about a third of a million internal nodes, and at tens of bytes per node plus 8 bytes per stored business ID the whole tree lands near 1 to 2 GB, which one server builds in minutes and holds in memory easily. Updating a quadtree in place is fiddlier than upserting a geohash row, because removals can leave leaves underfull and demand rebalancing logic that is easy to get subtly wrong, which is why many teams rebuild the tree on a schedule instead of mutating it. Google's S2 library is the production-grade refinement of these ideas, projecting the sphere onto cube faces and ordering cells along a Hilbert curve, which is a space-filling path that preserves locality better than geohash interleaving, and Uber's H3 does something similar with hexagons, whose neighbors all sit at uniform distances and so behave more predictably for ring queries. In an interview, geohash is the easiest to defend end to end, with S2 or H3 named as the libraries you would adopt rather than rebuild.

The query path, ranking, and caching

A search now runs in four steps. The service first converts the query point and radius into a cell cover, meaning the home cell plus its 8 neighbors at the chosen precision. It then fetches candidate business IDs for those cells from an index replica, computes the exact distance from the query point to each candidate using the haversine formula, which is the standard great-circle distance over the Earth's surface, and discards anything outside the radius. Finally it ranks the survivors, typically by a blend of distance and rating, before hydrating the top page from the database or the object cache. Candidate counts stay small because 9 cells at 6-character precision cover only a few square kilometers, which even in a dense downtown holds a few thousand businesses, and a few thousand haversine evaluations cost well under a millisecond of CPU, so the geometry is never the slow part.

A latency budget makes the 200 ms target feel earned rather than asserted. Allow 50 ms for the network round trip between the phone and the data center, a couple of milliseconds inside the load balancer and service code, about 5 ms for nine cell lookups against an in-memory index replica, under 1 ms for haversine filtering and ranking, and 10 to 20 ms to hydrate twenty business records from the object cache with an occasional fall-through to a database replica. The sum lands near 80 ms, which leaves over 100 ms of headroom for a cache miss, a retried connection, or a garbage collection pause before the 99th percentile target is threatened. When a request does blow the budget, the culprit is almost always hydration falling through every cache layer onto a busy database, which is exactly why the cell cache and the object cache exist as two separate lines of defense.

Pagination works best as a cursor rather than an offset. An offset asks the server to skip the first n results, which silently repeats or drops entries whenever the ranking shifts between page fetches. A cursor instead encodes the rank score and ID of the last returned business inside the page token, and the next request re-runs the cheap cell query and resumes after that position, which stays correct even if the underlying data drifts slightly between pages. Caching attacks the read load at two levels. Whole cell results, keyed by cell ID plus the filter set, get a time to live of a few minutes, which collapses the many users searching the same neighborhood into one index lookup, while individual business objects are cached by ID with a longer lifetime and invalidated on edit. Because popular areas are searched constantly, hit rates on the cell cache run high, and the staleness window is exactly the freshness the requirements already conceded, so the cache costs the product nothing it had not already agreed to pay.

Writes follow the quiet path. An edit lands in the business database as the source of truth, and the index absorbs it either incrementally, which for a geohash table is a delete plus an insert of one row, or through a periodic rebuild, where a job walks the table, constructs a fresh index or quadtree, and swaps replicas one at a time behind the load balancer so searches never see a half-built structure. At 12 writes per second, either approach is loafing, and the decision reduces to whether the team prefers the bookkeeping of incremental updates or the operational simplicity of rebuilding from scratch and swapping.

Scaling, failures, and operations

Each tier scales by its own rule. The search service is stateless and scales horizontally, and 15,000 queries per second spread over a couple dozen nodes is unremarkable for processes doing memory lookups and light arithmetic. The index scales by replication rather than sharding, because a few GB fits in memory everywhere and adding a replica adds read throughput linearly; replicas register with the load balancer and the search tier round-robins across them. The database carries 12 writes per second plus whatever read traffic the caches let through, so a primary with a couple of replicas is enough for a long time, and nothing in this tier requires technology more exotic than what a small team already runs.

Sharding deserves a paragraph because interviewers push on it even when the numbers say replicate. The tempting scheme is to shard the index by region, giving each shard a contiguous slice of the world so a query touches exactly one shard, but geography is brutally skewed, since the shard holding Tokyo or Manhattan would see orders of magnitude more traffic than the shard holding open ocean, and redrawing region boundaries under live load is the kind of operational misery that pages people at night. The alternative is to shard by hashing the business ID, so every shard holds a thin slice of every cell. A query then scatter-gathers across all shards and merges the partial results, which makes tail latency the latency of the slowest shard but spreads load perfectly evenly and never needs rebalancing by geography. At this data size the right answer is to replicate the whole index and avoid the choice entirely, while saying that hash-plus-scatter-gather is the scheme you would reach for if the index grew tenfold and stopped fitting in one machine's memory.

The failure stories are mild, and walking through them shows why. When a search node dies, the load balancer notices failed health checks within seconds and stops routing to it, so users see at most a few failed requests that retries absorb. When an index replica dies, query capacity drops until a replacement rebuilds itself from a database snapshot in minutes, and during that window latency creeps upward rather than anything going dark. Losing the cache cluster is the loudest event, because index traffic jumps until the cache rewarms, and request coalescing, where concurrent misses on the same cell wait on one fill instead of stampeding the index, keeps that surge polite. A database failover is the only moment of real care, and since search reads come from the index and the caches, a brief write outage on listings is nearly invisible, with owners seeing a failed edit they can retry while searchers notice nothing at all. Operationally the metrics that matter are search latency percentiles, cell cache hit rate, index replica staleness, and a skew report of the hottest cells, which warns you early if one neighborhood is heating beyond what a single replica set handles.

Follow-up questions

Why not use PostGIS or Elasticsearch geo queries instead of building an index? At moderate scale you should, and saying so is good judgment rather than a dodge. Those tools implement the same cell and tree machinery described above behind a query language, and they serve millions of rows comfortably. The custom path earns its keep only when query volume outgrows what a database extension serves cheaply, and the interview value lies in showing you understand what those tools are doing inside when you reach for them.
What geohash precision do you pick, and what if the radius changes? The cell size should roughly match the radius, which means six characters for searches of a few hundred meters, five for a few kilometers, and four for tens of kilometers. A 5 km search at 6-character precision would need dozens of cells to cover the circle, so the service instead picks the precision where the home cell plus its neighbors covers the circle in 9 fetches, keeping the fan-out constant regardless of radius.
How do results near a cell boundary stay correct? Every query fetches the 8 neighbor cells along with the home cell, because a point near an edge has part of its radius in adjacent cells and two close points can share no geohash prefix at all. The exact haversine filter then trims the overfetch, so correctness comes from the union of cells and precision comes from the distance check, each doing the job the other cannot.
Geohash table or quadtree? The geohash table is simpler to update and trivially durable because it is just rows in a database, while the quadtree adapts cell size to density and answers variable-radius queries gracefully but is easier to rebuild than to mutate in place. Both fit in memory at this scale, so the deciding factor is operational simplicity, which favors the geohash, and that is the answer I would defend unless the interviewer pressed hard on density skew.
What breaks first at 10x? Read volume feels the pressure first, and the design absorbs it by adding index replicas and cache nodes, which is exactly why keeping the index small and replicable was the central early decision. The business database and the write path barely notice 120 writes per second, so the storage tier holds without structural change.
How would you rank beyond distance? Distance becomes one feature among several, joined by rating, popularity, and whether the business is open right now, scored either with a hand-tuned formula or a lightweight learned model. The boundary worth defending is that candidate generation stays purely geometric so the index remains generic, with ranking layered on top where it can evolve independently.

References

Redis, Geospatial data type documentation, a production geohash-based index.
Google, S2 Geometry Library, spherical cells ordered along a Hilbert curve.
Uber Engineering, H3: Uber's Hexagonal Hierarchical Spatial Index (2018).
Xu, System Design Interview, Volume 2 (2022), chapter on proximity service.
Kleppmann, Designing Data-Intensive Applications (2017), on indexes and replication.