Design a social media platform

A social media platform lets people publish short posts with text and images, follow other people, and read a timeline of what the people they follow have published, with likes and reposts layered on top as the engagement currency. Hundreds of millions of people open such an app many times a day, and interviewers love the question because it packs a read-heavy serving problem, a write amplification problem, a counting problem, and a media delivery problem into one prompt, leaving the candidate to decide which parts deserve the deep dives. Spreading attention evenly is the classic mistake here, since the user service and the media path are well-trodden ground while the timeline machinery is where designs genuinely differ. The walkthrough below scopes the product, sizes it with worked numbers, decomposes it into services, and then spends its depth on the three places the design actually strains, namely reposts, counters, and timeline delivery.

Scope and requirements

I would pin the feature set early, because each feature on the list drags an architecture behind it. Users create posts containing text up to a small limit and optionally a few images. They follow each other, and following is one-directional, a choice that matters because a mutual friendship model forces two users' rows to agree on every change while a one-way follow is a single fact owned by the follower. They can like a post and can repost it, meaning share another person's post to their own followers with the original attributed, and there are two read surfaces, a home timeline showing recent posts from everyone the user follows and a profile timeline showing one user's own posts. Direct messages, search, and ranked recommendations are real systems of their own, so I would explicitly leave them out and note that the timeline design leaves a slot where ranking can plug in later, since a ranker consumes exactly the candidate set the timeline already assembles.

The non-functional requirements drive the architecture more than the features do. The system is overwhelmingly read-heavy, so home timeline loads must feel instant, ideally under 200 milliseconds end to end, while a new post may take a few seconds to reach all followers because humans cannot perceive that lag in a social context. The platform must stay available through node failures, and it can accept eventual consistency, which means a reader may briefly see slightly stale data that converges shortly after writes stop, for almost everything. The one exception is that a user must see their own post immediately after publishing it, a property called read-your-writes consistency, because nothing erodes trust faster than an app that appears to eat your words, and a user who refreshes and finds their post missing will post it again and create a duplicate. That carve-out is cheap to grant by serving the author's own recent posts from the primary store or a session cache rather than from a replica.

Sizing the problem

Concrete numbers turn the design from adjectives into engineering. Assume 200 million daily active users and that on average 2 posts arrive per 10 users per day, which is 0.2 posts per user per day and therefore 40 million posts per day. A day holds 86,400 seconds, so dividing the two gives an average near 460 posts per second, and a peak factor of three, which is typical for products with strong evening peaks, puts the write path around 1,400 posts per second. Timeline reads run about 100 times the write rate in this kind of product, since most sessions are pure scrolling, which gives 4 billion timeline requests per day, an average of roughly 46,000 reads per second, and peaks near 140,000 per second. That two-and-a-half-orders-of-magnitude gap between reads and writes is the single most important number in the design, because it says the system should do extra work at write time to make reads cheap, and nearly every choice that follows descends from it.

Storage follows the same discipline. A post row with text, author, timestamps, and counters fits in about 1 KB, so 40 million posts per day is 40 GB per day and roughly 15 TB per year, which is modest for a sharded store. Media dominates everything else, because if a quarter of posts carry an image averaging 300 KB, that is 10 million images and 3 TB per day, about 1.1 PB per year, which is why images live in an object store behind a CDN rather than in any database, and why the databases never see an image byte. Fanout is the hidden multiplier in the write path. With an average of 200 followers per user, delivering 40 million posts into followers' timelines means 8 billion timeline insertions per day, an average near 93,000 small writes per second, and that number rather than the 460 posts per second is what the fanout tier must be built to absorb.

The API

The external surface is a small set of authenticated JSON endpoints plus a separate upload path for media. For uploads the client first asks for a presigned URL, which is a short-lived URL carrying permission to upload one object directly to storage, so image bytes never pass through the application servers and a burst of large uploads cannot starve the request path that serves timelines. Pagination uses cursors rather than page numbers, where a cursor is an opaque token encoding the last post ID the client saw, because offset pagination re-scans the skipped rows on every request and duplicates or drops items whenever new posts arrive between pages, which on a fast-moving timeline is every request.

POST /api/v1/posts
{ "text": "shipping day", "media_ids": ["m-81f2"] }
→ 201 { "post_id": "p-99021", "created_at": "2025-06-03T17:20:11Z" }

POST /api/v1/posts/p-99021/like        → 204
POST /api/v1/posts/p-99021/repost      → 201 { "post_id": "p-99544" }
POST /api/v1/users/u-417/follow        → 204

GET  /api/v1/timeline/home?cursor=p-98712&limit=20
→ 200 { "posts": [ ... ], "next_cursor": "p-97608" }

POST /api/v1/media/uploads             → 201 { "media_id": "m-81f2",
                                               "upload_url": "https://..." }

The data model

Three tables carry the social core, and the repost deliberately reuses the posts table, so a repost is simply a lightweight post whose repost_of column points at the original, with no text of its own in the simple variant. The alternative would be a separate reposts join table, and it loses because every consumer of timelines would then have to read two tables and interleave the results, while the single-table model lets reposts flow through fanout, pagination, and caching as ordinary posts. Follows and likes are plain relationship rows, partitioned in practice by their first column so one user's relationships live together and the common questions, who do I follow and did I like this, each touch a single partition.

CREATE TABLE posts (
  post_id     BIGINT PRIMARY KEY,        -- time-ordered ID (Snowflake style)
  author_id   BIGINT NOT NULL,
  text        VARCHAR(500),
  media_ids   JSONB,
  repost_of   BIGINT,                    -- NULL for original posts
  created_at  TIMESTAMPTZ NOT NULL,
  deleted_at  TIMESTAMPTZ
);

CREATE TABLE follows (
  follower_id BIGINT NOT NULL,
  followee_id BIGINT NOT NULL,
  created_at  TIMESTAMPTZ NOT NULL,
  PRIMARY KEY (follower_id, followee_id)
);

CREATE TABLE likes (
  user_id    BIGINT NOT NULL,
  post_id    BIGINT NOT NULL,
  created_at TIMESTAMPTZ NOT NULL,
  PRIMARY KEY (user_id, post_id)
);

Two details earn comment. Post IDs should be time-ordered but not minted from one central counter, and the Snowflake scheme, which packs a timestamp, a machine ID, and a sequence number into one 64-bit integer, gives every application server the ability to create sortable unique IDs locally. A central auto-increment sequence would put one database in every write path and cap the platform at that node's throughput, while random UUIDs would sort uselessly, and timelines want newest-first order, so a time-prefixed ID makes ID order and time order the same thing and turns every pagination cursor into a simple comparison. The second detail is that the like and repost counts rendered on every post are conspicuously absent from these tables as live counters, because maintaining them naively is one of the failure modes this design treats at length below.

The high-level architecture

The decomposition follows the data, so a user service owns profiles and auth, a social graph service owns follows, a post service owns post rows, a media path of object store plus CDN owns image bytes, a fanout tier turns each new post into timeline entries, and a timeline service assembles home timelines at read time. Splitting along data ownership rather than along product features means each store can shard on the key its own access pattern wants, which is the property that makes everything in the later sections possible. The fanout tier is asynchronous behind a queue so a publish never waits on 200 cache writes, and the queue also decouples failure, since a slow cache cluster shows up as queue depth rather than as publish errors in users' hands. The timeline cache holds per-user lists of post IDs rather than full posts, which keeps each post body stored once and referenced many times, makes a cached timeline a few kilobytes instead of a megabyte, and means an edit to a post never has to chase fanned-out copies.

Writes flow along the top, where the post service persists the post and enqueues fanout, and workers fetch follower lists and push post IDs into per-user timeline caches that the timeline service reads and then hydrates from the post store. Dashed arrows are asynchronous.

Reposts as lightweight posts

Modeling the repost as a post row with a repost_of pointer buys three things at once. Reposts flow through the exact same fanout pipeline as originals, they paginate and sort in timelines with no special casing, and the original's content lives in one place so an edit or takedown of the original is reflected everywhere. At render time the timeline service hydrates the repost, sees the pointer, fetches the original in the same batched read, and the client displays the original's content under the reposter's attribution line. Hydration here means converting stored IDs into full display objects by looking them up in a cache or store, and the batching matters because a 20-item timeline containing a handful of reposts still costs one round trip to the post cache rather than one per pointer chased.

The user taps repost (1) and the post service writes a lightweight post whose repost_of field references the original (2). The original's repost counter increments through the counter service (3) while a fanout job is enqueued (4), and workers push the new post ID into followers' timeline caches (5). At render time the cached ID is hydrated, fetching the repost row and the original it points to (6).

Deletion is the case that proves the model. When an original is deleted, the row gets a deleted_at tombstone rather than being removed, and every repost pointing at it hydrates to a placeholder saying the post is unavailable on the next render, with no need to chase down thousands of repost rows or fanned-out timeline entries at delete time. The author who deletes sees the post vanish immediately, a follower mid-scroll sees the placeholder appear on their next refresh, and both experiences are correct under the consistency contract. Counters need symmetric care, so a repost increments the original's repost count, deleting the repost decrements it, and deleting the original freezes its counters and removes the original from circulation while the repost shells decay naturally. The one rule worth stating out loud is that the platform never edits other users' timelines synchronously on delete, because tombstone checks at hydration achieve the same effect for the cost of one cache field.

Counting likes without melting a row

The naive implementation runs UPDATE posts SET like_count = like_count + 1 WHERE post_id = ? on every like, and it fails precisely on the posts that matter. A relational database serializes writes to a single row through that row's lock, the mechanism that keeps concurrent updates from corrupting each other, and a hot row sustains perhaps 500 to 1,000 such updates per second before queueing dominates, while a viral post can attract 10,000 likes per second. At that rate the lock queue grows without bound, latencies climb past client timeouts, and the failure spreads as retries amplify the load, all while the rest of the database sits idle. The operator watching the dashboard sees one shard pinned by a single row, which is the least fixable shape of overload there is, because no amount of added hardware splits a row.

The standard fix is to spread and batch. Sharded counters split one logical counter into N rows, each increment picks a shard at random, and the displayed value is the sum, so 16 shards turn 10,000 increments per second into about 625 per shard, inside what a row can take. A counting service goes further by absorbing increments in memory and flushing batched deltas every few hundred milliseconds, and the arithmetic is satisfying, since 10,000 likes per second flushed every 500 milliseconds across 16 shards becomes 32 database writes per second carrying delta values, roughly a 300-fold reduction in write traffic. The displayed count consequently lags by up to the flush interval, which is invisible in a number already past thousands. If a counting node dies between flushes its unflushed deltas are lost, and that too is recoverable, because the like rows themselves still land in the likes table and a background pass can recompute any counter from that ground truth. Like state for the viewing user, the question of whether I already liked this post, reads from the likes table keyed by user and never from the counter, since that answer has to be exactly right for the heart icon to behave.

Timeline delivery and the celebrity exception

Fanout on write means computing each user's home timeline at publish time by pushing the new post ID into every follower's cached timeline, and the sizing numbers say the average case is cheap, since 200 followers per post is 200 writes of a few dozen bytes each, roughly 93,000 cache appends per second platform-wide, which a sharded cache cluster absorbs without drama. Reads then become a single cache fetch, which is exactly the trade the 100-to-1 read ratio wants. The alternative, fanout on read, would assemble each timeline on demand by querying every followed account's recent posts, and at 46,000 reads per second against an average of 200 followed accounts that is over 9 million author-timeline lookups per second, which is the wrong side of the arithmetic by two orders of magnitude. Framing the comparison this way also foreshadows why the answer flips for celebrity authors, whose follower counts move the multiplication in the other direction.

Celebrities break the write-side plan because one post by an account with 100 million followers is 100 million cache writes, and at even a million appends per second that single post occupies the fanout tier for 100 seconds while ordinary posts queue behind it. The hybrid answer fans out for normal users but skips fanout for accounts above a follower threshold, on the order of one hundred thousand followers or more, and instead merges at read time. Under that scheme the timeline service pulls the user's precomputed timeline from cache, separately fetches recent posts from the handful of celebrities the user follows, where those author lists are few and intensely cacheable, and merges the two streams by post ID time order before hydrating. Each user follows few celebrities, so the merge adds a couple of cache reads per timeline load, and the system buys itself out of unbounded write amplification by paying a small constant read tax the read path can easily afford.

Viral moments create the mirror-image hot spot on the read side. A post everyone is hydrating lives on one post-cache shard, and 100,000 reads per second against one key can saturate that node's network interface before its CPU works hard at all. The mitigations are replication and salting. Hot keys get detected by a simple frequency counter and replicated to several cache nodes with clients picking one at random, or the key is salted, meaning copies are written under derived keys like p-99021#3 so reads spread across shards by construction, and because the post body is immutable the copies cannot drift apart. The same trick applies to the celebrity author lists during a news spike, which is exactly when both hot spots tend to arrive together.

A latency budget for the home timeline

The 200 millisecond target only means something once it is broken into parts. Roughly 50 milliseconds go to the network between phone and edge, which is bought from the CDN provider rather than engineered, and perhaps 5 milliseconds go to the gateway for authentication and routing. The timeline service then reads the cached ID list in a millisecond or two, fetches the followed-celebrity lists in a couple more, and merges the streams in microseconds because both are already sorted by time-ordered ID. Hydration is the expensive step, since 20 posts may live on several cache shards, so the service issues those lookups in parallel and pays the latency of the slowest shard rather than the sum, typically under 10 milliseconds with a longer tail when a shard is rehashing. Counter reads and response serialization add a few milliseconds more, which puts the server-side total near 25 milliseconds and leaves comfortable room under the budget for the bad days.

The budget earns its keep when something misses. A timeline cache miss, which mostly happens for dormant users whose entries were evicted, forces the slow path of reading the follow list and querying recent posts per author, and that can cost a few hundred milliseconds. The product answer is to serve whatever stale list survives and rebuild in the background, while the operational answer is to size eviction so misses concentrate on people who have not opened the app in weeks, for whom one slower load is a fair toll for returning. Watching the miss rate is therefore not just a cache health check but a direct preview of how many users will feel the slow path tomorrow.

Scaling, failures, and operations

Each tier scales on its own axis. The stateless services scale horizontally behind load balancers, and the post store shards by post ID and replicates each shard, with reads happily served from replicas because staleness of seconds is acceptable everywhere except the author's view of their own posts. The graph store shards by user ID so one user's follow list is one partition read, while the timeline cache shards by user ID across a cluster sized by the arithmetic above, where 200 million users with an 800-entry ID list at 8 bytes each comes to about 1.3 TB, which is 30 to 40 large cache nodes. The object store and CDN are bought rather than built, because durability and edge distribution are commodity problems with mature vendors. The fanout queue is the system's shock absorber, since a publish spike becomes queue depth rather than write failures, and the visible symptom is that posts take longer to appear in followers' timelines, which the product tolerates by design and operations watches as a single lag metric.

Failures map to degraded freshness rather than errors when the design is working. A dead fanout worker's jobs get re-delivered by the queue, and because appending a post ID to a timeline twice is repaired by de-duplication at read, at-least-once delivery is safe and nobody has to attempt the much harder exactly-once variant. A lost timeline cache node costs its users a rebuild, either lazily on next read by querying followed accounts, which is expensive but rare, or by replaying recent fanout jobs from the queue's retained history. A post store shard failover pauses writes to that ID range for the seconds a replica needs to promote, during which a few publishers see a retry spinner and nearly everyone else notices nothing. The metrics that matter operationally are fanout queue lag, cache hit rates, and the p99 of timeline assembly, and the worst real-world day is a global event when posting rate, like rate, and read rate all spike together, which is why every tier carries headroom rather than exact-fit capacity.

Follow-up questions

What happens when the original of a repost is deleted? The original row gets a tombstone, and every repost hydrates to an unavailable placeholder on its next render. Nothing chases repost rows or fanned-out entries synchronously, because hydration-time checks give the same result for the cost of one cached field, and the placeholder preserves thread context for anyone who replied before the deletion.
Why not store like_count directly on the posts row? A single row sustains maybe a thousand locked updates per second while viral posts attract ten thousand likes per second, so the hot row queues, times out, and drags retry storms in behind it. Sharded counters with batched flushes cut the write rate by orders of magnitude, with 10,000 per second becoming 32 batched writes per second in the worked example, and the likes table remains the recoverable ground truth.
How do you pick the celebrity threshold? The threshold falls out of equating the two costs, since fanout pays follower-count writes once per post while read-merge pays one extra fetch on every timeline load of every follower forever. Accounts around one hundred thousand followers are where fanout latency and queue occupancy stop being worth it, and the cutoff should be a tunable the operators can move rather than a constant in code.
How do timelines avoid unbounded memory? Cached timelines are capped at around 800 post IDs because almost nobody pages past that, and deeper history falls back to a slower query path against the post store via the follow list. The cap is also what makes the cache sizing arithmetic land at 1.3 TB instead of growing without bound.
What if a fanout worker dies mid-job? The queue re-delivers the job, and appends are idempotent in effect because duplicate IDs are dropped at read time. At-least-once delivery plus idempotent application is the standard recipe, and it costs far less than trying to build exactly-once delivery into the queue itself.
How do you survive a viral post hammering one cache shard? Detection comes first, by counting key frequencies in the cache clients, and the response is to replicate the hot key across several nodes or salt it into derived keys so reads spread by construction. The post body is immutable, so the copies cannot diverge in any way that matters.

References

Bronson et al., TAO: Facebook's Distributed Data Store for the Social Graph (USENIX ATC 2013).
Krikorian, Timelines at Scale (QCon talk, InfoQ), on Twitter's hybrid fanout.
Kleppmann, Designing Data-Intensive Applications (2017), on the Twitter home timeline example and partitioning.
Instagram Engineering, Sharding and IDs at Instagram, on time-ordered distributed IDs.
Xu, System Design Interview, Volume 1 (2020), chapter on news feed systems.