From TinyURL to Bitly (and beyond): designing a smarter URL shortener
Principles and practice of system design
One rainy afternoon, John, Peter, and Andy huddled around a cluttered desk, frustrated by the clunky URL shortener they were using. They needed to share a long web link for their new project, but the legacy shortener provided only a basic alias and almost no insight into who clicked it. "Why can’t we get more info? It’s 2025, and all we have is a click count!" Peter groaned. John added, "The old tools just shrink links – no smarts, no real analytics". Andy, the visionary of the trio, suddenly lit up with an idea. “Links aren’t just connectors - they’re real-time signals of attention”, he exclaimed, articulating their new vision. In that moment, the three inventors realized they could build something better: a smarter URL shortening system with built-in analytics to understand those “signals of attention” in real time.
They imagined a platform where every short link not only redirects users seamlessly, but also feeds into live metrics – showing which links are getting attention, when, and how. The turning point had arrived. Instead of tolerating limited features, John, Peter, and Andy decided to design their own URL shortener from the ground up. It would be fast, fun, and informative. With excitement brewing, they grabbed a whiteboard to start mapping out their idea. The journey from frustration to inspiration had begun, and next, they would translate their vision into a concrete plan using domain-driven design principles.
John, Peter, and Andy have set the stage: they want a URL shortener that’s more than just a link alias. Now they’ll dive into designing it step by step, starting with understanding the domain of URL shortening and link analytics.
To tackle the design, Andy suggests using an Event Storming workshop – a playful, visual way to map out the domain before writing any code. Armed with sticky notes (both physical and digital), the trio gathers around a whiteboard to brainstorm how their URL shortener should work. Event Storming encourages them to focus on domain events first – significant things that happen in their system – and to derive other elements like commands and actors from those events.1 It’s an approach that feels like storytelling: perfect for our inventors.
The scope 🎯
They start by clearly defining the scope of their domain (🎯). John says, "Let’s keep it focused: we’re building a core URL shortening service with analytics, not an entire social network or ad platform". In other words, their system will handle creating short links, redirecting users, and tracking clicks – nothing more. With the scope nailed down, they move on to identify the key happenings in this domain.
Modelling the domain
Domain events
🟠 Orange sticky notes
These are the noteworthy things that occur in the life of a link. The team jots down events in past-tense language (a convention in event modeling).
ShortLinkCreated – when a new short URL is generated for a long URL.
ShortLinkAccessed – when a user clicks a short URL (i.e. a redirect happens).
(Optional) ShortLinkExpired – if links can expire, this event would fire when a link passes its expiration date.
(Optional) AnalyticsUpdated – when click data for a link is processed (this might be a result of a batch job or on each click).
Andy explains that these domain events are the heart of their story – each event is a fact that something happened in the system. For example, ShortLinkCreated means “a new short link was generated” – it’s an event the system can record and react to. ShortLinkAccessed means “someone was redirected through the short link”, which is crucial for analytics. By listing events, they’re essentially writing down the story of how their system will be used.
Commands
🟢 Green sticky notes
Next, they consider what initiates those events. Commands are like instructions or actions in the system, often triggered by an actor (user or system).
CreateShortLink – a command when a user requests to shorten a URL. This will result (if successful) in a ShortLinkCreated event.
AccessShortLink – a command when a user (or browser) requests a redirect via a short URL. This would lead to a ShortLinkAccessed event (if the code exists).
RecordClick – a command (possibly internal) to log a click for analytics. This might be triggered automatically when ShortLinkAccessed occurs.
(If expiring links) ExpireLink – a command that marks a link as expired (maybe via a scheduled job or trigger when time is up), resulting in ShortLinkExpired event.
By identifying commands, they clarify what actions the system performs in response to user input or other triggers. For instance, CreateShortLink is invoked by a user inputting a URL to shorten; AccessShortLink is invoked when someone hits the short URL in their browser. Each command leads to one or more events (e.g., AccessShortLink → ShortLinkAccessed, plus maybe an Analytics event).
Actors
🔵 Blue sticky notes
Who or what triggers those commands? The team lists the actors – the people or external systems interacting with the shortener.
Link Creator – this is the user who shortens a URL (could be John, Peter, Andy, or any user of the service). They trigger the CreateShortLink command.
Link Visitor – anyone who clicks on a short link. In practice, it’s a web user or their browser triggering the AccessShortLink command by visiting the short URL.
System/Timer – the system itself might act as an actor for scheduled tasks, like expiring links or running analytics jobs (if any automated policy triggers commands).
For simplicity, John notes that the same person might be both creator and visitor at different times, but it’s useful to distinguish their roles. The actors clarify who initiates each part of the process: the creator drives the shortening, the visitor drives the redirect, and occasionally the system’s own background process might drive things like expiration.
Policies
🟡 Yellow sticky notes
Policies represent business rules or decisions that happen in response to events. They are like automatic if-then rules the system should enforce, often leading to new commands. The team discusses a few.
Unique Code Policy: Ensure that each short code is unique. If a generated code collides (unlikely with a good algorithm, but possible), the system should detect the conflict (a kind of hotspot concern) and perhaps generate a new code or fail the operation. This policy might trigger a retry of CreateShortLink with a different code strategy.
Link Expiration Policy: If they allow expiring links, when a ShortLinkExpired event occurs (or when the expiration time is reached), the system should prevent future redirects for that link. This could be implemented by marking the link inactive. A policy might automatically issue an ExpireLink command when the current time passes a link’s expiration timestamp.
Analytics Policy: Every time a ShortLinkAccessed event happens, a policy could dictate: “record this event to analytics”. In practice, this could mean publishing a RecordClick command to the analytics system whenever a redirect occurs. The team decides this will be done asynchronously so as not to slow down the redirect (more on that later).
Policies ensure the system’s business rules are consistently applied. Andy points out that policies often act as glue: an event in one part of the system triggers a command in the same or another part. For example, their analytics policy will listen for the ShortLinkAccessed event and then invoke the RecordClick action in the Analytics component.
Aggregates / Entities
🟣 Purple sticky notes
Aggregates are clusters of related entities that handle commands and produce events. In simpler terms, an aggregate is like a conceptual object in the domain that has an internal state and logic. The team identifies the following.
ShortLink aggregate – representing the short link itself as a business entity. It holds data like the short code, original URL, creation date, expiration date, and perhaps a count of clicks. The ShortLink aggregate’s behavior: it can handle commands like CreateShortLink (assigning a code, saving the mapping) and maybe ExpireLink (mark itself expired). It produces events like ShortLinkCreated and ShortLinkExpired.
LinkAnalytics aggregate – representing the analytical info for a short link. This might be in a separate bounded context (see below). It could handle the RecordClick command and update click counts or other metrics, producing perhaps an AnalyticsUpdated event or updating a read model. In practice, LinkAnalytics might not be a single object but a process that tallies events for each link. For modeling, they treat it as an aggregate responsible for reacting to each ShortLinkAccessed event for a given link (e.g., incrementing a counter for that link).
They note that the ShortLink entity belongs to the core shortening domain, while LinkAnalytics belongs to the analytics domain. Each aggregate will ensure business rules within its boundary. For instance, the ShortLink aggregate can enforce the Unique Code Policy when creating a link (e.g., by checking the code’s uniqueness in storage or by using a generation strategy that guarantees uniqueness).
Hotspots
🔴 Red sticky notes
As with any design, there are open questions and potential problem areas. The team tags these as red “hotspot” notes – things to watch out for or design carefully.
Scalability of Code Generation: Will generating unique short codes become a bottleneck? If they use a single counter or algorithm, can it handle many requests in parallel without collisions?2 They highlight the need for a robust key generation strategy (we’ll revisit this in Storage & Data Modeling).
High Read Traffic: A short URL might get extremely popular (imagine a viral link) and receive hundreds of hits per second. How will the system handle that load on the redirect path? This foreshadows caching and load balancing needs to meet latency NFRs.
Abuse and Security: URL shorteners can be misused to hide malicious links. The team doesn’t want to forget about security – e.g., they might need a way to scan or blacklist dangerous URLs (perhaps using an external service). This is a hotspot for later consideration so that “smart” doesn’t become “scam” – they add a note to incorporate safety measures (like not shortening known bad domains).
Data Privacy/GDPR: Since they plan to collect analytics, they need to consider what user data is collected and stored. This might be beyond a basic design, but Andy notes it as a point: if they log IP addresses or locations, privacy laws could apply. They agree to keep personally identifiable info out of their initial design (just aggregate counts).
Calling out these hotspots early helps them remember to address these points in the design. As Andy says, “Red notes mean no surprises later!”
Bounded contexts
🧩 Puzzle pieces
Finally, the team groups their domain into logical bounded contexts – essentially sub-domains that can be designed and understood independently. Given their discussions, two main contexts emerge.
Shortening Context (Core ShortLink Domain): This includes everything about creating and managing short URLs themselves. The ShortLink aggregate and related events (ShortLinkCreated, ShortLinkExpired) live here. It’s all about the link mapping and lifecycle.
Analytics Context (Link Analytics Domain): This includes tracking and reporting on link usage. The LinkAnalytics aggregate and event (ShortLinkAccessed) live here. This context is all about collecting click events and producing stats (click counts, etc.) for each link.
These two contexts will interact via the domain events and policies. For example, the Shortening context produces a ShortLinkAccessed event whenever a redirect happens; the Analytics context listens for that and updates the stats. By separating them, John, Peter, and Andy make it clear that they could even implement or scale them independently (perhaps as separate services, which they indeed plan to do). It also follows the single-responsibility principle: one context focuses on link management, the other on data crunching.
💬 Leave a comment
We’d love to hear your take - how would you design it differently?
💡 Drop your thoughts in the comments!
At the end of their Event Storming session, the whiteboard is a rainbow of sticky notes, but the story makes sense from end to end. They’ve mapped out how a user’s actions flow through the system as events and commands, and how different parts of the system will interact. Peter sketches a quick summary table to capture their domain model at a glance.
With the domain now clearly modeled, the trio is ready to think about how to meet various quality goals of their system.
Non-functional requirements
Non-functional requirements are also known as NFRs.
Before jumping into architecture, John insists they list the non-functional requirements – the crucial “-ilities” and qualities that their URL shortener must have. These NFRs will heavily influence their design decisions. Together, they come up with the following key NFRs (and note how each will shape the system).
Latency (Speed). Both the URL shortening and the redirection should be very fast – ideally just a few milliseconds for the backend processing. Users shouldn’t notice any delay when clicking a short link. This implies they need efficient lookups (likely in-memory caching for hot links) and a lightweight redirect service. Also, any heavy processing (like logging analytics) should be done asynchronously so it doesn’t slow down the critical path of redirecting a user. In summary: keep the hot path lean and speedy.3
Scalability. The system should handle growth in traffic gracefully. Today it might be a few thousand links, but what if it becomes as popular as Bitly? (For context, Bitly shortens about 600 million links per month.4 The design should scale out horizontally – e.g., multiple instances of services behind load balancers – to handle increasing load. Components like databases should be chosen for scalability (e.g., using a NoSQL store that can handle billions of key lookups. The read-heavy nature (many more redirects than creations) means special attention to scaling reads (caches, replicas) is needed.
High availability. Short links might be embedded in marketing campaigns, emails, or documentation all over the web – if our service goes down, all those links break. So the system must be highly available (the trio targets at least 99.9%, roughly meaning only a few minutes of downtime per month at most). This will require redundancy at every level: multiple service instances (so one can fail and others carry on), replicated databases, perhaps even multi-datacenter deployment for disaster recovery. The design should avoid single points of failure. Peter quips, "If one server goes down, no one should notice".
Durability. Once a short URL is created, it should ideally work for years (unless intentionally expired). Data must be stored reliably so that even if servers restart or new versions are deployed, the mappings aren’t lost. Backups or replicas of the database help ensure no data loss. This also influences the choice of storage – they need a database that is stable and can preserve millions (or billions) of records over time without corruption.
Observability. Andy, having been burned by past projects with poor logging, insists on building observability in from day one. This means they need monitoring, logging, and tracing in their system. Each component should emit metrics (like how many redirects per second, how long lookups take, etc.), and they should have logs for key events (e.g. link created, user redirected) for debugging. If something goes wrong – say, redirects slow down or a spike of errors occurs – observability will help them pinpoint the issue. They note that using centralized logging and monitoring dashboards would be wise once the system is running in production.
Security. Though it’s a “toy” project for now, they cannot ignore security. They decide on a few basic rules: all communications should be over HTTPS (no plaintext http). They should guard against malicious inputs – e.g., validating that a long URL is a proper URL and possibly not on a blacklist of known bad domains (to mitigate phishing use-cases). Rate limiting may be needed to prevent someone from spamming the service with thousands of link creation requests or brute-forcing short codes. In analytics, they’ll avoid storing personal data (IP addresses could be considered personal in some jurisdictions) or at least provide a way to anonymize or not store it by default.
Maintainability and extensibility. Since this is a project they plan to learn from and possibly extend, the design should be modular. If tomorrow they want to add a new feature (say, QR code generation for each short link, or user accounts to track one’s links), the system’s structure should accommodate that without requiring a complete rewrite. This reinforces their decision to separate contexts and use a microservice-style approach – it’s easier to expand one piece (or add a new microservice) than to tinker with a single monolith handling everything.
🤝 Join the Chat
Curious to dig deeper or bounce ideas in real time?
🚀 Join our chat - because system design gets even better when we think together.
After discussing these NFRs, the trio feels confident that these goals will guide their architecture. In fact, they start imagining components and how to meet these needs (e.g., a cache for latency, a cluster of servers for availability, a scalable DB for growth). Before moving on, they summarize the NFRs and their design implications.
With clear NFR goals, John, Peter, and Andy proceed to size the system and then sketch the architecture that fulfills both the functional story and these quality needs.
Back-of-the-envelope estimations
Before finalizing the design, Andy suggests doing some quick back-of-the-envelope calculations. “Let’s get a feel for the scale of the system – how many links, how much traffic, how much data storage we’re talking about”, he says. This will help validate if their chosen components can handle the load and highlight any potential bottlenecks. They make some assumptions for their scenario, keeping numbers simple but realistic:
Short links volume
Suppose their service becomes moderately popular and handles about 30 million new short links per month.5 That’s roughly 1 million links per day. It’s ambitious but within reach for a global service (for context, this is smaller than Bitly’s 600M/month, but still significant). Over one year, that would accumulate to ~365 million links created.
Redirect traffic
Typically, reading (redirecting) is far more frequent than writing (creating links). They assume each short link on average is accessed about 100 times (this might be over its lifetime or within a certain period). That gives a read:write ratio of around 100:1 – a common assumption for URL shortener usage. With 1 million new links a day, that implies about 100 million redirect events per day system-wide (spread across all links). To account for uneven usage and peaks, they consider an even traffic distribution for calculations, then add a peak factor.
Requests per second (RPS)
Using the above numbers:
Writes (shorten requests): 1,000,000 per day ≈ 11.6 per second on average. Rounding up, ~12 writes/sec on average. If traffic peaks at, say, 10× typical load (morning news cycles or big events spur more link sharing), peak write rate could be ~120 writes/sec. This is easily handled by a few servers, but the database design needs to allow bursts of writes.
Reads (redirects): At 100:1 ratio, 100 million redirects per day is ~1,157 per second on average. Rounding, ~1,200 reads/sec average. Under 10× peak conditions, that’s ~12,000 redirects/sec at peak. 12k/sec is quite high – definitely need caching and a scalable datastore to handle that! It reinforces the earlier plan: most of these 12k lookups should be served from an in-memory cache (especially for the hottest links) to meet the latency and throughput goals. If 90% of requests hit cache, the database would only directly see ~1,200/sec, which is much more manageable.
Storage needs
What about data storage for all these links and their click records?
Link metadata storage: Each short link entry will store a short code and the long URL (plus some metadata like creation date, etc.). If we assume an average original URL length of ~100 characters (some URLs are longer, some shorter), plus, say, ~7 characters for the short code, and a few timestamps or counters, each record might be on the order of ~127 bytes (this matches an example calculation: 7 + 100 + 8 + 8 + 4 ≈ 127 bytes). At 365 million links/year, that’s roughly 46.4 GB per year of link data. Over 5 years, if growth is steady, it could be ~232 GB. If the system is successful and lives many years, this is significant but not astronomical – a modern distributed database can handle hundreds of gigabytes or more, especially when partitioned across nodes.
Click logs storage: Now, if they log every redirect (click) event for analytics (rather than just aggregating counts), the data is much larger. Each click log might contain a few pieces of info: at least a timestamp, the short link identifier, maybe the user’s IP or user-agent if they wanted to derive geo or platform stats. Let’s assume a minimal log entry of ~50 bytes (which is conservative; including more info could make it 100-200 bytes, but they’ll start simple). If there are ~100 million clicks per day (from the earlier assumption), that’s 100e6 * 50 bytes = 5e9 bytes = 5 GB per day of raw click data. Over a year, that’s ~1.8 TB – clearly a huge amount if we literally kept every click. In practice, they might not keep all raw events indefinitely; they could summarize or sample data after some time. But it’s clear the analytics part needs careful consideration for storage. Perhaps they’ll only keep detailed logs for a rolling window (say, 90 days), and maintain aggregated counts long-term. For the design, they note the storage for click events is a big factor for the Analytics context. They might use specialized storage (like a time-series database or big data warehouse) if full logs are retained. For now, they record this as a constraint and will propose an initial approach (e.g., store raw events for a period, and always update a counter in the ShortLink record or a summary table for quick lookup of total clicks).
Code Space (Short Code Capacity): How many unique short codes are possible with their chosen format? They plan to use Base62 encoding (characters 0-9, A-Z, a-z) for short codes, as it’s widely used for URL shorteners. Each character in a code can have 62 possibilities. If they use a fixed length:
With 6 characters, total combinations = 62^6 ≈ 56.8 billion unique codes. That’s tens of billions – likely enough for many years of operation even at high volume. (For example, 1 billion codes used would be only ~1.7% of this space.)
With 7 characters, combinations = 62^7 ≈ 3.5 trillion unique codes (practically inexhaustible for this application).
They decide to start with 6-character codes by default, which gives a huge headroom. If the unlikely day comes that 56 billion links are not enough, they could move to 7-character codes (some systems do increase code length over time). For vanity or custom aliases, they might allow variable lengths anyway.
To sum up these rough numbers, Peter prepares a table so they can reference it during design.
These estimates reassure the team that their design can work with the right choices. For instance, 12k requests/sec peak can be handled with a combination of load balancing, caching, and a scalable database (and it’s within what many modern systems handle). The storage numbers show that a few hundred GB per year for links is fine, but the analytics events are the bigger beast – which justifies having a separate context/service for analytics that could use big-data solutions if needed.
With scale estimates in mind, John, Peter, and Andy proceed to sketch the high-level architecture, ensuring it can meet the throughput and storage demands.
Architecture and components
Armed with domain knowledge and NFRs, the trio now designs the architecture of their system. They choose a modular, microservices-inspired architecture – each bounded context from earlier will be realized as a separate component (service), and supporting components (like databases and caches) will be included. They describe the architecture in terms of the C4 model’s Context and Container levels:
Context level
System Context
At the highest level, their URL shortening system sits between users and the Internet. Users (or applications) will interact with the system through a simple web interface or API. For example, a user sends a URL to be shortened, and the system returns a short link; when a user clicks a short link, the system redirects them to the original URL. The system also internally communicates with a data store and possibly external services (for things like link preview, but those are optional). In short, from an outsider’s perspective, the system is a black box that takes long URLs and gives short ones, and takes short URL hits and gives back redirects.
Container level
Services and storage inside the system
Within the system, they identify several components (containers) and the relationships between them. The main components are derived from the bounded contexts and responsibilities they identified.
ShortLink Service is responsible for creating short links. This corresponds to the core Shortening context. It exposes an API endpoint for users to submit a long URL (and possibly desired custom alias, etc.) and returns the generated short link. It handles generating a unique short code (coordinating with the storage or a key generator) and storing the mapping. It might also handle link management features like expiration (e.g., not allowing a link after a certain date). Essentially, all write operations (creating links, maybe deleting or expiring links) go through this service.
Redirect Service is responsible for handling incoming clicks on short URLs. This is a lightweight, performance-critical component. Its job is: given a short code (from an incoming HTTP request), quickly look up the corresponding long URL, and redirect the user there (HTTP 301 redirect). This service primarily performs read operations on the data (fetching the mapping), and must do so with low latency. It heavily utilizes the cache and database. After redirecting the user, it also triggers the logging of that event for analytics (likely by emitting a message or calling the Analytics service asynchronously). We can think of this as part of the Shortening context as well, but split out for scalability. In practice, ShortLink Service and Redirect Service might be deployed separately so each can scale according to its traffic profile (redirects being much higher QPS than creations). In some designs, they could even be one service that does both tasks (and simply internally route read vs write requests), but splitting them is cleaner for our needs.
Analytics Service – in the Analytics context, this service receives events about link accesses and updates usage statistics. It might consume messages like “Link X was clicked at time T” and then update databases that store analytics info. This could include incrementing click counters, logging the event to a datastore, computing aggregate stats like daily click counts, etc. It could also expose an API (for example, to query stats: “how many clicks did my link get?”). Importantly, the Analytics service runs mostly in the background – it doesn’t interfere with the user’s redirect flow. If it runs a few seconds behind real-time, that’s okay, as long as eventually the stats are updated. This service ensures the heavy lifting of analytics (which involves a large volume of writes, as every click turns into a write in an analytics store) is handled separately from the fast redirect logic.
Storage - Link Database is a database to store the core mapping of short codes to original URLs (and related metadata). This is critical persistent storage for the Shortening context. It will be a key-value style store: given a short code (key) we need to find the long URL (value). We also need to support inserting new mappings at high speed. The team leans towards a NoSQL solution here (like a distributed key-value store or wide-column DB) because it’s inherently a simple access pattern and needs to scale to high reads/writes. Cassandra or DynamoDB, for instance, are built for this kind of workload (billions of key-value lookups) and can scale horizontally. However, they note an SQL solution could work initially (like a single MySQL instance or cluster), but would require sharding and careful management at high scale. We’ll discuss this choice in the Storage section. This database will hold perhaps billions of records eventually, so it must be chosen and configured with that in mind.
Storage - Analytics Database is a separate storage for analytics data. This could be as simple as a table that keeps counts per link, or as elaborate as a big data pipeline into Hadoop or a time-series database for detailed logs. For the initial design, they plan to have a basic implementation: maybe a SQL table or a NoSQL store that accumulates counters (like each short code with a total click count, last accessed time, etc.) and possibly a log of recent access events for detailed info (e.g., storing the last N accesses or daily aggregates). This database can be optimized for write-heavy operations (since every click generates a write). They might use an append-only log model for events, or an aggregate model to just increment counters. The key point is to keep it separate from the main link mapping DB, to not overload the primary store with analytics writes.
Cache is an in-memory cache (like Redis) that's most critical in early phases when using slower databases like PostgreSQL. The cache stores popular short code mappings in memory for ultra-fast lookups. Technology evolution: Essential for MVP with PostgreSQL (SQL queries need acceleration), becomes less critical as you migrate to NoSQL databases like Cassandra/DynamoDB (which already provide fast key-value lookups). Given that 20% of links might get 80% of traffic, caching remains valuable for the hottest links even with fast NoSQL, but the performance gap is smaller. For production scale with NoSQL, caching might be used primarily for traffic spike protection rather than latency improvement.
Message Queue / Event Bus – this is a communication backbone between services for asynchronous processing. Specifically, when the Redirect Service registers a click, instead of synchronously calling the Analytics Service (which could slow down the redirect), it will push a message to a queue or publish an event (e.g., LinkAccessed event with details) to a message broker. The Analytics Service will consume from this queue. This decouples the two services – the redirect flow just drops a message and forgets it (very quick), and the analytics system processes it independently. Technology evolution: Start with Redis Pub/Sub for MVP simplicity, then migrate to RabbitMQ for reliable message delivery, and finally to Kafka for high-scale streaming (thousands of events per second) with persistence and partitioning capabilities. The team opts for a simple mental model: an "Analytics Events" queue that reliably delivers click events to Analytics Service. For production scale, they would leverage Kafka's partitioning and topic capabilities.
Load Balancer is not a service per se, but they acknowledge that clients will connect to the system via a load balancer that distributes requests across multiple instances of the ShortLink Service or Redirect Service. This is standard for high availability and scaling. Users will just use a single URL (like api.shortly.com
for the API or shortly.com for redirects), and the load balancer will route to many backend servers.
Bringing it all together, they envision the following flow (mentally a kind of architecture diagram in words):
Shortening (Write path): A user hits the ShortLink Service (through an API endpoint) to shorten a URL. The service validates the input, then either uses an internal Key Generator or the database to get a unique short code. It then stores the mapping in the Link DB (and also updates the Cache with this new entry for fast access). It returns the short URL to the user. (We will detail this in the API section with a sequence).
Redirection (Read path): A user clicks a short URL (e.g.,
http://sho.rt/abc123
). This goes to the Redirect Service. The Redirect Service extracts the code (abc123
), and first checks the Cache: if the mapping is cached, great – it gets the long URL immediately. If not, it queries the Link DB for the code. Once it obtains the long URL, it responds to the user with an HTTP redirect to that URL. Meanwhile, it produces a message onto the Analytics Queue saying “abc123 was accessed at time X”. The user’s browser is now going to the long URL destination.Analytics processing: The Analytics Service, listening on the queue, picks up the “abc123 accessed” message. It then updates the Analytics DB – for example, increment the click count for
abc123
, and perhaps log the event (timestamp, etc.) if detailed logs are stored. This update is completely separate from the user’s request; even if the Analytics DB is a bit slow, it doesn’t affect the user experience. The analytics data is now available for queries or reports.
Andy grins as he realizes the power of their modular design. "Each of these components can be built and scaled independently. If suddenly traffic spikes and 12,000 RPS becomes 50,000 RPS, we can add more instances of Redirect Service and perhaps more cache servers to handle it, without necessarily changing the ShortLink Service".
"Exactly!" Peter adds, "Write traffic might not spike as much as read traffic. Conversely, if someone launches a marketing campaign that creates a million links in one hour, we can scale out the ShortLink Service and ensure the Key Generation and DB can handle the write burst, without affecting redirect serving".
John nods appreciatively, "It's like having specialized teams in a company - each service has one job and does it really well".
Andy connects the dots back to their earlier work: "Look how beautifully this maps to our domain contexts! The ShortLink and Redirect Services together cover the Shortening context, with Redirect Service focusing on query side and ShortLink Service on command side - it's almost like a natural CQRS split".
"And the Analytics Service is clearly the Analytics context", Peter observes. "The events and messages that flow between them - like the LinkAccessed event - are exactly what we identified in domain modeling".
John realizes the bigger picture: "This is essentially an event-driven microservices architecture built on the DDD foundation we laid. By starting with events and DDD, we naturally arrived at a decoupled design where those events are now facilitating communication between services".
Finally, Peter draws up a table of the major components and their roles, to ensure nothing is forgotten.
Now that the architecture and responsibilities are set, the team can define how these components communicate and what APIs they expose.
Communication and APIs
With the components defined, John, Peter, and Andy discuss how these pieces talk to each other and to the outside world. They identify two primary interaction patterns: synchronous HTTP requests (for user-facing actions) and asynchronous messaging (for internal events like analytics). They also outline the key APIs each service will expose.
Inter-service communication: synchronous versus asynchronous
The ShortLink Service and Redirect Service will be accessed by users (or external clients) via synchronous HTTP requests. For example, a user’s browser or frontend app calls the ShortLink Service’s API to shorten a URL and waits for the result; similarly, when a user clicks a link, their browser makes an HTTP GET to the short URL and the Redirect Service synchronously returns a redirect response. These are real-time interactions.
Between Redirect Service and Analytics Service, they prefer asynchronous communication. As decided, the Redirect Service will not wait for the Analytics Service when logging a click. Instead, it will put a message on a queue (fire-and-forget). The Analytics Service will pick it up independently. This way, a hiccup or slowdown in analytics does not affect the user-facing redirect. The message queue acts as a buffer and decoupling point.
There might also be a case for asynchronous work within the ShortLink Service: e.g., if they implement a feature to check the long URL’s health (maybe scan for malware or fetch a preview), they could do that in the background. But for now, they assume the shorten API will just store the link. Any heavy lifting (like maybe generating a QR code image) could also be offloaded asynchronously if needed.
The team sketches two sequence diagrams in words to clarify the flows.
URL shortening flow
This is the flow when someone wants to shorten a new URL via the ShortLink Service API.
Request: The client (could be a web frontend or command-line call) sends an HTTP POST request to the ShortLink Service, e.g.
POST /api/shorten
with a JSON body containing the long URL (and maybe optional parameters like custom alias or expiry time).Validate & Generate: The ShortLink Service receives the request. Peter narrates: “Our service first validates the URL format (to ensure it’s a proper URL)”. Then it either generates a new short code or reserves one. How? In a simple design, it could use an auto-increment ID from the database or a separate ID generator. For example, perhaps the service has an internal counter or gets the next sequence number from the DB and encodes it in Base62. Alternatively, it might generate a random 6-character string and check the database for collisions (and retry if collision occurs). They lean towards the sequential ID approach for uniqueness guarantee and simplicity. Let’s say it generates code abc123.
Store Mapping: The service then stores the new mapping (abc123 -> longURL) in the Link Database. This might be a simple INSERT into a table or a put to a NoSQL store. If the DB confirms success, we have a new short link. (If there’s a conflict on the code – extremely unlikely with sequential IDs – it would retry with a new code.)
Update Cache: The ShortLink Service can also update the Cache: since this is a newly created hot link (the user likely will use it soon), caching it now saves a DB lookup on first use. It can set
cache[abc123] = longURL
. This step is optional but helpful for performance.Respond: The ShortLink Service returns an HTTP 201 Created response with the short link info. For example, it might return a JSON like
{ "shortCode": "abc123", "shortUrl": "https://sho.rt/abc123" }
. The user/app can now use this shortUrl as they please.
At this point, John holds up the imaginary short URL and smiles: they have successfully handled the write path.
URL redirect flow
This is the user-facing part when someone clicks on a short link.
Request: The user clicks
https://sho.rt/abc123
(for example). This results in a browser hitting the Redirect Service. In terms of HTTP, it's a GET request to the path /abc123 on the short domain. No payload, just the code in the URL.Lookup (Cache first): The Redirect Service processes the request. It extracts the code abc123 from the URL. First, it checks the Cache (since an in-memory lookup is fastest). If abc123 is in cache, it gets the long URL immediately. Let's say it finds that abc123 maps to
http://example.com/big/page?foo=bar
.If the code was not in cache (cache miss), the service will query the Link Database: e.g., SELECT or get the record for abc123. It retrieves the long URL from the DB. (If the DB doesn't have it, that means an unknown code – the service would then return a 404 Not Found error. But assume it's valid.)
After a cache miss and DB hit, the service should also populate the cache for next time: store abc123 -> longURL in cache with an appropriate TTL (maybe a few hours or days, or if no expiration, could cache longer).
Redirect Response: Now with the original URL in hand, the Redirect Service sends back an HTTP redirect. Specifically, it returns a 301 Moved Permanently (or 302 Found, but 301 is typical for permanent short links) with the Location header set to the long URL. The user's browser immediately receives this and then makes a new request to that long URL. From the user's perspective, they clicked a link and got taken to the final destination – it's almost instant.
Emit Event (Fire-and-forget): Simultaneously with the redirect response, the Redirect Service asynchronously emits an event or message to record this action. It places a message into the Analytics Queue with details like
{ shortCode: "abc123", timestamp: now, userAgent: "...", referrer: "..." }
. This step is very fast (queuing a message takes microseconds) and doesn't delay the response to the user.
Analytics processing flow
This happens behind the scenes, completely decoupled from the user redirect.
Event Consumption: The Analytics Service, which is listening on the queue, picks up the message for abc123. This happens independently of the redirect flow – it could be milliseconds later or even minutes if the system is under load.
Data Storage: The Analytics Service processes the event and stores it in the Analytics DB (Cassandra). This might include:
Inserting a new event record with full details (timestamp, referrer, user agent, etc.)
Updating pre-computed aggregates (daily click counts, popular referrers, etc.)
Maintaining time-series data for analytics dashboards
Optional Analytics Queries: Later, when a user or owner of the link requests analytics (say via an API or frontend), the Analytics Service would read from the Analytics DB. For example,
GET /api/links/abc123/stats
might return{"shortCode":"abc123","clickCount":42,"lastAccess":"2025-07-26T09:40:00Z"}
etc. This is separate from both the redirect flow and the event processing.
These flows illustrate a system where components talk efficiently: shortlink creation and redirection are synchronous to serve the user quickly, while analytics and other heavy tasks are async.
Now, they define the core APIs/endpoints for the system. The external APIs (what a user or client uses) are crucial to design clearly. To bridge the gap between design and implementation, they include concrete examples of what the HTTP requests and responses would look like.
API specifications with examples
Create a new short URL
POST /api/shorten
Request example
POST /api/shorten HTTP/1.1
Host: api.sho.rt
Content-Type: application/json
Accept: application/json
{
"originalUrl": "https://www.example.com/very/long/path/to/some/resource?param1=value1¶m2=value2&utm_source=newsletter",
"customCode": null,
"expiresAt": null
}
Successful response
HTTP/1.1 201 Created
Content-Type: application/json
Location: https://sho.rt/abc123
{
"success": true,
"data": {
"shortCode": "abc123",
"shortUrl": "https://sho.rt/abc123",
"originalUrl": "https://www.example.com/very/long/path/to/some/resource?param1=value1¶m2=value2&utm_source=newsletter",
"createdAt": "2025-07-26T09:15:30.123Z",
"expiresAt": null
},
"meta": {
"requestId": "req_789xyz",
"timestamp": "2025-07-26T09:15:30.123Z"
}
}
Error response examples
# Invalid URL format
HTTP/1.1 400 Bad Request
Content-Type: application/json
{
"success": false,
"error": {
"code": "INVALID_URL",
"message": "The provided URL is not valid",
"details": "URL must start with http:// or https://"
},
"meta": {
"requestId": "req_456def",
"timestamp": "2025-07-26T09:16:45.789Z"
}
}
# Custom code already taken
HTTP/1.1 409 Conflict
Content-Type: application/json
{
"success": false,
"error": {
"code": "CODE_ALREADY_EXISTS",
"message": "The custom code 'my-link' is already taken",
"details": "Please choose a different custom code"
}
}
Redirect to original URL
GET /{shortCode}
Request example
GET /abc123 HTTP/1.1
Host: sho.rt
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36
Successful response (301 Redirect)
HTTP/1.1 301 Moved Permanently
Location: https://www.example.com/very/long/path/to/some/resource?param1=value1¶m2=value2&utm_source=newsletter
Cache-Control: public, max-age=300
X-Short-Code: abc123
X-Redirect-Count: 1
<!DOCTYPE html>
<html>
<head>
<title>Redirecting...</title>
</head>
<body>
<p>If you are not redirected automatically, <a href="https://www.example.com/very/long/path/to/some/resource?param1=value1¶m2=value2&utm_source=newsletter">click here</a>.</p>
</body>
</html>
Error response (404 Not Found)
HTTP/1.1 404 Not Found
Content-Type: application/json
{
"success": false,
"error": {
"code": "LINK_NOT_FOUND",
"message": "The short link 'xyz789' was not found",
"details": "The link may have expired or never existed"
}
}
Retrieve analytics
GET /api/links/{code}/stats
Request example
GET /api/links/abc123/stats HTTP/1.1
Host: api.sho.rt
Accept: application/json
Authorization: Bearer optional-api-key-for-future
Successful response (200 OK)
HTTP/1.1 200 OK
Content-Type: application/json
{
"success": true,
"data": {
"shortCode": "abc123",
"shortUrl": "https://sho.rt/abc123",
"originalUrl": "https://www.example.com/very/long/path/to/some/resource?param1=value1¶m2=value2&utm_source=newsletter",
"createdAt": "2025-07-26T09:15:30.123Z",
"expiresAt": null,
"stats": {
"totalClicks": 847,
"uniqueClicks": 623,
"lastClickAt": "2025-07-26T14:22:15.456Z",
"clicksByDay": [
{
"date": "2025-07-26",
"clicks": 45
},
{
"date": "2025-07-25",
"clicks": 128
}
],
"topReferrers": [
{
"referrer": "twitter.com",
"clicks": 234
},
{
"referrer": "linkedin.com",
"clicks": 156
},
{
"referrer": "direct",
"clicks": 89
}
],
"platformBreakdown": {
"mobile": 456,
"desktop": 321,
"tablet": 70
}
}
},
"meta": {
"requestId": "req_analytics_123",
"timestamp": "2025-07-26T15:30:45.789Z"
}
}
Internal event messages
The team also documents the internal message format for analytics events.
Analytics Event Message
{
"eventType": "SHORT_LINK_ACCESSED",
"eventId": "evt_abc123_1690364567890",
"timestamp": "2025-07-26T14:22:15.456Z",
"data": {
"shortCode": "abc123",
"userAgent": "Mozilla/5.0 (iPhone; CPU iPhone OS 14_7_1 like Mac OS X) AppleWebKit/605.1.15",
"referrer": "https://witter.com/someuser/status/123456789",
"ipAddress": "192.168.1.100", // optionally hashed for privacy
"country": "US", // derived from IP, optional
"platform": "mobile", // derived from user agent
"timestamp": "2025-07-26T14:22:15.456Z"
},
"metadata": {
"serviceVersion": "1.2.3",
"requestId": "req_redirect_456"
}
}
John notes that these concrete examples make it much easier to understand what the actual implementation would look like. "Now when we start coding, we won't have to guess what the JSON structure should be", he says. Peter adds that the error responses show they've thought about edge cases, while Andy appreciates that the analytics response demonstrates the rich data they can provide to users.
They consider if there should be an API to delete or expire a link manually, or to list a user’s created links. Those features might require authentication and a user account concept, which is outside their current scope (they assumed no accounts). So they set those aside. For now, the above three endpoints cover the main use cases: create a link, use a link, and view link stats.
Additionally, internal communications are not exposed as public API, but for completeness:
The Redirect Service will have access to read from the Link DB (via an internal DAO or perhaps an internal service call if one chose that route – but here probably direct DB/cache access).
The Analytics Service consumes messages from the queue rather than an HTTP API from Redirect Service.
The ShortLink Service might call a key-generation module or DB function internally.
They compile a quick summary of the main endpoints.
Storage and data modeling
When it comes to storing data for their system, John, Peter, and Andy carefully consider what technologies to use and how to structure the data. The system has two main sets of data to worry about: the Short Links (mappings) and the Click/Analytics data. They discuss the pros and cons of using a SQL relational database vs a NoSQL database for each, and design a basic schema for their needs.
Choosing SQL vs NoSQL
For the short link mapping, the access pattern is simple: given a short code (exact match), find the long URL (and metadata), or insert a new mapping. This is essentially a key-value lookup. There’s no need for complex relational joins or transactions across multiple tables (each query is typically just one key lookup or insert). This kind of workload can be handled by a relational DB, but it shines in a NoSQL key-value or document store which can scale horizontally easily.
A NoSQL (key-value store) like Amazon DynamoDB or Apache Cassandra is very suitable here. It can handle a massive number of reads and writes with low latency, by distributing data across many nodes. Each short code lookup is independent, so it partitions well (e.g., by hash of the code). NoSQL also often has a flexible schema, but here the data items are quite uniform.
A SQL database (e.g. MySQL, PostgreSQL) would provide strong consistency easily and the convenience of ACID transactions (like ensuring no duplicate short code on insert via a primary key constraint). However, at huge scale, a single SQL instance could become a bottleneck – you’d need to shard it (split data across multiple DB servers by code ranges or something) which adds complexity. Also, the size of data (potentially billions of rows) might outgrow a single instance’s capacity or memory for indexing. That said, an advantage of SQL is simplicity of queries (if someday they needed a join or more complex query, it can be easier).
They recall guidance: since the data isn’t strongly relational (it’s basically one big table of mappings), a NoSQL solution is generally favored for scalability and high throughput. Many real URL shorteners at scale have used NoSQL or bespoke solutions for the core mapping. For example, Bitly initially used sharded MySQL, but later teams might use NoSQL for ease of scale.
For analytics data, Andy emphasizes their core value proposition: "Remember, we want to provide rich analytics - that's what differentiates us from basic shorteners". The team decides on a comprehensive approach
Analytics architecture decision
They choose detailed event storage to support the rich analytics features shown in their API examples (referrer tracking, platform breakdown, time-series data).
Implementation Strategy
Separate Analytics Database: A dedicated NoSQL store (Cassandra) optimized for high-volume writes and time-series analytics workloads
Event-first approach: Store every click as a detailed event with timestamp, referrer, user agent, etc.
Real-time aggregation: The Analytics Service processes events and maintains both:
Raw event logs for detailed queries
Pre-computed aggregates for fast dashboard queries
Peter points out, "This gives us maximum flexibility - we can answer any analytics question our users might have, like 'Show me clicks from Twitter vs LinkedIn over the past week'".
Storage decisions finalized
Link Database evolution: PostgreSQL (MVP with heavy Redis caching) → Cassandra/DynamoDB (production with minimal caching needed)
Analytics Database (Cassandra) for event storage and aggregated statistics - optimized for high-volume writes, time-series data, and analytics workloads
Caching strategy: Critical with PostgreSQL (slow SQL), less critical with NoSQL (fast key-value lookups)
Message Queue (Redis Pub/Sub → RabbitMQ → Kafka evolution) serves dual purpose: event transport AND event log retention for reprocessing if needed
They also note: if strong consistency on writes is needed (to ensure two users can’t create the same custom alias simultaneously, for instance), a relational DB or at least a consistent hashing mechanism might be needed. But if they use a single key generation approach, collisions are unlikely. Cassandra is eventually consistent by default, but can be tuned (quorum writes etc.) – for something like a unique short code, they would make that the primary key and rely on the DB to reject duplicates (Cassandra can enforce uniqueness on primary key). DynamoDB similarly can use the key uniqueness as a guarantee.
Next, they outline a simple schema / data model for the main entities.
ShortLink (Mapping) Entity
This represents one shortened link record. In a SQL schema, it could be a table ShortLinks:
ShortLinks(
short_code VARCHAR(10) PRIMARY KEY, -- "abc123" etc. primary key ensures uniqueness
original_url TEXT NOT NULL, -- the long URL
created_at DATETIME NOT NULL,
expires_at DATETIME NULL, -- optional expiration date
click_count BIGINT DEFAULT 0, -- number of times accessed (for quick stats)
created_by VARCHAR(50) NULL -- if we had user info; not used now
)
In a NoSQL context, if using Cassandra, they might define a table like:
CREATE TABLE ShortLinks (
short_code text PRIMARY KEY,
original_url text,
created_at timestamp,
expires_at timestamp,
click_count counter
);
Though Cassandra’s counter would be separate in reality; they might just update a regular column for simplicity.
The primary key being the short_code means lookups are very fast by code. This table can be partitioned by short_code (which is essentially random-ish if generated sequentially but encoded in base62, it won’t be strictly numeric order; they might even shuffle the codes a bit to avoid hotspots, but let's assume even distribution.)
They decide not to store user-identifying info since there are no accounts. If later they add accounts, a user_id could be added to link records to know ownership.
Click Event Entity
Since they chose detailed event storage for rich analytics, John sketches out the event schema. Cassandra is the preferred choice for analytics due to its superior write performance and time-series optimization.
CREATE TABLE ClickEvents (
short_code text,
event_date text, -- "2025-07-26" for partitioning
event_time timestamp,
referrer text,
user_agent text,
ip_hash text, -- hashed IP for privacy
country text,
platform text,
browser text,
session_id text,
PRIMARY KEY ((short_code, event_date), event_time)
) WITH CLUSTERING ORDER BY (event_time DESC);
This design partitions by short_code AND date to prevent hot partitions while keeping related events together for efficient queries.
Pre-computed aggregates
The Analytics Service also maintains aggregated statistics for fast dashboard queries.
// Daily aggregates collection
{
shortCode: "abc123",
date: "2025-07-26",
totalClicks: 1247,
uniqueClicks: 892,
topReferrers: {
"twitter.com": 456,
"linkedin.com": 234,
"direct": 89
},
platformBreakdown: {
"mobile": 678,
"desktop": 321,
"tablet": 67
}
}
Andy notes, "This dual approach gives us the best of both worlds - detailed events for deep analysis and fast aggregates for real-time dashboards".
Key generation strategy
Peter brings up the critical hotspot they identified earlier: "How exactly are we going to generate unique short codes at scale?" The team discusses the approach carefully.
Andy notes, "We need something that's guaranteed to be unique and won't cause retries or collisions that slow us down". They settle on a sequential ID + Base62 encoding strategy:
Central ID Generator: Maintain an atomic counter that produces sequential integers (1, 2, 3, 4, ...)
Base62 Encoding: Convert each integer to Base62 (0-9, A-Z, a-z) to create short codes
ID 1 → "1"
ID 62 → "10"
ID 3844 → "100"
ID 238328 → "abc123"
Implementation details
Start with a Redis counter using INCR command (atomic and fast)
For higher scale: Distributed ID blocks - each ShortLink Service instance reserves ranges (e.g., Instance 1 gets 1-1000, Instance 2 gets 1001-2000)
This eliminates the central bottleneck while maintaining uniqueness
Can handle their estimated 120 writes/sec easily, with room to scale to thousands/sec
Why not alternatives
Random/UUID approach: Risk of collisions that require retries
Hash-based codes: Same long URL would always get same short code (may or may not be desired)
Pre-generated pools: Complex to implement and manage
John adds, "The sequential approach gives us predictable performance and zero collision risk. We can always optimize the distribution later if needed".
They note these as potential improvements if needed. For now, one instance can handle it.
Sharding and partitioning
They consider how to scale the database if it grows.
With Cassandra or DynamoDB, data is automatically partitioned via hashing the key. So short_code will be distributed (especially if the codes are not sequential in a numeric sense – Base62 encoding of sequential IDs will still cause a pattern because “000001”, “000002” etc in base62 will increment lexicographically; that could lead to hotspots if all new IDs start with similar prefix. One common trick: apply a hash or salt to the ID or store with a random prefix that is not used in the code, just to distribute writes. Another approach is have multiple tables by mod of ID. But this is too deep; assume our NoSQL handles distribution).
If SQL was used: they might shard by range of IDs (e.g., IDs 1-100M on shard 1, 100M-200M on shard 2, etc.). This would require a way to know which DB to go to for a given code (which could be done by looking at the code if the code itself encodes the shard info or by maintaining a map of ranges).
They also consider read replicas if using SQL to offload reads (redirects) and keep a single writer. But synchronizing a high write volume can be tricky. NoSQL multi-master might just be simpler.
"That's the beauty of NoSQL", Andy points out. "Cassandra handles sharding automatically, so we don't have to become database administrators overnight". Peter nods in agreement, "Less operational complexity means more time focusing on features our users actually want".
Data modeling summary
They summarize the two main data entities as follows.
ShortLink Data: stored in ShortLinks store (NoSQL table). Keyed by short_code. Contains original_url and metadata (creation time, optional expiry, click count). This is used in both creation (insert) and redirect (get by key) flows. Example entry:
short_code = "abc123", original_url = "http://example.com/very/long/path", created_at = 2025-07-26 09:00:00, expires_at = null, click_count = 42
.Click/Analytics Data: stored either as events or as aggregated counts. Initially, they will use the click_count in ShortLinks for total clicks. Additionally, they might have a LinkEvents store: e.g., partition by short_code, each entry could be an event with timestamp. This could be a wide-column store where each event is a new column or a new row with composite key. Alternatively, a separate LinkStats that keeps daily aggregates: e.g., for code "abc123" and date "2025-07-26", count=10. They decide on one simple approach: use a Clicks table where each row = one click (for completeness of design, even if not practical at high scale without some roll-ups). This demonstrates capturing each event for analytics.
John quickly writes down an example of a Click event entry (stored in Cassandra):
-- INSERT into ClickEvents table:
INSERT INTO ClickEvents (
short_code, event_date, event_time,
user_agent, referrer, ip_hash, platform
) VALUES (
'abc123', '2025-07-26', '2025-07-26 09:45:00',
'Mozilla/5.0 ...', 'https://witter.com/...',
'sha256_hash', 'mobile'
);
This shows how they could store extra info for analytics like user agent or referrer to see where clicks come from. They note storing user agent allows determining device type, browser, etc., but it’s a lot of data. This might be overkill for MVP, but it’s something they could do if offering rich analytics.
John, ever the organizer, suggests they capture their key design decisions in a table. "This way, when we're implementing next week and second-guessing ourselves, we can remember why we made these choices", he says with a grin.
Alternative design approaches
Having settled on their microservices + event-driven design, the team takes a moment to explore other valid architectural patterns. Andy emphasizes, "Our solution isn't the only way – understanding alternatives helps us appreciate our choices and prepares us for different requirements".
Event sourcing for analytics
What it is
Instead of storing just the current state (like a click count), event sourcing stores every single event that happened to an entity. For analytics, this means keeping an immutable log of every click event.
How it would work in our system
Traditional Approach:
ShortLinks table: {code: "abc123", click_count: 847}
Event Sourcing Approach:
Event Store: [
{type: "ShortLinkCreated", code: "abc123", url: "...", timestamp: "..."},
{type: "ShortLinkAccessed", code: "abc123", timestamp: "2025-07-26T10:00:00Z"},
{type: "ShortLinkAccessed", code: "abc123", timestamp: "2025-07-26T10:05:00Z"},
// ... 847 more access events
]
Benefits
Complete audit trail: You can see exactly what happened and when
Time travel: Rebuild state at any point in history ("How many clicks did this link have last Tuesday?")
Flexible analytics: Derive new metrics without losing historical data
Debugging: Replay events to understand system behavior
Drawbacks
Storage overhead: Every event is stored forever (or with retention policies)
Complexity: Reading current state requires event replay or maintaining projections
Performance: Queries might be slower if not properly optimized
When to consider
If rich analytics, compliance/auditing, or debugging capabilities are crucial.
CQRS
Command Query Responsibility Segregation
What it is
Separate the models for writing data (commands) from reading data (queries). Our current design has hints of CQRS already.
How it applies
Write Side (Commands):
- ShortLink Service handles "CreateShortLink" command
- Optimized for consistency, validation, business rules
Read Side (Queries):
- Redirect Service handles "GetOriginalUrl" query
- Optimized for speed, caching, read replicas
- Analytics Service provides "GetLinkStats" query
Full CQRS implementation might look like
Write database: Normalized, transactional (PostgreSQL)
Read database: Denormalized, fast lookups (Redis, DynamoDB)
Synchronization: Events or CDC (Change Data Capture) keep read side updated
Benefits
Optimized for purpose: Write side for consistency, read side for performance
Independent scaling: Scale reads and writes separately
Different technologies: Use best tool for each job
Drawbacks
Eventual consistency: Read side might lag behind writes
Complexity: Maintaining sync between two data models
Operational overhead: More databases to manage
When to consider
When read/write patterns are very different or when you need extreme performance on both sides.
Monolithic architecture
What it is
Instead of microservices, build everything as a single application.
How it would look
Single URL Shortener Application:
├── Controllers/
│ ├── ShortenController (handles POST /api/shorten)
│ ├── RedirectController (handles GET /{code})
│ └── StatsController (handles GET /api/stats/{code})
├── Services/
│ ├── LinkService (business logic)
│ └── AnalyticsService (click tracking)
├── Data/
│ └── Database (single DB with all tables)
└── Background Jobs/
└── AnalyticsProcessor (async analytics)
Benefits
Simplicity: Easier to develop, test, and deploy initially
Performance: No network calls between services
Transactions: Easy ACID transactions across all data
Debugging: Simpler to trace requests through the system
Drawbacks
Scaling limitations: Must scale the entire application, not individual parts
Technology constraints: Entire app uses same tech stack
Team coordination: Harder for multiple teams to work independently
Deployment risk: Deploying any change affects the entire system
When to consider
Early stages, small teams, or when simplicity trumps scalability concerns.
Serverless architecture
What it is
Use cloud functions/lambdas instead of always-running services.
AWS implementation example
- API Gateway → Lambda Function (shorten URL)
- CloudFront → Lambda@Edge (redirect service)
- DynamoDB (link storage)
- Kinesis → Lambda (analytics processing)
- S3 + Athena (analytics querying)
Benefits
Auto-scaling: Functions scale automatically with demand
Cost-effective: Pay only for actual usage
No server management: Cloud provider handles infrastructure
Built-in availability: Managed services handle failover
Drawbacks
Cold start latency: Functions may take time to initialize
Vendor lock-in: Tied to cloud provider's services
Limited control: Less control over performance tuning
Complex monitoring: Debugging distributed functions can be challenging
When to consider
Variable traffic patterns, small teams, or cloud-native organizations.
Hybrid approaches
The trio realizes that these patterns can be combined.
Event Sourcing + CQRS. Store all events in an event store, then project them into optimized read models for different queries.
Modular Monolith. Start with a monolith but organized into clear modules that could later be extracted into microservices.
Serverless + Traditional. Use serverless for analytics processing (variable load) but traditional services for redirects (consistent low latency needs).
Design pattern trade-offs summary
Peter creates a comparison table.
Key insight: There's no universally "right" architecture.
The best choice depends on:
Team size and expertise
Performance requirements
Consistency needs
Operational capabilities
Business constraints
Andy concludes, "We chose microservices because we wanted to learn about distributed systems and planned for different scaling needs. But a well-built monolith could serve millions of URLs too!" This exploration helps them appreciate that their design decisions were conscious trade-offs, not absolute truths.
With alternative approaches considered, the team shares some hard-learned lessons before concluding.
Common pitfalls and lessons learned
As the team reflects on their design process, they recall various system design pitfalls they've seen in the past. Andy suggests documenting these as a "what could go wrong" guide to help future builders avoid common mistakes.
The hot partition problem
What could go wrong
Imagine a single short URL goes viral (like a breaking news link). Suddenly, that one link receives millions of hits while others get none. If your database partitions data by short code, one partition becomes overwhelmed while others sit idle.
Partition 1: abc123 (VIRAL LINK) → 100,000 requests/sec 🔥
Partition 2: def456 (normal link) → 10 requests/sec ✅
Partition 3: ghi789 (normal link) → 5 requests/sec ✅
Prevention strategies
Consistent hashing with virtual nodes to distribute load more evenly
Cache aggressively at multiple levels (CDN, application cache, database cache)
Circuit breakers to fail fast when a partition is overwhelmed
Rate limiting per short code to prevent complete system overwhelm
Monitoring to detect hot partitions early
Real-world lesson
"We learned this the hard way when our system went down during a major sports event because one link got shared by every sports broadcaster simultaneously". - Peter
The thundering herd problem
What could go wrong
Your cache expires for a popular link exactly when 10,000 users click it simultaneously. All 10,000 requests hit your database at once, potentially causing a cascade failure.
Timeline:
11:00:00 - Cache expires for popular link "abc123"
11:00:01 - 10,000 concurrent requests arrive
11:00:01 - All requests miss cache, hit database
11:00:02 - Database overwhelmed, response times spike
11:00:03 - More requests timeout and retry → even more load
11:00:04 - System crashes
Prevention strategies
Probabilistic cache refresh: Refresh cache before expiration with some randomness
Single-flight pattern: Only allow one request to regenerate cache while others wait
Circuit breaker pattern: Fail fast instead of overwhelming downstream systems
Backup cache layers: Use stale data from L2 cache if L1 cache and database fail
Code example of single-flight pattern
class SafeUrlLookup {
constructor() {
this.inflightRequests = new Map();
}
async getUrl(shortCode) {
// Check if request already in progress
if (this.inflightRequests.has(shortCode)) {
return await this.inflightRequests.get(shortCode);
}
// Start new request
const request = this.fetchFromDatabaseWithCache(shortCode);
this.inflightRequests.set(shortCode, request);
try {
const result = await request;
return result;
} finally {
this.inflightRequests.delete(shortCode);
}
}
}
Analytics data explosion
What could go wrong
You decide to log every piece of data about every click: full user agent, IP address, referrer, timestamp, browser resolution, etc. With millions of clicks per day, your analytics database becomes impossibly large and expensive.
Naive approach:
Click record: {
shortCode: "abc123",
timestamp: "2025-07-26T10:30:45.123Z",
userAgent: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36",
ipAddress: "192.168.1.100",
referrer: "https://very-long-social-media-url.com/posts/123456789/shares/comments",
screenResolution: "1920x1080",
language: "en-US,en;q=0.9"
}
Result: 500+ bytes per click × 100M clicks/day = 50GB/day raw data!
Better approaches
Aggregate in real-time: Store daily/hourly summaries instead of raw events
Sampling: Only log detailed data for a percentage of requests
Data retention policies: Keep raw data for 30 days, aggregated data longer
Compress repetitive data: Use lookup tables for user agents, referrers
Smart aggregation example
{
"shortCode": "abc123",
"date": "2025-07-26",
"hourlyStats": {
"10": {"clicks": 1247, "uniqueIPs": 892},
"11": {"clicks": 2156, "uniqueIPs": 1534}
},
"topReferrers": {
"twitter.com": 5432,
"linkedin.com": 3210,
"direct": 1098
},
"platformBreakdown": {
"mobile": 6789,
"desktop": 2951
}
}
The unique ID generation bottleneck
What could go wrong
You use a single database auto-increment ID as your short code generator. As traffic grows, this single writer becomes the bottleneck for all new link creation.
All ShortLink Service instances → Single DB sequence → Bottleneck!
Instance 1 ──┐
Instance 2 ──┼── Database AUTO_INCREMENT ── 💥 (only ~1000 writes/sec max)
Instance 3 ──┘
Better approach (our chosen solution)
Distributed ID blocks with Redis: Each service instance reserves blocks of IDs from a central Redis counter
How it works: Instance 1 gets IDs 1-1000, Instance 2 gets 1001-2000, etc.
Benefits: No single point of failure, can handle thousands of writes/sec
Base62 encoding: Convert sequential IDs to short codes (ID 238328 → "abc123")
Snowflake-style ID generation example
class DistributedIdGenerator {
constructor(machineId, datacenterId) {
this.machineId = machineId;
this.datacenterId = datacenterId;
this.sequence = 0;
this.lastTimestamp = 0;
}
generateId() {
let timestamp = Date.now();
if (timestamp === this.lastTimestamp) {
this.sequence = (this.sequence + 1) & 4095; // 12 bits
if (this.sequence === 0) {
// Wait for next millisecond
while (timestamp <= this.lastTimestamp) {
timestamp = Date.now();
}
}
} else {
this.sequence = 0;
}
this.lastTimestamp = timestamp;
// Combine: 41-bit timestamp + 5-bit datacenter + 5-bit machine + 12-bit sequence
const id = (timestamp << 22) | (this.datacenterId << 17) | (this.machineId << 12) | this.sequence;
return id;
}
}
Ignoring the cache invalidation problem
What could go wrong
You cache short link mappings but forget to handle updates. A user updates their link's destination URL, but cached redirects still go to the old URL for hours.
User updates abc123: google.com → facebook.com
Cache still has: abc123 → google.com
Result: Users get redirected to wrong destination! 😱
Solutions
Cache TTL: Reasonable expiration times (not too short for performance, not too long for consistency)
Cache invalidation: Actively remove entries when data changes
Versioning: Include version numbers in cache keys
Two-tier caching: Short TTL in application cache, longer TTL in CDN
Underestimating operational complexity
What could go wrong
You build a beautiful distributed system but realize you have no idea how to:
Debug a problem spanning 5 microservices
Deploy updates without downtime
Monitor system health holistically
Handle cascading failures
Essential operational practices:
Distributed tracing: Track requests across service boundaries
Centralized logging: All services log to a searchable central system
Health checks: Each service provides /health endpoints
Circuit breakers: Fail fast to prevent cascade failures
Gradual rollouts: Canary deployments and feature flags
Runbook documentation: Step-by-step guides for common issues
Security blindspots
What could go wrong
Open redirects: Attackers create short links to malicious sites
Rate limiting gaps: Attackers overwhelm your system with requests
Analytics data leaks: Storing IP addresses without considering GDPR
No abuse detection: System used to distribute malware or spam
Security checklist
✅ Validate destination URLs (block known malicious domains)
✅ Rate limit by IP address and user (if authenticated)
✅ Monitor for suspicious patterns (same IP creating many links)
✅ Implement HTTPS everywhere
✅ Hash or anonymize analytics data
✅ Provide abuse reporting mechanism
✅ Regular security audits and penetration testing
Key lessons summary
John wraps up their lessons:
Design for failure: Assume everything will break and plan accordingly
Start simple: Build complexity incrementally, not all at once
Monitor everything: You can't fix what you can't see
Test at scale: Performance issues emerge at scale that aren't visible in development
Security first: It's much harder to add security later than to build it in
Operational mindset: A system is only as good as your ability to run and maintain it
"The most important lesson", Andy reflects, "is that system design is about trade-offs, not perfect solutions. Every choice has consequences, and understanding those consequences is what separates good systems from great ones".
Armed with both the design and the wisdom to avoid common mistakes, they conclude their architectural journey.
Implementation roadmap: from MVP to production
Inspired by their comprehensive design, the team creates a practical roadmap for actually building their URL shortener. "Great design means nothing if we can't implement it incrementally", Peter notes. They decide on a phased approach that delivers value early while building toward their full vision.
Phase 1: MVP
Minimum Viable Product, Week 1-2
Goal
Get basic URL shortening working with minimal complexity
Scope
Single monolithic application (Python/FastAPI)
PostgreSQL database with simple schema
Basic REST API for shortening and redirecting
In-memory caching (Redis)
Simple sequential ID generation
What to build
MVP Database Schema
CREATE TABLE short_links (
id SERIAL PRIMARY KEY,
short_code VARCHAR(10) UNIQUE NOT NULL,
original_url TEXT NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
click_count INTEGER DEFAULT 0
);
CREATE INDEX idx_short_code ON short_links(short_code);
POC implementation example
Prove of Concept
# FastAPI implementation with async support
from fastapi import FastAPI, HTTPException, status
from fastapi.responses import RedirectResponse
from pydantic import BaseModel, HttpUrl
import asyncpg
import redis.asyncio as redis
from urllib.parse import urlparse
import base62
app = FastAPI(title="URL Shortener MVP")
# Database and Redis connections
db_pool = None
redis_client = None
class ShortenRequest(BaseModel):
originalUrl: HttpUrl
class ShortenResponse(BaseModel):
shortCode: str
shortUrl: str
def is_valid_url(url: str) -> bool:
"""Basic URL validation"""
try:
result = urlparse(str(url))
return all([result.scheme, result.netloc])
except Exception:
return False
def encode_base62(num: int) -> str:
"""Convert integer to base62 string"""
return base62.encode(num)
@app.post("/api/shorten", response_model=ShortenResponse)
async def shorten_url(request: ShortenRequest):
original_url = str(request.originalUrl)
# Validate URL
if not is_valid_url(original_url):
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="Invalid URL"
)
# Generate short code from auto-increment ID
async with db_pool.acquire() as conn:
row = await conn.fetchrow(
"INSERT INTO short_links (original_url) VALUES ($1) RETURNING id",
original_url
)
short_code = encode_base62(row['id'])
await conn.execute(
"UPDATE short_links SET short_code = $1 WHERE id = $2",
short_code, row['id']
)
# Cache the mapping
await redis_client.setex(short_code, 3600, original_url)
return ShortenResponse(
shortCode=short_code,
shortUrl=f"https://sho.rt/{short_code}"
)
@app.get("/{code}")
async def redirect_url(code: str):
# Check cache first
original_url = await redis_client.get(code)
if not original_url:
# Cache miss - query database
async with db_pool.acquire() as conn:
row = await conn.fetchrow(
"SELECT original_url FROM short_links WHERE short_code = $1",
code
)
if not row:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="Link not found"
)
original_url = row['original_url']
# Update cache and click count
await redis_client.setex(code, 3600, original_url)
await conn.execute(
"UPDATE short_links SET click_count = click_count + 1 WHERE short_code = $1",
code
)
else:
# Decode bytes to string if needed
original_url = original_url.decode() if isinstance(original_url, bytes) else original_url
return RedirectResponse(url=original_url, status_code=301)
@app.on_event("startup")
async def startup():
global db_pool, redis_client
# Initialize database connection pool
db_pool = await asyncpg.create_pool(
"postgresql://user:password@localhost/urlshortener"
)
# Initialize Redis connection
redis_client = redis.Redis.from_url("redis://localhost:6379")
@app.on_event("shutdown")
async def shutdown():
await db_pool.close()
await redis_client.close()
Success criteria
✅ Can shorten URLs and get working redirects
✅ Basic caching reduces database load
✅ Simple analytics (click count)
✅ Handles 100+ concurrent users
✅ < 100ms redirect latency
Phase 2: Scale and reliability
Week 3-4
Goal
Make the system production-ready and more resilient
What to add
Container deployment (Docker)
Load balancer (nginx or cloud load balancer)
Database replication (primary/replica setup)
Proper error handling and input validation
Rate limiting (per IP)
Basic monitoring (health checks, basic metrics)
Architecture evolution
Phase 3: Microservices and analytics
Week 5-8
Goal
Split into microservices and add rich analytics
What to add
Microservices separation: ShortLink Service + Redirect Service
Message queue (Redis Pub/Sub for MVP, evolving to RabbitMQ for reliability)
Analytics Service with detailed click tracking
API Gateway for routing and authentication
Cassandra cluster for analytics (optimized for time-series and high-write workloads)
Rich analytics API with stats endpoints
Target architecture
Getting closer to original design.
Phase 4: Production scale
Week 9-12
Goal
Handle serious traffic and provide enterprise features
What to add
Distributed caching (Redis Cluster)
NoSQL migration for primary database (Cassandra)
CDN integration for global redirects
Advanced monitoring (Prometheus, Grafana, distributed tracing)
Security hardening (rate limiting, abuse detection, URL validation)
Custom domains support
User accounts and link management
A/B testing for different redirect strategies
Phase 5+: Advanced features
Week 12+
Goal
Differentiate from competitors with unique features
What to add
QR code generation
Link expiration and scheduling
Branded short domains
Advanced analytics (geographic, device, referrer tracking)
Webhooks for events
Bulk URL processing API
Click fraud detection
API rate limiting tiers
and many more …
…
Conclusion
With the design fully sketched out, John, Peter, and Andy step back to reflect on their journey from a simple idea to a robust system design. What started as frustration with legacy shorteners turned into an exploration of their domain using storytelling and DDD principles, which guided them to an architecture they're confident about. By treating "links as real-time signals of attention", they ensured from the beginning that analytics was a first-class consideration, not an afterthought.
Through Event Storming, they discovered the key domain events (ShortLinkCreated, ShortLinkAccessed, etc.) and defined the system's behavior in a language all three could understand – sticky notes and all. This collaborative modeling helped them agree on what's important (and what's not in scope) early on. It also naturally led to a design where those events became the interface between sub-systems (for example, the Analytics service reacting to access events). In hindsight, starting with the domain story prevented them from jumping straight into coding a quick hacky shortener; instead, they have a thoughtful design that addresses both functional needs and quality attributes.
The resulting system is modular and scalable. By splitting responsibilities (shortening vs redirecting vs analytics) into separate services, each part can be scaled and improved independently. The use of asynchronous messaging for analytics means they can add more consumers or do additional processing (like sending alerts for certain events, or doing geographic lookup on IPs for location stats) without slowing down the core user experience. The focus on NFRs led them to include caching, choose a suitable database, and design for high availability from the start – so the system should be reliable and fast, which are crucial for user trust (no one wants a short link that's slow or down).
They also appreciate how domain-driven design (DDD) concepts like bounded contexts kept the model clean. The "Shortening" context and "Analytics" context have different concerns, and their design reflects that separation (different services, databases, etc.). This will make the system easier to maintain; if one of them later wants to completely revamp how analytics are stored, they can do so without touching the core redirect logic.
📩 Subscribe
If this post clicked with you (pun intended 🔗), hit Subscribe to get the next deep-dive in the System Design series straight to your inbox.
Finally, the team reviews the key design choices made along the way, feeling satisfied with the results.
As they conclude, John imagines how their story-driven design process will translate into actual implementation steps: setting up the databases, writing the services, and testing the flows end-to-end. Peter is already thinking about writing some unit tests for the ID generator and cache logic. Andy, satisfied, notes that by starting with "links are signals of attention", they ended up with a system that not only shortens URLs but also illuminates those signals through analytics.
The trio high-fives, ready to move from design to development. They've designed a smarter URL shortener system that they believe would make even the legacy providers a bit jealous – and they had a fun adventure designing it, too!