Lingua-e
← Back

Developer English guide

System Design Vocabulary Every Developer Should Know

April 30, 2026

The key concepts, terms, and tradeoffs that come up in every system design interview. Learn what they mean and how to use them confidently in English.

Scalability

How the system grows to handle more users or data. Interviewers always ask about your approach to scaling — know the difference between horizontal and vertical, and when to use each.

Horizontal scaling (scaling out)

Definition:

Adding more machines to handle increased load. Each machine handles a share of the traffic. Preferred for stateless services.

Vertical scaling (scaling up)

Definition:

Increasing the resources (CPU, RAM, disk) of an existing machine. Simpler to implement but has a hard upper limit and creates a single point of failure.

Sharding

Definition:

Splitting a database into smaller pieces (shards) distributed across multiple servers. Each shard holds a subset of the data. Enables horizontal scaling of databases.

Replication

Definition:

Copying data across multiple nodes. Improves read throughput and adds redundancy. Comes in leader-follower and multi-leader flavors.

Networking and traffic

Components that sit between the user and your servers. Understanding load balancers, proxies, and CDNs is expected for any mid-to-senior system design interview.

Load balancer

Definition:

A component that distributes incoming traffic across multiple servers to prevent any single server from becoming a bottleneck. Can operate at Layer 4 (transport) or Layer 7 (application).

L4 vs L7 load balancer

Definition:

L4 routes traffic based on IP and TCP/UDP port — fast but no content awareness. L7 routes based on HTTP headers, URLs, or cookies — slower but more flexible (e.g. routing /api to one pool and /static to another).

Reverse proxy

Definition:

A server that sits in front of your backend and forwards client requests to it. Provides SSL termination, load balancing, caching, and hides the internal topology from clients.

CDN (Content Delivery Network)

Definition:

A geographically distributed network of servers that cache static content close to users. Reduces latency for images, CSS, JS, and video by serving them from the nearest edge node.

Caching

Caching is one of the most common techniques to improve performance. Know how to introduce it, where to place it, and — crucially — how to keep it consistent.

Cache

Definition:

A fast, temporary storage layer that holds frequently accessed data so you don't have to recompute or re-fetch it. Common tools: Redis, Memcached.

Cache eviction policy

Definition:

The rule that decides what to remove when the cache is full. LRU (Least Recently Used) removes the oldest-accessed item. LFU (Least Frequently Used) removes the item accessed least often.

Cache invalidation

Definition:

The process of marking or removing cached data when the underlying source changes. One of the two hard problems in computer science — getting it wrong causes stale data bugs.

Write-through vs write-back

Definition:

Write-through: data is written to cache and storage at the same time — consistent but slower. Write-back: data is written to cache first and synced to storage later — faster but risks data loss on crash.

Data consistency

Distributed systems force you to make explicit choices about consistency. These concepts come up constantly in database and storage design questions.

ACID

Definition:

Properties that guarantee reliable database transactions: Atomicity (all or nothing), Consistency (data stays valid), Isolation (transactions don't interfere), Durability (committed data survives crashes).

CAP theorem

Definition:

A distributed system can only guarantee two of three: Consistency (every read gets the latest write), Availability (every request gets a response), Partition tolerance (the system works despite network splits). In practice, partition tolerance is required, so the real choice is between C and A.

Eventual consistency

Definition:

All nodes will eventually converge to the same state, but reads may return stale data in the short term. Accepted in high-availability systems like DNS and shopping carts.

Strong consistency

Definition:

Any read always returns the result of the most recent write. Requires coordination between nodes, which adds latency. Required for financial transactions and inventory systems.

Reliability and availability

Interviewers expect you to design systems that don't go down. Know how to spot and eliminate single points of failure, and how to handle async workloads.

Single Point of Failure (SPOF)

Definition:

A component whose failure causes the entire system to go down. Interviewers expect you to identify and eliminate SPOFs by adding redundancy.

High availability (HA)

Definition:

A system designed to minimize downtime, typically expressed as a percentage (e.g. 99.9% uptime = ~8.7 hours of downtime per year). Achieved through redundancy, failover, and health checks.

Rate limiting

Definition:

Controlling the number of requests a client can make in a given time window. Protects the system from abuse, DDoS, and noisy neighbours. Common algorithms: token bucket, sliding window.

Message queue

Definition:

A buffer that decouples producers from consumers. Producers send messages without waiting for consumers to process them. Enables async processing and absorbs traffic spikes. Examples: Kafka, RabbitMQ, SQS.

Interview vocabulary

The words that separate a junior answer from a senior one. Using these terms correctly — and knowing what they imply — signals that you think in systems.

Tradeoff

Definition:

Accepting a worse outcome in one dimension to get a better outcome in another. Every design decision involves tradeoffs. Using this word correctly signals seniority — always explain what you are giving up and what you gain.

Bottleneck

Definition:

The component that limits the overall performance of the system. Identifying bottlenecks — and explaining how you would relieve them — is the core skill in a system design interview.

Latency vs throughput

Definition:

Latency is the time it takes to complete one operation (e.g. 50ms per request). Throughput is how many operations the system can handle per unit of time (e.g. 10,000 requests/second). Optimising for one often hurts the other.

SLA / SLO / SLI

Definition:

SLI (Service Level Indicator): a metric you measure (e.g. error rate). SLO (Service Level Objective): the target for that metric (e.g. error rate < 0.1%). SLA (Service Level Agreement): the contractual commitment to the customer, with penalties if broken.

Idempotency

Definition:

An operation is idempotent if calling it multiple times produces the same result as calling it once. Critical for retries and distributed systems — if a payment request is retried due to a network error, you don't want to charge the user twice.

Ready to practice your English at work?

Lingua-e has interactive exercises built around real developer conversations: standups, code reviews, retrospectives, and more. Practice until it comes naturally.

Try Lingua-e for free
Roxana Lafuente

Written by

Roxana Lafuente

Lingua-e's founder

Roxana Lafuente is a software engineer with 8+ years of experience. At the beginning of her career, even though she had already passed the First Certificate in English, she still froze every time she had to speak up in the daily standup. That was a gap nobody was fixing. After 2,000+ standups, she figured out what actually builds fluency: practice that looks like your real work. She built Lingua-e so other developers wouldn't have to take the long road to feel confident working in an international development environment.