System Design Fundamentals Part 1

emsuraki5
Aug 18, 2025
5 min read

System Design is the process of defining the architecture, components, modules, interfaces, and data of a system to meet specific functional and non-functional requirements. This involves:

Choosing the overall structure of the system
Identifying the components of the system and their relationships
How the different components will interact or communicate with each other
How data will be stored, accessed and managed within the system
Also considers non-functional requirements like performance, scalability, reliability, security, and maintainability to ensure the system functions effectively and efficiently.

In this blog, we will look at the first three (Scalability, Availability, Latency & performance) key characteristics of Distributed Systems and understand how they relate to System Designs. Whether you are building a small-scale application or an enterprise one, understanding system design allows you to architect solutions that can handle real-world complexities.

Scalability
Scalability is the ability of a system to handle an increasing workloads, either by upgrading the existing resource to a bigger size (scale up) or adding more resources (scaling out). Scaling is important in distributed systems to ensure that the system can effectively manage the growing demands of users, data and processing power. There are 2 different types of scalability:

a. Vertical Scaling
Also known as scaling up where you increase the capacity of individual nodes within a system. This can be achieved by adding more CPUs, memory, or storage. It can help improve performance of a system by allowing it to handle more workloads on a single node. However, it has limitations to the amount of resources that can be added to a single machine, which can then also lead to a single point of failure.

b. Horizontal Scaling
Also known as scaling out. This involves adding more machines or nodes to a system to distribute the workload evenly. This approach allows the system to handle increased number of requests without overloading individual nodes. Horizontal scaling is useful because it provides a cost-effective way to manage fluctuating workloads and maintain high availability in a distributed system.

Availability
Availability is the measure of how a system is reliable and accessible to it's users. In distributed systems, high availability (HA) is crucial as it ensures the system remains operational even where there are impairments or increase in demand. This enables businesses to provide uninterrupted services to their end-users, regardless of any unforeseen circumstances.

High Availability (HA)

HA is often measured in terms of uptime - ratio of time that a system is operational to the total time it is supposed to be operational. This involves minimising planned and unplanned downtime, eliminating single points of failure, and implementing redundant systems and processes. HA also involves guaranteeing that the system would be able to handle increased load and network traffic without comprising it's performance. This is crucial in scenarios where there's a rapid increase in user base and or sudden spikes in demand.

In order for organisations to achieve HA, the following strategies are being used:

a. Redundancy and Replication

By duplicating critical components of or entire systems, organisations can ensure that if one fails, the redundant system takes over, making sure there is no interruption in service. An example is using multiple servers to handle the workload. In cases of hardware failure or system crash, the redundant servers take over, ensuring there is uninterrupted service to endusers.

Replication involves creating multiple copies of data ensures that the data is still available even if one copy becomes inaccessible.

b. Load Balancing

Load balancing involves evenly distributing workloads across multiple servers so that no single server is overwhelmed. Intelligent load balancing algorithms can be utilised by organisations to optimise resource utilisation, prevent bottleneck, and enhance HA by evenly distributing traffic. This is particularly useful in web applications, when a lot of users are accessing the website simultaneously. By distributing incoming requests to multiple backend servers, load balancers ensure that no single server is overwhelmed, which leads to improved performance and availability.

c. Availability through Distributed Data Storage

Data stored across multiple locations or data centres enhances HA by reducing the risk of data loss or corruption. Distributed data storage systems replicate data across multiple, geographically separated locations, ensuring that data is always available, even in the event of a catastrophic failure at one of the locations.

d. Regular System Maintenance and Updates

By keeping systems up to date with the latest security enhancements, patches, bug fixes, organisations can mitigate failure risks and vulnerabilities that could compromise system availability. This involves hardware inspections, software updates, and routine checks to make sure that all components are functioning correctly. Organisations can maintain HA and minimise system failures by being proactive with and addressing any potential issues promptly.

e. Data Consistency Models (Strong, Weak, Eventual)

Consistency models define how a distributed system maintains an up-to-date view of it's data across all replicas. Different consistency models provide different tradeoffs between availability, performance and data correctness. Strong consistency ensures all replicas have the same data, at the cost of reduced availability and performance. Weak consistency allows for temporary inconsistencies between replicas, with improved availability and performance. Eventual consistency guarantees that all replicas will eventually have the same data, providing a balance between consistency, availability, and performance.

f. Health Monitoring and Alerts

Real-time monitoring and automated notification systems ensures organisations can proactively identify and address potential issues before they impact system availability. This involves continuously monitoring system performance, resource utilisation, and various metrics to detect anomalies and or potential issues. Alerts are triggered when predefined thresholds are exceeded, allowing IT teams to take immediate actions and prevent service disruptions.

g. Geographic Distribution

This strategy involves deploying system components across multiple geographical locations or data centres. This ensures if one location or data centre experiences an outage like power cuts, endusers can still access the system from the other locations. This is particularly important for organisations that have a global presence or those that rely heavily on cloud infrastructure.

Latency and Performance
Latency refers to the time it takes for a request to travel from it's point of origin to it's destination, and receive a response. Both latency and performance are critical aspects of distributed systems as they directly impact the endusers' experience including the system's ability to handle large amounts of data and traffic. Optimising latency and performance in distributed systems involve factors such as data locality, load balancing, and caching strategies:

a. Data Locality

Data locality refers to the principle of storing and processing data close to the processing units or nodes that access them more frequently. This will reduce latency associated with data retrieval and improve overall performance. Techniques to achieve data locality include data partitioning, sharding and data replication.

b. Load Balancing

As earlier mentioned, load balancers evenly distributes incoming traffic across multiple backend servers ensuring no single server is overwhelmed. Various load balancing algorithms e.g round-robin, least connections, consistent hashing, etc can be implemented to achieve efficient load distribution and improved system performance.

c. Caching Strategy

Caching is a technique used to store frequently accessed data temporarily, and then allowing the system to quickly retrieve data from the cache instead of fetching it from the primary data source. This can significantly reduce latency and improve the performance of distributed systems. Examples include in-memory caching, distributed caching, and content delivery networks (CDN).

In this first part of a three part blog on Systems Design Fundamentals, we discussed about the first three key characteristics of System Design which is Scalability, Availability, Latency and Performance. Hope it helps you understand more about the fundamentals. As a look forward, we will cover the next two characteristics. See you there.

System Design Fundamentals Part 1

Scalability

a. Vertical Scaling

b. Horizontal Scaling

Availability

High Availability (HA)

a. Redundancy and Replication

b. Load Balancing

c. Availability through Distributed Data Storage

d. Regular System Maintenance and Updates

e. Data Consistency Models (Strong, Weak, Eventual)

f. Health Monitoring and Alerts

g. Geographic Distribution

Latency and Performance

a. Data Locality

b. Load Balancing

c. Caching Strategy

Recent Posts

Comments