Distributed Systems: Difference between revisions

Revision as of 07:48, 6 July 2025

Introduction

Distributed systems are collections of independent computers that collaborate through the sharing of networked resources to achieve a common goal. They operate as a cohesive unit while maintaining the autonomy of individual components. These systems are designed to handle large-scale, complex applications and can range from small clusters to vast networks of geographically distributed nodes. The significance of distributed systems lies in their ability to improve scalability, reliability, and resource utilization compared to traditional centralized systems.

History

The concept of distributed systems emerged in the late 20th century as computer networks began to proliferate. Early forms of distributed computing can be traced back to the 1960s, when researchers sought to connect multiple computers to process tasks in parallel. The development of the ARPANET in the late 1960s, which was funded by the U.S. Department of Defense, laid the groundwork for networked communication and the eventual rise of distributed systems.

Throughout the 1970s and 1980s, key advancements were made in distributed algorithms and protocols, including the development of the Client-Server model, which became the foundation for many subsequent distributed applications. Notable contributions include the implementations of distributed databases and file systems, along with the introduction of communication protocols such as TCP/IP.

The 1990s marked a significant milestone with the advent of the World Wide Web, which highlighted the potential of distributed systems to provide services on a global scale. Innovations such as peer-to-peer networks and grid computing emerged during this period, expanding the application of distributed systems beyond traditional boundaries.

With the rise of cloud computing in the early 2000s, distributed systems gained renewed attention. Companies began leveraging distributed architectures to provide scalable services and applications over the internet. Technologies such as MapReduce, Hadoop, and distributed databases like Amazon DynamoDB and Google Bigtable became crucial components in managing vast amounts of data across distributed environments.

Design and Architecture

Distributed systems are characterized by specific architectural patterns and design principles that differentiate them from centralized systems. The design focuses on ensuring consistent performance, fault tolerance, and resource management across multiple nodes. Key elements of distributed system architecture include:

1. Components

Distributed systems typically consist of multiple components, which may include:

**Nodes**: Individual computing devices that participate in the system.
**Middleware**: Software that acts as an intermediary layer, facilitating communication and data exchange between nodes.
**Storage systems**: Solutions that provide distributed data storage and management capabilities.

2. Communication

Effective communication is pivotal in distributed systems. Various communication models are used, including:

**Message Passing**: Nodes communicate by sending and receiving messages.
**Shared Memory**: Nodes share a common memory space, although this requires synchronization mechanisms to ensure data consistency.

3. Consistency Models

Maintaining data consistency across distributed nodes is challenging due to the potential for asynchrony and network partitioning. Common consistency models include:

**Strong Consistency**: Guarantees that all nodes see the same data at the same time.
**Eventual Consistency**: Allows for temporary discrepancies, with the assurance that all updates will propagate to all nodes eventually.

4. Fault Tolerance

Distributed systems must be resilient to component failures. Strategies to achieve fault tolerance include:

**Replication**: Duplicating data across multiple nodes to ensure availability in the case of failures.
**Consensus Algorithms**: Mechanisms such as Paxos and Raft are used to achieve agreement among nodes despite failures.

5. Scalability

Scalability refers to the ability of a system to handle increasing loads by adding more resources. Distributed systems may be designed for:

**Vertical Scaling**: Adding more resources (CPU, memory) to existing nodes.
**Horizontal Scaling**: Adding more nodes to the system, distributing the workload.

Usage and Implementation

Distributed systems find application in numerous fields, including cloud computing, data storage and management, web services, and enterprise applications. Below are some prominent implementations and their use cases:

1. Cloud Computing

Cloud platforms such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform are built on distributed systems principles. They enable users to provision resources, deploy applications, and scale services dynamically across multiple geographical locations.

2. Distributed Databases

These databases, such as Apache Cassandra, MongoDB, and CockroachDB, leverage distribution to provide scalability and fault tolerance for data storage. They allow for high availability and can handle large volumes of transactions across distributed nodes.

3. Microservices Architecture

The microservices pattern promotes the development of applications as a suite of small, independent services that communicate over a network. This architecture enhances scalability, as services can be developed, deployed, and scaled independently.

4. Peer-to-Peer Networks

In peer-to-peer (P2P) systems, nodes act as both clients and servers, sharing resources directly with each other. P2P applications include file sharing (e.g., BitTorrent) and cryptocurrency networks (e.g., Bitcoin), which capitalize on the decentralized nature of distributed systems.

5. Big Data Processing

Frameworks such as Apache Hadoop and Apache Spark utilize distributed systems to perform large-scale data processing tasks. These frameworks enable the analysis of massive datasets across clusters of machines, allowing businesses to derive insights and make data-driven decisions.

Real-world Examples

Distributed systems are prevalent in various domains and industries. Here are several notable examples:

1. Google MapReduce

MapReduce is a programming model designed for processing large datasets with a distributed algorithm on a cluster. Google utilized it to index the web and extract meaningful data, revolutionizing data processing capabilities.

2. Amazon DynamoDB

DynamoDB is a fully managed NoSQL database service designed to handle high-traffic workloads while providing low latency data access. It scales automatically and offers high availability across multiple regions.

3. Bitcoin Blockchain

The Bitcoin blockchain operates as a distributed ledger that ensures transactions are securely recorded across a network of nodes. It employs a consensus algorithm known as proof-of-work to validate transactions and maintain the integrity of the ledger.

4. Apache Kafka

Kafka is a distributed streaming platform that enables the building of real-time data pipelines and streaming applications. It is designed to handle high-throughput data feeds, making it a fundamental component in microservices architectures.

5. Kubernetes

Kubernetes is an orchestration platform for automating the deployment, scaling, and management of containerized applications. It operates in a distributed manner, allowing developers to manage clusters of machines efficiently.

Criticism and Controversies

While distributed systems offer numerous advantages, they are not without challenges and criticisms. Concerns include:

1. Complexity

The design, implementation, and maintenance of distributed systems can be significantly more complex than centralized systems. Debugging and troubleshooting issues can be particularly challenging due to the involvement of multiple components and potential network-related problems.

2. Security Issues

Distributed systems can introduce vulnerabilities, especially when nodes communicate over insecure networks. Ensuring data security, privacy, and integrity across distributed components is an ongoing challenge that must be addressed through robust security mechanisms.

3. Performance Overheads

Communication between distributed nodes can introduce latency, impacting overall system performance. Optimizing data exchange and ensuring efficient communication protocols are critical to mitigating these challenges.

4. Data Consistency Challenges

Achieving strong consistency in distributed systems can lead to trade-offs with availability and performance, notably in the presence of network partitions. The CAP theorem postulates that it's impossible to achieve all three properties (Consistency, Availability, Partition tolerance) simultaneously, necessitating design decisions that can impact system behavior.

5. Vendor Lock-In

As organizations adopt cloud-based distributed solutions, they may become dependent on specific vendors, leading to potential lock-in situations where migration to alternative platforms becomes costly and complicated.

Influence and Impact

The evolution of distributed systems has had a profound influence on various fields, shaping technology, infrastructure, and practices both in industry and academia. The impact includes:

1. Evolution of Software Development

The adoption of distributed architectures has driven the transition from monolithic application development to more modular and agile approaches. The emergence of microservices architecture aligns with contemporary development paradigms that emphasize automation, continuous integration, and deployment.

2. Growth of Cloud Computing

The establishment of distributed systems has been instrumental in the rise of cloud computing paradigms. Organizations can leverage cloud-based resources to achieve scalability and flexibility, leading to cost savings and improved operational efficiency.

3. Transforming Data Analytics

Distributed systems have transformed the landscape of data analytics by enabling large-scale processing and analysis of big data, allowing organizations to harness insights from vast datasets that were previously infeasible to manage.

4. Innovations in Networking and Infrastructure

The design principles of distributed systems have influenced advancements in network infrastructure, leading to the proliferation of content delivery networks (CDNs), edge computing, and enhanced network protocols that support efficient communication across distributed environments.

5. Academic Research

Distributed systems continue to be a vibrant area of academic research, contributing to advancements in algorithms, protocols, and methodologies that address key challenges such as fault tolerance, consensus, and performance optimization.

References

@@ Line 1: / Line 1: @@
-= Distributed Systems =
+== Introduction ==
-== Introduction ==
+Distributed systems are collections of independent computers that collaborate through the sharing of networked resources to achieve a common goal. They operate as a cohesive unit while maintaining the autonomy of individual components. These systems are designed to handle large-scale, complex applications and can range from small clusters to vast networks of geographically distributed nodes. The significance of distributed systems lies in their ability to improve scalability, reliability, and resource utilization compared to traditional centralized systems.
-A distributed system is a model in which components located on networked computers communicate and coordinate their actions by passing messages. The components interact with one another in order to achieve a common goal, and they appear to the users of the system as a single coherent system. This type of system is characterized by its lack of a shared physical memory and can include hardware and software components that are located in different physical locations. Distributed systems are designed to handle large-scale operations and offer advantages such as increased reliability, scalability, and flexibility.
 == History ==
-The concept of distributed systems dates back to the early 1970s when the need for shared resources and collaboration between multiple computers became apparent. One of the first instances of a distributed system was the development of ARPANET, the precursor to the modern Internet, which connected various research institutions and allowed for the sharing of information and resources.
-In the 1980s and 1990s, advances in network technologies, such as Ethernet and TCP/IP, began to revolutionize the way computers communicated with each other. This era saw the introduction of various distributed computing models and frameworks, including Remote Procedure Calls (RPC), and the emergence of systems like the Andrew File System (AFS) and distributed databases.
+The concept of distributed systems emerged in the late 20th century as computer networks began to proliferate. Early forms of distributed computing can be traced back to the 1960s, when researchers sought to connect multiple computers to process tasks in parallel. The development of the ARPANET in the late 1960s, which was funded by the U.S. Department of Defense, laid the groundwork for networked communication and the eventual rise of distributed systems.
+Throughout the 1970s and 1980s, key advancements were made in distributed algorithms and protocols, including the development of the Client-Server model, which became the foundation for many subsequent distributed applications. Notable contributions include the implementations of distributed databases and file systems, along with the introduction of communication protocols such as TCP/IP.
-The late 1990s and early 2000s introduced the rise of cloud computing, which further popularized distributed systems. Platforms such as Amazon Web Services (AWS) demonstrated the scalability of distributed architectures and provided businesses with unprecedented access to computing resources without needing to invest in physical infrastructure.
+The 1990s marked a significant milestone with the advent of the World Wide Web, which highlighted the potential of distributed systems to provide services on a global scale. Innovations such as peer-to-peer networks and grid computing emerged during this period, expanding the application of distributed systems beyond traditional boundaries.
+With the rise of cloud computing in the early 2000s, distributed systems gained renewed attention. Companies began leveraging distributed architectures to provide scalable services and applications over the internet. Technologies such as MapReduce, Hadoop, and distributed databases like Amazon DynamoDB and Google Bigtable became crucial components in managing vast amounts of data across distributed environments.
 == Design and Architecture ==
-Designing a distributed system involves various challenges unique to its architecture. Key components of distributed system architecture include:
-=== 1. Nodes ===
+Distributed systems are characterized by specific architectural patterns and design principles that differentiate them from centralized systems. The design focuses on ensuring consistent performance, fault tolerance, and resource management across multiple nodes. Key elements of distributed system architecture include:
-Nodes refer to individual computing devices within a distributed system. Each node operates independently but collaborates with other nodes to perform tasks. Nodes can be heterogeneous, including different types of hardware and operating systems.
+=== 1. Components ===
+Distributed systems typically consist of multiple components, which may include:
+* **Nodes**: Individual computing devices that participate in the system.
+* **Middleware**: Software that acts as an intermediary layer, facilitating communication and data exchange between nodes.
+* **Storage systems**: Solutions that provide distributed data storage and management capabilities.
 === 2. Communication ===
-Effective communication is crucial for the functioning of distributed systems. This usually takes place through message passing over a network. Several protocols, such as Message Queuing Telemetry Transport (MQTT) and Advanced Message Queuing Protocol (AMQP), facilitate communication by ensuring reliable message delivery and appropriate handling of communication failures.
+Effective communication is pivotal in distributed systems. Various communication models are used, including:
+* **Message Passing**: Nodes communicate by sending and receiving messages.
+* **Shared Memory**: Nodes share a common memory space, although this requires synchronization mechanisms to ensure data consistency.
-=== 3. Coordination ===
+=== 3. Consistency Models ===
-Coordination among distributed components is necessary for maintaining consistency and synchronization. Algorithms like the Paxos or Raft consensus algorithms provide methods for achieving agreement among nodes in the presence of failures.
+Maintaining data consistency across distributed nodes is challenging due to the potential for asynchrony and network partitioning. Common consistency models include:
+* **Strong Consistency**: Guarantees that all nodes see the same data at the same time.
+* **Eventual Consistency**: Allows for temporary discrepancies, with the assurance that all updates will propagate to all nodes eventually.
-=== 4. Data Management ===
+=== 4. Fault Tolerance ===
-Distributed data management involves ensuring that data is stored across various nodes in a way that is both reliable and accessible. Techniques such as replication, sharding, and partitioning are commonly utilized to enhance data durability and performance.
+Distributed systems must be resilient to component failures. Strategies to achieve fault tolerance include:
+* **Replication**: Duplicating data across multiple nodes to ensure availability in the case of failures.
+* **Consensus Algorithms**: Mechanisms such as Paxos and Raft are used to achieve agreement among nodes despite failures.
-=== 5. Fault Tolerance ===
+=== 5. Scalability ===
-A defining characteristic of distributed systems is their ability to remain operational in the presence of faults. Techniques such as redundancy, checkpointing, and recovery protocols are employed to ensure fault tolerance and disaster recovery.
+Scalability refers to the ability of a system to handle increasing loads by adding more resources. Distributed systems may be designed for:
+* **Vertical Scaling**: Adding more resources (CPU, memory) to existing nodes.
+* **Horizontal Scaling**: Adding more nodes to the system, distributing the workload.
-=== 6. Scalability ===
+== Usage and Implementation ==
-Scalability indicates the ability to manage increases in workload by adding resources, either by scaling up (adding more power to existing machines) or scaling out (adding more machines). Distributed systems are typically designed with horizontal scalability in mind, allowing them to accommodate growing demands seamlessly.
-== Usage and Implementation ==
+Distributed systems find application in numerous fields, including cloud computing, data storage and management, web services, and enterprise applications. Below are some prominent implementations and their use cases:
-Distributed systems have found applications across various domains. Significant areas of usage include:
 === 1. Cloud Computing ===
-Cloud computing platforms deliver distributed resources and services over the internet. They allow users to access virtualized resources, such as storage and processing power, without requiring on-premise infrastructure. Popular cloud services such as AWS, Google Cloud Platform, and Microsoft Azure operate on complex distributed architectures to support their offerings.
+Cloud platforms such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform are built on distributed systems principles. They enable users to provision resources, deploy applications, and scale services dynamically across multiple geographical locations.
 === 2. Distributed Databases ===
-Distributed databases store data across multiple nodes. They provide scalability and fault tolerance, with well-known systems including Apache Cassandra, Amazon DynamoDB, and Google Spanner. These databases often implement complex consistency models and data partitioning strategies to maintain performance and reliability.
+These databases, such as Apache Cassandra, MongoDB, and CockroachDB, leverage distribution to provide scalability and fault tolerance for data storage. They allow for high availability and can handle large volumes of transactions across distributed nodes.
-=== 3. Peer-to-Peer Networks ===
+=== 3. Microservices Architecture ===
-Peer-to-peer (P2P) networks enable direct communication and resource sharing among nodes without requiring an intermediary. P2P file sharing systems such as BitTorrent and cryptocurrency networks like Bitcoin exemplify this structure, where each participant acts as both a client and a server.
+The microservices pattern promotes the development of applications as a suite of small, independent services that communicate over a network. This architecture enhances scalability, as services can be developed, deployed, and scaled independently.
-=== 4. Sensor Networks ===
+=== 4. Peer-to-Peer Networks ===
-Distributed systems are integral to sensor networks that involve numerous sensor nodes collecting and processing data from extensive geographic areas. This technology is often leveraged in applications like environmental monitoring, smart cities, and industrial automation.
+In peer-to-peer (P2P) systems, nodes act as both clients and servers, sharing resources directly with each other. P2P applications include file sharing (e.g., BitTorrent) and cryptocurrency networks (e.g., Bitcoin), which capitalize on the decentralized nature of distributed systems.
-=== 5. Distributed Computing Frameworks ===
+=== 5. Big Data Processing ===
-Frameworks such as Apache Hadoop and Apache Spark enable the processing of large datasets across clusters of computers. They allow developers to build applications that can handle vast data streams and perform complex calculations efficiently.
+Frameworks such as Apache Hadoop and Apache Spark utilize distributed systems to perform large-scale data processing tasks. These frameworks enable the analysis of massive datasets across clusters of machines, allowing businesses to derive insights and make data-driven decisions.
 == Real-world Examples ==
-Numerous real-world examples illustrate the functionality of distributed systems:
-=== 1. Internet of Things (IoT) ===
+Distributed systems are prevalent in various domains and industries. Here are several notable examples:
-The IoT is a network of interconnected devices that communicate and exchange data, forming a distributed system. Smart home devices, wearables, and industrial IoT systems showcase how distributed architectures can enable automation and enhance user experiences.
-=== 2. Netflix ===
+=== 1. Google MapReduce ===
-Netflix employs a sophisticated distributed system architecture to provide streaming services to millions of users. The platform utilizes microservices architecture, which enhances scalability and resilience against failures, ensuring uninterrupted service delivery.
+MapReduce is a programming model designed for processing large datasets with a distributed algorithm on a cluster. Google utilized it to index the web and extract meaningful data, revolutionizing data processing capabilities.
-=== 3. Google Search ===
+=== 2. Amazon DynamoDB ===
-Google's search engine operates on a distributed architecture capable of indexing and searching vast amounts of web data across multiple servers worldwide. Its innovative use of distributed data storage, algorithmic processing, and redundancy ensures rapid and reliable search results.
+DynamoDB is a fully managed NoSQL database service designed to handle high-traffic workloads while providing low latency data access. It scales automatically and offers high availability across multiple regions.
-=== 4. Blockchain ===
+=== 3. Bitcoin Blockchain ===
-Blockchain technology, which underlies cryptocurrencies and other applications, operates as a distributed ledger system. It allows for secure, transparent, and tamper-proof transactions through consensus mechanisms that require agreement among multiple distributed nodes.
+The Bitcoin blockchain operates as a distributed ledger that ensures transactions are securely recorded across a network of nodes. It employs a consensus algorithm known as proof-of-work to validate transactions and maintain the integrity of the ledger.
+=== 4. Apache Kafka ===
+Kafka is a distributed streaming platform that enables the building of real-time data pipelines and streaming applications. It is designed to handle high-throughput data feeds, making it a fundamental component in microservices architectures.
+=== 5. Kubernetes ===
+Kubernetes is an orchestration platform for automating the deployment, scaling, and management of containerized applications. It operates in a distributed manner, allowing developers to manage clusters of machines efficiently.
 == Criticism and Controversies ==
-Distributed systems also face various criticisms and challenges, which may affect their implementation and acceptance:
+While distributed systems offer numerous advantages, they are not without challenges and criticisms. Concerns include:
 === 1. Complexity ===
-The complexity of designing, developing, and maintaining distributed systems can be daunting. Ensuring reliability, consistency, and performance requires sophisticated algorithms and a deep understanding of distributed computing principles. This complexity can lead to increased chances of failure and challenging debugging processes.
+The design, implementation, and maintenance of distributed systems can be significantly more complex than centralized systems. Debugging and troubleshooting issues can be particularly challenging due to the involvement of multiple components and potential network-related problems.
 === 2. Security Issues ===
-Distributed systems are often more vulnerable to security issues than centralized systems. The distributed nature of these systems can expose them to various attack vectors, including Denial-of-Service (DoS) attacks and data interception. Ensuring data security and integrity in a distributed environment remains a significant challenge for developers.
+Distributed systems can introduce vulnerabilities, especially when nodes communicate over insecure networks. Ensuring data security, privacy, and integrity across distributed components is an ongoing challenge that must be addressed through robust security mechanisms.
+=== 3. Performance Overheads ===
+Communication between distributed nodes can introduce latency, impacting overall system performance. Optimizing data exchange and ensuring efficient communication protocols are critical to mitigating these challenges.
-=== 3. Consistency vs. Availability ===
+=== 4. Data Consistency Challenges ===
-Distributed systems often face the trade-off between consistency and availability, known as the CAP theorem. This theorem states that in any distributed system, it is impossible to simultaneously guarantee consistency, availability, and partition tolerance. Designers must make critical decisions regarding these properties based on their specific use cases.
+Achieving strong consistency in distributed systems can lead to trade-offs with availability and performance, notably in the presence of network partitions. The CAP theorem postulates that it's impossible to achieve all three properties (Consistency, Availability, Partition tolerance) simultaneously, necessitating design decisions that can impact system behavior.
-=== 4. Ownership and Governance ===
+=== 5. Vendor Lock-In ===
-In peer-to-peer and decentralized systems, issues regarding ownership, governance, and control can arise. Questions about who owns the data, how decisions are made, and the implications of a decentralized system can lead to controversies surrounding privacy and accountability.
+As organizations adopt cloud-based distributed solutions, they may become dependent on specific vendors, leading to potential lock-in situations where migration to alternative platforms becomes costly and complicated.
 == Influence and Impact ==
-Distributed systems have had a profound influence on computing, leading to:
-=== 1. Advancement of Cloud Technologies ===
+The evolution of distributed systems has had a profound influence on various fields, shaping technology, infrastructure, and practices both in industry and academia. The impact includes:
-The rise of distributed systems has paved the way for cloud technologies that have transformed how businesses operate. Organizations can now access high-quality services with reduced costs, offering scalability and flexibility previously unattainable.
+=== 1. Evolution of Software Development ===
+The adoption of distributed architectures has driven the transition from monolithic application development to more modular and agile approaches. The emergence of microservices architecture aligns with contemporary development paradigms that emphasize automation, continuous integration, and deployment.
+=== 2. Growth of Cloud Computing ===
+The establishment of distributed systems has been instrumental in the rise of cloud computing paradigms. Organizations can leverage cloud-based resources to achieve scalability and flexibility, leading to cost savings and improved operational efficiency.
-=== 2. Innovations in Data Handling ===
+=== 3. Transforming Data Analytics ===
-Distributed systems have driven innovations in big data processing and analytics. With the exponential growth of data generated daily, distributed data management and processing frameworks have become essential for harnessing that data for meaningful insights.
+Distributed systems have transformed the landscape of data analytics by enabling large-scale processing and analysis of big data, allowing organizations to harness insights from vast datasets that were previously infeasible to manage.
-=== 3. Improvement in Resilience and Reliability ===
+=== 4. Innovations in Networking and Infrastructure ===
-The emphasis on fault tolerance and redundancy in distributed systems has led to architectural improvements across various applications, enhancing resilience and ensuring minimal downtime across critical services.
+The design principles of distributed systems have influenced advancements in network infrastructure, leading to the proliferation of content delivery networks (CDNs), edge computing, and enhanced network protocols that support efficient communication across distributed environments.
-=== 4. Decentralization Movement ===
+=== 5. Academic Research ===
-Distributed systems have fueled the decentralization movement in technology. From blockchain solutions to developments in peer-to-peer networks, there is a growing shift away from centralized authority structures, promoting autonomy and privacy for users.
+Distributed systems continue to be a vibrant area of academic research, contributing to advancements in algorithms, protocols, and methodologies that address key challenges such as fault tolerance, consensus, and performance optimization.
-== See Also ==
+== See also ==
-* [[Cloud computing]]
+* [[Cloud Computing]]
-* [[Peer-to-peer network]]
+* [[Microservices]]
-* [[Distributed database]]
+* [[Peer-to-Peer Networks]]
-* [[Consensus algorithm]]
+* [[Blockchain]]
-* [[Microservices architecture]]
+* [[Distributed Databases]]
-* [[Internet of Things]]
+* [[Grid Computing]]
+* [[Consensus Algorithms]]
+* [[CAP Theorem]]
+* [[Fault Tolerance]]
 == References ==
-* [https://www.cloudpronews.com Distributed Systems Overview - CloudPro News]
+* [https://www.darwinsys.com/ Distributed Systems Overview]
-* [https://www.ibm.com/cloud/learn/distributed-systems IBM Cloud - Understanding Distributed Systems]
+* [https://researchgate.net/publication/319208064 Distributed Systems: Principles and Paradigms]
-* [https://www.microsoft.com/en-us/research/publication/distributed-systems-introduction Microsoft Research - Introduction to Distributed Systems]
+* [https://aws.amazon.com/architecture/distributed-systems/ AWS Distributed Services]
-* [https://www.oreilly.com/library/view/distributed-systems/9781491942459/ Designing Data-Intensive Applications: Distributed Systems by Martin Kleppmann]
+* [https://towardsdatascience.com/a-comprehensive-guide-to-distributed-systems-for-practitioners-290b9d3f9b4c Comprehensive Guide to Distributed Systems]
-* [https://cacm.acm.org/magazines/2019/6/233319-distributed-systems-in-the-cloud/fulltext Distributed Systems in the Cloud - Communications of the ACM]
+* [https://cassandra.apache.org/_/index.html Apache Cassandra Official Documentation]
 [[Category:Distributed computing]]
 [[Category:Computer science]]
-[[Category:Information technology]]
+[[Category:Systems architecture]]