Distributed Systems: Difference between revisions

Latest revision as of 09:49, 6 July 2025

Distributed Systems is a field within computer science and engineering that encompasses a collection of independent entities that appear to applications as a single coherent system. These entities may include multiple computers, or nodes, that communicate and coordinate their actions by passing messages to one another. Contrary to centralized systems, where a single node or server performs all processing and serves all clients, distributed systems leverage the power of multiple interconnected systems, promoting scalability, robustness, and resource sharing.

Background or History

The concept of distributed systems is not a recent development; it can be traced back to the early days of computer science. The origins of distributed computing can be linked to the ARPANET project in the late 1960s and early 1970s, which was one of the first packet-switching networks. As the internet evolved and computers became more interconnected, the need for a standardized model of distributed communication became evident. Key theoretical advancements, such as those proposed by Leslie Lamport in his work on the Paxos consensus algorithm in the late 1970s, further guided the development of distributed systems.

Throughout the 1980s and 1990s, rapid advancements in networking technologies spurred the evolution of distributed systems research. Notably, the development of remote procedure calls (RPC) allowed programs on one computer to invoke services executed on another machine, giving rise to a range of distributed applications. The rise of client-server architecture marked significant progress, enabling applications to scale by distributing workloads efficiently across numerous clients and servers.

By the turn of the 21st century, grid computing and cloud computing emerged, firmly entrenching distributed systems in practical applications across various industries. This new wave of distributed systems allowed for leverage of computational resources over expansive networks, effectively addressing problems such as resource management, load balancing, and fault tolerance.

Architecture or Design

Distributed systems are characterized by various architectural models that determine how the components within the system interact with each other. Generally, there are three primary architectural styles for distributed systems: client-server, peer-to-peer, and multi-tier architectures.

Client-Server Architecture

In the client-server model, a dedicated server hosts resources or services that are accessed by multiple client nodes. The clients typically initiate requests that the server processes and responds to. A notable benefit of this model is the centralized management of resources, which simplifies data consistency and security protocols. However, this architecture may face bottlenecks if the server becomes overloaded, negatively impacting performance.

Peer-to-Peer Architecture

Peer-to-peer (P2P) systems distribute workloads among participants, allowing nodes to act both as clients and servers. This decentralized approach can improve resource utilization and resilience against failures, as each node can contribute resources to the system. P2P systems are commonly associated with file-sharing protocols and cryptocurrencies, yet they also present challenges such as security vulnerabilities and maintaining data consistency across numerous nodes.

Multi-Tier Architecture

Multi-tier architecture introduces additional layers between clients and servers. In this model, the system is divided into three or more tiers, with each tier responsible for specific functions within the application. Commonly, these tiers include the presentation layer, business logic layer, and data layer. This separation of concerns allows for easier management of the system while promoting scalability and flexibility. Multi-tier architectures are widely utilized in web applications and enterprise software systems.

Communication Mechanisms

Effective communication is a cornerstone of distributed systems, and numerous protocols facilitate interactions among nodes. These mechanisms can be categorized as synchronous and asynchronous communication. Synchronous communication necessitates that a node wait for a response before proceeding, which can hinder system performance if delays occur. Conversely, asynchronous communication allows nodes to continue processing while waiting for responses, thus enhancing efficiency. Various messaging protocols, such as Message Queue Telemetry Transport (MQTT), Advanced Message Queuing Protocol (AMQP), and the more ubiquitous HTTP, are often utilized to facilitate these interactions.

Implementation or Applications

The implementation of distributed systems spans various domains, including cloud computing, distributed databases, content delivery networks, and microservices architecture.

Cloud Computing

Cloud computing has redefined the allocation of computational resources. It operates on the principles of distributed systems, offering multiple services that can be accessed over the internet. Major cloud service providers, including Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP), maintain large-scale distributed systems that provide computing power, storage, and application services to users worldwide. These platforms leverage the advantages of elasticity and resource pooling, enabling organizations to scale services according to demand.

Distributed Databases

Distributed databases are a critical application of distributed systems. They allow data to be stored across multiple nodes, enhancing both performance and reliability. This architecture supports horizontal scaling, which is essential for handling vast amounts of data. Notable distributed databases include MongoDB, Cassandra, and Amazon DynamoDB, which implement various consistency models to ensure data reliability. The deployment of distributed databases enables seamless data access across different geographical regions, promoting fault tolerance and high availability.

Content Delivery Networks (CDNs)

CDNs utilize distributed systems to enhance the efficiency and speed of content delivery over the internet. By caching content across numerous geographical locations, CDNs ensure that users experience minimal latency and faster load times. This approach is particularly beneficial for media streaming and online services, where performance is critical. Major CDN providers, such as Akamai and Cloudflare, operate extensive networks of servers that store duplicated content, improving both redundancy and access speed.

Microservices Architecture

The microservices architectural style emphasizes the development of applications as independent services that can communicate through APIs. This distributed approach facilitates continuous development, deployment, and scaling of software applications. By breaking down a monolithic application into smaller, manageable components, organizations can efficiently allocate resources and enhance productivity. Tools and frameworks, such as Spring Boot and Kubernetes, have emerged to streamline the implementation of microservices-based architectures.

Real-world Examples

Distributed systems have been implemented in various industries, showcasing their versatility and effectiveness in solving complex problems.

Distributed File Systems

Distributed file systems, like Hadoop Distributed File System (HDFS) and Google File System (GFS), exemplify effective storage solutions that distribute data across multiple nodes. These systems ensure high availability and fault tolerance while allowing users to operate on massive datasets distributed across clusters of machines. Organizations frequently employ these systems for big data processing and analytics tasks, taking advantage of their scalability.

Blockchain Technology

Blockchain technology operates on principles of distributed systems, utilizing a decentralized ledger to verify and store transactions across multiple nodes. This architecture underpins cryptocurrencies, such as Bitcoin and Ethereum, enabling peer-to-peer transactions without the need for intermediaries. The consensus mechanisms employed by blockchain networks, including proof of work and proof of stake, ensure data integrity and security while demonstrating the application of distributed systems in fostering trust among participants.

Distributed Computing Frameworks

Frameworks like Apache Spark and Apache Flink provide robust platforms for distributed data processing. They enable the execution of complex data analytics tasks across clusters of computers, harnessing their combined computational power. These frameworks support fault tolerance and dynamic scaling, significantly boosting performance and enabling organizations to process large volumes of data in real time.

Industrial IoT Systems

In the domain of the Internet of Things (IoT), distributed systems facilitate the connectivity and coordination of numerous smart devices. Industrial IoT systems employ distributed architectures to gather and analyze data from various sensors and devices, enabling real-time monitoring and decision-making. These applications have proven invaluable in manufacturing, where they enhance operational efficiency and predictive maintenance, reducing downtime and costs.

Criticism or Limitations

Despite their numerous advantages, distributed systems face a host of challenges and limitations that can impact their effectiveness.

Complexity and Debugging

One notable challenge associated with distributed systems is the inherent complexity of designing, implementing, and managing such architectures. As the number of nodes increases, the difficulty of monitoring and troubleshooting also escalates. Issues such as network partitions, data inconsistency, and system failures can arise, often complicating debugging processes. Effective debugging tools and logging mechanisms are essential to mitigate these challenges and ensure system reliability.

Latency and Performance Overheads

Distributed systems can suffer from latency due to the time taken for messages to travel across networks. Additionally, performance overheads may result from the necessity of coordination among nodes, particularly in tightly-coupled systems that require frequent communication. Strategies such as data locality, caching, and reducing the granularity of interactions are often employed to minimize latency and optimize performance.

Security Concerns

Security is a critical concern in distributed systems, as the increased number of nodes and communication pathways provides more potential attack vectors for malicious actors. Ensuring data integrity, confidentiality, and authentication across distributed environments poses significant challenges. Best practices, such as employing encryption, access control, and network segmentation, are vital to safeguard distributed systems against evolving security threats.

Consistency Models

The trade-off between consistency, availability, and partition tolerance, known as the CAP theorem, underscores a major limitation of distributed systems. Given that it is impossible to achieve perfect consistency in a distributed environment, developers must make informed choices regarding how to maintain data accuracy, especially when operating under network partitions. The variety of consistency models, such as eventual consistency and strong consistency, each present specific benefits and drawbacks tailored to different application requirements.

References

@@ Line 1: / Line 1: @@
-= Distributed Systems =
+'''Distributed Systems''' is a field within computer science and engineering that encompasses a collection of independent entities that appear to applications as a single coherent system. These entities may include multiple computers, or nodes, that communicate and coordinate their actions by passing messages to one another. Contrary to centralized systems, where a single node or server performs all processing and serves all clients, distributed systems leverage the power of multiple interconnected systems, promoting scalability, robustness, and resource sharing.
-== Introduction ==
+== Background or History ==
-A distributed system is a model in which components located on networked computers communicate and coordinate their actions by passing messages. The components interact with one another in order to achieve a common goal, and they appear to the users of the system as a single coherent system. This type of system is characterized by its lack of a shared physical memory and can include hardware and software components that are located in different physical locations. Distributed systems are designed to handle large-scale operations and offer advantages such as increased reliability, scalability, and flexibility.
+The concept of distributed systems is not a recent development; it can be traced back to the early days of computer science. The origins of distributed computing can be linked to the ARPANET project in the late 1960s and early 1970s, which was one of the first packet-switching networks. As the internet evolved and computers became more interconnected, the need for a standardized model of distributed communication became evident. Key theoretical advancements, such as those proposed by Leslie Lamport in his work on the Paxos consensus algorithm in the late 1970s, further guided the development of distributed systems.
-== History ==
+Throughout the 1980s and 1990s, rapid advancements in networking technologies spurred the evolution of distributed systems research. Notably, the development of remote procedure calls (RPC) allowed programs on one computer to invoke services executed on another machine, giving rise to a range of distributed applications. The rise of client-server architecture marked significant progress, enabling applications to scale by distributing workloads efficiently across numerous clients and servers.
-The concept of distributed systems dates back to the early 1970s when the need for shared resources and collaboration between multiple computers became apparent. One of the first instances of a distributed system was the development of ARPANET, the precursor to the modern Internet, which connected various research institutions and allowed for the sharing of information and resources.
-In the 1980s and 1990s, advances in network technologies, such as Ethernet and TCP/IP, began to revolutionize the way computers communicated with each other. This era saw the introduction of various distributed computing models and frameworks, including Remote Procedure Calls (RPC), and the emergence of systems like the Andrew File System (AFS) and distributed databases.
+By the turn of the 21st century, grid computing and cloud computing emerged, firmly entrenching distributed systems in practical applications across various industries. This new wave of distributed systems allowed for leverage of computational resources over expansive networks, effectively addressing problems such as resource management, load balancing, and fault tolerance.
-The late 1990s and early 2000s introduced the rise of cloud computing, which further popularized distributed systems. Platforms such as Amazon Web Services (AWS) demonstrated the scalability of distributed architectures and provided businesses with unprecedented access to computing resources without needing to invest in physical infrastructure.
+== Architecture or Design ==
+Distributed systems are characterized by various architectural models that determine how the components within the system interact with each other. Generally, there are three primary architectural styles for distributed systems: client-server, peer-to-peer, and multi-tier architectures.
-== Design and Architecture ==
+=== Client-Server Architecture ===
-Designing a distributed system involves various challenges unique to its architecture. Key components of distributed system architecture include:
+In the client-server model, a dedicated server hosts resources or services that are accessed by multiple client nodes. The clients typically initiate requests that the server processes and responds to. A notable benefit of this model is the centralized management of resources, which simplifies data consistency and security protocols. However, this architecture may face bottlenecks if the server becomes overloaded, negatively impacting performance.
-=== 1. Nodes ===
+=== Peer-to-Peer Architecture ===
-Nodes refer to individual computing devices within a distributed system. Each node operates independently but collaborates with other nodes to perform tasks. Nodes can be heterogeneous, including different types of hardware and operating systems.
+Peer-to-peer (P2P) systems distribute workloads among participants, allowing nodes to act both as clients and servers. This decentralized approach can improve resource utilization and resilience against failures, as each node can contribute resources to the system. P2P systems are commonly associated with file-sharing protocols and cryptocurrencies, yet they also present challenges such as security vulnerabilities and maintaining data consistency across numerous nodes.
-=== 2. Communication ===
+=== Multi-Tier Architecture ===
-Effective communication is crucial for the functioning of distributed systems. This usually takes place through message passing over a network. Several protocols, such as Message Queuing Telemetry Transport (MQTT) and Advanced Message Queuing Protocol (AMQP), facilitate communication by ensuring reliable message delivery and appropriate handling of communication failures.
+Multi-tier architecture introduces additional layers between clients and servers. In this model, the system is divided into three or more tiers, with each tier responsible for specific functions within the application. Commonly, these tiers include the presentation layer, business logic layer, and data layer. This separation of concerns allows for easier management of the system while promoting scalability and flexibility. Multi-tier architectures are widely utilized in web applications and enterprise software systems.
-=== 3. Coordination ===
+=== Communication Mechanisms ===
-Coordination among distributed components is necessary for maintaining consistency and synchronization. Algorithms like the Paxos or Raft consensus algorithms provide methods for achieving agreement among nodes in the presence of failures.
+Effective communication is a cornerstone of distributed systems, and numerous protocols facilitate interactions among nodes. These mechanisms can be categorized as synchronous and asynchronous communication. Synchronous communication necessitates that a node wait for a response before proceeding, which can hinder system performance if delays occur. Conversely, asynchronous communication allows nodes to continue processing while waiting for responses, thus enhancing efficiency. Various messaging protocols, such as Message Queue Telemetry Transport (MQTT), Advanced Message Queuing Protocol (AMQP), and the more ubiquitous HTTP, are often utilized to facilitate these interactions.
-=== 4. Data Management ===
+== Implementation or Applications ==
-Distributed data management involves ensuring that data is stored across various nodes in a way that is both reliable and accessible. Techniques such as replication, sharding, and partitioning are commonly utilized to enhance data durability and performance.
+The implementation of distributed systems spans various domains, including cloud computing, distributed databases, content delivery networks, and microservices architecture.
-=== 5. Fault Tolerance ===
+=== Cloud Computing ===
-A defining characteristic of distributed systems is their ability to remain operational in the presence of faults. Techniques such as redundancy, checkpointing, and recovery protocols are employed to ensure fault tolerance and disaster recovery.
+Cloud computing has redefined the allocation of computational resources. It operates on the principles of distributed systems, offering multiple services that can be accessed over the internet. Major cloud service providers, including Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP), maintain large-scale distributed systems that provide computing power, storage, and application services to users worldwide. These platforms leverage the advantages of elasticity and resource pooling, enabling organizations to scale services according to demand.
-=== 6. Scalability ===
+=== Distributed Databases ===
-Scalability indicates the ability to manage increases in workload by adding resources, either by scaling up (adding more power to existing machines) or scaling out (adding more machines). Distributed systems are typically designed with horizontal scalability in mind, allowing them to accommodate growing demands seamlessly.
+Distributed databases are a critical application of distributed systems. They allow data to be stored across multiple nodes, enhancing both performance and reliability. This architecture supports horizontal scaling, which is essential for handling vast amounts of data. Notable distributed databases include MongoDB, Cassandra, and Amazon DynamoDB, which implement various consistency models to ensure data reliability. The deployment of distributed databases enables seamless data access across different geographical regions, promoting fault tolerance and high availability.
-== Usage and Implementation ==
+=== Content Delivery Networks (CDNs) ===
-Distributed systems have found applications across various domains. Significant areas of usage include:
+CDNs utilize distributed systems to enhance the efficiency and speed of content delivery over the internet. By caching content across numerous geographical locations, CDNs ensure that users experience minimal latency and faster load times. This approach is particularly beneficial for media streaming and online services, where performance is critical. Major CDN providers, such as Akamai and Cloudflare, operate extensive networks of servers that store duplicated content, improving both redundancy and access speed.
-=== 1. Cloud Computing ===
+=== Microservices Architecture ===
-Cloud computing platforms deliver distributed resources and services over the internet. They allow users to access virtualized resources, such as storage and processing power, without requiring on-premise infrastructure. Popular cloud services such as AWS, Google Cloud Platform, and Microsoft Azure operate on complex distributed architectures to support their offerings.
+The microservices architectural style emphasizes the development of applications as independent services that can communicate through APIs. This distributed approach facilitates continuous development, deployment, and scaling of software applications. By breaking down a monolithic application into smaller, manageable components, organizations can efficiently allocate resources and enhance productivity. Tools and frameworks, such as Spring Boot and Kubernetes, have emerged to streamline the implementation of microservices-based architectures.
-=== 2. Distributed Databases ===
-Distributed databases store data across multiple nodes. They provide scalability and fault tolerance, with well-known systems including Apache Cassandra, Amazon DynamoDB, and Google Spanner. These databases often implement complex consistency models and data partitioning strategies to maintain performance and reliability.
-=== 3. Peer-to-Peer Networks ===
-Peer-to-peer (P2P) networks enable direct communication and resource sharing among nodes without requiring an intermediary. P2P file sharing systems such as BitTorrent and cryptocurrency networks like Bitcoin exemplify this structure, where each participant acts as both a client and a server.
-=== 4. Sensor Networks ===
-Distributed systems are integral to sensor networks that involve numerous sensor nodes collecting and processing data from extensive geographic areas. This technology is often leveraged in applications like environmental monitoring, smart cities, and industrial automation.
-=== 5. Distributed Computing Frameworks ===
-Frameworks such as Apache Hadoop and Apache Spark enable the processing of large datasets across clusters of computers. They allow developers to build applications that can handle vast data streams and perform complex calculations efficiently.
 == Real-world Examples ==
-Numerous real-world examples illustrate the functionality of distributed systems:
+Distributed systems have been implemented in various industries, showcasing their versatility and effectiveness in solving complex problems.
-=== 1. Internet of Things (IoT) ===
-The IoT is a network of interconnected devices that communicate and exchange data, forming a distributed system. Smart home devices, wearables, and industrial IoT systems showcase how distributed architectures can enable automation and enhance user experiences.
-=== 2. Netflix ===
-Netflix employs a sophisticated distributed system architecture to provide streaming services to millions of users. The platform utilizes microservices architecture, which enhances scalability and resilience against failures, ensuring uninterrupted service delivery.
-=== 3. Google Search ===
-Google's search engine operates on a distributed architecture capable of indexing and searching vast amounts of web data across multiple servers worldwide. Its innovative use of distributed data storage, algorithmic processing, and redundancy ensures rapid and reliable search results.
-=== 4. Blockchain ===
-Blockchain technology, which underlies cryptocurrencies and other applications, operates as a distributed ledger system. It allows for secure, transparent, and tamper-proof transactions through consensus mechanisms that require agreement among multiple distributed nodes.
-== Criticism and Controversies ==
-Distributed systems also face various criticisms and challenges, which may affect their implementation and acceptance:
-=== 1. Complexity ===
+=== Distributed File Systems ===
-The complexity of designing, developing, and maintaining distributed systems can be daunting. Ensuring reliability, consistency, and performance requires sophisticated algorithms and a deep understanding of distributed computing principles. This complexity can lead to increased chances of failure and challenging debugging processes.
+Distributed file systems, like Hadoop Distributed File System (HDFS) and Google File System (GFS), exemplify effective storage solutions that distribute data across multiple nodes. These systems ensure high availability and fault tolerance while allowing users to operate on massive datasets distributed across clusters of machines. Organizations frequently employ these systems for big data processing and analytics tasks, taking advantage of their scalability.
-=== 2. Security Issues ===
+=== Blockchain Technology ===
-Distributed systems are often more vulnerable to security issues than centralized systems. The distributed nature of these systems can expose them to various attack vectors, including Denial-of-Service (DoS) attacks and data interception. Ensuring data security and integrity in a distributed environment remains a significant challenge for developers.
+Blockchain technology operates on principles of distributed systems, utilizing a decentralized ledger to verify and store transactions across multiple nodes. This architecture underpins cryptocurrencies, such as Bitcoin and Ethereum, enabling peer-to-peer transactions without the need for intermediaries. The consensus mechanisms employed by blockchain networks, including proof of work and proof of stake, ensure data integrity and security while demonstrating the application of distributed systems in fostering trust among participants.
-=== 3. Consistency vs. Availability ===
+=== Distributed Computing Frameworks ===
-Distributed systems often face the trade-off between consistency and availability, known as the CAP theorem. This theorem states that in any distributed system, it is impossible to simultaneously guarantee consistency, availability, and partition tolerance. Designers must make critical decisions regarding these properties based on their specific use cases.
+Frameworks like Apache Spark and Apache Flink provide robust platforms for distributed data processing. They enable the execution of complex data analytics tasks across clusters of computers, harnessing their combined computational power. These frameworks support fault tolerance and dynamic scaling, significantly boosting performance and enabling organizations to process large volumes of data in real time.
-=== 4. Ownership and Governance ===
+=== Industrial IoT Systems ===
-In peer-to-peer and decentralized systems, issues regarding ownership, governance, and control can arise. Questions about who owns the data, how decisions are made, and the implications of a decentralized system can lead to controversies surrounding privacy and accountability.
+In the domain of the Internet of Things (IoT), distributed systems facilitate the connectivity and coordination of numerous smart devices. Industrial IoT systems employ distributed architectures to gather and analyze data from various sensors and devices, enabling real-time monitoring and decision-making. These applications have proven invaluable in manufacturing, where they enhance operational efficiency and predictive maintenance, reducing downtime and costs.
-== Influence and Impact ==
+== Criticism or Limitations ==
-Distributed systems have had a profound influence on computing, leading to:
+Despite their numerous advantages, distributed systems face a host of challenges and limitations that can impact their effectiveness.
-=== 1. Advancement of Cloud Technologies ===
+=== Complexity and Debugging ===
-The rise of distributed systems has paved the way for cloud technologies that have transformed how businesses operate. Organizations can now access high-quality services with reduced costs, offering scalability and flexibility previously unattainable.
+One notable challenge associated with distributed systems is the inherent complexity of designing, implementing, and managing such architectures. As the number of nodes increases, the difficulty of monitoring and troubleshooting also escalates. Issues such as network partitions, data inconsistency, and system failures can arise, often complicating debugging processes. Effective debugging tools and logging mechanisms are essential to mitigate these challenges and ensure system reliability.
-=== 2. Innovations in Data Handling ===
+=== Latency and Performance Overheads ===
-Distributed systems have driven innovations in big data processing and analytics. With the exponential growth of data generated daily, distributed data management and processing frameworks have become essential for harnessing that data for meaningful insights.
+Distributed systems can suffer from latency due to the time taken for messages to travel across networks. Additionally, performance overheads may result from the necessity of coordination among nodes, particularly in tightly-coupled systems that require frequent communication. Strategies such as data locality, caching, and reducing the granularity of interactions are often employed to minimize latency and optimize performance.
-=== 3. Improvement in Resilience and Reliability ===
+=== Security Concerns ===
-The emphasis on fault tolerance and redundancy in distributed systems has led to architectural improvements across various applications, enhancing resilience and ensuring minimal downtime across critical services.
+Security is a critical concern in distributed systems, as the increased number of nodes and communication pathways provides more potential attack vectors for malicious actors. Ensuring data integrity, confidentiality, and authentication across distributed environments poses significant challenges. Best practices, such as employing encryption, access control, and network segmentation, are vital to safeguard distributed systems against evolving security threats.
-=== 4. Decentralization Movement ===
+=== Consistency Models ===
-Distributed systems have fueled the decentralization movement in technology. From blockchain solutions to developments in peer-to-peer networks, there is a growing shift away from centralized authority structures, promoting autonomy and privacy for users.
+The trade-off between consistency, availability, and partition tolerance, known as the CAP theorem, underscores a major limitation of distributed systems. Given that it is impossible to achieve perfect consistency in a distributed environment, developers must make informed choices regarding how to maintain data accuracy, especially when operating under network partitions. The variety of consistency models, such as eventual consistency and strong consistency, each present specific benefits and drawbacks tailored to different application requirements.
-== See Also ==
+== See also ==
-* [[Cloud computing]]
+* [[Cloud Computing]]
-* [[Peer-to-peer network]]
+* [[Microservices]]
-* [[Distributed database]]
+* [[Peer-to-Peer Networking]]
-* [[Consensus algorithm]]
+* [[Distributed Computing]]
-* [[Microservices architecture]]
+* [[Blockchain]]
-* [[Internet of Things]]
 == References ==
-* [https://www.cloudpronews.com Distributed Systems Overview - CloudPro News]
+* [https://aws.amazon.com/ Amazon Web Services]
-* [https://www.ibm.com/cloud/learn/distributed-systems IBM Cloud - Understanding Distributed Systems]
+* [https://azure.microsoft.com/en-us/ Microsoft Azure]
-* [https://www.microsoft.com/en-us/research/publication/distributed-systems-introduction Microsoft Research - Introduction to Distributed Systems]
+* [https://cloud.google.com/ Google Cloud Platform]
-* [https://www.oreilly.com/library/view/distributed-systems/9781491942459/ Designing Data-Intensive Applications: Distributed Systems by Martin Kleppmann]
+* [https://hadoop.apache.org/ Apache Hadoop]
-* [https://cacm.acm.org/magazines/2019/6/233319-distributed-systems-in-the-cloud/fulltext Distributed Systems in the Cloud - Communications of the ACM]
+* [https://www.mongodb.com/ MongoDB]
+* [https://cassandra.apache.org/ Apache Cassandra]
+* [https://blockchain.info/ Blockchain.info]
 [[Category:Distributed computing]]
 [[Category:Computer science]]
-[[Category:Information technology]]
+[[Category:Systems architecture]]