A distributed system is a software system in which components located on networked computers communicate and coordinate their actions by passing messages. The components interact with each other in order to achieve a common goal.
Distributed systems Principles
A distributed system consists of a collection of autonomous computers, connected through a network and distribution middleware, which enables computers to coordinate their activities and to share the resources of the system, so that users perceive the system as a single, integrated computing facility.
Centralised System Characteristics
· One component with non-autonomous parts
· Component shared by users all the time
· All resources accessible
· Software runs in a single process
· Single Point of control
· Single Point of failure
Distributed System Characteristics
· Multiple autonomous components
· Components are not shared by all users
· Resources may not be accessible
· Software runs in concurrent processes on different processors
· Multiple Points of control
· Multiple Points of failure
Examples of distributed systems and applications of distributed computing include the following:
· telecommunication networks:
o telephone networks and cellular networks,
o computer networks such as the Internet,
o wireless sensor networks,
o routing algorithms;
· network applications:
o World wide web and peer-to-peer networks,
o massively multiplayer online games and virtual reality communities,
o distributed databases and distributed database management systems,
o network file systems,
o distributed information processing systems such as banking systems and airline reservation systems;
o real-time process control:
o aircraft control systems,
o industrial control systems;
• parallel computation:
§ scientific computing, including cluster computing and grid computing and various volunteer computing projects (see the list of distributed computing projects),
§ distributed rendering in computer graphics.
Certain common characteristics can be used to assess distributed systems
o Resource Sharing
o Fault Tolerance
o Ability to use any hardware, software or data anywhere in the system.
o Resource manager controls access, provides naming scheme and controls concurrency.
o Resource sharing model (e.g. client/server or object-based) describing how
o resources are provided,
o they are used and
o provider and user interact with each other.
o Openness is concerned with extensions and improvements of distributed systems.
o Detailed interfaces of components need to be published.
o New components have to be integrated with existing components.
o Differences in data representation of interface types on different processors (of different vendors) have to be resolved.
Components in distributed systems are executed in concurrent processes.
Components access and update shared resources (e.g. variables, databases, device drivers).
Integrity of the system may be violated if concurrent updates are not coordinated. o Lost updates
o Adaption of distributed systems to
o accomodate more users
o respond faster (this is the hard one)
o Usually done by adding more and/or faster processors.
o Components should not need to be changed when scale of a system increases.
o Design components to be scalable
Hardware, software and networks fail!
o Distributed systems must maintain availability even at low levels of hardware/software/network reliability.
o Fault tolerance is achieved by
o Distributed systems should be perceived by users and application programmers as a whole rather than as a collection of cooperating components.
o Transparency has different dimensions that were identified by ANSA.
o These represent various properties that distributed systems should have.
Enables local and remote information objects to be accessed using identical operations.
o Example: File system operations in NFS.
o Example: Navigation in the Web.
o Example: SQL Queries
Enables information objects to be accessed without knowledge of their location.
o Example: File system operations in NFS
o Example: Pages in the Web
o Example: Tables in distributed databases
Enables several processes to operate concurrently using shared information objects without interference between them.
o Example: NFS
o Example: Automatic teller machine network
o Example: Database management system
Enables multiple instances of information objects to be used to increase reliability and performance without knowledge of the replicas by users or application programs
o Example: Distributed DBMS
o Example: Mirroring Web Pages.
o Enables the concealment of faults
o Allows users and applications to complete their tasks despite the failure of other components.
o Example: Database Management System
Allows the movement of information objects within a system without affecting the operations of users or application programs
o Example: NFS
o Example: Web Pages
Allows the system to be reconfigured to improve performance as loads vary.
Example: Distributed make.
Allows the system and applications to expand in scale without change to the system structure or the application algortithms.
o Example: World-Wide-Web
o Example: Distributed Database
Distributed Systems: Hardware Concepts
Networks of Computers
Multiprocessors and Multicomputers
Private versus shared memory
Bus versus switched interconnection
High degree of node heterogeneity:
High-performance parallel systems (multiprocessors as well as multicomputers)
High-end PCs and workstations (servers)
Simple network computers (offer users only network access)
Mobile computers (palmtops, laptops)
High degree of network heterogeneity:
Local-area gigabit networks
Long-haul, high-latency connections
Wide-area switched megabit connections
Distributed Systems: Software Concepts
Distributed operating system
_ Network operating system
Distributed Operating System
_ OS on each computer knows about the other computers
_ OS on different computers generally the same
_ Services are generally (transparently) distributed across computers
Network Operating System
_ Each computer has its own operating system with networking facilities
_ Computers work independently (i.e., they may even have different operating systems)
_ Services are tied to individual nodes (ftp, telnet, WWW)
_ Highly file oriented (basically, processors share only files)
Distributed System (Middleware)
_ OS on each computer need not know about the other computers
_ OS on different computers need not generally be the same
_ Services are generally (transparently) distributed across computers
Need for Middleware
Motivation: Too many networked applications were hard or difficult to integrate:
_ Departments are running different NOSs
_ Integration and interoperability only at level of primitive NOS services
_ Need for federated information systems:
Setting up enterprise-wide Internet services, making use of existing information systems
Allow transactions across different databases
_ Constraint: use the existing operating systems, and treat them as the underlying environment (they provided the basic functionality anyway)
Communication services: Abandon primitive socket based message passing in favor of: _ Procedure calls across networks
_ Remote-object method invocation
_ Message-queuing systems
Advanced communication streams
_ Event notification service
Information system services: Services that help manage data in a distributed system:
_ Large-scale, system wide naming services
_ Location services for tracking mobile objects
_ Persistent storage facilities
caching and replication
Control services: Services giving applications control over when, where, and how they access data:
_ Distributed transaction processing
_ Code migration
Security services: Services for secure processing and communication:
_ Authentication and authorization services
_ Simple encryption services
_ Auditing service
Comparison of DOS, NOS, and Middleware
Networks of computers are everywhere. The Internet is one, as are the many networks of which it is composed. Mobile phone networks, corporate networks, factory networks, campus networks, home networks, in-car networks – all of these, both separately and in combination, share the essential characteristics that make them relevant subjects for study under the heading distributed systems.
Distributed systems has the following significant consequences:
Concurrency: In a network of computers, concurrent program execution is the norm. I can do my work on my computer while you do your work on yours, sharing resources such as web pages or files when necessary. The capacity of the system to handle shared resources can be increased by adding more resources (for example. computers) to the network. We will describe ways in which this extra capacity can be usefully deployed at many points in this book. The coordination of concurrently executing programs that share resources is also an important and recurring topic.
No global clock: When programs need to cooperate they coordinate their actions by exchanging messages. Close coordination often depends on a shared idea of the time at which the programs’ actions occur. But it turns out that there are limits to the accuracy with which the computers in a network can synchronize their clocks – there is no single global notion of the correct time. This is a direct consequence of the fact that the only communication is by sending messages through a network.
Independent failures: All computer systems can fail, and it is the responsibility of system designers to plan for the consequences of possible failures. Distributed systems can fail in new ways. Faults in the network result in the isolation of the computers that are connected to it, but that doesn’t mean that they stop running. In fact, the programs
on them may not be able to detect whether the network has failed or has become unusually slow. Similarly, the failure of a computer, or the unexpected termination of a program somewhere in the system (a crash), is not immediately made known to the other components with which it communicates. Each component of the system can fail independently, leaving the others still running.