Chapter: Distributed Systems : Introduction

Distributed systems

A distributed system is a software system in which components located on networked computers communicate and coordinate their actions by passing messages. The components interact with each other in order to achieve a common goal.

Distributed systems

Introduction

Distributed systems Principles

A distributed system consists of a collection of autonomous computers, connected through a network and distribution middleware, which enables computers to coordinate their activities and to share the resources of the system, so that users perceive the system as a single, integrated computing facility.

Centralised System Characteristics

· One component with non-autonomous parts

· Component shared by users all the time

· All resources accessible

· Software runs in a single process

· Single Point of control

· Single Point of failure

Distributed System Characteristics

· Multiple autonomous components

· Components are not shared by all users

· Resources may not be accessible

· Software runs in concurrent processes on different processors

· Multiple Points of control

· Multiple Points of failure

Examples of distributed systems and applications of distributed computing include the following:

· telecommunication networks:

o telephone networks and cellular networks,

o computer networks such as the Internet,

o wireless sensor networks,

o routing algorithms;

· network applications:

o World wide web and peer-to-peer networks,

o massively multiplayer online games and virtual reality communities,

o distributed databases and distributed database management systems,

o network file systems,

o distributed information processing systems such as banking systems and airline reservation systems;

o real-time process control:

o aircraft control systems,

o industrial control systems;

• parallel computation:

§ scientific computing, including cluster computing and grid computing and various volunteer computing projects (see the list of distributed computing projects),

§ distributed rendering in computer graphics.

Common Characteristics

Certain common characteristics can be used to assess distributed systems

o Resource Sharing

o Openness

o Concurrency

o Scalability

o Fault Tolerance

o Transparency

Resource Sharing

o Ability to use any hardware, software or data anywhere in the system.

o Resource manager controls access, provides naming scheme and controls concurrency.

o Resource sharing model (e.g. client/server or object-based) describing how

o resources are provided,

o they are used and

o provider and user interact with each other.

Openness

o Openness is concerned with extensions and improvements of distributed systems.

o Detailed interfaces of components need to be published.

o New components have to be integrated with existing components.

o Differences in data representation of interface types on different processors (of different vendors) have to be resolved.

Concurrency

Components in distributed systems are executed in concurrent processes.

Components access and update shared resources (e.g. variables, databases, device drivers).

Integrity of the system may be violated if concurrent updates are not coordinated. o Lost updates

Scalability

o Adaption of distributed systems to

o accomodate more users

o respond faster (this is the hard one)

o Usually done by adding more and/or faster processors.

o Components should not need to be changed when scale of a system increases.

o Design components to be scalable

Fault Tolerance

Hardware, software and networks fail!

o Distributed systems must maintain availability even at low levels of hardware/software/network reliability.

o Fault tolerance is achieved by

o recovery

o redundancy

Transparency

o Distributed systems should be perceived by users and application programmers as a whole rather than as a collection of cooperating components.

o Transparency has different dimensions that were identified by ANSA.

o These represent various properties that distributed systems should have.

Access Transparency

Enables local and remote information objects to be accessed using identical operations.

o Example: File system operations in NFS.

o Example: Navigation in the Web.

o Example: SQL Queries

Location Transparency

Enables information objects to be accessed without knowledge of their location.

o Example: File system operations in NFS

o Example: Pages in the Web

o Example: Tables in distributed databases

Concurrency Transparency

Enables several processes to operate concurrently using shared information objects without interference between them.

o Example: NFS

o Example: Automatic teller machine network

o Example: Database management system

Replication Transparency

Enables multiple instances of information objects to be used to increase reliability and performance without knowledge of the replicas by users or application programs

o Example: Distributed DBMS

o Example: Mirroring Web Pages.

Failure Transparency

o Enables the concealment of faults

o Allows users and applications to complete their tasks despite the failure of other components.

o Example: Database Management System

Migration Transparency

Allows the movement of information objects within a system without affecting the operations of users or application programs

o Example: NFS

o Example: Web Pages

Performance Transparency

Allows the system to be reconfigured to improve performance as loads vary.

Example: Distributed make.

Scaling Transparency

Allows the system and applications to expand in scale without change to the system structure or the application algortithms.

o Example: World-Wide-Web

o Example: Distributed Database

Distributed Systems: Hardware Concepts

Multiprocessors

Multicomputers

Networks of Computers

Multiprocessors and Multicomputers

Distinguishing features:

Private versus shared memory

Bus versus switched interconnection

High degree of node heterogeneity:

High-performance parallel systems (multiprocessors as well as multicomputers)

High-end PCs and workstations (servers)

Simple network computers (offer users only network access)

Mobile computers (palmtops, laptops)

Multimedia workstations

High degree of network heterogeneity:

Local-area gigabit networks

Wireless connections

Long-haul, high-latency connections

Wide-area switched megabit connections

Distributed Systems: Software Concepts

Distributed operating system

_ Network operating system

_ Middleware

Distributed Operating System

Some characteristics:

_ OS on each computer knows about the other computers

_ OS on different computers generally the same

_ Services are generally (transparently) distributed across computers

Network Operating System

Some characteristics:

_ Each computer has its own operating system with networking facilities

_ Computers work independently (i.e., they may even have different operating systems)

_ Services are tied to individual nodes (ftp, telnet, WWW)

_ Highly file oriented (basically, processors share only files)

Distributed System (Middleware)

Some characteristics:

_ OS on each computer need not know about the other computers

_ OS on different computers need not generally be the same

_ Services are generally (transparently) distributed across computers

Need for Middleware

Motivation: Too many networked applications were hard or difficult to integrate:

_ Departments are running different NOSs

_ Integration and interoperability only at level of primitive NOS services

_ Need for federated information systems:

Combining different databases, but providing a single view to applications

Setting up enterprise-wide Internet services, making use of existing information systems

Allow transactions across different databases

Allow extensibility for future services (e.g., mobility, teleworking, collaborative applications)

_ Constraint: use the existing operating systems, and treat them as the underlying environment (they provided the basic functionality anyway)

Communication services: Abandon primitive socket based message passing in favor of: _ Procedure calls across networks

_ Remote-object method invocation

_ Message-queuing systems

_ Advanced communication streams

_ Event notification service

Information system services: Services that help manage data in a distributed system:

_ Large-scale, system wide naming services

_ Advanced directory services (search engines)

_ Location services for tracking mobile objects

_ Persistent storage facilities

_ Data caching and replication

Control services: Services giving applications control over when, where, and how they access data:

_ Distributed transaction processing

_ Code migration

Security services: Services for secure processing and communication:

_ Authentication and authorization services

_ Simple encryption services

_ Auditing service

Comparison of DOS, NOS, and Middleware

Networks of computers are everywhere. The Internet is one, as are the many networks of which it is composed. Mobile phone networks, corporate networks, factory networks, campus networks, home networks, in-car networks – all of these, both separately and in combination, share the essential characteristics that make them relevant subjects for study under the heading distributed systems.

Distributed systems has the following significant consequences:

Concurrency: In a network of computers, concurrent program execution is the norm. I can do my work on my computer while you do your work on yours, sharing resources such as web pages or files when necessary. The capacity of the system to handle shared resources can be increased by adding more resources (for example. computers) to the network. We will describe ways in which this extra capacity can be usefully deployed at many points in this book. The coordination of concurrently executing programs that share resources is also an important and recurring topic.

No global clock: When programs need to cooperate they coordinate their actions by exchanging messages. Close coordination often depends on a shared idea of the time at which the programs’ actions occur. But it turns out that there are limits to the accuracy with which the computers in a network can synchronize their clocks – there is no single global notion of the correct time. This is a direct consequence of the fact that the only communication is by sending messages through a network.

Independent failures: All computer systems can fail, and it is the responsibility of system designers to plan for the consequences of possible failures. Distributed systems can fail in new ways. Faults in the network result in the isolation of the computers that are connected to it, but that doesn’t mean that they stop running. In fact, the programs

on them may not be able to detect whether the network has failed or has become unusually slow. Similarly, the failure of a computer, or the unexpected termination of a program somewhere in the system (a crash), is not immediately made known to the other components with which it communicates. Each component of the system can fail independently, leaving the others still running.

Study Material, Lecturing Notes, Assignment, Reference, Wiki description explanation, brief detail

Distributed Systems : Introduction : Distributed systems |