2023-05-13

DS-week2

Data synchronization

Data synchronization refers to the process of ensuring that data in two or more locations or systems is consistent and up-to-date. This is often necessary when data is stored in multiple locations, such as in a distributed database or in different applications that need to share data. Data synchronization can be achieved through various techniques such as replication, mirroring, or using a shared database.

Process synchronization

Process synchronization, on the other hand, refers to the coordination of multiple processes or threads to ensure that they access shared resources in a mutually exclusive and synchronized manner. This is important to prevent conflicts and inconsistencies that can arise when multiple processes or threads try to access and modify the same resource simultaneously.

synchronization is needed when:

Agree on the ordering of events
Access a shared resource

synchronization challenges:

Coordination of actions that depend on communication over a network(networks are not always reliable)

A message will not arrive at its destination before it is sent

If a and b in same process, a occurs before b, then a->b
If a is sent by one process and b received by another process, then a->b
If a->b b->c then a->c (a->b means a happens before b)

If two events happen in different processes that do not exchange messages, then x->y and y->x are both not true. These events are said to be concurrent, which simply means that nothing can be said (or need be said) about when the events happened or which event happened first

Centralised Lock Server and Mutual Exclusion Locks(Mutex)

Two Phase Commit Algorithm

Electing a Coordinator Node

Bully Algorithm
- P sends an ELECTION message to all nodes above
- If no one respond P is the coordinator
- If one of the higher-numbered nodes Q answers, P concedes that it is not the winner, Q begins the election process again until one node eventually wins

Cristian’s Algorithm

The Berkeley Algorithm

Deadlock

Deadlock occurs when four particular conditions hold simultaneously in a system
1. Mutual exclusion: only one process can use that resource at a given time
2. Hold and wait: wait for a process to release something
3. No pre-emption: once a process has acquired a resource, nothing can force it to relinquish that resource; it has to do so voluntarily
4. Circular wait

Dealing with a Deadlock

Prevent
Avoid
Detect

2023-05-13

DS-week1

Why Distributed Systems

To share resources
To bind customers and suupliers
To allow us to do things we could not otherwise do, due to performance, scalability, reliability and availability issues

What is a Distributed System

A distributed system is a collection of autonomous computing elements that appears to its users as a single coherent system

Charateristics of Distributed Systems

Distributed systems can also vary in
- size
- the way in which nodes are interconnected
- the way in which node membership is handled
Although nodes can act independently from one another, they cannot ignore one another
Nodes are programmed to achieve common goals, and they do this by exchanging messages with each other
To appear to users as a single coherent system, distribution transparency is an important goal of distributed systems

Fallacy(谬论) about Distributed Systems

The network is reliable
The network is secure
The network is homogeneous(linux and windows)
The topology does not change
Latency is zero
Bandwidth is infinite
Transport cost is zero
There is one administrator

2023-05-10

ML-week2

KNN

Core idea: it finds the K nearest neighbours to a test data point and assigns the most common label as the label of the test point

pseudo code for classification

function KNN(X_train, y_train, X_test, k):
  distances = compute_distances(X_train, X_test) # calculate distances between training and test data
  nearest_neighbors = get_k_nearest_neighbors(distances, k) # get indices of k nearest neighbors
  labels = y_train[nearest_neighbors] # get labels of nearest neighbors
  return most_common_label(labels) # return most common label among neighbors

Usage: KNN is computationally expensive for large datasets as it calculates the distances between all pairs of data points. It can be useful when the number of classes is small, the data is not very high dimensional and there are enough training samples.

pseudo code of regression

function KNN(X_train, y_train, X_test, k):
  distances = compute_distances(X_train, X_test) # calculate distances between training and test data
  nearest_neighbors = get_k_nearest_neighbors(distances, k) # get indices of k nearest neighbors
  y_pred = average(y_train[nearest_neighbors]) # compute average output value of nearest neighbors
  return y_pred

k-NN behaviour vs. neighbour number and training sample size.

neighbour number k: if k is too small, sensitive to noise, overfitting. Too large, missing important patterns, underfitting
Training sample size: small training set, difficult to generalize new data, result in high variance and overfitting. Large training set, capture the underlying patterns, result in lower variance and better generalisation, but computational cost increases

2023-05-10

ML-week1

Key definitions and concepts in machine learning

Data + Model strategy

Data: experience
Model: a math representation of the relationshio between the input data and the output or prediction

Optimisation

finds the input that can give the minimum(or maximum) output value of a real-valued function

Classification and regression

Classification: supervised learning, goal is to predict a categorical label or class or a given input. The model is trained on labeled data, where each input accociated with a known output class
Regression: supervised learning, goal is to predict a continuous numerical output based on one or more input variablesThe model is trained on labeled data, where each input accociated with a known output value

2023-05-09

DS-week11

Drivers of Modern Distributed Systems

Pervasive(spread throughout蔓延的) networking technology
Ubiquitous(existing at the same time 普及的) computing
multimedia services
utility computing: providing computing resources as a service, without users to invest in

Pervasive Networking

ability to access and connect to networks anytime and anywhere, using a variety of devices and technologies

Ubiquitous Computing

embedding computing devices and technologies into everyday objects and environments
making them unobtrusive(不引人注目的), invisible and seamlessly integrated into our daily lives
supports user mobility, allows user to remain connected to an environment as the user moves about
possible because of the high portability of many of these devices

Multimedia Services多媒体

a range of media types in an integrated manner

Utility Computing效用计算

在这个模型里服务提供商提供客户需要的计算资源和基础设施管理，并根据应用所占用的资源情况进行计费，而不是仅仅按照速率进行收费

Utility Computing —> Clout Computing

cloud is defined as a set of internet-based application, storage and computing services to support users’ needs

Cloud Computing

promotes everything as a service, reducing requirements on users’ devices
Internet of Things(IoT) and Big Data are closely related to cloud computing
networked society
implemented on cluster computer
overall goal is to provide a range of cloud services, including high performance computing capability

Grid Computing

grid focus on high-end data-heavy or computationally expensive applications
cloud is more general

grid is and early example of cloud computing, but cloud computing has developed significantly since then

Key Concept: Virtualisation

2023-05-09

DS-week10

Fault Tolerance

Strongly associated with dependable systems
- availablity: the probability that the system operates correctly at any given moment 能正确运行的概率
- reliability: length of time that it can run continuously without failure 能正确运行的连续市场
- safety: if and when failures occur, the consequences are not catastrophic for the system 如果发生错误，对于系统不是灾难性的
- maintainability: how easily a failed system can be repaired 维修难度
building dependable, fault tolerant systems relates to controlling faults

Types of Failures

crash failure: halts, but working correctly until it halts
omission failure(遗漏): fails to respond to incoming requests
- receive omission: fails to receive
- Send omission: fails to send
Timing failure: Response lies outside a specified time interval
Response failure: response is incorrect
- value failure: value is wrong
- State_transition failure: deviates from the correct flow of control
Arbitrary failure: may produce arbitrary responses at arbitrary times
When arbitrary failures occur, clients should be prepared for the worst

Arbitrary failures are also known as Byzantine failures

Redundancy in Masking Failures

Physical redundancy
Time redundancy: an action is performed if needed, again and again, helpful when faults are transient(短时间的) and intermittent(断断续续的)
Information redundancy: eg. send extra bits
should improve performance, but keeping replicas consistency can damage performance

All-or-Nothing: THe need for Atomicity

Either ALL operations execute or NONE

Isolated Execution - The need for Isolation

ensure that “concurrent” applications do not interfere with each other

Seial Executions

concurrent executions do not interfere with each other if their execution is equavalent to a serial one
- one transfer at a time
- not scalable and very very slow

ACID

atomicity: all or nothing, either completed in its entriety or not at all
consistency: all constraints, rules and relationships are maintained throughout the transaction(kept up-to-date)一致性
isolation: concurrent transactions being executed as if they were the only transactions in the system, this ensures the outcome is not affected by the concurrent execution of other transactions 合流的交易分别看作自己独立的交易确保不会被同时进行的另一个交易干扰
durability: once a transaction is commited, its changes must be permanent and survive any subsequent system failures or errors, this ensures the data remains consistent and reliable

How Transactions are Implemented

Managing multiple simutaneous users
- Concurrency control algorithms
Durability
- Recovery algorithms

Concurrency Control

Acquire locks phase
- get a read lock before read
- get a write lock before write
- Read lock conflict with write lock
- write lock conflict with read and write lock
Release locks phase: where transaction terminates

2023-05-09

DS-week9

Service-Oriented Architectural Style

encapsulated services into independent units
server is a discrete unit of functionality and can be accessed remotely
composition of many different services
模块化
使用中间件来保证软件和系统的灵活交流和整合
each service must offer and interface

2023-05-09

DS-week8

Security in Distributed Systems

Communication between users or processes
Authorization

main method for (1) is secure channel

Methods for (2) are called access control

Relationship between Security and Dependability

dependability involves availability, reliability, safety and maintainability
Confidentiality: information is disclosed only to authorized parties
Integrity: System’s assets can be made only in an authorized way

Security Threats

interception: unaauthorized party gain access to data or service
interruption: service or data becomes unavailable
modification: unauthorized changing of data
Fabrication: additional data are generated that would normally not exist

Security Mechanisms

Encryption: transform data into something an attacker cannot understand
- Symmetri: same secret value(key) used for encryption and decryption
- Asymmetric: different key
Authentication: verify the identity of a user based on secret information(eg. password)
Authorization: check whether a lcient is authorized to perform the action requested
Auditing: trace which clients accessed what and in which way

Cryptography 密码学

suppose S wnats to send message m to R

encrypts into an unintelligible message m’
send m’ to R
R decrypt the received message into m

Secure channels

protects senders and receiver againset interception modification and fabrication
protecting against interception is done by ensuring confidentiality: cannot access by unauthorised parties
protecting against modification and fabrication is done by protocols for mutual (相互的)authentication and message integrity
confidentiality is by encrypting a message before sending it
integrity can be done by digital signatures

在这张图中，Alice用私钥对m加密计作KA-(m),再用Bob的公钥加密计作KB+(m, KA-(m)), Bob使用自己的私钥KB-对外层的公钥KB+解密,再使用Alice的公钥KA+对Aloce的私钥KA-解密

Controller access

protecting it against requests generated by unauthorized subject
enforced by a program called a reference monitor
a reference monitor records which subject may do what and decides where allowed
Referene monitor should be impenetrable(坚不可摧的) by its very nature

2023-05-09

DS-week7

Data Replication

enhance system reliability
improve performance
improve system scalability
keeping data replicas consistent

Reliability: If one replica crashes it could contnue working
- better protection against corrupted data

The Price of Replication

having multiple copies may lead to consistency problems
when and how those modifications need to be carries out deterines the price of replication

Replication for Performance

reduce access time and solve scalability problems

Problems:

keeping multiple copies up to date require more network bandwidth
keeping multiple copies consistent lead to serious scalability problems

Tight consistency

Provides strong guarantees about the ordering and visibility of updates to shared data
updates to shared data are immediately visible to all nodes in the system

Relaxing tight consistency

using a consistency model that allows for some level of inconsistency or latency
global synchronizations are avoided
performance improved

Consistency Models - Assumptions

assuming a data store which multiple processes running on different machines have access
a consistency model is a “contract” between the processes and this data store

Sequential Consistency Model

updates to shared data are applied in the order in which they are generated by the processes in the system
no involvement of time
no reference to the “most recent” write operation
different processes must see the updates to the shared data in the same order
所有的process都需要看到同一个order，如果P2执行了write b那么所有的node都需要同时执行read b，但是如果两个write看起来有先后顺序，如果所有的read操作都是对于相同的变量，那么可以允许

Causal Consistency Model

If b caused by earlier event a, causality requires that everyone first sees a then see b
Operations that are not causally related are said to be concurrent

P2 有一个R(X)a的操作，这就说明W(X)b有可能是基于W(X)a之后的X执行的write操作，所以他俩是causally related，所以在所有的nodes里都需要先看见a再看见b，在P3中先看见了b所以这个不是causally related

Replica management

keeping the replicas consistent
classical optimisation problem

Agreedy Heuristic to Find Location

首先找到总cost最小的网站，然后用这个网站来更新其他网站到这个网站的cost，然后重复(在第一步之后可以去掉这个网站)

2023-05-09

DS-week6

Inter-Process Communication

the way processes on different machines can exchange information
low-level message passing, offered by the underlying network
Two models for communication:
- Remote Procedure Call (RPC)
- Message-Oriented Middleware (MOM)

Remote Procedure Call

aims at hiding most of the intricacies(复杂) of message passing
ideal for client-server applications

Example:

newlist = append(data, dblist)

Append is a RPC, client stub is offered to the calling client, the client send to server and client stub calls receive, blocking itself until reply comes back. When message arrive at server, server’s OS passes it to a server stub, unpack and call server procedure, the server perform and return the results to the server stub

Parameter Passing

parameter marshalling (transforming data from one representation or format to another)
not straightforward
solution is to transform data into a machine and network-independent format making sure that both communicating parties expect the same message data type

Asynchronous RPCS

client will continue without blocking as soon as the server receive the message

Message-Oriented Middleware

message-queuing
persistent asynchronous commuication
sender posts a message in the queue, receiver retrieves the message form the queue
apps communicate by inserting messages in specific queues