Change of
business environment
Centralized
database suffer problems such as:
·
computer workstations
·
network hardware and
software (e.g. NIC)
·
communication media
(e.g. network cable)
·
transaction processor
(DTM -- distributed transaction manager)
·
data processor (LTM --
local transaction manager or local data manager)
·
Distribution
transparency -- treats it as a single logical database
·
Transaction transparency
-- allows a transaction to update data correctly as at a central site
·
Failure transparency --
system will continue to operate in the event of a node failure as other nodes
can substitute the functions
·
Performance transparency
-- perform as if it were a centralized DBMS
·
Heterogeneity
transparency -- support different database models
Allows users to manage a physically dispersed database as if it were
a centralized database
·
Fragmentation
transparency -- user need not know that the database is partitioned
·
Location transparency --
user must specify which fragment but not where these fragments are
·
Local mapping
transparency -- must specify both but not the mapping
Levels of distribution transparency
1.
Global Schema -- a global
view of database schema similar to centralized database
2.
Fragmentation Schema --
a schema to partition the database into logical fragments
3.
Allocation Schema -- a
schema to determine the allocation of fragments to each site, with or without
replication
4.
Local Mapping Schema
5.
DBMS of site n
6.
Local database at site n
To partition the database without regard of physical location of
data:
1.
Completeness -- all data
must be mapped into the fragments
2.
Reconstruction condition
-- always possible to reconstruct global relation from its fragments
3.
Disjointness condition
-- nonoverlapping fragments (except for the primary key [why?]) so that
replication can be controlled explicitly at the allocation level
1.
partitioning the tuples
of a global relation into subsets
1.
subdivision of a global attributes
into groups (e.g. data in different applications)
2.
fragments obtained by
projecting the global relation over each group which should consist of the
primary key [why?]
1.
obtained by applying the
fragmentation operations recursively provided the correctness conditions are
satisfied each time
2.
reconstruction can be
obtained by applying the construction rules in inverse order
1.
Atomicity:
a transaction must be all-or-nothing
2.
Consistency:
a transaction takes the system from one consistenct state to another consistent
state
3.
Isolation:
each transaction must be performed without interference from other
transactions, or the intermediate effects of a transaction must not be visible
to other transactions
4.
Durability:
after a transaction has completed successfully all its effects are saved in
permanent storage
Reference: http://personal.cityu.edu.hk/~ismikeho/dm2/dmchap9.htm
a)
Two
schedules (of operations) are equivalent (serializable) if:
i)
Each
read operation read data item values which are produced by the same write
operations in both schedules
ii)
The
final write operation on each data item is the same in both schedules
b) Two operations are in
conflict if they operate on the same data item, one of them is a write
operation, and they are issued by different transactions
Mechanisms
to perform concurrency control:
1.
2
Phase Locking –
a)
A
transaction applies shared-lock if it reads a resource and is successful only
if the resource is not exclusive-locked
b) A transaction applied
exclusive-lock if it writes/updates a resource and is successful only if the
resource is not locked at all
c)
A trans. is well-formed
if it always locks a data item in shared mode before reading it and always lock
a data item in exclusive mode before writing it
d) A transaction has to wait
and retry if it cannot acquire the necessary locks.
e)
Any
locks held until the transaction commits or aborts is called strict two-phase
locking
f)
A
transaction once starts to unlock does not acquire new locks is a 2-phase-locked transaction
2.
Time
Stamping – a method to ensure the time
of the transaction operations follows the above serializability rule by keeping
the time of the operation with the data.
Inconsistent operations are aborted.
Deadlock
is possible with 2-phase-locking.
1.
2-phase-locking
is still good in distributed systems
2.
Besides
local deadlock, distributed deadlock is possible http://personal.cityu.edu.hk/~ismikeho/dist/clntsvr.htm
2-Phase-Commitment
Protocol can be used:
from: Distributed Databases by S. Ceri, and G. Pelagatti, McGraw Hill 1985