Back to Library
DBMS Indexing & Transactions
PART 4 OF 5

Indexing & Transactions

Master indexing techniques with B-trees and hashing, plus transaction management with ACID properties, serializability, and concurrency control.

← Part 3Part 4 - CurrentPart 5 →

Contents

Indexing & Hashing

1. Index Types

Index TypeKey OrderData OrderDense/Sparse
PrimaryOrderedOrdered (same)Can be both
ClusteringOrderedOrdered (different key)Dense
SecondaryOrderedUnorderedAlways dense

Key Rules

  • Primary Index: Search key = Ordering key of file
  • Secondary Index: Search key ≠ Ordering key
  • Dense Index: Entry for every search key
  • Sparse Index: Entry for some keys (only for ordered files)

2. Dense vs Sparse Index

Dense Index

Index:  [10] [20] [30] [40] [50]
         ↓    ↓    ↓    ↓    ↓
Data:   [10] [20] [30] [40] [50]
  • • Entry for every record
  • • Faster search
  • • More space required
  • • Works on any file

Sparse Index

Index:  [10]      [30]      [50]
         ↓         ↓         ↓
Data:   [10][20] [30][40] [50][60]
  • • Entry for some records
  • • Slower search
  • • Less space required
  • Only for ordered files

3. Bitmap Index

Formula

Size (bytes) = (Number of rows × Number of distinct values) / 8

Example Calculation

Given:

  • • Attribute: Grade (values: A, B, C, D, F)
  • • Distinct values = 5
  • • Bitmap size = 200 bytes

Find: Number of rows

Rows = (Size × 8) / Distinct values
= (200 × 8) / 5
= 320 rows

When to Use Bitmap Index

  • • Low cardinality (few distinct values)
  • • Read-heavy workloads
  • • Data warehousing
  • • Gender, Status, Category columns

SQL Syntax

CREATE BITMAP INDEX index_name 
ON table_name(column_name);

-- Example
CREATE BITMAP INDEX team_name 
ON Player(team);

4. B-Trees & B⁺-Trees

Order Formula

key_size × (p - 1) + pointer_size × p ≤ block_size

Solve for p (order)

Example: Finding Order

Given:

  • • Key size = 8 bytes
  • • Pointer size = 12 bytes
  • • Block size = 512 bytes

Solution:

8(p - 1) + 12p ≤ 512
8p - 8 + 12p ≤ 512
20p ≤ 520
p = 26

B-Tree Properties (Order p)

PropertyFormulaExample (p=17)
Max keys per nodep - 116
Max children per nodep17
Min keys (non-root)⌈(p-1)/2⌉8
Min children (non-root)⌈p/2⌉9

Height Calculation

Height = ⌈logp(N)⌉

where N = number of records, p = order

Example:

Records = 6,250,000, Order = 50
Height = ⌈log₅₀(6,250,000)⌉ = 4
Max nodes to access = 4

B⁺-Tree vs B-Tree

  • B⁺-Tree: All data in leaf nodes, internal nodes for navigation only
  • B-Tree: Data in all nodes
  • B⁺-Tree: Leaf nodes are linked (good for range queries)
  • • Same formulas apply for both

5. Hashing

Hash Function Properties

  • • Maps key → bucket number
  • • Should distribute evenly
  • • Minimize collisions
  • • Deterministic (same input → same output)

Example Hash Functions

h₁(n) = n mod 7

Simple modulo operation

h₂(n) = n² + 1 mod 7

Quadratic hashing

h₃(n) = 3n + 3 mod 7

Linear hashing

Extendable Hashing

  • Global Depth (d): Bits used from hash value
  • Local Depth: Bits for specific bucket
  • Bucket Split: When bucket overflows, increase depth

Example:

Hash: 1001 0110 (binary)
Global depth: 2
Use first 2 bits: 1001 0110
→ Bucket 2

Transactions

6. ACID Properties

A - Atomicity

All or nothing. Either all operations complete or none do.

Transfer $100: Debit AND Credit both execute, or none

C - Consistency

Database must be consistent before and after transaction. May be inconsistent during execution.

Balance ≥ 0 before transaction AND after transaction

I - Isolation

Concurrent transactions appear to execute sequentially. T1 doesn't see T2's partial changes.

Two users withdrawing from same account don't interfere

D - Durability

Once committed, changes are permanent. Survives system crashes.

Power failure after COMMIT won't undo the transaction

7. Serializability

Conflict Serializability

Method: Draw precedence graph and check for cycles

Steps:

  1. 1. Nodes = Transactions
  2. 2. Add edge Ti → Tj if they have conflicting operations where Ti comes first
  3. 3. If cycle exists → NOT serializable
  4. 4. No cycle → Serializable

Conflicting Operations

Operation 1Operation 2Conflict?
r₁(A)w₂(A)✗ Yes
w₁(A)r₂(A)✗ Yes
w₁(A)w₂(A)✗ Yes
r₁(A)r₂(A)✓ No conflict

Example

Schedule S:

w₁(A), r₂(A), w₂(B), r₁(B)

Conflicts:

w₁(A) → r₂(A): T1 → T2
w₂(B) → r₁(B): T2 → T1

Graph:

T1 ⇄ T2 (cycle!)

Result: NOT serializable

8. Two-Phase Locking (2PL)

Two Phases

1. Growing Phase

Acquire locks only. Cannot release any lock.

2. Shrinking Phase

Release locks only. Cannot acquire new locks.

Lock Types

LockPurposeCompatible with S?Compatible with X?
Shared (S)For reading✓ Yes✗ No
Exclusive (X)For writing✗ No✗ No

Key Points

  • • ✅ Guarantees conflict serializability
  • • ❌ Does NOT prevent deadlock
  • Strict 2PL: Release all locks at commit/abort
  • Rigorous 2PL: Release all locks at end only

9. Deadlock

Detection: Wait-For Graph

Draw graph where edge Ti → Tj means Ti waits for Tj

T1 waits for T2
T2 waits for T3
T3 waits for T1
Graph: T1 → T2 → T3 → T1 (cycle!)
→ Deadlock detected

Prevention Methods

Wait-Die

Older transaction waits, younger aborts

Wound-Wait

Older forces younger to abort

Timeout

Abort if waiting too long

Deadlock-Free Protocols

  • ✅ Timestamp Ordering Protocol
  • ✅ Tree Protocol
  • ❌ 2PL (can have deadlock)

10. Schedule Types

TypeRead Uncommitted?Write Uncommitted?Cascading Rollback?
RecoverableYesYesYes
CascadelessCascadelessNoYesNo
StrictNoNoNo

Recoverable Schedule

If Ti reads from Tj, then Tj must commit before Ti commits.

w₁(A), r₂(A), commit₁, commit₂ ✅ Recoverable

Cascadeless Schedule

Transaction reads only committed data. No cascading rollbacks.

w₁(A), commit₁, r₂(A), commit₂ ✅ Cascadeless

Strict Schedule

Neither reads nor writes data until previous transaction commits/aborts.

w₁(A), commit₁, r₂(A), w₂(A), commit₂ ✅ Strict

Relationship

Strict ⊂ Cascadeless ⊂ Recoverable

SQL Transaction Control

COMMIT

BEGIN TRANSACTION;
UPDATE accounts 
SET balance = balance - 1000 
WHERE id = 1;

UPDATE accounts 
SET balance = balance + 1000 
WHERE id = 2;

COMMIT; -- Save all changes

ROLLBACK

BEGIN TRANSACTION;
UPDATE accounts 
SET balance = balance - 1000 
WHERE id = 1;

-- Error occurred!
ROLLBACK; -- Undo all changes

SAVEPOINT

BEGIN TRANSACTION;

UPDATE accounts 
SET balance = balance - 1000 
WHERE id = 1;

SAVEPOINT sp1;

UPDATE accounts 
SET balance = balance + 1000 
WHERE id = 2;

SAVEPOINT sp2;

UPDATE accounts 
SET balance = balance + 500 
WHERE id = 3;

ROLLBACK TO sp1; 
-- Undo changes after sp1

COMMIT; 
-- Only first update is saved

Practice Problems

Problem 1: Find order of B-tree. Key=8 bytes, Pointer=12 bytes, Block=512 bytes

Show Solution
8(p - 1) + 12p ≤ 512
8p - 8 + 12p ≤ 512
20p ≤ 520
p = 26

Order = 26

Problem 2: Is schedule serializable? S: w₁(A), r₂(A), w₂(B), r₁(B)

Show Solution
Conflicts:
w₁(A) → r₂(A): T1 → T2
w₂(B) → r₁(B): T2 → T1

Graph: T1 ⇄ T2 (cycle!)

Result: NOT serializable ❌

Problem 3: Calculate bitmap size. 500 rows, 4 distinct values

Show Solution
Size = (Rows × Distinct values) / 8
     = (500 × 4) / 8
     = 2000 / 8
     = 250 bytes

Problem 4: Is schedule recoverable? S: w₁(A), r₂(A), commit₂, commit₁

Show Solution
T2 reads from T1 (r₂(A) after w₁(A))
T2 commits BEFORE T1 commits

If T1 aborts after T2 commits, T2 read invalid data
Cannot undo T2 (already committed)

Result: NOT recoverable ❌

Quick Formulas Reference

B-Tree Order
key_size × (p-1) + pointer_size × p ≤ block_size
B-Tree Height
height = ⌈logp(n)⌉
Bitmap Size
size = (rows × distinct_values) / 8 bytes
Max Keys per Node
max_keys = p - 1
Min Keys per Node
min_keys = ⌈(p-1)/2⌉
Deadlock-Free Resource Allocation
min_resources = Σ(max_requirement) - (n - 1)

Quick Tips

✓ Sparse index ONLY works on ordered files

✓ 2PL guarantees serializability but not deadlock-free

✓ Bitmap index best for low cardinality columns

✓ Strict ⊂ Cascadeless ⊂ Recoverable

✓ B⁺-tree: All data in leaf nodes, linked for range queries

✓ Timestamp ordering is deadlock-free

Almost There!

You've mastered indexing and transactions! One more part to go: Recovery, Query Optimization, and complete reference materials.