COMP S368 & COMP 4680SED — Final Exam Concept Map 期末考試概念圖

Exam: 2026-05-04 · Consolidated from Units 1, 2/3, 4, 5, 7, Spring MVC, Cloud Native/K8s/Big Data, Cloud Computing (Ppt 1-20), plus the instructor's 2026-05-01 revision lecture
考試日期:2026年5月4日 · 整合所有課程 PPT + 導師 5 月 1 日複習講座重點

1. Networking & System Models 網路與系統模型

重點:一定要識畫 TCP/IP 四層模型 + 封裝過程、解釋點解唔用 OSI 七層、TCP vs UDP 表列三個以上差異,並舉應用例子(Web = TCP、VoIP = UDP)。Socket = IP + Port,Java / Python 可互通。

OSI 7-Layer vs TCP/IP 4-Layer 七層 vs 四層

OSITCP/IPExamples
Application / Presentation / SessionApplicationHTTP, FTP, SMTP, DNS
TransportHost-to-Host (Transport)TCP, UDP
NetworkInternetworkIP, ICMP, routing
Data Link / PhysicalNetwork AccessEthernet, Wi-Fi

Why TCP/IP 4-layer wins in practice: simpler, implementation-friendly, covers OSI's core functions with less overhead. You must be able to draw this.

Use case 用途: OSI — teaching / reference model, network troubleshooting (e.g. "issue is at Layer 7 vs Layer 3"). TCP/IP — every real-world network (the Internet, LAN, mobile data) runs on this 4-layer stack.

Encapsulation (draw this) 封裝過程(必畫)

Application data (HTTP payload)
+ TCP/UDP header → Segment / Datagram
+ IP header → Packet
→ Bits on the wire

Sender wraps (encapsulates) top-down; receiver unwraps (decapsulates) bottom-up. Each layer's PDU becomes the next layer's payload.

TCP vs UDP 可靠 vs 快速

FeatureTCPUDP
ConnectionConnection-oriented (3-way handshake)Connectionless
ReliabilityGuaranteed delivery, ACKs, retransmitBest-effort, no guarantee
OrderOrdered byte streamMay arrive out of order
Speed / overheadSlower, higher overheadFast, low latency
Use forWeb, email, file transferVoIP, video streaming, gaming, DNS

Trade-off to memorise: Reliability ↔ Speed.

Socket Programming 套接字編程

A socket = IP address + Port. Bridges application ↔ transport layer.

TCP sequence

Server: bind → listen → accept
Client: connect
read / write streams
close

UDP sequence

sendto / recvfrom
no connection state

Server must start before the client, otherwise ConnectionRefused. Java and Python sockets interoperate because both speak standard TCP/UDP.

IP header fields worth knowing

Source / Destination address TTL — avoid routing loops Header checksum Fragmentation: ID / DF / MF / Offset Protocol (TCP=6, UDP=17)

2. Distributed Architecture & Design 分散式架構與設計

重點:分散式系統 = 多部電腦協作睇落好似一部。要識列八大特性(資源共享、異質性、開放、安全、擴展、容錯、並行、透明)同四種架構(C/S、多伺服器、代理、P2P)。雲端 = NIST 5 特性 + SPI 三層 + 四種部署。Cloud Native = 容器化 + 微服務 + 動態編排 + DevOps。

What is a Distributed System? 咩係分散式系統?

A collection of independent computers that appear to users as a single coherent system. Users consume the integrated service without knowing the components or their interactions.

Motivations: fixed distributed apps (ATM), resource sharing, cost control, incremental growth, fault tolerance. Overheads: software complexity, communication delay, security.

The 8 Key Characteristics 八大特性(必背 + 會舉例)

導師要求:唔單止識背名,仲要識用一句解釋 + 舉一個具體例子。呢個係典型 short-answer 送分題。

Characteristic 特性MeaningConcrete example
Resource Sharing
資源共享
Multiple users / nodes share hardware, software, or data across the network instead of duplicating them locally. Shared network printer in an office; cloud storage (Dropbox); shared database server accessed by many app servers.
Heterogeneity
異質性
The system runs across different hardware, operating systems, network types, and programming languages. Standards and middleware hide the differences. A Java Spring Boot service on Linux calling a Python API on Windows via REST/JSON — both sides don't care about each other's stack.
Openness
開放性
Published, standard interfaces let anyone extend the system or plug in new components. Built on open protocols. The Internet uses open TCP/UDP/IP/HTTP standards — any vendor's router or browser interoperates.
Security
安全性
Data in transit & at rest is protected. Covers Confidentiality, Integrity, Availability (CIA) across an untrusted network. Online banking uses TLS to encrypt traffic, salted-hashed passwords, and MFA to authenticate users.
Scalability
可擴展性
System keeps performing as load, data, or number of users grows — typically by adding nodes (horizontal scale). Kubernetes Horizontal Pod Autoscaler spins up extra pods when CPU load rises; Cassandra scales by adding nodes.
Fault Tolerance
容錯性
System keeps working when components fail. Achieved by redundancy, replication, retries, failover. DNS has many redundant servers; if one fails, queries route to others. Database replicas take over if primary dies.
Concurrency
並行性
Many clients & processes operate at the same time without corrupting shared state. Requires locks, transactions, or optimistic control. Thousands of shoppers add items to their carts simultaneously; the DB uses row-level locks or OCC so no one's cart is lost.
Transparency
透明性
The system hides its distributed nature from users. They just see "one service".
AccessLocationMigrationReplicationConcurrencyFailureScaling
When you type google.com you don't know which data center, which server, or which replica handled the request. All hidden.
記憶口訣:「共 · 異 · 開 · 安 · 擴 · 容 · 並 · 透」— 用資源、質硬體、放標準、全傳輸、展容易、許失敗、發處理、明使用者。
答題時記住:一句定義 + 一個例子。例子越貼地(ATM、Google、網上銀行、Netflix)越容易攞 full mark。

Architecture Models 架構模式

ModelProsConsUse case 用途
Client–Server Simple, well-understood Single point of failure, scaling limits Web apps, email (SMTP/IMAP), DNS, online banking, classic DB (MySQL) — small-to-medium load with clear client-server roles
Multi-Server Fault tolerance, scalability Load-balancing complexity Large e-commerce (Amazon), streaming platforms (Netflix, YouTube), search engines (Google) — high traffic requiring redundancy + horizontal scale
Proxy-Server Security, load balancing, protocol conversion Added complexity, possible SPOF CDN (Cloudflare), corporate firewall / web proxy, API gateway (Spring Cloud Gateway), reverse proxy (Nginx) — need caching / filtering / TLS termination in front of backend
Peer-to-Peer Robust, no central server Resource hungry, harder coordination BitTorrent file sharing, blockchain (Bitcoin, Ethereum), Skype voice, distributed computing (SETI@home) — high decentralisation + no single owner
答題 tip:每種都講一句經典例子就夠:C/S = 網上銀行;Multi-Server = Netflix;Proxy = Cloudflare CDN;P2P = BitTorrent。導師最鍾意見到真實世界例子。

Three-Tier Model 三層架構

Presentation — UI, browsers, mobile apps
Application — business logic, services
Database — storage, RDBMS / NoSQL

Benefits: modularity, code reuse, layers upgraded independently. Key to explain Java EE & Spring MVC architectures.

Cloud Computing (NIST) 雲端運算(NIST 定義)

5 essential characteristics

On-demand self-serviceBroad network access Resource poolingRapid elasticityMeasured service

SPI service models (shared responsibility)

ModelProvider managesConsumer managesExample
SaaSEverything incl. appEnd-user configGoogle Workspace, M365
PaaSInfra + platform (OS, middleware)App + dataGoogle App Engine, Elastic Beanstalk
IaaSPhysical infraOS, middleware, appsEC2, GCE, Azure VM

Deployment models

PrivateCommunityPublicHybrid

Cloud-Native Principles 雲原生原則

  • Containerisation — Docker, consistent runtime across envs
  • Microservices — small, independent, single-capability services
  • Dynamic orchestration — Kubernetes auto-scales & load-balances
  • DevOps / CI-CD — automated build, test, deploy; IaC
  • Scalability & resilience — horizontal scaling, self-healing, multi-zone, circuit breakers

Benefits: agility, cost efficiency (pay-per-use), resilience. Risks: complexity, network latency, distributed-tracing overhead.

Kubernetes Essentials K8s 核心概念

ComponentRole
PodSmallest deployable unit; containers share network/storage
ServiceStable endpoint, load-balances pods (ClusterIP / NodePort / LoadBalancer)
DeploymentManages pod lifecycle, rolling updates, replica count
IngressHTTP(S) routing, TLS termination, virtual hosting

Scaling: HPA (replicas by CPU/metric), VPA (resource requests), Cluster Autoscaler (nodes).
Self-healing: liveness/readiness probes, automatic pod replacement, node failure rescheduling.

Big Data & MapReduce 大數據與 MapReduce

3V: Volume, Variety, Velocity. Extra: Variability, Veracity, Visualisation, Value.

Split data
Map (parallel)
Shuffle
Reduce
Output

Example: Map squares inputs (0,1,2,3,4)→(0,1,4,9,16), Reduce sums → 30. Same mapper/reducer scales from 10 to 100+ CPUs unchanged.

3. Concurrency Control 並行控制

重點(導師必考):SharedDataTest 解釋 Race Condition 點發生(步驟 A/B/C 交錯)、用 BankAccount 解釋 Lock+Condition(必須用 while 唔用 if)、識分 Deadlock / Starvation / Spurious wake-up、講得出 樂觀鎖 vs 悲觀鎖 幾時用邊個。

Process vs Thread 進程 vs 線程(深入版)

核心分別:Process 係「獨立間屋」— 每間屋有自己嘅地址、傢俬、鎖匙。Thread 係「同一間屋入面嘅人」— 大家行同一個 floor plan、用同樣傢俬,但各自做緊唔同嘢。

Detailed comparison 詳細對比

FeatureProcess 進程Thread 線程
DefinitionA running instance of a program with its own OS resourcesA lightweight execution unit inside a process
Memory spaceIndependent (isolated virtual address space)Shared (heap, static, code) — each thread has its own stack + PC register only
Creation costHeavy — OS allocates memory, PCB, page table (~ms)Light — only stack + TCB (~µs)
Context switchExpensive — flush TLB, switch page tablesCheap — just swap registers + stack pointer
CommunicationIPC: pipes, sockets, shared memory, signals, message queuesDirect — read/write shared variables (but need synchronisation!)
Synchronisation costLow — processes rarely need to sync directlyHigh — race conditions, need locks / volatile / atomics
Owns OS resources?Yes — file handles, sockets, working dirNo — inherits from parent process
Fault isolationStrong — one crash doesn't kill siblingsWeak — one bad thread (segfault, OOM) kills the whole process
SchedulingOS kernel schedulerOS kernel (Java threads are 1:1 with OS threads) or user-level
Use caseBrowser tabs (Chrome per-site process), microservices, DB server spawning a process per user, high-isolation sandboxesWeb server handling many concurrent requests, Java servlet container, UI doing background work, parallel computation within one app

What each thread actually has vs shares 線程之間共享 / 私有

Per-thread (private)Shared across threads in same process
Stack (local variables, method call frames)
Program counter (PC) — next instruction
Registers
Thread-local storage (ThreadLocal)
Heap (objects created with new)
Static / class variables
Code segment
File handles, sockets, environment variables

Why threads > processes for concurrent servers 點解 server 用 thread

  • Handling 10,000 simultaneous HTTP requests as 10,000 processes would crush RAM and CPU.
  • Threads share the cache / DB connection pool / object graph, so setup cost is paid once per process.
  • Cost: all requests run in one address space — one bad request can crash the whole JVM. That's why many servers combine: a small pool of worker processes, each hosting many threads.
答題套路:先畫「process 係屋 / thread 係屋內人」嘅比喻→列三對最重要嘅分別(memory、cost、isolation)→舉 Chrome(process)+ Tomcat(thread)兩個貼地例子→結尾答:「需要 isolation 揀 process;需要 throughput + 共享狀態揀 thread;現代系統通常兩者並用」。

Java Thread States Java 線程六大狀態(深入版)

Defined in java.lang.Thread.State. Every live thread is in exactly one state. Understanding transitions is a classic short-answer question.

State table 狀態表

StateMeaningHow you enterHow you leave
NEW
新建
Thread object is created but start() hasn't been called yet. It's just a Java object, not yet a real OS thread. new Thread(...) call start() → RUNNABLE
RUNNABLE
可運行
Ready to run. May be actually running on a CPU or waiting for the scheduler to pick it. (Java doesn't distinguish "ready" vs "running" — both are RUNNABLE.) start() called; or wake from BLOCKED / WAITING / TIMED_WAITING wait for lock → BLOCKED; call wait() / join() → WAITING; call sleep(n) / timed wait(n) → TIMED_WAITING; finish run() → TERMINATED
BLOCKED
阻塞
Waiting to acquire an intrinsic lock (synchronized monitor). Another thread currently holds it. trying to enter a synchronized block when lock is taken lock becomes available → RUNNABLE
WAITING
等待
Waiting indefinitely for another thread to signal it. Doesn't hold the CPU. Object.wait(), Thread.join(), LockSupport.park() another thread calls notify() / notifyAll(), or target thread finishes (join), or unpark() → RUNNABLE (may still need the lock → BLOCKED briefly)
TIMED_WAITING
超時等待
Like WAITING, but with a timeout. Automatically wakes up after the time expires. sleep(ms), wait(ms), join(ms), parkNanos(), tryLock(ms, unit) time expires, or notified early → RUNNABLE
TERMINATED
終止
Thread's run() method has returned (normally or via exception). Dead. Cannot restart. finish run() / uncaught exception — end of life; GC reclaims it later

Full state diagram 完整狀態轉換圖

             new Thread(r)
                  │
                  ▼
              ┌───────┐
              │  NEW  │
              └───┬───┘
                  │ start()
                  ▼
        ┌──────────────────┐
    ┌──►│    RUNNABLE      │◄─── lock acquired     notify/notifyAll
    │   │  (ready/running) │◄────┐                  ┌──────┐
    │   └──┬────┬──────┬───┘     │                  │      │
    │      │    │      │         │                  │      │
    │      │    │      │ sleep(n), wait(n), join(n)│      │
    │      │    │      └───────────────►┌──────────────────┐
    │      │    │                       │ TIMED_WAITING     │
    │      │    │                       └──────┬────────────┘
    │      │    │                              │ timeout / notify
    │      │    │                              └──────────────┐
    │      │    │  wait(), join(), park()                      │
    │      │    └──────────────────►┌──────────┐              │
    │      │                        │ WAITING  │              │
    │      │                        └────┬─────┘              │
    │      │                             │ notify/notifyAll   │
    │      │                             └────────────────────┘
    │      │ wants lock held by someone else
    │      └────────────────►┌──────────┐
    │                        │ BLOCKED  │
    │                        └────┬─────┘
    │                             │ lock released
    └─────────────────────────────┘
                                       run() returns / exception
                                            │
                                            ▼
                                      ┌────────────┐
                                      │ TERMINATED │
                                      └────────────┘

Key distinctions to know cold 必分清楚

Confusion pairDifference
BLOCKED vs WAITINGBLOCKED = queued for an intrinsic lock; WAITING = explicitly chose to wait (wait() / join()) for a signal or another thread.
WAITING vs TIMED_WAITINGTIMED_WAITING has a timeout → will auto-wake even without a notify.
RUNNABLE vs "running"Java lumps them together. The OS decides which RUNNABLE thread actually runs on a core right now.
sleep() vs wait()sleep() keeps the lock (thread still owns it); wait() releases the lock and waits for a notify.

Code example 代碼例子

Thread t = new Thread(() -> {              // state: NEW
  try {
    Thread.sleep(1000);                     // state: TIMED_WAITING
    synchronized (lock) {                    // state: BLOCKED if lock busy
      while (!ready) lock.wait();            // state: WAITING (lock released)
    }
    System.out.println("done");              // state: RUNNABLE again
  } catch (InterruptedException e) {}
});                                          // state: still NEW

t.start();                                   // NEW → RUNNABLE
t.join();                                    // caller enters WAITING until t finishes
// when run() returns → t is TERMINATED
答題套路:先畫六個 state 嘅圓角方塊 → 用箭頭連埋(必畫 start()wait/notifysleep/timeout、lock contention、run() 結束)→ 最尾答「BLOCKED 係搶鎖唔到、WAITING 係自己決定等、TIMED_WAITING 係有 timeout 嘅 WAITING」。識答呢三句基本攞晒分。

Race Condition — the classic SharedDataTest 競爭條件 — 經典例題

Two threads both read sum, both write back sum+1 → one update is lost. Increase threads or reps → higher overlap %.

// Non-atomic — UNSAFE
public void increment() {
    int tmp = sum;           // A: read
    // (interleave point)
    sum = tmp + 1;           // C: write
}

// Fix 1: synchronized method
synchronized public void increment() { sum = sum + 1; }

// Fix 2: explicit lock (finer control)
lock.lock();
try { sum += 1; }
finally { lock.unlock(); }

Key insight: the critical section is read-modify-write on shared state. Mutual exclusion makes it atomic.

Lock + Condition (BankAccount pattern) 鎖 + 條件變量(銀行例題)

Lock lock = new ReentrantLock();
Condition getDeposit = lock.newCondition();

void deposit(int n) {
  lock.lock();
  try {
    balance += n;
    getDeposit.signalAll();   // wake all waiters
  } finally { lock.unlock(); }
}

void withdraw(int n) {
  lock.lock();
  try {
    while (balance < n)     // ⭐ while, not if
      getDeposit.await();      // releases lock + waits
    balance -= n;
  } finally { lock.unlock(); }
}
  • while (not if): protects against spurious wake-ups and multiple waiters.
  • await(): releases lock → waits → re-acquires on wake.
  • signalAll() over signal(): safer when multiple waiters may be eligible; avoids starvation.

Common Concurrency Bugs 常見並行錯誤

Deadlock

// T1: lock1 → lock2     T2: lock2 → lock1  → DEADLOCK

Fixes: fixed lock ordering, tryLock(timeout), higher-level primitives.

Starvation

Using signal() may always wake the same thread → others starve. Prefer signalAll() / fair locks.

Spurious wake-up

Always re-check the condition in a while after await().

Pessimistic vs Optimistic Control 悲觀鎖 vs 樂觀鎖(深入版)

一句話分別:悲觀鎖係「預咗有人搶,鎖咗先」,樂觀鎖係「當冇人搶,搞完先驗證」。關鍵係你估計 conflict 係頻繁定罕見。

Pessimistic Locking 悲觀鎖

Assumes conflicts will happen. Lock the resource before touching it so no one else can interfere.

1. Acquire lock
2. Read data
3. Modify
4. Commit
5. Release lock
-- SQL example: "SELECT ... FOR UPDATE" acquires a row-level exclusive lock
BEGIN;
SELECT balance FROM Account WHERE id=1 FOR UPDATE;  -- others wait
UPDATE Account SET balance = balance - 300 WHERE id=1;
COMMIT;                                                  -- lock released
  • Real-world analogy 比喻: 廁所門鎖 — 入去就鎖門,其他人要等。
  • Guarantees no lost updates or dirty reads.
  • Cost: other transactions block and wait; deadlock possible if multiple locks acquired out of order.

Optimistic Concurrency Control (OCC) 樂觀鎖(無鎖)

Assumes conflicts are rare. Don't lock anything — just work on a local copy, then at commit time check if anyone else modified the data. If yes, rollback and retry.

Three phases:

1. Working
read + modify on local copy, no lock
2. Validation
check version / timestamp still matches
3. Update
if valid: commit. else: rollback + retry

Conflict Detection — 2 techniques 衝突檢測方法

MethodHow it worksExample
Version number
版本號
Each row has a version column. Read version at start, on update check it's still the same, then version++. UPDATE Account SET balance=200, version=4 WHERE id=1 AND version=3
If ROW COUNT = 0 → someone else updated first → retry.
Timestamp
時間戳
Each transaction gets a start-timestamp. At commit, check no other transaction with a later timestamp has modified the same data. Used in MVCC databases (PostgreSQL, Oracle).
// JPA: a single @Version field turns on optimistic locking automatically
@Entity
class Account {
  @Id Long id;
  Double balance;
  @Version Long version;    // JPA bumps this on every update
}

// At commit time, if another transaction already bumped the version,
// JPA throws OptimisticLockException — application must catch & retry.
  • Analogy 比喻: 維基百科編輯 — 你改緊嘅時候冇人阻你,但 save 嗰陣如果有人搶先改咗,你要 refresh 同合併先再 save。
  • No blocking → higher throughput when few conflicts.
  • When conflicts do happen, the whole transaction is wasted and must retry.

Side-by-side 詳細對比

AspectPessimisticOptimistic (OCC)
AssumptionConflicts likelyConflicts rare
LockingLock acquired before read/writeNo lock; validate at commit
Blocking?Yes — others waitNo — everyone works in parallel
ConcurrencyLow under contentionHigh when conflicts rare
Failure modeDeadlock possible; needs timeout / lock orderingRollback + retry on validation failure
Wasted workNone once lock held — always succeedsEntire transaction lost on conflict
DetectionDatabase lock managerVersion number / timestamp
ImplementationSELECT ... FOR UPDATE, LOCK TABLE, pessimistic JPA lock modes@Version column, MVCC, CAS (Compare-And-Swap)
Best forHot write paths, short transactions, money transferRead-heavy, long transactions, wiki edits, reporting

When to pick which — decision guide 點揀?

ScenarioChoiceWhy
Bank transfer / account debitPessimisticContention on the same row is likely; must never overdraft
Inventory decrement at Black FridayPessimisticMany buyers compete for last items — need to serialise
User editing their own profileOptimisticSame user rarely conflicts with themselves
Collaborative wiki / documentOptimisticMost edits are non-overlapping paragraphs; retry is cheap
Reporting / analytics queriesOptimistic (or MVCC snapshot)Read-only; blocking writers would hurt throughput
Ticket booking (high contention, last seat)PessimisticRollback-and-retry loops would thrash under heavy conflict
考試答題模板:「如果 conflict 頻繁 + 重寫代價高 → 悲觀鎖(例:銀行轉帳)。如果 conflict 罕見 + 併發吞吐量重要 → 樂觀鎖(例:用戶編輯自己資料)。」再用銀行例子 or wiki 例子收尾,分數攞得到。

Lock Types & Granularity 鎖類型與粒度

TypeConcurrent holdersUse
Shared (Read)ManyRead-only
Exclusive (Write)OneRead + Write

Granularity: record < page < table < database. Smaller = more concurrency but more lock overhead.

Centralised vs Decentralised Coordination 中心化 vs 去中心化

Centralised lockDistributed lock
MechanismSingle server holds all locksLock logic spread across nodes
ProsSimple, consistentNo SPOF, scales with nodes
ConsSingle point of failure, bottleneckMore messages, complex
Use caseSmall cluster, quick POC, apps where correctness > scale (e.g. Redis-based lock, single Zookeeper leader)Large globally-distributed systems, blockchain, P2P networks, high-availability clusters where no node can be a SPOF

Two-Phase Commit (2PC) 兩階段提交(深入版)

Problem solved: making a transaction atomic across multiple nodes (e.g. bank A and bank B are on different databases). Either all nodes commit, or all abort — never a mix.

Roles 角色

  • Coordinator — one node drives the protocol (e.g. transaction manager).
  • Participants — the other nodes that hold the data being changed.

Phase 1 — Prepare / Voting 階段一:準備投票

  1. Coordinator sends PREPARE to every participant.
  2. Each participant does its local work, writes an undo + redo log, and locks the resources.
  3. Participant replies YES (ready to commit) or NO (must abort).

Phase 2 — Commit / Abort 階段二:提交/中止

  1. If all participants voted YES → coordinator writes COMMIT to its log, sends COMMIT to all.
  2. If any voted NO (or timed out) → coordinator sends ABORT to all.
  3. Participants finalise (release locks, apply or undo changes) and send ACK.
Coordinator                 Participants (P1, P2, ...)
    |  ── PREPARE ─────────►  |  do work, write log, lock
    |                         |
    |  ◄── YES / NO ─────────  |  vote
    |                         |
    |  (all YES?)             |
    |  ── COMMIT or ABORT ──► |  finalise
    |                         |
    |  ◄── ACK ──────────────  |

Why it can block 點解 2PC 會卡死

  • Coordinator crashes after PREPARE but before sending COMMIT/ABORT → participants are stuck holding locks, not knowing the outcome. This is 2PC's famous weakness: it is a blocking protocol.
  • Participant failure after voting YES → on recovery it must consult the log and the coordinator to learn the outcome.
  • Mitigation: add a third phase (3PC) or replace with consensus protocols (Paxos, Raft) or the Saga pattern for long-running business flows.

2PC vs OCC — don't confuse 唔好同樂觀並行撈亂

2PCOCC
ProblemAtomicity across multiple nodesConcurrency on a single resource
ScopeDistributed transactionsLocal (or distributed) concurrency control
Locks?Yes — participants lock during Phase 1No — validate at commit
Failure modeBlock until coordinator recoversRollback + retry
考試答題:記住 2PC 解決 atomicity 跨節點問題,唔係 concurrency。講成個流程用「Prepare → 投票 → Commit/Abort」,再補一句「coordinator 失聯會 block,所以大型系統傾向用 Saga pattern 代替」就非常完整。

4. Databases & Frameworks 資料庫與框架

重點:識比 SQL vs NoSQL、用銀行轉帳講解 ACID、比較 Java EE vs Spring、記熟 JPA 生命週期(New → Managed → Detached → Removed)同常用 annotation(@Entity / @Id / @Column / @ManyToOne...)。Spring MVC 一定要識講 DispatcherServlet 流程。REST vs SOAP = flexible vs strict。

SQL vs NoSQL 關聯式 vs 非關聯式

AspectSQL (Relational)NoSQL
SchemaFixed, strongly typedFlexible, schema-less
Data modelTables, rows, joinsKey-value / Column / Document / Graph
QueryStandard SQLVendor-specific APIs (Mongo, Redis, etc.)
ScalingVertical, sharding hardHorizontal by design
ConsistencyStrong (ACID)Often eventual (BASE)
Best forComplex relations, transactionsLarge, unstructured or evolving data
Use caseBanking ledger, ERP, airline booking, HR records — anywhere integrity & joins matterSocial feeds (Twitter timeline), product catalogue, IoT sensor data, real-time analytics, session cache (Redis)

4 NoSQL families: key-value (Redis — session cache), column (Cassandra — time series), document (MongoDB — CMS/profiles), graph (Neo4j — social networks, fraud).

Relational Concepts 關聯式資料庫概念

  • Primary Key — unique, non-null, one per table (Entity Integrity)
  • Foreign Key — matches a PK elsewhere or is NULL (Referential Integrity)
  • Cardinalities: 1:11:N (FK on many side)M:N (junction table)

SQL categories

DDL: CREATE / ALTER / DROP DML: INSERT / UPDATE / DELETE / SELECT

Transactions & ACID 事務與 ACID 特性(深入版)

A transaction is an atomic unit of work: all or nothing. Boundaries: BEGIN → COMMIT | ROLLBACK. Essential for maintaining correctness in the face of failures and concurrent access.

記憶口訣:ACID = Atomicity(原子)· Consistency(一致)· Isolation(隔離)· Durability(持久)。銀行轉帳例子一條題都可以講晒四個字母。

Running Example 貫穿例子:轉帳 $1000 由 A 戶到 B 戶

BEGIN TRANSACTION;
  UPDATE Account SET balance = balance - 1000 WHERE id = 'A';  -- Step 1: Withdraw
  UPDATE Account SET balance = balance + 1000 WHERE id = 'B';  -- Step 2: Deposit
COMMIT;
PropertyWhat it guaranteesViolation looks like…How the DB enforces it
A — Atomicity
原子性
「全做 or 全唔做」
All operations in the transaction succeed together, or the transaction is aborted and the database is rolled back to its pre-transaction state. Server crashes after Step 1 but before Step 2 → A loses $1000, B never gets it. Money vanished. ❌ Undo log / write-ahead log (WAL). On crash, uncommitted work is rolled back.
C — Consistency
一致性
「合乎規矩」
The DB moves from one valid state to another valid state. All integrity constraints, FK, triggers, and business rules hold at commit time. Withdraw succeeds even though A's balance would go below 0 (if "balance ≥ 0" is a rule). Rule broken. ❌ Constraint checks: NOT NULL, UNIQUE, CHECK, FOREIGN KEY. Transaction aborts if any fails.
I — Isolation
隔離性
「唔會互相偷睇」
Concurrent transactions don't see each other's uncommitted or partial changes. Each transaction appears to run alone. Another transaction reads A's balance between Step 1 and Step 2 → sees a total that's $1000 short (dirty read). ❌ Locks (pessimistic) or versioning / MVCC (optimistic). Isolation levels: Read Uncommitted → Serializable.
D — Durability
持久性
「寫咗就唔會冇」
Once a transaction is committed, its effects survive system crashes, power loss, or restarts. Commit succeeds, server crashes, on restart the transfer is missing. ❌ Force-write commit record to disk (fsync) + redo log. Recovery replays the log on restart.

Isolation Anomalies — what "I" prevents 隔離唔夠會出咩問題

AnomalyScenarioPrevented by isolation level
Dirty Read
髒讀
T2 reads data T1 has written but not yet committed; T1 then rolls back → T2 saw imaginary data.Read Committed or higher
Non-Repeatable Read
不可重複讀
T1 reads a row, T2 updates and commits, T1 reads it again and gets a different value.Repeatable Read or higher
Phantom Read
幻讀
T1 runs the same range query twice; between runs T2 inserts a matching row → T1 sees a new "phantom".Serializable
Lost Update
丟失更新
T1 and T2 both read balance=500, both set balance=balance+100, one write is lost.Serializable, or use locking / @Version

ACID vs BASE 兩套哲學

ACID (RDBMS)BASE (NoSQL)
PhilosophyStrong correctness firstAvailability & scale first
Consistency modelStrong — reads always see latest committedEventual — reads may be briefly stale, converge later
Scale styleUsually vertical (single node)Horizontal (many nodes)
Good forBanking, accounting, inventorySocial feeds, product catalogs, IoT streams
Tied toCAP → CPCAP → AP
考試答題套路:定義 transaction → 列 ACID 四字母 + 一句解釋 → 用轉帳例子逐個字母講一次 violation 會點 → 如有空位,補多一句 ACID 適合 RDBMS / BASE 適合 NoSQL(scale vs correctness trade-off)。

Java EE vs Spring Java EE vs Spring 框架

Java EE (COMP S368)Spring (COMP 4680)
StyleHeavyweight spec, container-drivenLightweight, POJO + IoC
PersistenceJPA (EclipseLink default) + JTASpring Data JPA (Hibernate)
TransactionsJTA, @TransactionAttribute@Transactional
WebServlet / JSP / JSFSpring MVC / Spring Boot
DeployWAR/EAR in app serverExecutable JAR, embedded Tomcat
Use caseLegacy / large enterprise systems, regulated industries still on WebLogic / WebSphere, apps needing JTA distributed transactionsNew microservice projects, cloud-native / Spring Cloud apps, startup-to-enterprise web backends, REST APIs, fast iteration

JPA Essentials JPA 核心

  • @Entity + @Id are mandatory; add @GeneratedValue for auto-PK
  • Class: non-final, public/protected no-arg constructor, usually Serializable
  • Customise mapping: @Table, @Column, @Temporal, @Transient
  • Relationships: @OneToOne, @OneToMany(mappedBy=...), @ManyToOne, @ManyToMany

Entity lifecycle

New
persist()
Managed
remove()
Removed
Managed
context close
Detached
merge()
Managed

Spring Data JPA Spring Data JPA(免寫 SQL)

interface CustomerRepository extends JpaRepository<Customer,Long> {
  Optional<Customer> findByCustName(String n);          // derived
  @Query("select c from Customer c where c.email like :e")
  List<Customer> byEmail(@Param("e") String e);          // declared
}
  • save() is an upsert — insert if id is null, update otherwise. Don't mutate the PK.
  • Inherits findById, findAll, existsById, deleteById.
  • JPQL uses entity/field names, not table/column names → vendor-independent.

Spring MVC & the DispatcherServlet Spring MVC 流程(必考)

HTTP request
DispatcherServlet
Handler (@Controller + @RequestMapping)
Model + view name
ViewResolver (JSP / Thymeleaf)
HTML response

Key annotations

@Controller / @RestController @GetMapping / @PostMapping @RequestParam / @PathVariable @ModelAttribute @Valid + BindingResult @SessionScope / @ApplicationScope

PRG pattern

POST → return "redirect:/success" → GET. Prevents duplicate submits on refresh; flash attributes carry one-shot messages.

REST vs SOAP REST vs SOAP 服務

FeatureRESTSOAP
TypeArchitectural styleProtocol (XML-based)
StandardsFlexibleStrict (WS-* stack)
Data formatJSON, XML, HTML, etc.XML only
SecurityInherits transport (HTTPS, OAuth)Built-in WS-Security
BandwidthLightHeavier
StateStatelessCan be stateful
Best forWeb, mobile, public APIsEnterprise, strict contracts, banking
Use casePublic API (Twitter, Stripe, GitHub), mobile backends, microservice-to-microservice over HTTP/JSON, Single-Page App backendsBank-to-bank payment systems, government / healthcare integrations (HL7, SWIFT), legacy B2B EDI with strict WSDL contracts

Messaging (Unit 7) 訊息系統(Unit 7)

Two models

ModelChannelFan-out
Point-to-PointQueue1 → 1 (one consumer wins)
Publish/SubscribeTopic1 → N (all subscribers)

Brokers / MOM

RabbitMQ (AMQP) Apache Kafka (streaming) ActiveMQ Redis (Pub/Sub)

Tech used in course

  • JMS — Java API: ConnectionFactory → Connection → Session → Producer/Consumer → Destination (Queue/Topic)
  • Spring Cloud Stream — binder abstraction (@EnableBinding, @StreamListener) — swap RabbitMQ/Kafka without code changes
  • OpenFeign — declarative REST client (@FeignClient), boilerplate-free HTTP calls
  • Hystrix — circuit breaker + fallback (@HystrixCommand(fallbackMethod=...))

Service Coordination — Orchestration vs Choreography 服務協調 — 編排 vs 編舞(深入版)

比喻:編排 (Orchestration) = 交響樂團,有個指揮話每個人幾時出場。編舞 (Choreography) = 跳 flash mob,冇人指揮,每個舞者自己聽到音樂跟住起舞。

Orchestration — centralised workflow 編排(中心化)

A single orchestrator service explicitly calls each participant in sequence, waits for responses, and handles errors / compensations.

Customer → OrderService (orchestrator)
             │
             │ 1. POST /reserve → InventoryService ─► OK
             │ 2. POST /charge  → PaymentService   ─► OK
             │ 3. POST /ship    → ShippingService  ─► OK
             │
             ▼
          Order confirmed
          (If any step fails, orchestrator calls compensating endpoints.)
  • Control: orchestrator knows the whole flow.
  • Tools: Spring Cloud Data Flow (SCDF), Spring Batch, BPMN engines.
  • Use when: flow is well-defined, sequential, and needs tight monitoring (business-critical workflows, regulated processes).

Choreography — event-driven 編舞(去中心化)

No central controller. Services publish events and subscribe to events they care about. The overall flow emerges from these reactions.

RegistrationService publishes "UserCreated" event
              │
     ┌────────┼─────────┬───────────┐
     ▼        ▼         ▼           ▼
 EmailSvc  ProfileSvc  AnalyticsSvc  CRMSvc
 (send     (init      (record       (create
  welcome)  profile)   signup)       lead)

Each subscriber reacts independently. Adding a new subscriber
requires no change to the publisher.
  • Control: distributed — no service knows the full flow.
  • Tools: Spring Cloud Stream, Spring Integration, Kafka, RabbitMQ.
  • Use when: loose coupling matters, new consumers added often, flow is fan-out style.

Side-by-side comparison 詳細對比

DimensionOrchestrationChoreography
ControlCentralised orchestratorDecentralised — services react to events
CommunicationSynchronous calls (REST/RPC) typicalAsynchronous events over a broker typical
CouplingOrchestrator knows all participants (tighter)Services know only the event contract (looser)
Flow visibilityEasy — read the orchestrator's codeHard — need distributed tracing to reconstruct flow
Error handlingCentralised — one place to retry / compensateDistributed — each service owns its own retries
FlexibilityRigid — change = modify orchestratorFlexible — add subscribers without touching publisher
Bottleneck riskOrchestrator can become SPOFNo single bottleneck
Typical patternSaga (orchestration-based)Saga (choreography-based), EDA
Spring toolingSCDF, Spring BatchSpring Cloud Stream, Spring Integration

When to pick which 點揀?

ScenarioChooseWhy
E-commerce checkout (reserve → charge → ship, strict order)OrchestrationOrder matters; failure needs coordinated rollback (Saga-orchestration)
User signup fan-out (email, profile, CRM, analytics)ChoreographySubscribers independent; new ones added often
Regulated business workflow (loan approval)OrchestrationAuditability + strict sequence required
IoT event stream (sensor → multiple analytics pipelines)ChoreographyHigh throughput, loose coupling, many consumers
Batch data pipeline (ETL with dependencies)OrchestrationTask dependencies + scheduling (SCDF, Airflow)
考試答題:先講定義(中心 vs 分散)→ 用交響樂團 / flash mob 比喻→ 用訂單 vs 註冊通知兩個具體例子對比 → 結尾答「揀 Orchestration 如果流程固定需要監控;揀 Choreography 如果要鬆耦合同容易擴展」。

Circuit Breaker Pattern (detailed) 斷路器模式(深入版)

Problem: in a microservice call chain, if one downstream service becomes slow or fails, upstream callers pile up waiting, consume all threads, and the failure cascades through the whole system.

Solution: wrap remote calls in a "circuit breaker" that fails fast after repeated failures, giving the failing service time to recover and protecting the caller from running out of resources.

Three states & transitions 三個狀態轉換

CLOSED
all requests flow through
failures ≥ threshold →
OPEN
fail-fast, return fallback
after cooldown →
HALF-OPEN
allow a probe request
HALF-OPEN
probe succeeds →
CLOSED
probe fails →
OPEN
StateBehaviourTransitions out when…
Closed (normal)Calls pass through; a counter tracks recent failures.failure count / rate exceeds threshold → Open
Open (tripped)Calls are not made. The breaker immediately returns a fallback. Protects the failing service and the caller.cooldown timer expires → Half-Open
Half-Open (testing)A limited number of probe calls are allowed through to test recovery.probe succeeds → Closed; probe fails → Open again

Typical parameters 參數

  • Failure threshold — e.g. trip if ≥ 50% failures in last 20 requests.
  • Request volume — minimum calls before stats are evaluated (don't trip on 1 failure).
  • Timeout / sleep window — how long to stay Open before trying Half-Open (e.g. 30 s).
  • Success threshold — how many probes must succeed in Half-Open to close again.

Hystrix code example Hystrix 代碼

@Service
class PaymentService {

  @HystrixCommand(fallbackMethod = "paymentFallback",
    commandProperties = {
      @HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "2000"),
      @HystrixProperty(name = "circuitBreaker.requestVolumeThreshold", value = "20"),
      @HystrixProperty(name = "circuitBreaker.errorThresholdPercentage", value = "50"),
      @HystrixProperty(name = "circuitBreaker.sleepWindowInMilliseconds", value = "30000")
    })
  public PaymentResult charge(Order o) {
    return paymentClient.charge(o);      // remote REST call via Feign
  }

  public PaymentResult paymentFallback(Order o) {
    // graceful degradation: queue for later, or return cached approval
    return PaymentResult.deferred(o.getId());
  }
}

Circuit Breaker vs other resilience patterns 同其他容錯模式嘅分別

PatternProtects againstMechanism
Circuit BreakerCascading failure from a slow / failing downstreamStop calling after threshold; fallback
BulkheadOne integration exhausting shared threadsDedicated thread pool / semaphore per downstream
RetryTransient failures (network blip)Re-attempt with exponential backoff
TimeoutWaiting forever on an unresponsive calleeCut off after N ms
Rate LimiterOver-loading downstreamCap requests per second
考試答題:講清楚 三個 state + 兩個 transition 條件(failures 超閾值 → Open;cooldown 後 → Half-Open;probe 成功 → Closed)。用「下游 payment service 掛咗,上游 order service 唔想等到死」嘅場景做例子。再配合 Bulkhead / Timeout / Retry 講 resilience 整套工具。

5. Security 資訊安全

重點:兩組基石要一齊識 — CIA(機密、完整、可用)+ AAA(Authentication 認證 / Authorization 授權 / Accounting 審計),再加 Non-repudiation(不可否認)。Hashing 一定要加 Salt 先夠安全。三類防禦:物理 / 教育 / 阻嚇。Zero Trust = 「永不信任,永遠驗證」。

CIA Triad 資安三大基石

Confidentiality — only authorised can read. Tools: encryption, ACLs.

Integrity — data not altered undetected. Tools: hashes, digital signatures, checksums.

Availability — services usable when needed. Tools: redundancy, DDoS mitigation, backups.

The 3 A's — AAA AAA 三大原則

Beyond CIA, security systems rest on the AAA triad.

AQuestion answeredTypical mechanisms
Authentication
認證
"Who are you?" Password, token, MFA, biometrics, certificate (X.509), digital signature
Authorization
授權
"What are you allowed to do?" RBAC (role-based), ABAC (attribute-based), ACL, OAuth2 scopes, policies
Accounting (Auditing)
審計/記錄
"What did you actually do — and can we prove it later?" Audit logs, access logs, billing records, tamper-evident log chains

Bonus principle often grouped with AAA

Non-repudiation 不可否認性 — a party cannot later deny they sent / signed a message. Achieved via digital signatures (private-key signs, anyone can verify with public key).

Real example 實例

Online banking transfer:

  • AuthN: user logs in with password + SMS OTP.
  • AuthZ: system checks the user has role ACCOUNT_OWNER for the source account.
  • Accounting: audit log records user, source, destination, amount, timestamp, IP.
  • Non-repudiation: transaction signed with user's key — cannot deny later.
記憶口訣:「你係邊個(AuthN)→ 你可以做乜(AuthZ)→ 你做咗乜(Accounting)→ 你認唔認數(Non-repudiation)」。CIA + AAA = 完整資安框架。

Hashing vs Salted Hashing 哈希 vs 加鹽哈希

Hash = one-way function (SHA-256, bcrypt). Same input → same hash.

Plain hash is vulnerable to rainbow tables & precomputed attacks.

Salted hash = hash(password + random_salt), salt stored per user → identical passwords get different hashes.

Plain hashSalted hash
Use caseFile integrity check (SHA-256 of a download), Git commit IDs, deduplication — not for passwordsPassword storage in user DB, API secret storage, anything where the same input must not produce the same hash across users

Common Attacks & Defences 常見攻擊與防禦(深入版)

考試策略:每個攻擊要識答:(1) 點樣發生 / 點運作 → (2) 受害者會點 → (3) CIA 邊個被破壞 → (4) 點防。越具體嘅例子(銀行、釣魚郵件、社工)越加分。
Attack 攻擊How it worksViolatesDefences
Phishing
釣魚
Attacker sends a fake email / SMS pretending to be a trusted party (bank, IT, boss) and tricks user into clicking a malicious link or handing over credentials. Confidentiality User training, anti-phishing filters, SPF / DKIM / DMARC, MFA, URL inspection, report-phish button
SQL Injection
SQL 注入
User input is concatenated into SQL — attacker supplies ' OR '1'='1 or ; DROP TABLE users; -- to break out of the query and read / modify / delete data. C + I + A Parameterised / prepared statements, ORM / JPA, input validation, least-privilege DB account, WAF
XSS (Cross-Site Scripting)
跨站腳本
Attacker injects JavaScript into a page viewed by other users (stored in DB, reflected from URL, or DOM-based). Script runs in victim's browser and steals cookies / session tokens. Confidentiality, Integrity Output encoding (HTML-escape), Content-Security-Policy (CSP), HttpOnly cookies, input sanitisation, template engines that escape by default (Thymeleaf)
CSRF (Cross-Site Request Forgery)
跨站請求偽造
Victim is logged into Site A. Attacker tricks their browser into submitting a request to Site A (e.g. transfer money) — the browser attaches the valid session cookie automatically. Integrity CSRF tokens, SameSite=Strict cookies, require re-auth for sensitive actions
MITM (Man-in-the-Middle)
中間人攻擊
Attacker sits between client and server (e.g. on a public Wi-Fi) and intercepts / modifies traffic. Can steal credentials, downgrade protocols, or impersonate either side. Confidentiality, Integrity TLS / HTTPS, HSTS, certificate pinning, avoid untrusted Wi-Fi, VPN
DoS / DDoS
拒絕服務 / 分散式
Attacker floods target with traffic / requests to exhaust CPU, bandwidth, or connection pool so legitimate users can't get service. DDoS uses a botnet for amplified volume. Availability Rate limiting, CDN / scrubbing, auto-scaling, WAF, SYN-cookies, traffic filtering (Cloudflare, AWS Shield)
Brute-force / Credential Stuffing
暴力破解 / 撞庫
Brute-force: try many password guesses. Credential stuffing: replay leaked username-password pairs from other site breaches. Confidentiality Strong + salted hashing (bcrypt / Argon2), rate limiting, account lockout, MFA, breach-password detection (haveibeenpwned)
Replay Attack
重放攻擊
Attacker captures a valid signed / encrypted message and re-sends it later to repeat the effect (e.g. replay an auth token, re-submit a transfer). Integrity Nonces, timestamps with short TTL, sequence numbers, one-time tokens
Social Engineering
社交工程
Attacker manipulates a human (call pretending to be IT, tailgate into building, pretend to be CEO asking for wire transfer) to bypass technical controls. Confidentiality, Integrity Awareness training, verify-callbacks, out-of-band approval for sensitive ops, least-privilege, clean-desk policy
Malware / Ransomware
惡意軟件 / 勒索
Malicious code installed on victim's machine / server to steal, encrypt, or destroy data. Ransomware encrypts files and demands payment. C + I + A Anti-virus / EDR, patching, least privilege, air-gapped backups, application allow-listing, user training
Privilege Escalation
權限提升
Attacker starts with low-privilege access and exploits a vulnerability (misconfig, kernel bug, IDOR) to gain admin/root. Confidentiality, Integrity Least-privilege design, patching, audit logs, sudo / RBAC review, separation of duties
Eavesdropping / Sniffing
竊聽
Attacker passively captures unencrypted network traffic (HTTP, FTP, Telnet) and reads credentials / data. Confidentiality Encrypt everything (TLS, SSH, VPN), disable legacy plaintext protocols, WPA3 on Wi-Fi

Example answer — one attack end-to-end 一個完整示範

Q: "Explain phishing and how an organisation defends against it."

Phishing is a social-engineering attack in which the attacker sends a forged message (email, SMS, chat) that looks like it came from a trusted party — typically a bank, IT helpdesk, or colleague — in order to trick the victim into revealing credentials or clicking a malicious link. It primarily breaks the Confidentiality leg of the CIA triad, because user credentials leak to the attacker. Defences combine education (training users to spot suspicious URLs, hover to verify, never reply with passwords), technical controls (SPF / DKIM / DMARC to authenticate senders, anti-phishing email gateways, URL sandboxing), and compensating controls (MFA so a leaked password alone is insufficient, rapid-revocation of compromised sessions, and audit logging for incident response).

Three categories of prevention

  • Physical — locks, access control, CCTV, guards, server room policies.
  • Educational — training users to spot phishing, safe links, password hygiene.
  • Deterrent — visible monitoring, banners, logging to discourage insiders/attackers.

Zero Trust

"Never trust, always verify." No implicit trust from network location. Every request is authenticated, authorised, and encrypted. Plan Zero Trust before designing the architecture, not after.

6. Resilience & Integration Patterns (Unit 7 + Cloud Native)

Circuit Breaker

States: ClosedOpenHalf-Open

Stops cascading failures by short-circuiting calls to a failing service, then tests recovery.

Bulkhead

Partition resources (thread pools, connections) so one overloaded component can't sink the rest.

Saga

Long-running transaction as a chain of local transactions; each step has a compensating action to undo on failure. E.g. Order → Payment → Inventory; failures trigger Cancel / Refund / Restock.

Event-Driven Architecture

Components publish events, others subscribe. Loose coupling, async, scalable. Example: "Order Placed" triggers inventory + payment + notification.

Spring Cloud Stack Cheat Sheet

ConcernToolCore idea
Externalised configSpring Cloud ConfigGit-backed, dynamic refresh
Service discoveryEurekaServices register, clients discover
API gatewaySpring Cloud GatewayRouting, security, rate limiting
Fault toleranceHystrix / Resilience4jCircuit breaker, fallback, timeout
TracingSleuth + ZipkinTrace/Span IDs across services
Declarative HTTPOpenFeignInterface-based REST clients
MessagingSpring Cloud StreamBinder over RabbitMQ/Kafka

Exam Strategy — What the Instructor Emphasised

Must be able to DRAW

  • TCP/IP 4-layer model & encapsulation
  • Client-server, multi-server, proxy, P2P, three-tier
  • Race condition timeline (sum interleaving)
  • Lock + Condition state machine (await / signal / signalAll)
  • 2PC coordinator/participant flow
  • Orchestration vs Choreography sequence
  • Circuit breaker states

Must be able to COMPARE (tables)

  • TCP vs UDP
  • SQL vs NoSQL
  • Java EE vs Spring
  • REST vs SOAP
  • Pessimistic vs Optimistic / Centralised vs Decentralised
  • Process vs Thread
  • Orchestration vs Choreography
  • PTP vs Pub/Sub messaging

Worked examples the instructor likes

  • Bank transfer — illustrates transactions, ACID, rollback, lock/condition, race conditions.
  • Joint-account double-withdrawal — motivates concurrency control.
  • SharedDataTest — classic race condition; explain Steps A/B/C interleaving.
  • Order processing — orchestration vs choreography, Saga pattern.
  • Phishing email — educational prevention, digital signature, authentication.
Tip: When a question says "compare" or "discuss," always end with a selection criterion — e.g. "use optimistic for low-conflict workloads; pessimistic when writes clash heavily." Instructors reward showing when to pick each option, not just listing features.

Last-minute checklist

  1. Walk through the six core tutorial examples in your head.
  2. Redraw the five key diagrams above on paper without notes.
  3. Fill out each comparison table blind.
  4. Explain one annotation per framework: @Entity, @Transactional, @RestController, @FeignClient, @HystrixCommand, @StreamListener.
  5. Rehearse CIA + salted hashing in one breath — security questions are fast marks.
  6. Revisit each unit's Self-Test questions — they hint at the examiner's style.

8. Gap-Fillers LOW PROB

Note: These topics (CAP, 2PL, HTTP codes, 12-Factor, Isolation Levels, full annotation list) were not explicitly listed in the instructor's 5/1 revision. Shown only in "All sections" mode.

CAP Theorem

In a distributed system, you can guarantee at most two of:

  • Consistency — every read sees the most recent write
  • Availability — every request gets a (non-error) response
  • Partition tolerance — system keeps working despite dropped network messages

In practice, network partitions will happen, so the real trade-off is CP vs AP.

ChoiceExamplesTrade-off
CPHBase, MongoDB (default), ZooKeeperMay refuse requests to stay consistent
APCassandra, DynamoDB, CouchDBMay return stale data to stay up
CATraditional RDBMS (single node)Not partition-tolerant → not truly distributed

ACID vs BASE

ACID (RDBMS)BASE (NoSQL)
AtomicityBasically Available
ConsistencySoft state
IsolationEventual consistency
Durability 
Use case: Bank ledger, ERP, inventory — any domain that can't tolerate lost / inconsistent writes.Use case: Social feeds, "like" counters, product recommendations, IoT telemetry — stale-by-seconds is acceptable.

ACID = strict correctness. BASE = trade consistency for scale & availability.

2PL vs 2PC (don't confuse!)

2PL (Two-Phase Locking)2PC (Two-Phase Commit)
Problem solvedConcurrency / isolationDistributed atomicity
PhasesGrowing (acquire) → Shrinking (release)Prepare → Commit/Abort
ActorsOne DB, many transactionsCoordinator + multiple participants
GoalSerialisable schedulesAll nodes commit or all abort
Use caseInside a single DB engine scheduling concurrent txns (MySQL / PostgreSQL default isolation)Cross-DB / cross-service txns (classic XA, JTA), e.g. transfer money when A and B are on different databases
Coordinator: "Prepare?"
Participants vote Yes/No
Coordinator: "Commit" or "Abort"

2PC weakness: if coordinator crashes after prepare, participants block.

Coffman's 4 Deadlock Conditions

All four must hold. Prevent any one → no deadlock.

  1. Mutual exclusion — resources are non-sharable
  2. Hold and wait — thread holds one resource while waiting for another
  3. No preemption — cannot forcibly take a resource back
  4. Circular wait — T1 → T2 → ... → T1 waiting chain

Prevention strategies

  • Acquire locks in a fixed global order (breaks circular wait)
  • tryLock(timeout) — release if can't acquire in time (breaks hold-and-wait)
  • Request all resources up front (breaks hold-and-wait)
  • Use higher-level primitives (semaphores, transactions)

HTTP Status Codes (know the families)

FamilyMeaningKey codes
2xxSuccess200 OK · 201 Created · 204 No Content
3xxRedirection301 Moved · 302 Found · 304 Not Modified
4xxClient error400 Bad Request · 401 Unauthorized · 403 Forbidden · 404 Not Found · 409 Conflict · 429 Too Many Requests
5xxServer error500 Internal · 502 Bad Gateway · 503 Service Unavailable · 504 Gateway Timeout

HTTP methods and idempotency

MethodUseIdempotent?
GETReadYes
POSTCreate / actionNo
PUTReplaceYes
PATCHPartial updateNo (usually)
DELETERemoveYes

Spring Annotation Quick-Reference

AnnotationWherePurpose
@SpringBootApplicationMain classScan + auto-config + config
@Controller / @RestControllerWeb layerMVC controller / JSON API
@Service / @Component / @RepositoryBeanStereotype for DI
@AutowiredField/ctorInject dependency
@RequestMapping / @GetMapping / @PostMappingMethodRoute mapping
@PathVariable / @RequestParam / @RequestBodyParamBind URL / query / body
@ModelAttributeParamBind form to POJO
@Valid + BindingResultParamJSR-303 validation
@TransactionalMethod/classBegin/commit/rollback txn
@Entity / @Id / @GeneratedValueJPAMap class → table
@Query / @ParamRepositoryDeclared JPQL query
@EnableFeignClients + @FeignClientApp / interfaceDeclarative REST client
@EnableHystrix + @HystrixCommandApp / methodCircuit breaker + fallback
@EnableBinding + @StreamListenerSCSBind channels / consume
@ControllerAdvice + @ExceptionHandlerCross-cuttingGlobal exception handling

Transaction Isolation Levels

LevelDirty readNon-repeatable readPhantom
Read UncommittedYesYesYes
Read CommittedNoYesYes
Repeatable ReadNoNoYes
SerializableNoNoNo

Higher isolation = stronger correctness, lower concurrency.

12-Factor App (Cloud-Native)

The checklist for services that run well on the cloud. Know at least these five:

  • Config in env — no hard-coded secrets
  • Stateless processes — scale horizontally
  • Port binding — export services via ports
  • Disposability — fast startup / graceful shutdown
  • Dev/prod parity — minimise environment drift

9. Mnemonics & Memory Tricks

Security

CIAConfidentiality, Integrity, Availability.
AAAAuthentication (你係邊個), Authorization (可以做乜), Accounting (做咗乜). +1: Non-repudiation (認唔認數).
3 defences — "PED": Physical, Educational, Deterrent.
Salted hash = hash(password + salt). Salt is per user, stored alongside the hash. Two users with the same password get different hashes.

Transactions

ACIDAtomic, Consistent, Isolated, Durable.
BASEBasically Available, Soft state, Eventual consistency.
2PC — "Prepare, Commit" (coordinator asks, all vote, all commit or all abort).
OCCWork, Validate, Update. Assumes conflict is rare.

Concurrency

Coffman's 4 — "MHNC": Mutual exclusion, Hold-and-wait, No preemption, Circular wait.
Always use while, not if when calling await() — guards against spurious wake-ups and multiple waiters.
Lock order matters — "alphabetical locks" — always acquire A before B.

Networking & Big Data

OSI 7 layers — "All People Seem To Need Data Processing" (App, Presentation, Session, Transport, Network, Data Link, Physical).
TCP/IP 4 layers — A-T-I-N: Application, Transport, Internet, Network Access.
TCP = Trust (ordered, reliable). UDP = Urgent (fast, fire-and-forget).
Big Data 3VVolume, Variety, Velocity. Plus V: Veracity, Value, Variability, Visualisation.
MapReduce — "SMSRO": Split → Map → Shuffle → Reduce → Output.

Cloud Deployment Models

"Private / Community / Public / Hybrid" — ownership scale. Hybrid = "best of both, more complex".

Service Models

SPI ladderSaaS (use) → PaaS (build) → IaaS (control). Up the stack = less responsibility, less flexibility.

10. One-Page Cheat Sheet (phone-friendly)

TCP vs UDPTCP: 3-way handshake, ordered, reliable, slow. Web/email.UDP: connectionless, fast, lossy. VoIP/games/DNS.
TCP/IP LayersApp → Transport → Internet → Network Access. Encapsulate top-down.
CIAConfidentiality · Integrity · Availability
ACIDAtomic · Consistent · Isolated · Durable
BASEBasically Available · Soft state · Eventual consistency
CAPPick 2 of Consistency/Availability/Partition-tol. Real: CP vs AP.
Race Condition Fixsynchronized | ReentrantLock + try/finally | atomic classes
Lock + Conditionlock → while(!cond) await() → action → signalAll() → unlock in finally
Deadlock PreventionFixed lock order · tryLock(timeout) · request-all-at-once
Pessimistic vs OptimisticPess: lock first (high conflict). Opt: validate at commit (low conflict).
2PCPrepare → vote → Commit/Abort. Blocks if coordinator dies.
SQL vs NoSQLSQL: schema, ACID, joins. NoSQL: schema-less, BASE, horizontal scale.
NoSQL FamiliesKey-Value · Column · Document · Graph
Java EE vs SpringJavaEE: heavy container, JTA. Spring: POJO + IoC, @Transactional.
JPA LifecycleNew → persist → Managed → remove → Removed. Managed → (close) → Detached → merge → Managed.
Spring MVC FlowRequest → DispatcherServlet → Controller → Model+View → ViewResolver → HTML
REST vs SOAPREST: JSON, stateless, flexible. SOAP: XML, strict, enterprise security.
Messaging ModelsPTP (Queue, 1-1) · Pub/Sub (Topic, 1-N)
Orchestration vs ChoreographyCentral vs decentralised. Predictable vs flexible.
Resilience PatternsCircuit Breaker · Bulkhead · Saga · EDA · Retry · Timeout
K8s CorePod · Service · Deployment · Ingress · HPA/VPA/CA
Cloud NIST 5On-demand · Broad network · Pooling · Elasticity · Measured
SPI ModelsSaaS (app) · PaaS (platform) · IaaS (infra)
MapReduceSplit → Map → Shuffle → Reduce → Output
HashingOne-way. Use salted hash to defeat rainbow tables.
Zero Trust"Never trust, always verify." Design before architecting.
Process vs ThreadProcess: own memory, slow switch. Thread: shared memory, fast switch.
HTTP Status2xx ok · 3xx redirect · 4xx client · 5xx server

11. Practice Bank (click to reveal answer)

Drawn from Self-Test 7.1–7.4, Unit 2/3 slide 19, and common exam-style variations on the instructor's emphasised topics.

Networking & Distributed Systems

Q1 Why does TCP/IP use 4 layers when OSI has 7?
OSI's 7 layers are conceptually clean but complex. TCP/IP collapses Application/Presentation/Session into one "Application" layer and Data Link/Physical into "Network Access". Result: simpler, cheaper to implement, and it covers every function OSI handles — which is why real networks run TCP/IP.
Q2 State three key differences between TCP and UDP.
(1) Connection — TCP is connection-oriented (3-way handshake); UDP is connectionless. (2) Reliability — TCP guarantees delivery and ordering via ACKs and retransmits; UDP is best-effort. (3) Overhead — TCP has higher per-packet overhead; UDP is fast and lightweight. Use TCP for web/email/file transfer; UDP for VoIP/streaming/gaming/DNS.
Q3 Explain encapsulation with an example.
Each layer wraps the higher layer's PDU with its own header. Example: an HTTP GET is wrapped by a TCP segment header, then an IP packet header, then an Ethernet frame header/trailer before hitting the wire. The receiver strips headers in reverse. This separation of concerns lets each layer evolve independently.
Q4 List three benefits and three overheads of a distributed system.
Benefits: resource sharing, scalability, fault tolerance. Overheads: software complexity (no central control), communication delay, security (data in transit, key distribution). Also acceptable: heterogeneity, openness, transparency as characteristics.
Q5 Compare client-server and peer-to-peer architectures.
Client-server has clear roles (clients request, servers respond); easy to secure and manage but risks single-point-of-failure and scaling bottlenecks. P2P has no fixed roles — each node is both client and server; robust and decentralised but harder to coordinate and secure. Example: Web (C/S) vs BitTorrent (P2P).

Concurrency

Q6 Define a race condition. Give a fix.
A race condition occurs when multiple threads operate on shared data without synchronisation, so the final result depends on the unpredictable order of execution. Classic example: two threads both read sum, both compute sum+1, both write back — one update is lost. Fix with mutual exclusion: synchronized method, ReentrantLock, or atomic classes like AtomicInteger.
Q7 Why must await() be called inside a while loop, not an if?
Two reasons. (1) Spurious wake-ups: a thread can return from await() without any signal() being called. (2) Multiple waiters: after signalAll() several threads race to reacquire the lock — by the time one wins, another may have already consumed the condition. Re-checking in while guarantees the condition is still true before acting.
Q8 Name Coffman's four deadlock conditions. How do you prevent deadlock?
(1) Mutual exclusion, (2) Hold and wait, (3) No preemption, (4) Circular wait. Breaking any one prevents deadlock. Common strategies: acquire locks in a fixed global order (breaks circular wait); use tryLock(timeout) and back off (breaks hold-and-wait); request all resources at once; or use higher-level primitives (transactions, semaphores).
Q9 When would you choose optimistic concurrency control over pessimistic locking?
When conflicts are rare and read throughput matters more than write throughput. OCC avoids blocking — transactions operate on copies and only validate at commit, so there's no lock contention. Downside: if conflicts do happen, transactions roll back and retry. Pessimistic is better for hot write paths like bank account updates where conflicts are likely.
Q10 Difference between 2PL and 2PC?
2PL (Two-Phase Locking) is a concurrency protocol for a single database: transactions have a growing phase (acquire locks) followed by a shrinking phase (release locks) — guarantees serialisable schedules. 2PC (Two-Phase Commit) is a distributed atomicity protocol: a coordinator asks all participants to Prepare, then based on votes sends Commit or Abort — guarantees all-or-nothing across nodes.

Databases & Frameworks

Q11 Contrast SQL and NoSQL. When would you pick each?
SQL has a fixed schema, ACID transactions, and rich joins — best for structured data with strong consistency needs (banking, ERP). NoSQL is schema-less, scales horizontally, and typically offers BASE/eventual consistency — best for large, evolving, or unstructured data (social feeds, product catalogs, IoT logs). Pick SQL for relational integrity; NoSQL for scale and flexibility.
Q12 Explain the ACID properties using a bank transfer.
Atomicity: withdraw + deposit both succeed, or both are rolled back. Consistency: total money before = total after. Isolation: two simultaneous transfers don't see each other's half-done state. Durability: once committed, the transfer survives a server crash.
Q13 List three mandatory requirements for a JPA entity class.
(1) Annotated with @Entity. (2) Has a primary key field annotated with @Id (often with @GeneratedValue). (3) Public or protected no-argument constructor. Bonus: non-final class/fields, typically implements Serializable.
Q14 Why is Spring Data JPA preferred over raw JDBC?
It removes boilerplate — no manual ResultSet traversal, no SQL strings for basic CRUD. Queries derived from method names (findByName) or @Query JPQL are database-independent. Integrates with Spring transactions, DI, and testing. JDBC gives more control but costs a lot more code for the same outcome.
Q15 What does the DispatcherServlet do?
It's the Spring MVC front controller. Every HTTP request goes to it; it consults handler mappings to pick the right controller method, invokes it to get a model and logical view name, hands the view name to a ViewResolver (JSP/Thymeleaf), then renders the template with model data and writes HTML back. This keeps controllers decoupled from the view layer.
Q16 Compare REST and SOAP.
REST is an architectural style over HTTP: stateless, flexible data formats (JSON common), light, easy to consume from web/mobile. SOAP is an XML protocol with strict standards (WSDL, WS-Security), heavier but offers built-in security and formal contracts — preferred in regulated enterprise contexts like banking or government. Pick REST for public/web APIs; SOAP for formal B2B integrations.

Messaging & Integration

Q17 Point-to-point vs Publish/Subscribe messaging?
PTP uses a Queue and delivers each message to exactly one consumer (1:1) — like a task dispatcher. Pub/Sub uses a Topic and broadcasts each message to all current subscribers (1:N) — like a news feed. In JMS both are called "destinations".
Q18 Orchestration vs Choreography — which for a checkout flow with strict sequencing?
Orchestration, because a central controller can enforce the exact order (validate → reserve inventory → charge payment → ship) and handle compensating actions centrally if any step fails. Choreography is better when services should react independently to events without a fixed script, e.g. a user-registration fan-out.
Q19 How does a Circuit Breaker work?
It wraps calls to a downstream service. In Closed state calls go through but failures are counted. When failures exceed a threshold it flips to Open — calls short-circuit to a fallback immediately, sparing the failing service. After a cooldown it moves to Half-Open, letting a few probes through; if they succeed it closes again, otherwise re-opens. Prevents cascading failures.
Q20 What is the Saga pattern?
A long-running distributed transaction broken into local transactions, each with a compensating action to undo it. If any step fails, earlier steps are rolled back by running their compensators (Cancel order, Refund payment, Restock inventory). Keeps services loosely coupled while preserving business atomicity without 2PC.

Security

Q21 Define each letter of the CIA triad.
Confidentiality — only authorised parties can read the data (encryption, ACLs). Integrity — data is not altered without detection (hashes, digital signatures). Availability — services are reachable and responsive when needed (redundancy, DDoS protection).
Q22 Why is plain hashing not enough? What is salted hashing?
Plain hashes are deterministic, so attackers can precompute huge lookup tables (rainbow tables) that map common passwords to their hashes. Adding a unique random salt per user — hash(password + salt) — makes each user's hash unique even for identical passwords and defeats precomputed tables. The salt is stored with the hash (it's not secret, just unique).
Q23 Explain Zero Trust in one sentence.
"Never trust, always verify" — no request is trusted based on network location; every access is authenticated, authorised, and encrypted, whether it originates inside or outside the corporate network.
Q24 Give one example each of physical, educational, and deterrent prevention.
Physical: CCTV, biometric door locks, server-room cages. Educational: phishing-awareness training, password hygiene sessions. Deterrent: visible monitoring banners, audit logging, published penalties for misuse.

Cloud & Big Data

Q25 Name NIST's five essential cloud characteristics.
On-demand self-service, broad network access, resource pooling, rapid elasticity, measured service.
Q26 Difference between IaaS, PaaS, SaaS — give an example of each.
IaaS: provider gives you VMs/storage/network; you manage OS + apps (AWS EC2). PaaS: provider gives you a runtime platform; you only manage code and data (Google App Engine). SaaS: provider gives you the whole application; you just use it (Google Workspace, Microsoft 365).
Q27 What are Kubernetes Pods, Services, and Deployments?
Pod: smallest deployable unit, one or more containers sharing network/storage. Service: stable network endpoint that load-balances traffic to matching pods. Deployment: declarative manager for pod replicas, rolling updates, and desired-state reconciliation.
Q28 Explain MapReduce with a word-count example.
Split input text into chunks across workers. Map emits (word, 1) for every word. Shuffle groups all pairs by key so all "the" tuples land on the same reducer. Reduce sums the values per key → (word, count). The same map/reduce code scales unchanged from 10 to 100+ CPUs.
Q29 State the CAP theorem. Give an AP and a CP example.
CAP: in the presence of network partitions, a distributed system can guarantee either Consistency or Availability — not both. CP: HBase, ZooKeeper — refuse requests during partition to stay consistent. AP: Cassandra, DynamoDB — keep serving, risk stale reads. "CA" systems (single-node RDBMS) aren't truly distributed.
Q30 Why is cloud-native better than a monolith for scaling?
Microservices can be scaled independently — only the hot service spawns more instances. Containers give fast, consistent deploys; orchestration (Kubernetes) auto-scales, self-heals, and rolls updates with no downtime. Monoliths have to scale the whole application even if only one feature is under load, and release coupling slows delivery.

12. Essay-Style Worked Answers

These show the structure the instructor rewards: define → compare in table → give example → state selection criterion.

Prompt 1. "Compare pessimistic and optimistic concurrency control. Using a banking example, explain when you would choose each."

Definitions. Pessimistic control assumes conflicts are likely; it acquires locks on resources at the start of a transaction so other transactions wait. Optimistic control (OCC) assumes conflicts are rare; transactions work on local copies and only validate against conflicts at commit time.

Comparison.
PessimisticOptimistic
AssumptionConflicts likelyConflicts rare
PhasesLock → Use → ReleaseWork → Validate → Update
OverheadBlocking, risk of deadlockRollbacks on conflict
ThroughputLow under contentionHigh when low contention
Banking example. In a joint account withdrawal where two owners may simultaneously try to withdraw, pessimistic locking is appropriate: a write-lock on the balance forces the second withdrawal to wait until the first commits, guaranteeing the balance never goes negative. By contrast, in a monthly statement generation service that mostly reads account snapshots, OCC is preferred — readers don't block each other; the occasional edit commits cleanly since conflicts are rare.

Selection criterion. Choose pessimistic when write contention is high and correctness is paramount; choose optimistic when workloads are read-heavy or conflicts are statistically rare, to maximise throughput.

Prompt 2. "Explain how a race condition arises in a multi-threaded program. Use the SharedDataTest example and describe two different fixes."

Definition. A race condition occurs when multiple threads access shared mutable state without proper synchronisation, so the final result depends on the unpredictable interleaving of operations.

Example. SharedDataTest has a Buf object with an increment() method that performs three non-atomic steps: (A) read sum, (B) log, (C) write sum + 1. With two threads running in parallel, one valid interleaving is:
t1: T1 reads 0
t2: T2 reads 0        ← both see the same value
t3: T1 writes 1
t4: T2 writes 1       ← update lost
Both increments were meant to raise sum by 2, but only one write survives. Overlap percentage grows with more threads or more iterations.

Fix 1 — implicit mutual exclusion. Declare increment() synchronized; the JVM now grants only one thread at a time the intrinsic monitor on the Buf instance, so A/B/C execute atomically.

Fix 2 — explicit Lock. Use a ReentrantLock:
Lock lock = new ReentrantLock();
void increment() {
  lock.lock();
  try { sum = sum + 1; }
  finally { lock.unlock(); }
}
Explicit locks add flexibility: tryLock(timeout) for deadlock avoidance, fair ordering to prevent starvation, and multiple Condition objects for producer/consumer patterns.

Conclusion. Both fixes enforce atomicity of the read-modify-write sequence. synchronized is simpler; ReentrantLock is preferable when you need timed acquisition, condition variables, or fair scheduling.

Prompt 3. "Discuss how a modern e-commerce platform can be built on cloud-native principles. Include microservices, messaging, resilience patterns, and security."

Architecture overview. Decompose the monolith into independently deployable microservices — Catalog, Cart, Order, Payment, Inventory, Shipping, Notification. Each owns its data store (polyglot persistence: relational for Order, document for Catalog, Redis cache for Cart). Services are packaged as Docker containers and orchestrated by Kubernetes, which provides auto-scaling (HPA), rolling updates via Deployments, self-healing with liveness probes, and stable endpoints via Services and Ingress.

Integration. Synchronous calls (e.g. Order ↔ Payment) use REST with OpenFeign for declarative clients. Asynchronous workflows (inventory reservation, email notifications) run over RabbitMQ with Spring Cloud Stream; publishing an "OrderPlaced" event triggers Inventory and Notification services independently — classic choreography, loose coupling.

Coordination. For the multi-step checkout (reserve → charge → ship) we use the Saga pattern. Each step is a local transaction with a compensating action (cancel reservation, refund payment, cancel shipment), avoiding the availability cost of 2PC.

Resilience. Wrap inter-service calls with Hystrix circuit breakers and fallback methods to prevent cascading failure if Payment is slow. Apply the Bulkhead pattern by giving each downstream integration its own thread pool. Add retries with exponential backoff for transient errors and timeouts on every remote call.

Security. Apply Zero Trust: mutual TLS between services, token-based authN (JWT/OAuth2) at the API gateway, role-based authZ inside each service. Store passwords as salted hashes. Encrypt data at rest and in transit (Confidentiality), sign events to prevent tampering (Integrity), and run multi-AZ Kubernetes with auto-healing for Availability — the full CIA triad.

Observability. Spring Cloud Sleuth + Zipkin for distributed tracing, centralised logs, and Prometheus/Grafana dashboards.

Conclusion. Cloud-native gives this platform independent scalability of hot paths (Catalog under Black Friday load), fault isolation, and rapid feature delivery — at the cost of operational complexity that DevOps tooling and Kubernetes automate away.