# Consistency Transactions Transactional Memory Chris Rossbach ### Outline for Today - Questions? - Administrivia - Have you started the next lab yet? © - Agenda - Consistency - Transactions - Transactional Memory - Acks: Yoav Cohen for some STM slides #### Faux Quiz questions - How are promises and futures related? Since there is disagreement on the nomenclature, don't worry about which is which—just describe what the different objects are and how they function. - How does HTM resemble or differ from Load-linked Stored-Conditional? - What are some pros and cons of HTM vs STM? - What is Open Nesting? Closed Nesting? Flat Nesting? - How does 2PL differ from 2PC? - Define ACID properties: which, if any, of these properties does TM relax? # Memory Consistency #### Memory Consistency - Formal specification of memory semantics - Statement of how shared memory will behave with multiple CPUs - Ordering of reads and writes #### Memory Consistency - Formal specification of memory semantics - Statement of how shared memory will behave with multiple CPUs - Ordering of reads and writes - Memory Consistency != Cache Coherence - Coherence: propagate updates to cached copies - Invalidate vs. Update - Coherence vs. Consistency? - **Coherence:** ordering of ops. at a single location - Consistency: ordering of ops. at multiple locations Result of any execution is same as if all operations execute on a uniprocessor - Result of any execution is same as if all operations execute on a uniprocessor - Operations on each processor are totally ordered in the sequence and respect program order for each processor - Result of any execution is same as if all operations execute on a uniprocessor - Operations on each processor are totally ordered in the sequence and respect program order for each processor Trying to mimic Uniprocessor semantics: - Memory operations occur: - One at a time - In program order - Read returns value of last write - Result of any execution is same as if all operations execute on a uniprocessor - Operations on each processor are totally ordered in the sequence and respect program order for each processor - How is this different from coherence? - Why do modern CPUs not implement SC? - Requirements: program order, write atomicity Trying to mimic Uniprocessor semantics: - Memory operations occur: - One at a time - In program order - Read returns value of last write - All operations are executed in *some* sequential order - each process issues operations in program order - Any valid interleaving is allowed - All agree on the same interleaving - Each process preserves its program order | P1: | W(x)a | | | |-----|-------|-------|-------| | P2: | W(x)b | | | | P3: | | R(x)b | R(x)a | | P4: | | R(x)b | R(x)a | | P1: | W(x)a | | | |-----|-------|-------|-------| | P2: | W(x)b | | | | P3: | | R(x)b | R(x)a | | P4: | | R(x)a | R(x)b | - All operations are executed in some sequential order - each process issues operations in program order - Any valid interleaving is allowed - All agree on the same interleaving - Each process preserves its program order | P1: W | (x)a | | | P1: W(x)a | | | | |-------|-------|-------|-------|-----------|-------|-------|---------| | P2: | W(x)b | | | P2: | W(x)b | | | | P3: | | R(x)b | R(x)a | P3: | | R(x)b | R(x)a | | P4: | | R(x)b | R(x)a | P4: | | R(x)a | a R(x)b | | | | (a) | | | | (b) | | Are either of these SC? #### Sequential Consistency: Canonical Example ``` Initially, Flag1 = Flag2 = 0 P1 Flag1 = 1 if (Flag2 == 0) enter CS Flag1 = Flag2 = 1 if (Flag1 == 0) enter CS enter CS ``` #### Sequential Consistency: Canonical Example ``` Initially, Flag1 = Flag2 = 0 P1 Flag1 = 1 if (Flag2 == 0) enter CS Flag1 = Flag2 = 1 if (Flag1 == 0) enter CS enter CS ``` Can both P1 and P2 wind up in the critical section at the same time? #### Do we need Sequential Consistency? ``` Initially, Flag1 = Flag2 = 0 <u>P2</u> Flag1 = 1 Flag2 = 1 if(Flag1 == 0) data++ if(Flag2 == 0) data++ ``` #### Do we need Sequential Consistency? ``` Initially, Flag1 = Flag2 = 0 <u>P2</u> Flag1 Flag2 = 1 if(Flag1 == 0) data++ if(Flag2 == 0) data++ ``` #### Key issue: - P1 and P2 may not see each other's writes in the same order - Implication: both in critical section, which is incorrect - Why would this happen? #### Do we need Sequential Consistency? #### Key issue: - P1 and P2 may not see each other's writes in the same order - Implication: both in critical section, which is incorrect - Why would this happen? #### Write Buffers - P 0 write → queue op in write buffer, proceed - P\_0 read → look in write buffer, - $P_(x = 0)$ read $\rightarrow$ old value: write buffer hasn't drained - Program Order - Processor's memory operations must complete in program order - Program Order - Processor's memory operations must complete in program order - Write Atomicity - Writes to the same location seen by all other CPUs - Subsequent reads must not return value of a write until propagated to all - Program Order - Processor's memory operations must complete in program order - Write Atomicity - Writes to the same location seen by all other CPUs - Subsequent reads must not return value of a write until propagated to all - Write acknowledgements are necessary - Cache coherence provides these properties for a cache-only system - Program Order - Processor's memory operations must complete in program order - Write Atomicity - Writes to the same location seen by all other CPUs - Subsequent reads must not return value of a write until propagated to all - Write acknowledgements are necessary - Cache coherence provides these properties for a cache-only system #### Disadvantages: - Difficult to implement! - Coherence to (e.g.) write buffers is hard - Sacrifices many potential optimizations - Hardware (cache) and software (compiler) - Major performance hit #### Why Relax Consistency? - Motivation, originally - Allow in-order processors to overlap store latency with other work - "Other work" depends on loads, so loads bypass stores using a store queue - PC (processor consistency), SPARC TSO, IBM/370 - Just relax read-to-write program order requirement - Subsequently - Hide latency of one store with latency of other stores - Stores to be performed OOO with respect to each other - Breaks SC even further - This led to definition of SPARC PSO/RMO, WO, PowerPC WC, Itanium - What's the problem with relaxed consistency? - Shared memory programs can break if not written for specific cons. model - **Program Order** relaxations (different locations) - $W \rightarrow R$ ; $W \rightarrow W$ ; $R \rightarrow R/W$ - **Program Order** relaxations (different locations) - W $\rightarrow$ R; W $\rightarrow$ W; R $\rightarrow$ R/W - Write Atomicity relaxations - Read returns another processor's Write early - **Program Order** relaxations (different locations) - W $\rightarrow$ R; W $\rightarrow$ W; R $\rightarrow$ R/W - Write Atomicity relaxations - Read returns another processor's Write early - Requirement: synchronization primitives for safety - Fence, barrier instructions etc - **Program Order** relaxations (different locations) - $W \rightarrow R$ ; $W \rightarrow W$ ; $R \rightarrow R/W$ - Write Atomicity relaxations - Read returns another processor's V - Requirement: synchronization pri - Fence, barrier instructions etc | Relaxation | $W \rightarrow R$ | $W \rightarrow W$ | $R \rightarrow RW$ | Read Others' | Read Own | Safety net | |-----------------|-------------------|-------------------|--------------------|--------------|--------------|------------------------------| | | Order | Order | Order | Write Early | Write Early | | | SC [16] | | | | | $\sqrt{}$ | | | IBM 370 [14] | | | | | | serialization instructions | | TSO [20] | | | | | $\sqrt{}$ | RMW | | PC [13, 12] | | | | | $\sqrt{}$ | RMW | | PSO [20] | | | | | $\sqrt{}$ | RMW, STBAR | | WO [5] | | | $\sqrt{}$ | | $\sqrt{}$ | synchronization | | RCsc [13, 12] | <b>√</b> | $\sqrt{}$ | $\sqrt{}$ | | $\checkmark$ | release, acquire, nsync, RMW | | RCpc [13, 12] | $\sqrt{}$ | $\sqrt{}$ | $\sqrt{}$ | $\sqrt{}$ | $\checkmark$ | release, acquire, nsync, RMW | | Alpha [19] | | | $\sqrt{}$ | | $\sqrt{}$ | MB, WMB | | RMO [21] | | | | | | various MEMBAR's | | PowerPC [17, 4] | | | | | | SYNC | | | | | | | | | # x86 #### Relaxed Consis ``` static inline void arch write lock(arch rwlock t *rw) { asm volatile(LOCK PREFIX WRITE LOCK SUB(%1) "(%0)\n\t" "jz 1f\n" "call write lock failed\n\t" "1:\n" ::LOCK PTR REG (&rw->write), "i" (RW LOCK BIAS) : "memory"); } ``` - Program Order relaxations (different locations) - W $\rightarrow$ R; W $\rightarrow$ W; R $\rightarrow$ R/W - Write Atomicity relaxations - Read returns another processor's V - Requirement: synchronization pri - Fence, barrier instructions etc | • | | | | | | | | |---|-----------------|-------------------|-------------------|--------------------|--------------|-------------|------------------------------| | | Relaxation | $W \rightarrow R$ | $W \rightarrow W$ | $R \rightarrow RW$ | Read Others' | Read Own | Safety net | | | | Order | Order | Order | Write Early | Write Early | | | | SC [16] | | | | | | | | | IBM 370 [14] | | | | | | serialization instructions | | | TSO [20] | $\sqrt{}$ | | | | | RMW | | | PC [13, 12] | | | | | $\sqrt{}$ | RMW | | | PSO [20] | | | | | $\sqrt{}$ | RMW, STBAR | | | WO [5] | $\sqrt{}$ | $\sqrt{}$ | $\sqrt{}$ | | $\sqrt{}$ | synchronization | | | RCsc [13, 12] | <b>√</b> | $\sqrt{}$ | $\sqrt{}$ | | <b>√</b> | release, acquire, nsync, RMW | | | RCpc [13, 12] | <b>√</b> | $\sqrt{}$ | $\sqrt{}$ | V | <b>√</b> | release, acquire, nsync, RMW | | | Alpha [19] | $\sqrt{}$ | $\sqrt{}$ | $\sqrt{}$ | | $\sqrt{}$ | MB, WMB | | | RMO [21] | | $\sqrt{}$ | $\sqrt{}$ | | | various MEMBAR's | | | PowerPC [17, 4] | | $\sqrt{}$ | V | | | SYNC | | | | | | | | | | - **Program Order** relaxations (different locations) - $W \rightarrow R$ ; $W \rightarrow W$ ; $R \rightarrow R/W$ - Write Atomicity relaxations - Read returns another processor's V - Requirement: synchronization pri - Fence, barrier instructions etc | Relaxation | $W \rightarrow R$ | $W \rightarrow W$ | $R \rightarrow RW$ | Read Others' | Read Own | Safety net | |-----------------|-------------------|-------------------|--------------------|--------------|--------------|------------------------------| | | Order | Order | Order | Write Early | Write Early | | | SC [16] | | | | | $\sqrt{}$ | | | IBM 370 [14] | | | | | | serialization instructions | | TSO [20] | | | | | $\sqrt{}$ | RMW | | PC [13, 12] | | | | | $\sqrt{}$ | RMW | | PSO [20] | | | | | $\sqrt{}$ | RMW, STBAR | | WO [5] | | | $\sqrt{}$ | | $\sqrt{}$ | synchronization | | RCsc [13, 12] | <b>√</b> | $\sqrt{}$ | $\sqrt{}$ | | $\checkmark$ | release, acquire, nsync, RMW | | RCpc [13, 12] | $\sqrt{}$ | $\sqrt{}$ | $\sqrt{}$ | $\sqrt{}$ | $\checkmark$ | release, acquire, nsync, RMW | | Alpha [19] | | | $\sqrt{}$ | | $\sqrt{}$ | MB, WMB | | RMO [21] | | | | | | various MEMBAR's | | PowerPC [17, 4] | | | | | | SYNC | | | | | | | | | • **Program Order** relaxations (different locations) ``` • W \rightarrow R; W \rightarrow W; R \rightarrow R/W ``` ``` static inline unsigned long __arch_spin_trylock(arch_spinlock_t *lock) unsigned long tmp, token; token = LOCK TOKEN; _asm__ _volatile_ ( "1: " PPC_LWARX(%0,0,%2,1) "\n\ cmpwi 0,%0,0\n\ bne- 2f\n\ stwcx. %1,0,%2\n\ bne- 1b\n" PPC ACQUIRE BARRIER "2:": "=&r" (tmp) : "r" (token), "r" (&lock->slock) : "cr0", "memory"); return tmp; PowerPC ``` ns essor's V tion pri etc | İ | Relaxation | $W \rightarrow R$ | $W \rightarrow W$ | $R \rightarrow RW$ | Read Others' | Read Own | Safety net | |---|-----------------|-------------------|-------------------|--------------------|--------------|--------------|------------------------------| | | | Order | Order | Order | Write Early | Write Early | | | ĺ | SC [16] | | | | | | | | | IBM 370 [14] | $\sqrt{}$ | | | | | serialization instructions | | | TSO [20] | | | | | | RMW | | | PC [13, 12] | | | | | | RMW | | | PSO [20] | | | | | $\sqrt{}$ | RMW, STBAR | | | WO [5] | $\sqrt{}$ | | $\sqrt{}$ | | $\sqrt{}$ | synchronization | | | RCsc [13, 12] | <b>✓</b> | $\checkmark$ | $\sqrt{}$ | | $\checkmark$ | release, acquire, nsync, RMW | | | RCpc [13, 12] | | $\checkmark$ | $\sqrt{}$ | $\sqrt{}$ | <b>√</b> | release, acquire, nsync, RMW | | | Alpha [19] | | | $\sqrt{}$ | | $\sqrt{}$ | MB, WMB | | ĺ | RMO [21] | $\sqrt{}$ | | $\sqrt{}$ | | $\sqrt{}$ | various MEMBAR's | | | PowerPC [17, 4] | | | | | | SYNC | | ۰ | | <u> </u> | | | | | | - **Program Order** relaxations (different locations) - $W \rightarrow R$ ; $W \rightarrow W$ ; $R \rightarrow R/W$ - Write Atomicity relaxations - Read returns another processor's V - Requirement: synchronization pri - Fence, barrier instructions etc | Relaxation | $W \rightarrow R$ | $W \rightarrow W$ | $R \rightarrow RW$ | Read Others' | Read Own | Safety net | |-----------------|-------------------|-------------------|--------------------|--------------|--------------|------------------------------| | | Order | Order | Order | Write Early | Write Early | | | SC [16] | | | | | $\sqrt{}$ | | | IBM 370 [14] | | | | | | serialization instructions | | TSO [20] | | | | | $\sqrt{}$ | RMW | | PC [13, 12] | | | | | $\sqrt{}$ | RMW | | PSO [20] | | | | | $\sqrt{}$ | RMW, STBAR | | WO [5] | | | $\sqrt{}$ | | $\sqrt{}$ | synchronization | | RCsc [13, 12] | <b>√</b> | $\sqrt{}$ | $\sqrt{}$ | | $\checkmark$ | release, acquire, nsync, RMW | | RCpc [13, 12] | $\sqrt{}$ | $\sqrt{}$ | $\sqrt{}$ | $\sqrt{}$ | $\checkmark$ | release, acquire, nsync, RMW | | Alpha [19] | | | $\sqrt{}$ | | $\sqrt{}$ | MB, WMB | | RMO [21] | | | | | | various MEMBAR's | | PowerPC [17, 4] | | | | | | SYNC | | | | | | | | | #### Some Key Consistency Models #### **TSO** - x86 - Stores are totally ordered, reads not - Differs from PC by allowing early reads of processor's own writes #### **RC: Release Consistency** - Key insight: only synchronization references need to be ordered - Hence, relax memory for all other references - Enable high-performance OOO implementation - Programmer **labels** synchronization references - Hardware must carefully order these labeled references - Labeling schemes: - Explicit synchronization ops (acquire/release) - Memory fence or memory barrier ops: - All preceding ops must finish before following ones begin - Fence ops drain pipeline ### Transactions and Transactional Memory • 3 Programming Model Dimensions: - 3 Programming Model Dimensions: - How to specify computation - 3 Programming Model Dimensions: - How to specify computation - How to specify communication - 3 Programming Model Dimensions: - How to specify computation - How to specify communication - How to specify coordination/control transfer - 3 Programming Model Dimensions: - How to specify computation - How to specify communication - How to specify coordination/control transfer - 3 Programming Model Dimensions: - How to specify computation - How to specify communication - How to specify coordination/control transfer - Threads, Futures, Events etc. - Mostly about how to express control - 3 Programming Model Dimensions: - How to specify computation - How to specify communication - How to specify coordination/control transfer - Threads, Futures, Events etc. - Mostly about how to express control - Transactions - Mostly about how to deal with shared state #### Transactions Core issue: multiple updates #### Canonical examples: #### Transactions Core issue: multiple updates #### Canonical examples: - Modified data in memory/caches - Even if in-memory data is durable, multiple disk updates #### Transactions Core issue: multiple updates #### Canonical examples: Problems: crash in the middle / visibility of intermediate state - Modified data in memory/caches - Even if in-memory data is durable, multiple disk updates - Want reliable update of two resources (e.g. in two disks, machines...) - Move file from A to B - Create file (update free list, inode, data block) - Bank transfer (move \$100 from my account to VISA account) - Move directory from server A to B - Want reliable update of two resources (e.g. in two disks, machines...) - Move file from A to B - Create file (update free list, inode, data block) - Bank transfer (move \$100 from my account to VISA account) - Move directory from server A to B - Machines can crash, messages can be lost - Want reliable update of two resources (e.g. in two disks, machines...) - Move file from A to B - Create file (update free list, inode, data block) - Bank transfer (move \$100 from my account to VISA account) - Move directory from server A to B - Machines can crash, messages can be lost Can we use messages? E.g. with retries over unreliable medium to synchronize with guarantees? - Want reliable update of two resources (e.g. in two disks, machines...) - Move file from A to B - Create file (update free list, inode, data block) - Bank transfer (move \$100 from my account to VISA account) - Move directory from server A to B - Machines can crash, messages can be lost Can we use messages? E.g. with retries over unreliable medium to synchronize with guarantees? No. Not even if all messages get through! • Two generals on separate mountains - Two generals on separate mountains - Can only communicate via messengers - Two generals on separate mountains - Can only communicate via messengers - Messengers can get lost or captured - Two generals on separate mountains - Can only communicate via messengers - Messengers can get lost or captured - Need to coordinate attack - attack at same time good, different times bad! - Two generals on separate mountains - Can only communicate via messengers - Messengers can get lost or captured - Need to coordinate attack - attack at same time good, different times bad! - Two generals on separate mountains - Can only communicate via messengers - Messengers can get lost or captured - Need to coordinate attack - attack at same time good, different times bad! - Two generals on separate mountains - Can only communicate via messengers - Messengers can get lost or captured - Need to coordinate attack - attack at same time good, different times bad! - Two generals on separate mountains - Can only communicate via messengers - Messengers can get lost or captured - Need to coordinate attack - attack at same time good, different times bad! General A → General B: let's attack at dawn General B → General A: OK, dawn. - Two generals on separate mountains - Can only communicate via messengers - Messengers can get lost or captured - Need to coordinate attack - attack at same time good, different times bad! General A -> General B: let's attack at dawn General B → General A: OK, dawn. General A → General B: Check. Dawn it is. - Two generals on separate mountains - Can only communicate via messengers - Messengers can get lost or captured - Need to coordinate attack - attack at same time good, different times bad! General A -> General B: let's attack at dawn General B → General A: OK, dawn. General A → General B: Check. Dawn it is. General B → General A: Alright already—dawn. - Two generals on separate mountains - Can only communicate via messengers - Messengers can get lost or captured - Need to coordinate attack - attack at same time good, different times bad! - Even if all messages delivered, can't assume— maybe some message didn't get through. - No solution: one of the few CS impossibility results. General A → General B: let's attack at dawn General B → General A: OK, dawn. General A → General B: Check. Dawn it is. General B → General A: Alright already—dawn. (but can't solve it) (but can't solve it) - Solves weaker problem: - 2 things will either happen or not - not necessarily at the same time (but can't solve it) - Solves weaker problem: - 2 things will either happen or not - not necessarily at the same time - Core idea: one entity has the power to say yes or no for all - Local txn: one final update (TxEND) irrevocably triggers several - Distributed transactions - 2 phase commit - One machine has final say for all machines - Other machines bound to comply (but can't solve it) - Solves weaker problem: - 2 things will either happen or not - not necessarily at the same time - Core idea: one entity has the power to say yes or no for all - Local txn: one final update (TxEND) irrevocably triggers several - Distributed transactions - 2 phase commit - One machine has final say for all machines - Other machines bound to comply What is the role of synchronization here? #### Transactional Programming Model ``` begin transaction; x = read("x-values", ....); y = read("y-values", ....); z = x+y; write("z-values", z, ....); commit transaction; ``` #### Transactional Programming Model ``` begin transaction; x = read("x-values", ....); y = read("y-values", ....); z = x+y; write("z-values", z, ....); commit transaction; ``` What has changed from previous programming models? #### **ACID Semantics** #### **ACID Semantics** #### What are they? - A - C - | - D #### **ACID Semantics** ``` begin transaction; x = read("x-values", ....); y = read("y-values", ....); z = x+y; write("z-values", z, ....); commit transaction; ``` • Atomic – all updates happen or none do ``` begin transaction; x = read("x-values", ....); y = read("y-values", ....); z = x+y; write("z-values", z, ....); commit transaction; ``` - Atomic all updates happen or none do - Consistent system invariants maintained across updates ``` begin transaction; x = read("x-values", ....); y = read("y-values", ....); z = x+y; write("z-values", z, ....); commit transaction; ``` - Atomic all updates happen or none do - Consistent system invariants maintained across updates - Isolated no visibility into partial updates ``` begin transaction; x = read("x-values", ....); y = read("y-values", ....); z = x+y; write("z-values", z, ....); commit transaction; ``` - Atomic all updates happen or none do - Consistent system invariants maintained across updates - Isolated no visibility into partial updates - Durable once done, stays done ``` begin transaction; x = read("x-values", ....); y = read("y-values", ....); z = x+y; write("z-values", z, ....); commit transaction; ``` - Atomic all updates happen or none do - Consistent system invariants maintained across updates - Isolated no visibility into partial updates - Durable once done, stays done - Are subsets ever appropriate? - When would ACI be useful? - ACD? - Isolation only? ``` begin transaction; x = read("x-values", ....); y = read("y-values", ....); z = x+y; write("z-values", z, ....); commit transaction; ``` • Key idea: turn multiple updates into a single one - Key idea: turn multiple updates into a single one - Many implementation Techniques - Two-phase locking - Timestamp ordering - Optimistic Concurrency Control - Journaling - 2,3-phase commit - Speculation-rollback - Single global lock - Compensating transactions - Key idea: turn multiple updates into a single one - Many implementation Techniques - Two-phase locking - Timestamp ordering - Optimistic Concurrency Control - Journaling - 2,3-phase commit - Speculation-rollback - Single global lock - Compensating transactions #### Key problems: - output commit - synchronization - Key idea: turn multiple updates into a single one - Many implementation Techniques - Two-phase locking - Timestamp ordering - Optimistic Concurrency Control - Journaling - 2,3-phase commit - Speculation-rollback - Single global lock - Compensating transactions #### Key problems: - output commit - synchronization ``` BEGIN_TXN(); x = read("x-values", ....); y = read("y-values", ....); z = x+y; write("z-values", z, ....); COMMIT_TXN(); ``` ``` BEGIN_TXN(); x = read("x-values", ....); y = read("y-values", ....); z = x+y; write("z-values", z, ....); COMMIT_TXN(); ``` ``` BEGIN_TXN() { } ``` ``` COMMIT_TXN() { } ``` ``` BEGIN_TXN(); x = read("x-values", ....); y = read("y-values", ....); z = x+y; write("z-values", z, ....); COMMIT_TXN(); ``` ``` BEGIN_TXN() { LOCK(single-global-lock); } ``` ``` COMMIT_TXN() { UNLOCK(single-global-lock); } ``` ``` BEGIN_TXN(); x = read("x-values", ....); y = read("y-values", ....); z = x+y; write("z-values", z, ....); COMMIT_TXN(); ``` ``` BEGIN_TXN() { LOCK(single-global-lock); } ``` ``` COMMIT_TXN() { UNLOCK(single-global-lock); } ``` - Phase 1: only acquire locks in order - Phase 2: unlock at commit - avoids deadlock ``` BEGIN_TXN(); Lock x, y x = x + 1 y = y - 1 unlock y, x COMMIT_TXN(); ``` - Phase 1: only acquire locks in order - Phase 2: unlock at commit - avoids deadlock ``` BEGIN_TXN(); Lock x, y x = x + 1 y = y - 1 unlock y, x COMMIT_TXN(); ``` ``` BEGIN_TXN() { } ``` ``` COMMIT_TXN() { } ``` - Phase 1: only acquire locks in order - Phase 2: unlock at commit - avoids deadlock ``` BEGIN_TXN(); Lock x, y x = x + 1 y = y - 1 unlock y, x COMMIT_TXN(); ``` ``` BEGIN_TXN() { rwset = Union(rset, wset); rwset = sort(rwset); forall x in rwset LOCK(x); } ``` ``` COMMIT_TXN() { forall x in rwset UNLOCK(x); } ``` - Phase 1: only acquire locks in order - Phase 2: unlock at commit - avoids deadlock ``` BEGIN_TXN(); Lock x, y x = x + 1 y = y - 1 unlock y, x COMMIT_TXN(); ``` ``` BEGIN_TXN() { rwset = Union(rset, wset); rwset = sort(rwset); forall x in rwset LOCK(x); } ``` ``` COMMIT_TXN() { forall x in rwset UNLOCK(x); } ``` Pros/Cons? - Phase 1: only acquire locks in order - Phase 2: unlock at commit - avoids deadlock ``` BEGIN_TXN(); Lock x, y x = x + 1 y = y - 1 unlock y, x COMMIT_TXN(); ``` ``` BEGIN_TXN() { rwset = Union(rset, wset); rwset = sort(rwset); forall x in rwset LOCK(x); } ``` ``` COMMIT_TXN() { forall x in rwset UNLOCK(x); } ``` Pros/Cons? What happens on failures? - Phase 1: only acquire locks in order - Phase 2: unlock at commit - avoids deadlock ``` BEGIN_TXN(); Lock x, y x = x + 1 y = y - 1 unlock y, x COMMIT_TXN(); ``` ``` A: grab locks A: modify x, y, A: unlock y, x B: grab locks B: update x, y B: unlock y, x B: COMMIT A: CRASH ``` ``` BEGIN_TXN() { rwset = Union(rset, wset); rwset = sort(rwset); forall x in rwset LOCK(x); } ``` ``` COMMIT_TXN() { forall x in rwset UNLOCK(x); } ``` ``` Pros/Cons? What happens on failures? ``` - Phase 1: only acquire locks in - Phase 2: unlock at commit - avoids deadlock ``` BEGIN_TXN(); Lock x, y x = x + 1 y = y - 1 unlock y, x COMMIT_TXN(); ``` ``` B commits changes that depend on A's updates ``` ``` A: grab locks A: modify x, y, A: unlock y, x B: grab locks B: update x, y B: unlock y, x B: COMMIT A: CRASH ``` ``` BEGIN_TXN() { rwset = Union(rset, wset); rwset = sort(rwset); forall x in rwset LOCK(x); } ``` ``` COMMIT_TXN() { forall x in rwset UNLOCK(x); } ``` ``` Pros/Cons? What happens on failures? ``` #### Two-phase commit - N participants agree or don't (atomicity) - Phase 1: everyone "prepares" - Phase 2: Master decides and tells everyone to actually commit - What if the master crashes in the middle? #### 2PC: Phase 1 - 1. Coordinator sends REQUEST to all participants - 2. Participants receive request and - 3. Execute locally - 4. Write VOTE\_COMMIT or VOTE\_ABORT to local log - 5. Send VOTE\_COMMIT or VOTE\_ABORT to coordinator Example—move: $C \rightarrow S1$ : delete foo from /, $C \rightarrow S2$ : add foo to / ``` Failure case: S1 writes rm /foo, VOTE_COMMIT to log S1 sends VOTE_COMMIT S2 decides permission problem S2 writes/sends VOTE ABORT Success case: S1 writes rm /foo, VOTE_COMMIT to log S1 sends VOTE_COMMIT S2 writes add foo to / S2 writes/sends VOTE COMMIT ``` #### 2PC: Phase 2 - Case 1: receive VOTE\_ABORT or timeout - Write GLOBAL\_ABORT to log - send GLOBAL\_ABORT to participants - Case 2: receive VOTE\_COMMIT from all - Write GLOBAL\_COMMIT to log - send GLOBAL\_COMMIT to participants - Participants receive decision, write GLOBAL\_\* to log #### 2PC corner cases #### Phase 1 - 1. Coordinator sends REQUEST to all participants - X 2. Participants receive request and - 3. Execute locally - 4. Write VOTE\_COMMIT or VOTE\_ABORT to local log - 5. Send VOTE COMMIT or VOTE ABORT to coordinator #### Phase 2 - Y Case 1: receive VOTE\_ABORT or timeout - Write GLOBAL\_ABORT to log - send GLOBAL\_ABORT to participants - Case 2: receive VOTE\_COMMIT from all - Write GLOBAL\_COMMIT to log - send GLOBAL\_COMMIT to participants - Participants recv decision, write GLOBAL\_\* to log - What if participant crashes at X? - Coordinator crashes at Y? - Participant crashes at Z? - Coordinator crashes at W? Coordinator crashes at W, never wakes up - Coordinator crashes at W, never wakes up - All nodes block forever! - Coordinator crashes at W, never wakes up - All nodes block forever! - Can participants ask each other what happened? - Coordinator crashes at W, never wakes up - All nodes block forever! - Can participants ask each other what happened? - 2PC: always has risk of indefinite blocking - Coordinator crashes at W, never wakes up - All nodes block forever! - Can participants ask each other what happened? - 2PC: always has risk of indefinite blocking - Solution: (yes) 3 phase commit! - Reliable replacement of crashed "leader" - 2PC often good enough in practice - Composition of transactions - E.g. interact with multiple organizations, each supporting txns - Travel agency: canonical example - Composition of transactions - E.g. interact with multiple organizations, each supporting txns - Travel agency: canonical example - Nesting: view transaction as collection of: - actions on unprotected objects - protected actions that my be undone or redone - real actions that may be deferred but not undone - nested transactions that may be undone - 3 basic flavors: - \* **Flat:** subsume inner transactions - \* Closed: subsume w partial rollback - \* Open: pause transactional context - Composition of transactions - E.g. interact with multiple organizations, each supporting txns - Travel agency: canonical example - Nesting: view transaction as collection of: - actions on unprotected objects - protected actions that my be undone or redone - real actions that may be deferred but not undone - nested transactions that may be undone #### **Nested Transactions** #### 3 basic flavors: - \* Flat: subsume inner transactions - \* Closed: subsume w partial rollback - \* Open: pause transactional context #### Composition of transactions - E.g. interact with multiple organizations, each supporting txns - Travel agency: canonical example - Nesting: view transaction as collection of: - actions on unprotected objects - protected actions that my be undone or redone - real actions that may be deferred but not undone - nested transactions that may be undone #### Open Nesting details: - Nested transaction returns name and parameters of compensating transaction - Parent includes compensating transaction in log of parent transaction - Invoke compensating transactions from log if parent transaction aborted - Consistent, atomic, durable, but not isolated #### Nesting Semantics Exercise ``` 1 BeginTX() X = read(x) Y = read(y) write(x, X+1+Y) 5 BeginTX() Z = read(z) + X + Y 6 write(z) ← abort EndTX() EndTX() ``` What if TX aborts btw 7,8 - Under flat nesting? - Under closed nesting? - Under open nesting? ### Transactional Memory: ACI #### Transactional Memory: - Make multiple memory accesses atomic - All or nothing Atomicity - No interference Isolation - Correctness Consistency - No durability, for obvious reasons #### Keywords: Commit, Abort, Speculative access, Checkpoint ### Transactional Memory: ACI #### Transactional Memory: - Make multiple memory accesses atomic - All or nothing Atomicity - No interference Isolation - Correctness Consistency - No durability, for obvious reasons #### Keywords: Commit, Abort, Speculative access, Checkpoint ``` remove(list, x) { lock(list); pos = find(list, x); if(pos) erase(list, pos); unlock(list); } ``` ### Transactional Memory: ACI #### Transactional Memory: - Make multiple memory accesses atomic - All or nothing Atomicity - No interference Isolation - Correctness Consistency - No durability, for obvious reasons #### Keywords: ``` Commit, Abort, Speculative access, Checkpoint ``` ``` remove(list, x) { lock(list); pos = find(list, x); if(pos) erase(list, pos); unlock(list); } ``` ``` remove(list, x) { TXBEGIN(); pos = find(list, x); if(pos) erase(list, pos); TXEND(); } ``` ``` remove(list, x) { lock(list); pos = find(list, x); if(pos) erase(list, pos); unlock(list); } ``` ``` remove(list, x) { TXBEGIN(); pos = find(list, x); if(pos) erase(list, pos); TXEND(); } ``` ``` (ist, x) t remov κ(list); = find(list, x) e(list, pos); unlock ist); emove(list, TXBEGIN(); pos = find(lis x); f(pos) erase(list, po VD(); ``` ``` remove(list, x) { atomic { pos = find(list, x); if(pos) erase(list, pos); } } ``` ``` (1ist, x) remov k(list); = find(list, x) e(list, pos); unlock ist); emove(list, TXBEGIN(); pos = find(lis x); f(pos) erase(list, po VD(); ``` ``` remove(list, x) { atomic { pos = find(list, x); if(pos) erase(list, pos); } } ``` - Transactions: super-awesome - Transactional Memory: also super-awesome, **but**: - Transactions != TM - TM is an *implementation technique* - Often presented as programmer abstraction - Remember Optimistic Concurrency Control ``` (ist, x) remov k(list); = find(list, x) e(list, pos); unlock ist); emove(list, TXBEGIN(); pos = find(lis x); f(pos) erase(list, po VD(); ``` #### A Simple TM ``` pthread mutex t g global_lock; pbegin tx() { pthread_mutex_lock(g_global_lock); pend tx() { pthread_mutex_unlock(g_global_lock); // can't happen ``` #### A Simple TM ``` erase(list, pos); end_tx(); pthread mutex t g global lock; ⊟begin tx() { pthread_mutex_lock(g_global_lock); ⊟end tx() { pthread mutex unlock(g global lock); // can't happen ``` remove(list, x) { pos = find(list, x); begin\_tx(); if(pos) #### A Simple TM ``` end_tx(); pthread mutex t g global lock; ⊟begin tx() { pthread_mutex_lock(g_global_lock); ⊟end tx() { pthread mutex unlock(g global lock); // can't happen Actually, this ``` Actually, this works fine... But how can we improve it? remove(list, x) { pos = find(list, x); erase(list, pos); begin\_tx(); if(pos) Consider a hash-table Consider a hash-table ``` thread T1 thread T2 ht.add(); ht.add(); if(ht.contains() if(ht.contains( )) ht.del( ht.del([ ``` ``` thread T1 thread T2 ht.add(); ht.add(); if(ht.contains() if(ht.contains( )) ht.del( ht.del([ ``` ``` thread T2 thread T1 ht.lock() ht.lock(); ht.add(); ht.add( ( ); if(ht.contains() if(ht.contains(___)) ht.del( ht.del([ ht.unlock(); ht.unlock(); ``` ### Pessimistic concurrency control #### Pessimistic concurrency control ### Optimistic concurrency control #### Optimistic concurrency control ### Optimistic concurrency control What do we do when same data is accessed? #### TM Primer #### **Key Ideas:** - Critical sections execute concurrently - Conflicts are detected dynamically • Conflict - If conflict serializability is violated, rollback #### **Key Abstractions:** - Primitives - xbegin, xend, xabort $$\varnothing \neq \{W_a\} \cap \{R_b \cup W_b\}$$ - Contention Mänager - Need flexible policy #### **Data Versioning** - Eager Versioning - Lazy Versioning #### **Data Versioning** - Eager Versioning - Lazy Versioning #### Conflict Detection and Resolution - Eager Detection (Pessimistic) - Lazy Detection (Optimistic) #### **Data Versioning** - Eager Versioning - Lazy Versioning #### Conflict Detection and Resolution - Eager Detection (Pessimistic) - Lazy Detection (Optimistic) #### **Conflict Detection Granularity** - Object Granularity - Word Granularity - Cache line Granularity ## TM Design Alternatives - Hardware (HTM) - Caches track RW set, HW speculation/checkpoint - Software (STM) - Instrument RW - Inherit TX Object ## Hardware Transactional Memory - Idea: Track read / write sets in HW - commit / rollback in hardware as well - Cache coherent hardware already manages much of this - Basic idea: cache == speculative storage - HTM ~= smarter cache - Can support many different TM paradigms - Eager, lazy - optimistic, pessimistic • "Small" modification to cache • "Small" modification to cache • "Small" modification to cache "Small" modification to cache #### Key ideas - Checkpoint architectural state - Caches: 'versioning' for memory - Change coherence protocol - Conflict detection in hardware - 'Commit' transactions if no conflict - 'Abort' on conflict (or special cond) - 'Retry' aborted transaction # Coherence for Conflict Detection and Versioning # Coherence for Conflict Detection and Versioning - Lines in TMI state are speculative - Lines in TS, TE have been read - Invalidations/Upgrades for T\* transactional conflicts - *Commit: T\* -> \** - Abort: $T^* \rightarrow I$ , rollback registers # Coherence for Conflict Detection and Versioning - Lines in TMI state are speculative - Lines in TS, TE have been read - Invalidations/Upgrades for T\* transactional conflicts - *Commit: T\* -> \** - Abort: $T^* \rightarrow I$ , rollback registers # Case Study: SUN Rock - Major challenge: diagnosing cause of Transaction aborts - Necessary for intelligent scheduling of transactions - Also for debugging code - debugging the processor architecture / μarchitecture - Many unexpected causes of aborts - Rock v1 diagnostics unable to distinguish distinct failure modes | Mask | Name | Description and example cause | |-------|-------|------------------------------------------------------------------------------------------| | 0x001 | EXOG | Exogenous - Intervening code has run: cps register contents are invalid. | | 0x002 | COH | Coherence - Conflicting memory operation. | | 0x004 | TCC | Trap Instruction - A trap instruction evaluates to "taken". | | 0x008 | INST | Unsupported Instruction - Instruction not supported inside transactions. | | 0x010 | PREC | Precise Exception - Execution generated a precise exception. | | 0x020 | ASYNC | Async - Received an asynchronous interrupt. | | 0x040 | SIZ | Size - Transaction write set exceeded the size of the store queue. | | 0x080 | LD | Load - Cache line in read set evicted by transaction. | | 0x100 | ST | Store - Data TLB miss on a store. | | 0x200 | CTI | Control transfer - Mispredicted branch. | | 0x400 | FP | Floating point - Divide instruction. | | 0x800 | UCTI | Unresolved control transfer - branch executed without resolving load on which it depends | Table 1. cps register: bit definitions and example failure reasons that set them. # Case Study: SUN Rock - Major challenge: diagnosing cause of Transaction aborts - Necessary for intelligent scheduling of transactions - Also for debugging code - debugging the processor architecture / μarchitecture - Many unexpected causes of aborts - Rock v1 diagnostics unable to distinguish distinct failure modes | Mask | Name | Description and example cause | |-------|-------|------------------------------------------------------------------------------------------| | 0x001 | EXOG | Exogenous - Intervening code has run: cps register contents are invalid. | | 0x002 | COH | Coherence - Conflicting memory operation. | | 0x004 | TCC | Trap Instruction - A trap instruction evaluates to "taken". | | 0x008 | INST | Unsupported Instruction - Instruction not supported inside transactions. | | 0x010 | PREC | Precise Exception - Execution generated a precise exception. | | 0x020 | ASYNC | Async - Received an asynchronous interrupt. | | 0x040 | SIZ | Size - Transaction write set exceeded the size of the store queue. | | 0x080 | LD | Load - Cache line in read set evicted by transaction. | | 0x100 | ST | Store - Data TLB miss on a store. | | 0x200 | CTI | Control transfer - Mispredicted branch. | | 0x400 | FP | Floating point - Divide instruction. | | 0x800 | UCTI | Unresolved control transfer - branch executed without resolving load on which it depends | Table 1. cps register: bit definitions and example failure reasons that set them. | Thread 1 | Thread 2 | |-----------------------|----------| | <pre>1 atomic {</pre> | | | 2 r1 = x; | x = 1; | | 3 r2 = x; | | | 4 } | | | Thread 1 | Thread 2 | |-------------------------------------------------|----------| | <pre>1 atomic { 2 r1 = x; 3 r2 = x; 4 }</pre> | x = 1; | Can r1 != r2? | Thread 1 | Thread 2 | |----------------------------------|----------| | <pre>1 atomic { 2 r1 = x;</pre> | x = 1; | | 3 r2 = x; | | | 4 } | | Can r1 != r2? Non-repeatable reads Initially, x == 0 | Thread 1 | Thread 2 | Thread 1 | Thread 2 | |-------------------------------------------------|----------|-------------------------------------------------|----------| | <pre>1 atomic { 2 r1 = x; 3 r2 = x; 4 }</pre> | x = 1; | <pre>1 atomic { 2 r = x; 3 x = r+1; 4 }</pre> | x = 10; | Can r1 != r2? Non-repeatable reads Initially, x == 0 | Thread 1 | Thread 2 | Thread 1 | Thread 2 | |---------------------------------------------------|----------|-------------------------------------------------|----------| | <pre>1 atomic { 2 r1 = x; 3 r2 = x; 4 }</pre> | x = 1; | <pre>1 atomic { 2 r = x; 3 x = r+1; 4 }</pre> | x = 10; | Can r1 != r2? Non-repeatable reads Can x==1? Initially, x == 0 | Thread 1 | Thread 2 | Thread 1 | Thread 2 | |-------------------------------------------------|----------|-------------------------------------------------|----------| | <pre>1 atomic { 2 r1 = x; 3 r2 = x; 4 }</pre> | x = 1; | <pre>1 atomic { 2 r = x; 3 x = r+1; 4 }</pre> | x = 10; | Can r1 != r2? Non-repeatable reads Can x==1? **Lost Updates** Initially, x == 0 Initially, x is even | Thread 1 | Thread 2 | Thread 1 | Thread 2 | Thread 1 | Thread 2 | |---------------------------------------------------|----------|-------------------------------------------------|----------|-------------------------------------------|----------| | <pre>1 atomic { 2 r1 = x; 3 r2 = x; 4 }</pre> | | <pre>1 atomic { 2 r = x; 3 x = r+1; 4 }</pre> | x = 10; | <pre>1 atomic { 2 x++; 3 x++; 4 }</pre> | r = x; | Can r1 != r2? Non-repeatable reads Can x==1? Lost Updates Initially, x == 0 Initially, x is even | Thread 1 | Thread 2 | Thread 1 | Thread 2 | Thread 1 | Thread 2 | |-------------------------------------------------|----------|-------------------------------------------------|----------|-------------------------------------------|----------| | <pre>1 atomic { 2 r1 = x; 3 r2 = x; 4 }</pre> | x = 1; | <pre>1 atomic { 2 r = x; 3 x = r+1; 4 }</pre> | x = 10; | <pre>1 atomic { 2 x++; 3 x++; 4 }</pre> | r = x; | Can r1 != r2? Non-repeatable reads Can x==1? Lost Updates Can r be odd? Initially, x == 0 Initially, x is even | Thread 1 | Thread 2 | Thread 1 | Thread 2 | Thread 1 | Thread 2 | |-------------------------------------------------|----------|-------------------------------------------------|----------|-------------------------------------------|----------| | <pre>1 atomic { 2 r1 = x; 3 r2 = x; 4 }</pre> | x = 1; | <pre>1 atomic { 2 r = x; 3 x = r+1; 4 }</pre> | x = 10; | <pre>1 atomic { 2 x++; 3 x++; 4 }</pre> | r = x; | Can r1 != r2? Non-repeatable reads Can x==1? Lost Updates Can r be odd? Dirty reads #### TM Tricks #### Lock Elision - In many data structures, accesses are contention free in the common case - But need locks for the uncommon case where contention does occur - For example, double ended queue - Can replace lock with atomic section, default to lock when needed - Allows extra parallelism in the average case #### Lock Elision ``` hashTable.lock() var = hashTable.lookup(X); if (!var) hashTable.insert(X); hashTable.unlock(); ``` ``` hashTable.lock() var = hashTable.lookup(Y); if (!var) hashTable.insert(Y); hashTable.unlock(); ``` #### Lock Elision ``` hashTable.lock() var = hashTable.lookup(X); if (!var) hashTable.insert(X); hashTable.unlock(); ``` Hardware notices lock Instruction sequence! ``` hashTable.lock() var = hashTable.lookup(Y); if (!var) hashTable.insert(Y); hashTable.unlock(); ``` #### Lock Elision ``` hashTable.lock() var = hashTable.lookup(X); Hardware notices lock if (!var) hashTable.insert(X); hashTable.unlock(); Instruction sequence! hashTable.lock() var = hashTable.lookup(Y); if (!var) hashTable.insert(Y); hashTable.unlock(); Parallel Execution atomic { atomic { if (!hashTable.isUnlocked()) abort; if (!hashTable.isUnlocked()) abort; var = hashTable.lookup(X); var = hashTable.lookup(X); if (!var) hashTable.insert(X); if (!var) hashTable.insert(X); } orElse ... } orElse ... ``` ### Privatization #### Privatization #### Privatization may only work correctly in TMs that support strong isolation. (why?) #### Work Deferral ``` atomic { do_lots_of_work(); update_global_statistics(); atomic { do_lots_of_work(); atomic open { update_global_statistics(); atomic { do_lots_of_work(); update_local_statistics(); //effectively serializes transactions atomic{ update_global_statistics_using_local_statistics() ``` System == <threads, memory> System == <threads, memory> Memory cell support 4 operations: System == <threads, memory> Memory cell support 4 operations: ■ Write<sup>i</sup>(L,v) - thread i writes v to L System == <threads, memory> Memory cell support 4 operations: - Write<sup>i</sup>(L,v) thread i writes v to L - Read<sup>i</sup>(L,v) thread i reads v from L System == <threads, memory> Memory cell support 4 operations: - Write<sup>i</sup>(L,v) thread i writes v to L - Read<sup>i</sup>(L,v) thread i reads v from L - LLi(L,v) thread i reads v from L, marks L read by I System == <threads, memory> Memory cell support 4 operations: - Write<sup>i</sup>(L,v) thread i writes v to L - Read<sup>i</sup>(L,v) thread i reads v from L - LLi(L,v) thread i reads v from L, marks L read by I - SC<sup>i</sup>(L,v) thread i writes v to L - returns success if L is marked as read by i. - Otherwise it returns *failure*. # STM Design Overview #### STM Design Overview This is the shared memory, (STM Object) Memory Ownerships status status status version version version size size size locs[] locs[] locs[] oldValues[] oldValues[] oldValues[] Rec<sub>2</sub> Rec<sub>n</sub> Rec<sub>1</sub> #### STM Design Overview This is the shared memory, (STM Object) Memory Ownerships Pointers to threads (Rec Objects) status status status version version version size size size locs[] locs[] locs[] oldValues[] oldValues[] oldValues[] Rec<sub>2</sub> Rec<sub>n</sub> Rec<sub>1</sub> # Threads: Rec Objects ``` Memory Ownerships status version size locs[] oldValues[] Rec_1 Rec_2 Status version size locs[] oldValues[] ``` ``` class Rec { boolean stable = false; boolean, int status= (false,0); //can have two values... boolean allWritten = false; int version = 0; int size = 0; int locs[] = {null}; int oldValues[] = {null}; (short for record). ``` Rec instance defines current transaction on thread # Memory: STM Object ``` public class STM { Rec<sub>2</sub> int memory[]; Rec ownerships[]; public boolean, int[] startTranscation(Rec rec, int[] dataSet){...}; private void initialize(Rec rec, int[] dataSet) private void transaction(Rec rec, int version, boolean isInitiator) {...}; private void acquireOwnerships(Rec rec, int version) {...}; private void releaseOwnershipd(Rec rec, int version) {...}; private void agreeOldValues(Rec rec, int version) {...}; private void updateMemory(Rec rec, int version, int[] newvalues) {...}; ``` Memory Ownerships oldValues∏ version locs[] oldValues[] version oldValues[] Rec. STM Threads STM Threads Thread i #### Flow of a transaction #### Flow of a transaction #### Flow of a transaction ``` public boolean, int[] startTranscation(Rec rec, int[] dataSet) { initialize(rec, dataSet); rec.stable = true; transaction(rec, rec.version, true); rec.stable = false; rec.version++; if (rec.status) return (true, rec.oldValues); else return false; } ``` ``` public boolean, int[] startTranscation(Rec rec, int[] dataSet) initialize(rec, dataSet); rec.stable = true; transaction(rec, rec.version, true); rec.stable = false; rec.version++; if (rec.status) return (true, rec.oldValues); else return false; } ``` rec – The thread that executes this transaction. dataSet – The location in memory it needs to own. ``` executes this transaction. public boolean, int[] startTranscation(Rec rec, incl) advases dataSet - The initialize(rec, dataSet); location in memory it needs to own. rec.stable = true; transaction(rec, rec.version, true); This notifies rec.stable = false; other threads that I can be rec.version++; helped if (rec.status) return (true, rec.oldValues); else return false; ``` rec - The thread that ``` private void transaction(Rec rec, int version, boolean isInitiator) { acquireOwnerships(rec, version); // try to own locations (status, failedLoc) = LL(rec.status); if (status == null) { // success in acquireOwnerships if (versoin != rec.version) return; SC(rec.status, (true,0)); (status, failedLoc) = LL(rec.status); if (status == true) { // execute the transaction agreeOldValues(rec, version); int[] newVals = calcNewVals(rec.oldvalues); updateMemory(rec, version); releaseOwnerships(rec, version); } else { // failed in acquireOwnerships releaseOwnerships(rec, version); if (isInitiator) { Rec failedTrans = ownerships[failedLoc]; if (failedTrans == null) return; else { // execute the transaction that owns the location you want int failedVer = failedTrans.version; if (failedTrans.stable) transaction(failedTrans, failedVer, false); } ``` ``` private void transaction(Rec rec, int version, boolean acquireOwnerships(rec, version); // try to own locations (status, failedLoc) = LL(rec.status); if (status == null) { // success in acquireOwnerships if (versoin != rec.version) return; SC(rec.status, (true,0)); (status, failedLoc) = LL(rec.status); if (status == true) { // execute the transaction agreeOldValues(rec, version); int[] newVals = calcNewVals(rec.oldvalues); updateMemory(rec, version); releaseOwnerships(rec, version); else { // failed in acquireOwnerships releaseOwnerships(rec, version); if (isInitiator) { Rec failedTrans = ownerships[failedLoc]; if (failedTrans == null) return; else { // execute the transaction that owns the location you want int failedVer = failedTrans.version; if (failedTrans.stable) transaction(failedTrans, failedVer, false); ``` rec – The thread that executes this transaction. version – Serial number of the transaction. isInitiator – Am I the initiating thread or the helper? ``` transaction. private void transaction(Rec rec, int version, boolean is a second transaction) version - Serial acquireOwnerships(rec, version); // try to own locations number of the transaction. (status, failedLoc) = LL(rec.status); isInitiator – Am I the if (status == null) { // success in acquireOwnerships initiating thread or if (versoin != rec.version) return; the helper? SC(rec.status, (true,0)); (status, failedLoc) = LL(rec.status); if (status == true) { // execute the transaction agreeOldValues(rec, version); Another thread own int[] newVals = calcNewVals(rec.oldvalues); the locations I need updateMemory(rec, version); and it hasn't finished releaseOwnerships(rec, version); its transaction yet. else { // failed in acquireOwnerships releaseOwnerships(rec, version); So I go out and if (isInitiator) { execute its Rec failedTrans = ownerships[failedLoc]; transaction in order if (failedTrans == null) return; to help it. else { // execute the transaction that owns the cion you wa<del>nt</del> int failedVer = failedTrans.version; if (failedTrans.stable) transaction(failedTrans, failedVer, false); ``` rec – The thread that executes this ``` private void acquireOwnerships(Rec rec, int version) { for (int j=1; j<=rec.size; j++) {</pre> while (true) do { int loc = locs[j]; if LL(rec.status) != null return; other thread // transaction completed by some Rec owner = LL(ownerships[loc]); if (rec.version != version) return; if (owner == rec) break; // location is already mine if (owner == null) { // acquire location if ( SC(rec.status, (m.11, 0)) ) { if ( SC(ownerships[loc]; break; else {// location is taken by someone else if ( SC(rec.status, (false, j)) ) return; ``` If I'm not the last one to read this field, it means that another thread is trying to execute this transaction. Try to loop until I succeed or until the other thread completes the transaction ``` Copy the dataSet private void agreeOldValues(Rec rec, int version) { to my private for (int j=1; j<=rec.size; j++) { space int loc = locs[j]; if ( LL(rec.oldvalues[loc]) != null ) { if (rec.version != version) return; SC(rec.oldvalues[loc], memory[loc]); } Selectively update the shared private void updateMemory(Rec rec, int version, int[] newvalues) { memory for (int j=1; j<=rec.size; j++) {</pre> int loc = locs[j]; int oldValue = LL(memory[loc]); if (rec.allWritten) return; // work_i if (rec.version != version) return; if (oldValue != newValues[j]) SC(memory[loc], newValues[j]); if (! LL(rec.allWritten) ) { if (rec.version != version) SC(rec.allWritten, true); ``` #### HTM vs. STM | Hardware | Software | |----------------------------------------|------------------------------------------| | Fast (due to hardware operations) | Slow (due to software validation/commit) | | Light code instrumentation | Heavy code instrumentation | | HW buffers keep amount of metadata low | Lots of metadata | | No need of a middleware | Runtime library needed | | Only short transactions allowed (why?) | Large transactions possible | #### HTM vs. STM | Hardware | Software | |----------------------------------------|------------------------------------------| | Fast (due to hardware operations) | Slow (due to software validation/commit) | | Light code instrumentation | Heavy code instrumentation | | HW buffers keep amount of metadata low | Lots of metadata | | No need of a middleware | Runtime library needed | | Only short transactions allowed (why?) | Large transactions possible | How would you get the best of both? # Hybrid-TM - Best-effort HTM (use STM for long trx) - Possible conflicts between HW,SW and HW-SW Trx - What kind of conflicts do SW-Trx care about? - What kind of conflicts do HW-Trx care about? - Some initial proposals: - HyTM: uses an ownership record per memory location (overhead?) - PhTM: HTM-only or (heavy) STM-only, low instrumentation # Questions?