ISO/IEC JTC1 SC22 WG21 N2745 = 08-0255 - 2008-08-22 (REVISED for POWER 5/5+)
Paul E. McKenney, paulmck@linux.vnet.ibm.com
This document presents an implementation of the proposed C/C++ memory-order model for the POWER 5/5+ family of computer systems, which require either usage restrictions or special code sequences to implement the proposed C/C++ sequentially consistent atomic operations.
The POWER 5/5+ family of computer systems successfully run parallel programs containing atomic operations as long as at least one of the following conditions is met:
Please note that other members of the Power family, for example, Power 6 and Power 7, need not adhere to any of the above conditions.
Operation | POWER 5/5+ Implementation |
---|---|
Load Relaxed | ld |
Load Consume | ld |
Load Acquire | ld; cmp; bc; isync |
Load Seq Cst (POWER5/5+) | hwsync; larx; cmp; bc; isync |
Store Relaxed | st |
Store Release | lwsync; st |
Store Seq Cst | hwsync; st |
Cmpxchg Relaxed,Relaxed (32 bit) | _loop: lwarx; cmp; bc _exit; stwcx.; bc _loop; _exit: |
Cmpxchg Acquire,Relaxed (32 bit) | _loop: lwarx; cmp; bc _exit; stwcx.; bc _loop; isync; _exit: |
Cmpxchg Release,Relaxed (32 bit) | lwsync; _loop: lwarx; cmp; bc _exit; stwcx.; bc _loop; _exit: |
Cmpxchg AcqRel,Relaxed (32 bit) | lwsync; _loop: lwarx; cmp; bc _exit; stwcx.; bc _loop; isync; _exit |
Cmpxchg SeqCst,Relaxed (32 bit) | hwsync; _loop: lwarx; cmp; bc _exit; stwcx.; bc _loop; isync; _exit |
Acquire Fence | lwsync |
Release Fence | lwsync |
AcqRel Fence | lwsync |
SeqCst Fence (POWER5/5+) | for (i=0;i<8;i++) { dcbf junk; hwsync; ld junk; } |
The variable junk
may be any memory location.
It is permissible to use junk
as the loop control variable, as
long as that loop control variable is assigned to a memory location.
It is legitimate (but usually unnecessary) to replace sync
,
lwsync
, and eieio
instructions with the
code sequence shown above for “SeqCst Fence (POWER5/5+)”.