© 2005 IBM Corporation
Extrapolating an Exaflop in 2018 Standard technology scaling will not get us there in 2018
compromise using evolutionary technology
Assumption for “compromise guess”
Node Peak Perf
Same node count (64k)
System Power in Compute Chip
Expected based on 30W for 200 GF with 6x technology improvement through 4 technology generations. (Only compute chip power scaling, I/Os also scaled same way)
Link Bandwidth (Each unidirectional 3-D link)
Not possible to maintain bandwidth ratio.
Wires per unidirectional 3-D link
Large wire count will eliminate high density and drive links onto cables where they are 100x more expensive. Assume 20 Gbps signaling
Pins in network on node
20 Gbps differential assumed. 20 Gbps over copper will be limited to 12 inches. Will need optics for in rack interconnects.
10Gbps now possible in both copper and optics.
Power in network
10 mW/Gbps assumed.
Now: 25 mW/Gbps for long distance (greater than 2 feet on copper) for both ends one direction. 45mW/Gbps optics both ends one direction. + 15mW/Gbps of electrical
Electrical power in future: separately optimized links for power.
Not possible to maintain external bandwidth/Flop
About 6-7 technology generations with expected eDRAM density improvements
Data pins associated with memory/node
128 data pins
3.2 Gbps per pin
Power in memory I/O (not DRAM)
10 mW/Gbps assumed. Most current power in address bus.
Future probably about 15mW/Gbps maybe get to 10mW/Gbps (2.5mW/Gbps is c*v^2*f for random data on data pins) Address power is higher.
QCD CG single iteration time
fast global sum (2 per iteration)
hardware offload for messaging (Driverless messaging)
~1/20 B/Flop bandwidth
Power and packaging driven
Power, cost and packaging driven