HyperTransport on Opteron
HyperTransport is a high speed 16-bit serial bus, providing (at full speed) a bi-directional bus with 3.2GB/sec of bandwidth each way, for 6.4GB/sec total. With 3 such links on any given Opteron processor, 2 for communication with other Opteron processors in the system and the third for communication with any external services provided by other I/O ASIC's in the system.HyperTransport also supports tunnelling, so that any I/O ASIC attached to the I/O HT link on the Opteron processor can tunnel the bus traffic to another serially connected I/O ASIC on that same HT link. For a proper example, think about a system that provides AMD's 131 PCI-X segment bridge connected via HyperTransport, then their 8111 I/O hub connected to the other side of the 8131. The bus traffic travels like so:
Opteron <-16 bit-=""> AMD 8131 <-8 bit-=""> AMD 8111
-8>-16>
The HyperTransport serial bus along with tunnelling, lets the
processor communicate with the 8111 bridge via the 8131 segment bridge,
with very little latency. Everything connected to the 8111 is visible
and active as far as the Opteron CPU is concerned. Further 8131's, or
any other HyperTransport tunnel ASIC, can also be connected to the far
side of the first 8131 too if needed. Obviously, bus saturation and
latency becomes a limiting factor as far as bus traffic and performance
is concerned, but the bus is that extensible if need be.
The 2 further HT links for CPU to CPU communication allow multi way Opteron systems to be created. Given that the bus is low latency and high bandwidth, adding further Opteron processors to a multi way system increases performance at a much larger rate than any other SMP x86 system. And since the bus for inter CPU communication resides on the processor itself, with the bus protocol itself intelligent to allow efficient signalling and tunnelling, your aren't limited by the chipset and all communication doesn't flow through one piece of silicon.
NUMA allows each CPU to only talk to the CPU it needs to, via the shortest HT link path and you don't need an I/O ASIC system capable of the 51.2GB/sec of potential inter CPU bandwidth on an 8-way Opteron system. The CPU does it all, at low latency and high speed, the actual operating system intervention required for the NUMA abstraction to occur is very little, compared to traditional multi processor systems based around x86 hardware.
The last thing worth talking about with regards to HyperTransport on Opteron, which I won't specifically cover in this article due to the hardware on test, is that HyperTransport bus bandwidth now has the ability to affect future graphics performance. In current systems, with something like the AMD 8151 AGP HyperTransport tunnel, the speed of the bus link to the processor can affect AGP graphics performance. And with HyperTransport unable to provide identical latency to all devices on a given bus, it's essential that AGP bridges, or any other graphics bus (PCI Express?) in the future, be placed as the first HyperTransport device on the I/O bus of the Opteron processor, for maximum performance.
So we know about x86-64 basics, HyperTransport and all the enhancements to the K7 core that give Opteron/K8 its power when running current code (both 32 and 64 bit), lets take a look at a real life implementation, a quite special one at that.
https://www.comway.dk/en-gb/network/access-points
ACESS PONITS
http://hexus.net/business/reviews/enterprise/626-amd-opteron/?page=6
No comments:
Post a Comment