Networking in AI

Sun Feb 04 2024

banner cover

Traditionally, most of datacenter traffic flow is south-north, using the leaf/TOR (top of rack) -spine architecture (or you can say topology, which sounds more decent) to direct traffic inside datacenter to external internet and vice versa.

Courtesy of FS

Courtesy of FS

Within AI, LLM data need to be transmitted across server in order to reduce latency, hence creating demand for interconnects addressing huge east-west traffic, called Backend network. This Backend network composes of 800G switch, 800G optical transceivers, Nvidia InfiniBand across servers, Nvidia NVLink within server. No matter it's Infiniband or Ethernet, rising demand for higher-spec and more switches and transceivers remain unchanged. Some may also need the assistance of NIC (network interface card). Ethernet giant Broadcom, Marvell, datacenter switch leader Arista, Celestica, Edgecore, transceiver leader Innolight, Finisar, shall all benefit.

Courtesy of FS

Courtesy of Coherent

How much volume increase can they benefit? That is a difficult part. What we may take as reference is Marvell's view. Marvell dominates in PAM4 DSP, which is the main chip in optical transceivers for high-speed switch port, and it publicly mentioned the relationship of GPU to transceiver is more than 1. With current market view of 3.2-4mn Nvidia GPU in 2024 yet much higher number of high speed transceivers according to Innolight, this relationship seems about right.

All right that's it for this week. For any question, please feel free to contact us.

Time for an afternoon tea!