04 August 2016
When we create through adaptation, we inevitably have to make tradeoffs. We often sacrifice some efficiency, and sometimes we simply end up with a sub-optimal solution. I believe this is the case with data center networking. The current data center networks evolved from campus and enterprise networks, which are great models when you need a lot of packet editing and control. Large-scale data centers don't need to be so complicated. What is needed there is simply high bandwidth, low latency, and simple-to-manage interconnects. At the core of networks, efficiency and the fastest way to facilitate transmission from point A to point B is pivotal.

Data centers have evolved from on-premise, environment-controlled rooms with a single mainframe into robotically-controlled warehouses of machines accessed through the cloud. Along with this monumental shift we have witnessed the evolution of features, as well as the fundamental needs they fulfill. There are two very important legacy features that are part of the modern data center that introduce unnecessary complexity into the network. In fact, maintaining these functions where they currently reside is creating inefficiencies.

The first of these legacy features is packet control. Packet control is essential for checking all of the logic functions -- VLAN, access control, security, and redirecting. The problem is that packet control doesn't need to be performed in every data center switch, and it ends up making switching technologies unnecessarily expensive and inefficient. In fact, it doesn't need to be performed in a switch at all. It's much more efficient to do this at the server-network edge. 
The problem is that packet control doesn't need to be performed in every data center switch.
Share this

The second feature is store-and-forward. This is really one of those cases where people do things a certain way because that's the way it has always been done. Store-and-forward switching is a simple operation that has always been a standard part of campus and data center networks, but it's slow. It causes an increase in both latency and in buffers needed for storage for each packet. Cut-through switching is much faster, eliminating the need for acknowledgement at every hop. Now that we live in a world where networks are 99% reliable, it's unnecessary to ensure the process at intermediate locations. We should be doing smart acknowledgement. Once again, it's more efficient to perform this function at the server-network edge. 

Just as network reliability improvements should prompt a new evaluation of data center architecture, the evolution of certain server technologies gives us an opportunity to rethink where key features should reside. With dramatic improvements in the semiconductor technologies used to produce NIC cards, it is now possible to move some of the rich networking features there. The improvements we have seen in NIC cards over the last five years provide the capability to offer additional intelligence and features, like packet control and buffering, to compensate for the elimination of store-and-forward. These enhancements not only decrease complexity, but they also improve performance, increase scale and reduce costs. 
Store-and-forward switching has always been a standard part of campus and data center networks, but it's slow.
Share this

As is stated in Ockham's principle: All things being equal, the simplest solution tends to be the best one. This should be our guiding principle in changing the data center network. The current model includes legacy features that hinder the ability for networks to deliver needed bandwidth, latency and features. The new and improved model is one that embraces simplification and will result in a scalable, error-free, highly-efficient data center network infrastructure. Underlying network technology has improved dramatically over the last several decades. It now provides an environment where we can design data center network infrastructure specifically to maximize performance, while still ensuring security, low latency, and availability of required features.
Tap to read full article