Now when you want to implement a network function using multi-core, there are several challenges that come aboard. The first challenge is how to ensure the processing done by distinct cores don't interfere with one another. So if different packets of the same connection are sent to different cores, then the sharing of the network's function state will become a nightmare across these different cores. So that's one thing that you have to worry about. And the other thing that you have to worry about is when multiple cores are participating, then you have inter-core communication. And I mentioned these NUMA effects, and you want to make sure that the multiple cores that are being employed for a particular network function are the same NUMA node so that they can take advantage of locality rather than having to go across the sockets. Going across sockets meaning from one core on one processor to another core on another processor, you will be stuck with NUMA effects and those are things that you have to worry about. And a corollary to that is how to ensure that the cores processing packets are on the same NUMA socket as the NIC itself. Because the NIC is transferring packets into user space buffers and that is coming into the RAM that is associated with a particular socket meaning a particular CPU. And the core there is gonna process it better be associated with the same NUMA piece so that it will be efficient. So these are all the challenges that one has to deal with when you're trying to implement network functions on multiple cores. So a useful hardware technology that's available in order to deal with these complexities is what is called receive side scaling. It's an enabler for multi-core processing and basically the idea is quite simple, that we're gonna use hashing to distribute the incoming packet to individual course. Now how does this hashing work? The hash function is gonna take the 5-tuple of the packet as the input. Remember that the 5-tuple of a packet uniquely identifies a connection because it is a source address, destination address, source port destination port on the protocol and that 5-tuple is useful to direct the packet that is coming in to a particular core, okay? And each cord is assigned a unique ring buffer to poll and therefore there will be no contention among the threads, right? So I mentioned that dealing with contention is one of the issues with multi-core implementation but receives that scaling already doing this bifurcation of an incoming packet into these ring buffers. We are guaranteed that the thread that is going to be pulling the ring buffer is distinct from the thread that is polling a different ring buffer, the same ring buffer is not going to be pulled by multiple threads so there's no contention among the threads. So, because of the fact that we are demultiplexing using that 5-tuple, different connections will have different queues in the ring buffer and and therefore, they are redirected to different cores and the per-connection state is accessed only by a single core so the state management becomes easy. So these are all the challenges that I said that is there when you're trying to implement network functions on multicore. And the receive side scaling which is a hardware technology is something that DPDK can exploit in order to make sure that all of these problems that is associated with multi-core processing can be mitigated or even eliminated and that's the nice thing about this. So the multi-core support in DPDK, it allows the admin to specify many things and this is the hardware software partnership that I was mentioning and that is being made possible by DPDK exploiting the hardware features. And in particular, the admin can allow mapping of specific RX queue to a specific CPU core. So I can say that port zero RX queue 1 is associated with CPU core number 6. And similarly we can say that CPU core number 6 is also associated with port 1 for the transmit queue 2. So these are things that the admin can specify and there's flexibility available in DPDK to create as many queues as the admin wants and the network administrator is the person that I'm thinking about. And each thread is pinned to a specific core and that avoids contention. So that's the way by which all the challenges that are existing In supporting a network function on multiple cores is being alleviated by DPDK by both the hardware technology that's available and the library that is implementing the software for channeling the network packet to a particular core. And each thread or core runs the same code in order to do this packet processing. The other thing that I mentioned is the NUMA effects, and again DPDK allows the network function to be aware of the NUMA effects. So what DPDK allows you to do is create memory pools for inter-core communication on the same NUMA socket as the cores that are involved with the communication. And the other thing that is also available, the ring buffers are allocated on the same socket the NIC and the cores selected for processing. So these are things that you can do in DPDK and remote memory access is therefore minimized. And so these are things that are hairy from a network function implementor's point of view. But DPDK library is alleviating this by making it possible for the association of a NIC with the cores that are processing to be the same using the same piece of NUMA and reducing the amount of remote memory accessing.