check slides for diagrams and on how IO works in OS
https://www.cse.iitb.ac.in/~mythili/virtcc/slides_pdf/07-iovirt.pdf

Techniques for I/O virtualization

  • Device emulation

    • I/O traps in VMM and then the host OS does the operation and sends the interrupt
    • some interrupts are handled by the VMM itself so yes.
  • Virtio optimization

    • Pretty much device emulation but there’s a shared ring buffer where requests gets collected
      and then processed and then the results are put back into the ring.
      so this pools the I/O requests,interrupts and batch processes them, i.e High Performance
      so using a ring reduces the number of copies:
      DMA RAM VMCS Host RAM
      DMA RING Host RAM
  • Device passthrough or direct I/O (SR-IOV)

    • More Efficient than Emulation
    • Example:
      **SR IOV Single Root IO Virtualization
      • 1 physical function: host OS
      • many virtual functions: guest OS
      • each virtual function is like a separate NIC, bound to a guest VM
      • so each NIC has it’s own MAC address which is used for packet routing
    • So how do packets get sent and assigned to the right NIC?
      • Directly communicates with the device driver
      • Packets don’t go to host OS
      • Packets get switched at layer 2 using VM virtual device’s MAC
      • Data gets DMA’d directly in Guest RAM
      • but interrupts can still cause VM exit. these interrupts can occur due to packets sent, packets received, NIC errors/link changes
    • So why doesn’t all NIC do this?
      • so the guest VM will give Guest physical address to the NIC as the place to store the DMA data but since this is a virtualized data, the address won’t match with the actual physical device.
      • Thus there’s IOMMU (IO Memory Managed Unit) to convert GPA to HPA before NIC can perform the direct memory access.

    Final Flow of DMA in SR-IOV (with IOMMU)

    1. Guest OS allocates a buffer and provides a GPA to the NIC (VF).
    2. NIC receives a network packet and wants to write it to the provided buffer.
    3. NIC tries to perform DMA to the GPA, but it doesn’t know the real memory location (HPA).
    4. IOMMU translates the GPA → HPA, giving the NIC the correct physical memory location.
    5. NIC performs DMA to the correct HPA, storing the packet in memory.
    6. Guest OS reads the packet from its buffer using the same GPA (which the guest maps to its virtual memory).