Writing a UEFI Network Device Driver
Purpose of this Document
The reader is strongly encouraged to refer to the Supporting Documents, especially the UEFI Specification. It provides a description of all the major APIs in UEFI, including explanations of each of the API entry points. Don't be intimidated by its size – a mere 2168 pages – the PDF is reasonably well organized, indexed, and cross-referenced. Unfortunately, this specification is not written clearly. Many words are used in unfamiliar ways: for example, the “Simple Network Protocol” describes the API used to implement a network driver, not the network protocol it implements. Other problems arise when attempting to interpret the entry point specifications. As a randomly chosen example, the description for the “InterruptStatus” parameter to the “GetStatus” driver entry point is described as follows: “A pointer to the bit mask of the currently active interrupts (see “Related Definitions”). If this is NULL, the interrupt status will not be read from the device. If this is not NULL, the interrupt status will be read from the device. When the interrupt status is read, it will also be cleared. Clearing the transmit interrupt does not empty the recycled transmit buffer array.” But what does “If this is NULL, the interrupt status will not be read from the device” mean to the driver writer? It appears to tell the driver not to read the interrupt status from the device, but in practice that's not what it means. It actually means “If this is NULL, don't pass back an interrupt status.” Further, does “When the interrupt status is read, it will also be cleared.” mean the interrupt should be acknowledged in the hardware? Is that even appropriate? Code examination doesn't help clarify the intent, as the bit mask is never used, even when it is supplied. Thus the goal of this document is to describe the process for creating a network driver in a narrative, rather than a specification.
This document makes reference to the following documents. The UEFI specifications can be downloaded from the UEFI web site (http://www.uefi.org); a license acknowledgement is required. Unified Extensible Firmware Interface Specification, version 2.3.1, Errata C. June 27th, 2012. UEFI Shell Specification, revision 2.0, Errata A. May 22nd, 2012.
UEFI runs in a single core environment, even if the hardware supports multiple cores. Simple Network Protocol drivers generally in a polled mode, with device interrupts disabled. This might lead to the assumption that driver methods won't be called recursively. This has been proven in practice to be untrue, even when the driver has been marked as not supporting multiple concurrent transmissions.
Some of the driver entry points, such as “Transmit”, can return EFI_NOT_READY to indicate that they're busy. Others, like “GetStatus” could return without performing any action. For these, it would be possible to detect and handle a recursive entry or other synchronization conflict. However, the proper mechanism is to use the Task Priority Services. Each task in UEFI is scheduled at a defined priority. Tasks of one priority may not interrupt currently-running tasks of a higher priority. Thus, if a tasks raises its task priority level, it can be assured that (1) no tasks of the new priority are currently running, and (2) the task will not be interrupted by a lower priority task. With the Simple Network Protocol, all driver methods are called with the task priority less than or equal to TPL_CALLBACK. Thus raising the task priority to TPL_CALLBACK is sufficient to prevent reentrancy and data synchronization problems. The LAN91x and LAN9118 drivers both use this mechanism.
UEFI Driver Entry Point
Refer to UEFI Specification §4.7.2, “UEFI Driver Model Example”
A network device driver in UEFI is usually implemented as a DXE (Driver Execution Environment) module. A DXE module has a main entry point which is called after the module has been loaded and relocated; the name of this function is specified in the module's .INF file. Its purpose is roughly equivalent to a Linux device driver's module initialization function (identified in the module_init() macro). In Linux, most drivers register with the bus driver for the type of bus the device connects to, and the bus driver calls the driver when a potentially matching device is detected. Similarly, most UEFI drivers will register for the protocol that supports that bus (USB, for example) using the EFI Driver Binding Protocol, and the bus protocol calls the driver to notify it of a device instance that the driver may wish to handle. In the case of the network drivers I've been working with (the SMSC LAN91x and LAN9118 families), the device is not connected to a bus with UEFI protocol support. Instead, these drivers directly install the protocol interfaces for their device instances. A network driver needs to register with one of the network protocols, and both of these install protocol interfaces to the Simple Network Protocol, commonly referred to as “SNP”. To do this, the driver creates a private data structure containing two SNP-defined structures and whatever per-instance data the driver needs to maintain. The first structure, EFI_SIMPLE_NETWORK_PROTOCOL, contains a collection of pointers to driver functions (or methods) that UEFI and applications can call to access the network. Each of these driver methods is defined in Chapter 21.1 of the UEFI Specification. The EFI_SIMPLE_NETWORK_PROTOCOL structure also contains a pointer to the second SNP-defined structure, EFI_SIMPLE_NETWORK_MODE, which is a description of the interface characteristics. Thus a primary function of these drivers' main entry point is to initialize these two structures. Officially, a driver entry point function is not permitted to touch the device itself. This makes sense if it is merely registering to handle a class of devices which may or may not yet be connected to the system; the driver's Binding Protocol methods will be called when the device is ready to be accessed. In the case of the LAN91x and LAN9118 drivers, however, the drivers' entry point functions access the devices to obtain the information needed to initialize the SNP structures. If the driver is to support network booting, it needs to install a second protocol, the Device Path Protocol. This allows UEFI to identify the programmatic path the the device. This is described in Chapter 9.1 of the UEFI Specification.
SNP Network Interface Start
Normally, the first driver method called by UEFI is the Start method. The UEFI Specification's description of this function is: Changes the state of a network interface from “stopped” to “started.” That's all it says. One might think this involves initializing the device, but as will be seen in the next section, that's clearly not the case. My interpretation is that this function would enable the device if it requires enabling, and return a good/bad status. Internally, the “State” field in the Mode structure needs to be advanced from the Stopped to Started state. Once in the Started state, it appears it would be legal for a different driver to perform the initialization, so very little can or should be done in this method.
SNP Network Interface Stop
This driver method is the inverse of the Stop method above. It takes a network interface from the Started state back to the Stopped state. SNP Network Interface Initialization Immediately after the driver's Start method is called, the driver's Initialize method is called. This method's purpose is to reset the network interface and prepare it for operation. This is where the initialization that one might think belong in the Start method really goes. If the device requires the allocation of buffer or descriptor rings, this is where that will be done. In the case of the LAN91x and LAN9118 drivers, the required buffers are resident in the device itself, and no additional buffers or descriptors are required. The drivers initiate soft resets of the devices, request auto-negotiation of link parameters, and finally enable the devices. If no problems are encountered, the interface State is advanced from Started to Initialized; this is the normal “online” state of the interface.
SNP Network Interface Reset
This driver method appears to be intended to reset a malfunctioning network interface without requiring total reconfiguration. Source code analysis of UEFI sources does not reveal any code that calls this method, though presumably a UEFI application could invoke it. No calls to it observed during testing.
SNP Network Interface Shutdown
This driver method reverses the effect of the Initialize method. It is intended to quiesce the interface so that another driver may initialize it. It's not clear how the UEFI designers expect this to work, though. On successful completion the network interface is placed into the Started state. Normally a call to this method is followed by a call to the Stop method. SNP Network Interface Receive Filter Configuration This driver method is called to manage the MAC address filtering performed by the device and/or driver during packet reception. Its operation is rather confusing. The Enable and Disable parameters are interpreted as bitmasks indicating what classes of addresses should and should not be received; these classes are: Unicast, Multicast, Broadcast, Promiscuous, and Promiscuous-Multicast. This method also permits the installation (or replacement) of a list of MAC addresses to be received. Many devices are not capable of filtering specific MAC addresses, while others support only a limited list of addresses to be filtered and received. Thus the UEFI network stack permits the driver to be more permissive than specified, and the incorrectly received packets will be dropped in software.
SNP Network Station Address Configuration
This driver method allows the caller to set the MAC address for the interface, or request the driver reset the MAC address to the permanent (ROM-based) MAC address. SNP Network Statistics This driver method allows the caller to retrieve, and optionally reset, the interface statistics. The driver is responsible for collecting the statistics defined in EFI_NETWORK_STATISTICS, and returning a copy of them on request.
SNP Multicast Address Translation
RFC 791 defines the algorithmic translation from IPv4 multicast address to Ethernet multicast address., RFC 2460 does the same for IPv6. This driver method is called to perform this translation, thus keeping the higher-level code independent of the layer 2 protocol being used. The LAN91x and LAN9118 drivers each contain functions to implement this driver method. After the drivers were working I discovered that there exists a function in UEFI to perform this translation. I did not attempt to switch over to using it, but I would recommend using the common routine in future development.
SNP Get Status
The SNP GetStatus method is one of the more tricky methods to implement. It is called to poll the driver for media status changes (link up/down) and to handle transmit completions. When called for the latter purpose, the driver method is expected to return the buffer address of a completed transmit packet queued in FIFO order. This requires the driver to keep track of the original buffer addresses. Although SNP permits the driver to indicate that it cannot handle multiple simultaneous transmissions, in practice this restriction is not enforced and a new transmission may be requested before the status for the previous one has been retrieved. This has been seen where GRUBv2 has queued one packet for transmission, and ARP or the DHCP client queues a second. Thus it is required that the Transmit and GetStatus methods cooperate to maintain a queue that can support at least two simultaneous transmissions and potentially more. The LAN91x and LAN9118 drivers are written to support 15 outstanding transmissions, although the deepest queue depth observed was two packets.
The GetStatus method is also documented to retrieve the interrupt status of the network device. This status is reported as a bitmask with four bits defined: receive, transmit, command, and software. The specification states that when the InterruptStatus parameter is non-NULL, this bitmask is returned and the pending interrupts are cleared. Thus it is possible to dequeue an entry from the transmit complete list without clearing the pending interrupts. The upper-level code appears to call GetStatus passing either InterruptStatus or the TxBuf pointers, but not both at the same time. However, the content of the InterruptStatus bitmask appears to be totally ignored, supporting the hypothesis that this is being done for the benefit of the driver, not upper-level code. The LAN91x and LAN9118 drivers return interrupt status but do not clear the device status as this would interfere with normal driver operations.
Upper-level code calls the driver Transmit method to initiate transmission of a packet on the network. This apparently simple activity is complicated by the strange set of parameters passed in. The Transmit method must accept packets in two different forms. In the first form, the HeaderSize parameter is non-zero. The DestAddr and SrcAddr parameters point to EFI_MAC_ADDRESS structures which contain the destination and source MAC addresses, and the Protocol parameter points to an integer containing the protocol code in host byte order (0x0800 for IPv4). The driver is responsible for assembling the L2 protocol header from these pieces. It appears that space for this header is reserved in the packet buffer, as the actual packet data begins HeaderSize bytes into the buffer provided. Because the LAN9x and LAN9118 drivers transfer data to the device under program control rather than DMA, they do not modify the packet buffer. This is the normal form of transmission.
In the second form, the HeaderSize parameter is zero. The packet buffer contains a fully-formed network packet ready for transmission. The DestAddr, SrcAddr, and Protocol parameters are not used, and likely are NULL. This form is used to transmit ARP and other unusual packet types. The Transmit method is also responsible for updating the network interface statistics related to packet transmission. This requires that the driver examine the content of the packets to determine if they are unicast, multicast, or broadcast packets, as well as checking for relevant error conditions. After a packet has been transmitted, the buffer address must be placed on a completion queue so that the GetStatus method can return it.
Upper-level code calls the driver Receive method to retrieve a packet from the network interface. If one is available, the content of the packet is copied to the caller-provided buffer; the copied packet includes the L2 protocol headers. If no packets have been received, the method returns EFI_NOT_READY; it does not block. If the BufferSize parameter is non-NULL, the size of the L2 protocol header (ETHER_HEAD) is passed back. If the SrcAddress, DestAddr, or Protocol parameters are non-NULL, the driver copies the appropriate parts of the L2 protocol header to the provided spaces. Note that the Protocol is converted from network to host byte order in the process. The Receive method is also responsible for updating the network interface statistics related to packet reception. This requires that the driver examine the content of the packets to determine if they are unicast, multicast, or broadcast packets, as well as checking for relevant error conditions.
Flaws in the design
There are flaws apparent in the Simple Network Protocol design. The primary flaw is in how packets are queued for transmission and their completion is reported. The documentation for the Transmit function states:
The Transmit() function performs nonblocking I/O. A caller who wants to perform blocking I/O, should call Transmit(), and then GetStatus() until the transmitted buffer shows up in the recycled transmit buffer.
The flaw here is that if two or more applications are using the network interface, the completions for one application may be consumed by another. This flaw can be seen at work during long TFTP transfers initiated by GRUBv2. During the transfer, ARP and the DHCP client also insert packets into the transmission stream. Each application then polls for completions, and discards those intended for the others. The result is unnecessary delays and retransmissions, as the application that lost the completion notification has to recover from the situation.
LEG/ServerArchitecture/UEFI/UEFI_Network_Driver (last modified 2017-08-17 12:12:52)