|Return to 4.me CVS log||Up to [cvs.NetBSD.org] / src / share / doc / papers / bus_dma|
|File: [cvs.NetBSD.org] / src / share / doc / papers / bus_dma / 4.me (download)
Revision 1.2, Wed Feb 5 00:02:27 2003 UTC (18 years, 2 months ago) by perry
"Utilize" has exactly the same meaning as "use," but it is more difficult to read and understand. Most manuals of English style therefore say that you should use "use".
.\" $NetBSD: 4.me,v 1.2 2003/02/05 00:02:27 perry Exp $ .\" .\" Copyright (c) 1998 Jason R. Thorpe. .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" 3. All advertising materials mentioning features or use of this software .\" must display the following acknowledgements: .\" This product includes software developed for the NetBSD Project .\" by Jason R. Thorpe. .\" 4. The name of the author may not be used to endorse or promote products .\" derived from this software without specific prior written permission. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR .\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES .\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. .\" IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, .\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, .\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; .\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED .\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, .\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .sh 1 "Implementation of \fIbus_dma\fB in NetBSD/alpha and NetBSD/i386" .pp This section is a description of the \fIbus_dma\fR implementation in two NetBSD ports, NetBSD/alpha and NetBSD/i386. It is presented as a side-by-side comparison in order to give the reader a better feel for the types of details that are being abstracted by the interface. .sh 2 "Platform requirements" .pp NetBSD/alpha currently supports six implementations of the PCI bus, each of which implement DMA differently. In order to understand the design approach for NetBSD/alpha's fairly complex \fIbus_dma\fR implementation, it is necessary to understand the differences between the bus adapters. While some of these adapters have similar descriptions and features, the software interface to each one is quite different. (In addition to PCI, NetBSD/alpha also supports two TurboChannel DMA implementations on the DEC 3000 models. For simplicity's sake, we will limit the discussion to the PCI and related busses.) .pp The first PCI implementation to be supported by NetBSD/alpha was the DECchip 21071/21072 (APECS)\*#. It is designed to be used with the DECchip 21064 (EV4) and 21064A (EV45) processors. Systems in which this PCI host bus adapter is found include the AlphaStation 200, AlphaStation 400, and AlphaPC 64 systems, as well as some AlphaVME systems. The APECS supports up to two DMA windows, which may be configured for direct-mapped or scatter-gather-mapped operation, and uses host RAM for scatter-gather page tables. .(d .pp \*# Digital Equipment Corporation, .ul DECchip 21071 and DECchip 21072 Core Logic Chipsets Data Sheet, DEC order number EC-QAEMA-TE, November 1994. .)d .pp The second PCI implementation to be supported by NetBSD/alpha was the built-in I/O controller found on the DECchip 21066\*# and DECchip 21068 family of Low Cost Alpha (LCA) processors. This processor family was used in the AXPpci33 and Multia AXP systems, as well as some AlphaVME systems. The LCA supports up to two DMA windows, which may be configured for direct-mapped or scatter-gather-mapped operation, and uses host RAM for scatter-gather page tables. .(d .pp \*# Digital Equipment Corporation, .ul DECchip 21066 Alpha AXP Microprocessor Data Sheet, DEC order number EC-N0617-72, May 1994. .)d .pp The third PCI implementation to be supported by NetBSD/alpha was the DECchip 21171 (ALCOR)\*#, 21172 (ALCOR2), and 21174 (Pyxis)\**. .(f \**While these chipsets are somewhat different from one another, the software interface is similar enough that they share a common device driver in the NetBSD/alpha kernel. .)f These PCI host bus adapters are found in systems based on the DECchip 21164 (EV5), 21164A (EV56), and 21164PC (PCA56) processors, including the AlphaStation 500, AlphaStation 600, and AlphaPC 164, and Digital Personal Workstation. The ALCOR, ALCOR2, and Pyxis support up to four DMA windows, which may be configured for direct-mapped or scatter-gather-mapped operation, and uses host RAM for scatter-gather page tables. .(d .pp \*# Digital Equipment Corporation, .ul DECchip 21171 Core Logic Chipset Technical Reference Manual, DEC order number EC-QE18B-TE, September 1995. .)d .pp The fourth PCI implementation to be supported by NetBSD/alpha was the Digital DWLPA/DWLPB\*#. This is a TurboLaser-to-PCI\** bridge found on AlphaServer 8200 and 8400 systems. The bridge is connected to the TurboLaser system bus via a KFTIA (internal) or KFTHA (external) I/O adapter. The former supports one built-in and one external DWLPx. The latter supports up to four external DWLPxs. Multiple I/O adapters may be present on the TurboLaser system bus. Each DWLPx supports up to four primary PCI busses and has three DMA windows which may be configured for direct-mapped or scatter-gather-mapped DMA. These three windows are shared by all PCI busses attached to the DWLPx. The DWLPx does not use host RAM for scatter-gather page tables. Instead, the DWLPx uses on-board SRAM, which must be shared by all PCI busses attached to the DWLPx. This is because the store-and-forward architecture of these systems would cause latency on DMA page table access to be too high. The DWLPA has 32K of page table SRAM and the DWLPB has 128K. Since the DWLPx can snoop access to the page table SRAM, no explicit scatter-gather TLB invalidation is necessary on this PCI implementation. .(f \**"TurboLaser" is the name of the system bus on the AlphaServer 8200 and 8400 systems. .)f .(d .pp \*# Digital Equipment Corporation, .ul DWLPA and DWLPB PCI Adapter Technical Manual, DEC order number EK-DWLPX-TM, July 1996. .)d .pp The fifth PCI implementation to be supported by NetBSD/alpha was the A12C PCI bus on the Avalon A12 Scalable Parallel Processor\*#. This PCI bus is a secondary I/O bus\**, has only a single PCI slot in mezzanine form-factor, and is used solely for Ethernet I/O. .(d .pp \*# H. Ross Harvey, .ul Avalon A12 Parallel Supercomputer Theory of Operation, Avalon Computer Systems, Inc., October 1997. .)d .(f \**The primary I/O bus on the A12 is a crossbar, which is used to communicate with other nodes in the parallel processor. .)f This PCI bus is not able to directly access host RAM. Instead, devices DMA to and from a 128K SRAM buffer. This is, in essence, a hardware implementation of DMA bouncing. This is not considered a limitation of the architecture given the target application of the A12 system (parallel computation applications which communicate via MPI\** over the crossbar). .(f \**MPI, or the Message Passing Interface, is a standardized API for passing data and control within a parallel program. .)f .pp The sixth PCI implementation to be supported by NetBSD/alpha was the MCPCIA MCBUS-to-PCI bridge found on the AlphaServer 4100 (Rawhide) systems. The Rawhide architecture is made up of a "horse" (the central backplane) and two "saddles" (primary PCI bus adapters on either side of the backplane). The saddles may also contain EISA bus adapters. Each MCPCIA has four DMA windows which may be configured for direct-mapped or scatter-gather-mapped operation, and uses host RAM for scatter-gather page tables. .pp In sharp contrast to the Alpha, the i386 platform has a very simple PCI implementation; the PCI bus is capable of addressing the entire 32-bit physical address space of the PC architecture, and, in general, all PCI host bus adapters are software compatible. The i386 platform also has WYSIWYG DMA, so no window translations are necessary. The i386 platform, however, must contend with DMA bouncing on the ISA bus, due to ISA's 24-bit address limitation and lack of scatter-gather-mapped DMA. .sh 2 "Data structures" .pp The DMA tags used by NetBSD/alpha and NetBSD/i386 are very similar. Both contain thirteen function pointers for the thirteen functional methods in the \fIbus_dma\fR interface. The NetBSD/alpha DMA tag, however, also has a function pointer used to obtain the DMA tag for children of the primary I/O bus and an opaque cookie to be interpreted by the low-level implementation of these methods. .pp The opaque cookie used by NetBSD/alpha's DMA tag is a pointer to the chipset's statically-allocated state information. This state information includes one or more \fIalpha_sgmap\fR structures. The \fIalpha_sgmap\fR contains all of the state information for a single DMA window to perform scatter-gather-mapped DMA, including pointers to the scatter-gather page table, the \fIextent map\fR\** that manages the page table, and the DMA window base. .(f \**An \fIextent map\fR is a data structure which manages an arbitrary number range, providing several resource allocation primitives. NetBSD has a general-purpose extent map manager which is used by several kernel subsystems. .)f .pp The DMA map structure contains all of the parameters used to create the map. (This is a fairly standard practice among all current implementations of the \fIbus_dma\fR interface.) In addition to the creation parameters, the two implementations contain additional state variables specific to their particular DMA quirks. For example, the NetBSD/alpha DMA map contains several state variables related to scatter-gather-mapped DMA. The i386 port's DMA map, on the other hand, contains a pointer to a map-specific cookie. This cookie holds state information for ISA DMA bouncing. This state is stored in a separate cookie because DMA bouncing is far less common on the i386 then scatter-gather-mapped DMA is on the Alpha, since the Alpha must also do scatter-gather-mapped DMA for PCI if the system has a large amount of physical memory. .pp In both the NetBSD/alpha and NetBSD/i386 \fIbus_dma\fR implementations, the DMA segment structure contains only the public members defined by the interface. .sh 2 "Code structure" .pp Both the NetBSD/alpha and NetBSD/i386 \fIbus_dma\fR implementations use a simple inheritance scheme for code reuse. This is achieved by allowing the chipset- or bus-specific code layers (i.e. the "master" layers) to assemble the DMA tag. When the tag is assembled, the master layer inserts its own methods in the function pointer slots where special handling at that layer is required. For those methods which do not require special handling, the slots are initialized with pointers to common code. .pp The Alpha \fIbus_dma\fR code is broken down into four basic categories: chipset-specific code, code that implements common direct-mapped operations, code that implements common scatter-gather-mapped operations, and code that implements operations common to both direct-mapped and scatter-gather-mapped DMA. Some of the common functions are not called directly via the tag's function switch. These functions are helper functions, and are for use only by chipset front-ends. An example of such a helper is the set of common direct-mapped DMA load functions. These functions take all of the same arguments as the interface-defined methods, plus an extra argument: the DMA window's base DMA address. .pp The i386 \fIbus_dma\fR implementation, on the other hand, is broken down into three basic categories: common implementations of \fIbus_dma\fR methods, common helper functions, and ISA DMA front-ends\**. .(f \**ISA is currently the only bus supported by NetBSD/i386 with special DMA requirements. This may change in future versions of the system. .)f All of the common interface methods may be called directly from the DMA tag's function switch. Both the PCI and EISA DMA tags use this feature; they provide no bus-specific DMA methods. The ISA DMA front-ends provide support for DMA bouncing if the system has more than 16MB of physical memory. If the system has 16MB of physical memory or less, no DMA bouncing is required, and the ISA DMA front-ends simply redirect the \fIbus_dma\fR function calls to the common implementation. .sh 2 "Autoconfiguration" .pp The NetBSD kernel's autoconfiguration system employs a depth-first traversal of the nodes (devices) in the device tree. This process is started by machine-dependent code telling the machine-independent autoconfiguration framework that it has "found" the root "bus". In the two platforms described here, this root bus, called \fImainbus\fR, is a virtual device; it does not directly correspond to any physical bus in the system. The device driver for \fImainbus\fR is implemented in machine-dependent code. This driver's responsibility is to configure the primary I/O bus or busses. .pp In NetBSD/alpha, the chipset which implements the primary I/O bus is considered to be the primary I/O bus by the \fImainbus\fR layer. Platform-specific code specifies the name of the chipset, and the \fImainbus\fR driver configures it by "finding" it. When the chipset's device driver is attached, it initializes its DMA windows and data structures. Once this is complete, it "finds" the primary PCI bus or busses logically attached to the chipset, and passes the DMA tag for these busses down to the PCI bus device driver. This driver in turn finds and configures each device on the PCI bus, and so on. .pp In the event that the PCI bus driver encounters a PCI-to-PCI bridge (PPB), the DMA tag is passed unchanged to the PPB device driver, which in turn passes it to the secondary PCI bus instance attached to the other side of the bridge. However, intervention by machine-dependent code is required if the PCI bus driver encounters a bridge to a different bus type, such as EISA or ISA; this bus may require a different DMA tag. For this reason, all PCI-to-<other bus> bridge (PCxB) drivers are implemented in machine-dependent code. While the PCxB drivers could be implemented in machine-independent code using machine-dependent hooks to obtain DMA tags, this is not done as the secondary bus may require special machine-dependent interrupt setup and routing. Once all of the call-backs to handle the machine-dependent bus transition details were implemented, the amount of code that would be shared would hardly be worth the effort. .pp When a device driver is associated with a particular hardware device that the bus driver has found, it is given several pieces of information needed to initialize and communicate with the device. One of these pieces of information is the DMA tag. If the driver wishes to perform DMA, it must remember this tag, which, as noted previously, is used in every call to the \fIbus_dma\fR interface. .pp While the procedure for configuring busses and devices is essentially identical to the NetBSD/alpha case, NetBSD/i386 configures the primary I/O busses quite differently. The PC platform was designed from the ground up around the ISA bus. EISA and PCI are, in many ways, very similar to ISA from a device driver's perspective. All three have the concept of I/O-mapped\** and memory-mapped space. The hardware and firmware in PCs typically map these busses in such a way that initialization of the bus's adapter by operating system software is not necessary. For this reason, it is possible to consider PCI, EISA, and ISA to all be primary I/O busses, from the autoconfiguration perspective. .(f \**I/O-mapped space is accessed with special instructions on Intel processors. .)f .pp The NetBSD/i386 \fImainbus\fR driver configures the primary I/O busses in order of descending priority: PCI first, then EISA, and finally, ISA. The \fImainbus\fR driver has direct access to each bus's DMA tags, and passes them down to the I/O bus directly. In the case of EISA and ISA, the \fImainbus\fR layer only attempts to configure these busses if they were not found during the PCI bus configuration phase; NetBSD/i386, as a matter of correctness, identifies PCI-to-EISA (PCEB) and PCI-to-ISA (PCIB) bridges, and assigns autoconfiguration nodes in the device tree to them. The EISA and ISA busses are logically attached to these nodes, in a way very similar to that of NetBSD/alpha. The bridge drivers also have direct access to the bus's DMA tags, and pass them down to the I/O bus accordingly. .sh 2 "Example of underlying operation" .pp This subsection describes the operation of the machine-dependent code which implements the \fIbus_dma\fR interface as used by a device driver for a hypothetical DES encryption card. While this is not the original application of \fIbus_dma\fR, it provides an example which is much easier to understand; the application for which the interface was developed is a high-performance hierarchical mass storage system, the details of which are overwhelming. .pp Not all of the details of a NetBSD device driver will be described here, but rather only those details which are important within the scope of DMA. .pp For the purpose of our example, the card comes in both PCI and ISA models. Since we describe two platforms, there are four permutations of actual examples. They will be tagged with the following indicators: .(b .b [Alpha/ISA] [Alpha/PCI] [i386/ISA] [i386/PCI] .r .)b .pp We will assume that the \fB[i386/ISA]\fR platform has more than 16MB of RAM, so transfers might have to be bounced if DMA-safe memory is not used explicitly. We will also assume that the direct-mapped DMA window on the \fB[Alpha/PCI]\fR platform is capable of addressing all of system RAM. .pp Please note that in the description of map synchronization, only cases which require special handing will be described. In both the \fB[Alpha/ISA]\fR and \fB[Alpha/PCI]\fR cases, all synchronizations cause the CPU's write buffer to be drained using the Alpha's \fImb\fR\*# instruction. All synchronizations in the \fB[i386/PCI]\fR case are no-ops, as are synchronizations of DMA-safe memory in the \fB[i386/ISA]\fR case. .(d .pp \*# Richard L. Sites and Richard T. Witek, .ul Alpha AXP Architecture Reference Manual, Second Edition, Digital Press, 1995. .)d .sh 3 "Hardware overview" .pp The card is a bus master, and operates by reading a fixed-length command block via DMA. There are three commands: \fBSET KEY\fR, \fBENCRYPT\fR, and \fBDECRYPT\fR. Commands are initiated by filling in the command block, and writing the DMA address of the command block to the card's \fIdmaAddr\fR register. The command block contains 6 32-bit words: \fIcbCommand\fR, \fIcbStatus\fR, \fIcbInAddr\fR, \fIcbInCount\fR, \fIcbOutAddr\fR, and \fIcbOutCount\fR. The \fIcbInAddr\fR and \fIcbOutAddr\fR members are the DMA addresses of software scatter-gather lists used by the card's DMA engine. The \fIcbInCount\fR and \fIcbOutCount\fR members are the number of scatter-gather entries in their respective lists. Each scatter-gather entry contains a DMA address and a length, both 32-bit words. .pp When the card processes a request, it reads the command block via DMA. It then examines the command block to determine which action to take. In the case of all three supported commands, it reads the input scatter-gather list at DMA address \fIcbInAddr\fR for length \fIcbInCount * 8\fR. It then switches the input to the appropriate processing engine. In the case of the \fBSET KEY\fR command, the scatter-gather list is used to DMA the DES key into SRAM located on the card. For all other commands, the input is directed at the pipelined DES engine, switched into either encrypt or decrypt mode. The DES engine then reads the output scatter-gather list specified by \fIcbOutAddr\fR for \fIcbOutCount * 8\fR bytes. Once the DES engine has all of the DMA addresses, it then begins the cycle of input-process-output until all data has been consumed. Once any command is finished, a status word is written to \fIcbStatus\fR, and an interrupt is delivered to the host. The driver software must read this word to determine if the command completed successfully. .sh 3 "Device driver overview" .pp The device driver for this DES card provides \fIopen()\fR, \fIclose()\fR, and \fIioctl()\fR entry points. The driver uses DMA to the user address space for high performance. When a user issues a request via the ioctl corresponding to the requested operation, the driver places it on a work queue. The \fIioctl()\fR system call returns immediately, allowing the application to run or block via \fIsigsuspend()\fR. If the card is currently idle, the driver immediately issues the command to the card. When the job is finished, the card interrupts, and the driver notifies the user that the request has completed via the \fBSIGIO\fR signal. If there are more jobs on the work queue, the next job is removed from the queue and started, until there are no more jobs. .sh 3 "Driver initialization" .pp When the driver instance is created (attached), it must create and initialize the data structures necessary for operation. This driver uses multiple DMA maps: one for the control structures (control block and scatter-gather lists), and many for data submitted by user requests. The data maps are kept in the driver job queue entries, which are created when jobs are submitted. .pp Next the driver must allocate DMA-safe memory for the control structures. The driver will allocate three pages of memory via \fIbus_dmamem_alloc\fR(). For simplicity, the driver will request a single memory segment. For all platforms and busses in this example, this operation simply calls a function in the virtual memory system that allocates memory with the requested constraints. In the \fB[i386/ISA]\fR case, the ISA layer inserts itself into the call graph to specify a range of 0 - 16MB. All other cases simply specify the entire present physical memory range. .pp A small piece of this memory will be used for the command block. The rest of the memory will be divided evenly between the two scatter-gather lists. This memory is then mapped into kernel virtual address space using \fIbus_dmamem_map()\fR with the \fBBUS_DMA_COHERENT\fR flag, and the kernel pointers to the three structures are initialized. When the memory is mapped on the i386, the \fBBUS_DMA_COHERENT\fR flag causes the cache-inhibit bits to be set in the PTEs. No special handing of this flag is required on the Alpha. However, in the Alpha case, since there is only a single segment, the memory is mapped via the Alpha's direct-mapped kernel segment; no use of kernel virtual address space is required. .pp Finally, the driver loads the control structure DMA map by passing the kernel virtual address of the memory to \fIbus_dmamap_load()\fR. To make it easier to start transactions, the driver caches the DMA addresses of the various control structures (by adding their offsets to the memory's DMA address). In all cases, the underlying load function steps through each page in the virtual address range, extracting the physical address from the pmap module and compacting the segments where possible. Since the memory was allocated as a single segment, it maps to a single DMA segment. .sh 3 "Example transaction" .pp Let's suppose that the user has already set the key, and now wishes to use it to encrypt a data buffer. The calling program packages up the request, providing a pointer to the input buffer, output buffer, and status word, all in user space, and issues the "encrypt buffer" ioctl. .pp Upon entry into the kernel, the driver locks the user's buffer to prevent the data from being paged out while the DMA is in progress. A job queue entry is allocated, and two DMA maps are created for the job queue entry, one for the input buffer and one for the output buffer. In all cases, this allocates the standard DMA map structure. In the \fB[i386/ISA]\fR case, an ISA DMA cookie for each map is also allocated. .pp Once the queue entry has been allocated, it must be initialized. The first step in this process is to load the DMA maps for the input and output buffers. Since this process is essentially identical for input and output, only the actions for the input buffer's map are described here. .pp On \fB[Alpha/PCI]\fR and \fB[i386/PCI]\fR, the underlying code traverses the user's buffer, extracting the physical addresses for each page. For \fB[Alpha/PCI]\fR, the DMA window base is added to this address. The address and length of the segment are placed into the map's DMA segment list. Segments are concatenated when possible. .pp On \fB[Alpha/ISA]\fR, a very similar process occurs. However, rather than placing the physical addresses into the map's segment list, some scatter-gather-mapped DMA address space is allocated and the addresses placed into the corresponding page table entries. Once this process is complete, a single DMA segment is placed in the map's segment list, indicating the beginning of the scatter-gather-mapped area. .pp The \fB[i386/ISA]\fR case also traverses the user's buffer, but twice. In the first pass, the buffer is checked to ensure that it does not have any pages above the 16MB threshold. If it does not, then the procedure is identical to the \fB[i386/PCI]\fR case. However, for the sake of example, the buffer has pages outside the threshold so the transfer must be bounced. At this point, a bounce buffer is allocated. Since we are still in the process's context, this allocation may block. A pointer to the bounce buffer is stored in the ISA DMA cookie, and the physical address of the bounce buffer is placed in the map's segment list. .pp The next order of business is to enqueue or begin the transfer. To keep the example simple, we will assume that no other transfers are pending. The first step in this process is to initialize the control block with the cached DMA addresses of the card's scatter-gather lists. These lists are also initialized with the contents of the DMA maps' segment list. Before we tell the card to begin transferring data, we must synchronize the DMA maps. .pp The first map to be synchronized is the input buffer map. This is a \fBPREWRITE\fR operation. In the \fB[i386/ISA]\fR case, the user's buffer is copied from the user's address space into the bounce buffer\**. .(f \**This is not currently implemented, as it required substantial changes to the virtual memory system. This is because the \fIcopyin()\fR and \fIcopyout()\fR functions only operate on the current process's context, which may not be available at the time of the bounce. Those changes to the virtual memory system have now been made, so support for bouncing to and from user space will appear in a future release of NetBSD. Support for bouncing from kernel space is currently supported, however. .)f The next map to be synchronized is the output buffer map. This is a \fBPREREAD\fR operation. Finally, the control map is synchronized. Since the status will be read back from the control block after the transaction is complete, this synchronization is a \fBPREREAD|PREWRITE\fR. .pp At this point the DMA transaction may occur. The card is started by writing the cached DMA address of the control block into the card's \fIdmaAddr\fR register. The driver returns to user space, and the process waits for the signal indicating that the transaction has completed. .pp Once the transaction has completed, the card interrupts the host. The interrupt handler is now responsible for finishing the DMA sequence and notifying the requesting process that the operation is complete. .pp The first task to perform is to synchronize the input buffer map. This is a \fBPOSTWRITE\fR. Next we synchronize the output buffer map. This is a \fBPOSTREAD\fR. In the \fB[i386/ISA]\fR case, the contents of the output bounce buffer are copied to the user's buffer\**. Finally, we synchronize the control map. This is a \fBPOSTREAD|POSTWRITE\fR. .(f \**The same caveat applies here as to the \fB[i386/ISA]\fR \fBPREWRITE\fR case for the input map. .)f .pp Now that the DMA maps have been synchronized, they must be unloaded. In the \fB[Alpha/PCI]\fR and \fB[i386/PCI]\fR cases, there are no resources to be freed; the mapping is simply marked invalid. In the \fB[Alpha/ISA]\fR case, the scatter-gather-mapped DMA resources are released. In the \fB[i386/ISA]\fR case, the bounce buffer is freed. .pp Since the user's buffer is no longer in use, it is unlocked by the device driver. Now the process may be signaled that I/O has completed. The last task to perform is to destroy the input and output buffer DMA maps and the job queue entry.