[BACK]Return to 4.me CVS log [TXT][DIR] Up to [cvs.NetBSD.org] / src / share / doc / papers / bus_dma

File: [cvs.NetBSD.org] / src / share / doc / papers / bus_dma / 4.me (download)

Revision 1.2, Wed Feb 5 00:02:27 2003 UTC (18 years, 2 months ago) by perry
Branch: MAIN
CVS Tags: yamt-pf42-baseX, yamt-pf42-base4, yamt-pf42-base3, yamt-pf42-base2, yamt-pf42-base, yamt-pf42, yamt-pagecache-tag8, yamt-pagecache-base9, yamt-pagecache-base8, yamt-pagecache-base7, yamt-pagecache-base6, yamt-pagecache-base5, yamt-pagecache-base4, yamt-pagecache-base3, yamt-pagecache-base2, yamt-pagecache-base, yamt-pagecache, wrstuden-revivesa-base-3, wrstuden-revivesa-base-2, wrstuden-revivesa-base-1, wrstuden-revivesa-base, wrstuden-revivesa, wrstuden-fixsa-newbase, wrstuden-fixsa-base-1, wrstuden-fixsa-base, wrstuden-fixsa, tls-maxphys-base, tls-maxphys, tls-earlyentropy-base, tls-earlyentropy, riastradh-xf86-video-intel-2-7-1-pre-2-21-15, riastradh-drm2-base3, riastradh-drm2-base2, riastradh-drm2-base1, riastradh-drm2-base, riastradh-drm2, prg-localcount2-base3, prg-localcount2-base2, prg-localcount2-base1, prg-localcount2-base, prg-localcount2, phil-wifi-base, phil-wifi-20200421, phil-wifi-20200411, phil-wifi-20200406, phil-wifi-20191119, phil-wifi-20190609, phil-wifi, pgoyette-localcount-base, pgoyette-localcount-20170426, pgoyette-localcount-20170320, pgoyette-localcount-20170107, pgoyette-localcount-20161104, pgoyette-localcount-20160806, pgoyette-localcount-20160726, pgoyette-localcount, pgoyette-compat-merge-20190127, pgoyette-compat-base, pgoyette-compat-20190127, pgoyette-compat-20190118, pgoyette-compat-1226, pgoyette-compat-1126, pgoyette-compat-1020, pgoyette-compat-0930, pgoyette-compat-0906, pgoyette-compat-0728, pgoyette-compat-0625, pgoyette-compat-0521, pgoyette-compat-0502, pgoyette-compat-0422, pgoyette-compat-0415, pgoyette-compat-0407, pgoyette-compat-0330, pgoyette-compat-0322, pgoyette-compat-0315, pgoyette-compat, perseant-stdc-iso10646-base, perseant-stdc-iso10646, netbsd-9-base, netbsd-9-1-RELEASE, netbsd-9-0-RELEASE, netbsd-9-0-RC2, netbsd-9-0-RC1, netbsd-9, netbsd-8-base, netbsd-8-2-RELEASE, netbsd-8-1-RELEASE, netbsd-8-1-RC1, netbsd-8-0-RELEASE, netbsd-8-0-RC2, netbsd-8-0-RC1, netbsd-8, netbsd-7-nhusb-base-20170116, netbsd-7-nhusb-base, netbsd-7-nhusb, netbsd-7-base, netbsd-7-2-RELEASE, netbsd-7-1-RELEASE, netbsd-7-1-RC2, netbsd-7-1-RC1, netbsd-7-1-2-RELEASE, netbsd-7-1-1-RELEASE, netbsd-7-1, netbsd-7-0-RELEASE, netbsd-7-0-RC3, netbsd-7-0-RC2, netbsd-7-0-RC1, netbsd-7-0-2-RELEASE, netbsd-7-0-1-RELEASE, netbsd-7-0, netbsd-7, netbsd-6-base, netbsd-6-1-RELEASE, netbsd-6-1-RC4, netbsd-6-1-RC3, netbsd-6-1-RC2, netbsd-6-1-RC1, netbsd-6-1-5-RELEASE, netbsd-6-1-4-RELEASE, netbsd-6-1-3-RELEASE, netbsd-6-1-2-RELEASE, netbsd-6-1-1-RELEASE, netbsd-6-1, netbsd-6-0-RELEASE, netbsd-6-0-RC2, netbsd-6-0-RC1, netbsd-6-0-6-RELEASE, netbsd-6-0-5-RELEASE, netbsd-6-0-4-RELEASE, netbsd-6-0-3-RELEASE, netbsd-6-0-2-RELEASE, netbsd-6-0-1-RELEASE, netbsd-6-0, netbsd-6, netbsd-5-base, netbsd-5-2-RELEASE, netbsd-5-2-RC1, netbsd-5-2-3-RELEASE, netbsd-5-2-2-RELEASE, netbsd-5-2-1-RELEASE, netbsd-5-2, netbsd-5-1-RELEASE, netbsd-5-1-RC4, netbsd-5-1-RC3, netbsd-5-1-RC2, netbsd-5-1-RC1, netbsd-5-1-5-RELEASE, netbsd-5-1-4-RELEASE, netbsd-5-1-3-RELEASE, netbsd-5-1-2-RELEASE, netbsd-5-1-1-RELEASE, netbsd-5-1, netbsd-5-0-RELEASE, netbsd-5-0-RC4, netbsd-5-0-RC3, netbsd-5-0-RC2, netbsd-5-0-RC1, netbsd-5-0-2-RELEASE, netbsd-5-0-1-RELEASE, netbsd-5-0, netbsd-5, netbsd-4-base, netbsd-4-0-RELEASE, netbsd-4-0-RC5, netbsd-4-0-RC4, netbsd-4-0-RC3, netbsd-4-0-RC2, netbsd-4-0-RC1, netbsd-4-0-1-RELEASE, netbsd-4-0, netbsd-4, netbsd-3-base, netbsd-3-1-RELEASE, netbsd-3-1-RC4, netbsd-3-1-RC3, netbsd-3-1-RC2, netbsd-3-1-RC1, netbsd-3-1-1-RELEASE, netbsd-3-1, netbsd-3-0-RELEASE, netbsd-3-0-RC6, netbsd-3-0-RC5, netbsd-3-0-RC4, netbsd-3-0-RC3, netbsd-3-0-RC2, netbsd-3-0-RC1, netbsd-3-0-3-RELEASE, netbsd-3-0-2-RELEASE, netbsd-3-0-1-RELEASE, netbsd-3-0, netbsd-3, netbsd-2-base, netbsd-2-1-RELEASE, netbsd-2-1-RC6, netbsd-2-1-RC5, netbsd-2-1-RC4, netbsd-2-1-RC3, netbsd-2-1-RC2, netbsd-2-1-RC1, netbsd-2-1, netbsd-2-0-base, netbsd-2-0-RELEASE, netbsd-2-0-RC5, netbsd-2-0-RC4, netbsd-2-0-RC3, netbsd-2-0-RC2, netbsd-2-0-RC1, netbsd-2-0-3-RELEASE, netbsd-2-0-2-RELEASE, netbsd-2-0-1-RELEASE, netbsd-2-0, netbsd-2, mjf-devfs2-base, mjf-devfs2, matt-premerge-20091211, matt-nb8-mediatek-base, matt-nb8-mediatek, matt-nb6-plus-nbase, matt-nb6-plus-base, matt-nb6-plus, matt-nb5-pq3-base, matt-nb5-pq3, matt-nb5-mips64-u2-k2-k4-k7-k8-k9, matt-nb5-mips64-u1-k1-k5, matt-nb5-mips64-premerge-20101231, matt-nb5-mips64-premerge-20091211, matt-nb5-mips64-k15, matt-nb5-mips64, matt-nb4-mips64-k7-u2a-k9b, matt-mips64-premerge-20101231, matt-mips64-base2, matt-mips64-base, matt-mips64, matt-armv6-prevmlocking, matt-armv6-nbase, matt-armv6-base, matt-armv6, localcount-20160914, keiichi-mipv6-nbase, keiichi-mipv6-base, keiichi-mipv6, jym-xensuspend-nbase, jym-xensuspend-base, jym-xensuspend, is-mlppp-base, is-mlppp, hpcarm-cleanup-nbase, hpcarm-cleanup-base, hpcarm-cleanup, cube-autoconf-base, cube-autoconf, cherry-xenmp-base, cherry-xenmp, bouyer-socketcan-base1, bouyer-socketcan-base, bouyer-socketcan, bouyer-quota2-nbase, bouyer-quota2-base, bouyer-quota2, agc-symver-base, agc-symver, abandoned-netbsd-4-base, abandoned-netbsd-4, HEAD
Changes since 1.1: +2 -2 lines

"Utilize" has exactly the same meaning as "use," but it is more
difficult to read and understand. Most manuals of English style
therefore say that you should use "use".

.\"	$NetBSD: 4.me,v 1.2 2003/02/05 00:02:27 perry Exp $
.\"
.\" Copyright (c) 1998 Jason R. Thorpe.
.\" All rights reserved.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\"    notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\"    notice, this list of conditions and the following disclaimer in the
.\"    documentation and/or other materials provided with the distribution.
.\" 3. All advertising materials mentioning features or use of this software
.\"    must display the following acknowledgements:
.\"	This product includes software developed for the NetBSD Project
.\"	by Jason R. Thorpe.
.\" 4. The name of the author may not be used to endorse or promote products
.\"    derived from this software without specific prior written permission.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
.\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
.\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
.\" IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
.\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
.\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
.\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
.\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
.\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
.\" SUCH DAMAGE.
.\"
.sh 1 "Implementation of \fIbus_dma\fB in NetBSD/alpha and NetBSD/i386"
.pp
This section is a description of the \fIbus_dma\fR implementation in
two NetBSD ports, NetBSD/alpha and NetBSD/i386.  It is presented as
a side-by-side comparison in order to give the reader a better feel
for the types of details that are being abstracted by the interface.
.sh 2 "Platform requirements"
.pp
NetBSD/alpha currently supports six implementations of the PCI bus, each
of which implement DMA differently.
In order to understand the design approach for NetBSD/alpha's
fairly complex \fIbus_dma\fR implementation, it is necessary to
understand the differences between the bus adapters.  While some of
these adapters have similar descriptions and features, the software
interface to each one is quite different.  (In addition to PCI,
NetBSD/alpha also supports two TurboChannel DMA implementations on
the DEC 3000 models.  For simplicity's sake, we will limit
the discussion to the PCI and related busses.)
.pp
The first PCI implementation to be supported by NetBSD/alpha was
the DECchip 21071/21072 (APECS)\*#.
It is designed
to be used with the DECchip 21064 (EV4) and 21064A (EV45) processors.
Systems in which this PCI host
bus adapter is found include the AlphaStation 200, AlphaStation 400,
and AlphaPC 64 systems, as well as some AlphaVME systems.
The APECS supports up to two DMA windows, which may be configured for
direct-mapped or scatter-gather-mapped operation, and uses host RAM for
scatter-gather page tables.
.(d
.pp
\*# Digital Equipment Corporation,
.ul
DECchip 21071 and DECchip 21072 Core Logic Chipsets Data Sheet,
DEC order number EC-QAEMA-TE, November 1994.
.)d
.pp
The second PCI implementation to be supported by NetBSD/alpha was
the built-in I/O controller found on the DECchip 21066\*# and DECchip 21068
family of Low Cost Alpha (LCA) processors.  This processor family was used
in the AXPpci33 and Multia AXP systems, as well as some AlphaVME
systems.  The LCA supports up to two DMA windows, which may be configured
for direct-mapped or scatter-gather-mapped operation, and uses host
RAM for scatter-gather page tables.
.(d
.pp
\*# Digital Equipment Corporation,
.ul
DECchip 21066 Alpha AXP Microprocessor Data Sheet,
DEC order number EC-N0617-72, May 1994.
.)d
.pp
The third PCI implementation to be supported by NetBSD/alpha was
the DECchip 21171 (ALCOR)\*#, 21172 (ALCOR2), and 21174 (Pyxis)\**.
.(f
\**While these chipsets are somewhat different from one another, the
software interface is similar enough that they share a common device
driver in the NetBSD/alpha kernel.
.)f
These PCI host bus adapters are found in systems based on the
DECchip 21164 (EV5), 21164A (EV56), and 21164PC (PCA56) processors,
including the AlphaStation 500, AlphaStation 600, and AlphaPC 164,
and Digital Personal Workstation.  The ALCOR, ALCOR2, and Pyxis
support up to four DMA windows, which may be configured for direct-mapped
or scatter-gather-mapped operation, and uses host RAM for scatter-gather
page tables.
.(d
.pp
\*# Digital Equipment Corporation,
.ul
DECchip 21171 Core Logic Chipset Technical Reference Manual,
DEC order number EC-QE18B-TE, September 1995.
.)d
.pp
The fourth PCI implementation to be supported by NetBSD/alpha was
the Digital DWLPA/DWLPB\*#.  This is a TurboLaser-to-PCI\** bridge found on
AlphaServer 8200 and 8400 systems.  The bridge is connected to the TurboLaser
system bus via a KFTIA (internal) or KFTHA (external) I/O adapter.  The
former supports one built-in and one external DWLPx.  The latter supports up
to four external DWLPxs.  Multiple I/O adapters may be present
on the TurboLaser system bus.  Each DWLPx supports up to four primary PCI
busses and has three DMA windows which may be configured for
direct-mapped or scatter-gather-mapped DMA.  These three windows are
shared by all PCI busses attached to the DWLPx.  The DWLPx does not use
host RAM for scatter-gather page tables.  Instead, the DWLPx uses on-board
SRAM, which must be shared by all PCI busses attached to the DWLPx.
This is because the store-and-forward architecture of these systems
would cause latency on DMA page table access to be too high.  The DWLPA
has 32K of page table SRAM and the DWLPB has 128K.  Since the DWLPx can
snoop access to the page table SRAM, no explicit scatter-gather TLB
invalidation is necessary on this PCI implementation.
.(f
\**"TurboLaser" is the name of the system bus on the AlphaServer 8200
and 8400 systems.
.)f
.(d
.pp
\*# Digital Equipment Corporation,
.ul
DWLPA and DWLPB PCI Adapter Technical Manual,
DEC order number EK-DWLPX-TM, July 1996.
.)d
.pp
The fifth PCI implementation to be supported by NetBSD/alpha was
the A12C PCI bus on the Avalon A12 Scalable Parallel Processor\*#.  This
PCI bus is a secondary I/O bus\**, has only a single PCI slot in
mezzanine form-factor, and is used solely for Ethernet I/O.
.(d
.pp
\*# H. Ross Harvey,
.ul
Avalon A12 Parallel Supercomputer Theory of Operation,
Avalon Computer Systems, Inc., October 1997.
.)d
.(f
\**The primary I/O bus on the A12 is a crossbar, which is used to
communicate with other nodes in the parallel processor.
.)f
This PCI bus is not able to directly access host RAM.  Instead, devices
DMA to and from a 128K SRAM buffer.  This is, in essence, a hardware
implementation of DMA bouncing.  This is not considered a limitation
of the architecture given the target application of the A12 system
(parallel computation applications which communicate via MPI\**
over the crossbar).
.(f
\**MPI, or the Message Passing Interface, is a standardized API for
passing data and control within a parallel program.
.)f
.pp
The sixth PCI implementation to be supported by NetBSD/alpha was
the MCPCIA MCBUS-to-PCI bridge found on the AlphaServer 4100 (Rawhide)
systems.  The Rawhide architecture is made up of a "horse" (the
central backplane) and two "saddles" (primary PCI bus adapters on
either side of the backplane).  The saddles may also contain EISA
bus adapters.  Each MCPCIA has four DMA windows which may be configured
for direct-mapped or scatter-gather-mapped operation, and uses host
RAM for scatter-gather page tables.
.pp
In sharp contrast to the Alpha, the i386 platform has a very simple
PCI implementation; the PCI bus is capable of addressing the entire
32-bit physical address space of the PC architecture, and, in general,
all PCI host bus adapters are software compatible.  The i386 platform
also has WYSIWYG DMA, so no window translations are necessary.  The
i386 platform, however, must contend with DMA bouncing on the ISA bus,
due to ISA's 24-bit address limitation and lack of scatter-gather-mapped
DMA.
.sh 2 "Data structures"
.pp
The DMA tags used by NetBSD/alpha and NetBSD/i386 are very similar.  Both
contain thirteen function pointers for the thirteen functional methods in
the \fIbus_dma\fR interface.  The NetBSD/alpha DMA tag, however, also has
a function pointer used to obtain the DMA tag for children of the primary
I/O bus and an opaque cookie to be interpreted by the low-level
implementation of these methods.
.pp
The opaque cookie used by NetBSD/alpha's DMA tag is a pointer to the
chipset's statically-allocated state information.  This state information
includes one or more \fIalpha_sgmap\fR structures.  The \fIalpha_sgmap\fR
contains all of the state information for a single DMA window to perform
scatter-gather-mapped DMA, including pointers to the scatter-gather
page table, the \fIextent map\fR\** that manages the page table, and the
DMA window base.
.(f
\**An \fIextent map\fR is a data structure which manages an arbitrary
number range, providing several resource allocation primitives.  NetBSD
has a general-purpose extent map manager which is used by several kernel
subsystems.
.)f
.pp
The DMA map structure contains all of the parameters used to create
the map.  (This is a fairly standard practice among all current
implementations of the \fIbus_dma\fR interface.)  In addition to
the creation parameters, the two implementations contain additional
state variables specific to their particular DMA quirks.  For example,
the NetBSD/alpha DMA map contains several state variables
related to scatter-gather-mapped DMA.  The i386 port's DMA map,
on the other hand, contains a pointer to a map-specific cookie.  This
cookie holds state information for ISA DMA bouncing.  This state is stored
in a separate cookie because DMA bouncing is far less common on the i386
then scatter-gather-mapped DMA is on the Alpha, since the Alpha must
also do scatter-gather-mapped DMA for PCI if the system has a large
amount of physical memory.
.pp
In both the NetBSD/alpha and NetBSD/i386 \fIbus_dma\fR implementations,
the DMA segment structure contains only the public members defined
by the interface.
.sh 2 "Code structure"
.pp
Both the NetBSD/alpha and NetBSD/i386 \fIbus_dma\fR implementations
use a simple inheritance scheme for code reuse.  This is achieved
by allowing the chipset- or bus-specific code layers (i.e. the "master"
layers) to assemble the DMA tag.  When the tag is assembled, the master
layer inserts its own methods in the function pointer slots where special
handling at that layer is required.  For those methods which do not require
special handling, the slots are initialized with pointers to common code.
.pp
The Alpha \fIbus_dma\fR code is broken down into four basic categories:
chipset-specific code, code that implements common direct-mapped
operations, code that implements common scatter-gather-mapped operations,
and code that implements operations common to both direct-mapped and
scatter-gather-mapped DMA.  Some of the common functions are not
called directly via the tag's function switch.  These functions
are helper functions, and are for use only by chipset front-ends.
An example of such a helper is the set of common direct-mapped
DMA load functions.  These functions take all of the same arguments
as the interface-defined methods, plus an extra argument: the DMA
window's base DMA address.
.pp
The i386 \fIbus_dma\fR implementation, on the other hand, is broken down
into three basic categories: common implementations of \fIbus_dma\fR
methods, common helper functions, and ISA DMA front-ends\**.
.(f
\**ISA is currently the only bus supported by NetBSD/i386 with
special DMA requirements.  This may change in future versions of the
system.
.)f
All of the common interface methods may be called directly from the
DMA tag's function switch.  Both the PCI and EISA DMA tags use
this feature; they provide no bus-specific DMA methods.  The ISA DMA
front-ends provide support for DMA bouncing if the system has more
than 16MB of physical memory.  If the system has 16MB of physical
memory or less, no DMA bouncing is required, and the ISA DMA front-ends
simply redirect the \fIbus_dma\fR function calls to the common
implementation.
.sh 2 "Autoconfiguration"
.pp
The NetBSD kernel's autoconfiguration system employs a depth-first traversal
of the nodes (devices) in the device tree.  This process is started
by machine-dependent code telling the machine-independent
autoconfiguration framework that it has "found" the root "bus".  In the
two platforms described here, this root bus, called \fImainbus\fR, is a
virtual device; it does not directly correspond to any physical bus in
the system.  The device driver for \fImainbus\fR is implemented in
machine-dependent code.  This driver's responsibility is to configure
the primary I/O bus or busses.
.pp
In NetBSD/alpha, the chipset which implements the primary I/O bus
is considered to be the primary I/O bus by the \fImainbus\fR layer.
Platform-specific code specifies the name of the chipset, and the
\fImainbus\fR driver configures it by "finding" it.  When the chipset's
device driver is attached, it initializes its DMA windows and data
structures.  Once this is complete, it "finds" the primary PCI bus or busses
logically attached to the chipset, and passes the DMA tag for these busses
down to the PCI bus device driver.  This driver in turn finds and configures
each device on the PCI bus, and so on.
.pp
In the event that the PCI bus driver encounters a PCI-to-PCI bridge (PPB),
the DMA tag is passed unchanged to the PPB device driver, which in turn
passes it to the secondary PCI bus instance attached to the other
side of the bridge.  However, intervention by machine-dependent code
is required if the PCI bus driver encounters a bridge to a different bus
type, such as EISA or ISA; this bus may require a different DMA tag.
For this reason, all PCI-to-<other bus> bridge (PCxB) drivers are
implemented in machine-dependent code.  While the PCxB drivers could
be implemented in machine-independent code using machine-dependent
hooks to obtain DMA tags, this is not done as the secondary bus may
require special machine-dependent interrupt setup and routing.  Once all
of the call-backs to handle the machine-dependent bus transition details
were implemented, the amount of code that would be shared would hardly be
worth the effort.
.pp
When a device driver is associated with a particular hardware device
that the bus driver has found, it is given several pieces of information
needed to initialize and communicate with the device.  One of these
pieces of information is the DMA tag.  If the driver wishes to perform
DMA, it must remember this tag, which, as noted previously, is used in
every call to the \fIbus_dma\fR interface.
.pp
While the procedure for configuring busses and devices is essentially
identical to the NetBSD/alpha case, NetBSD/i386 configures the primary
I/O busses quite differently.  The PC platform was designed from the
ground up around the ISA bus.  EISA and PCI are, in many ways, very
similar to ISA from a device driver's perspective.  All three have the
concept of I/O-mapped\** and memory-mapped space.  The hardware and firmware
in PCs typically map these busses in such a way that initialization of
the bus's adapter by operating system software is not necessary.  For
this reason, it is possible to consider PCI, EISA, and ISA to all be
primary I/O busses, from the autoconfiguration perspective.
.(f
\**I/O-mapped space is accessed with special instructions on Intel
processors.
.)f
.pp
The NetBSD/i386 \fImainbus\fR driver configures the primary I/O busses
in order of descending priority: PCI first, then EISA, and finally, ISA.
The \fImainbus\fR driver has direct access to each bus's DMA tags, and
passes them down to the I/O bus directly.  In the case of EISA and ISA,
the \fImainbus\fR layer only attempts to configure these busses if they
were not found during the PCI bus configuration phase; NetBSD/i386, as
a matter of correctness, identifies PCI-to-EISA (PCEB) and PCI-to-ISA (PCIB)
bridges, and assigns autoconfiguration nodes in the device tree to them.
The EISA and ISA busses are logically attached to these nodes, in a way
very similar to that of NetBSD/alpha.  The bridge drivers also have direct
access to the bus's DMA tags, and pass them down to the I/O bus accordingly.
.sh 2 "Example of underlying operation"
.pp
This subsection describes the operation of the machine-dependent code
which implements the \fIbus_dma\fR interface as used by a device driver
for a hypothetical DES encryption card.  While this is not the 
original application of \fIbus_dma\fR, it provides an example which is much
easier to understand; the application for which the interface was developed
is a high-performance hierarchical mass storage system, the details of which
are overwhelming.
.pp
Not all of the details of a NetBSD device driver will be described here,
but rather only those details which are important within the scope of DMA.
.pp
For the purpose of our example, the card comes in both PCI and ISA
models.  Since we describe two platforms, there are four permutations
of actual examples.  They will be tagged with the following indicators:
.(b
.b
[Alpha/ISA]
[Alpha/PCI]
[i386/ISA]
[i386/PCI]
.r
.)b
.pp
We will assume that the \fB[i386/ISA]\fR platform has more than 16MB
of RAM, so transfers might have to be bounced if DMA-safe memory is not
used explicitly.  We will also assume that the direct-mapped DMA window
on the \fB[Alpha/PCI]\fR platform is capable of addressing all of system
RAM.
.pp
Please note that in the description of map synchronization, only cases
which require special handing will be described.  In both the
\fB[Alpha/ISA]\fR and \fB[Alpha/PCI]\fR cases, all synchronizations
cause the CPU's write buffer to be drained using the Alpha's \fImb\fR\*#
instruction.  All synchronizations in the \fB[i386/PCI]\fR case are
no-ops, as are synchronizations of DMA-safe memory in the \fB[i386/ISA]\fR
case.
.(d
.pp
\*# Richard L. Sites and Richard T. Witek,
.ul
Alpha AXP Architecture Reference Manual, Second Edition,
Digital Press, 1995.
.)d
.sh 3 "Hardware overview"
.pp
The card is a bus master,
and operates by reading a fixed-length command block via DMA.  There are
three commands: \fBSET KEY\fR, \fBENCRYPT\fR, and \fBDECRYPT\fR.  Commands
are initiated by filling in the command block, and writing the DMA address
of the command block to the card's \fIdmaAddr\fR register.  The command
block contains 6 32-bit words: \fIcbCommand\fR, \fIcbStatus\fR,
\fIcbInAddr\fR, \fIcbInCount\fR, \fIcbOutAddr\fR, and \fIcbOutCount\fR.
The \fIcbInAddr\fR and \fIcbOutAddr\fR members are the DMA addresses of
software scatter-gather lists used by the card's DMA engine.  The
\fIcbInCount\fR and \fIcbOutCount\fR members are the number of
scatter-gather entries in their respective lists.  Each scatter-gather
entry contains a DMA address and a length, both 32-bit
words.
.pp
When the card processes a request, it reads the command block via DMA.
It then examines the command block to determine which action to take.
In the case of all three supported commands, it reads the input
scatter-gather list at DMA address \fIcbInAddr\fR for length
\fIcbInCount * 8\fR.  It then switches the input to the appropriate
processing engine.  In the case of the \fBSET KEY\fR command, the
scatter-gather list is used to DMA the DES key into SRAM located
on the card.  For all other commands, the input is directed at the
pipelined DES engine, switched into either encrypt or decrypt mode.  The
DES engine then reads the output scatter-gather list specified by
\fIcbOutAddr\fR for \fIcbOutCount * 8\fR bytes.  Once the DES engine
has all of the DMA addresses, it then begins the cycle of
input-process-output until all data has been consumed.  Once any command
is finished, a status word is written to \fIcbStatus\fR, and an interrupt
is delivered to the host.  The driver software must read this word to
determine if the command completed successfully.
.sh 3 "Device driver overview"
.pp
The device driver for this DES card provides \fIopen()\fR, \fIclose()\fR,
and \fIioctl()\fR entry points.  The driver uses DMA to the user
address space for high performance.  When a user issues a request via the
ioctl corresponding to the requested operation, the driver places it
on a work queue.  The \fIioctl()\fR system call returns immediately,
allowing the application to run or block via \fIsigsuspend()\fR.  If
the card is currently idle, the driver immediately issues the command
to the card.  When the job is finished, the card interrupts, and the
driver notifies the user that the request has completed via the
\fBSIGIO\fR signal.  If there are more jobs on the work queue, the
next job is removed from the queue and started, until there are no
more jobs.
.sh 3 "Driver initialization"
.pp
When the driver instance is created (attached), it must create and initialize
the data structures necessary for operation.  This driver uses multiple
DMA maps: one for the control structures (control block and scatter-gather
lists), and many for data submitted by user requests.  The data maps are
kept in the driver job queue entries, which are created when jobs are
submitted.
.pp
Next the driver must allocate DMA-safe memory for the control structures.
The driver will allocate three pages of memory via \fIbus_dmamem_alloc\fR().
For simplicity, the driver will request a single memory segment.
For all platforms and busses in this example, this operation simply calls
a function in the virtual memory system that allocates memory with
the requested constraints.  In the \fB[i386/ISA]\fR case, the ISA layer
inserts itself into the call graph to specify a range of 0 - 16MB.
All other cases simply specify the entire present physical memory range.
.pp
A small piece of this memory will be used for the command block.  The rest
of the memory will be divided evenly between the two scatter-gather lists.
This memory is then mapped into kernel virtual address space using
\fIbus_dmamem_map()\fR with the \fBBUS_DMA_COHERENT\fR flag, and the
kernel pointers to the three structures are initialized.  When the memory
is mapped on the i386, the \fBBUS_DMA_COHERENT\fR flag causes the
cache-inhibit bits to be set in the PTEs.  No special handing of this
flag is required on the Alpha.  However, in the Alpha case, since there
is only a single segment, the memory is mapped via the Alpha's
direct-mapped kernel segment; no use of kernel virtual address space is
required.
.pp
Finally, the driver loads the control structure DMA map by passing the
kernel virtual address of the memory to \fIbus_dmamap_load()\fR.  To
make it easier to start transactions, the driver caches the DMA addresses
of the various control structures (by adding their offsets to the memory's
DMA address).  In all cases, the underlying load function steps through
each page in the virtual address range, extracting the physical address
from the pmap module and compacting the segments where possible.  Since
the memory was allocated as a single segment, it maps to a single DMA
segment.
.sh 3 "Example transaction"
.pp
Let's suppose that the user has already set the key, and now wishes to
use it to encrypt a data buffer.  The calling program packages up the
request, providing a pointer to the input buffer, output buffer, and
status word, all in user space, and issues the "encrypt buffer" ioctl.
.pp
Upon entry into the kernel, the driver locks the user's buffer
to prevent the data from being paged out
while the DMA is in progress.  A job queue entry is allocated, and
two DMA maps are created for the job queue entry, one for the input buffer
and one for the output buffer.  In all cases, this allocates the standard
DMA map structure.  In the \fB[i386/ISA]\fR case, an ISA DMA cookie for
each map is also allocated.
.pp
Once the queue entry has been allocated, it must be initialized.  The first
step in this process is to load the DMA maps for the input and output
buffers.  Since this process is essentially identical for input and output,
only the actions for the input buffer's map are described here.
.pp
On \fB[Alpha/PCI]\fR and \fB[i386/PCI]\fR, the underlying code traverses
the user's buffer, extracting the physical addresses for each page.  For
\fB[Alpha/PCI]\fR, the DMA window base is added to this address.  The
address and length of the segment are placed into the map's DMA segment
list.  Segments are concatenated when possible.
.pp
On \fB[Alpha/ISA]\fR, a very similar process occurs.  However, rather than
placing the physical addresses into the map's segment list, some
scatter-gather-mapped DMA address space is allocated and the addresses
placed into the corresponding page table entries.  Once this process is
complete, a single DMA segment is placed in the map's segment list,
indicating the beginning of the scatter-gather-mapped area.
.pp
The \fB[i386/ISA]\fR case also traverses the user's buffer, but twice.
In the first pass, the buffer is checked to ensure that it does not
have any pages above the 16MB threshold.  If it does not, then the
procedure is identical to the
\fB[i386/PCI]\fR case.  However, for the sake of example, the buffer has
pages outside the threshold so the transfer must be bounced.  At this
point, a bounce buffer is allocated.  Since we are still in the process's
context, this allocation may block.  A pointer to the bounce buffer is
stored in the ISA DMA cookie, and the physical address of the bounce buffer
is placed in the map's segment list.
.pp
The next order of business is to enqueue or begin the transfer.  To keep
the example simple, we will assume that no other transfers are pending.
The first step in this process is to initialize the control block with
the cached DMA addresses of the card's scatter-gather lists.  These
lists are also initialized with the contents of the DMA maps' segment list.
Before we tell the card to begin transferring data, we must synchronize
the DMA maps.
.pp
The first map to be synchronized is the input buffer map.  This is a
\fBPREWRITE\fR operation.  In the \fB[i386/ISA]\fR case, the user's buffer
is copied from the user's address space into the bounce buffer\**.
.(f
\**This is not currently implemented, as it required substantial changes
to the virtual memory system.  This is because the \fIcopyin()\fR and
\fIcopyout()\fR functions only operate on the current process's context,
which may not be available at the time of the bounce.  Those changes to
the virtual memory system have now been made, so support for bouncing to
and from user space will appear in a future release of NetBSD.  Support
for bouncing from kernel space is currently supported, however.
.)f
The next map to be synchronized is the output buffer map.  This is a
\fBPREREAD\fR operation.
Finally, the control map is synchronized.  Since the status will be
read back from the control block after the transaction is complete,
this synchronization is a \fBPREREAD|PREWRITE\fR.
.pp
At this point the DMA transaction may occur.  The card is started by
writing the cached DMA address of the control block into the card's
\fIdmaAddr\fR register.  The driver returns to user space, and the
process waits for the signal indicating that the transaction has
completed.
.pp
Once the transaction has completed, the card interrupts the host.  The
interrupt handler is now responsible for finishing the DMA sequence
and notifying the requesting process that the operation is complete.
.pp
The first task to perform is to synchronize the input buffer map.  This
is a \fBPOSTWRITE\fR.  Next we synchronize the output buffer map.  This
is a \fBPOSTREAD\fR.  In the \fB[i386/ISA]\fR case, the contents of the
output bounce buffer are copied to the user's buffer\**.
Finally, we synchronize the control map.  This is a \fBPOSTREAD|POSTWRITE\fR.
.(f
\**The same caveat applies here as to the \fB[i386/ISA]\fR
\fBPREWRITE\fR case for the input map.
.)f
.pp
Now that the DMA maps have been synchronized, they must be unloaded.
In the \fB[Alpha/PCI]\fR and \fB[i386/PCI]\fR cases, there are no resources
to be freed; the mapping is simply marked invalid.  In the \fB[Alpha/ISA]\fR
case, the scatter-gather-mapped DMA resources are released.  In the
\fB[i386/ISA]\fR case, the bounce buffer is freed.
.pp
Since the user's buffer is no longer in use, it is unlocked by the device
driver.  Now the process may be signaled that I/O has completed.  The
last task to perform is to destroy the input and output buffer DMA maps
and the job queue entry.