<?xml version="1.0"?>
<!DOCTYPE webpage
PUBLIC "-//NetBSD//DTD Website-based NetBSD Extension//EN"
"http://www.NetBSD.org/share/xml/website-netbsd.dtd">
<webpage id="docs-kernel-vfork">
<config param="desc" value="Why implement traditional vfork()"/>
<config param="cvstag" value="$NetBSD: vfork.xml,v 1.1 2007/06/09 11:33:47 dsieger Exp $"/>
<config param="rcsdate" value="$Date: 2007/06/09 11:33:47 $"/>
<head>
<!-- Copyright (c) 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED. -->
<title>NetBSD Documentation: Why implement traditional vfork()</title>
</head>
<sect1 role="toc">
<sect2 id="vfork">
<sect3 id="intro">
<title>Introduction</title>
<para><function>vfork()</function> is designed to be used in the specific case where
the child will <function>exec()</function> another program, and the parent can
block until this happens. A traditional <function>fork()</function> required
duplicating all the pages of the parent process in the child - a
significant overhead.</para>
<para>
The Mach VM system added Copy On Write (COW), which made the
<function>fork()</function> much cheaper, and in BSD 4.4, <function>vfork()</function>
was made synonymous to <function>fork()</function>. After NetBSD 1.3, a
traditional <function>vfork()</function> was reimplemented.</para>
<para>A good amount of effort was directed at making COW better in <ulink
url="uvm.html"><emphasis role="bold">UVM</emphasis></ulink>, but an
address space-sharing <function>vfork()</function> <emphasis>still</emphasis>
turns out to be a win. It shaves several seconds off a build of libc
on a 200MHz PPro.</para>
</sect3>
<sect3 id="4bsd-vfork-cow">
<title><function>vfork()</function>/<function>exec()</function> using the 4.4BSD
<function>vfork()</function> and COW</title>
<itemizedlist>
<listitem><para>Traverse parent's vm_map, marking the writable portions of the
address space COW. This means invoking the pmap, modifying PTEs,
and flushing the TLB.</para></listitem>
<listitem><para>Create a vm_map for the child, copy the parent's vm_map entries
into the child's vm_map. Optionally, invoke the pmap to copy
PTEs from the parent's page tables into the child's page tables.</para></listitem>
<listitem><para>Block parent.</para></listitem>
<listitem><para>Child runs. If PTEs were <emphasis>not</emphasis> copied, take page fault to get
a physical mapping for the text page at the current program counter.</para></listitem>
<listitem><para>Child execs, and unmaps the entire address space that was just
created, and creates a new one. This implies that the parent's
vm_map has to be traversed to mark the COW portions not-COW.</para></listitem>
<listitem><para>Unblock parent.</para></listitem>
<listitem><para>Parent runs, takes page fault when modifying previously R/W
data that was marked R/O for COW. No data is copied at this
time.</para></listitem>
</itemizedlist>
</sect3>
<sect3 id="3bsd-vfork">
<title>The 3.0BSD/NetBSD <function>vfork()</function>, using address space
sharing</title>
<itemizedlist>
<listitem><para>Take reference to parent's vmspace structure.</para></listitem>
<listitem><para>Block parent.</para></listitem>
<listitem><para>Child runs. No page faults occur because the parent's page
tables are being used, and the PTEs are already valid.</para></listitem>
<listitem><para>Child execs, deletes the reference it had to the parent's
vmspace structure, and creates a new one.</para></listitem>
<listitem><para>Unblock parent.</para></listitem>
<listitem><para>Parent runs. (No page faults occur because the parent's
vm_map was not modified.)</para></listitem>
</itemizedlist>
<para>
So, in the case where you're going to fork and then exec, the latter
case is clearly faster. Even if your COW algorithms are good, you still
have to do a lot more work compared to the vmspace-sharing case!
</para>
</sect3>
</sect2>
</sect1>
<parentsec url="./" text="NetBSD Documentation: Kernel"/>
</webpage>