Annotation of src/sys/netinet6/IMPLEMENTATION, Revision 1.14
1.14 ! itojun 1: $NetBSD: IMPLEMENTATION,v 1.13 2000/06/10 08:21:11 itojun Exp $
1.2 thorpej 2:
1.1 itojun 3: # NOTE: this is from original KAME distribution.
4: # Some portion of this document is not applicable to the code merged into
1.5 itojun 5: # NetBSD-current (for example, section 5). Check sys/netinet6/TODO as well.
1.1 itojun 6:
7: Implementation Note
8:
9: KAME Project
10: http://www.kame.net/
1.14 ! itojun 11: KAME Date: 2000/06/12 09:29:16
1.1 itojun 12:
13: 1. IPv6
14:
15: 1.1 Conformance
16:
17: The KAME kit conforms, or tries to conform, to the latest set of IPv6
18: specifications. For future reference we list some of the relevant documents
19: below (NOTE: this is not a complete list - this is too hard to maintain...).
20: For details please refer to specific chapter in the document, RFCs, manpages
21: come with KAME, or comments in the source code.
22:
1.5 itojun 23: Conformance tests have been performed on past and latest KAME STABLE kit,
1.3 itojun 24: at TAHI project. Results can be viewed at http://www.tahi.org/report/KAME/.
1.1 itojun 25: We also attended Univ. of New Hampshire IOL tests (http://www.iol.unh.edu/)
26: in the past, with our past snapshots.
27:
28: RFC1639: FTP Operation Over Big Address Records (FOOBAR)
29: * RFC2428 is preferred over RFC1639. ftp clients will first try RFC2428,
30: then RFC1639 if failed.
1.3 itojun 31: RFC1886: DNS Extensions to support IPv6
1.1 itojun 32: RFC1933: Transition Mechanisms for IPv6 Hosts and Routers
33: * IPv4 compatible address is not supported.
1.3 itojun 34: * automatic tunneling (4.3) is not supported.
1.1 itojun 35: * "gif" interface implements IPv[46]-over-IPv[46] tunnel in a generic way,
36: and it covers "configured tunnel" described in the spec.
37: See 1.5 in this document for details.
38: RFC1981: Path MTU Discovery for IPv6
39: RFC2080: RIPng for IPv6
40: * KAME-supplied route6d, bgpd and hroute6d support this.
41: RFC2283: Multiprotocol Extensions for BGP-4
42: * so-called "BGP4+".
43: * KAME-supplied bgpd supports this.
44: RFC2292: Advanced Sockets API for IPv6
1.3 itojun 45: * For supported library functions/kernel APIs, see sys/netinet6/ADVAPI.
1.1 itojun 46: RFC2362: Protocol Independent Multicast-Sparse Mode (PIM-SM)
47: * RFC2362 defines packet formats for PIM-SM. draft-ietf-pim-ipv6-01.txt
48: is written based on this.
49: RFC2373: IPv6 Addressing Architecture
50: * KAME supports node required addresses, and conforms to the scope
51: requirement.
52: RFC2374: An IPv6 Aggregatable Global Unicast Address Format
53: * KAME supports 64-bit length of Interface ID.
54: RFC2375: IPv6 Multicast Address Assignments
55: * Userland applications use the well-known addresses assigned in the RFC.
56: RFC2428: FTP Extensions for IPv6 and NATs
57: * RFC2428 is preferred over RFC1639. ftp clients will first try RFC2428,
58: then RFC1639 if failed.
59: RFC2460: IPv6 specification
60: RFC2461: Neighbor discovery for IPv6
61: * See 1.2 in this document for details.
62: RFC2462: IPv6 Stateless Address Autoconfiguration
63: * See 1.4 in this document for details.
64: RFC2463: ICMPv6 for IPv6 specification
65: * See 1.8 in this document for details.
66: RFC2464: Transmission of IPv6 Packets over Ethernet Networks
1.3 itojun 67: RFC2465: MIB for IPv6: Textual Conventions and General Group
68: * Necessary statistics are gathered by the kernel. Actual IPv6 MIB
69: support is provided as patchkit for ucd-snmp.
70: RFC2466: MIB for IPv6: ICMPv6 group
71: * Necessary statistics are gathered by the kernel. Actual IPv6 MIB
72: support is provided as patchkit for ucd-snmp.
1.1 itojun 73: RFC2467: Transmission of IPv6 Packets over FDDI Networks
74: RFC2472: IPv6 over PPP
75: RFC2492: IPv6 over ATM Networks
76: * only PVC is supported.
1.3 itojun 77: RFC2497: Transmission of IPv6 packet over ARCnet Networks
1.1 itojun 78: RFC2545: Use of BGP-4 Multiprotocol Extensions for IPv6 Inter-Domain Routing
79: RFC2553: Basic Socket Interface Extensions for IPv6
80: * IPv4 mapped address (3.7) and special behavior of IPv6 wildcard bind
81: socket (3.8) are,
1.3 itojun 82: - supported on KAME/FreeBSD3x,
83: - supported on KAME/NetBSD,
1.4 itojun 84: - supported on KAME/BSDI4,
85: - not supported on KAME/FreeBSD228, KAME/OpenBSD and KAME/BSDI3.
1.1 itojun 86: see 1.12 in this document for details.
1.3 itojun 87: RFC2675: IPv6 Jumbograms
88: * See 1.7 in this document for details.
89: RFC2710: Multicast Listener Discovery for IPv6
90: RFC2711: IPv6 router alert option
1.5 itojun 91: RFC2732: Format for Literal IPv6 Addresses in URL's
92: * The spec is implemented in programs that handle URLs
93: (like freebsd ftpio(3) and fetch(1), or netbsd ftp(1))
1.12 itojun 94: draft-ietf-ipngwg-router-renum-10: Router renumbering for IPv6
1.10 itojun 95: draft-ietf-ipngwg-icmp-name-lookups-05: IPv6 Name Lookups Through ICMP
1.12 itojun 96: draft-ietf-pim-ipv6-03.txt: PIM for IPv6
1.3 itojun 97: * pim6dd implements dense mode. pim6sd implements sparse mode.
1.12 itojun 98: draft-ietf-dhc-dhcpv6-15.txt: DHCPv6
99: draft-ietf-dhc-dhcpv6exts-12.txt: Extensions for DHCPv6
1.3 itojun 100: * kame/dhcp6 has test implementation, which will not be compiled in
101: default compilation.
1.8 itojun 102: draft-itojun-ipv6-tcp-to-anycast-00.txt:
1.1 itojun 103: Disconnecting TCP connection toward IPv6 anycast address
1.11 itojun 104: draft-ietf-ipngwg-scopedaddr-format-01.txt:
1.3 itojun 105: An Extension of Format for IPv6 Scoped Addresses
1.12 itojun 106: draft-ietf-ngtrans-tcpudp-relay-01.txt:
1.5 itojun 107: An IPv6-to-IPv4 transport relay translator
108: * FAITH tcp relay translator (faithd) implements this. See 3.1 for more
109: details.
1.13 itojun 110: draft-ietf-ngtrans-6to4-06.txt:
1.12 itojun 111: Connection of IPv6 Domains via IPv4 Clouds without Explicit Tunnels
112: * "stf" interface implements it. Be sure to read the next item before
113: configuring it, there are security issues.
114: http://playground.iijlab.net/i-d/draft-itojun-ipv6-transition-abuse-00.txt:
115: Possible abuse against IPv6 transition technologies
116: * KAME does not implement RFC1933 automatic tunnel.
117: * "stf" interface implements some address filters. Refer to stf(4)
118: for details. Since there's no way to make 6to4 interface 100% secure,
119: we do not include "stf" interface into GENERIC.v6 compilation.
120: * kame/openbsd completely disables IPv4 mapped address support.
121: * kame/netbsd makes IPv4 mapped address support off by default.
1.13 itojun 122: * See section 12.6 and 14 for more details.
1.1 itojun 123:
124: 1.2 Neighbor Discovery
125:
126: Neighbor Discovery is fairly stable. Currently Address Resolution,
127: Duplicated Address Detection, and Neighbor Unreachability Detection
1.9 itojun 128: are supported. In the near future we will be adding Unsolicited Neighbor
129: Advertisement transmission command as admin tool.
1.1 itojun 130:
1.5 itojun 131: Duplicated Address Detection (DAD) will be performed when an IPv6 address
132: is assigned to a network interface, or the network interface is enabled
133: (ifconfig up). It is documented in RFC2462 5.4.
1.1 itojun 134: If DAD fails, the address will be marked "duplicated" and message will be
135: generated to syslog (and usually to console). The "duplicated" mark
1.3 itojun 136: can be checked with ifconfig. It is administrators' responsibility to check
1.5 itojun 137: for and recover from DAD failures. We may try to improve failure recovery
138: in future KAME code.
139: DAD procedure may not be effective on certain network interfaces/drivers.
140: If a network driver needs long initialization time (with wireless network
141: interfaces this situation is popular), and the driver mistakingly raises
142: IFF_RUNNING before the driver becomes ready, DAD code will try to transmit
143: DAD probes to not-really-ready network driver and the packet will not go out
144: from the interface. In such cases, network drivers should be corrected.
1.1 itojun 145:
1.5 itojun 146: Some of network drivers loop multicast packets back to themselves,
1.1 itojun 147: even if instructed not to do so (especially in promiscuous mode).
148: In such cases DAD may fail, because DAD engine sees inbound NS packet
149: (actually from the node itself) and considers it as a sign of duplicate.
1.3 itojun 150: You may want to look at #if condition marked "heuristics" in
151: sys/netinet6/nd6_nbr.c:nd6_dad_timer() as workaround (note that the code
152: fragment in "heuristics" section is not spec conformant).
1.1 itojun 153:
154: Neighbor Discovery specification (RFC2461) does not talk about neighbor
155: cache handling in the following cases:
1.3 itojun 156: (1) when there was no neighbor cache entry, node received unsolicited
157: RS/NS/NA/redirect packet without link-layer address
1.1 itojun 158: (2) neighbor cache handling on medium without link-layer address
159: (we need a neighbor cache entry for IsRouter bit)
160: For (1), we implemented workaround based on discussions on IETF ipngwg mailing
161: list. For more details, see the comments in the source code and email
162: thread started from (IPng 7155), dated Feb 6 1999.
163:
164: IPv6 on-link determination rule (RFC2461) is quite different from assumptions
1.5 itojun 165: in BSD IPv4 network code. To implement behavior in RFC2461 section 5.2
166: (when default router list is empty), the kernel needs to know the default
167: outgoing interface. To configure the default outgoing interface, use
168: commands like "ndp -I de0" as root. Note that the spec misuse the word
169: "host" and "node" in several places in the section.
1.1 itojun 170:
171: To avoid possible DoS attacks and infinite loops, KAME stack will accept
172: only 10 options on ND packet. Therefore, if you have 20 prefix options
173: attached to RA, only the first 10 prefixes will be recognized.
174: If this troubles you, please contact KAME team and/or modify
1.3 itojun 175: nd6_maxndopt in sys/netinet6/nd6.c. If there are high demands we may
176: provide sysctl knob for the variable.
1.9 itojun 177:
178: Proxy Neighbor Advertisement support is implemented in the kernel.
179: You can configure it by using the following command:
180: # ndp -s fe80:1::1234 0:1:2:3:4:5 proxy
181: You need to fill in scope index into the address - see 1.3.3.
182: There are certain limitations, though:
183: - It does not send unsolicited multicast NA on configuration. This is MAY
184: behavior in RFC2461.
185: - It does not add random delay before transmission of solicited NA. This is
186: SHOULD behavior in RFC2461.
187: - We cannot configure proxy NDP for off-link address. The target address for
188: proxying must be link-local address, or must be in prefixes configured to
189: node which does proxy NDP.
190: - RFC2461 is unclear about if it is legal for a host to perform proxy ND.
191: We do not prohibit hosts from doing proxy ND, but there will be very limited
192: use in it.
1.1 itojun 193:
1.12 itojun 194: Starting mid March 2000, we support Neighbor Unreachability Detection (NUD)
195: on p2p interfaces, including tunnel interfaces (gif). NUD is turned on by
196: default. Before March 2000 KAME stack did not perform NUD on p2p interfaces.
197: If the change raises any interoperability issues, you can turn off/on NUD
198: by per-interface basis. Use "ndp -i interface -nud" to turn it off.
199: Consult ndp(8) for details.
200:
1.1 itojun 201: 1.3 Scope Index
202:
1.5 itojun 203: IPv6 uses scoped addresses. It is therefore very important to
1.1 itojun 204: specify scope index (interface index for link-local address, or
205: site index for site-local address) with an IPv6 address. Without
1.5 itojun 206: scope index, a scoped IPv6 address is ambiguous to the kernel, and
207: the kernel will not be able to determine the outbound interface for a
208: packet. KAME code tries to address the issue in several ways.
209:
1.6 itojun 210: Site-local address is very vaguely defined in the specs, and both specification
211: and KAME code need tons of improvements to enable its actual use.
212: For example, it is still very unclear how we define a site, or how we resolve
213: hostnames in a site. There are work underway to define behavior of routers
214: at site border, however, we have almost no code for site boundary node support
215: (both forwarding nor routing) and we bet almost noone has.
216: We recommend, at this moment, you to use global addresses for experiments -
217: there are way too many pitfalls if you use site-local addresses.
218:
1.5 itojun 219: 1.3.1 Kernel internal
220:
221: In the kernel, the interface index for a link-local scope address is
222: embedded into the 2nd 16bit-word (the 3rd and 4th bytes) in the IPv6
223: address.
1.3 itojun 224: For example, you may see something like:
1.1 itojun 225: fe80:1::200:f8ff:fe01:6317
226: in the routing table and interface address structure (struct
1.5 itojun 227: in6_ifaddr). The address above is a link-local unicast address
1.3 itojun 228: which belongs to a network interface whose interface identifier is 1.
229: The embedded index enables us to identify IPv6 link local
1.1 itojun 230: addresses over multiple interfaces effectively and with only a
231: little code change.
1.5 itojun 232:
233: 1.3.2 Interaction with API
234:
235: Ordinary userland applications should use the advanced API (RFC2292)
236: to specify scope index, or interface index. For the similar purpose,
237: the sin6_scope_id member in the sockaddr_in6 structure is defined in
238: RFC2553. However, the semantics for sin6_scope_id is rather vague.
239: If you care about portability of your application, we suggest you to
240: use the advanced API rather than sin6_scope_id.
241:
1.1 itojun 242: Routing daemons and configuration programs, like route6d and
243: ifconfig, will need to manipulate the "embedded" scope index.
244: These programs use routing sockets and ioctls (like SIOCGIFADDR_IN6)
1.3 itojun 245: and the kernel API will return IPv6 addresses with 2nd 16bit-word
1.1 itojun 246: filled in. The APIs are for manipulating kernel internal structure.
247: Programs that use these APIs have to be prepared about differences
248: in kernels anyway.
249:
1.5 itojun 250: getaddrinfo(3) and getnameinfo(3) are modified to support extended numeric
1.11 itojun 251: IPv6 syntax, as documented in draft-ietf-ipngwg-scopedaddr-format-01.txt.
1.5 itojun 252: You can specify outgoing link, by using name of the outgoing interface
1.8 itojun 253: like "fe80::1%ne0". This way you will be able to specify link-local scoped
1.5 itojun 254: address without much trouble.
255: To use this extension in your program, you'll need to use getaddrinfo(3),
256: and getnameinfo(3) with NI_WITHSCOPEID.
257: The implementation currently assumes 1-to-1 relationship between a link and an
258: interface, which is stronger than what IPv6 specs say.
259: Other APIs like inet_pton(3) or getipnodebyname(3) are inherently unfriendly
260: with scoped addresses, since they are unable to annotate addresses with
261: scope identifier.
262:
263: 1.3.3 Interaction with users (command line)
264:
265: Some of the userland tools support extended numeric IPv6 syntax, as
1.11 itojun 266: documented in draft-ietf-ipngwg-scopedaddr-format-01.txt. In this case,
1.5 itojun 267: you can specify outgoing link, by using name of the outgoing interface like
1.8 itojun 268: "fe80::1%ne0".
1.5 itojun 269:
1.1 itojun 270: When you specify scoped address to the command line, NEVER write the
271: embedded form (such as ff02:1::1 or fe80:2::fedc). This is not supposed
272: to work. Always use standard form, like ff02::1 or fe80::fedc, with
273: command line option for specifying interface (like "ping6 -I ne0 ff02::1).
274: In general, if a command does not have command line option to specify
275: outgoing interface, that command is not ready to accept scoped address.
276: This may seem to be opposite from IPv6's premise to support "dentist office"
277: situation. We believe that specifications need some improvements for this.
278:
1.5 itojun 279: The only exception to the above rule would be when you configure routing table
1.8 itojun 280: manually by route(8), or ndp(8). Gateway portion of IPv6 routing entry must
281: be an link-local address (otherwise ICMPv6 redirect will not work), and in this
1.5 itojun 282: case you'll need to configure it by putting interface index into the address:
283: # route add -inet6 default fe80:2::9876:5432:1234:5678
284: (when interface index for outgoing interface = 2)
285: To avoid configuration mistakes, we suggest you to run dynamic routing instead
286: (like route6d(8)).
1.3 itojun 287:
1.1 itojun 288: 1.4 Plug and Play
289:
290: The KAME kit implements most of the IPv6 stateless address
291: autoconfiguration in the kernel.
292: Neighbor Discovery functions are implemented in the kernel as a whole.
293: Router Advertisement (RA) input for hosts is implemented in the
294: kernel. Router Solicitation (RS) output for endhosts, RS input
295: for routers, and RA output for routers are implemented in the
296: userland.
297:
1.3 itojun 298: 1.4.1 Assignment of link-local, and special addresses
299:
1.6 itojun 300: IPv6 link-local address is generated from IEEE802 address (ethernet MAC
1.3 itojun 301: address). Each of interface is assigned an IPv6 link-local address
302: automatically, when the interface becomes up (IFF_UP). Also, direct route
303: for the link-local address is added to routing table.
304:
305: Here is an output of netstat command:
1.1 itojun 306:
307: Internet6:
308: Destination Gateway Flags Netif Expire
1.8 itojun 309: fe80::%ed0/64 link#1 UC ed0
310: fe80::%ep0/64 link#2 UC ep0
1.1 itojun 311:
1.3 itojun 312: Interfaces that has no IEEE802 address (pseudo interfaces like tunnel
313: interfaces, or ppp interfaces) will borrow IEEE802 address from other
314: interfaces, such as ethernet interfaces, whenever possible.
315: If there is no IEEE802 hardware attached, last-resort pseudorandom value,
316: which is from MD5(hostname), will be used as source of link-local address.
317: If it is not suitable for your usage, you will need to configure the
318: link-local address manually.
319:
320: If an interface is not capable of handling IPv6 (such as lack of multicast
321: support), link-local address will not be assigned to that interface.
322: See section 2 for details.
323:
1.1 itojun 324: Each interface joins the solicited multicast address and the
1.3 itojun 325: link-local all-nodes multicast addresses (e.g. fe80::1:ff01:6317
326: and ff02::1, respectively, on the link the interface is attached).
327: In addition to a link-local address, the loopback address (::1) will be
328: assigned to the loopback interface. Also, ::1/128 and ff01::/32 are
329: automatically added to routing table, and loopback interface joins
330: node-local multicast group ff01::1.
331:
332: 1.4.2 Stateless address autoconfiguration on hosts
333:
334: In IPv6 specification, nodes are separated into two categories:
335: routers and hosts. Routers forward packets addressed to others, hosts does
336: not forward the packets. net.inet6.ip6.forwarding defines whether this
337: node is router or host (router if it is 1, host if it is 0).
338:
1.5 itojun 339: It is NOT recommended to change net.inet6.ip6.forwarding while the node
340: is in operation. IPv6 specification defines behavior for "host" and "router"
341: quite differently, and switching from one to another can cause serious
342: troubles. It is recommended to configure the variable at bootstrap time only.
343:
344: The first step in stateless address configuration is Duplicated Address
345: Detection (DAD). See 1.2 for more detail on DAD.
346:
1.3 itojun 347: When a host hears Router Advertisement from the router, a host may
348: autoconfigure itself by stateless address autoconfiguration.
349: This behavior can be controlled by net.inet6.ip6.accept_rtadv
350: (host autoconfigures itself if it is set to 1).
351: By autoconfiguration, network address prefix for the receiving interface
352: (usually global address prefix) is added. Default route is also configured.
353: Routers periodically generate Router Advertisement packets. To request
354: an adjacent router to generate RA packet, a host can transmit Router
355: Solicitation. To generate a RS packet at any time, use the "rtsol" command.
356: "rtsold" daemon is also available. "rtsold" generates Router Solicitation
1.1 itojun 357: whenever necessary, and it works great for nomadic usage (notebooks/laptops).
358: If one wishes to ignore Router Advertisements, use sysctl to set
359: net.inet6.ip6.accept_rtadv to 0.
360:
361: To generate Router Advertisement from a router, use the "rtadvd" daemon.
362:
1.3 itojun 363: Note that, IPv6 specification assumes the following items, and nonconforming
364: cases are left unspecified:
365: - Only hosts will listen to router advertisements
366: - Hosts have single network interface (except loopback)
367: Therefore, this is unwise to enable net.inet6.ip6.accept_rtadv on routers,
368: or multi-interface host. A misconfigured node can behave strange
369: (KAME code allows nonconforming configuration, for those who would like
370: to do some experiments).
371:
372: To summarize the sysctl knob:
373: accept_rtadv forwarding role of the node
374: --- --- ---
375: 0 0 host (to be manually configured)
376: 0 1 router
377: 1 0 autoconfigured host
378: (spec assumes that host has single
379: interface only, autoconfigred host with
380: multiple interface is out-of-scope)
381: 1 1 invalid, or experimental
382: (out-of-scope of spec)
383:
1.1 itojun 384: RFC2462 has validation rule against incoming RA prefix information option,
385: in 5.5.3 (e). This is to protect hosts from malicious (or misconfigured)
386: routers that advertise very short prefix lifetime.
387: There was an update from Jim Bound to ipngwg mailing list (look
388: for "(ipng 6712)" in the archive) and KAME implements Jim's update.
389:
390: See 1.2 in the document for relationship between DAD and autoconfiguration.
391:
1.3 itojun 392: 1.4.3 DHCPv6
393:
394: We supply a tiny DHCPv6 server/client in kame/dhcp6. However, the
395: implementation is very premature (for example, this does NOT
396: implement address lease/release), and it is not in default compilation
397: tree. If you want to do some experiment, compile it on your own.
398:
399: DHCPv6 and autoconfiguration also needs more work. "Managed" and "Other"
400: bits in RA have no special effect to stateful autoconfiguration procedure
401: in DHCPv6 client program ("Managed" bit actually prevents stateless
402: autoconfiguration, but no special action will be taken for DHCPv6 client).
1.1 itojun 403:
404: 1.5 Generic tunnel interface
405:
406: GIF (Generic InterFace) is a pseudo interface for configured tunnel.
407: Details are described in gif(4) manpage.
408: Currently
409: v6 in v6
410: v6 in v4
411: v4 in v6
412: v4 in v4
413: are available. Use "gifconfig" to assign physical (outer) source
414: and destination address to gif interfaces.
415: Configuration that uses same address family for inner and outer IP
416: header (v4 in v4, or v6 in v6) is dangerous. It is very easy to
417: configure interfaces and routing tables to perform infinite level
418: of tunneling. Please be warned.
419:
420: gif can be configured to be ECN-friendly. See 4.5 for ECN-friendliness
421: of tunnels, and gif(4) manpage for how to configure.
422:
1.3 itojun 423: If you would like to configure an IPv4-in-IPv6 tunnel with gif interface,
1.5 itojun 424: read gif(4) carefully. You may need to remove IPv6 link-local address
1.3 itojun 425: automatically assigned to the gif interface.
426:
1.1 itojun 427: 1.6 Source Address Selection
428:
1.13 itojun 429: KAME's source address selection takes care of the following
430: conditions:
431: - address scope
432: - prefix matching against the destination
433: - outgoing interface
434: - whether an address is deprecated
435:
436: Roughly speaking, the selection policy is as follows:
437: - always use an address that belongs to the same scope zone as the
438: destination.
439: - addresses that have equal or larger scope than the scope of the
440: destination are preferred.
441: - if multiple addresses have the equal scope, one which is longest
442: prefix matching against the destination is preferred.
443: - a deprecated address is not used in new communications if an
444: alternate (non-deprecated) address is available and has sufficient
445: scope.
446: - if none of above conditions tie-breaks, addresses assigned on the
447: outgoing interface are preferred.
448:
449: For instance, ::1 is selected for ff01::1,
450: fe80::200:f8ff:fe01:6317%ne0 for fe80::2a0:24ff:feab:839b%ne0.
451: To see how longest-matching works, suppose that
1.3 itojun 452: 3ffe:501:808:1:200:f8ff:fe01:6317 and 3ffe:2001:9:124:200:f8ff:fe01:6317
1.13 itojun 453: are given on the outgoing interface. Then the former is chosen as the
454: source for the destination 3ffe:501:800::1. Note that even if all
455: available addresses have smaller scope than the scope of the
456: destination, we choose one anyway. For example, if we have link-local
457: and site-local addresses only, we choose a site-local addresses for a
458: global destination. If the packet is going to break a site boundary,
459: the boundary router will return an ICMPv6 destination unreachable
460: error with code 2 - beyond scope of source address.
461:
462: The precise desripction of the algorithm is quite complicated. To
463: describe the algorithm, we introduce the following notation:
464:
465: For a given destination D,
466: samescope(D): A set of addresses that have the same scope as D.
467: largerscope(D): A set of addresses that have a larger scope than D.
468: smallerscope(D): A set of addresses that have a smaller scope than D.
469:
470: For a given set of addresses A,
471: DEP(A): a set of deprecated addresses in A.
472: nonDEP(A): A - DEP(A).
473:
474: Also, the algorithm assumes that the outgoing interface for the
475: destination D is determined. We call the interface "I".
476:
477: The algorithm is as follows. Selection proceeds step by step as
478: described; For example, if an address is selected by item 1, item 2 or
479: later are not considered at all.
480:
481: 0. If there is no address in the same scope zone as D, just give up;
482: the packet will not be sent.
483: 1. If nonDEP(samescope(D)) is not empty,
484: choose a longest matching address against D. If more than one
485: address is longest matching, choose arbitrary one provided that
486: an address on I is always preferred.
487: 2. If nonDEP(largerscope(D)) is not empty,
488: choose an address that has the smallest scope. If more than one
489: address has the smallest scope, choose arbitrary one provided
490: that an address on I is always preferred.
491: 3. If DEP(samescope(D)) is not empty,
492: choose a longest matching address against D. If more than one
493: address is longest matching, choose arbitrary one provided that
494: an address on I is always preferred.
495: 4. If DEP(largerscope(D)) is not empty,
496: choose an address that has the smallest scope. If more than one
497: address has the smallest scope, choose arbitrary one provided
498: that an address on I is always preferred.
499: 5. if nonDEP(smallerscope(D)) is not empty,
500: choose an address that has the largest scope. If more than one
501: address has the largest scope, choose arbitrary one provided
502: that an address on I is always preferred.
503: 6. if DEP(smallerscope(D)) is not empty,
504: choose an address that has the largest scope. If more than one
505: address has the largest scope, choose arbitrary one provided
506: that an address on I is always preferred.
507:
508: There exists a document about source address selection
509: (draft-ietf-ipngwg-default-addr-select-xx.txt). KAME's algorithm
510: described above takes a similar approach to the document, but there
511: are some differences. See the document for more details.
1.1 itojun 512:
513: There are some cases where we do not use the above rule. One
1.13 itojun 514: example is connected TCP session, and we use the address kept in TCP
515: protocol control block (tcb) as the source.
1.1 itojun 516: Another example is source address for Neighbor Advertisement.
517: Under the spec (RFC2461 7.2.2) NA's source should be the target
518: address of the corresponding NS's target. In this case we follow
519: the spec rather than the above longest-match rule.
520:
1.12 itojun 521: If you would like to prohibit the use of deprecated address for some
522: reason, configure net.inet6.ip6.use_deprecated to 0. The issue
523: related to deprecated address is described in RFC2462 5.5.4 (NOTE:
524: there is some debate underway in IETF ipngwg on how to use
1.3 itojun 525: "deprecated" address).
526:
1.1 itojun 527: 1.7 Jumbo Payload
528:
529: KAME supports the Jumbo Payload hop-by-hop option used to send IPv6
530: packets with payloads longer than 65,535 octets. But since currently
531: KAME does not support any physical interface whose MTU is more than
532: 65,535, such payloads can be seen only on the loopback interface(i.e.
533: lo0).
534:
535: If you want to try jumbo payloads, you first have to reconfigure the
536: kernel so that the MTU of the loopback interface is more than 65,535
537: bytes; add the following to the kernel configuration file:
538: options "LARGE_LOMTU" #To test jumbo payload
539: and recompile the new kernel.
540:
541: Then you can test jumbo payloads by the ping6 command with -b and -s
542: options. The -b option must be specified to enlarge the size of the
543: socket buffer and the -s option specifies the length of the packet,
544: which should be more than 65,535. For example, type as follows;
545: % ping6 -b 70000 -s 68000 ::1
546:
547: The IPv6 specification requires that the Jumbo Payload option must not
548: be used in a packet that carries a fragment header. If this condition
549: is broken, an ICMPv6 Parameter Problem message must be sent to the
550: sender. KAME kernel follows the specification, but you cannot usually
551: see an ICMPv6 error caused by this requirement.
552:
553: If KAME kernel receives an IPv6 packet, it checks the frame length of
554: the packet and compares it to the length specified in the payload
555: length field of the IPv6 header or in the value of the Jumbo Payload
556: option, if any. If the former is shorter than the latter, KAME kernel
557: discards the packet and increments the statistics. You can see the
558: statistics as output of netstat command with `-s -p ip6' option:
559: % netstat -s -p ip6
560: ip6:
561: (snip)
562: 1 with data size < data length
563:
564: So, KAME kernel does not send an ICMPv6 error unless the erroneous
565: packet is an actual Jumbo Payload, that is, its packet size is more
566: than 65,535 bytes. As described above, KAME kernel currently does not
567: support physical interface with such a huge MTU, so it rarely returns an
568: ICMPv6 error.
569:
570: TCP/UDP over jumbogram is not supported at this moment. This is because
571: we have no medium (other than loopback) to test this. Contact us if you
572: need this.
573:
574: IPsec does not work on jumbograms. This is due to some specification twists
1.3 itojun 575: in supporting AH with jumbograms (AH header size influences payload length,
576: and this makes it real hard to authenticate inbound packet with jumbo payload
577: option as well as AH).
578:
579: There are fundamental issues in *BSD support for jumbograms. We would like to
1.12 itojun 580: address those, but we need more time to finalize the task. To name a few:
581: - mbuf pkthdr.len field is typed as "int" in 4.4BSD, so it cannot hold
1.3 itojun 582: jumbogram with len > 2G on 32bit architecture CPUs. If we would like to
583: support jumbogram properly, the field must be expanded to hold 4G +
584: IPv6 header + link-layer header. Therefore, it must be expanded to at least
585: int64_t (u_int32_t is NOT enough).
586: - We mistakingly use "int" to hold packet length in many places. We need
1.12 itojun 587: to convert them into larger numeric type. It needs a great care, as we may
1.3 itojun 588: experience overflow during packet length computation.
589: - We mistakingly check for ip6_plen field of IPv6 header for packet payload
590: length in various places. We should be checking mbuf pkthdr.len instead.
591: ip6_input() will perform sanity check on jumbo payload option on input,
592: and we can safely use mbuf pkthdr.len afterwards.
1.12 itojun 593: - TCP code needs careful updates in bunch of places, of course.
1.1 itojun 594:
595: 1.8 Loop prevention in header processing
596:
597: IPv6 specification allows arbitrary number of extension headers to
598: be placed onto packets. If we implement IPv6 packet processing
599: code in the way BSD IPv4 code is implemented, kernel stack may
1.3 itojun 600: overflow due to long function call chain. KAME sys/netinet6 code
1.1 itojun 601: is carefully designed to avoid kernel stack overflow. Because of
602: this, KAME sys/netinet6 code defines its own protocol switch
603: structure, as "struct ip6protosw" (see netinet6/ip6protosw.h).
604: IPv4 part (sys/netinet) remains untouched for compatibility.
605: Because of this, if you receive IPsec-over-IPv4 packet with massive
606: number of IPsec headers, kernel stack may blow up. IPsec-over-IPv6 is okay.
607:
608: 1.9 ICMPv6
609:
610: After RFC2463 was published, IETF ipngwg has decided to disallow ICMPv6 error
611: packet against ICMPv6 redirect, to prevent ICMPv6 storm on a network medium.
612: KAME already implements this into the kernel.
613:
614: 1.10 Applications
615:
616: For userland programming, we support IPv6 socket API as specified in
617: RFC2553, RFC2292 and upcoming internet drafts.
618:
619: TCP/UDP over IPv6 is available and quite stable. You can enjoy "telnet",
620: "ftp", "rlogin", "rsh", "ssh", etc. These applications are protocol
621: independent. That is, they automatically chooses IPv4 or IPv6
622: according to DNS.
623:
624: 1.11 Kernel Internals
625:
1.3 itojun 626: (*) TCP/UDP part is handled differently between operating system platforms.
627: See 1.12 for details.
1.1 itojun 628:
629: The current KAME has escaped from the IPv4 netinet logic. While
630: ip_forward() calls ip_output(), ip6_forward() directly calls
631: if_output() since routers must not divide IPv6 packets into fragments.
632:
633: ICMPv6 should contain the original packet as long as possible up to
634: 1280. UDP6/IP6 port unreach, for instance, should contain all
635: extension headers and the *unchanged* UDP6 and IP6 headers.
636: So, all IP6 functions except TCP6 never convert network byte
637: order into host byte order, to save the original packet.
638:
639: tcp6_input(), udp6_input() and icmp6_input() can't assume that IP6
640: header is preceding the transport headers due to extension
641: headers. So, in6_cksum() was implemented to handle packets whose IP6
642: header and transport header is not continuous. TCP/IP6 nor UDP/IP6
643: header structure don't exist for checksum calculation.
644:
645: To process IP6 header, extension headers and transport headers easily,
646: KAME requires network drivers to store packets in one internal mbuf or
647: one or more external mbufs. A typical old driver prepares two
648: internal mbufs for 100 - 208 bytes data, however, KAME's reference
649: implementation stores it in one external mbuf.
650:
651: "netstat -s -p ip6" tells you whether or not your driver conforms
652: KAME's requirement. In the following example, "cce0" violates the
653: requirement. (For more information, refer to Section 2.)
654:
655: Mbuf statistics:
656: 317 one mbuf
657: two or more mbuf::
658: lo0 = 8
659: cce0 = 10
660: 3282 one ext mbuf
661: 0 two or more ext mbuf
662:
663: Each input function calls IP6_EXTHDR_CHECK in the beginning to check
664: if the region between IP6 and its header is
665: continuous. IP6_EXTHDR_CHECK calls m_pullup() only if the mbuf has
666: M_LOOP flag, that is, the packet comes from the loopback
667: interface. m_pullup() is never called for packets coming from physical
668: network interfaces.
669:
670: TCP6 reassembly makes use of IP6 header to store reassemble
671: information. IP6 is not supposed to be just before TCP6, so
672: ip6tcpreass structure has a pointer to TCP6 header. Of course, it has
673: also a pointer back to mbuf to avoid m_pullup().
674:
675: Like TCP6, both IP and IP6 reassemble functions never call m_pullup().
676:
677: xxx_ctlinput() calls in_mrejoin() on PRC_IFNEWADDR. We think this is
678: one of 4.4BSD implementation flaws. Since 4.4BSD keeps ia_multiaddrs
679: in in_ifaddr{}, it can't use multicast feature if the interface has no
680: unicast address. So, if an application joins to an interface and then
681: all unicast addresses are removed from the interface, the application
682: can't send/receive any multicast packets. Moreover, if a new unicast
683: address is assigned to the interface, in_mrejoin() must be called.
684: KAME's interfaces, however, have ALWAYS one link-local unicast
685: address. These extensions have thus not been implemented in KAME.
686:
687: 1.12 IPv4 mapped address and IPv6 wildcard socket
688:
689: RFC2553 describes IPv4 mapped address (3.7) and special behavior
690: of IPv6 wildcard bind socket (3.8). The spec allows you to:
1.4 itojun 691: - Accept IPv4 connections by AF_INET6 wildcard bind socket.
1.1 itojun 692: - Transmit IPv4 packet over AF_INET6 socket by using special form of
693: the address like ::ffff:10.1.1.1.
1.3 itojun 694: but the spec itself is very complicated and does not specify how the
695: socket layer should behave.
1.4 itojun 696: Here we call the former one "listening side" and the latter one "initiating
697: side", for reference purposes.
1.1 itojun 698:
1.4 itojun 699: Almost all KAME implementations treat tcp/udp port number space separately
1.6 itojun 700: between IPv4 and IPv6. You can perform wildcard bind on both of the address
1.4 itojun 701: families, on the same port.
702:
703: There are some OS-platform differences in KAME code, as we use tcp/udp
704: code from different origin. The following table summarizes the behavior.
705:
706: listening side initiating side
1.6 itojun 707: (AF_INET6 wildcard (connection to ::ffff:10.1.1.1)
1.4 itojun 708: socket gets IPv4 conn.)
709: --- ---
710: KAME/BSDI3 not supported not supported
711: KAME/FreeBSD228 not supported not supported
712: KAME/FreeBSD3x configurable supported
713: default: enabled
714: KAME/NetBSD configurable supported
715: default: disabled
1.12 itojun 716: KAME/BSDI4 enabled supported
1.4 itojun 717: KAME/OpenBSD not supported not supported
1.1 itojun 718:
1.4 itojun 719: The following sections will give you more details, and how you can
1.3 itojun 720: configure the behavior.
1.1 itojun 721:
1.4 itojun 722: Comments on listening side:
723:
724: It looks that RFC2553 talks too little on wildcard bind issue,
1.12 itojun 725: specifically on (1) port space issue, (2) failure mode, (3) relationship
726: between AF_INET/INET6 wildcard bind like ordering constraint, and (4) behavior
727: when conflicting socket is opened/closed. There can be several separate
1.4 itojun 728: interpretation for this RFC which conform to it but behaves differently.
729: So, to implement portable application you should assume nothing
730: about the behavior in the kernel. Using getaddrinfo() is the safest way.
731: Port number space and wildcard bind issues were discussed in detail
732: on ipv6imp mailing list, in mid March 1999 and it looks that there's
733: no concrete consensus (means, up to implementers). You may want to
734: check the mailing list archives.
735: We supply a tool called "bindtest" that explores the behavior of
736: kernel bind(2). The tool will not be compiled by default.
737:
738: If a server application would like to accept IPv4 and IPv6 connections,
739: it should use AF_INET and AF_INET6 socket (you'll need two sockets).
740: Use getaddrinfo() with AI_PASSIVE into ai_flags, and socket(2) and bind(2)
741: to all the addresses returned.
742: By opening multiple sockets, you can accept connections onto the socket with
743: proper address family. IPv4 connections will be accepted by AF_INET socket,
744: and IPv6 connections will be accepted by AF_INET6 socket (NOTE: KAME/BSDI4
745: kernel sometimes violate this - we will fix it).
746:
747: If you try to support IPv6 traffic only and would like to reject IPv4
748: traffic, always check the peer address when a connection is made toward
749: AF_INET6 listening socket. If the address is IPv4 mapped address, you may
750: want to reject the connection. You can check the condition by using
751: IN6_IS_ADDR_V4MAPPED() macro. This is one of the reasons the author of
752: the section (itojun) dislikes special behavior of AF_INET6 wildcard bind.
753:
754: Comments on initiating side:
755:
1.1 itojun 756: Advise to application implementers: to implement a portable IPv6 application
757: (which works on multiple IPv6 kernels), we believe that the following
758: is the key to the success:
1.3 itojun 759: - NEVER hardcode AF_INET nor AF_INET6.
1.1 itojun 760: - Use getaddrinfo() and getnameinfo() throughout the system.
761: Never use gethostby*(), getaddrby*(), inet_*() or getipnodeby*().
762: - If you would like to connect to destination, use getaddrinfo() and try
763: all the destination returned, like telnet does.
764: - Some of the IPv6 stack is shipped with buggy getaddrinfo(). Ship a minimal
765: working version with your application and use that as last resort.
766:
1.4 itojun 767: If you would like to use AF_INET6 socket for both IPv4 and IPv6 outgoing
768: connection, you will need tweaked implementation in DNS support libraries,
769: as documented in RFC2553 6.1. KAME libinet6 includes the tweak in
770: getipnodebyname(). Note that getipnodebyname() itself is not recommended as
771: it does not handle scoped IPv6 addresses at all. For IPv6 name resolution
772: getaddrinfo() is the preferred API. getaddrinfo() does not implement the
773: tweak.
774:
775: When writing applications that make outgoing connections, story goes much
1.6 itojun 776: simpler if you treat AF_INET and AF_INET6 as totally separate address family.
1.4 itojun 777: {set,get}sockopt issue goes simpler, DNS issue will be made simpler. We do
778: not recommend you to rely upon IPv4 mapped address.
1.3 itojun 779:
780: 1.12.1 KAME/BSDI3 and KAME/FreeBSD228
1.1 itojun 781:
1.4 itojun 782: The platforms do not support IPv4 mapped address at all (both listening side
783: and initiating side). AF_INET6 and AF_INET sockets are totally separated.
1.1 itojun 784:
1.5 itojun 785: Port number space is totally separate between AF_INET and AF_INET6 sockets.
1.1 itojun 786:
1.4 itojun 787: 1.12.2 KAME/FreeBSD3x
1.1 itojun 788:
1.4 itojun 789: KAME/FreeBSD3x uses shared tcp4/6 code (from sys/netinet/tcp*) and shared
790: udp4/6 code (from sys/netinet/udp*). It uses unified inpcb/in6pcb structure.
1.1 itojun 791:
1.4 itojun 792: 1.12.2.1 KAME/FreeBSD3x, listening side
1.1 itojun 793:
1.4 itojun 794: The platform can be configured to support IPv4 mapped address/special
1.12 itojun 795: AF_INET6 wildcard bind (enabled by default). There is no kernel compilation
796: option to disable it. You can enable/disable the behavior with sysctl
797: (per-node), or setsockopt (per-socket).
1.4 itojun 798:
799: Wildcard AF_INET6 socket grabs IPv4 connection if and only if the following
800: conditions are satisfied:
801: - there's no AF_INET socket that matches the IPv4 connection
802: - the AF_INET6 socket is configured to accept IPv4 traffic, i.e.
803: getsockopt(IPV6_BINDV6ONLY) returns 0.
804:
805: (XXX need checking)
806:
1.5 itojun 807: 1.12.2.2 KAME/FreeBSD3x, initiating side
1.4 itojun 808:
1.6 itojun 809: KAME/FreeBSD3x supports outgoing connection to IPv4 mapped address
1.4 itojun 810: (::ffff:10.1.1.1), if the node is configured to accept IPv4 connections
811: by AF_INET6 socket.
1.1 itojun 812:
1.4 itojun 813: (XXX need checking)
1.1 itojun 814:
1.5 itojun 815: 1.12.3 KAME/NetBSD
1.1 itojun 816:
1.3 itojun 817: KAME/NetBSD uses shared tcp4/6 code (from sys/netinet/tcp*) and shared
818: udp4/6 code (from sys/netinet/udp*). The implementation is made differently
819: from KAME/FreeBSD3x. KAME/NetBSD uses separate inpcb/in6pcb structures,
820: while KAME/FreeBSD3x uses merged inpcb structure.
821:
1.5 itojun 822: 1.12.3.1 KAME/NetBSD, listening side
1.4 itojun 823:
824: The platform can be configured to support IPv4 mapped address/special AF_INET6
825: wildcard bind (disabled by default). Kernel behavior can be summarized as
826: follows:
827: - default: special support code will be compiled in, but is disabled by
828: default. It can be controlled by sysctl (net.inet6.ip6.bindv6only),
829: or setsockopt(IPV6_BINDV6ONLY).
830: - add "INET6_BINDV6ONLY": No special support code for AF_INET6 wildcard socket
831: will be compiled in. AF_INET6 sockets and AF_INET sockets are totally
832: separate. The behavior is similar to what described in 1.12.1.
833:
834: sysctl setting will affect per-socket configuration at in6pcb creation time
835: only. In other words, per-socket configuration will be copied from sysctl
836: configuration at in6pcb creation time. To change per-socket behavior, you
837: must perform setsockopt or reopen the socket. Change in sysctl configuration
838: will not change the behavior or sockets that are already opened.
839:
840: Wildcard AF_INET6 socket grabs IPv4 connection if and only if the following
841: conditions are satisfied:
842: - there's no AF_INET socket that matches the IPv4 connection
843: - the AF_INET6 socket is configured to accept IPv4 traffic, i.e.
844: getsockopt(IPV6_BINDV6ONLY) returns 0.
1.12 itojun 845:
846: You cannot bind(2) with IPv4 mapped address. This is a workaround for port
847: number duplicate and other twists.
1.4 itojun 848:
1.5 itojun 849: 1.12.3.2 KAME/NetBSD, initiating side
1.4 itojun 850:
851: When you initiate a connection, you can always connect to IPv4 destination
852: over AF_INET6 socket, usin IPv4 mapped address destination (::ffff:10.1.1.1).
853: This is enabled independently from the configuration for listening side, and
854: always enabled.
1.3 itojun 855:
1.5 itojun 856: 1.12.4 KAME/BSDI4
1.4 itojun 857:
858: KAME/BSDI4 uses NRL-based TCP/UDP stack and inpcb source code,
1.3 itojun 859: which was derived from NRL IPv6/IPsec stack. I guess it supports IPv4 mapped
860: address and speical AF_INET6 wildcard bind. The implementation is, again,
861: different from other KAME/*BSDs.
1.4 itojun 862:
1.5 itojun 863: 1.12.4.1 KAME/BSDI4, listening side
1.4 itojun 864:
865: NRL inpcb layer supports special behavior of AF_INET6 wildcard socket.
1.12 itojun 866: There is no way to disable the behavior.
867:
868: Wildcard AF_INET6 socket grabs IPv4 connection if and only if the following
869: condition is satisfied:
870: - there's no AF_INET socket that matches the IPv4 connection
1.1 itojun 871:
1.5 itojun 872: 1.12.4.2 KAME/BSDI4, initiating side
1.4 itojun 873:
874: KAME/BSDi4 supports connection initiation to IPv4 mapped address
875: (like ::ffff:10.1.1.1).
876:
1.5 itojun 877: 1.12.5 KAME/OpenBSD
1.1 itojun 878:
1.4 itojun 879: KAME/OpenBSD uses NRL-based TCP/UDP stack and inpcb source code,
880: which was derived from NRL IPv6/IPsec stack.
881:
1.5 itojun 882: 1.12.5.1 KAME/OpenBSD, listening side
1.4 itojun 883:
884: KAME/OpenBSD disables special behavior on AF_INET6 wildcard bind for
885: security reasons (if IPv4 traffic toward AF_INET6 wildcard bind is allowed,
886: access control will become much harder). KAME/BSDI4 uses NRL-based TCP/UDP
887: stack as well, however, the behavior is different due to OpenBSD's security
888: policy.
889:
890: As a result the behavior of KAME/OpenBSD is similar to KAME/BSDI3 and
891: KAME/FreeBSD228 (see 1.12.1 for more detail).
892:
1.5 itojun 893: 1.12.5.2 KAME/OpenBSD, initiating side
1.4 itojun 894:
895: KAME/OpenBSD does not support connection initiation to IPv4 mapped address
896: (like ::ffff:10.1.1.1).
897:
1.13 itojun 898: 1.12.6 More issues
899:
900: IPv4 mapped address support adds a big requirement to EVERY userland codebase.
901: Every userland code should check if an AF_INET6 sockaddr contains IPv4
902: mapped address or not. This adds many twists:
903:
904: - Access controls code becomes harder to write.
905: For example, if you would like to reject packets from 10.0.0.0/8,
906: you need to reject packets to AF_INET socket from 10.0.0.0/8,
907: and to AF_INET6 socket from ::ffff:10.0.0.0/104.
1.14 ! itojun 908: - If a protocol on top of IPv4 is defined differently with IPv6, we need to be
! 909: really careful when we determine which protocol to use.
1.13 itojun 910: For example, with FTP protocol, we can not simply use sa_family to determine
911: FTP command sets. The following example is incorrect:
912: if (sa_family == AF_INET)
913: use EPSV/EPRT or PASV/PORT; /*IPv4*/
914: else if (sa_family == AF_INET6)
915: use EPSV/EPRT or LPSV/LPRT; /*IPv6*/
916: else
917: error;
918: Under SIIT environment, the correct code would be:
919: if (sa_family == AF_INET)
920: use EPSV/EPRT or PASV/PORT; /*IPv4*/
921: else if (sa_family == AF_INET6 && IPv4 mapped address)
922: use EPSV/EPRT or PASV/PORT; /*IPv4 command set on AF_INET6*/
923: else if (sa_family == AF_INET6 && !IPv4 mapped address)
924: use EPSV/EPRT or LPSV/LPRT; /*IPv6*/
925: else
926: error;
1.14 ! itojun 927: It is too much to ask for every body to be careful like this.
! 928: The problem is, we are not sure if the above code fragment is perfect for
! 929: all situations.
1.13 itojun 930: - By enabling kernel support for IPv4 mapped address (outgoing direction),
931: servers on the kernel can be hosed by IPv6 native packet that has IPv4
932: mapped address in IPv6 header source, and can generate unwanted IPv4 packets.
933: http://playground.iijlab.net/i-d/draft-itojun-ipv6-transition-abuse-00.txt
934: talks more about this scenario.
935:
936: Due to the above twists, some of KAME userland programs has restrictions on
937: the use of IPv4 mapped addresses:
938: - rshd/rlogind do not accept connections from IPv4 mapped address.
939: This is to avoid malicious use of IPv4 mapped address in IPv6 native
940: packet, to bypass source-address based authentication.
941: - ftp/ftpd does not support SIIT environment. IPv4 mapped address will be
942: decoded in userland, and will be passed to AF_INET sockets
943: (SIIT client should pass IPv4 mapped address as is, to AF_INET6 sockets).
944:
1.4 itojun 945: 1.13 sockaddr_storage
946:
1.6 itojun 947: When RFC2553 was about to be finalized, there was discussion on how struct
1.4 itojun 948: sockaddr_storage members are named. One proposal is to prepend "__" to the
949: members (like "__ss_len") as they should not be touched. The other proposal
950: was that don't prepend it (like "ss_len") as we need to touch those members
951: directly. There was no clear consensus on it.
952:
953: As a result, RFC2553 defines struct sockaddr_storage as follows:
954: struct sockaddr_storage {
955: u_char __ss_len; /* address length */
956: u_char __ss_family; /* address family */
957: /* and bunch of padding */
958: };
959: On the contrary, XNET draft defines as follows:
960: struct sockaddr_storage {
961: u_char ss_len; /* address length */
962: u_char ss_family; /* address family */
963: /* and bunch of padding */
964: };
965:
966: In December 1999, it was agreed that RFC2553bis should pick the latter (XNET)
967: definition.
968:
969: KAME kit prior to December 1999 used RFC2553 definition. KAME kit after
970: December 1999 (including December) will conform to XNET definition,
1.6 itojun 971: based on RFC2553bis discussion.
1.4 itojun 972:
973: If you look at multiple IPv6 implementations, you will be able to see
974: both definitions. As an userland programmer, the most portable way of
975: dealing with it is to:
976: (1) ensure ss_family and/or ss_len are available on the platform, by using
977: GNU autoconf,
978: (2) have -Dss_family=__ss_family to unify all occurences (including header
979: file) into __ss_family, or
980: (3) never touch __ss_family. cast to sockaddr * and use sa_family like:
981: struct sockaddr_storage ss;
982: family = ((struct sockaddr *)&ss)->sa_family
1.1 itojun 983:
1.5 itojun 984: 1.14 Invalid addresses on the wire
985:
1.13 itojun 986: Some of IPv6 transition technologies embed IPv4 address into IPv6 address.
987: These specifications themselves are fine, however, there can be certain
988: set of attacks enabled by these specifications. Recent speicifcation
989: documents covers up those issues, however, there are already-published RFCs
990: that does not have protection against those (like using source address of
1.5 itojun 991: ::ffff:127.0.0.1 to bypass "reject packet from remote" filter).
992:
1.13 itojun 993: To name a few, these address ranges can be used to hose an IPv6 implementation,
994: or bypass security controls:
995: - IPv4 mapped address that embeds unspecified/multicast/loopback/broadcast
996: IPv4 address (if they are in IPv6 native packet header, they are malicious)
997: ::ffff:0.0.0.0/104 ::ffff:127.0.0.0/104
998: ::ffff:224.0.0.0/100 ::ffff:255.0.0.0/104
999: - 6to4 prefix generated from unspecified/multicast/loopback/broadcast/private
1000: IPv4 address
1001: 2002:0000::/24 2002:7f00::/24 2002:e000::/24
1002: 2002:ff00::/24 2002:0a00::/24 2002:ac10::/28
1003: 2002:c0a8::/32
1004:
1005: Also, since KAME does not support RFC1933 auto tunnels, seeing IPv4 compatible
1006: is very rare. You should take caution if you see those on the wire.
1007:
1.5 itojun 1008: KAME code is carefully written to avoid such incidents. More specifically,
1.13 itojun 1009: KAME kernel will reject packets with certain source/dstination address in IPv6
1010: base header, or IPv6 routing header. Also, KAME default configuration file
1011: is written carefully, to avoid those attacks.
1012:
1013: http://playground.iijlab.net/i-d/draft-itojun-ipv6-transition-abuse-00.txt
1014: talks about more about this.
1.5 itojun 1015:
1.12 itojun 1016: 1.15 Node's required addresses
1017:
1018: RFC2373 section 2.8 talks about required addresses for an IPv6
1019: node. The section talks about how KAME stack manages those required
1020: addresses.
1021:
1022: 1.15.1 Host case
1023:
1024: The following items are automatically assigned to the node (or the node will
1025: automatically joins the group), at bootstrap time:
1026: - Loopback address
1027: - All-nodes multicast addresses (ff01::1)
1028:
1029: The following items will be automatically handled when the interface becomes
1030: IFF_UP:
1031: - Its link-local address for each interface
1032: - Solicited-node multicast address for link-local addresses
1033: - Link-local allnodes multicast address (ff02::1)
1034:
1035: The following items need to be configured manually by ifconfig(8) or prefix(8).
1036: Alternatively, these can be autoconfigured by using stateless address
1037: autoconfiguration.
1038: - Assigned unicast/anycast addresses
1039: - Solicited-Node multicast address for assigned unicast address
1040:
1041: Users can join groups by using appropriate system calls like setsockopt(2).
1042:
1043: 1.15.2 Router case
1044:
1045: In addition to the above, routers needs to handle the following items.
1046:
1047: The following items need to be configured manually by using ifconfig(8).
1048: o The subnet-router anycast addresses for the interfaces it is configured
1049: to act as a router on (prefix::/64)
1050: o All other anycast addresses with which the router has been configured
1051:
1052: The router will join the following multicast group when rtadvd(8) is available
1053: for the interface.
1054: o All-Routers Multicast Addresses (ff02::2)
1055:
1056: Routing daemons will join appropriate multicast groups, as necessary,
1057: like ff02::9 for RIPng.
1058:
1059: Users can join groups by using appropriate system calls like setsockopt(2).
1060:
1.1 itojun 1061: 2. Network Drivers
1062:
1063: KAME requires three items to be added into the standard drivers:
1064:
1065: (1) mbuf clustering requirement. In this stable release, we changed
1066: MINCLSIZE into MHLEN+1 for all the operating systems in order to make
1067: all the drivers behave as we expect.
1068:
1069: (2) multicast. If "ifmcstat" yields no multicast group for a
1070: interface, that interface has to be patched.
1071:
1072: To avoid troubles, we suggest you to comment out the device drivers
1.3 itojun 1073: for unsupported/unnecessary cards, from the kernel configuration file.
1.1 itojun 1074: If you accidentally enable unsupported drivers, some of the userland
1075: tools may not work correctly (routing daemons are typical example).
1076:
1077: In the following sections, "official support" means that KAME developers
1078: are using that ethernet card/driver frequently.
1079:
1.3 itojun 1080: (NOTE: In the past we required all pcmcia drivers to have a call to
1081: in6_ifattach(). We have no such requirement any more)
1082:
1.1 itojun 1083: 2.1 FreeBSD 2.2.x-RELEASE
1084:
1085: Here is a list of FreeBSD 2.2.x-RELEASE drivers and its conditions:
1086:
1.12 itojun 1087: driver mbuf(1) multicast(2) official support?
1.3 itojun 1088: --- --- --- ---
1.1 itojun 1089: (Ethernet)
1.3 itojun 1090: ar looks ok - -
1091: cnw ok ok yes (*)
1092: ed ok ok yes
1093: ep ok ok yes
1094: fe ok ok yes
1095: sn looks ok - - (*)
1096: vx looks ok - -
1097: wlp ok ok - (*)
1098: xl ok ok yes
1099: zp ok ok -
1.1 itojun 1100: (FDDI)
1.3 itojun 1101: fpa looks ok ? -
1.1 itojun 1102: (ATM)
1.3 itojun 1103: en ok ok yes
1.1 itojun 1104: (Serial)
1.3 itojun 1105: lp ? - not work
1106: sl ? - not work
1107: sr looks ok ok - (**)
1.1 itojun 1108:
1109: You may want to add an invocation of "rtsol" in "/etc/pccard_ether",
1110: if you are using notebook computers and PCMCIA ethernet card.
1111:
1112: (*) These drivers are distributed with PAO (http://www.jp.freebsd.org/PAO/).
1113:
1114: (**) There was some report says that, if you make sr driver up and down and
1115: then up, the kernel may hang up. We have disabled frame-relay support from
1116: sr driver and after that this looks to be working fine. If you need
1.3 itojun 1117: frame-relay support to come back, please contact KAME developers.
1.1 itojun 1118:
1.3 itojun 1119: 2.2 BSD/OS 3.x
1.1 itojun 1120:
1.3 itojun 1121: The following lists BSD/OS 3.x device drivers and its conditions:
1.1 itojun 1122:
1.12 itojun 1123: driver mbuf(1) multicast(2) official support?
1.3 itojun 1124: --- --- --- ---
1.1 itojun 1125: (Ethernet)
1.3 itojun 1126: cnw ok ok yes
1127: de ok ok -
1128: df ok ok -
1129: eb ok ok -
1130: ef ok ok yes
1131: exp ok ok -
1132: mz ok ok yes
1133: ne ok ok yes
1134: we ok ok -
1.1 itojun 1135: (FDDI)
1.3 itojun 1136: fpa ok ok -
1.1 itojun 1137: (ATM)
1.3 itojun 1138: en maybe ok -
1.1 itojun 1139: (Serial)
1.3 itojun 1140: ntwo ok ok yes
1141: sl ? - not work
1142: appp ? - not work
1.1 itojun 1143:
1144: You may want to use "@insert" directive in /etc/pccard.conf to invoke
1145: "rtsol" command right after dynamic insertion of PCMCIA ethernet cards.
1146:
1147: 2.3 NetBSD
1148:
1149: The following table lists the network drivers we have tried so far.
1150:
1.12 itojun 1151: driver mbuf(1) multicast(2) official support?
1.1 itojun 1152: --- --- --- ---
1153: (Ethernet)
1.5 itojun 1154: awi pcmcia/i386 ok ok -
1155: bah zbus/amiga NG(*)
1156: cnw pcmcia/i386 ok ok yes
1.1 itojun 1157: ep pcmcia/i386 ok ok -
1158: le sbus/sparc ok ok yes
1.5 itojun 1159: ne pci/i386 ok ok yes
1.12 itojun 1160: ne pcmcia/i386 ok ok yes
1.5 itojun 1161: wi pcmcia/i386 ok ok yes
1.1 itojun 1162: (ATM)
1163: en pci/i386 ok ok -
1164:
1.3 itojun 1165: (*) This may need some fix, but I'm not sure what arcnet interfaces assume...
1166:
1.1 itojun 1167: 2.4 FreeBSD 3.x-RELEASE
1168:
1169: Here is a list of FreeBSD 3.x-RELEASE drivers and its conditions:
1170:
1.12 itojun 1171: driver mbuf(1) multicast(2) official support?
1.1 itojun 1172: --- --- --- ---
1173: (Ethernet)
1.12 itojun 1174: cnw ok ok -(*)
1175: ed ? ok -
1176: ep ok ok -
1.3 itojun 1177: fe ok ok yes
1.8 itojun 1178: fxp ?(**)
1.3 itojun 1179: lnc ? ok -
1180: sn ? ? -(*)
1.12 itojun 1181: wi ok ok yes
1.3 itojun 1182: xl ? ok -
1183:
1184: (*) These drivers are distributed with PAO as PAO3
1185: (http://www.jp.freebsd.org/PAO/).
1.8 itojun 1186: (**) there are trouble reports with multicast filter initialization.
1.1 itojun 1187:
1188: More drivers will just simply work on KAME FreeBSD 3.x-RELEASE but have not
1189: been checked yet.
1190:
1.3 itojun 1191: 2.5 OpenBSD 2.x
1192:
1193: Here is a list of OpenBSD 2.x drivers and its conditions:
1194:
1.12 itojun 1195: driver mbuf(1) multicast(2) official support?
1.3 itojun 1196: --- --- --- ---
1197: (Ethernet)
1.12 itojun 1198: de pci/i386 ok ok yes
1199: fxp pci/i386 ?(*)
1.5 itojun 1200: le sbus/sparc ok ok yes
1.3 itojun 1201: ne pci/i386 ok ok yes
1202: ne pcmcia/i386 ok ok yes
1.5 itojun 1203:
1204: (*) There seem to be some problem in driver, with multicast filter
1205: configuration. This happens with certain revision of chipset on the card.
1.12 itojun 1206: Should be fixed by now by workaround in sys/net/if.c, but still not sure.
1.5 itojun 1207:
1208: 2.6 BSD/OS 4.x
1209:
1210: The following lists BSD/OS 4.x device drivers and its conditions:
1211:
1.12 itojun 1212: driver mbuf(1) multicast(2) official support?
1.5 itojun 1213: --- --- --- ---
1214: (Ethernet)
1215: de ok ok yes
1.12 itojun 1216: exp (*)
1.5 itojun 1217:
1218: You may want to use "@insert" directive in /etc/pccard.conf to invoke
1219: "rtsol" command right after dynamic insertion of PCMCIA ethernet cards.
1.3 itojun 1220:
1.12 itojun 1221: (*) exp driver has serious conflict with KAME initialization sequence.
1222: A workaround is committed into sys/i386/pci/if_exp.c, and should be okay by now.
1223:
1.1 itojun 1224: 3. Translator
1225:
1226: We categorize IPv4/IPv6 translator into 4 types.
1227:
1228: Translator A --- It is used in the early stage of transition to make
1229: it possible to establish a connection from an IPv6 host in an IPv6
1230: island to an IPv4 host in the IPv4 ocean.
1231:
1232: Translator B --- It is used in the early stage of transition to make
1233: it possible to establish a connection from an IPv4 host in the IPv4
1234: ocean to an IPv6 host in an IPv6 island.
1235:
1236: Translator C --- It is used in the late stage of transition to make it
1237: possible to establish a connection from an IPv4 host in an IPv4 island
1238: to an IPv6 host in the IPv6 ocean.
1239:
1240: Translator D --- It is used in the late stage of transition to make it
1241: possible to establish a connection from an IPv6 host in the IPv6 ocean
1242: to an IPv4 host in an IPv4 island.
1243:
1244: KAME provides an TCP relay translator for category A. This is called
1245: "FAITH". We also provide IP header translator for category A.
1246:
1247: 3.1 FAITH TCP relay translator
1248:
1249: FAITH system uses TCP relay daemon called "faithd" helped by the KAME kernel.
1250: FAITH will reserve an IPv6 address prefix, and relay TCP connection
1251: toward that prefix to IPv4 destination.
1252:
1253: For example, if the reserved IPv6 prefix is 3ffe:0501:0200:ffff::, and
1254: the IPv6 destination for TCP connection is 3ffe:0501:0200:ffff::163.221.202.12,
1255: the connection will be relayed toward IPv4 destination 163.221.202.12.
1256:
1257: destination IPv4 node (163.221.202.12)
1258: ^
1259: | IPv4 tcp toward 163.221.202.12
1260: FAITH-relay dual stack node
1261: ^
1262: | IPv6 TCP toward 3ffe:0501:0200:ffff::163.221.202.12
1263: source IPv6 node
1264:
1265: faithd must be invoked on FAITH-relay dual stack node.
1266:
1.12 itojun 1267: For more details, consult kame/kame/faithd/README and
1268: draft-ietf-ngtrans-tcpudp-relay-01.txt.
1.1 itojun 1269:
1270: 3.2 IPv6-to-IPv4 header translator
1271:
1.8 itojun 1272: # removed since it is not imported to NetBSD-current
1.1 itojun 1273:
1274: 4. IPsec
1275:
1.5 itojun 1276: IPsec is implemented as the following three components.
1.1 itojun 1277:
1278: (1) Policy Management
1279: (2) Key Management
1.5 itojun 1280: (3) AH, ESP and IPComp handling in kernel
1281:
1282: Note that KAME/OpenBSD does NOT include support for KAME IPsec code,
1283: as OpenBSD team has their home-brew IPsec stack and they have no plan
1284: to replace it. IPv6 support for IPsec is, therefore, lacking on KAME/OpenBSD.
1.1 itojun 1285:
1286: 4.1 Policy Management
1287:
1.5 itojun 1288: The kernel implements experimental policy management code. There are two way
1.12 itojun 1289: to manage security policy. One is to configure per-socket policy using
1.5 itojun 1290: setsockopt(3). In this cases, policy configuration is described in
1291: ipsec_set_policy(3). The other is to configure kernel packet filter-based
1292: policy using PF_KEY interface, via setkey(8).
1293:
1294: The policy entry will be matched in order. The order of entries makes
1295: difference in behavior.
1.1 itojun 1296:
1297: 4.2 Key Management
1298:
1299: The key management code implemented in this kit (sys/netkey) is a
1300: home-brew PFKEY v2 implementation. This conforms to RFC2367.
1301:
1.5 itojun 1302: The home-brew IKE daemon, "racoon" is included in the kit (kame/kame/racoon,
1303: or usr.sbin/racoon).
1.3 itojun 1304: Basically you'll need to run racoon as daemon, then setup a policy
1305: to require keys (like ping -P 'out ipsec esp/transport//use').
1306: The kernel will contact racoon daemon as necessary to exchange keys.
1.1 itojun 1307:
1.11 itojun 1308: In IKE spec, there's ambiguity about interpretation of "tunnel" proposal.
1309: For example, if we would like to propose the use of following packet:
1310: IP AH ESP IP payload
1.12 itojun 1311: some implementation proposes it as "AH transport and ESP tunnel", since
1.11 itojun 1312: this is more logical from packet construction point of view. Some
1313: implementation proposes it as "AH tunnel and ESP tunnel".
1314: Racoon follows the former route.
1315: This raises real interoperability issue. We hope this to be resolved quickly.
1316:
1.1 itojun 1317: 4.3 AH and ESP handling
1318:
1319: IPsec module is implemented as "hooks" to the standard IPv4/IPv6
1320: processing. When sending a packet, ip{,6}_output() checks if ESP/AH
1321: processing is required by checking if a matching SPD (Security
1322: Policy Database) is found. If ESP/AH is needed,
1323: {esp,ah}{4,6}_output() will be called and mbuf will be updated
1324: accordingly. When a packet is received, {esp,ah}4_input() will be
1325: called based on protocol number, i.e. (*inetsw[proto])().
1326: {esp,ah}4_input() will decrypt/check authenticity of the packet,
1327: and strips off daisy-chained header and padding for ESP/AH. It is
1328: safe to strip off the ESP/AH header on packet reception, since we
1329: will never use the received packet in "as is" form.
1330:
1.3 itojun 1331: By using ESP/AH, TCP4/6 effective data segment size will be affected by
1332: extra daisy-chained headers inserted by ESP/AH. Our code takes care of
1333: the case.
1.1 itojun 1334:
1335: Basic crypto functions can be found in directory "sys/crypto". ESP/AH
1336: transform are listed in {esp,ah}_core.c with wrapper functions. If you
1337: wish to add some algorithm, add wrapper function in {esp,ah}_core.c, and
1338: add your crypto algorithm code into sys/crypto.
1339:
1.5 itojun 1340: Tunnel mode works basically fine, but comes with the following restrictions:
1341: - You cannot run routing daemon across IPsec tunnel, since we do not model
1342: IPsec tunnel as pseudo interfaces.
1.1 itojun 1343: - Authentication model for AH tunnel must be revisited. We'll need to
1344: improve the policy management engine, eventually.
1.5 itojun 1345: - Tunnelling for IPv6 IPsec is still incomplete. This is disabled by default.
1346: If you need to perform experiments, add "options IPSEC_IPV6FWD" into
1347: the kernel configuration file. Note that path MTU discovery does not work
1348: across IPv6 IPsec tunnel gateway due to insufficient code.
1.1 itojun 1349:
1.11 itojun 1350: AH specificaton does not talk much about "multiple AH on a packet" case.
1351: We incrementally compute AH checksum, from inside to outside. Also, we
1352: treat inner AH to be immutable.
1353: For example, if we are to create the following packet:
1354: IP AH1 AH2 AH3 payload
1355: we do it incrementally. As a result, we get crypto checksums like below:
1356: AH3 has checksum against "IP AH3' payload".
1357: where AH3' = AH3 with checksum field filled with 0.
1358: AH2 has checksum against "IP AH2' AH3 payload".
1359: AH1 has checksum against "IP AH1' AH2 AH3 payload",
1360: Also note that AH3 has the smallest sequence number, and AH1 has the largest
1361: sequence number.
1362:
1.5 itojun 1363: 4.4 IPComp handling
1364:
1365: IPComp stands for IP payload compression protocol. This is aimed for
1366: payload compression, not the header compression like PPP VJ compression.
1367: This may be useful when you are using slow serial link (say, cell phone)
1368: with powerful CPU (well, recent notebook PCs are really powerful...).
1369: The protocol design of IPComp is very similar to IPsec, though it was
1370: defined separately from IPsec itself.
1371:
1372: Here are some points to be noted:
1373: - IPComp is treated as part of IPsec protocol suite, and SPI and
1374: CPI space is unified. Spec says that there's no relationship
1375: between two so they are assumed to be separate in specs.
1376: - IPComp association (IPCA) is kept in SAD.
1377: - It is possible to use well-known CPI (CPI=2 for DEFLATE for example),
1378: for outbound/inbound packet, but for indexing purposes one element from
1379: SPI/CPI space will be occupied anyway.
1380: - pfkey is modified to support IPComp. However, there's no official
1381: SA type number assignment yet. Portability with other IPComp
1382: stack is questionable (anyway, who else implement IPComp on UN*X?).
1.11 itojun 1383: - Spec says that IPComp output processing must be performed before AH/ESP
1.5 itojun 1384: output processing, to achieve better compression ratio and "stir" data
1.11 itojun 1385: stream before encryption. The most meaningful processing order is:
1386: (1) compress payload by IPComp, (2) encrypt payload by ESP, then (3) attach
1387: authentication data by AH.
1388: However, with manual SPD setting, you are able to violate the ordering
1389: (KAME code is too generic, maybe). Also, it is just okay to use IPComp
1390: alone, without AH/ESP.
1391: - Though the packet size can be significantly decreased by using IPComp, no
1392: special consideration is made about path MTU (spec talks nothing about MTU
1.5 itojun 1393: consideration). IPComp is designed for serial links, not ethernet-like
1394: medium, it seems.
1395: - You can change compression ratio on outbound packet, by changing
1396: deflate_policy in sys/netinet6/ipcomp_core.c. You can also change outbound
1397: history buffer size by changing deflate_window_out in the same source code.
1398: (should it be sysctl accessible, or per-SAD configurable?)
1399: - Tunnel mode IPComp is not working right. KAME box can generate tunnelled
1400: IPComp packet, however, cannot accept tunneled IPComp packet.
1401: - You can negotiate IPComp association with racoon IKE daemon.
1402: - KAME code does not attach Adler32 checksum to compressed data.
1403: see ipsec wg mailing list discussion in Jan 2000 for details.
1404:
1405: 4.5 Conformance to RFCs and IDs
1.1 itojun 1406:
1407: The IPsec code in the kernel conforms (or, tries to conform) to the
1408: following standards:
1409: "old IPsec" specification documented in rfc182[5-9].txt
1410: "new IPsec" specification documented in rfc240[1-6].txt, rfc241[01].txt,
1411: rfc2451.txt and draft-mcdonald-simple-ipsec-api-01.txt (draft expired,
1412: but you can take from ftp://ftp.kame.net/pub/internet-drafts/).
1.6 itojun 1413: (NOTE: IKE specifications, rfc240[7-9].txt are implemented in userland,
1.1 itojun 1414: as "racoon" IKE daemon)
1.5 itojun 1415: IPComp:
1416: RFC2393: IP Payload Compression Protocol (IPComp)
1.1 itojun 1417:
1418: Currently supported algorithms are:
1419: old IPsec AH
1420: null crypto checksum (no document, just for debugging)
1421: keyed MD5 with 128bit crypto checksum (rfc1828.txt)
1422: keyed SHA1 with 128bit crypto checksum (no document)
1423: HMAC MD5 with 128bit crypto checksum (rfc2085.txt)
1424: HMAC SHA1 with 128bit crypto checksum (no document)
1425: old IPsec ESP
1426: null encryption (no document, similar to rfc2410.txt)
1427: DES-CBC mode (rfc1829.txt)
1428: new IPsec AH
1429: null crypto checksum (no document, just for debugging)
1430: keyed MD5 with 96bit crypto checksum (no document)
1431: keyed SHA1 with 96bit crypto checksum (no document)
1432: HMAC MD5 with 96bit crypto checksum (rfc2403.txt
1433: HMAC SHA1 with 96bit crypto checksum (rfc2404.txt)
1434: new IPsec ESP
1435: null encryption (rfc2410.txt)
1436: DES-CBC with derived IV
1437: (draft-ietf-ipsec-ciph-des-derived-01.txt, draft expired)
1438: DES-CBC with explicit IV (rfc2405.txt)
1439: 3DES-CBC with explicit IV (rfc2451.txt)
1440: BLOWFISH CBC (rfc2451.txt)
1441: CAST128 CBC (rfc2451.txt)
1442: RC5 CBC (rfc2451.txt)
1443: each of the above can be combined with:
1444: ESP authentication with HMAC-MD5(96bit)
1445: ESP authentication with HMAC-SHA1(96bit)
1.5 itojun 1446: IPComp
1447: RFC2394: IP Payload Compression Using DEFLATE
1.1 itojun 1448:
1449: The following algorithms are NOT supported:
1450: old IPsec AH
1451: HMAC MD5 with 128bit crypto checksum + 64bit replay prevention
1452: (rfc2085.txt)
1453: keyed SHA1 with 160bit crypto checksum + 32bit padding (rfc1852.txt)
1454:
1.5 itojun 1455: The key/policy management API is based on the following document, with fair
1456: amount of extensions:
1457: RFC2367: PF_KEY key management API
1.3 itojun 1458:
1.5 itojun 1459: 4.6 ECN consideration on IPsec tunnels
1.1 itojun 1460:
1461: KAME IPsec implements ECN-friendly IPsec tunnel, described in
1.12 itojun 1462: draft-ietf-ipsec-ecn-02.txt.
1.1 itojun 1463: Normal IPsec tunnel is described in RFC2401. On encapsulation,
1464: IPv4 TOS field (or, IPv6 traffic class field) will be copied from inner
1465: IP header to outer IP header. On decapsulation outer IP header
1466: will be simply dropped. The decapsulation rule is not compatible
1467: with ECN, since ECN bit on the outer IP TOS/traffic class field will be
1468: lost.
1469: To make IPsec tunnel ECN-friendly, we should modify encapsulation
1470: and decapsulation procedure. This is described in
1.12 itojun 1471: draft-ietf-ipsec-ecn-02.txt, chapter 3.3.
1.1 itojun 1472:
1473: KAME IPsec tunnel implementation can give you three behaviors, by setting
1474: net.inet.ipsec.ecn (or net.inet6.ipsec6.ecn) to some value:
1475: - RFC2401: no consideration for ECN (sysctl value -1)
1476: - ECN forbidden (sysctl value 0)
1477: - ECN allowed (sysctl value 1)
1478: Note that the behavior is configurable in per-node manner, not per-SA manner
1.12 itojun 1479: (draft-ietf-ipsec-ecn-02 wants per-SA configuration, but it looks too much
1480: for me).
1.1 itojun 1481:
1482: The behavior is summarized as follows (see source code for more detail):
1483:
1484: encapsulate decapsulate
1485: --- ---
1486: RFC2401 copy all TOS bits drop TOS bits on outer
1487: from inner to outer. (use inner TOS bits as is)
1488:
1489: ECN forbidden copy TOS bits except for ECN drop TOS bits on outer
1490: (masked with 0xfc) from inner (use inner TOS bits as is)
1491: to outer. set ECN bits to 0.
1492:
1493: ECN allowed copy TOS bits except for ECN use inner TOS bits with some
1494: CE (masked with 0xfe) from change. if outer ECN CE bit
1495: inner to outer. is 1, enable ECN CE bit on
1496: set ECN CE bit to 0. the inner.
1497:
1498: General strategy for configuration is as follows:
1499: - if both IPsec tunnel endpoint are capable of ECN-friendly behavior,
1500: you'd better configure both end to "ECN allowed" (sysctl value 1).
1501: - if the other end is very strict about TOS bit, use "RFC2401"
1502: (sysctl value -1).
1503: - in other cases, use "ECN forbidden" (sysctl value 0).
1504: The default behavior is "ECN forbidden" (sysctl value 0).
1505:
1506: For more information, please refer to:
1.12 itojun 1507: draft-ietf-ipsec-ecn-02.txt
1.1 itojun 1508: RFC2481 (Explicit Congestion Notification)
1509: KAME sys/netinet6/{ah,esp}_input.c
1510:
1511: (Thanks goes to Kenjiro Cho <kjc@csl.sony.co.jp> for detailed analysis)
1512:
1.5 itojun 1513: 4.7 Interoperability
1514:
1515: IPsec, IPComp (in kernel) and IKE (in userland as "racoon") has been tested
1516: at several interoperability test events, and it is known to interoperate
1517: with many other implementations well. Also, KAME IPsec has quite wide
1518: coverage for IPsec crypto algorithms documented in RFC (we do not cover
1519: algorithms with intellectual property issues, though).
1.3 itojun 1520:
1521: Here are (some of) platforms we have tested IPsec/IKE interoperability
1.5 itojun 1522: in the past, in no particular order. Note that both ends (KAME and
1523: others) may have modified their implementation, so use the following
1524: list just for reference purposes.
1525: Altiga, Ashley-laurent (vpcom.com), Data Fellows (F-Secure),
1526: BlueSteel, CISCO, Ericsson, ACC, Fitel, FreeS/WAN, HITACHI, IBM
1.8 itojun 1527: AIX, IIJ, Intel, Microsoft WinNT, NAI PGPnet,
1.12 itojun 1528: NIST (linux IPsec + plutoplus), Netscreen, OpenBSD isakmpd, Radguard,
1.8 itojun 1529: RedCreek, Routerware, SSH, Secure Computing, Soliton, Toshiba,
1530: TIS/NAI Gauntret, VPNet, Yamaha RT100i
1.5 itojun 1531:
1532: Here are (some of) platforms we have tested IPComp/IKE interoperability
1533: in the past, in no particular order.
1534: IRE
1535:
1536: 5. ALTQ
1537:
1.8 itojun 1538: # removed since it is not imported to NetBSD-current
1.1 itojun 1539:
1.8 itojun 1540: 6. mobile-ip6
1541:
1542: # removed since it is not imported to NetBSD-current
1.1 itojun 1543:
1.3 itojun 1544: <end of IMPLEMENTATION>
CVSweb <webmaster@jp.NetBSD.org>