Annotation of src/external/mit/expat/dist/doc/xmlwf.1, Revision 1.1.1.1.22.2
1.1.1.1.22.1 snj 1: '\" -*- coding: us-ascii -*-
2: .if \n(.g .ds T< \\FC
3: .if \n(.g .ds T> \\F[\n[.fam]]
4: .de URL
5: \\$2 \(la\\$1\(ra\\$3
6: ..
7: .if \n(.g .mso www.tmac
8: .TH XMLWF 1 "March 11, 2016" "" ""
1.1 tron 9: .SH NAME
10: xmlwf \- Determines if an XML document is well-formed
11: .SH SYNOPSIS
1.1.1.1.22.1 snj 12: 'nh
13: .fi
14: .ad l
15: \fBxmlwf\fR \kx
16: .if (\nx>(\n(.l/2)) .nr x (\n(.l/5)
17: 'in \n(.iu+\nxu
18: [\fB-s\fR] [\fB-n\fR] [\fB-p\fR] [\fB-x\fR] [\fB-e \fIencoding\fB\fR] [\fB-w\fR] [\fB-d \fIoutput-dir\fB\fR] [\fB-c\fR] [\fB-m\fR] [\fB-r\fR] [\fB-t\fR] [\fB-v\fR] [file ...]
19: 'in \n(.iu-\nxu
20: .ad b
21: 'hy
22: .SH DESCRIPTION
1.1 tron 23: \fBxmlwf\fR uses the Expat library to
1.1.1.1.22.2! snj 24: determine if an XML document is well-formed. It is
1.1 tron 25: non-validating.
26: .PP
27: If you do not specify any files on the command-line, and you
28: have a recent version of \fBxmlwf\fR, the
29: input file will be read from standard input.
30: .SH "WELL-FORMED DOCUMENTS"
31: A well-formed document must adhere to the
32: following rules:
33: .TP 0.2i
34: \(bu
1.1.1.1.22.2! snj 35: The file begins with an XML declaration. For instance,
1.1.1.1.22.1 snj 36: \*(T<<?xml version="1.0" standalone="yes"?>\*(T>.
37: \fINOTE:\fR
1.1 tron 38: \fBxmlwf\fR does not currently
39: check for a valid XML declaration.
40: .TP 0.2i
41: \(bu
42: Every start tag is either empty (<tag/>)
43: or has a corresponding end tag.
44: .TP 0.2i
45: \(bu
1.1.1.1.22.2! snj 46: There is exactly one root element. This element must contain
! 47: all other elements in the document. Only comments, white
1.1 tron 48: space, and processing instructions may come after the close
49: of the root element.
50: .TP 0.2i
51: \(bu
52: All elements nest properly.
53: .TP 0.2i
54: \(bu
55: All attribute values are enclosed in quotes (either single
56: or double).
57: .PP
58: If the document has a DTD, and it strictly complies with that
1.1.1.1.22.1 snj 59: DTD, then the document is also considered \fIvalid\fR.
1.1 tron 60: \fBxmlwf\fR is a non-validating parser --
1.1.1.1.22.2! snj 61: it does not check the DTD. However, it does support
1.1.1.1.22.1 snj 62: external entities (see the \*(T<\fB\-x\fR\*(T> option).
63: .SH OPTIONS
1.1 tron 64: When an option includes an argument, you may specify the argument either
1.1.1.1.22.1 snj 65: separately ("\*(T<\fB\-d\fR\*(T> output") or concatenated with the
66: option ("\*(T<\fB\-d\fR\*(T>output"). \fBxmlwf\fR
1.1 tron 67: supports both.
1.1.1.1.22.2! snj 68: .TP
1.1.1.1.22.1 snj 69: \*(T<\fB\-c\fR\*(T>
1.1 tron 70: If the input file is well-formed and \fBxmlwf\fR
71: doesn't encounter any errors, the input file is simply copied to
72: the output directory unchanged.
1.1.1.1.22.1 snj 73: This implies no namespaces (turns off \*(T<\fB\-n\fR\*(T>) and
74: requires \*(T<\fB\-d\fR\*(T> to specify an output file.
1.1.1.1.22.2! snj 75: .TP
1.1.1.1.22.1 snj 76: \*(T<\fB\-d output\-dir\fR\*(T>
1.1 tron 77: Specifies a directory to contain transformed
78: representations of the input files.
1.1.1.1.22.1 snj 79: By default, \*(T<\fB\-d\fR\*(T> outputs a canonical representation
1.1 tron 80: (described below).
1.1.1.1.22.1 snj 81: You can select different output formats using \*(T<\fB\-c\fR\*(T>
82: and \*(T<\fB\-m\fR\*(T>.
1.1 tron 83:
84: The output filenames will
85: be exactly the same as the input filenames or "STDIN" if the input is
1.1.1.1.22.2! snj 86: coming from standard input. Therefore, you must be careful that the
1.1 tron 87: output file does not go into the same directory as the input
1.1.1.1.22.2! snj 88: file. Otherwise, \fBxmlwf\fR will delete the
1.1 tron 89: input file before it generates the output file (just like running
1.1.1.1.22.1 snj 90: \*(T<cat < file > file\*(T> in most shells).
1.1 tron 91:
92: Two structurally equivalent XML documents have a byte-for-byte
93: identical canonical XML representation.
94: Note that ignorable white space is considered significant and
95: is treated equivalently to data.
96: More on canonical XML can be found at
97: http://www.jclark.com/xml/canonxml.html .
1.1.1.1.22.2! snj 98: .TP
1.1.1.1.22.1 snj 99: \*(T<\fB\-e encoding\fR\*(T>
1.1 tron 100: Specifies the character encoding for the document, overriding
1.1.1.1.22.2! snj 101: any document encoding declaration. \fBxmlwf\fR
1.1 tron 102: supports four built-in encodings:
1.1.1.1.22.1 snj 103: \*(T<US\-ASCII\*(T>,
104: \*(T<UTF\-8\*(T>,
105: \*(T<UTF\-16\*(T>, and
106: \*(T<ISO\-8859\-1\*(T>.
107: Also see the \*(T<\fB\-w\fR\*(T> option.
1.1.1.1.22.2! snj 108: .TP
1.1.1.1.22.1 snj 109: \*(T<\fB\-m\fR\*(T>
1.1 tron 110: Outputs some strange sort of XML file that completely
111: describes the input file, including character positions.
1.1.1.1.22.1 snj 112: Requires \*(T<\fB\-d\fR\*(T> to specify an output file.
1.1.1.1.22.2! snj 113: .TP
1.1.1.1.22.1 snj 114: \*(T<\fB\-n\fR\*(T>
1.1.1.1.22.2! snj 115: Turns on namespace processing. (describe namespaces)
1.1.1.1.22.1 snj 116: \*(T<\fB\-c\fR\*(T> disables namespaces.
1.1.1.1.22.2! snj 117: .TP
1.1.1.1.22.1 snj 118: \*(T<\fB\-p\fR\*(T>
1.1 tron 119: Tells xmlwf to process external DTDs and parameter
120: entities.
121:
122: Normally \fBxmlwf\fR never parses parameter
1.1.1.1.22.1 snj 123: entities. \*(T<\fB\-p\fR\*(T> tells it to always parse them.
124: \*(T<\fB\-p\fR\*(T> implies \*(T<\fB\-x\fR\*(T>.
1.1.1.1.22.2! snj 125: .TP
1.1.1.1.22.1 snj 126: \*(T<\fB\-r\fR\*(T>
1.1 tron 127: Normally \fBxmlwf\fR memory-maps the XML file
128: before parsing; this can result in faster parsing on many
129: platforms.
1.1.1.1.22.1 snj 130: \*(T<\fB\-r\fR\*(T> turns off memory-mapping and uses normal file
1.1 tron 131: IO calls instead.
132: Of course, memory-mapping is automatically turned off
133: when reading from standard input.
134:
135: Use of memory-mapping can cause some platforms to report
136: substantially higher memory usage for
137: \fBxmlwf\fR, but this appears to be a matter of
138: the operating system reporting memory in a strange way; there is
139: not a leak in \fBxmlwf\fR.
1.1.1.1.22.2! snj 140: .TP
1.1.1.1.22.1 snj 141: \*(T<\fB\-s\fR\*(T>
1.1 tron 142: Prints an error if the document is not standalone.
143: A document is standalone if it has no external subset and no
144: references to parameter entities.
1.1.1.1.22.2! snj 145: .TP
1.1.1.1.22.1 snj 146: \*(T<\fB\-t\fR\*(T>
1.1.1.1.22.2! snj 147: Turns on timings. This tells Expat to parse the entire file,
1.1 tron 148: but not perform any processing.
149: This gives a fairly accurate idea of the raw speed of Expat itself
150: without client overhead.
1.1.1.1.22.1 snj 151: \*(T<\fB\-t\fR\*(T> turns off most of the output options
152: (\*(T<\fB\-d\fR\*(T>, \*(T<\fB\-m\fR\*(T>, \*(T<\fB\-c\fR\*(T>, ...).
1.1.1.1.22.2! snj 153: .TP
1.1.1.1.22.1 snj 154: \*(T<\fB\-v\fR\*(T>
1.1 tron 155: Prints the version of the Expat library being used, including some
156: information on the compile-time configuration of the library, and
157: then exits.
1.1.1.1.22.2! snj 158: .TP
1.1.1.1.22.1 snj 159: \*(T<\fB\-w\fR\*(T>
1.1 tron 160: Enables support for Windows code pages.
161: Normally, \fBxmlwf\fR will throw an error if it
1.1.1.1.22.2! snj 162: runs across an encoding that it is not equipped to handle itself. With
1.1.1.1.22.1 snj 163: \*(T<\fB\-w\fR\*(T>, xmlwf will try to use a Windows code
164: page. See also \*(T<\fB\-e\fR\*(T>.
1.1.1.1.22.2! snj 165: .TP
1.1.1.1.22.1 snj 166: \*(T<\fB\-x\fR\*(T>
1.1 tron 167: Turns on parsing external entities.
168:
169: Non-validating parsers are not required to resolve external
170: entities, or even expand entities at all.
171: Expat always expands internal entities (?),
172: but external entity parsing must be enabled explicitly.
173:
174: External entities are simply entities that obtain their
175: data from outside the XML file currently being parsed.
176:
177: This is an example of an internal entity:
178:
179: .nf
1.1.1.1.22.2! snj 180:
1.1 tron 181: <!ENTITY vers '1.0.2'>
182: .fi
183:
184: And here are some examples of external entities:
185:
186: .nf
1.1.1.1.22.2! snj 187:
1.1.1.1.22.1 snj 188: <!ENTITY header SYSTEM "header\-&vers;.xml"> (parsed)
1.1 tron 189: <!ENTITY logo SYSTEM "logo.png" PNG> (unparsed)
190: .fi
1.1.1.1.22.2! snj 191: .TP
1.1.1.1.22.1 snj 192: \*(T<\fB\-\-\fR\*(T>
1.1 tron 193: (Two hyphens.)
1.1.1.1.22.2! snj 194: Terminates the list of options. This is only needed if a filename
! 195: starts with a hyphen. For example:
1.1 tron 196:
197: .nf
1.1.1.1.22.2! snj 198:
1.1.1.1.22.1 snj 199: xmlwf \-\- \-myfile.xml
1.1 tron 200: .fi
201:
202: will run \fBxmlwf\fR on the file
1.1.1.1.22.1 snj 203: \*(T<\fI\-myfile.xml\fR\*(T>.
1.1 tron 204: .PP
205: Older versions of \fBxmlwf\fR do not support
206: reading from standard input.
1.1.1.1.22.1 snj 207: .SH OUTPUT
1.1 tron 208: If an input file is not well-formed,
209: \fBxmlwf\fR prints a single line describing
1.1.1.1.22.2! snj 210: the problem to standard output. If a file is well formed,
1.1 tron 211: \fBxmlwf\fR outputs nothing.
1.1.1.1.22.1 snj 212: Note that the result code is \fInot\fR set.
213: .SH BUGS
1.1 tron 214: \fBxmlwf\fR returns a 0 - noerr result,
1.1.1.1.22.2! snj 215: even if the file is not well-formed. There is no good way for
1.1 tron 216: a program to use \fBxmlwf\fR to quickly
217: check a file -- it must parse \fBxmlwf\fR's
218: standard output.
219: .PP
220: The errors should go to standard error, not standard output.
221: .PP
1.1.1.1.22.1 snj 222: There should be a way to get \*(T<\fB\-d\fR\*(T> to send its
1.1 tron 223: output to standard output rather than forcing the user to send
224: it to a file.
225: .PP
226: I have no idea why anyone would want to use the
1.1.1.1.22.1 snj 227: \*(T<\fB\-d\fR\*(T>, \*(T<\fB\-c\fR\*(T>, and
228: \*(T<\fB\-m\fR\*(T> options. If someone could explain it to
1.1 tron 229: me, I'd like to add this information to this manpage.
1.1.1.1.22.1 snj 230: .SH ALTERNATIVES
1.1 tron 231: Here are some XML validators on the web:
232:
233: .nf
1.1.1.1.22.2! snj 234:
1.1.1.1.22.1 snj 235: http://www.hcrc.ed.ac.uk/~richard/xml\-check.html
1.1 tron 236: http://www.stg.brown.edu/service/xmlvalid/
237: http://www.scripting.com/frontier5/xml/code/xmlValidator.html
238: http://www.xml.com/pub/a/tools/ruwf/check.html
239: .fi
240: .SH "SEE ALSO"
241: .nf
1.1.1.1.22.2! snj 242:
1.1 tron 243: The Expat home page: http://www.libexpat.org/
1.1.1.1.22.1 snj 244: The W3 XML specification: http://www.w3.org/TR/REC\-xml
1.1 tron 245: .fi
1.1.1.1.22.1 snj 246: .SH AUTHOR
247: This manual page was written by Scott Bronson <\*(T<bronson@rinspin.com\*(T>> for
1.1.1.1.22.2! snj 248: the Debian GNU/Linux system (but may be used by others). Permission is
1.1 tron 249: granted to copy, distribute and/or modify this document under
250: the terms of the GNU Free Documentation
251: License, Version 1.1.
CVSweb <webmaster@jp.NetBSD.org>