Annotation of src/sys/arch/hp300/DOC/Debug.tips, Revision 1.2
1.2 ! cgd 1: $NetBSD$
! 2:
1.1 mycroft 3: NOTE: this description applies to the hp300 system with the old BSD
4: virtual memory system. It has not been updated to reflect the new,
5: Mach-derived VM system, but should still be useful.
6: The new system has no fixed-address "u.", but has a fixed mapping
7: for the kernel stack at 0xfff00000.
8:
9: --------------------------------------------------------------------------
10:
11: Some quick notes on the HPBSD VM layout and kernel debugging.
12:
13: Physical memory:
14:
15: Physical memory always ends at the top of the 32 bit address space; i.e. the
16: last addressible byte is at 0xFFFFFFFF. Hence, the start of physical memory
17: varies depending on how much memory is installed. The kernel variable "lowram"
18: contains the starting locatation of memory as provided by the ROM.
19:
20: The low 128k (I think) of the physical address space is occupied by the ROM.
21: This is accessible via /dev/mem *only* if the kernel is compiled with DEBUG.
22: [ Maybe it should always be accessible? ]
23:
24: Virtual address spaces:
25:
26: The hardware page size is 4096 bytes. The hardware uses a two-level lookup.
27: At the highest level is a one page segment table which maps a page table which
28: maps the address space. Each 4 byte segment table entry (described in
29: hp300/pte.h) contains the page number of a single page of 4 byte page table
30: entries. Each PTE maps a single page of address space. Hence, each STE maps
31: 4Mb of address space and one page containing 1024 STEs is adequate to map the
32: entire 4Gb address space.
33:
34: Both page and segment table entries look similar. Both have the page frame
35: in the upper part and control bits in the lower. This is the opposite of
36: the VAX. It is easy to convert the page frame number in an STE/PTE to a
37: physical address, simply mentally mask out the low 12 bits. For example
38: if a PTE contains 0xFF880019, the physical memory location mapped starts at
39: 0xFF880000.
40:
41: Kernel address space:
42:
43: The kernel resides in its own virtual address space independent of all user
44: processes. When the processor is in supervisor mode (i.e. interrupt or
45: exception handling) it uses the kernel virtual mapping. The kernel segment
46: table is called Sysseg and is allocated statically in hp300/locore.s. The
47: kernel page table is called Systab is also allocated statically in
48: hp300/locore.s and consists of the usual assortment of SYSMAPs.
49: The size of Systab (Syssize) depends on the configured size of the various
50: maps but as currently configured is 9216 PTEs. Both segment and page tables
51: are initialized at bootup in hp300/locore.s. The segment table never changes
52: (except for bits maintained by the hardware). Portions of the page table
53: change as needed. The kernel is mapped into the address space starting at 0.
54:
55: Theoretically, any address in the range 0 to Syssize * 4096 (0x2400000 as
56: currently configured) is valid. However, certain addresses are more common
57: in dumps than others. Those are (for the current configuration):
58:
59: 0 - 0x800000 kernel text and permanent data structures
60: 0x917000 - 0x91a000 u-area; 1st page is user struct, last k-stack
61: 0x1b1b000 - 0x2400000 user page tables, also kmem_alloc()ed data
62:
63: User address space:
64:
65: The user text and data are loaded starting at VA 0. The user's stack starts
66: at 0xFFF00000 and grows toward lower addresses. The pages above the user
67: stack are used by the kernel. From 0xFFF00000 to 0xFFF03000 is the u-area.
68: The 3 PTEs for this range map (read-only) the same memory as does 0x917000
69: to 0x91a000 in the kernel address space. This address range is never used
70: by the kernel, but exists for utilities that assume that the u-area sits
71: above the user stack. The pages from FFF03000 up are not used. They
72: exist so that the user stack is in the same location as in HPUX.
73:
74: The user segment table is allocated along with the page tables from Usrptmap.
75: They are contiguous in kernel VA space with the page tables coming before
76: the segment table. Hence, a process has p_szpt+1 pages allocated starting
77: at kernel VA p_p0br.
78:
79: The user segment table is typically very sparse since each entry maps 4Mb.
80: There are usually only two valid STEs, one at the start mapping the text/data
81: potion of the page table, and one at the end mapping the stack/u-area. For
82: example if the segment table was at 0xFFFFA000 there would be valid entries
83: at 0xFFFFA000 and 0xFFFFAFFC.
84:
85: Random notes:
86:
87: An important thing to note is that there are no hardware length registers
88: on the HP. This implies that we cannot "pack" data and stack PTEs into the
89: same page table page. Hence, every user page table has at least 2 pages
90: (3 if you count the segment table).
91:
92: The HP maintains the p0br/p0lr and p1br/p1lr PCB fields the same as the
93: VAX even though they have no meaning to the hardware. This also keeps many
94: utilities happy.
95:
96: There is no seperate interrupt stack (right now) on the HPs. Interrupt
97: processing is handled on the kernel stack of the "current" process.
98:
99: Following is a list of things you might want to be able to do with a kernel
100: core dump. One thing you should always have is a ps listing from the core
101: file. Just do:
102:
103: ps klaw vmunix.? vmcore.?
104:
105: Exception related panics (i.e. those detected in hp300/trap.c) will dump
106: out various useful information before panicing. If available, you should
107: get this out of the /usr/adm/messages file. Finally, you should be in adb:
108:
109: adb -k vmunix.? vmcore.?
110:
111: Adb -k will allow you to examine the kernel address space more easily.
112: It automatically maps kernel VAs in the range 0 to 0x2400000 to physical
113: addresses. Since the kernel and user address spaces overlap (i.e. both
114: start at 0), adb can't let you examine the address space of the "current"
115: process as it does on the VAX.
116: --------
117:
118: 1. Find out what the current process was at the time of the crash:
119:
120: If you have the dump info from /usr/adm/messages, it should contain the
121: PID of the active process. If you don't have this info you can just look
122: at location "Umap". This is the PTE for the first page of the u-area; i.e.
123: the user structure. Forget about the last 3 hex digits and compare the top
124: 5 to the ADDR column in the ps listing.
125:
126: 2. Locating a process' user structure:
127:
128: Get the ADDR field of the desired process from the ps listing. This is the
129: page frame number of the process' user structure. Tack 3 zeros on to the
130: end to get the physical address. Note that this doesn't give you the kernel
131: stack since it is in a different page than the user-structure and pages of
132: the u-area are not physically contiguous.
133:
134: 3. Locating a process' proc structure:
135:
136: First find the process' user structure as described above. Find the u_procp
137: field at offset 0x200 from the beginning. This gives you the kernel VA of
138: the proc structure.
139:
140: 4. Locating a process' page table:
141:
142: First find the process' user structure as described above. The first part
143: of the user structure is the PCB. The second longword (third field) of the
144: PCB is pcb_ustp, a pointer to the user segment table. This pointer is
145: actually the page frame number. Again adding 3 zeros yields the physical
146: address. You can now use the values in the segment table to locate the
147: page tables. For example, to locate the first page of the text/data part
148: of the page table, use the first STE (longword) in the segment table.
149:
150: 5. Locating a process' kernel stack:
151:
152: First find the process' page table as described above. The kernel stack
153: is near the end of the user address space. So, locate the last entry in the
154: user segment table (base+0xFFC) and use that entry to find the last page of
155: the user page table. Look at the last 256 entries of this page
156: (pagebase+0xFE0) The first is the PTE for the user-structure. The second
157: was intended to be a read-only page to protect the user structure from the
158: kernel stack. Currently it is read/write and actually allocated. Hence
159: it can wind up being a second page for the kernel stack. The third is the
160: kernel stack. The last 253 should be zero. Hence, indirecing through the
161: third of these last 256 PTEs will give you the kernel stack page.
162:
163: An alternate way to do this is to use the p_addr field of the proc structure
164: which is found as described above. The p_addr field is at offset 0x10 in the
165: proc structure and points to the first of the PTEs mentioned above (i.e. the
166: user structure PTE).
167:
168: 6. Interpreting the info in a "trap type N..." panic:
169:
170: As mentioned, when the kernel crashes out of hp300/trap.c it will dump some
171: useful information. This dates back to the days when I was debugging the
172: exception handling code and had no kernel adb or even kernel crash dump code.
173: "trap type" (decimal) is as defined in hp300/trap.h, it doesn't really
174: correlate with anything useful. "code" (hex) is only useful for MMU
175: (trap type 8) errors. It is the concatination of the MMU status register
176: (see hp300/cpu.h) in the high 16 bits and the 68020 special status word
177: (see the 020 manual page 6-17) in the low 16. "v" (hex) is the virtual
178: address which caused the fault. "pid" (decimal) is the ID of the process
179: running at the time of the exception. Note that if we panic in an interrupt
180: routine, this process may not be related to the panic. "ps" (hex) is the
181: value of the 68020 status register (see page 1-4 of 020 manual) at the time
182: of the crash. If the 0x2000 bit is on, we were in supervisor (kernel) mode
183: at the time, otherwise we were in user mode. "pc" (hex) is the value of the
184: PC saved on the hardware exception frame. It may *not* be the PC of the
185: instruction causing the fault (see the 020 manual for details). The 0x2000
186: bit of "ps" dictates whether this is a kernel or user VA. "sfc" and "dfc"
187: are the 68020 source/destination function codes. They should always be one.
188: "p0" and "p1" are the VAX-like region registers. They are of the form:
189:
190: <length> '@' <kernel VA>
191:
192: where both are in hex. Following these values are a dump of the processor
193: registers (hex). Check the address registers for values close to "v", the
194: fault address. Most faults are causes by dereferences of bogus pointers.
195: Most such dereferences are the result of 020 instructions using the:
196:
197: <address-register> '@' '(' offset ')'
198:
199: addressing mode. This can help you track down the faulting instruction (since
200: the PC may not point to it). Note that the value of a7 (the stack pointer) is
201: ALWAYS the user SP. This is brain-dead I know. Finally, is a dump of the
202: stack (user/kernel) at the time of the offense. Before kernel crash dumps,
203: this was very useful.
204:
205: 7. Converting kernel virtual address to a physical address.
206:
207: Adb -k already does this for you, but sometimes you want to know what the
208: resulting physical address is rather than what is there. Doing this is
209: simply a matter of indexing into the kernel page table. In theory we would
210: first have to do a lookup in the kernel segment table, but we know that the
211: kernel page table is physically contiguous so this isn't necessary. The
212: base of the system page table is "Sysmap", so to convert an address V just
213: divide the address by 4096 to get the page number, multiply that by 4 (the
214: size of a PTE in bytes) to get a byte offset, and add that to "Sysmap".
215: This gives you the address of the PTE mapping V. You can then get the
216: physical address by masking out the low 12 bits of the contents of that PTE.
217: To wit:
218:
219: *(Sysmap+(VA%1000*4))&fffff000
220:
221: where VA is the virtual address in question.
222:
223: This technique should also work for user virtual addresses if you replace
224: "Sysmap" with the value of the appropriate processes' P0BR. This works
225: because a user's page table is *virtually* contiguous in the kernel
226: starting at P0BR, and adb will handle translating the kernel virtual addresses
227: for you.
CVSweb <webmaster@jp.NetBSD.org>