Up to [cvs.NetBSD.org] / src / sys / dev / nvmm / x86
Request diff between arbitrary revisions
Default branch: MAIN
Revision 1.85 / (download) - annotate - [select for diffs], Tue Sep 13 20:10:04 2022 UTC (8 months, 2 weeks ago) by riastradh
Branch: MAIN
CVS Tags: netbsd-10-base,
netbsd-10,
bouyer-sunxi-drm-base,
bouyer-sunxi-drm,
HEAD
Changes since 1.84: +76 -9
lines
Diff to previous 1.84 (colored)
nvmm(4): Add suspend/resume support. New MD nvmm_impl callbacks: - .suspend_interrupt forces all VMs on all physical CPUs to exit. - .vcpu_suspend suspends an individual vCPU on a machine. - .machine_suspend suspends an individual machine. - .suspend suspends the whole system. - .resume resumes the whole system. - .machine_resume resumes an individual machine. - .vcpu_resume resumes an indidivudal vCPU on a machine. Suspending nvmm: 1. causes new VM operations (ioctl and close) to block until resumed, 2. uses .suspend_interrupt to interrupt any concurrent and force them to return early, and then 3. uses the various suspend callbacks to suspend all vCPUs, machines, and the whole system -- all vCPUs before the machine they're on, and all machines before the system. Resuming nvmm does the reverse of (3) -- resume system, resume each machine and then the vCPUs on that machine -- and then unblocks operations. Implemented only for x86-vmx for now: - suspend_interrupt triggers a TLB IPI to cause VM exits; - vcpu_suspend issues VMCLEAR to force any in-CPU state to be written to memory; - machine_suspend does nothing; - suspend does VMXOFF on all CPUs; - resume does VMXON on all CPUs; - machine_resume does nothing; and - vcpu_resume just marks each vCPU as valid but inactive so subsequent use will clear it and load it with vmptrld. x86-svm left as an exercise for the reader.
Revision 1.84 / (download) - annotate - [select for diffs], Sat Aug 20 23:48:51 2022 UTC (9 months, 1 week ago) by riastradh
Branch: MAIN
Changes since 1.83: +3 -2
lines
Diff to previous 1.83 (colored)
x86: Split most of pmap.h into pmap_private.h or vmparam.h. This way pmap.h only contains the MD definition of the MI pmap(9) API, which loads of things in the kernel rely on, so changing x86 pmap internals no longer requires recompiling the entire kernel every time. Callers needing these internals must now use machine/pmap_private.h. Note: This is not x86/pmap_private.h because it contains three parts: 1. CPU-specific (different for i386/amd64) definitions used by... 2. common definitions, including Xenisms like xpmap_ptetomach, further used by... 3. more CPU-specific inlines for pmap_pte_* operations So {amd64,i386}/pmap_private.h defines 1, includes x86/pmap_private.h for 2, and then defines 3. Maybe we should split that out into a new pmap_pte.h to reduce this trouble. No functional change intended, other than that some .c files must include machine/pmap_private.h when previously uvm/uvm_pmap.h polluted the namespace with pmap internals. Note: This migrates part of i386/pmap.h into i386/vmparam.h -- specifically the parts that are needed for several constants defined in vmparam.h: VM_MAXUSER_ADDRESS VM_MAX_ADDRESS VM_MAX_KERNEL_ADDRESS VM_MIN_KERNEL_ADDRESS Since i386 needs PDP_SIZE in vmparam.h, I added it there on amd64 too, just to keep things parallel.
Revision 1.83 / (download) - annotate - [select for diffs], Fri May 13 19:34:47 2022 UTC (12 months, 2 weeks ago) by tnn
Branch: MAIN
Changes since 1.82: +2 -4
lines
Diff to previous 1.82 (colored)
nvmm_x86_vmx.c: remove an #ifdef DIAGNOSTIC, it is wrong since r1.66
Revision 1.81.2.1 / (download) - annotate - [select for diffs], Sat Apr 3 22:28:45 2021 UTC (2 years, 1 month ago) by thorpej
Branch: thorpej-futex
Changes since 1.81: +3 -3
lines
Diff to previous 1.81 (colored) next main 1.82 (colored)
Sync with HEAD.
Revision 1.81.4.1 / (download) - annotate - [select for diffs], Sat Apr 3 21:44:51 2021 UTC (2 years, 1 month ago) by thorpej
Branch: thorpej-cfargs
Changes since 1.81: +3 -3
lines
Diff to previous 1.81 (colored) next main 1.82 (colored)
Sync with HEAD.
Revision 1.82 / (download) - annotate - [select for diffs], Fri Mar 26 15:59:53 2021 UTC (2 years, 2 months ago) by reinoud
Branch: MAIN
CVS Tags: thorpej-i2c-spi-conf2-base,
thorpej-i2c-spi-conf2,
thorpej-i2c-spi-conf-base,
thorpej-i2c-spi-conf,
thorpej-futex2-base,
thorpej-futex2,
thorpej-futex-base,
thorpej-cfargs2-base,
thorpej-cfargs2,
thorpej-cfargs-base,
cjep_sun2x-base1,
cjep_sun2x-base,
cjep_sun2x,
cjep_staticlib_x-base1,
cjep_staticlib_x-base,
cjep_staticlib_x
Changes since 1.81: +3 -3
lines
Diff to previous 1.81 (colored)
Implement nvmm_vcpu::stop, a race-free exit from nvmm_vcpu_run() without signals. This introduces a new kernel and userland NVMM version indicating this support. Patch by Kamil Rytarowski <kamil@netbsd.org> and committed on his request.
Revision 1.81 / (download) - annotate - [select for diffs], Sat Oct 24 07:14:30 2020 UTC (2 years, 7 months ago) by mgorny
Branch: MAIN
Branch point for: thorpej-futex,
thorpej-cfargs
Changes since 1.80: +6 -4
lines
Diff to previous 1.80 (colored)
Issue 64-bit versions of *XSAVE* for 64-bit amd64 programs When calling FXSAVE, XSAVE, FXRSTOR, ... for 64-bit programs on amd64 use the 64-suffixed variant in order to include the complete FIP/FDP registers in the x87 area. The difference between the two variants is that the FXSAVE64 (new) variant represents FIP/FDP as 64-bit fields (union fp_addr.fa_64), while the legacy FXSAVE variant uses split fields: 32-bit offset, 16-bit segment and 16-bit reserved field (union fp_addr.fa_32). The latter implies that the actual addresses are truncated to 32 bits which is insufficient in modern programs. The change is applied only to 64-bit programs on amd64. Plain i386 and compat32 continue using plain FXSAVE. Similarly, NVMM is not changed as I am not familiar with that code. This is a potentially breaking change. However, I don't think it likely to actually break anything because the data provided by the old variant were not meaningful (because of the truncated pointer).
Revision 1.36.2.15 / (download) - annotate - [select for diffs], Sun Sep 13 11:56:44 2020 UTC (2 years, 8 months ago) by martin
Branch: netbsd-9
CVS Tags: netbsd-9-3-RELEASE,
netbsd-9-2-RELEASE,
netbsd-9-1-RELEASE
Changes since 1.36.2.14: +70 -12
lines
Diff to previous 1.36.2.14 (colored) to branchpoint 1.36 (colored) next main 1.37 (colored)
Pull up following revision(s) (requested by maxv in ticket #1078): sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.73 sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.73 sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.74 nvmm-x86-vmx: improve the handling of CR4 - Filter out certain features we don't want the guest to enable. This is for general correctness, and future-proofness. - Flush the guest TLB when certain flags change. nvmm-x86: improve the handling of RFLAGS.RF - When injecting certain exceptions, set RF. For us to have an up-to-date view of RFLAGS, we commit the state before the event. - When advancing RIP, clear RF.
Revision 1.36.2.14 / (download) - annotate - [select for diffs], Sun Sep 13 11:54:10 2020 UTC (2 years, 8 months ago) by martin
Branch: netbsd-9
Changes since 1.36.2.13: +20 -4
lines
Diff to previous 1.36.2.13 (colored) to branchpoint 1.36 (colored)
Pull up following revision(s) (requested by maxv in ticket #1077): sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.68 sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.74 sys/dev/nvmm/x86/nvmm_x86.c: revision 1.16 Improve emulation of MSR_IA32_ARCH_CAPABILITIES: publish only the *_NO bits. Initially they were the only ones there, but Intel then added other bits we aren't interested in, and they must be filtered out. nvmm-x86-svm: improve the handling of MSR_EFER Intercept reads of it as well, just to mask EFER_SVME, which the guest doesn't need to see. nvmm-x86: improve the CPUID emulation - Mask DTES64, DS_CPL, CID, SDBG, xTPR, PN. - B10, B20 and IA64 do not exist, so just remove them.
Revision 1.80 / (download) - annotate - [select for diffs], Tue Sep 8 17:02:03 2020 UTC (2 years, 8 months ago) by maxv
Branch: MAIN
Changes since 1.79: +4 -7
lines
Diff to previous 1.79 (colored)
nvmm-x86: avoid hogging behavior observed recently When the FPU code got rewritten in NetBSD, the dependency on IPL_HIGH was eliminated, and I took _vcpu_guest_fpu_enter() out of the VCPU loop since there was no need to be in the splhigh window. Later, the code was switched to use the kernel FPU API, API that works at IPL_VM, not at IPL_NONE. These two changes mean that the whole VCPU loop is now executing at IPL_VM, which is not desired, because it introduces a delay in interrupt processing on the host in certain cases. Fix this by putting _vcpu_guest_fpu_enter() back inside the VCPU loop.
Revision 1.79 / (download) - annotate - [select for diffs], Tue Sep 8 17:00:07 2020 UTC (2 years, 8 months ago) by maxv
Branch: MAIN
Changes since 1.78: +43 -23
lines
Diff to previous 1.78 (colored)
nvmm-x86-vmx: improve the handling of CR0 - CR0_ET is hard-wired to 1 in the cpu, so force CR0_ET to 1 in the shadow. - Clarify.
Revision 1.78 / (download) - annotate - [select for diffs], Sun Sep 6 02:18:53 2020 UTC (2 years, 8 months ago) by riastradh
Branch: MAIN
Changes since 1.77: +4 -3
lines
Diff to previous 1.77 (colored)
Fix fallout from previous uvm.h cleanup. - pmap(9) needs uvm/uvm_extern.h. - x86/pmap.h is not usable on its own; it is only usable if included via uvm/uvm_extern.h (-> uvm/uvm_pmap.h -> machine/pmap.h). - Make nvmm.h and nvmm_internal.h standalone.
Revision 1.77 / (download) - annotate - [select for diffs], Sat Sep 5 16:30:11 2020 UTC (2 years, 8 months ago) by riastradh
Branch: MAIN
Changes since 1.76: +3 -4
lines
Diff to previous 1.76 (colored)
Round of uvm.h cleanup. The poorly named uvm.h is generally supposed to be for uvm-internal users only. - Narrow it to files that actually need it -- mostly files that need to query whether curlwp is the pagedaemon, which should maybe be exposed by an external header. - Use uvm_extern.h where feasible and uvm_*.h for things not exposed by it. We should split up uvm_extern.h but this will serve for now to reduce the uvm.h dependencies. - Use uvm_stat.h and #ifdef UVMHIST uvm.h for files that use UVMHIST(ubchist), since ubchist is declared in uvm.h but the reference evaporates if UVMHIST is not defined, so we reduce header file dependencies. - Make uvm_device.h and uvm_swap.h independently includable while here. ok chs@
Revision 1.76 / (download) - annotate - [select for diffs], Sat Sep 5 07:22:26 2020 UTC (2 years, 8 months ago) by maxv
Branch: MAIN
Changes since 1.75: +15 -16
lines
Diff to previous 1.75 (colored)
nvmm: update copyright headers
Revision 1.36.2.13 / (download) - annotate - [select for diffs], Fri Sep 4 18:53:43 2020 UTC (2 years, 8 months ago) by martin
Branch: netbsd-9
Changes since 1.36.2.12: +15 -8
lines
Diff to previous 1.36.2.12 (colored) to branchpoint 1.36 (colored)
Pull up following revision(s) (requested by maxv in ticket #1076): sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.75 sys/arch/x86/include/specialreg.h: revision 1.172 sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.72 nvmm-x86-vmx: fix detection of the BIOS lock If it's locked, ensure it's locked with VMX enabled. If it's not locked, then lock it ourselves with VMX enabled. Should fix NetBSD PR/55596. - Add a few more CPUID flags. - nvmm-x86-svm: check the SVM revision Only revision 1 exists, but check it, for future-proofness.
Revision 1.75 / (download) - annotate - [select for diffs], Fri Sep 4 17:07:33 2020 UTC (2 years, 8 months ago) by maxv
Branch: MAIN
Changes since 1.74: +22 -9
lines
Diff to previous 1.74 (colored)
nvmm-x86-vmx: improve the handling of CR0 - Flush the guest TLB when certain CR0 bits change. - If the guest updates a static bit in CR0, then reflect the change in VMCS_CR0_SHADOW, for the guest to get the illusion that the change was applied. The "real" CR0 static bits remain unchanged. - In vmx_vcpu_{g,s}et_state(), take VMCS_CR0_SHADOW into account. - Slightly modify the CR4 handling code, just for more symmetry with CR0.
Revision 1.36.2.12 / (download) - annotate - [select for diffs], Sat Aug 29 17:00:28 2020 UTC (2 years, 8 months ago) by martin
Branch: netbsd-9
Changes since 1.36.2.11: +3 -3
lines
Diff to previous 1.36.2.11 (colored) to branchpoint 1.36 (colored)
Pull up following revision(s) (requested by maxv in ticket #1068): sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.71 sys/dev/nvmm/nvmm.c: revision 1.34 sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.72 sys/dev/nvmm/nvmm.c: revision 1.35 sys/dev/nvmm/nvmm.c: revision 1.36 sys/dev/nvmm/x86/nvmm_x86_svmfunc.S: revision 1.5 sys/dev/nvmm/nvmm.c: revision 1.37 sys/dev/nvmm/x86/nvmm_x86_vmxfunc.S: revision 1.5 sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.70 sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.68 sys/dev/nvmm/x86/nvmm_x86.c: revision 1.15 sys/dev/nvmm/nvmm_ioctl.h: revision 1.10 Micro-optimize: use pushq instead of pushw. To avoid LCP stalls and unaligned stack accesses. nvmm-x86: also flush the guest TLB when CR4.{PCIDE,SMEP} changes nvmm: localify a variable that doesn't need to be global nvmm: use relaxed atomics to read nmachines nvmm-x86-svm: dedup code nvmm-x86: hide more CPUID flags, mostly related to perf monitors nvmm: misc improvements - use mach->ncpus to get the number of vcpus, now that we have it - don't forget to decrement mach->ncpus when a machine gets killed - add more __predict_false() nvmm-x86-svm: don't forget to intercept INVD INVD executed in the guest can be dangerous for the host, due to CPU caches being flushed without write-back. nvmm: slightly clarify nvmm: explicitly include atomic.h
Revision 1.36.2.11 / (download) - annotate - [select for diffs], Wed Aug 26 17:55:49 2020 UTC (2 years, 9 months ago) by martin
Branch: netbsd-9
Changes since 1.36.2.10: +43 -9
lines
Diff to previous 1.36.2.10 (colored) to branchpoint 1.36 (colored)
Pull up following revision(s) (requested by maxv in ticket #1058): sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.70 sys/dev/nvmm/x86/nvmm_x86.h: revision 1.19 sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.69 sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.71 sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.69 sys/dev/nvmm/x86/nvmm_x86.c: revision 1.11 sys/dev/nvmm/x86/nvmm_x86.c: revision 1.12 sys/dev/nvmm/x86/nvmm_x86.c: revision 1.13 sys/dev/nvmm/x86/nvmm_x86.c: revision 1.14 Improve the CPUID emulation: - Hide SGX*, PKU, WAITPKG, and SKINIT, because they are not supported. - Hide HLE and RTM, part of TSX. Because TSX is just too buggy and we cannot guarantee that it remains enabled in the guest (if for example the host disables TSX while the guest is running). Nobody wants this crap anyway, so bye-bye. - Advertise FSREP_MOV, because no reason to hide it. Hide OSPKE. NFC since the host never uses PKU, but still. Improve the CPUID emulation on nvmm-intel: - Limit the highest extended leaf. - Limit 0x00000007 to ECX=0, for future-proofness. nvmm-x86-svm: improve the CPUID emulation Limit the hypervisor range, and properly handle each basic leaf until 0xD. nvmm-x86: advertise the SERIALIZE instruction, available on future CPUs nvmm-x86: improve the CPUID emulation - x86-svm: explicitly handle 0x80000007 and 0x80000008. The latter contains extended features we must filter out. Apply the same in x86-vmx for symmetry. - x86-svm: explicitly handle extended leaves until 0x8000001F, and truncate to it.
Revision 1.74 / (download) - annotate - [select for diffs], Wed Aug 26 16:32:02 2020 UTC (2 years, 9 months ago) by maxv
Branch: MAIN
Changes since 1.73: +32 -8
lines
Diff to previous 1.73 (colored)
nvmm-x86: improve the handling of RFLAGS.RF - When injecting certain exceptions, set RF. For us to have an up-to-date view of RFLAGS, we commit the state before the event. - When advancing RIP, clear RF.
Revision 1.73 / (download) - annotate - [select for diffs], Wed Aug 26 16:30:50 2020 UTC (2 years, 9 months ago) by maxv
Branch: MAIN
Changes since 1.72: +40 -6
lines
Diff to previous 1.72 (colored)
nvmm-x86-vmx: improve the handling of CR4 - Filter out certain features we don't want the guest to enable. This is for general correctness, and future-proofness. - Flush the guest TLB when certain flags change.
Revision 1.72 / (download) - annotate - [select for diffs], Sat Aug 22 11:01:10 2020 UTC (2 years, 9 months ago) by maxv
Branch: MAIN
Changes since 1.71: +15 -8
lines
Diff to previous 1.71 (colored)
nvmm-x86-vmx: fix detection of the BIOS lock If it's locked, ensure it's locked with VMX enabled. If it's not locked, then lock it ourselves with VMX enabled. Should fix NetBSD PR/55596.
Revision 1.71 / (download) - annotate - [select for diffs], Thu Aug 20 11:09:56 2020 UTC (2 years, 9 months ago) by maxv
Branch: MAIN
Changes since 1.70: +12 -2
lines
Diff to previous 1.70 (colored)
nvmm-x86: improve the CPUID emulation - x86-svm: explicitly handle 0x80000007 and 0x80000008. The latter contains extended features we must filter out. Apply the same in x86-vmx for symmetry. - x86-svm: explicitly handle extended leaves until 0x8000001F, and truncate to it.
Revision 1.70 / (download) - annotate - [select for diffs], Tue Aug 18 17:03:10 2020 UTC (2 years, 9 months ago) by maxv
Branch: MAIN
Changes since 1.69: +3 -3
lines
Diff to previous 1.69 (colored)
nvmm-x86: also flush the guest TLB when CR4.{PCIDE,SMEP} changes
Revision 1.36.2.10 / (download) - annotate - [select for diffs], Tue Aug 18 09:29:52 2020 UTC (2 years, 9 months ago) by martin
Branch: netbsd-9
Changes since 1.36.2.9: +29 -2
lines
Diff to previous 1.36.2.9 (colored) to branchpoint 1.36 (colored)
Pull up following revision(s) (requested by maxv in ticket #1055): sys/dev/nvmm/nvmm.h: revision 1.13 sys/dev/nvmm/nvmm.h: revision 1.14 sys/dev/nvmm/nvmm.c: revision 1.33 sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.67 sys/dev/nvmm/nvmm_internal.h: revision 1.17 sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.67 sys/dev/nvmm/x86/nvmm_x86.c: revision 1.10 Put the few x86-specific structures under #ifdef __x86_64__, for clarity. Make it easier to understand what's going on, no functional change. Add new field definitions. Add new field definitions, and intercept everything, for future-proofness. Add CTASSERT.
Revision 1.69 / (download) - annotate - [select for diffs], Tue Aug 11 15:31:51 2020 UTC (2 years, 9 months ago) by maxv
Branch: MAIN
Changes since 1.68: +31 -7
lines
Diff to previous 1.68 (colored)
Improve the CPUID emulation on nvmm-intel: - Limit the highest extended leaf. - Limit 0x00000007 to ECX=0, for future-proofness.
Revision 1.68 / (download) - annotate - [select for diffs], Tue Aug 11 15:27:46 2020 UTC (2 years, 9 months ago) by maxv
Branch: MAIN
Changes since 1.67: +20 -4
lines
Diff to previous 1.67 (colored)
Improve emulation of MSR_IA32_ARCH_CAPABILITIES: publish only the *_NO bits. Initially they were the only ones there, but Intel then added other bits we aren't interested in, and they must be filtered out.
Revision 1.67 / (download) - annotate - [select for diffs], Wed Aug 5 15:20:09 2020 UTC (2 years, 9 months ago) by maxv
Branch: MAIN
Changes since 1.66: +29 -2
lines
Diff to previous 1.66 (colored)
Add new field definitions.
Revision 1.36.2.9 / (download) - annotate - [select for diffs], Wed Aug 5 15:18:24 2020 UTC (2 years, 9 months ago) by martin
Branch: netbsd-9
Changes since 1.36.2.8: +7 -13
lines
Diff to previous 1.36.2.8 (colored) to branchpoint 1.36 (colored)
Pull up following revision(s) (requested by maxv in ticket #1041): sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.66 sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.50 sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.66 sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.46 sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.49 sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.55 sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.56 pg->phys_addr > VM_PAGE_TO_PHYS(pg) Explicitly cast pointers to uintptr_t before casting to enums. They are not necessarily the same size. Don't cast pointers to bool, check for NULL instead. vmx_vmptrst(): only used when DIAGNOSTIC Simplify, remove unnecessary #ifdef DIAGNOSTIC around KASSERTs. Use ULL, to make it clear we are unsigned.
Revision 1.66 / (download) - annotate - [select for diffs], Wed Aug 5 10:20:50 2020 UTC (2 years, 9 months ago) by maxv
Branch: MAIN
Changes since 1.65: +3 -11
lines
Diff to previous 1.65 (colored)
Simplify, remove unnecessary #ifdef DIAGNOSTIC around KASSERTs.
Revision 1.36.2.8 / (download) - annotate - [select for diffs], Sun Aug 2 08:49:08 2020 UTC (2 years, 9 months ago) by martin
Branch: netbsd-9
Changes since 1.36.2.7: +5 -10
lines
Diff to previous 1.36.2.7 (colored) to branchpoint 1.36 (colored)
Pull up following revision(s) (requested by maxv in ticket #1032): sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.60 (patch) sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.61 (patch) sys/dev/nvmm/nvmm.c: revision 1.30 sys/dev/nvmm/nvmm.c: revision 1.31 sys/dev/nvmm/nvmm.c: revision 1.32 sys/dev/nvmm/nvmm_internal.h: revision 1.15 sys/dev/nvmm/nvmm_internal.h: revision 1.16 sys/dev/nvmm/files.nvmm: revision 1.3 sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.62 (patch) sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.63 (patch) sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.59 (patch) sys/modules/nvmm/nvmm.ioconf: revision 1.2 Gather the conditions to return from the VCPU loops in nvmm_return_needed(), and use it in nvmm_do_vcpu_run() as well. This fixes two undesired behaviors: - When a VM initializes, the many nested page faults that need processing could cause the calling thread to occupy the CPU too much if we're unlucky and are only getting repeated nested page faults thousands of times in a row. - When the emulator calls nvmm_vcpu_run() and immediately sends a signal to stop the VCPU, it's better to check signals earlier and leave right away, rather than doing a round of VCPU run that could increase the time spent by the emulator waiting for the return. style Register NVMM as an actual pseudo-device. Without PMF handler, to explicitly disallow ACPI suspend if NVMM is running. Should fix PR/55406. Print the backend name when attaching.
Revision 1.65 / (download) - annotate - [select for diffs], Sun Jul 19 06:56:09 2020 UTC (2 years, 10 months ago) by maxv
Branch: MAIN
Changes since 1.64: +4 -3
lines
Diff to previous 1.64 (colored)
Switch to fpu_kern_enter/leave, to prevent clobbering, now that the kernel itself uses the fpu.
Revision 1.64 / (download) - annotate - [select for diffs], Sun Jul 19 06:36:37 2020 UTC (2 years, 10 months ago) by maxv
Branch: MAIN
Changes since 1.63: +17 -5
lines
Diff to previous 1.63 (colored)
The TLB flush IPIs do not respect the IPL, so enforcing IPL_HIGH has no effect. Disable interrupts earlier instead. This prevents a possible race against such IPIs.
Revision 1.63 / (download) - annotate - [select for diffs], Sat Jul 18 20:56:53 2020 UTC (2 years, 10 months ago) by maxv
Branch: MAIN
Changes since 1.62: +4 -8
lines
Diff to previous 1.62 (colored)
Now that the IDT is per-CPU, it must be saved/restored on each CPU independently.
Revision 1.62 / (download) - annotate - [select for diffs], Tue Jul 14 00:45:53 2020 UTC (2 years, 10 months ago) by yamaguchi
Branch: MAIN
Changes since 1.61: +8 -4
lines
Diff to previous 1.61 (colored)
Introduce per-cpu IDTs This is realized by following modifications: - Add IDT pages and its allocation maps for each cpu in "struct cpu_info" - Load per-cpu IDTs at cpu_init_idt(struct cpu_info*) - Copy the IDT entries for cpu0 to other CPUs at attach - These are, for example, exceptions, db, system calls, etc. And, added a kernel option named PCPU_IDT to enable the feature.
Revision 1.61 / (download) - annotate - [select for diffs], Fri Jul 3 16:09:54 2020 UTC (2 years, 10 months ago) by maxv
Branch: MAIN
Changes since 1.60: +3 -2
lines
Diff to previous 1.60 (colored)
Print the backend name when attaching.
Revision 1.60 / (download) - annotate - [select for diffs], Thu Jun 18 16:31:15 2020 UTC (2 years, 11 months ago) by maxv
Branch: MAIN
Changes since 1.59: +3 -3
lines
Diff to previous 1.59 (colored)
style
Revision 1.59 / (download) - annotate - [select for diffs], Sun May 24 08:08:49 2020 UTC (3 years ago) by maxv
Branch: MAIN
Changes since 1.58: +3 -6
lines
Diff to previous 1.58 (colored)
Gather the conditions to return from the VCPU loops in nvmm_return_needed(), and use it in nvmm_do_vcpu_run() as well. This fixes two undesired behaviors: - When a VM initializes, the many nested page faults that need processing could cause the calling thread to occupy the CPU too much if we're unlucky and are only getting repeated nested page faults thousands of times in a row. - When the emulator calls nvmm_vcpu_run() and immediately sends a signal to stop the VCPU, it's better to check signals earlier and leave right away, rather than doing a round of VCPU run that could increase the time spent by the emulator waiting for the return.
Revision 1.36.2.7 / (download) - annotate - [select for diffs], Thu May 21 10:52:58 2020 UTC (3 years ago) by martin
Branch: netbsd-9
Changes since 1.36.2.6: +100 -17
lines
Diff to previous 1.36.2.6 (colored) to branchpoint 1.36 (colored)
Pull up following revision(s) (requested by maxv in ticket #919): sys/dev/nvmm/x86/nvmm_x86.c: revision 1.9 sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.60 sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.61 sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.56 sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.57 sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.58 sys/dev/nvmm/nvmm.c: revision 1.29 Improve the CPUID emulation of basic leaves: - Hide DCA and PQM, they cannot be used in guests. - On Intel, explicitly handle each basic leaf until 0x16. - On AMD, explicitly handle each basic leaf until 0x0D. Respect the convention for the hypervisor information: return the highest hypervisor leaf in 0x40000000.EAX. Improve the CPUID emulation on nvmm-intel: limit the highest basic and hypervisor leaves. Complete rev1.26: reset nvmm_impl to NULL in nvmm_fini().
Revision 1.58 / (download) - annotate - [select for diffs], Thu May 21 07:36:16 2020 UTC (3 years ago) by maxv
Branch: MAIN
Changes since 1.57: +45 -10
lines
Diff to previous 1.57 (colored)
Improve the CPUID emulation on nvmm-intel: limit the highest basic and hypervisor leaves.
Revision 1.36.2.6 / (download) - annotate - [select for diffs], Wed May 13 12:21:56 2020 UTC (3 years ago) by martin
Branch: netbsd-9
Changes since 1.36.2.5: +59 -6
lines
Diff to previous 1.36.2.5 (colored) to branchpoint 1.36 (colored)
Pull up following revision(s) (requested by maxv in ticket #898): sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.59 sys/dev/nvmm/nvmm_internal.h: revision 1.14 sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.53 sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.54 sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.55 sys/dev/nvmm/nvmm.c: revision 1.27 sys/dev/nvmm/nvmm.c: revision 1.28 When the identification fails, print the reason. If we were processing a software int/excp, and got a VMEXIT in the middle, we must also reflect the instruction length, otherwise the next VMENTER fails and Qemu shuts the guest down. On Intel CPUs, CPUID leaf 0xB, too, provides topology information, so filter it correctly, to avoid inconsistencies if the host has SMT. This fixes HaikuOS which fetches SMT information from there and would panic because of the inconsistencies.
Revision 1.57 / (download) - annotate - [select for diffs], Sun May 10 06:24:16 2020 UTC (3 years ago) by maxv
Branch: MAIN
Changes since 1.56: +5 -2
lines
Diff to previous 1.56 (colored)
Respect the convention for the hypervisor information: return the highest hypervisor leaf in 0x40000000.EAX.
Revision 1.56 / (download) - annotate - [select for diffs], Sat May 9 16:18:57 2020 UTC (3 years ago) by maxv
Branch: MAIN
Changes since 1.55: +54 -9
lines
Diff to previous 1.55 (colored)
Improve the CPUID emulation of basic leaves: - Hide DCA and PQM, they cannot be used in guests. - On Intel, explicitly handle each basic leaf until 0x16. - On AMD, explicitly handle each basic leaf until 0x0D.
Revision 1.55 / (download) - annotate - [select for diffs], Sat May 9 08:39:07 2020 UTC (3 years ago) by maxv
Branch: MAIN
Changes since 1.54: +34 -4
lines
Diff to previous 1.54 (colored)
On Intel CPUs, CPUID leaf 0xB, too, provides topology information, so filter it correctly, to avoid inconsistencies if the host has SMT. This fixes HaikuOS which fetches SMT information from there and would panic because of the inconsistencies.
Revision 1.54 / (download) - annotate - [select for diffs], Thu Apr 30 16:56:23 2020 UTC (3 years ago) by maxv
Branch: MAIN
Changes since 1.53: +12 -4
lines
Diff to previous 1.53 (colored)
If we were processing a software int/excp, and got a VMEXIT in the middle, we must also reflect the instruction length, otherwise the next VMENTER fails and Qemu shuts the guest down.
Revision 1.53 / (download) - annotate - [select for diffs], Thu Apr 30 16:50:17 2020 UTC (3 years ago) by maxv
Branch: MAIN
Changes since 1.52: +17 -2
lines
Diff to previous 1.52 (colored)
When the identification fails, print the reason.
Revision 1.35.2.3 / (download) - annotate - [select for diffs], Mon Apr 13 08:04:25 2020 UTC (3 years, 1 month ago) by martin
Branch: phil-wifi
Changes since 1.35.2.2: +320 -195
lines
Diff to previous 1.35.2.2 (colored) to branchpoint 1.35 (colored) next main 1.36 (colored)
Mostly merge changes from HEAD upto 20200411
Revision 1.52 / (download) - annotate - [select for diffs], Sun Mar 22 00:16:16 2020 UTC (3 years, 2 months ago) by ad
Branch: MAIN
CVS Tags: phil-wifi-20200421,
phil-wifi-20200411,
phil-wifi-20200406,
bouyer-xenpvh-base2,
bouyer-xenpvh-base1,
bouyer-xenpvh-base,
bouyer-xenpvh
Changes since 1.51: +3 -3
lines
Diff to previous 1.51 (colored)
x86 pmap: - Give pmap_remove_all() its own version of pmap_remove_ptes() that on native x86 does the bare minimum needed to clear out PTPs. Cuts ~4% sys time on 'build.sh release' for me. - pmap_sync_pv(): there's no need to issue a redundant TLB shootdown. The caller waits for the competing operation to finish. - Bring 'options TLBSTATS' up to date.
Revision 1.51 / (download) - annotate - [select for diffs], Sat Mar 14 18:08:39 2020 UTC (3 years, 2 months ago) by ad
Branch: MAIN
Changes since 1.50: +4 -7
lines
Diff to previous 1.50 (colored)
- Hide the details of SPCF_SHOULDYIELD and related behind a couple of small functions: preempt_point() and preempt_needed(). - preempt(): if the LWP has exceeded its timeslice in kernel, strip it of any priority boost gained earlier from blocking.
Revision 1.50 / (download) - annotate - [select for diffs], Thu Mar 12 13:01:59 2020 UTC (3 years, 2 months ago) by tnn
Branch: MAIN
Changes since 1.49: +4 -2
lines
Diff to previous 1.49 (colored)
vmx_vmptrst(): only used when DIAGNOSTIC
Revision 1.46.2.2 / (download) - annotate - [select for diffs], Sat Feb 29 20:19:09 2020 UTC (3 years, 2 months ago) by ad
Branch: ad-namecache
Changes since 1.46.2.1: +3 -3
lines
Diff to previous 1.46.2.1 (colored) to branchpoint 1.46 (colored) next main 1.47 (colored)
Sync with head.
Revision 1.49 / (download) - annotate - [select for diffs], Fri Feb 21 00:26:22 2020 UTC (3 years, 3 months ago) by joerg
Branch: MAIN
CVS Tags: is-mlppp-base,
is-mlppp,
ad-namecache-base3
Changes since 1.48: +3 -3
lines
Diff to previous 1.48 (colored)
Explicitly cast pointers to uintptr_t before casting to enums. They are not necessarily the same size. Don't cast pointers to bool, check for NULL instead.
Revision 1.36.2.5 / (download) - annotate - [select for diffs], Mon Feb 10 19:05:05 2020 UTC (3 years, 3 months ago) by martin
Branch: netbsd-9
CVS Tags: netbsd-9-0-RELEASE
Changes since 1.36.2.4: +3 -3
lines
Diff to previous 1.36.2.4 (colored) to branchpoint 1.36 (colored)
Pull up following revision(s) (requested by maxv in ticket #688): share/man/man4/nvmm.4: revision 1.5 lib/libnvmm/libnvmm.3: revision 1.26 sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.47 Mmh, as noted in PR/54847, this should be uint64_t, not uint16_t. Harmless because we use only the two lowest bits anyway. I believe this could be caught by KUBSAN; time to do another round of NVMM+K_SAN testing. Reference nvmmctl(8).
Revision 1.46.2.1 / (download) - annotate - [select for diffs], Fri Jan 17 21:47:31 2020 UTC (3 years, 4 months ago) by ad
Branch: ad-namecache
Changes since 1.46: +4 -4
lines
Diff to previous 1.46 (colored)
Sync with head.
Revision 1.48 / (download) - annotate - [select for diffs], Thu Jan 9 16:27:57 2020 UTC (3 years, 4 months ago) by maxv
Branch: MAIN
CVS Tags: ad-namecache-base2,
ad-namecache-base1
Changes since 1.47: +3 -3
lines
Diff to previous 1.47 (colored)
Registering the host's CR0 is done outside of the VCPU loop, so it must be cleared because it is also cleared inside the loop. Not clearing it could trigger DNAs on VMEXITs, because STTS/CLTS are still here as part of debugging since my FPU overhaul.
Revision 1.47 / (download) - annotate - [select for diffs], Thu Jan 9 16:20:12 2020 UTC (3 years, 4 months ago) by maxv
Branch: MAIN
Changes since 1.46: +3 -3
lines
Diff to previous 1.46 (colored)
Mmh, as noted in PR/54847, this should be uint64_t, not uint16_t. Harmless because we use only the two lowest bits anyway. I believe this could be caught by KUBSAN; time to do another round of NVMM+K_SAN testing.
Revision 1.46 / (download) - annotate - [select for diffs], Tue Dec 10 18:06:50 2019 UTC (3 years, 5 months ago) by ad
Branch: MAIN
CVS Tags: ad-namecache-base
Branch point for: ad-namecache
Changes since 1.45: +3 -3
lines
Diff to previous 1.45 (colored)
pg->phys_addr > VM_PAGE_TO_PHYS(pg)
Revision 1.36.2.4 / (download) - annotate - [select for diffs], Mon Nov 25 16:39:30 2019 UTC (3 years, 6 months ago) by martin
Branch: netbsd-9
CVS Tags: netbsd-9-0-RC2,
netbsd-9-0-RC1
Changes since 1.36.2.3: +14 -3
lines
Diff to previous 1.36.2.3 (colored) to branchpoint 1.36 (colored)
Pull up following revision(s) (requested by maxv in ticket #475): tests/lib/libnvmm/h_mem_assist.c: revision 1.18 sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.45 sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.54 Hide XSAVES-specific stuff and the masked extended states. Several improvements. In particular, reduce CS.limit, because Intel CPUs perform strict sanity checks, and the previous (too high) limit caused the VM entry to fail.
Revision 1.45 / (download) - annotate - [select for diffs], Wed Nov 20 10:26:56 2019 UTC (3 years, 6 months ago) by maxv
Branch: MAIN
Changes since 1.44: +14 -3
lines
Diff to previous 1.44 (colored)
Hide XSAVES-specific stuff and the masked extended states.
Revision 1.36.2.3 / (download) - annotate - [select for diffs], Sun Nov 10 12:58:30 2019 UTC (3 years, 6 months ago) by martin
Branch: netbsd-9
Changes since 1.36.2.2: +284 -166
lines
Diff to previous 1.36.2.2 (colored) to branchpoint 1.36 (colored)
Pull up following revision(s) (requested by maxv in ticket #405): usr.sbin/nvmmctl/nvmmctl.8: revision 1.2 lib/libnvmm/libnvmm.3: revision 1.24 sys/dev/nvmm/nvmm.h: revision 1.11 lib/libnvmm/libnvmm.3: revision 1.25 sys/dev/nvmm/x86/nvmm_x86.h: revision 1.16 sys/dev/nvmm/nvmm.h: revision 1.12 sys/dev/nvmm/x86/nvmm_x86.h: revision 1.17 tests/lib/libnvmm/h_mem_assist.c: revision 1.12 sys/dev/nvmm/x86/nvmm_x86.h: revision 1.18 share/mk/bsd.hostprog.mk: revision 1.82 lib/libnvmm/libnvmm.c: revision 1.15 distrib/sets/lists/base/md.amd64: revision 1.281 tests/lib/libnvmm/h_mem_assist.c: revision 1.13 lib/libnvmm/libnvmm.c: revision 1.16 tests/lib/libnvmm/h_mem_assist.c: revision 1.14 lib/libnvmm/libnvmm_x86.c: revision 1.32 lib/libnvmm/libnvmm.c: revision 1.17 tests/lib/libnvmm/h_mem_assist.c: revision 1.15 lib/libnvmm/libnvmm_x86.c: revision 1.33 lib/libnvmm/libnvmm.c: revision 1.18 usr.sbin/nvmmctl/Makefile: revision 1.1 tests/lib/libnvmm/h_mem_assist_asm.S: revision 1.7 tests/lib/libnvmm/h_mem_assist.c: revision 1.16 lib/libnvmm/libnvmm_x86.c: revision 1.34 usr.sbin/nvmmctl/Makefile: revision 1.2 tests/lib/libnvmm/h_mem_assist_asm.S: revision 1.8 tests/lib/libnvmm/h_mem_assist.c: revision 1.17 sys/dev/nvmm/nvmm_internal.h: revision 1.13 lib/libnvmm/libnvmm_x86.c: revision 1.35 lib/libnvmm/libnvmm_x86.c: revision 1.36 usr.sbin/postinstall/postinstall.in: revision 1.8 lib/libnvmm/libnvmm_x86.c: revision 1.37 lib/libnvmm/libnvmm_x86.c: revision 1.38 lib/libnvmm/libnvmm_x86.c: revision 1.39 usr.sbin/Makefile: revision 1.282 lib/libnvmm/nvmm.h: revision 1.13 lib/libnvmm/nvmm.h: revision 1.14 lib/libnvmm/nvmm.h: revision 1.15 sys/dev/nvmm/nvmm.c: revision 1.23 lib/libnvmm/nvmm.h: revision 1.16 sys/dev/nvmm/nvmm.c: revision 1.24 lib/libnvmm/nvmm.h: revision 1.17 sys/dev/nvmm/nvmm.c: revision 1.25 tests/lib/libnvmm/h_io_assist.c: revision 1.9 etc/MAKEDEV.tmpl: revision 1.209 tests/lib/libnvmm/h_io_assist.c: revision 1.10 tests/lib/libnvmm/h_io_assist.c: revision 1.11 etc/group: revision 1.35 distrib/sets/lists/man/mi: revision 1.1660 sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.40 sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.41 sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.42 sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.43 sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.44 sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.51 sys/dev/nvmm/nvmm_ioctl.h: revision 1.8 sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.52 sys/dev/nvmm/nvmm_ioctl.h: revision 1.9 sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.53 usr.sbin/nvmmctl/nvmmctl.c: revision 1.1 lib/libnvmm/libnvmm.3: revision 1.20 distrib/sets/lists/debug/md.amd64: revision 1.106 lib/libnvmm/libnvmm.3: revision 1.21 lib/libnvmm/libnvmm.3: revision 1.22 usr.sbin/nvmmctl/nvmmctl.8: revision 1.1 lib/libnvmm/libnvmm.3: revision 1.23 Fix incorrect parsing: the R/M field uses a special GPR map when the address size is 16 bits, regardless of the actual operating mode. With this special map there can be two registers referenced at once, and also disp16-only. Implement this special behavior, and add associated tests. While here simplify a few things. With this in place, the Windows 95 installer initializes correctly. Part of PR/54611. add missing initializer Implement XCHG, add associated tests, and add comments to explain. With this in place the Windows 95 installer completes successfuly. Part of PR/54611. Improve nvmm_vcpu_dump(). Put back 'default', because llvm apparently doesn't realize that all cases are covered in the switch. Miscellaneous changes in NVMM, to address several inconsistencies and issues in the libnvmm API. - Rename NVMM_CAPABILITY_VERSION to NVMM_KERN_VERSION, and check it in libnvmm. Introduce NVMM_USER_VERSION, for future use. - In libnvmm, open "/dev/nvmm" as read-only and with O_CLOEXEC. This is to avoid sharing the VMs with the children if the process forks. In the NVMM driver, force O_CLOEXEC on open(). - Rename the following things for consistency: nvmm_exit* -> nvmm_vcpu_exit* nvmm_event* -> nvmm_vcpu_event* NVMM_EXIT_* -> NVMM_VCPU_EXIT_* NVMM_EVENT_INTERRUPT_HW -> NVMM_VCPU_EVENT_INTR NVMM_EVENT_EXCEPTION -> NVMM_VCPU_EVENT_EXCP Delete NVMM_EVENT_INTERRUPT_SW, unused already. - Slightly reorganize the MI/MD definitions, for internal clarity. - Split NVMM_VCPU_EXIT_MSR in two: NVMM_VCPU_EXIT_{RD,WR}MSR. Also provide separate u.rdmsr and u.wrmsr fields. This is more consistent with the other exit reasons. - Change the types of several variables: event.type enum -> u_int event.vector uint64_t -> uint8_t exit.u.*msr.msr: uint64_t -> uint32_t exit.u.io.type: enum -> bool exit.u.io.seg: int -> int8_t cap.arch.mxcsr_mask: uint64_t -> uint32_t cap.arch.conf_cpuid_maxops: uint64_t -> uint32_t - Delete NVMM_VCPU_EXIT_MWAIT_COND, it is AMD-only and confusing, and we already intercept 'monitor' so it is never armed. - Introduce vmx_exit_insn() for NVMM-Intel, similar to svm_exit_insn(). The 'npc' field wasn't getting filled properly during certain VMEXITs. - Introduce nvmm_vcpu_configure(). Similar to nvmm_machine_configure(), but as its name indicates, the configuration is per-VCPU and not per-VM. Migrate and rename NVMM_MACH_CONF_X86_CPUID to NVMM_VCPU_CONF_CPUID. This becomes per-VCPU, which makes more sense than per-VM. - Extend the NVMM_VCPU_CONF_CPUID conf to allow triggering VMEXITs on specific leaves. Until now we could only mask the leaves. An uint32_t is added in the structure: uint32_t mask:1; uint32_t exit:1; uint32_t rsvd:30; The two first bits select the desired behavior on the leaf. Specifying zero on both resets the leaf to the default behavior. The new NVMM_VCPU_EXIT_CPUID exit reason is added. Three changes in libnvmm: - Add 'mach' and 'vcpu' backpointers in the nvmm_io and nvmm_mem structures. - Rename 'nvmm_callbacks' to 'nvmm_assist_callbacks'. - Rename and migrate NVMM_MACH_CONF_CALLBACKS to NVMM_VCPU_CONF_CALLBACKS, it now becomes per-VCPU. Update the libnvmm man page: - Sync the naming with reality. - Replace "relevant" by "desired" and "virtualizer" by "emulator", closer to what I meant. - Add a "VCPU Configuration" section. - Add a "Machine Ownership" section. Add the "nvmm" group, and make nvmm_init() public. Sent to tech-kern@ a few days ago. Use the new PTE naming, and define CR3_FRAME_* separately. No functional change. Add a new VCPU conf option, that allows userland to request VMEXITs after a TPR change. This is supported on all Intel CPUs, and not-too-old AMD CPUs. The reason for wanting this option is that certain OSes (like Win10 64bit) manage interrupt priority in hardware via CR8 directly, and for these OSes, the emulator may want to sync its internal TPR state on each change. Add two new fields in cap.arch, to report the conf capabilities. Report TPR only on Intel for now, not AMD, because I don't have a recent AMD CPU on which to test. Mask CPUID leaf 0x0A on Intel, because we don't want the guest to try (and fail) to probe the PMC MSRs. This avoids "Unexpected WRMSR" warnings in qemu-nvmm. Add PCID support in the guests. This speeds up most 64bit guests, because since Meltdown, everybody uses PCID (including NetBSD). Change the way root_owner works: consider the calling process as root_owner not if it has root privileges, but if the /dev/nvmm device was opened with write permissions. Introduce the undocumented nvmm_root_init() function to achieve that. The goal is to simplify the logic and have more granularity, eg if we want a monitoring agent to access VMs but don't want to give this agent real root access on the system. A few changes: - Use smaller types in struct nvmm_capability. - Use smaller type for nvmm_io.port. - Switch exitstate to a compacted structure. Add nram in struct nvmm_ctl_mach_info. Add nvmmctl, with two commands for now. Macro tidyness. Sort SEE ALSO. should be fork(2), noticed by wiz Add debug entry for newly introduced nvmmctl utility. Annotate a covering switch as such to avoid warnings about missing returns. Forgot to put nvmmctl in the "nvmm" group. Add nvmm group.
Revision 1.44 / (download) - annotate - [select for diffs], Mon Oct 28 08:30:49 2019 UTC (3 years, 7 months ago) by maxv
Branch: MAIN
CVS Tags: phil-wifi-20191119
Changes since 1.43: +8 -12
lines
Diff to previous 1.43 (colored)
A few changes: - Use smaller types in struct nvmm_capability. - Use smaller type for nvmm_io.port. - Switch exitstate to a compacted structure.
Revision 1.43 / (download) - annotate - [select for diffs], Sun Oct 27 18:26:54 2019 UTC (3 years, 7 months ago) by maxv
Branch: MAIN
Changes since 1.42: +52 -11
lines
Diff to previous 1.42 (colored)
Add PCID support in the guests. This speeds up most 64bit guests, because since Meltdown, everybody uses PCID (including NetBSD).
Revision 1.42 / (download) - annotate - [select for diffs], Sun Oct 27 11:11:09 2019 UTC (3 years, 7 months ago) by maxv
Branch: MAIN
Changes since 1.41: +8 -2
lines
Diff to previous 1.41 (colored)
Mask CPUID leaf 0x0A on Intel, because we don't want the guest to try (and fail) to probe the PMC MSRs. This avoids "Unexpected WRMSR" warnings in qemu-nvmm.
Revision 1.41 / (download) - annotate - [select for diffs], Sun Oct 27 10:28:55 2019 UTC (3 years, 7 months ago) by maxv
Branch: MAIN
Changes since 1.40: +45 -15
lines
Diff to previous 1.40 (colored)
Add a new VCPU conf option, that allows userland to request VMEXITs after a TPR change. This is supported on all Intel CPUs, and not-too-old AMD CPUs. The reason for wanting this option is that certain OSes (like Win10 64bit) manage interrupt priority in hardware via CR8 directly, and for these OSes, the emulator may want to sync its internal TPR state on each change. Add two new fields in cap.arch, to report the conf capabilities. Report TPR only on Intel for now, not AMD, because I don't have a recent AMD CPU on which to test.
Revision 1.40 / (download) - annotate - [select for diffs], Wed Oct 23 07:01:11 2019 UTC (3 years, 7 months ago) by maxv
Branch: MAIN
Changes since 1.39: +197 -152
lines
Diff to previous 1.39 (colored)
Miscellaneous changes in NVMM, to address several inconsistencies and issues in the libnvmm API. - Rename NVMM_CAPABILITY_VERSION to NVMM_KERN_VERSION, and check it in libnvmm. Introduce NVMM_USER_VERSION, for future use. - In libnvmm, open "/dev/nvmm" as read-only and with O_CLOEXEC. This is to avoid sharing the VMs with the children if the process forks. In the NVMM driver, force O_CLOEXEC on open(). - Rename the following things for consistency: nvmm_exit* -> nvmm_vcpu_exit* nvmm_event* -> nvmm_vcpu_event* NVMM_EXIT_* -> NVMM_VCPU_EXIT_* NVMM_EVENT_INTERRUPT_HW -> NVMM_VCPU_EVENT_INTR NVMM_EVENT_EXCEPTION -> NVMM_VCPU_EVENT_EXCP Delete NVMM_EVENT_INTERRUPT_SW, unused already. - Slightly reorganize the MI/MD definitions, for internal clarity. - Split NVMM_VCPU_EXIT_MSR in two: NVMM_VCPU_EXIT_{RD,WR}MSR. Also provide separate u.rdmsr and u.wrmsr fields. This is more consistent with the other exit reasons. - Change the types of several variables: event.type enum -> u_int event.vector uint64_t -> uint8_t exit.u.*msr.msr: uint64_t -> uint32_t exit.u.io.type: enum -> bool exit.u.io.seg: int -> int8_t cap.arch.mxcsr_mask: uint64_t -> uint32_t cap.arch.conf_cpuid_maxops: uint64_t -> uint32_t - Delete NVMM_VCPU_EXIT_MWAIT_COND, it is AMD-only and confusing, and we already intercept 'monitor' so it is never armed. - Introduce vmx_exit_insn() for NVMM-Intel, similar to svm_exit_insn(). The 'npc' field wasn't getting filled properly during certain VMEXITs. - Introduce nvmm_vcpu_configure(). Similar to nvmm_machine_configure(), but as its name indicates, the configuration is per-VCPU and not per-VM. Migrate and rename NVMM_MACH_CONF_X86_CPUID to NVMM_VCPU_CONF_CPUID. This becomes per-VCPU, which makes more sense than per-VM. - Extend the NVMM_VCPU_CONF_CPUID conf to allow triggering VMEXITs on specific leaves. Until now we could only mask the leaves. An uint32_t is added in the structure: uint32_t mask:1; uint32_t exit:1; uint32_t rsvd:30; The two first bits select the desired behavior on the leaf. Specifying zero on both resets the leaf to the default behavior. The new NVMM_VCPU_EXIT_CPUID exit reason is added.
Revision 1.39 / (download) - annotate - [select for diffs], Sat Oct 12 06:31:04 2019 UTC (3 years, 7 months ago) by maxv
Branch: MAIN
Changes since 1.38: +8 -14
lines
Diff to previous 1.38 (colored)
Rewrite the FPU code on x86. This greatly simplifies the logic and removes the dependency on IPL_HIGH. NVMM is updated accordingly. Posted on port-amd64 a week ago. Bump the kernel version to 9.99.16.
Revision 1.36.2.2 / (download) - annotate - [select for diffs], Sun Oct 6 11:04:55 2019 UTC (3 years, 7 months ago) by martin
Branch: netbsd-9
Changes since 1.36.2.1: +3 -3
lines
Diff to previous 1.36.2.1 (colored) to branchpoint 1.36 (colored)
Pull up following revision(s) (requested by maxv in ticket #287): sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.38 sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.47 sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.48 sys/dev/nvmm/x86/nvmm_x86_svm.c: revision 1.49 Add definitions for RDPRU, MCOMMIT, GMET and VTE. Fix definition for MWAIT. It should be bit 11, not 12; 12 is the armed version. Switch to the new PTE naming.
Revision 1.38 / (download) - annotate - [select for diffs], Fri Oct 4 12:17:05 2019 UTC (3 years, 7 months ago) by maxv
Branch: MAIN
Changes since 1.37: +3 -3
lines
Diff to previous 1.37 (colored)
Switch to the new PTE naming.
Revision 1.36.2.1 / (download) - annotate - [select for diffs], Tue Sep 24 18:14:59 2019 UTC (3 years, 8 months ago) by martin
Branch: netbsd-9
Changes since 1.36: +11 -11
lines
Diff to previous 1.36 (colored)
Pull up following revision(s) (requested by maxv in ticket #239): sys/dev/nvmm/x86/nvmm_x86_vmx.c: revision 1.37 Always set hwcode on error. Useful for debugging.
Revision 1.37 / (download) - annotate - [select for diffs], Fri Sep 13 14:19:13 2019 UTC (3 years, 8 months ago) by maxv
Branch: MAIN
Changes since 1.36: +11 -11
lines
Diff to previous 1.36 (colored)
Always set hwcode on error. Useful for debugging.
Revision 1.36 / (download) - annotate - [select for diffs], Sun Jun 16 18:30:31 2019 UTC (3 years, 11 months ago) by maxv
Branch: MAIN
CVS Tags: netbsd-9-base
Branch point for: netbsd-9
Changes since 1.35: +5 -2
lines
Diff to previous 1.35 (colored)
Make sure VMX-outside-SMX is allowed. It may not be if the BIOS decided to disable VMX. Seen on an HP laptop, where NVMM would panic because of that.
Revision 1.35.2.2 / (download) - annotate - [select for diffs], Mon Jun 10 22:07:14 2019 UTC (3 years, 11 months ago) by christos
Branch: phil-wifi
Changes since 1.35.2.1: +3156 -0
lines
Diff to previous 1.35.2.1 (colored) to branchpoint 1.35 (colored)
Sync with HEAD
Revision 1.35.2.1, Sat May 18 08:55:59 2019 UTC (4 years ago) by christos
Branch: phil-wifi
Changes since 1.35: +0 -3156
lines
FILE REMOVED
file nvmm_x86_vmx.c was added on branch phil-wifi on 2019-06-10 22:07:14 +0000
Revision 1.35 / (download) - annotate - [select for diffs], Sat May 18 08:55:59 2019 UTC (4 years ago) by maxv
Branch: MAIN
CVS Tags: phil-wifi-20190609
Branch point for: phil-wifi
Changes since 1.34: +3 -4
lines
Diff to previous 1.34 (colored)
Now that SVS cannot be disabled at run time, MSR_LSTAR is static, so no need to save it on each VM enter.
Revision 1.34 / (download) - annotate - [select for diffs], Sat May 11 07:31:56 2019 UTC (4 years ago) by maxv
Branch: MAIN
Changes since 1.33: +10 -9
lines
Diff to previous 1.33 (colored)
Rework the machine configuration interface. Provide three ranges in the conf space: <libnvmm:0-100>, <MI:100-200> and <MD:200-...>. Remove nvmm_callbacks_register(), and replace it by the conf op NVMM_MACH_CONF_CALLBACKS, handled by libnvmm. The callbacks are now per-machine, and the emulators should now do: - nvmm_callbacks_register(&cbs); + nvmm_machine_configure(&mach, NVMM_MACH_CONF_CALLBACKS, &cbs); This provides more granularity, for example if the process runs two VMs and wants different callbacks for each.
Revision 1.33 / (download) - annotate - [select for diffs], Wed May 1 09:20:21 2019 UTC (4 years ago) by maxv
Branch: MAIN
Changes since 1.32: +49 -31
lines
Diff to previous 1.32 (colored)
Use the comm page to inject events, rather than ioctls, and commit them in vcpu_run. This saves a few syscalls and copyins. For example on Windows 10, moving the mouse from the left to right sides of the screen generates ~500 events, which now don't result in syscalls. The error handling is done in vcpu_run and it is less precise, but this doesn't matter a lot, and will be solved with future NVMM error codes.
Revision 1.32 / (download) - annotate - [select for diffs], Mon Apr 29 18:54:26 2019 UTC (4 years, 1 month ago) by maxv
Branch: MAIN
Changes since 1.31: +7 -27
lines
Diff to previous 1.31 (colored)
Stop taking care of the INT/NMI windows in the kernel, the emulator is supposed to do that itself.
Revision 1.31 / (download) - annotate - [select for diffs], Sun Apr 28 14:22:13 2019 UTC (4 years, 1 month ago) by maxv
Branch: MAIN
Changes since 1.30: +58 -8
lines
Diff to previous 1.30 (colored)
Modify the communication layer between the kernel NVMM driver and libnvmm: introduce a bidirectionnal "comm page", a page of memory shared between the kernel and userland, and used to transfer data in and out in a more performant manner than ioctls. The comm page contains the VCPU state, plus three flags: - "wanted": the states the kernel must get/set when requested via ioctls - "cached": the states that are in the comm page - "commit": the states the kernel must set in vcpu_run The idea is to avoid performing expensive syscalls, by using the VCPU state cached, either explicitly or speculatively, in the comm page. For example, if the state is cached we do a direct 1->5 with no syscall: +---------------------------------------------+ | Qemu | +---------------------------------------------+ | ^ | (0) nvmm_vcpu_getstate | (6) Done | | V | +---------------------------------------+ | libnvmm | +---------------------------------------+ | ^ | ^ (1) State | | (2) No | (3) Ioctl: | (5) Ok, state cached? | | | "please cache | fetched | | | the state" | V | | | +-----------+ | | | Comm Page |------+---------------+ +-----------+ | ^ | (4) "Alright | V babe" | +--------+ +-----| Kernel | +--------+ The main changes in behavior are: - nvmm_vcpu_getstate(): won't emit a syscall if the state is already cached in the comm page, will just fetch from the comm page directly - nvmm_vcpu_setstate(): won't emit a syscall at all, will just cache the wanted state in the comm page - nvmm_vcpu_run(): will commit the to-be-set state in the comm page, as previously requested by nvmm_vcpu_setstate() In addition to this, the kernel NVMM driver is changed to speculatively cache certain states known to be of interest, so that the future nvmm_vcpu_getstate() calls libnvmm or the emulator will perform will use the comm page rather than expensive syscalls. For example, if an I/O VMEXIT occurs, the I/O Assist in libnvmm will want GPRS+SEGS+CRS+MSRS, and now the kernel caches all of that in the comm page before returning to userland. Overall, in a normal run of Windows 10, this saves several millions of syscalls. Eg on a 4CPU Intel with 4VCPUs, booting the Win10 install ISO goes from taking 1min35 to taking 1min16. The libnvmm API is not changed, but the ABI is. If we changed the API it would be possible to save expensive memcpys on libnvmm's side. This will be avoided in a future version. The comm page can also be extended to implement future services.
Revision 1.30 / (download) - annotate - [select for diffs], Sat Apr 27 15:45:21 2019 UTC (4 years, 1 month ago) by maxv
Branch: MAIN
Changes since 1.29: +5 -5
lines
Diff to previous 1.29 (colored)
Reorder the NVMM headers, to make a clear(er) distinction between MI and MD. Also use #defines for the exit reasons rather than an union. No ABI change, and no API change except 'cap->u.{}' renamed to 'cap->arch'.
Revision 1.29 / (download) - annotate - [select for diffs], Sat Apr 27 09:06:18 2019 UTC (4 years, 1 month ago) by maxv
Branch: MAIN
Changes since 1.28: +22 -3
lines
Diff to previous 1.28 (colored)
If guest events were being processed when a #VMEXIT occurred, reschedule the events rather than dismissing them. This can happen for instance when a guest wants to process an exception and an #NPF occurs on the guest IDT. In practice it occurs only when the host swapped out specific guest pages.
Revision 1.28 / (download) - annotate - [select for diffs], Sat Apr 27 08:16:19 2019 UTC (4 years, 1 month ago) by maxv
Branch: MAIN
Changes since 1.27: +174 -119
lines
Diff to previous 1.27 (colored)
Optimize nvmm-intel, use inlined GCC assembly rather than function calls.
Revision 1.27 / (download) - annotate - [select for diffs], Wed Apr 24 18:19:28 2019 UTC (4 years, 1 month ago) by maxv
Branch: MAIN
Changes since 1.26: +10 -3
lines
Diff to previous 1.26 (colored)
Provide the hardware error code for NVMM_EXIT_INVALID, useful when debugging.
Revision 1.26 / (download) - annotate - [select for diffs], Sat Apr 20 08:45:30 2019 UTC (4 years, 1 month ago) by maxv
Branch: MAIN
CVS Tags: isaki-audio2-base,
isaki-audio2
Changes since 1.25: +3 -3
lines
Diff to previous 1.25 (colored)
Ah, take XSAVE into account in ECX too, not just in EBX. Otherwise if the guest relies only on ECX to initialize/copy the FPU state (like NetBSD does), spurious #GPs can be encountered because the bitmap is clobbered.
Revision 1.25 / (download) - annotate - [select for diffs], Sun Apr 7 14:28:50 2019 UTC (4 years, 1 month ago) by maxv
Branch: MAIN
Changes since 1.24: +4 -5
lines
Diff to previous 1.24 (colored)
Invert the filtering priority: now the kernel-managed cpuid leaves are overwritable by the virtualizer. This is useful to virtualizers that want to 100% control every leaf.
Revision 1.24 / (download) - annotate - [select for diffs], Sat Apr 6 11:49:53 2019 UTC (4 years, 1 month ago) by maxv
Branch: MAIN
Changes since 1.23: +18 -14
lines
Diff to previous 1.23 (colored)
Replace the misc[] state by a new compressed nvmm_x64_state_intr structure, which describes the interruptibility state of the guest. Add evt_pending, read-only, that allows the virtualizer to know if an event is pending.
Revision 1.23 / (download) - annotate - [select for diffs], Wed Apr 3 19:10:58 2019 UTC (4 years, 1 month ago) by maxv
Branch: MAIN
Changes since 1.22: +11 -3
lines
Diff to previous 1.22 (colored)
VMX: if PAT is not valid, #GP on WRMSR, rather than crashing the guest.
Revision 1.22 / (download) - annotate - [select for diffs], Wed Apr 3 18:05:55 2019 UTC (4 years, 1 month ago) by maxv
Branch: MAIN
Changes since 1.21: +6 -3
lines
Diff to previous 1.21 (colored)
Add new VMCS bits.
Revision 1.21 / (download) - annotate - [select for diffs], Wed Apr 3 17:32:58 2019 UTC (4 years, 1 month ago) by maxv
Branch: MAIN
Changes since 1.20: +20 -11
lines
Diff to previous 1.20 (colored)
Add MSR_TSC.
Revision 1.20 / (download) - annotate - [select for diffs], Thu Mar 21 20:21:41 2019 UTC (4 years, 2 months ago) by maxv
Branch: MAIN
Changes since 1.19: +6 -5
lines
Diff to previous 1.19 (colored)
Make it possible for an emulator to set the protection of the guest pages. For some reason I had initially concluded that it wasn't doable; verily it is, so let's do it. The reserved 'flags' argument of nvmm_gpa_map() becomes 'prot' and takes mmap-like protection codes.
Revision 1.19 / (download) - annotate - [select for diffs], Thu Mar 14 20:29:53 2019 UTC (4 years, 2 months ago) by maxv
Branch: MAIN
Changes since 1.18: +67 -8
lines
Diff to previous 1.18 (colored)
Optimize NVMM-Intel: keep the VMCS active on the host CPU, and lazy-switch it on demand only when needed. This allows the CPU to use the cached version of the guest state, rather than the in-memory copy of it. This is much more performant. A VMCS must be active on only one CPU, but one CPU can have several active VMCSs at the same time. We keep track of which CPU each VMCS is active on. When we want to execute a VCPU, we determine whether its VMCS is loaded on another CPU, and if so send an IPI to ask it to unbusy that VMCS. In most cases the VMCS is already active on the current CPU, so we don't have to do anything and can proceed with a fast VMRESUME. We send IPIs with kpreemption enabled but with a bound LWP, because we don't want to get context-switched to the CPU we just sent an IPI to. Overall, with this in place, I see a ~15% performance increase in the guests on NVMM-Intel.
Revision 1.18 / (download) - annotate - [select for diffs], Thu Mar 14 19:26:44 2019 UTC (4 years, 2 months ago) by maxv
Branch: MAIN
Changes since 1.17: +6 -6
lines
Diff to previous 1.17 (colored)
Move a KASSERT, applies to all branches.
Revision 1.17 / (download) - annotate - [select for diffs], Thu Mar 7 15:06:37 2019 UTC (4 years, 2 months ago) by maxv
Branch: MAIN
Changes since 1.16: +40 -14
lines
Diff to previous 1.16 (colored)
Parse EXC_NMI on nvmm-intel, and don't return NVMM_EXIT_INVALID if we received a host NMI, otherwise the guest could get killed if an NMI comes in, typically when the host runs tprof at the same time. Already handled on nvmm-amd.
Revision 1.16 / (download) - annotate - [select for diffs], Sun Mar 3 07:01:09 2019 UTC (4 years, 2 months ago) by maxv
Branch: MAIN
Changes since 1.15: +17 -12
lines
Diff to previous 1.15 (colored)
Choose which CPUID bits to allow, rather than which bits to disallow. This is clearer, and also forward compatible with future CPUs. While here be more consistent when allowing the bits, and sync between nvmm-amd and nvmm-intel. Also make sure to disallow AVX, because the guest state we provide is only x86+SSE. Fixes a CentOS panic when booting on NVMM, reported by Jared McNeill, thanks.
Revision 1.15 / (download) - annotate - [select for diffs], Tue Feb 26 12:23:12 2019 UTC (4 years, 3 months ago) by maxv
Branch: MAIN
Changes since 1.14: +23 -26
lines
Diff to previous 1.14 (colored)
Change the layout of the SEG state: - Reorder it, to match the CPU encoding. This is the universal order, also used by Qemu. Drop the seg_to_nvmm[] tables. - Compress it. This divides its size by two. - Rename some of its fields, to better match the x86 spec. Also, take S out of Type, this was a NetBSD-ism that was likely confusing to other people.
Revision 1.14 / (download) - annotate - [select for diffs], Sat Feb 23 12:27:00 2019 UTC (4 years, 3 months ago) by maxv
Branch: MAIN
Changes since 1.13: +6 -8
lines
Diff to previous 1.13 (colored)
Install the x86 RESET state at VCPU creation time, for convenience, so that the libnvmm users can expect a functional VCPU right away.
Revision 1.13 / (download) - annotate - [select for diffs], Sat Feb 23 10:43:36 2019 UTC (4 years, 3 months ago) by maxv
Branch: MAIN
Changes since 1.12: +9 -5
lines
Diff to previous 1.12 (colored)
Add support for CPUs that don't have the EPT_{A,D} bits. On such CPUs, these bits are ignored by the hardware. We don't care about setting them, however, we must always assume they are set. Modify the pmap code to do that. While here, in pmap_ept_remove_pte, don't flush the TLB when it's not needed. Tested on an old Intel Celeron.
Revision 1.12 / (download) - annotate - [select for diffs], Sat Feb 23 08:19:16 2019 UTC (4 years, 3 months ago) by maxv
Branch: MAIN
Changes since 1.11: +478 -475
lines
Diff to previous 1.11 (colored)
Reorder the functions, and constify setstate. No functional change.
Revision 1.11 / (download) - annotate - [select for diffs], Fri Feb 22 12:24:34 2019 UTC (4 years, 3 months ago) by maxv
Branch: MAIN
Changes since 1.10: +23 -3
lines
Diff to previous 1.10 (colored)
Fix omission: if we receive a guest trap on CR0, and if the original instruction would have resulted in Long Mode being enabled, we need to manually enable Long Mode ourselves. We were already doing that correctly in setstate, but not in the CR0 trap handler. Problem initially reported by Aymeric Vincent; ArchLinux wouldn't boot, now it does and works correctly. While here, add CR0_ET in the CR0 mask, for the associated shadow to be taken into account. Normally this shadow bit shouldn't be necessary, but for now I keep it regardless.
Revision 1.10 / (download) - annotate - [select for diffs], Thu Feb 21 13:25:44 2019 UTC (4 years, 3 months ago) by maxv
Branch: MAIN
Changes since 1.9: +19 -19
lines
Diff to previous 1.9 (colored)
Reorder the detection in vmx_ident(), to fix panic on old CPUs. We must read MSR_IA32_VMX_EPT_VPID_CAP _after_ ensuring EPT is there, because if it's not, the rdmsr faults.
Revision 1.9 / (download) - annotate - [select for diffs], Thu Feb 21 12:17:52 2019 UTC (4 years, 3 months ago) by maxv
Branch: MAIN
Changes since 1.8: +62 -29
lines
Diff to previous 1.8 (colored)
Another locking issue in NVMM: the {svm,vmx}_tlb_flush functions take VCPU mutexes which can sleep, but their context does not allow it. Rewrite the TLB handling code to fix that. It becomes a bit complex. In short, we use a per-VM generation number, which we increase on each TLB flush, before sending a broadcast IPI to everybody. The IPIs cause a #VMEXIT of each VCPU, and each VCPU Loop will synchronize the per-VM gen with a per-VCPU copy, and apply the flushes as neededi lazily. The behavior differs between AMD and Intel; in short, on Intel we don't flush the hTLB (EPT cache) if a context switch of a VCPU occurs, so now, we need to maintain a kcpuset to know which VCPU's hTLBs are active on which hCPU. This creates some redundancy on Intel, ie there are cases where we flush the hTLB several times unnecessarily; but hTLB flushes are very rare, so there is no real performance regression. The thing is lock-less and non-blocking, so it solves our problem.
Revision 1.8 / (download) - annotate - [select for diffs], Thu Feb 21 11:58:04 2019 UTC (4 years, 3 months ago) by maxv
Branch: MAIN
Changes since 1.7: +20 -12
lines
Diff to previous 1.7 (colored)
Clarify the gTLB code a little.
Revision 1.7 / (download) - annotate - [select for diffs], Mon Feb 18 12:17:45 2019 UTC (4 years, 3 months ago) by maxv
Branch: MAIN
Changes since 1.6: +12 -19
lines
Diff to previous 1.6 (colored)
Ah, finally found you. Fix scheduling bug in NVMM. When processing guest page faults, we were calling uvm_fault with preemption disabled. The thing is, uvm_fault may block, and if it does, we land in sleepq_block which calls mi_switch; so we get switched away while we explicitly asked not to be. From then on things could go really wrong. Fix that by processing such faults in MI, where we have preemption enabled and are allowed to block. A KASSERT in sleepq_block (or before) would have helped.
Revision 1.6 / (download) - annotate - [select for diffs], Sat Feb 16 12:40:31 2019 UTC (4 years, 3 months ago) by maxv
Branch: MAIN
Changes since 1.5: +25 -11
lines
Diff to previous 1.5 (colored)
Improve the FPU detection: hide XSAVES because we're not allowing it, and don't set CPUID2_OSXSAVE if the guest didn't first set CR4_OSXSAVE. With these changes in place, I can boot Windows 10 on NVMM.
Revision 1.5 / (download) - annotate - [select for diffs], Sat Feb 16 12:05:30 2019 UTC (4 years, 3 months ago) by maxv
Branch: MAIN
Changes since 1.4: +20 -2
lines
Diff to previous 1.4 (colored)
Handle MSR_MISC_ENABLE on NVMM-Intel (Intel-specific).
Revision 1.4 / (download) - annotate - [select for diffs], Fri Feb 15 13:17:05 2019 UTC (4 years, 3 months ago) by maxv
Branch: MAIN
Changes since 1.3: +11 -4
lines
Diff to previous 1.3 (colored)
Initialize the guest TSC to zero at VCPU creation time, and handle guest writes to MSR_TSC at run time. This is imprecise, because the hardware does not provide a way to preserve the TSC during #VMEXITs, but that's fine enough.
Revision 1.3 / (download) - annotate - [select for diffs], Thu Feb 14 14:30:20 2019 UTC (4 years, 3 months ago) by maxv
Branch: MAIN
Changes since 1.2: +2 -11
lines
Diff to previous 1.2 (colored)
Harmonize the handling of the CPL between AMD and Intel. AMD has a separate guest CPL field, because on AMD, the SYSCALL/SYSRET instructions do not force SS.DPL to predefined values. On Intel they do, so the CPL on Intel is just the guest's SS.DPL value. Even though technically possible on AMD, there is no sane reason for a guest kernel to set a non-three SS.DPL, doing that would mess up several common segmentation practices and wouldn't be compatible with Intel. So, force the Intel behavior on AMD, by always setting SS.DPL<=>CPL. Remove the now unused CPL field from nvmm_x64_state::misc[]. This actually increases performance on AMD: to detect interrupt windows the virtualizer has to modify some fields of misc[], and because CPL was there, we had to flush the SEG set of the VMCB cache. Now there is no flush necessary. While here remove the CPL check for XSETBV on Intel, contrary to AMD Intel checks the CPL before the intercept, so if we receive an XSETBV VMEXIT, we are certain that it was executed at CPL=0 in the guest. By the way my check was wrong in the first place, it was reading SS.RPL instead of SS.DPL.
Revision 1.2 / (download) - annotate - [select for diffs], Thu Feb 14 09:37:31 2019 UTC (4 years, 3 months ago) by maxv
Branch: MAIN
Changes since 1.1: +10 -6
lines
Diff to previous 1.1 (colored)
On AMD, the segments have a simple "present" bit. On Intel however there is an extra "unusable" bit, which has a twisted meaning. We can't just ignore this bit, because when unset, the CPU performs extra checks on the other attributes, which may cause VMENTRY to fail and the guest to be killed. Typically, on Qemu, some guests like Windows XP trigger two consecutive getstate+setstate calls, and while processing them, we end up wrongfully removing the "unusable" bits that were previously set. Fix that by forcing "unusable = !present". Each hypervisor I could check does something different, but this seems to be the least problematic solution for now. While here, the fields of vmx_guest_segs are VMX indexes, so they should be uint64_t (no functional change).
Revision 1.1 / (download) - annotate - [select for diffs], Wed Feb 13 16:03:16 2019 UTC (4 years, 3 months ago) by maxv
Branch: MAIN
Add Intel-VMX support in NVMM. This allows us to run hardware-accelerated VMs on Intel CPUs. Overall this implementation is fast and reliable, I am able to run NetBSD VMs with many VCPUs on a quad-core Intel i5. NVMM-Intel applies several optimizations already present in NVMM-AMD, and has a code structure similar to it. No change was needed in the NVMM MI frontend, or in libnvmm. Some differences exist against AMD: - On Intel the ASID space is big, so we don't fall back to a shared ASID when there are more VCPUs executing than available ASIDs in the host, contrary to AMD. There are enough ASIDs for the maximum number of VCPUs supported by NVMM. - On Intel there are two TLBs we need to take care of, one for the host (EPT) and one for the guest (VPID). Changes in EPT paging flush the host TLB, changes to the guest mode flush the guest TLB. - On Intel there is no easy way to set/fetch the VTPR, so we intercept reads/writes to CR8 and maintain a software TPR, that we give to the virtualizer as if it was the effective TPR in the guest. - On Intel, because of SVS, the host CR4 and LSTAR are not static, so we're forced to save them on each VMENTRY. - There is extra Intel weirdness we need to take care of, for example the reserved bits in CR0 and CR4 when accesses trap. While this implementation is functional and can already run many OSes, we likely have a problem on 32bit-PAE guests, because they require special care on Intel CPUs, and currently we don't handle that correctly; such guests may misbehave for now (without altering the host stability). I expect to fix that soon.