summaryrefslogtreecommitdiffstats
path: root/arch/x86/include/asm/vmx.h
Commit message (Collapse)AuthorAgeFilesLines
* KVM: x86: Add Intel PT virtualization work modeChao Peng2018-12-211-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | Intel Processor Trace virtualization can be work in one of 2 possible modes: a. System-Wide mode (default): When the host configures Intel PT to collect trace packets of the entire system, it can leave the relevant VMX controls clear to allow VMX-specific packets to provide information across VMX transitions. KVM guest will not aware this feature in this mode and both host and KVM guest trace will output to host buffer. b. Host-Guest mode: Host can configure trace-packet generation while in VMX non-root operation for guests and root operation for native executing normally. Intel PT will be exposed to KVM guest in this mode, and the trace output to respective buffer of host and guest. In this mode, tht status of PT will be saved and disabled before VM-entry and restored after VM-exit if trace a virtual machine. Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM: nVMX: Unrestricted guest mode requires EPTJim Mattson2018-11-271-0/+1
| | | | | | | | | | | | | | | | | | As specified in Intel's SDM, do not allow the L1 hypervisor to launch an L2 guest with the VM-execution controls for "unrestricted guest" or "mode-based execute control for EPT" set and the VM-execution control for "enable EPT" clear. Note that the VM-execution control for "mode-based execute control for EPT" is not yet virtualized by kvm. Reported-by: Andrew Thornton <andrewth@google.com> Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Peter Shier <pshier@google.com> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Wanpeng Li <wanpengli@tencent.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM/x86: Use assembly instruction mnemonics instead of .byte streamsUros Bizjak2018-10-171-13/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recently the minimum required version of binutils was changed to 2.20, which supports all VMX instruction mnemonics. The patch removes all .byte #defines and uses real instruction mnemonics instead. The compiler is now able to pass memory operand to the instruction, so there is no need for memory clobber anymore. Also, the compiler adds CC register clobber automatically to all extended asm clauses, so the patch also removes explicit CC clobber. The immediate benefit of the patch is removal of many unnecesary register moves, resulting in 1434 saved bytes in vmx.o: text data bss dec hex filename 151257 18246 8500 178003 2b753 vmx.o 152691 18246 8500 179437 2bced vmx-old.o Some examples of improvement include removal of unneeded moves of %rsp to %rax in front of invept and invvpid instructions: a57e: b9 01 00 00 00 mov $0x1,%ecx a583: 48 89 04 24 mov %rax,(%rsp) a587: 48 89 e0 mov %rsp,%rax a58a: 48 c7 44 24 08 00 00 movq $0x0,0x8(%rsp) a591: 00 00 a593: 66 0f 38 80 08 invept (%rax),%rcx to: a45c: 48 89 04 24 mov %rax,(%rsp) a460: b8 01 00 00 00 mov $0x1,%eax a465: 48 c7 44 24 08 00 00 movq $0x0,0x8(%rsp) a46c: 00 00 a46e: 66 0f 38 80 04 24 invept (%rsp),%rax and the ability to use more optimal registers and memory operands in the instruction: 8faa: 48 8b 44 24 28 mov 0x28(%rsp),%rax 8faf: 4c 89 c2 mov %r8,%rdx 8fb2: 0f 79 d0 vmwrite %rax,%rdx to: 8e7c: 44 0f 79 44 24 28 vmwrite 0x28(%rsp),%r8 Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM: vmx: Add defines for SGX ENCLS exitingSean Christopherson2018-08-221-0/+3
| | | | | | | | | | | | | | | | | | Hardware support for basic SGX virtualization adds a new execution control (ENCLS_EXITING), VMCS field (ENCLS_EXITING_BITMAP) and exit reason (ENCLS), that enables a VMM to intercept specific ENCLS leaf functions, e.g. to inject faults when the VMM isn't exposing SGX to a VM. When ENCLS_EXITING is enabled, the VMM can set/clear bits in the bitmap to intercept/allow ENCLS leaf functions in non-root, e.g. setting bit 2 in the ENCLS_EXITING_BITMAP will cause ENCLS[EINIT] to VMExit(ENCLS). Note: EXIT_REASON_ENCLS was previously added by commit 1f5199927034 ("KVM: VMX: add missing exit reasons"). Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20180814163334.25724-2-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* x86/speculation: Use ARCH_CAPABILITIES to skip L1D flush on vmentryPaolo Bonzini2018-08-051-0/+1
| | | | | | | | | Bit 3 of ARCH_CAPABILITIES tells a hypervisor that L1D flush on vmentry is not needed. Add a new value to enum vmx_l1d_flush_state, which is used either if there is no L1TF bug at all, or if bit 3 is set in ARCH_CAPABILITIES. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* Merge 4.18-rc7 into master to pick up the KVM dependcyThomas Gleixner2018-08-051-0/+3
|\ | | | | | | Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * kvm: vmx: Nested VM-entry prereqs for event inj.Marc Orr2018-06-221-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch extends the checks done prior to a nested VM entry. Specifically, it extends the check_vmentry_prereqs function with checks for fields relevant to the VM-entry event injection information, as described in the Intel SDM, volume 3. This patch is motivated by a syzkaller bug, where a bad VM-entry interruption information field is generated in the VMCS02, which causes the nested VM launch to fail. Then, KVM fails to resume L1. While KVM should be improved to correctly resume L1 execution after a failed nested launch, this change is justified because the existing code to resume L1 is flaky/ad-hoc and the test coverage for resuming L1 is sparse. Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Marc Orr <marcorr@google.com> [Removed comment whose parts were describing previous revisions and the rest was obvious from function/variable naming. - Radim] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
* | x86/l1tf: Handle EPT disabled state properThomas Gleixner2018-07-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If Extended Page Tables (EPT) are disabled or not supported, no L1D flushing is required. The setup function can just avoid setting up the L1D flush for the EPT=n case. Invoke it after the hardware setup has be done and enable_ept has the correct state and expose the EPT disabled state in the mitigation status as well. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142322.612160168@linutronix.de
* | x86/litf: Introduce vmx status variableThomas Gleixner2018-07-131-0/+9
|/ | | | | | | | | | | | Store the effective mitigation of VMX in a status variable and use it to report the VMX state in the l1tf sysfs file. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142322.433098358@linutronix.de
* KVM: nVMX: Restore the VMCS12 offsets for v4.0 fieldsJim Mattson2018-05-231-0/+2
| | | | | | | | | | | | | | | | | Changing the VMCS12 layout will break save/restore compatibility with older kvm releases once the KVM_{GET,SET}_NESTED_STATE ioctls are accepted upstream. Google has already been using these ioctls for some time, and we implore the community not to disturb the existing layout. Move the four most recently added fields to preserve the offsets of the previously defined fields and reserve locations for the vmread and vmwrite bitmaps, which will be used in the virtualization of VMCS shadowing (to improve the performance of double-nesting). Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [Kept the SDM order in vmcs_field_to_offset_table. - Radim] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
* kvm/x86: fix icebp instruction handlingLinus Torvalds2018-03-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The undocumented 'icebp' instruction (aka 'int1') works pretty much like 'int3' in the absense of in-circuit probing equipment (except, obviously, that it raises #DB instead of raising #BP), and is used by some validation test-suites as such. But Andy Lutomirski noticed that his test suite acted differently in kvm than on bare hardware. The reason is that kvm used an inexact test for the icebp instruction: it just assumed that an all-zero VM exit qualification value meant that the VM exit was due to icebp. That is not unlike the guess that do_debug() does for the actual exception handling case, but it's purely a heuristic, not an absolute rule. do_debug() does it because it wants to ascribe _some_ reasons to the #DB that happened, and an empty %dr6 value means that 'icebp' is the most likely casue and we have no better information. But kvm can just do it right, because unlike the do_debug() case, kvm actually sees the real reason for the #DB in the VM-exit interruption information field. So instead of relying on an inexact heuristic, just use the actual VM exit information that says "it was 'icebp'". Right now the 'icebp' instruction isn't technically documented by Intel, but that will hopefully change. The special "privileged software exception" information _is_ actually mentioned in the Intel SDM, even though the cause of it isn't enumerated. Reported-by: Andy Lutomirski <luto@kernel.org> Tested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* KVM: VMX: rename RDSEED and RDRAND vmx ctrls to reflect exitingDavid Hildenbrand2017-10-121-2/+2
| | | | | | | | Let's just name these according to the SDM. This should make it clearer that the are used to enable exiting and not the feature itself. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
* KVM: MMU: Add 5 level EPT & Shadow page table support.Yu Zhang2017-08-241-0/+2
| | | | | | | | | | | | | | | | | | Extends the shadow paging code, so that 5 level shadow page table can be constructed if VM is running in 5 level paging mode. Also extends the ept code, so that 5 level ept table can be constructed if maxphysaddr of VM exceeds 48 bits. Unlike the shadow logic, KVM should still use 4 level ept table for a VM whose physical address width is less than 48 bits, even when the VM is running in 5 level paging mode. Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com> [Unconditionally reset the MMU context in kvm_cpuid_update. Changing MAXPHYADDR invalidates the reserved bit bitmasks. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM: VMX: cleanup EPTP definitionsDavid Hildenbrand2017-08-181-5/+6
| | | | | | | | Don't use shifts, tag them correctly as EPTP and use better matching names (PWL vs. GAW). Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
* KVM: nVMX: Emulate EPTP switching for the L1 hypervisorBandan Das2017-08-071-0/+6
| | | | | | | | | | | When L2 uses vmfunc, L0 utilizes the associated vmexit to emulate a switching of the ept pointer by reloading the guest MMU. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Bandan Das <bsd@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
* KVM: vmx: Enable VMFUNCsBandan Das2017-08-071-0/+3
| | | | | | | | | | | Enable VMFUNC in the secondary execution controls. This simplifies the changes necessary to expose it to nested hypervisors. VMFUNCs still cause #UD when invoked. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Bandan Das <bsd@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
* KVM: nVMX: support RDRAND and RDSEED exitingPaolo Bonzini2017-04-071-0/+2
| | | | | | Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
* kvm: nVMX: support EPT accessed/dirty bitsPaolo Bonzini2017-04-071-0/+2
| | | | | | | | | | | | | | | | | | | | | | Now use bit 6 of EPTP to optionally enable A/D bits for EPTP. Another thing to change is that, when EPT accessed and dirty bits are not in use, VMX treats accesses to guest paging structures as data reads. When they are in use (bit 6 of EPTP is set), they are treated as writes and the corresponding EPT dirty bit is set. The MMU didn't know this detail, so this patch adds it. We also have to fix up the exit qualification. It may be wrong because KVM sets bit 6 but the guest might not. L1 emulates EPT A/D bits using write permissions, so in principle it may be possible for EPT A/D bits to be used by L1 even though not available in hardware. The problem is that guest page-table walks will be treated as reads rather than writes, so they would not cause an EPT violation. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [Fixed typo in walk_addr_generic() comment and changed bit clear + conditional-set pattern in handle_ept_violation() to conditional-clear] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
* kvm: x86: mmu: Rename EPT_VIOLATION_READ/WRITE/INSTR constantsJunaid Shahid2017-01-271-6/+6
| | | | | | | | | | Rename the EPT_VIOLATION_READ/WRITE/INSTR constants to EPT_VIOLATION_ACC_READ/WRITE/INSTR to more clearly indicate that these signify the type of the memory access as opposed to the permissions granted by the PTE. Signed-off-by: Junaid Shahid <junaids@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* kvm: x86: mmu: Lockless access tracking for Intel CPUs without EPT A bits.Junaid Shahid2017-01-091-3/+6
| | | | | | | | | | | This change implements lockless access tracking for Intel CPUs without EPT A bits. This is achieved by marking the PTEs as not-present (but not completely clearing them) when clear_flush_young() is called after marking the pages as accessed. When an EPT Violation is generated as a result of the VM accessing those pages, the PTEs are restored to their original values. Signed-off-by: Junaid Shahid <junaids@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* kvm: x86: mmu: Do not use bit 63 for tracking special SPTEsJunaid Shahid2017-01-091-2/+7
| | | | | | | | | | | | | | | | MMIO SPTEs currently set both bits 62 and 63 to distinguish them as special PTEs. However, bit 63 is used as the SVE bit in Intel EPT PTEs. The SVE bit is ignored for misconfigured PTEs but not necessarily for not-Present PTEs. Since MMIO SPTEs use an EPT misconfiguration, so using bit 63 for them is acceptable. However, the upcoming fast access tracking feature adds another type of special tracking PTE, which uses not-Present PTEs and hence should not set bit 63. In order to use common bits to distinguish both type of special PTEs, we now use only bit 62 as the special bit. Signed-off-by: Junaid Shahid <junaids@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* kvm: x86: mmu: Use symbolic constants for EPT Violation Exit QualificationsJunaid Shahid2017-01-091-0/+16
| | | | | | | | | This change adds some symbolic constants for VM Exit Qualifications related to EPT Violations and updates handle_ept_violation() to use these constants instead of hard-coded numbers. Signed-off-by: Junaid Shahid <junaids@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM: nVMX: support restore of VMX capability MSRsDavid Matlack2016-12-081-0/+31
| | | | | | | | | | | | | | | | | | | | | The VMX capability MSRs advertise the set of features the KVM virtual CPU can support. This set of features varies across different host CPUs and KVM versions. This patch aims to addresses both sources of differences, allowing VMs to be migrated across CPUs and KVM versions without guest-visible changes to these MSRs. Note that cross-KVM- version migration is only supported from this point forward. When the VMX capability MSRs are restored, they are audited to check that the set of features advertised are a subset of what KVM and the CPU support. Since the VMX capability MSRs are read-only, they do not need to be on the default MSR save/restore lists. The userspace hypervisor can set the values of these MSRs or read them from KVM at VCPU creation time, and restore the same value after every save/restore. Signed-off-by: David Matlack <dmatlack@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
* KVM: VMX: clean up declaration of VPID/EPT invalidation typesJan Dakinevich2016-11-221-1/+4
| | | | | | | | | | | - Remove VMX_EPT_EXTENT_INDIVIDUAL_ADDR, since there is no such type of EPT invalidation - Add missing VPID types names Signed-off-by: Jan Dakinevich <jan.dakinevich@gmail.com> Tested-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
* KVM: nVMX: support descriptor table exitsPaolo Bonzini2016-11-021-0/+1
| | | | | | | | These are never used by the host, but they can still be reflected to the guest. Tested-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Revert "KVM: x86: add pcommit support"Dan Williams2016-07-231-1/+0
| | | | | | | | | | | | | This reverts commit 8b3e34e46aca9b6d349b331cd9cf71ccbdc91b2e. Given the deprecation of the pcommit instruction, the relevant VMX features and CPUID bits are not going to be rolled into the SDM. Remove their usage from KVM. Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* KVM: VMX: Enable and initialize VMX TSC scalingHaozhong Zhang2015-11-101-0/+3
| | | | | | | | This patch exhances kvm-intel module to enable VMX TSC scaling and collects information of TSC scaling ratio during initialization. Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM: nVMX: emulate the INVVPID instructionWanpeng Li2015-10-161-0/+1
| | | | | | | | Add the INVVPID instruction emulation. Reviewed-by: Wincy Van <fanwenyi0529@gmail.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM: x86: add pcommit supportXiao Guangrong2015-10-011-1/+1
| | | | | | | | | | | | | | | | | | Pass PCOMMIT CPU feature to guest to enable PCOMMIT instruction Currently we do not catch pcommit instruction for L1 guest and allow L1 to catch this instruction for L2 if, as required by the spec, L1 can enumerate the PCOMMIT instruction via CPUID: | IA32_VMX_PROCBASED_CTLS2[53] (which enumerates support for the | 1-setting of PCOMMIT exiting) is always the same as | CPUID.07H:EBX.PCOMMIT[bit 22]. Thus, software can set PCOMMIT exiting | to 1 if and only if the PCOMMIT instruction is enumerated via CPUID The spec can be found at https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* x86/kvm: Rename VMX's segment access rights definesAndy Lutomirski2015-08-151-23/+23
| | | | | | | | | | | | VMX encodes access rights differently from LAR, and the latter is most likely what x86 people think of when they think of "access rights". Rename them to avoid confusion. Cc: kvm@vger.kernel.org Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* kvm/x86: add support for MONITOR_TRAP_FLAGMihai Donțu2015-07-231-0/+1
| | | | | | | | Allow a nested hypervisor to single step its guests. Signed-off-by: Mihai Donțu <mihai.dontu@gmail.com> [Fix overlong line. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM: VMX: Add PML support in VMXKai Huang2015-01-301-0/+4
| | | | | | | | | This patch adds PML support in VMX. A new module parameter 'enable_pml' is added to allow user to enable/disable it manually. Signed-off-by: Kai Huang <kai.huang@linux.intel.com> Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* kvm: x86: handle XSAVES vmcs and vmexitWanpeng Li2014-12-051-0/+2
| | | | | | | | | Initialize the XSS exit bitmap. It is zero so there should be no XSAVES or XRSTORS exits. Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com> Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* kvm: x86: Add kvm_x86_ops hook that enables XSAVES for guestWanpeng Li2014-12-051-0/+1
| | | | | | | | Expose the XSAVES feature to the guest if the kvm_x86_ops say it is available. Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM: nVMX: Fix returned value of MSR_IA32_VMX_PROCBASED_CTLSJan Kiszka2014-06-191-0/+3
| | | | | | | SDM says bits 1, 4-6, 8, 13-16, and 26 have to be set. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM: x86: Fix constant value of VM_{EXIT_SAVE,ENTRY_LOAD}_DEBUG_CONTROLSJan Kiszka2014-06-191-2/+2
| | | | | | | | | | | | The spec says those controls are at bit position 2 - makes 4 as value. The impact of this mistake is effectively zero as we only use them to ensure that these features are set at position 2 (or, previously, 1) in MSR_IA32_VMX_{EXIT,ENTRY}_CTLS - which is and will be always true according to the spec. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM: x86: Intel MPX vmx and msr handleLiu, Jinsong2014-02-241-0/+4
| | | | | | | | | | | | | From caddc009a6d2019034af8f2346b2fd37a81608d0 Mon Sep 17 00:00:00 2001 From: Liu Jinsong <jinsong.liu@intel.com> Date: Mon, 24 Feb 2014 18:11:11 +0800 Subject: [PATCH v5 1/3] KVM: x86: Intel MPX vmx and msr handle This patch handle vmx and msr of Intel MPX feature. Signed-off-by: Xudong Hao <xudong.hao@intel.com> Signed-off-by: Liu Jinsong <jinsong.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM: nVMX: Add support for activity state HLTJan Kiszka2013-12-121-0/+1
| | | | | | | | | | We can easily emulate the HLT activity state for L1: If it decides that L2 shall be halted on entry, just invoke the normal emulation of halt after switching to L2. We do not depend on specific host features to provide this, so we can expose the capability unconditionally. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* nEPT: Nested INVEPTNadav Har'El2013-08-071-0/+2
| | | | | | | | | | | | | | | | | | | | If we let L1 use EPT, we should probably also support the INVEPT instruction. In our current nested EPT implementation, when L1 changes its EPT table for L2 (i.e., EPT12), L0 modifies the shadow EPT table (EPT02), and in the course of this modification already calls INVEPT. But if last level of shadow page is unsync not all L1's changes to EPT12 are intercepted, which means roots need to be synced when L1 calls INVEPT. Global INVEPT should not be different since roots are synced by kvm_mmu_load() each time EPTP02 changes. Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Nadav Har'El <nyh@il.ibm.com> Signed-off-by: Jun Nakajima <jun.nakajima@intel.com> Signed-off-by: Xinhao Xu <xinhao.xu@intel.com> Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM: nVMX: Shadow-vmcs control fields/bitsAbel Gordon2013-04-221-0/+3
| | | | | | | | | Add definitions for all the vmcs control fields/bits required to enable vmcs-shadowing Signed-off-by: Abel Gordon <abelg@il.ibm.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
* KVM: VMX: Check the posted interrupt capabilityYang Zhang2013-04-161-0/+4
| | | | | | | | Detect the posted interrupt feature. If it exists, then set it in vmcs_config. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: nVMX: Add preemption timer supportJan Kiszka2013-03-141-0/+3
| | | | | | | | | | | Provided the host has this feature, it's straightforward to offer it to the guest as well. We just need to load to timer value on L2 entry if the feature was enabled by L1 and watch out for the corresponding exit reason. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
* KVM: nVMX: Provide EFER.LMA saving supportJan Kiszka2013-03-141-0/+2
| | | | | | | | | | | | | We will need EFER.LMA saving to provide unrestricted guest mode. All what is missing for this is picking up EFER.LMA from VM_ENTRY_CONTROLS on L2->L1 switches. If the host does not support EFER.LMA saving, no change is performed, otherwise we properly emulate for L1 what the hardware does for L0. Advertise the support, depending on the host feature. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
* KVM: nVMX: Clean up and fix pin-based execution controlsJan Kiszka2013-03-131-0/+2
| | | | | | | | | | | | | Only interrupt and NMI exiting are mandatory for KVM to work, thus can be exposed to the guest unconditionally, virtual NMI exiting is optional. So we must not advertise it unless the host supports it. Introduce the symbolic constant PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR at this chance. Reviewed-by:: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
* KVM: nVMX: Fix content of MSR_IA32_VMX_ENTRY/EXIT_CTLSJan Kiszka2013-03-071-0/+4
| | | | | | | | | Properly set those bits to 1 that the spec demands in case bit 55 of VMX_BASIC is 0 - like in our case. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* Merge tag 'kvm-3.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2013-02-241-3/+15
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM updates from Marcelo Tosatti: "KVM updates for the 3.9 merge window, including x86 real mode emulation fixes, stronger memory slot interface restrictions, mmu_lock spinlock hold time reduction, improved handling of large page faults on shadow, initial APICv HW acceleration support, s390 channel IO based virtio, amongst others" * tag 'kvm-3.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (143 commits) Revert "KVM: MMU: lazily drop large spte" x86: pvclock kvm: align allocation size to page size KVM: nVMX: Remove redundant get_vmcs12 from nested_vmx_exit_handled_msr x86 emulator: fix parity calculation for AAD instruction KVM: PPC: BookE: Handle alignment interrupts booke: Added DBCR4 SPR number KVM: PPC: booke: Allow multiple exception types KVM: PPC: booke: use vcpu reference from thread_struct KVM: Remove user_alloc from struct kvm_memory_slot KVM: VMX: disable apicv by default KVM: s390: Fix handling of iscs. KVM: MMU: cleanup __direct_map KVM: MMU: remove pt_access in mmu_set_spte KVM: MMU: cleanup mapping-level KVM: MMU: lazily drop large spte KVM: VMX: cleanup vmx_set_cr0(). KVM: VMX: add missing exit names to VMX_EXIT_REASONS array KVM: VMX: disable SMEP feature when guest is in non-paging mode KVM: Remove duplicate text in api.txt Revert "KVM: MMU: split kvm_mmu_free_page" ...
| * KVM: VMX: add missing exit names to VMX_EXIT_REASONS arrayGleb Natapov2013-02-051-1/+6
| | | | | | | | | | Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * x86, apicv: add virtual interrupt delivery supportYang Zhang2013-01-291-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Virtual interrupt delivery avoids KVM to inject vAPIC interrupts manually, which is fully taken care of by the hardware. This needs some special awareness into existing interrupr injection path: - for pending interrupt, instead of direct injection, we may need update architecture specific indicators before resuming to guest. - A pending interrupt, which is masked by ISR, should be also considered in above update action, since hardware will decide when to inject it at right time. Current has_interrupt and get_interrupt only returns a valid vector from injection p.o.v. Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * x86, apicv: add virtual x2apic supportYang Zhang2013-01-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | basically to benefit from apicv, we need to enable virtualized x2apic mode. Currently, we only enable it when guest is really using x2apic. Also, clear MSR bitmap for corresponding x2apic MSRs when guest enabled x2apic: 0x800 - 0x8ff: no read intercept for apicv register virtualization, except APIC ID and TMCCT which need software's assistance to get right value. Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * x86, apicv: add APICv register virtualization supportYang Zhang2013-01-291-0/+2
| | | | | | | | | | | | | | | | | | | | - APIC read doesn't cause VM-Exit - APIC write becomes trap-like Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Yang Zhang <yang.z.zhang@intel.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>