Message ID | cover.1694421911.git.haibo1.xu@intel.com |
---|---|
Headers | show |
Series | RISCV: Add kvm Sstc timer selftests | expand |
On Thu, Sep 14, 2023 at 4:52 PM Andrew Jones <ajones@ventanamicro.com> wrote: > > On Thu, Sep 14, 2023 at 09:36:59AM +0800, Haibo Xu wrote: > > Signed-off-by: Haibo Xu <haibo1.xu@intel.com> > > --- > > tools/testing/selftests/kvm/include/riscv/processor.h | 8 +------- > > 1 file changed, 1 insertion(+), 7 deletions(-) > > > > diff --git a/tools/testing/selftests/kvm/include/riscv/processor.h b/tools/testing/selftests/kvm/include/riscv/processor.h > > index 5b62a3d2aa9b..67766baed4f7 100644 > > --- a/tools/testing/selftests/kvm/include/riscv/processor.h > > +++ b/tools/testing/selftests/kvm/include/riscv/processor.h > > @@ -8,6 +8,7 @@ > > #define SELFTEST_KVM_PROCESSOR_H > > > > #include "kvm_util.h" > > +#include <asm/csr.h> > > #include <linux/stringify.h> > > nit: Usually we try to keep the order of our includes separated into five > categories, listed below, where each category is sorted alphabetically. Of > course any dependencies the includes have on each other need to be > considered too. > > <library-includes-without-a-subdir> > <library-includes-with-subdir> > <linux/...> > <asm/...> > "local-includes" > It's very helpful. Thanks for sharing it! Will fix it in v4. > > > > static inline uint64_t __kvm_reg_id(uint64_t type, uint64_t idx, > > @@ -95,13 +96,6 @@ static inline uint64_t __kvm_reg_id(uint64_t type, uint64_t idx, > > #define PGTBL_PAGE_SIZE PGTBL_L0_BLOCK_SIZE > > #define PGTBL_PAGE_SIZE_SHIFT PGTBL_L0_BLOCK_SHIFT > > > > -#define SATP_PPN _AC(0x00000FFFFFFFFFFF, UL) > > -#define SATP_MODE_39 _AC(0x8000000000000000, UL) > > -#define SATP_MODE_48 _AC(0x9000000000000000, UL) > > -#define SATP_ASID_BITS 16 > > -#define SATP_ASID_SHIFT 44 > > -#define SATP_ASID_MASK _AC(0xFFFF, UL) > > - > > #define SBI_EXT_EXPERIMENTAL_START 0x08000000 > > #define SBI_EXT_EXPERIMENTAL_END 0x08FFFFFF > > > > -- > > 2.34.1 > > > > Assuming the CONFIG_64BIT patch will come before this, then > > Reviewed-by: Andrew Jones <ajones@ventanamicro.com> > > Thanks, > drew
On Thu, Sep 14, 2023 at 5:52 PM Andrew Jones <ajones@ventanamicro.com> wrote: > > On Thu, Sep 14, 2023 at 09:37:03AM +0800, Haibo Xu wrote: > > Add a KVM selftests to validate the Sstc timer functionality. > > The test was ported from arm64 arch timer test. > > I just tried this test out. Running it over and over again on QEMU I see > it works sometimes, but it frequently fails with the > GUEST_ASSERT_EQ(config_iter + 1, irq_iter) assert and at least once I > also saw the __GUEST_ASSERT(xcnt >= cmp) assert. > Good catch! I can also reproduce this issue and it is a common problem for both arm64 and riscv because it also happens in a arm64 Qemu VM. It seems like a synchronization issue between host and guest shared variables. Will double check the test code. > Thanks, > drew
On Thu, Sep 14, 2023 at 6:15 PM Andrew Jones <ajones@ventanamicro.com> wrote: > > On Thu, Sep 14, 2023 at 10:52:15AM +0100, Conor Dooley wrote: > > On Thu, Sep 14, 2023 at 11:36:01AM +0200, Andrew Jones wrote: > > > > +static inline void cpu_relax(void) > > > > +{ > > > > +#ifdef __riscv_zihintpause > > > > + asm volatile("pause" ::: "memory"); > > > > +#else > > > > + /* Encoding of the pause instruction */ > > > > + asm volatile(".4byte 0x100000F" ::: "memory"); > > > > +#endif > > > > +} > > > > > > cpu_relax() should go to include/riscv/processor.h > > > > Can the one from asm/vdso/processor.h be reused, or are there special > > considerations preventing that? > > We'd need to copy it into tools/arch/riscv/include/asm, but it could be > done. Hmm, now that I look at it, I see we're missing the barrier() call > in this kvm selftests version. > Will reuse the one from asm/vdso/processor.h and copy it to tools/arch/riscv/include/asm. > Thanks, > drew
On Fri, Sep 15, 2023 at 2:21 PM Haibo Xu <xiaobo55x@gmail.com> wrote: > > On Thu, Sep 14, 2023 at 5:52 PM Andrew Jones <ajones@ventanamicro.com> wrote: > > > > On Thu, Sep 14, 2023 at 09:37:03AM +0800, Haibo Xu wrote: > > > Add a KVM selftests to validate the Sstc timer functionality. > > > The test was ported from arm64 arch timer test. > > > > I just tried this test out. Running it over and over again on QEMU I see > > it works sometimes, but it frequently fails with the > > GUEST_ASSERT_EQ(config_iter + 1, irq_iter) assert and at least once I > > also saw the __GUEST_ASSERT(xcnt >= cmp) assert. > > > > Good catch! > > I can also reproduce this issue and it is a common problem for both > arm64 and riscv because it also happens in a arm64 Qemu VM. > > It seems like a synchronization issue between host and guest shared > variables. Will double check the test code. > > > Thanks, > > drew Hi Andrew, After several rounds of regression testing, some findings: 1. The intermittent failure also happened on ARM64 Qemu VM, and even in the initial arch_timer commit(4959d8650e9f4). 2. it didn't happen on a ARM64 HW(but a different failure occured during stress test) 3. The failure have a close relationship with TIMER_TEST_ERR_MARGIN_US(default 100), and after increasing the macro to 300, the failure couldn't reproduced in 1000 loops stress test in RISC-V Qemu VM So my suggestion is we can expose the TIMER_TEST_ERR_MARGIN_US parameter as an arch_timer test arg parameter and tune it based on a specific test environment. What's your opinion? Regards, Haibo
On Mon, Dec 04, 2023 at 10:42:24AM +0800, Haibo Xu wrote: > On Fri, Sep 15, 2023 at 2:21 PM Haibo Xu <xiaobo55x@gmail.com> wrote: > > > > On Thu, Sep 14, 2023 at 5:52 PM Andrew Jones <ajones@ventanamicro.com> wrote: > > > > > > On Thu, Sep 14, 2023 at 09:37:03AM +0800, Haibo Xu wrote: > > > > Add a KVM selftests to validate the Sstc timer functionality. > > > > The test was ported from arm64 arch timer test. > > > > > > I just tried this test out. Running it over and over again on QEMU I see > > > it works sometimes, but it frequently fails with the > > > GUEST_ASSERT_EQ(config_iter + 1, irq_iter) assert and at least once I > > > also saw the __GUEST_ASSERT(xcnt >= cmp) assert. > > > > > > > Good catch! > > > > I can also reproduce this issue and it is a common problem for both > > arm64 and riscv because it also happens in a arm64 Qemu VM. > > > > It seems like a synchronization issue between host and guest shared > > variables. Will double check the test code. > > > > > Thanks, > > > drew > > Hi Andrew, > > After several rounds of regression testing, some findings: > 1. The intermittent failure also happened on ARM64 Qemu VM, and even > in the initial arch_timer commit(4959d8650e9f4). > 2. it didn't happen on a ARM64 HW(but a different failure occured > during stress test) > 3. The failure have a close relationship with > TIMER_TEST_ERR_MARGIN_US(default 100), and after increasing > the macro to 300, the failure couldn't reproduced in 1000 loops > stress test in RISC-V Qemu VM > > So my suggestion is we can expose the TIMER_TEST_ERR_MARGIN_US > parameter as an arch_timer test arg parameter > and tune it based on a specific test environment. > > What's your opinion? The concept of "timeout for an interrupt to arrive" is always going to leave us exposed to random failures. Your suggestion of making the timeout user configurable is probably the best we can do. I would suggest also adding more descriptive failure text and a hint about trying to adjust the timeout. Or, one thing we do in kvm-unit-tests, is to reduce typical delays while allowing expected delays to be longer by looping over a shorter delay and a non-fatal check, i.e. pass = false; for (i = 0; i < 10; i++) { udelay(100); if (check(...)) { pass = true; break; } } assert(pass); We could try that approach here too. Thanks, drew
On Mon, Dec 4, 2023 at 7:32 PM Andrew Jones <ajones@ventanamicro.com> wrote: > > On Mon, Dec 04, 2023 at 10:42:24AM +0800, Haibo Xu wrote: > > On Fri, Sep 15, 2023 at 2:21 PM Haibo Xu <xiaobo55x@gmail.com> wrote: > > > > > > On Thu, Sep 14, 2023 at 5:52 PM Andrew Jones <ajones@ventanamicro.com> wrote: > > > > > > > > On Thu, Sep 14, 2023 at 09:37:03AM +0800, Haibo Xu wrote: > > > > > Add a KVM selftests to validate the Sstc timer functionality. > > > > > The test was ported from arm64 arch timer test. > > > > > > > > I just tried this test out. Running it over and over again on QEMU I see > > > > it works sometimes, but it frequently fails with the > > > > GUEST_ASSERT_EQ(config_iter + 1, irq_iter) assert and at least once I > > > > also saw the __GUEST_ASSERT(xcnt >= cmp) assert. > > > > > > > > > > Good catch! > > > > > > I can also reproduce this issue and it is a common problem for both > > > arm64 and riscv because it also happens in a arm64 Qemu VM. > > > > > > It seems like a synchronization issue between host and guest shared > > > variables. Will double check the test code. > > > > > > > Thanks, > > > > drew > > > > Hi Andrew, > > > > After several rounds of regression testing, some findings: > > 1. The intermittent failure also happened on ARM64 Qemu VM, and even > > in the initial arch_timer commit(4959d8650e9f4). > > 2. it didn't happen on a ARM64 HW(but a different failure occured > > during stress test) > > 3. The failure have a close relationship with > > TIMER_TEST_ERR_MARGIN_US(default 100), and after increasing > > the macro to 300, the failure couldn't reproduced in 1000 loops > > stress test in RISC-V Qemu VM > > > > So my suggestion is we can expose the TIMER_TEST_ERR_MARGIN_US > > parameter as an arch_timer test arg parameter > > and tune it based on a specific test environment. > > > > What's your opinion? > > The concept of "timeout for an interrupt to arrive" is always going to > leave us exposed to random failures. Your suggestion of making the > timeout user configurable is probably the best we can do. I would > suggest also adding more descriptive failure text and a hint about > trying to adjust the timeout. > > Or, one thing we do in kvm-unit-tests, is to reduce typical delays while > allowing expected delays to be longer by looping over a shorter delay and > a non-fatal check, i.e. > > pass = false; > for (i = 0; i < 10; i++) { > udelay(100); > if (check(...)) { > pass = true; > break; > } > } > assert(pass); > > We could try that approach here too. > > Thanks, > drew Thanks for the feedback, I will send out patch set v4 soon!