Message ID | 20240702132830.213384-1-aconole@redhat.com |
---|---|
Headers | show |
Series | selftests: openvswitch: Address some flakes in the CI environment | expand |
On Tue, Jul 02, 2024 at 09:28:28AM -0400, Aaron Conole wrote: > We found that since some tests rely on the TCP SYN timeouts to cause flow > misses, the default test suite timeout of 45 seconds is quick to be > exceeded. Bump the timeout to 15 minutes. > > Signed-off-by: Aaron Conole <aconole@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Simon Horman <horms@kernel.org> FWIIW, locally I had been using a timeout of 720s. So 900 seems entirely reasonable to me.
On Tue, Jul 02, 2024 at 09:28:30AM -0400, Aaron Conole wrote: > The openvswitch selftest is difficult to debug for anyone that isn't > directly familiar with the openvswitch module and the specifics of the > test cases. Many times when something fails, the debug log will be > sparsely populated and it takes some time to understand where a failure > occured. > > Increase the amount of details logged to the debug log by trapping all > 'info' logs, and all 'ovs_sbx' commands. > > Signed-off-by: Aaron Conole <aconole@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org>
On Tue, 2 Jul 2024 09:28:27 -0400 Aaron Conole wrote: > These patches aim to make using the openvswitch testsuite more reliable. > These should address the major sources of flakiness in the openvswitch > test suite allowing the CI infrastructure to exercise the openvswitch > module for patch series. There should be no change for users who simply > run the tests (except that patch 3/3 does make some of the debugging a bit > easier by making some output more verbose). Hi Aaron! The results look solid on normal builds now, but with a debug kernel the test is failing consistently: https://netdev.bots.linux.dev/contest.html?executor=vmksft-net-dbg&test=openvswitch-sh
Jakub Kicinski <kuba@kernel.org> writes: > On Tue, 2 Jul 2024 09:28:27 -0400 Aaron Conole wrote: >> These patches aim to make using the openvswitch testsuite more reliable. >> These should address the major sources of flakiness in the openvswitch >> test suite allowing the CI infrastructure to exercise the openvswitch >> module for patch series. There should be no change for users who simply >> run the tests (except that patch 3/3 does make some of the debugging a bit >> easier by making some output more verbose). > > Hi Aaron! > > The results look solid on normal builds now, but with a debug kernel > the test is failing consistently: > > https://netdev.bots.linux.dev/contest.html?executor=vmksft-net-dbg&test=openvswitch-sh Yes - it shows a test case issue with the upcall and psample tests. Adrian and I discussed the correct approach would be using a wait_for instead of just sleeping, because it seems the dbg environment might be too racy. I think he is working on a follow up to submit after the psample work gets merged - we were hoping not to hold that patch series up with more potential conflicts or merge issues if that's okay.
On Fri, 05 Jul 2024 09:49:12 -0400 Aaron Conole wrote: > > The results look solid on normal builds now, but with a debug kernel > > the test is failing consistently: > > > > https://netdev.bots.linux.dev/contest.html?executor=vmksft-net-dbg&test=openvswitch-sh > > Yes - it shows a test case issue with the upcall and psample tests. > > Adrian and I discussed the correct approach would be using a wait_for > instead of just sleeping, because it seems the dbg environment might be > too racy. I think he is working on a follow up to submit after the > psample work gets merged - we were hoping not to hold that patch series > up with more potential conflicts or merge issues if that's okay. Makes sense, thanks!
On Fri, Jul 05, 2024 at 09:49:12AM GMT, Aaron Conole wrote: > Jakub Kicinski <kuba@kernel.org> writes: > > > On Tue, 2 Jul 2024 09:28:27 -0400 Aaron Conole wrote: > >> These patches aim to make using the openvswitch testsuite more reliable. > >> These should address the major sources of flakiness in the openvswitch > >> test suite allowing the CI infrastructure to exercise the openvswitch > >> module for patch series. There should be no change for users who simply > >> run the tests (except that patch 3/3 does make some of the debugging a bit > >> easier by making some output more verbose). > > > > Hi Aaron! > > > > The results look solid on normal builds now, but with a debug kernel > > the test is failing consistently: > > > > https://netdev.bots.linux.dev/contest.html?executor=vmksft-net-dbg&test=openvswitch-sh > > Yes - it shows a test case issue with the upcall and psample tests. > > Adrian and I discussed the correct approach would be using a wait_for > instead of just sleeping, because it seems the dbg environment might be > too racy. I think he is working on a follow up to submit after the > psample work gets merged - we were hoping not to hold that patch series > up with more potential conflicts or merge issues if that's okay. > Yes. I am working on a patch to solve the failures in slow systems. Thanks. Adrián