Message ID | 20250507170901.151548-1-ebiggers@kernel.org |
---|---|
State | New |
Headers | show |
Series | crypto: arm64/sha256 - fix build when CONFIG_PREEMPT_VOLUNTARY=y | expand |
On Thu, May 08, 2025 at 01:12:28PM +0200, Ard Biesheuvel wrote: > On Wed, 7 May 2025 at 19:09, Eric Biggers <ebiggers@kernel.org> wrote: > > > > From: Eric Biggers <ebiggers@google.com> > > > > Fix the build of sha256-ce.S when CONFIG_PREEMPT_VOLUNTARY=y by passing > > the correct label to the cond_yield macro. Also adjust the code to > > execute only one branch instruction when CONFIG_PREEMPT_VOLUNTARY=n. > > > > Fixes: 6e36be511d28 ("crypto: arm64/sha256 - implement library instead of shash") > > Reported-by: kernel test robot <lkp@intel.com> > > Closes: https://lore.kernel.org/oe-kbuild-all/202505071811.yYpLUbav-lkp@intel.com/ > > Signed-off-by: Eric Biggers <ebiggers@google.com> > > --- > > arch/arm64/lib/crypto/sha256-ce.S | 7 ++++--- > > 1 file changed, 4 insertions(+), 3 deletions(-) > > > > diff --git a/arch/arm64/lib/crypto/sha256-ce.S b/arch/arm64/lib/crypto/sha256-ce.S > > index a8461d6dad634..f3e21c6d87d2e 100644 > > --- a/arch/arm64/lib/crypto/sha256-ce.S > > +++ b/arch/arm64/lib/crypto/sha256-ce.S > > @@ -121,14 +121,15 @@ CPU_LE( rev32 v19.16b, v19.16b ) > > > > /* update state */ > > add dgav.4s, dgav.4s, dg0v.4s > > add dgbv.4s, dgbv.4s, dg1v.4s > > > > + /* return early if voluntary preemption is needed */ > > + cond_yield 1f, x5, x6 > > + > > This will yield needlessly when the condition hits during the final iteration. > > > /* handled all input blocks? */ > > - cbz x2, 1f > > - cond_yield 3f, x5, x6 > > - b 0b > > + cbnz x2, 0b cond_yield doesn't actually yield, though. It just checks whether yielding is needed. So the behavior is the same: on the last iteration this function returns 0 (i.e. 0 blocks remaining), regardless of whether it gets to the end by jumping there due to TSK_TI_PREEMPT being set or by falling through after seeing nblocks==0. We could keep the nblocks==0 check first, but the cond_yield check is lightweight and it's probably better to avoid the extra branch instruction on every other iteration. - Eric
diff --git a/arch/arm64/lib/crypto/sha256-ce.S b/arch/arm64/lib/crypto/sha256-ce.S index a8461d6dad634..f3e21c6d87d2e 100644 --- a/arch/arm64/lib/crypto/sha256-ce.S +++ b/arch/arm64/lib/crypto/sha256-ce.S @@ -121,14 +121,15 @@ CPU_LE( rev32 v19.16b, v19.16b ) /* update state */ add dgav.4s, dgav.4s, dg0v.4s add dgbv.4s, dgbv.4s, dg1v.4s + /* return early if voluntary preemption is needed */ + cond_yield 1f, x5, x6 + /* handled all input blocks? */ - cbz x2, 1f - cond_yield 3f, x5, x6 - b 0b + cbnz x2, 0b /* store new state */ 1: st1 {dgav.4s, dgbv.4s}, [x0] mov x0, x2 ret