From patchwork Tue Jan 10 21:00:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 640911 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp2919814pvb; Tue, 10 Jan 2023 13:09:38 -0800 (PST) X-Google-Smtp-Source: AMrXdXt55Clum5w8NLDZzPiFKYub7iB5w7lHvp0RyMIJQNmz6/LGfxcunlvf+PqqKHGi03lWSN7N X-Received: by 2002:a05:6402:28cd:b0:46c:fabc:5897 with SMTP id ef13-20020a05640228cd00b0046cfabc5897mr59017590edb.7.1673384977775; Tue, 10 Jan 2023 13:09:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673384977; cv=none; d=google.com; s=arc-20160816; b=BojBniYrXJFFGeSijIRDZk/wAjwZFGUrvdW/6Ni86OtgJ6wdC28JZqfjO8bFTGUetH 0wRl0ntOXDRy/wFeBH93K0vgCboedo0TtF202Ucgr0vz01J0Jn02YVdruA3lYTenjhkU XEzT/kjduDX0hSPrzBbNF9UnvHBhM8U3AaA3pvGGt/NYzqY/PJYmLCOIqXbO8URLgbsM Ymyd6UHceOomXyatJrJFDtqxcLW5n2F6VIyAa74nO6SOFj8MDrCof+XSDOFAX1HI/7Pp Y9r0/Rdu8kXo9+gDdRdbzL9LY6ZVYdYsoytJAyQY9IM9mM57xguRB13X4JGfjrFHY3rJ oRQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=K5Ag4woQTmBilI0nqI0wTKxlSKEup7s5F2pLMCC1uTo=; b=aUbUacBywCVnMKC2FRtrmYKk3M1EjsEyCokaf36PaNWjveu4dO6Mp9g4dkzGflTTIU pOviPZZAYL31SF/cOioUuvnuCivm5DWNJATwnBaPfRvQAqNMMfJbpVQCiK77AmxI0QqD 41WOQgwuEUZ93kCjuIZQmyTkG+1l/veHEdhTuHCmJUpnqE8SgmUsV0IeW21Q+VKeVG46 R9mSBJZazaDwDW+PThD20kAJZd3CYLKkbACnWfVZZSuGFQN01sEiyIkWTp3tRRqwBiyv b+NteQ9rRAS4y0iJYUd0NMZrDop6X7eHTdp1o1Vs0gB/FpJpGhTmA6/XfVS8wiVdaBDf Tb+g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=d9vCRB37; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id z17-20020a05640240d100b0048ed707ae3dsi15667782edb.277.2023.01.10.13.09.37 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:09:37 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=d9vCRB37; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 38B283858284 for ; Tue, 10 Jan 2023 21:09:36 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 38B283858284 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673384976; bh=K5Ag4woQTmBilI0nqI0wTKxlSKEup7s5F2pLMCC1uTo=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=d9vCRB37Ht1yBA8y6KewjbSTOb2snH68zj1bY+Nl1w1xWdWRu+H3OtgS+LJHo0q9K nKbOGmND+uRBkBg5bh77Z6jbQMtylTrgx0837/aAN8UC/KYdlWuy7ZVA8+sHeYRyNQ b6Bxb1GjbUeSlKuFOW2X+Qwe+sKYrwYsUlc0d5R0= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-il1-x133.google.com (mail-il1-x133.google.com [IPv6:2607:f8b0:4864:20::133]) by sourceware.org (Postfix) with ESMTPS id 78DA33858D33 for ; Tue, 10 Jan 2023 21:09:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 78DA33858D33 Received: by mail-il1-x133.google.com with SMTP id u8so7066431ilg.0 for ; Tue, 10 Jan 2023 13:09:24 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=K5Ag4woQTmBilI0nqI0wTKxlSKEup7s5F2pLMCC1uTo=; b=WVDTrIOLRYNCrQD15kn8y8Ybnx00aktMmf7wl98IE0bU4HwcvBdEtz7ZPnHxqV2O08 rayCxJL/3tN7GVVWvW+peOXQp4cwIE/jUbqdbw1gs6r0RBKwKLc1nnpwD5BmtYkn1IQh xAd2fxRRMtaapMCG+1UofPLEd7jxlthN/GPgxMM1ZCEA7nKNHKN8Ijq7mHDG6X7BOxJQ bJpp3bzNe0BKYRsmpMo4VbrIUdLqrYg8ZBPZ8ZuFQzwGzAN6synEwyuzlPE8qlHV/C8v zor4hyZ4BYifyQxfCXEN3LPNuaXjjNsTZemFkJQS1qE6r9RXNRRP9i9tCKEoAur938ZJ IsZw== X-Gm-Message-State: AFqh2kovhE0SZYwjqOVxDaWifb45Gh2mzlV6Igk4to4v+wrwib8j/Odn E3dDmf7BraXNcDjM4w8tChtiiLdOcojFfPCM2Go= X-Received: by 2002:a4a:c199:0:b0:4f1:e491:c80 with SMTP id w25-20020a4ac199000000b004f1e4910c80mr6441581oop.2.1673384478705; Tue, 10 Jan 2023 13:01:18 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:e8a0:dd55:3328:997]) by smtp.gmail.com with ESMTPSA id r5-20020a4a83c5000000b0049ee88e86f9sm6202193oog.10.2023.01.10.13.01.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:01:17 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Cc: Carlos O'Donell Subject: [PATCH v6 02/17] Parameterize OP_T_THRES from memcopy.h Date: Tue, 10 Jan 2023 18:00:51 -0300 Message-Id: <20230110210106.1457686-3-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> References: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Richard Henderson It moves OP_T_THRES out of memcopy.h to its own header and adjust each architecture that redefines it. Checked with a build and check with run-built-tests=no for all major Linux ABIs. Co-authored-by: Adhemerval Zanella Reviewed-by: Carlos O'Donell --- string/memcmp.c | 3 --- sysdeps/generic/memcopy.h | 4 +--- sysdeps/generic/string-opthr.h | 25 ++++++++++++++++++++++ sysdeps/i386/memcopy.h | 3 --- sysdeps/i386/string-opthr.h | 25 ++++++++++++++++++++++ sysdeps/m68k/memcopy.h | 3 --- sysdeps/powerpc/powerpc32/power4/memcopy.h | 5 ----- 7 files changed, 51 insertions(+), 17 deletions(-) create mode 100644 sysdeps/generic/string-opthr.h create mode 100644 sysdeps/i386/string-opthr.h diff --git a/string/memcmp.c b/string/memcmp.c index ea0fa03e1c..047ca4f98e 100644 --- a/string/memcmp.c +++ b/string/memcmp.c @@ -48,9 +48,6 @@ and store. Must be an unsigned type. */ # define OPSIZ (sizeof (op_t)) -/* Threshold value for when to enter the unrolled loops. */ -# define OP_T_THRES 16 - /* Type to use for unaligned operations. */ typedef unsigned char byte; diff --git a/sysdeps/generic/memcopy.h b/sysdeps/generic/memcopy.h index b5ffa4d114..e9b3f227b2 100644 --- a/sysdeps/generic/memcopy.h +++ b/sysdeps/generic/memcopy.h @@ -57,6 +57,7 @@ /* Type to use for aligned memory operations. */ #include +#include #define OPSIZ (sizeof (op_t)) /* Type to use for unaligned operations. */ @@ -188,9 +189,6 @@ extern void _wordcopy_bwd_dest_aligned (long int, long int, size_t) #endif -/* Threshold value for when to enter the unrolled loops. */ -#define OP_T_THRES 16 - /* Set to 1 if memcpy is safe to use for forward-copying memmove with overlapping addresses. This is 0 by default because memcpy implementations are generally not safe for overlapping addresses. */ diff --git a/sysdeps/generic/string-opthr.h b/sysdeps/generic/string-opthr.h new file mode 100644 index 0000000000..6f10a98edd --- /dev/null +++ b/sysdeps/generic/string-opthr.h @@ -0,0 +1,25 @@ +/* Define a threshold for word access. Generic version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_OPTHR_H +#define _STRING_OPTHR_H 1 + +/* Threshold value for when to enter the unrolled loops. */ +#define OP_T_THRES 16 + +#endif /* string-opthr.h */ diff --git a/sysdeps/i386/memcopy.h b/sysdeps/i386/memcopy.h index 4f82689b84..1aa7c3a850 100644 --- a/sysdeps/i386/memcopy.h +++ b/sysdeps/i386/memcopy.h @@ -18,9 +18,6 @@ #include -#undef OP_T_THRES -#define OP_T_THRES 8 - #undef BYTE_COPY_FWD #define BYTE_COPY_FWD(dst_bp, src_bp, nbytes) \ do { \ diff --git a/sysdeps/i386/string-opthr.h b/sysdeps/i386/string-opthr.h new file mode 100644 index 0000000000..ed3e4b2ddb --- /dev/null +++ b/sysdeps/i386/string-opthr.h @@ -0,0 +1,25 @@ +/* Define a threshold for word access. i386 version. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef I386_STRING_OPTHR_H +#define I386_STRING_OPTHR_H 1 + +/* Threshold value for when to enter the unrolled loops. */ +#define OP_T_THRES 8 + +#endif /* I386_STRING_OPTHR_H */ diff --git a/sysdeps/m68k/memcopy.h b/sysdeps/m68k/memcopy.h index accd81c1c3..610577071d 100644 --- a/sysdeps/m68k/memcopy.h +++ b/sysdeps/m68k/memcopy.h @@ -20,9 +20,6 @@ #if defined(__mc68020__) || defined(mc68020) -#undef OP_T_THRES -#define OP_T_THRES 16 - /* WORD_COPY_FWD and WORD_COPY_BWD are not symmetric on the 68020, because of its weird instruction overlap characteristics. */ diff --git a/sysdeps/powerpc/powerpc32/power4/memcopy.h b/sysdeps/powerpc/powerpc32/power4/memcopy.h index 384f33b029..872157e485 100644 --- a/sysdeps/powerpc/powerpc32/power4/memcopy.h +++ b/sysdeps/powerpc/powerpc32/power4/memcopy.h @@ -50,11 +50,6 @@ [I fail to understand. I feel stupid. --roland] */ - -/* Threshold value for when to enter the unrolled loops. */ -#undef OP_T_THRES -#define OP_T_THRES 16 - /* Copy exactly NBYTES bytes from SRC_BP to DST_BP, without any assumptions about alignment of the pointers. */ #undef BYTE_COPY_FWD From patchwork Tue Jan 10 21:00:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 640896 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp2915580pvb; Tue, 10 Jan 2023 13:01:34 -0800 (PST) X-Google-Smtp-Source: AMrXdXtpqQp59WJDdh4WoSs4mNRXTo1XoiIjounV4+uSOErho93G6DDXvHnJnhwsGmFn+Aqh3CHq X-Received: by 2002:a05:6402:414:b0:495:77a9:f10d with SMTP id q20-20020a056402041400b0049577a9f10dmr15878865edv.42.1673384494150; Tue, 10 Jan 2023 13:01:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673384494; cv=none; d=google.com; s=arc-20160816; b=L+GiBdHx+7wbKJcJWjUkjfhDuHf7v/DbjQ979ECEb1kymqWo7nKaY3M456WGOs1wwo JrK2g4hhonLNARnDYRT7FzU/YinnbQBxwb8qA8i0bnzwXvhlr9DqI0NCmcA8ZM9HGBXt 5f2bra/m8g0x23yji8Ic+OOy4F1v+GhBNkaBDRjqCtfJJsUPm4P7x+v2XAwQLapQ7Qhf a7OhA8TXw+v7pn/jDZPAOHS12Hrnla1YgeB/6+kzrTimTLJoWl9+PLR8k4MTYAY8++sj RaYFDEnNeSx6UIc2Q28F6ovKICywTOQjj+fIclD3fDyXJuAEYr6i4F8gIaZ9BOVJdrfY rV5Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=RB25T/ws3uWabx2xGRO3jr0wdlqMybwD8hb8NBqZcC8=; b=LpHp0JBEAgr+dyvKMjWkdcKRa6mFjhHuK8Tbz+s1wgrVcEUrlW9+sdPcJdbnBeVTom ManR5yXaOqAL9tDtEtUftlcnvKaKAhAUnjQY9YoHC8bwK72azsfeX4vISwiUN6R6GPl1 oNSMI0I5SZFkpHyabKjx6r8q7RQ6EpW1X9MssE11+sfdVcH5AJ8PM3hbYfjTmBxJ2n9/ oJXsQSW5182avvCo6O9aEFaZdMjMdJ+2hGG5ZsiBU2wdcq+JnW6MrxqM6AAejZOAn548 01aV2qOaTe9TKyXTZvk/EZriN+bc0TkX/G1CyY7ey8gQjYxlhvh4dwf2N92jVQlQSBE3 LPHw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=LdPKstpn; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id b9-20020a056402278900b00484d2a31c84si14071897ede.473.2023.01.10.13.01.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:01:34 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=LdPKstpn; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EFEDE3858033 for ; Tue, 10 Jan 2023 21:01:32 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org EFEDE3858033 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673384493; bh=RB25T/ws3uWabx2xGRO3jr0wdlqMybwD8hb8NBqZcC8=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=LdPKstpnTkDFyXjKvfivRKAZ1NH23cky7j1G1ah6tL71cokx89q/5NfHZOuA1CQ+2 OhF8+cAsk1GSji1w9aljByn2EOJEPg5DGzHd4mfJ7NXm8W1Y+MvjXU8JYFJD4wmz/u MlTjONUTwpf0DRs9de/0cHD7Xo+acshrzputsYE0= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oo1-xc32.google.com (mail-oo1-xc32.google.com [IPv6:2607:f8b0:4864:20::c32]) by sourceware.org (Postfix) with ESMTPS id 5D0EE3858033 for ; Tue, 10 Jan 2023 21:01:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5D0EE3858033 Received: by mail-oo1-xc32.google.com with SMTP id y194-20020a4a45cb000000b004a08494e4b6so3547705ooa.7 for ; Tue, 10 Jan 2023 13:01:22 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RB25T/ws3uWabx2xGRO3jr0wdlqMybwD8hb8NBqZcC8=; b=MHPQ8z3xwS+3iSsNoqkAXP+nci7oEUEghUEnCm5nEvt8D/07dJc0LIK7ibA+Pj+p0L T510zCmejm1o7reG9vb+/AkBS8sgS2RFSORCsdbs9HTwR2UgtqgtTvxxPry53vurQfJj RKvttQ2qTLYxnAbEk+KuB57qWGli1bi3eMa5cw7/lRl52/e4NQ/Sbto9FGCnBWWBz83M D2VqxNKgCWCKWhgTMj16UMzoOQMDxojdfkqMMd/3oWLKcAw92OwiRIjA5iasA6KRxxBL 4KIX8ydUAuiDOwAx2dYO10YRtwDZ6855fTPmU1sINdgSVGRIx8ZzLgUtENuJdvGF+qfj Ek9A== X-Gm-Message-State: AFqh2koGoyVpH8jb1yOimcVgqLtNYa1tzZ18teNxiXSxcvanWPBQ92ks 4WbsAesWzhhTjIDZdJq/gAsp8p8/nkKmcP2IdoE= X-Received: by 2002:a4a:928f:0:b0:4f1:e1c7:2723 with SMTP id i15-20020a4a928f000000b004f1e1c72723mr7495661ooh.8.1673384480811; Tue, 10 Jan 2023 13:01:20 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:e8a0:dd55:3328:997]) by smtp.gmail.com with ESMTPSA id r5-20020a4a83c5000000b0049ee88e86f9sm6202193oog.10.2023.01.10.13.01.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:01:20 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Cc: Adhemerval Zanella Netto Subject: [PATCH v6 03/17] Add string-maskoff.h generic header Date: Tue, 10 Jan 2023 18:00:52 -0300 Message-Id: <20230110210106.1457686-4-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> References: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto Macros to operate on unaligned access for string operations: - create_mask: create a mask based on pointer alignment to sets up non-zero bytes before the beginning of the word so a following operation (such as find zero) might ignore these bytes. - check_mask: eturn the mask WORD shifted based on S_INT address value, to ignore values not presented in the aligned word read. - repeat_bytes: setup an word with each byte being c_in. - highbit_mask: create a mask with high bit of each byte being 1, and the low 7 bits being all the opposite of the input. - word_containing: return the address of the op_t word containing the addres. These macros are meant to be used on optimized vectorized string implementations. --- sysdeps/generic/string-maskoff.h | 85 ++++++++++++++++++++++++++++++++ 1 file changed, 85 insertions(+) create mode 100644 sysdeps/generic/string-maskoff.h diff --git a/sysdeps/generic/string-maskoff.h b/sysdeps/generic/string-maskoff.h new file mode 100644 index 0000000000..6ad4e0b6f9 --- /dev/null +++ b/sysdeps/generic/string-maskoff.h @@ -0,0 +1,85 @@ +/* Mask off bits. Generic C version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_MASKOFF_H +#define _STRING_MASKOFF_H 1 + +#include +#include +#include +#include +#include + +/* Provide a mask based on the pointer alignment that sets up non-zero + bytes before the beginning of the word. It is used to mask off + undesirable bits from an aligned read from an unaligned pointer. + For instance, on a 64 bits machine with a pointer alignment of + 3 the function returns 0x0000000000ffffff for LE and 0xffffff0000000000 + (meaning to mask off the initial 3 bytes). */ +static __always_inline op_t +create_mask (uintptr_t i) +{ + i = i % sizeof (op_t); + if (__BYTE_ORDER == __LITTLE_ENDIAN) + return ~(((op_t)-1) << (i * CHAR_BIT)); + else + return ~(((op_t)-1) >> (i * CHAR_BIT)); +} + +/* Return the mask WORD shifted based on S_INT address value, to ignore + values not presented in the aligned word read. */ +static __always_inline op_t +check_mask (op_t word, uintptr_t s_int) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + return word >> (CHAR_BIT * (s_int % sizeof (s_int))); + else + return word << (CHAR_BIT * (s_int % sizeof (s_int))); +} + +/* Setup an word with each byte being c_in. For instance, on a 64 bits + machine with input as 0xce the functions returns 0xcececececececece. */ +static __always_inline op_t +repeat_bytes (unsigned char c_in) +{ + return ((op_t)-1 / 0xff) * c_in; +} + +/* Based on mask created by 'create_mask', mask off the high bit of each + byte in the mask. It is used to mask off undesirable bits from an + aligned read from an unaligned pointer, and also taking care to avoid + match possible bytes meant to be matched. For instance, on a 64 bits + machine with a mask created from a pointer with an alignment of 3 + (0x0000000000ffffff) the function returns 0x7f7f7f0000000000 for BE + and 0x00000000007f7f7f for LE. */ +static __always_inline op_t +highbit_mask (op_t m) +{ + return m & repeat_bytes (0x7f); +} + +/* Return the address of the op_t word containing the address P. For + instance on address 0x0011223344556677 and op_t with size of 8, + it returns 0x0011223344556670. */ +static __always_inline op_t * +word_containing (char const *p) +{ + return (op_t *) ((uintptr_t) p & -sizeof(p)); +} + +#endif /* _STRING_MASKOFF_H */ From patchwork Tue Jan 10 21:00:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 640899 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp2916328pvb; Tue, 10 Jan 2023 13:02:22 -0800 (PST) X-Google-Smtp-Source: AMrXdXtw2Sw8qrhb3b+1MMOZR8ueD+OAtupuxy2GF4N/zaK9Xc9rovc2RFT/DqxIkzj7KeFEtrG/ X-Received: by 2002:a17:906:54d1:b0:84d:3819:79b9 with SMTP id c17-20020a17090654d100b0084d381979b9mr11837533ejp.71.1673384541922; Tue, 10 Jan 2023 13:02:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673384541; cv=none; d=google.com; s=arc-20160816; b=oUnJyd9KPFidPNUoQ3DruJPZaPVSSyFPZ6NSwRCY/62inSpg+H9agTEggtzQyMqyEi th+0WLNS0qreCXSo4x7bk16uRZD9hS8oolrP5MKqG0Y8xuVVdaY5VikDIUpkO9Lj7E/y 4ebWNIIW4PmAop4AOe+LcT5Q0wuAP+zHljxXCXwuyxN5Bm40ptMcxMgBxGVMbrm0ETdw xFQ93sNyqtZXCDO9wsRiZlzEutgMzqEXrx8g3G68qesDk3YFzRfsG3//uFSsFD2klgzS byMWaXHtcaBvG6ghlphCKCMazODXXg/S8MbFj825tr2WURbDLBmwS9YTXeEk1K3X6Q3l YV5Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=jsdgiZ4ZKa5otRpCpPYLEYCiKOevxnsN7LwyakEDfJI=; b=xy5sdwEtVlPPJR1n6Di9JCbNpsUUS+/BupfEmE3PgPWjx1E6MTYZk1vylw0NYBGsWZ 9nRdsq/SZ5Abdnek81WUvjbR33s+ib/x/4a6XkQiGCVacMHbr4NrNBFgsCL3fCuJmAc+ Q4aFCRIbzFSw1tkjBxSWslaIXflIrJkMgWHnU2k+sh1KMmzazfl8J4kFeqTVyp/5r53P hIWf1vXF+1JM1Bk1Mk3XLv/LkQfkt9SvebGWarP+AN1PX6D1o2uB4K9bqubQg5frZioM WcBpFuLO2QFoccjddlYvG/yWXvc/0tbilKUQz/trPTThZfdhe8JUTSdaPmd/gQyOzKU6 AIEA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=wvMtxYML; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id xb13-20020a170907070d00b007c1600359ecsi13053880ejb.441.2023.01.10.13.02.21 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:02:21 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=wvMtxYML; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5053A3832364 for ; Tue, 10 Jan 2023 21:02:20 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5053A3832364 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673384540; bh=jsdgiZ4ZKa5otRpCpPYLEYCiKOevxnsN7LwyakEDfJI=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=wvMtxYMLpOawXaJhri44FTzdCzgRF9oD4ElVeA9DTQYTuOfDytPi+6Y6hqxLckTDG Ves82m8GXDLqtGvaP1vS9z6gjrVYYcciMXCn61/duIW0M5y6J6793jpl4K7wgTuWAV 2T8grR7xLwTymW6o9KzWgCUomJzHOBlvTS4VImOM= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oo1-xc31.google.com (mail-oo1-xc31.google.com [IPv6:2607:f8b0:4864:20::c31]) by sourceware.org (Postfix) with ESMTPS id 2ED1C3858404 for ; Tue, 10 Jan 2023 21:01:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2ED1C3858404 Received: by mail-oo1-xc31.google.com with SMTP id h3-20020a4aa283000000b004ead187bd6eso3541424ool.5 for ; Tue, 10 Jan 2023 13:01:25 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jsdgiZ4ZKa5otRpCpPYLEYCiKOevxnsN7LwyakEDfJI=; b=IvMFPJMs8JCaJbgW5k0qhJAGWtwbc/0mogW2kQicbXTNUgWJhGVeMcE0fXtATZpDMx 0A4Un7qASHzxaxUgD0SoBCmra2d7nmJ33znwsEhxNpjszMEQNcP8pqpGrChe8NATKUWo b23EzvSurjuaBC5iKxf/rmX2qEndazfIkjoZgo+nHtfk8TdtnhdtAPYMV6f6EAwTtzjS X1TXsy1SETbnftMPHqKanx34bOTSysSNIfsrtrE+zED0444D5UphdPaE+5lN0M94mOOM RdwA1zz00FEiF/LM+3vZb8I4b2IyD8CylVF9jQ9EAIWL4i7uXVObxmqhCxxbppntaKS4 BcYw== X-Gm-Message-State: AFqh2kpvYqxwwdlOXTQjjOE2PXNjRYWA/UqjrAEXskbuPHICzreoxLaX MWRzOCVC8o3F/n+sPh6KvEqKFGFAG2CL//FxRt0= X-Received: by 2002:a4a:c54d:0:b0:4f2:ff6:6168 with SMTP id j13-20020a4ac54d000000b004f20ff66168mr1674540ooq.3.1673384483561; Tue, 10 Jan 2023 13:01:23 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:e8a0:dd55:3328:997]) by smtp.gmail.com with ESMTPSA id r5-20020a4a83c5000000b0049ee88e86f9sm6202193oog.10.2023.01.10.13.01.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:01:22 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Cc: Adhemerval Zanella Netto Subject: [PATCH v6 04/17] Add string vectorized find and detection functions Date: Tue, 10 Jan 2023 18:00:53 -0300 Message-Id: <20230110210106.1457686-5-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> References: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto This patch adds generic string find and detection meant to be used in generic vectorized string implementation. The idea is to decompose the basic string operation so each architecture can reimplement if it provides any specialized hardware instruction. The 'string-fza.h' provides zero byte detection functions (find_zero_low, find_zero_all, find_eq_low, find_eq_all, find_zero_eq_low, find_zero_eq_all, find_zero_ne_low, and find_zero_ne_all). They are used on both functions provided by 'string-fzb.h' and 'string-fzi'. The 'string-fzb.h' provides boolean zero byte detection with the functions: - has_zero: determine if any byte within a word is zero. - has_eq: determine byte equality between two words. - has_zero_eq: determine if any byte within a word is zero along with byte equality between two words. The 'string-fzi.h' provides zero byte detection along with its positions: - index_first_zero: return index of first zero byte within a word. - index_first_eq: return index of first byte different between two words. - index_first_zero_eq: return index of first zero byte within a word or first byte different between two words. - index_first_zero_ne: return index of first zero byte within a word or first byte equal between two words. - index_last_zero: return index of last zero byte within a word. - index_last_eq: return index of last byte different between two words. Co-authored-by: Richard Henderson --- sysdeps/generic/string-extbyte.h | 37 +++++++++ sysdeps/generic/string-fza.h | 102 +++++++++++++++++++++++ sysdeps/generic/string-fzb.h | 49 +++++++++++ sysdeps/generic/string-fzi.h | 135 +++++++++++++++++++++++++++++++ sysdeps/generic/string-maskoff.h | 9 +-- 5 files changed, 327 insertions(+), 5 deletions(-) create mode 100644 sysdeps/generic/string-extbyte.h create mode 100644 sysdeps/generic/string-fza.h create mode 100644 sysdeps/generic/string-fzb.h create mode 100644 sysdeps/generic/string-fzi.h diff --git a/sysdeps/generic/string-extbyte.h b/sysdeps/generic/string-extbyte.h new file mode 100644 index 0000000000..38b4674dca --- /dev/null +++ b/sysdeps/generic/string-extbyte.h @@ -0,0 +1,37 @@ +/* Extract by from memory word. Generic C version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_EXTBYTE_H +#define _STRING_EXTBYTE_H 1 + +#include +#include +#include + +/* Extract the byte at index IDX from word X, with index 0 being the + least significant byte. */ +static __always_inline unsigned char +extractbyte (op_t x, unsigned int idx) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + return x >> (idx * CHAR_BIT); + else + return x >> (sizeof (x) - 1 - idx) * CHAR_BIT; +} + +#endif /* _STRING_EXTBYTE_H */ diff --git a/sysdeps/generic/string-fza.h b/sysdeps/generic/string-fza.h new file mode 100644 index 0000000000..1ec51f7f3c --- /dev/null +++ b/sysdeps/generic/string-fza.h @@ -0,0 +1,102 @@ +/* Basic zero byte detection. Generic C version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_FZA_H +#define _STRING_FZA_H 1 + +#include +#include +#include + +/* This function returns non-zero if any byte in X is zero. + More specifically, at least one bit set within the least significant + byte that was zero; other bytes within the word are indeterminate. */ +static __always_inline op_t +find_zero_low (op_t x) +{ + /* This expression comes from + https://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord + Subtracting 1 sets 0x80 in a byte that was 0; anding ~x clears + 0x80 in a byte that was >= 128; anding 0x80 isolates that test bit. */ + op_t lsb = repeat_bytes (0x01); + op_t msb = repeat_bytes (0x80); + return (x - lsb) & ~x & msb; +} + +/* This function returns at least one bit set within every byte of X that + is zero. The result is exact in that, unlike find_zero_low, all bytes + are determinate. This is usually used for finding the index of the + most significant byte that was zero. */ +static __always_inline op_t +find_zero_all (op_t x) +{ + /* For each byte, find not-zero by + (0) And 0x7f so that we cannot carry between bytes, + (1) Add 0x7f so that non-zero carries into 0x80, + (2) Or in the original byte (which might have had 0x80 set). + Then invert and mask such that 0x80 is set iff that byte was zero. */ + op_t m = repeat_bytes (0x7f); + return ~(((x & m) + m) | x | m); +} + +/* With similar caveats, identify bytes that are equal between X1 and X2. */ +static __always_inline op_t +find_eq_low (op_t x1, op_t x2) +{ + return find_zero_low (x1 ^ x2); +} + +static __always_inline op_t +find_eq_all (op_t x1, op_t x2) +{ + return find_zero_all (x1 ^ x2); +} + +/* With similar caveats, identify zero bytes in X1 and bytes that are + equal between in X1 and X2. */ +static __always_inline op_t +find_zero_eq_low (op_t x1, op_t x2) +{ + return find_zero_low (x1) | find_zero_low (x1 ^ x2); +} + +static __always_inline op_t +find_zero_eq_all (op_t x1, op_t x2) +{ + return find_zero_all (x1) | find_zero_all (x1 ^ x2); +} + +/* With similar caveats, identify zero bytes in X1 and bytes that are + not equal between in X1 and X2. */ +static __always_inline op_t +find_zero_ne_low (op_t x1, op_t x2) +{ + return (~find_zero_eq_low (x1, x2)) + 1; +} + +static __always_inline op_t +find_zero_ne_all (op_t x1, op_t x2) +{ + op_t m = repeat_bytes (0x7f); + op_t eq = x1 ^ x2; + op_t nz1 = ((x1 & m) + m) | x1; + op_t ne2 = ((eq & m) + m) | eq; + return (ne2 | ~nz1) & ~m; +} + +#endif /* _STRING_FZA_H */ diff --git a/sysdeps/generic/string-fzb.h b/sysdeps/generic/string-fzb.h new file mode 100644 index 0000000000..42de500d67 --- /dev/null +++ b/sysdeps/generic/string-fzb.h @@ -0,0 +1,49 @@ +/* Zero byte detection, boolean. Generic C version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_FZB_H +#define _STRING_FZB_H 1 + +#include +#include + +/* Determine if any byte within X is zero. This is a pure boolean test. */ + +static __always_inline _Bool +has_zero (op_t x) +{ + return find_zero_low (x) != 0; +} + +/* Likewise, but for byte equality between X1 and X2. */ + +static __always_inline _Bool +has_eq (op_t x1, op_t x2) +{ + return find_eq_low (x1, x2) != 0; +} + +/* Likewise, but for zeros in X1 and equal bytes between X1 and X2. */ + +static __always_inline _Bool +has_zero_eq (op_t x1, op_t x2) +{ + return find_zero_eq_low (x1, x2); +} + +#endif /* _STRING_FZB_H */ diff --git a/sysdeps/generic/string-fzi.h b/sysdeps/generic/string-fzi.h new file mode 100644 index 0000000000..b1fd4d34b3 --- /dev/null +++ b/sysdeps/generic/string-fzi.h @@ -0,0 +1,135 @@ +/* Zero byte detection; indexes. Generic C version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_FZI_H +#define _STRING_FZI_H 1 + +#include +#include +#include + +static __always_inline int +clz (op_t c) +{ + if (sizeof (op_t) == sizeof (unsigned long)) + return __builtin_clzl (c); + else + return __builtin_clzll (c); +} + +static __always_inline int +ctz (op_t c) +{ + if (sizeof (op_t) == sizeof (unsigned long)) + return __builtin_ctzl (c); + else + return __builtin_ctzll (c); +} + +/* A subroutine for the index_zero functions. Given a test word C, return + the (memory order) index of the first byte (in memory order) that is + non-zero. */ +static __always_inline unsigned int +index_first (op_t c) +{ + int r; + if (__BYTE_ORDER == __LITTLE_ENDIAN) + r = ctz (c); + else + r = clz (c); + return r / CHAR_BIT; +} + +/* Similarly, but return the (memory order) index of the last byte that is + non-zero. */ +static __always_inline unsigned int +index_last (op_t c) +{ + int r; + if (__BYTE_ORDER == __LITTLE_ENDIAN) + r = clz (c); + else + r = ctz (c); + return sizeof (op_t) - 1 - (r / CHAR_BIT); +} + +/* Given a word X that is known to contain a zero byte, return the index of + the first such within the word in memory order. */ +static __always_inline unsigned int +index_first_zero (op_t x) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + x = find_zero_low (x); + else + x = find_zero_all (x); + return index_first (x); +} + +/* Similarly, but perform the search for byte equality between X1 and X2. */ +static __always_inline unsigned int +index_first_eq (op_t x1, op_t x2) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + x1 = find_eq_low (x1, x2); + else + x1 = find_eq_all (x1, x2); + return index_first (x1); +} + +/* Similarly, but perform the search for zero within X1 or equality between + X1 and X2. */ +static __always_inline unsigned int +index_first_zero_eq (op_t x1, op_t x2) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + x1 = find_zero_eq_low (x1, x2); + else + x1 = find_zero_eq_all (x1, x2); + return index_first (x1); +} + +/* Similarly, but perform the search for zero within X1 or inequality between + X1 and X2. */ +static __always_inline unsigned int +index_first_zero_ne (op_t x1, op_t x2) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + x1 = find_zero_ne_low (x1, x2); + else + x1 = find_zero_ne_all (x1, x2); + return index_first (x1); +} + +/* Similarly, but search for the last zero within X. */ +static __always_inline unsigned int +index_last_zero (op_t x) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + x = find_zero_all (x); + else + x = find_zero_low (x); + return index_last (x); +} + +static __always_inline unsigned int +index_last_eq (op_t x1, op_t x2) +{ + return index_last_zero (x1 ^ x2); +} + +#endif /* STRING_FZI_H */ diff --git a/sysdeps/generic/string-maskoff.h b/sysdeps/generic/string-maskoff.h index 6ad4e0b6f9..16d3cc2ddb 100644 --- a/sysdeps/generic/string-maskoff.h +++ b/sysdeps/generic/string-maskoff.h @@ -23,7 +23,6 @@ #include #include #include -#include /* Provide a mask based on the pointer alignment that sets up non-zero bytes before the beginning of the word. It is used to mask off @@ -44,12 +43,12 @@ create_mask (uintptr_t i) /* Return the mask WORD shifted based on S_INT address value, to ignore values not presented in the aligned word read. */ static __always_inline op_t -check_mask (op_t word, uintptr_t s_int) +check_mask (op_t word, uintptr_t s) { if (__BYTE_ORDER == __LITTLE_ENDIAN) - return word >> (CHAR_BIT * (s_int % sizeof (s_int))); + return word >> (CHAR_BIT * (s % sizeof (op_t))); else - return word << (CHAR_BIT * (s_int % sizeof (s_int))); + return word << (CHAR_BIT * (s % sizeof (op_t))); } /* Setup an word with each byte being c_in. For instance, on a 64 bits @@ -79,7 +78,7 @@ highbit_mask (op_t m) static __always_inline op_t * word_containing (char const *p) { - return (op_t *) ((uintptr_t) p & -sizeof(p)); + return (op_t *) ((op_t) p & -sizeof(p)); } #endif /* _STRING_MASKOFF_H */ From patchwork Tue Jan 10 21:00:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 640897 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp2915825pvb; Tue, 10 Jan 2023 13:01:58 -0800 (PST) X-Google-Smtp-Source: AMrXdXtv1aTs49a5evlWkKSmWHSAn3fbcr5Jn7jg2uqgNhGv3TYOM1Zmyntbov5qpKiVFnObKFC8 X-Received: by 2002:a17:906:f84d:b0:7c1:28a7:f79b with SMTP id ks13-20020a170906f84d00b007c128a7f79bmr49916219ejb.59.1673384518264; Tue, 10 Jan 2023 13:01:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673384518; cv=none; d=google.com; s=arc-20160816; b=B33jgXYM7XF/E1Ra7Bj0b3aTZvW1fblGKN2P8+4xaelK+0zUgH6Ufliue5TO67YdbR 5DRfTnHAHK7/im35P2/Jq6eKS4TQkrPMGqgdZlV6cyLq+mHx3eRg1u9yZ2o0kOFVdDVB fsAFKXQOAzIv0cKd6RQfqT0wSRa4SDrbU782hgg4WR4UOeIaluKSLsN47V816jI7O6yP bpzakTCx0A0RXwjRPrxIVaUDTCvkY82aXIAncnK2Bmo4c7eHJizl4xs4kPYTxCc7iS81 0TOkogp+rMuXxkOb6FkZrhKydu67qOTSiWuo5Y8Sy5ISkbdIdqRktRAuu+XhELfrMemY SfvQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=k8tqbHLdBSZ3p68T8t3IP3XZ3ntWWpTLKxx6X1fIzvo=; b=vjdQWCH10pT1CpcvCWLW/vrgN0rDsn2Fj0S0lcT03VEqJxlRHgkmypyViWEvxYbQNF ykafc9cmb+XUr4kP3PCoVmmW2vd083n1hcJMzNajPzUpl1UDZhG+mey6I1ZDdpUVnKr0 na4Dq6k8KnUkJ0LiMCHLFOnlBCB9dNs7Z7ZabKq3Why/6/cgRCwN10YfayovzYmYRH3e qwPLZit15iBjJtgq5t2vV1w4TpzAV5SnAZpl/7xX4soWnrSPQCr8CpKzadPwfexBT9An LijQXgddnsJaYXAQyXnV5bNkIz8wa5R3e7z1djASwd3KB6n309NtIxY99pjN9wl/t7Nj AWLA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=EBAK9uLM; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id qa9-20020a170907868900b007c10a0c6f44si13198664ejc.623.2023.01.10.13.01.57 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:01:58 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=EBAK9uLM; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D0DBE385B507 for ; Tue, 10 Jan 2023 21:01:56 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D0DBE385B507 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673384516; bh=k8tqbHLdBSZ3p68T8t3IP3XZ3ntWWpTLKxx6X1fIzvo=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=EBAK9uLMGADQCYHon0A/0Tsxu+SjoRhj3ksbBXlmYX4lao7YD4V4fUHwum0EwyGWc Ur64d1n7sd8zZ/VWl64TcU1s983osw4v1eagvOexkjh054JiYozMt6xDlGCwJEyMXS Cvelods5viQnp1q0jiZV0bVxYxLh2t7GEFRC5C7g= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oo1-xc32.google.com (mail-oo1-xc32.google.com [IPv6:2607:f8b0:4864:20::c32]) by sourceware.org (Postfix) with ESMTPS id 0DEB6385B515 for ; Tue, 10 Jan 2023 21:01:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0DEB6385B515 Received: by mail-oo1-xc32.google.com with SMTP id h3-20020a4aa283000000b004ead187bd6eso3541465ool.5 for ; Tue, 10 Jan 2023 13:01:27 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=k8tqbHLdBSZ3p68T8t3IP3XZ3ntWWpTLKxx6X1fIzvo=; b=5QVYY+u/7RE7Z0poV1OGaxXYZQeOsHHkdNgGdHGeWZoSNz7tYqHP/GvQLu3lLTVrBi 1m+23vNP4Bp1qtji5ZU0zYXScqibWWarQEMr/2Hjo9EAFfsAJGPBtvbZ6LXUCZvYRPVp w6QMvI+IrJfaGwgsKVVK5F6TDLTgCjdEQ5Xn+nvJ6blh3HuEVcBWmDqO+sAlSh+BBROP jSdtYWau2nEnIxfxsozj2NDdKo5I1nyT35DNhKme40qYqN8PUFZWZ+zxzBKteaWowye7 Iwtqe4gWfdLoMaLB+OPtJiAQtWTIkZmI04s5F/YwLNn3uNsbtE0PE6f6cwgd9ehF20Ab FYAw== X-Gm-Message-State: AFqh2kpC7JDjw7hUbg4ATe1kVGqZcT9F4+FG+Nu3brfWH73Dv9XrcUFC +KitKhuMIps+ZFZiMBR/TUfjM6Xk+HBehamSuH8= X-Received: by 2002:a4a:c386:0:b0:4ae:590:8087 with SMTP id u6-20020a4ac386000000b004ae05908087mr38950484oop.0.1673384485816; Tue, 10 Jan 2023 13:01:25 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:e8a0:dd55:3328:997]) by smtp.gmail.com with ESMTPSA id r5-20020a4a83c5000000b0049ee88e86f9sm6202193oog.10.2023.01.10.13.01.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:01:25 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Cc: Adhemerval Zanella Netto Subject: [PATCH v6 05/17] string: Improve generic strlen Date: Tue, 10 Jan 2023 18:00:54 -0300 Message-Id: <20230110210106.1457686-6-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> References: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto New algorithm have the following key differences: - Reads first word unaligned and use string-maskoff functions to remove unwanted data. This strategy follow arch-specific optimization used on powerpc, sparc, and SH. - Use of has_zero and index_first_zero parametrized functions. Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu, and powercp64-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Co-authored-by: Richard Henderson --- string/strlen.c | 90 +++++++++-------------------------------- sysdeps/s390/strlen-c.c | 10 +++-- 2 files changed, 26 insertions(+), 74 deletions(-) diff --git a/string/strlen.c b/string/strlen.c index ee1aae0fff..a69f3343ef 100644 --- a/string/strlen.c +++ b/string/strlen.c @@ -17,84 +17,34 @@ #include #include - -#undef strlen - -#ifndef STRLEN -# define STRLEN strlen +#include +#include +#include +#include +#include + +#ifdef STRLEN +# define __strlen STRLEN #endif /* Return the length of the null-terminated string STR. Scan for the null terminator quickly by testing four bytes at a time. */ size_t -STRLEN (const char *str) +__strlen (const char *str) { - const char *char_ptr; - const unsigned long int *longword_ptr; - unsigned long int longword, himagic, lomagic; - - /* Handle the first few characters by reading one character at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = str; ((unsigned long int) char_ptr - & (sizeof (longword) - 1)) != 0; - ++char_ptr) - if (*char_ptr == '\0') - return char_ptr - str; - - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to 8-byte longwords. */ + /* Align pointer to sizeof op_t. */ + const uintptr_t s_int = (uintptr_t) str; + const op_t *word_ptr = word_containing (str); - longword_ptr = (unsigned long int *) char_ptr; + /* Read and MASK the first word. */ + op_t word = *word_ptr | create_mask (s_int); - /* Computing (longword - lomagic) sets the high bit of any corresponding - byte that is either zero or greater than 0x80. The latter case can be - filtered out by computing (~longword & himagic). The final result - will always be non-zero if one of the bytes of longword is zero. */ - himagic = 0x80808080L; - lomagic = 0x01010101L; - if (sizeof (longword) > 4) - { - /* 64-bit version of the magic. */ - /* Do the shift in two steps to avoid a warning if long has 32 bits. */ - himagic = ((himagic << 16) << 16) | himagic; - lomagic = ((lomagic << 16) << 16) | lomagic; - } - if (sizeof (longword) > 8) - abort (); + while (! has_zero (word)) + word = *++word_ptr; - /* Instead of the traditional loop which tests each character, - we will test a longword at a time. The tricky part is testing - if *any of the four* bytes in the longword in question are zero. */ - for (;;) - { - longword = *longword_ptr++; - - if (((longword - lomagic) & ~longword & himagic) != 0) - { - /* Which of the bytes was the zero? */ - - const char *cp = (const char *) (longword_ptr - 1); - - if (cp[0] == 0) - return cp - str; - if (cp[1] == 0) - return cp - str + 1; - if (cp[2] == 0) - return cp - str + 2; - if (cp[3] == 0) - return cp - str + 3; - if (sizeof (longword) > 4) - { - if (cp[4] == 0) - return cp - str + 4; - if (cp[5] == 0) - return cp - str + 5; - if (cp[6] == 0) - return cp - str + 6; - if (cp[7] == 0) - return cp - str + 7; - } - } - } + return ((const char *) word_ptr) + index_first_zero (word) - str; } +#ifndef STRLEN +weak_alias (__strlen, strlen) libc_hidden_builtin_def (strlen) +#endif diff --git a/sysdeps/s390/strlen-c.c b/sysdeps/s390/strlen-c.c index b829ef2452..0a33a6f8e5 100644 --- a/sysdeps/s390/strlen-c.c +++ b/sysdeps/s390/strlen-c.c @@ -21,12 +21,14 @@ #if HAVE_STRLEN_C # if HAVE_STRLEN_IFUNC # define STRLEN STRLEN_C +# endif + +# include + +# if HAVE_STRLEN_IFUNC # if defined SHARED && IS_IN (libc) -# undef libc_hidden_builtin_def -# define libc_hidden_builtin_def(name) \ - __hidden_ver1 (__strlen_c, __GI_strlen, __strlen_c); +__hidden_ver1 (__strlen_c, __GI_strlen, __strlen_c); # endif # endif -# include #endif From patchwork Tue Jan 10 21:00:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 640900 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp2916551pvb; Tue, 10 Jan 2023 13:02:44 -0800 (PST) X-Google-Smtp-Source: AMrXdXvfTVOOx9OjAY/boCjdMWSEpCoepdQiO2OvrNtuNzEcG+Ma9/dTtwv4cX5D5nRFw+Q5SiPv X-Received: by 2002:a17:907:874c:b0:7c0:9bc2:a7d6 with SMTP id qo12-20020a170907874c00b007c09bc2a7d6mr57080372ejc.38.1673384563926; Tue, 10 Jan 2023 13:02:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673384563; cv=none; d=google.com; s=arc-20160816; b=uF9DCRsyVvGXMTANYn/OyvuB60h9mImtLVOVyiCWeE54ejZxa2e69LDLM1NebUbLpp VMwPNtOSHTFTr4C55DbLLEvHJzI9OXrpd7ntm6f4rq4YXkDdb8+8EDhU0jgrp+c5Sdot f5fk6BKgEkCEh3EvbmwIke2kRoJBvgenBnAYFl2Xsqw80WQQE+BDztq++qip+sI/kmzW rzXXw2kRtyAMhMr8L6h+2Qdm858+qXv87nt9k+RxPk1Rir9vt+l7aJ2AVbhS7H0zC5/v Ctx+5W4AAZ98x91vi/vY/NwzzYXyd2WZajS+GytukGoeQMS8RfYX6L1JpAIPUxXMU25n 1HvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=lmtxRpsTK2s7EUw/8kxbXxelnWJd50VsJ1oOFlPHVEk=; b=VT6mtlX6h4eIj0cjntkDg0Ba5CSu2bQjZ6jP7H1FDrImSXULZOC1mlCjD86LG6XjAg 3Ujti5NQCCxCZzEki9lJw+NsDv2jahGaQp5TZyiOFt6Z4yFWduPNTYgZHpZcLA8Qrcjz Qlm297ONbcCJcnf2PjPh1z/816on4PHyeJOWkoUqeSoXP/6C9H7PwfeCKBNrJRPueMgq tTwMSU0Q4OsREu/dhlCp7gYUbxXh/ocZCZPVTZKi4oA6QfPiqMGltXludntmTUiyJV5J kUhErkObPZ223/JUKrHPt02oj0FwzIlbU1JytDM7ADrxeJxAwfBJWzuDo6IFi3gamaQm 6jVA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=bDDLMpzk; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id he36-20020a1709073da400b007a4feae7adfsi14250629ejc.565.2023.01.10.13.02.43 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:02:43 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=bDDLMpzk; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C6301382FAFD for ; Tue, 10 Jan 2023 21:02:42 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C6301382FAFD DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673384562; bh=lmtxRpsTK2s7EUw/8kxbXxelnWJd50VsJ1oOFlPHVEk=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=bDDLMpzk5NmbY/N9kmlYO9bmPCmF5atgMTuaPUdJF4EMEX39o5HGeBoyfilzj9TC8 gdKc8LhaAHvP8L9UOwVsUjWrX/HaMZN81afKl9PayFuwGa4luha71Jy4fRsc6iaDhS XWF6mzsgoUxOBDuZHV49Rd6LqMMWeCAJO5hEzym0= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oo1-xc35.google.com (mail-oo1-xc35.google.com [IPv6:2607:f8b0:4864:20::c35]) by sourceware.org (Postfix) with ESMTPS id A0D45385B512 for ; Tue, 10 Jan 2023 21:01:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A0D45385B512 Received: by mail-oo1-xc35.google.com with SMTP id d2-20020a4ab202000000b004ae3035538bso3537099ooo.12 for ; Tue, 10 Jan 2023 13:01:32 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lmtxRpsTK2s7EUw/8kxbXxelnWJd50VsJ1oOFlPHVEk=; b=Mpgwddwdzjt1odSVA68LfuOB5mJp6KGhZrljYRyejyUs/6Mpjxcg2gsG1lT8bAXIax 2wMmKpQMRIoPcYdjqQYOBVHtb2e6InqO+vW0kBZW71dgOhcCRGJ382DxEBKE9Z+7F79c b1c8vOvBIM3YVdEkIl/7gAup09zL4vsWd0YyQ6s2zX1uSXAKAO5YFJWNxY8ieRiWB831 3dXzmM5HeN0RpPcydxun+kim4wi/kRHLjhE99VvLFm23lMmSmmpiGdhbs12GrH1p2DtF 7voHue3UWeE1Ld9NNga9GXxDRVyo4H/0m1hyz4wE/MLUYgJE1bTLWA0yOvQMZA2qmUwj aAow== X-Gm-Message-State: AFqh2krXTiOy4YD0FcjpVd5S+48ga5muLvtm7F6Ig5lpZ4nlN/Xt0deI PlpKhXvPvE+D4O7XQnJAKQEs1DEtGfdJr/i3PoU= X-Received: by 2002:a05:6820:150f:b0:4d3:f4b2:81ee with SMTP id ay15-20020a056820150f00b004d3f4b281eemr35296242oob.8.1673384488148; Tue, 10 Jan 2023 13:01:28 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:e8a0:dd55:3328:997]) by smtp.gmail.com with ESMTPSA id r5-20020a4a83c5000000b0049ee88e86f9sm6202193oog.10.2023.01.10.13.01.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:01:27 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Cc: Adhemerval Zanella Netto Subject: [PATCH v6 06/17] string: Improve generic strnlen Date: Tue, 10 Jan 2023 18:00:55 -0300 Message-Id: <20230110210106.1457686-7-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> References: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto With an optimized memchr, new strnlen implementation basically calls memchr and adjust the result pointer value. It also cleanups the multiple inclusion by leaving the ifunc implementation to undef the weak_alias and libc_hidden_def. Co-authored-by: Richard Henderson --- string/strnlen.c | 137 +----------------- sysdeps/i386/i686/multiarch/strnlen-c.c | 14 +- .../power4/multiarch/strnlen-ppc32.c | 14 +- sysdeps/s390/strnlen-c.c | 14 +- 4 files changed, 27 insertions(+), 152 deletions(-) diff --git a/string/strnlen.c b/string/strnlen.c index 6ff294eab1..dc23354ec8 100644 --- a/string/strnlen.c +++ b/string/strnlen.c @@ -1,10 +1,6 @@ /* Find the length of STRING, but scan at most MAXLEN characters. Copyright (C) 1991-2023 Free Software Foundation, Inc. - Based on strlen written by Torbjorn Granlund (tege@sics.se), - with help from Dan Sahlin (dan@sics.se); - commentary by Jim Blandy (jimb@ai.mit.edu). - The GNU C Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the @@ -20,7 +16,6 @@ not, see . */ #include -#include /* Find the length of S, but scan at most MAXLEN characters. If no '\0' terminator is found in that many characters, return MAXLEN. */ @@ -32,134 +27,12 @@ size_t __strnlen (const char *str, size_t maxlen) { - const char *char_ptr, *end_ptr = str + maxlen; - const unsigned long int *longword_ptr; - unsigned long int longword, himagic, lomagic; - - if (maxlen == 0) - return 0; - - if (__glibc_unlikely (end_ptr < str)) - end_ptr = (const char *) ~0UL; - - /* Handle the first few characters by reading one character at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = str; ((unsigned long int) char_ptr - & (sizeof (longword) - 1)) != 0; - ++char_ptr) - if (*char_ptr == '\0') - { - if (char_ptr > end_ptr) - char_ptr = end_ptr; - return char_ptr - str; - } - - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to 8-byte longwords. */ - - longword_ptr = (unsigned long int *) char_ptr; - - /* Bits 31, 24, 16, and 8 of this number are zero. Call these bits - the "holes." Note that there is a hole just to the left of - each byte, with an extra at the end: - - bits: 01111110 11111110 11111110 11111111 - bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD - - The 1-bits make sure that carries propagate to the next 0-bit. - The 0-bits provide holes for carries to fall into. */ - himagic = 0x80808080L; - lomagic = 0x01010101L; - if (sizeof (longword) > 4) - { - /* 64-bit version of the magic. */ - /* Do the shift in two steps to avoid a warning if long has 32 bits. */ - himagic = ((himagic << 16) << 16) | himagic; - lomagic = ((lomagic << 16) << 16) | lomagic; - } - if (sizeof (longword) > 8) - abort (); - - /* Instead of the traditional loop which tests each character, - we will test a longword at a time. The tricky part is testing - if *any of the four* bytes in the longword in question are zero. */ - while (longword_ptr < (unsigned long int *) end_ptr) - { - /* We tentatively exit the loop if adding MAGIC_BITS to - LONGWORD fails to change any of the hole bits of LONGWORD. - - 1) Is this safe? Will it catch all the zero bytes? - Suppose there is a byte with all zeros. Any carry bits - propagating from its left will fall into the hole at its - least significant bit and stop. Since there will be no - carry from its most significant bit, the LSB of the - byte to the left will be unchanged, and the zero will be - detected. - - 2) Is this worthwhile? Will it ignore everything except - zero bytes? Suppose every byte of LONGWORD has a bit set - somewhere. There will be a carry into bit 8. If bit 8 - is set, this will carry into bit 16. If bit 8 is clear, - one of bits 9-15 must be set, so there will be a carry - into bit 16. Similarly, there will be a carry into bit - 24. If one of bits 24-30 is set, there will be a carry - into bit 31, so all of the hole bits will be changed. - - The one misfire occurs when bits 24-30 are clear and bit - 31 is set; in this case, the hole at bit 31 is not - changed. If we had access to the processor carry flag, - we could close this loophole by putting the fourth hole - at bit 32! - - So it ignores everything except 128's, when they're aligned - properly. */ - - longword = *longword_ptr++; - - if ((longword - lomagic) & himagic) - { - /* Which of the bytes was the zero? If none of them were, it was - a misfire; continue the search. */ - - const char *cp = (const char *) (longword_ptr - 1); - - char_ptr = cp; - if (cp[0] == 0) - break; - char_ptr = cp + 1; - if (cp[1] == 0) - break; - char_ptr = cp + 2; - if (cp[2] == 0) - break; - char_ptr = cp + 3; - if (cp[3] == 0) - break; - if (sizeof (longword) > 4) - { - char_ptr = cp + 4; - if (cp[4] == 0) - break; - char_ptr = cp + 5; - if (cp[5] == 0) - break; - char_ptr = cp + 6; - if (cp[6] == 0) - break; - char_ptr = cp + 7; - if (cp[7] == 0) - break; - } - } - char_ptr = end_ptr; - } - - if (char_ptr > end_ptr) - char_ptr = end_ptr; - return char_ptr - str; + const char *found = memchr (str, '\0', maxlen); + return found ? found - str : maxlen; } + #ifndef STRNLEN -libc_hidden_def (__strnlen) weak_alias (__strnlen, strnlen) -#endif +libc_hidden_def (__strnlen) libc_hidden_def (strnlen) +#endif diff --git a/sysdeps/i386/i686/multiarch/strnlen-c.c b/sysdeps/i386/i686/multiarch/strnlen-c.c index 351e939a93..beb0350d53 100644 --- a/sysdeps/i386/i686/multiarch/strnlen-c.c +++ b/sysdeps/i386/i686/multiarch/strnlen-c.c @@ -1,10 +1,10 @@ #define STRNLEN __strnlen_ia32 +#include + #ifdef SHARED -# undef libc_hidden_def -# define libc_hidden_def(name) \ - __hidden_ver1 (__strnlen_ia32, __GI_strnlen, __strnlen_ia32); \ - strong_alias (__strnlen_ia32, __strnlen_ia32_1); \ - __hidden_ver1 (__strnlen_ia32_1, __GI___strnlen, __strnlen_ia32_1); +/* Alias for internal symbol to avoid PLT generation, it redirects the + libc_hidden_def (__strnlen/strlen) to default implementation. */ +__hidden_ver1 (__strnlen_ia32, __GI_strnlen, __strnlen_ia32); +strong_alias (__strnlen_ia32, __strnlen_ia32_1); +__hidden_ver1 (__strnlen_ia32_1, __GI___strnlen, __strnlen_ia32_1); #endif - -#include "string/strnlen.c" diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-ppc32.c b/sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-ppc32.c index 957b9b99e8..2ca1cd7181 100644 --- a/sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-ppc32.c +++ b/sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-ppc32.c @@ -17,12 +17,12 @@ . */ #define STRNLEN __strnlen_ppc +#include + #ifdef SHARED -# undef libc_hidden_def -# define libc_hidden_def(name) \ - __hidden_ver1 (__strnlen_ppc, __GI_strnlen, __strnlen_ppc); \ - strong_alias (__strnlen_ppc, __strnlen_ppc_1); \ - __hidden_ver1 (__strnlen_ppc_1, __GI___strnlen, __strnlen_ppc_1); +/* Alias for internal symbol to avoid PLT generation, it redirects the + libc_hidden_def (__strnlen/strlen) to default implementation. */ +__hidden_ver1 (__strnlen_ppc, __GI_strnlen, __strnlen_ppc); \ +strong_alias (__strnlen_ppc, __strnlen_ppc_1); \ +__hidden_ver1 (__strnlen_ppc_1, __GI___strnlen, __strnlen_ppc_1); #endif - -#include diff --git a/sysdeps/s390/strnlen-c.c b/sysdeps/s390/strnlen-c.c index 172fcc7caa..95156a0ff5 100644 --- a/sysdeps/s390/strnlen-c.c +++ b/sysdeps/s390/strnlen-c.c @@ -21,14 +21,16 @@ #if HAVE_STRNLEN_C # if HAVE_STRNLEN_IFUNC # define STRNLEN STRNLEN_C +# endif + +# include + +# if HAVE_STRNLEN_IFUNC # if defined SHARED && IS_IN (libc) -# undef libc_hidden_def -# define libc_hidden_def(name) \ - __hidden_ver1 (__strnlen_c, __GI_strnlen, __strnlen_c); \ - strong_alias (__strnlen_c, __strnlen_c_1); \ - __hidden_ver1 (__strnlen_c_1, __GI___strnlen, __strnlen_c_1); +__hidden_ver1 (__strnlen_c, __GI_strnlen, __strnlen_c); +strong_alias (__strnlen_c, __strnlen_c_1); +__hidden_ver1 (__strnlen_c_1, __GI___strnlen, __strnlen_c_1); # endif # endif -# include #endif From patchwork Tue Jan 10 21:00:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 640903 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp2916788pvb; Tue, 10 Jan 2023 13:03:08 -0800 (PST) X-Google-Smtp-Source: AMrXdXuChjxJIR60MCCbTQQvDOmfdRax+rmuIgjcsUNGq2cmgUYe/tlobZWUq3hebfJb8GEFZQHB X-Received: by 2002:a17:906:700f:b0:7c1:6bd9:571e with SMTP id n15-20020a170906700f00b007c16bd9571emr53656671ejj.13.1673384588017; Tue, 10 Jan 2023 13:03:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673384588; cv=none; d=google.com; s=arc-20160816; b=X7u15dpnMBnLsuImG32RR6qjCn7SOeRq9n6rTsW/orWcOrWfJx0nJpPpz7GBVEUs6D IIX5FOY+1/IDsRJ/rySryBFyei6/09mhvKUf31U/zMFSECua0WK7kqtm2fw9AKOsCxCn SCL1CJOvzMRN7wJqh0dT99Lx2gpOcheTv+0XTdy/iCEKWNcwEdrQ7rVo5COMjn7fJWwS pttW3b1RkcxoWaokGDmOWOBySAnZjkKvlEbE6gwCxdzoHCDxVWb5uXD6/1qGwbLnOxUp q2waZc0prgXUsyDuJHOOQ+OOzR/3VqQmlpHiSx6CkRjsUs3+MAyGIIqOygfv9eA5rYon 8XLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=JISe1ps6pZqzYvm4zLC0Y5WmtRjg+JfAVvhHw1QSgq8=; b=mnTWtwlHJsGGNiDl9e8tIa7Q5OmSUqj/felyog+XdlKZTkG05n7jC/mwq5JewzHDqa THQGO7R7PVyA6jkN54qMnp4uEFCZvW/qo752SdXhF/QCTIEzhN2+atPODqedwe9kbjDQ ljJqzoDLqO46HqtnV+SFiFLZFxMH6kz0OqviuDs5FBaXA9z9y7sBTX+JEsXJZEugV4m5 +3qKqV1L7fwG2iK/0suLS7UwsXkah8vZ3SYvpgdAYdM8xGq9rm1RJrn7mVBVGPyCdnal 4Vaq3v2fVp8bLHGLi1aGBTPAeXaAKzsvV5egige9hBW0UBa/45xdJJvNG9JUEOElmrLr utkQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=PT3ouQhI; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id e21-20020a17090658d500b007adaedb2f14si14535240ejs.866.2023.01.10.13.03.07 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:03:07 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=PT3ouQhI; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D5BCF3832352 for ; Tue, 10 Jan 2023 21:03:06 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D5BCF3832352 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673384586; bh=JISe1ps6pZqzYvm4zLC0Y5WmtRjg+JfAVvhHw1QSgq8=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=PT3ouQhIJ0gBYD8m1S6xxUJOx3xgIRsCIGKAnqGoVDBHpU9XLH8M0PQuKATez2fnh cYf7h57H9fyJ8mCHJ+/tnbgnShuWfsf0XFmoGTk8Or1/M6IajJIr4QryWrQQffWzgp c6myPMsPElaG/M5I+tlwMgICP4kPaFPwsUz9Dv/A= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oo1-xc2f.google.com (mail-oo1-xc2f.google.com [IPv6:2607:f8b0:4864:20::c2f]) by sourceware.org (Postfix) with ESMTPS id AB0D83858022 for ; Tue, 10 Jan 2023 21:01:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AB0D83858022 Received: by mail-oo1-xc2f.google.com with SMTP id y4-20020a4ab404000000b004f21c72be42so284619oon.8 for ; Tue, 10 Jan 2023 13:01:32 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JISe1ps6pZqzYvm4zLC0Y5WmtRjg+JfAVvhHw1QSgq8=; b=GsI5NcxWsQC5Km74lRc6IfKKuWkLbEJyCxpylF6oe8rDAYicvqpMRi1FSro2JZOhSG IhgOXBsTZI6NQxiUYxJ5Q06kH2M9kSf9zH/kEEsfl/Thoi8ToiOBQO8MXRFhw1Gvgvmd A3hB+JBwVrI4ISZAvCuPZviFNSVP48BjJx1mu4PokjgHk+PqSPOnivHAqf+HFCDiNNzS 5rIQVu7JuxwZJkoiKR+u7y867I3ZiBTAQISNksubpWE0ruwbkDKzfDnOshm4ReP9tIMN g4jiZ6TsM9k1TQs5xYxZpmCOOEcQ39F/VI4IFG8Juolg5UWLB8GNR7r8zHWmttQF4H9Q 7BfQ== X-Gm-Message-State: AFqh2koXYlfwhkRdWEKr5v3weGOjLKQJozoVkwWsBhsHdd9EJ05s803M iczoWT7BGuch5Ha2sICLmJuY75QXuHiWaf60tLo= X-Received: by 2002:a4a:840b:0:b0:4a3:7135:d6e6 with SMTP id l11-20020a4a840b000000b004a37135d6e6mr22886081oog.2.1673384490368; Tue, 10 Jan 2023 13:01:30 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:e8a0:dd55:3328:997]) by smtp.gmail.com with ESMTPSA id r5-20020a4a83c5000000b0049ee88e86f9sm6202193oog.10.2023.01.10.13.01.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:01:29 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Cc: Adhemerval Zanella Netto Subject: [PATCH v6 07/17] string: Improve generic strchr Date: Tue, 10 Jan 2023 18:00:56 -0300 Message-Id: <20230110210106.1457686-8-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> References: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto New algorithm now calls strchrnul. Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu, and powerpc64-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). --- string/strchr.c | 159 ++-------------------------------------- sysdeps/s390/strchr-c.c | 11 +-- 2 files changed, 14 insertions(+), 156 deletions(-) diff --git a/string/strchr.c b/string/strchr.c index 1572b8b42e..30c3eb10f2 100644 --- a/string/strchr.c +++ b/string/strchr.c @@ -21,165 +21,22 @@ . */ #include -#include #undef strchr +#undef index -#ifndef STRCHR -# define STRCHR strchr +#ifdef STRCHR +# define strchr STRCHR #endif /* Find the first occurrence of C in S. */ char * -STRCHR (const char *s, int c_in) +strchr (const char *s, int c_in) { - const unsigned char *char_ptr; - const unsigned long int *longword_ptr; - unsigned long int longword, magic_bits, charmask; - unsigned char c; - - c = (unsigned char) c_in; - - /* Handle the first few characters by reading one character at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = (const unsigned char *) s; - ((unsigned long int) char_ptr & (sizeof (longword) - 1)) != 0; - ++char_ptr) - if (*char_ptr == c) - return (void *) char_ptr; - else if (*char_ptr == '\0') - return NULL; - - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to 8-byte longwords. */ - - longword_ptr = (unsigned long int *) char_ptr; - - /* Bits 31, 24, 16, and 8 of this number are zero. Call these bits - the "holes." Note that there is a hole just to the left of - each byte, with an extra at the end: - - bits: 01111110 11111110 11111110 11111111 - bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD - - The 1-bits make sure that carries propagate to the next 0-bit. - The 0-bits provide holes for carries to fall into. */ - magic_bits = -1; - magic_bits = magic_bits / 0xff * 0xfe << 1 >> 1 | 1; - - /* Set up a longword, each of whose bytes is C. */ - charmask = c | (c << 8); - charmask |= charmask << 16; - if (sizeof (longword) > 4) - /* Do the shift in two steps to avoid a warning if long has 32 bits. */ - charmask |= (charmask << 16) << 16; - if (sizeof (longword) > 8) - abort (); - - /* Instead of the traditional loop which tests each character, - we will test a longword at a time. The tricky part is testing - if *any of the four* bytes in the longword in question are zero. */ - for (;;) - { - /* We tentatively exit the loop if adding MAGIC_BITS to - LONGWORD fails to change any of the hole bits of LONGWORD. - - 1) Is this safe? Will it catch all the zero bytes? - Suppose there is a byte with all zeros. Any carry bits - propagating from its left will fall into the hole at its - least significant bit and stop. Since there will be no - carry from its most significant bit, the LSB of the - byte to the left will be unchanged, and the zero will be - detected. - - 2) Is this worthwhile? Will it ignore everything except - zero bytes? Suppose every byte of LONGWORD has a bit set - somewhere. There will be a carry into bit 8. If bit 8 - is set, this will carry into bit 16. If bit 8 is clear, - one of bits 9-15 must be set, so there will be a carry - into bit 16. Similarly, there will be a carry into bit - 24. If one of bits 24-30 is set, there will be a carry - into bit 31, so all of the hole bits will be changed. - - The one misfire occurs when bits 24-30 are clear and bit - 31 is set; in this case, the hole at bit 31 is not - changed. If we had access to the processor carry flag, - we could close this loophole by putting the fourth hole - at bit 32! - - So it ignores everything except 128's, when they're aligned - properly. - - 3) But wait! Aren't we looking for C as well as zero? - Good point. So what we do is XOR LONGWORD with a longword, - each of whose bytes is C. This turns each byte that is C - into a zero. */ - - longword = *longword_ptr++; - - /* Add MAGIC_BITS to LONGWORD. */ - if ((((longword + magic_bits) - - /* Set those bits that were unchanged by the addition. */ - ^ ~longword) - - /* Look at only the hole bits. If any of the hole bits - are unchanged, most likely one of the bytes was a - zero. */ - & ~magic_bits) != 0 - - /* That caught zeroes. Now test for C. */ - || ((((longword ^ charmask) + magic_bits) ^ ~(longword ^ charmask)) - & ~magic_bits) != 0) - { - /* Which of the bytes was C or zero? - If none of them were, it was a misfire; continue the search. */ - - const unsigned char *cp = (const unsigned char *) (longword_ptr - 1); - - if (*cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (sizeof (longword) > 4) - { - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - } - } - } - - return NULL; + char *r = __strchrnul (s, c_in); + return (*(unsigned char *)r == (unsigned char)c_in) ? r : NULL; } - -#ifdef weak_alias -# undef index +#ifndef STRCHR weak_alias (strchr, index) -#endif libc_hidden_builtin_def (strchr) +#endif diff --git a/sysdeps/s390/strchr-c.c b/sysdeps/s390/strchr-c.c index c00f2cceea..90822ae0f4 100644 --- a/sysdeps/s390/strchr-c.c +++ b/sysdeps/s390/strchr-c.c @@ -21,13 +21,14 @@ #if HAVE_STRCHR_C # if HAVE_STRCHR_IFUNC # define STRCHR STRCHR_C -# undef weak_alias +# endif + +# include + +# if HAVE_STRCHR_IFUNC # if defined SHARED && IS_IN (libc) -# undef libc_hidden_builtin_def -# define libc_hidden_builtin_def(name) \ - __hidden_ver1 (__strchr_c, __GI_strchr, __strchr_c); +__hidden_ver1 (__strchr_c, __GI_strchr, __strchr_c); # endif # endif -# include #endif From patchwork Tue Jan 10 21:00:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 640906 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp2917250pvb; Tue, 10 Jan 2023 13:03:49 -0800 (PST) X-Google-Smtp-Source: AMrXdXuzk+pGxhMTelPp2LOMmjxIWXMxkEKtrO9ljg4DaSEUO4ctvLEADDHzwZPb65LiB9/h8Y7E X-Received: by 2002:a05:6402:2074:b0:48e:a97e:9f2d with SMTP id bd20-20020a056402207400b0048ea97e9f2dmr23924321edb.11.1673384629158; Tue, 10 Jan 2023 13:03:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673384629; cv=none; d=google.com; s=arc-20160816; b=D5lppcmORiWzvBaeYOLLIyacQ7tO5jmcAZ4QEbFUX9SVvRLSpx997Sg6jQWQjOzL0s pDPspBy2sjFSdTblS+OLQgJiGlWM+ySQCZQnBPslVPndQsKwDovAeEb/nBiLR/EhBKRT KqVJgHNoAj36iPsaiEoKXwPxDEspG3GegK/e2hrjmKgvGWywRx2W0g3jnaUpcEz/MHpl FH/pXG2ajtGHFjp7sIVqIr47ASyL5MsG0JzsKWDjeeCpNyJGPBm9mTrkdsevDCL7S6Y8 iRQIfpbSCr0hR9fc7yV1zwEnjIyHCD/lwPbIaVCjwmM3mo2hcVDFHkonmcltKfi2+4xs DKtg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=UfNjwOm7vqByOq6htq8rcoHc0IUR9kJRVLijCE/VO2c=; b=vpQzk4RSCiLhk0T1pNHF37gpGm0mqF6vSBDcTNoZDnZKDQ0ZSO/ic/OilXWvjT1mbn OjAB5GLU9ykAa/hyR2l+i64F/7MyXYzS+7jYRzwSkyQArJ8sS853h65iLpVBg2Zd4BvU O5ryqVFen3+owIOzADUiEGAIXOlzAKuAfsPXF3b/FYNqzCkY9nyai/c9MT+karTcEsJf Ks+muqCPbbgzXDjAolPAQR35ZhJhSeYiV/4CBrh42alSrGW35pDpJNXS+9ZT0xrThZNh brQ1qceas+wP2VSsdS8OksyTlmMD61rxMYlDhgYZBDG5FDvLeDOJU8byrCOLZrshDE4Y Q6JA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b="ua25rH/k"; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id cn25-20020a0564020cb900b0048321d887b5si10227130edb.516.2023.01.10.13.03.48 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:03:49 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b="ua25rH/k"; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 02C46389839A for ; Tue, 10 Jan 2023 21:03:48 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 02C46389839A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673384628; bh=UfNjwOm7vqByOq6htq8rcoHc0IUR9kJRVLijCE/VO2c=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=ua25rH/kzhputyf3+RC4w5KRyKHQjE8ojPK1tbrSDqr1EIV/I/tempR4F9JBZskOk m1yXMtAaHHzeEcbHyvvqKIdpdBzKDJSHAPSm0KfWUqJAf4hB6OL8IXPmoFL5WfzHd2 twWGmObyaep1fMM0HCxEEO3kSTb4O6a5CP91/IGc= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oo1-xc35.google.com (mail-oo1-xc35.google.com [IPv6:2607:f8b0:4864:20::c35]) by sourceware.org (Postfix) with ESMTPS id 1365C385B531 for ; Tue, 10 Jan 2023 21:01:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1365C385B531 Received: by mail-oo1-xc35.google.com with SMTP id x15-20020a4ab90f000000b004e64a0a967fso3549001ooo.2 for ; Tue, 10 Jan 2023 13:01:34 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UfNjwOm7vqByOq6htq8rcoHc0IUR9kJRVLijCE/VO2c=; b=hm5ZPsvXAQIOkMxarFMYtaRmzTKq5ZgNXZ8JEUyniQq3aWQn57tkn9piSioAvDPtpw DvfZl/TLijupD3lzlKUJkl0X5LVd7wPSKgXjQV54G/Jtod10S1MfTwW9tU2GncLq8R6j y1Yh8/lPzkCx1cC8PaZRWYSfl3ZRekXnYKqQ3/3Z5is+kxcXOJA7D4FxvnxA3t99fjUi 12F0bIo7uXiFFks9d6a/Rpp2UZQsEjNOOht7Shy0OORAm4dIutmMRjTHDyJoo+qb4866 Ejh69IKXVU+CzK7beRysdOX5M2qjDZCx1mE8IQlEGNqJwiDhGHUrBr+8V5q76F+p1i6A L9iA== X-Gm-Message-State: AFqh2kq9bTZ8MpuPOUz2kp2wQPULaVRSdPeJ7pX3pxB3N3z7e5U0lHW+ 9J/sfHhylKfDJ/eBp7assOp8Ii6O0Hgzxo21yc0= X-Received: by 2002:a4a:dbd8:0:b0:4a0:bc7f:462c with SMTP id t24-20020a4adbd8000000b004a0bc7f462cmr30069032oou.9.1673384492643; Tue, 10 Jan 2023 13:01:32 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:e8a0:dd55:3328:997]) by smtp.gmail.com with ESMTPSA id r5-20020a4a83c5000000b0049ee88e86f9sm6202193oog.10.2023.01.10.13.01.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:01:31 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Cc: Adhemerval Zanella Netto Subject: [PATCH v6 08/17] string: Improve generic strchrnul Date: Tue, 10 Jan 2023 18:00:57 -0300 Message-Id: <20230110210106.1457686-9-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> References: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto New algorithm have the following key differences: - Reads first word unaligned and use string-maskoff function to remove unwanted data. This strategy follow arch-specific optimization used on aarch64 and powerpc. - Use string-fz{b,i} functions. Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu, and powerpc-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Co-authored-by: Richard Henderson --- string/strchrnul.c | 153 +++--------------- .../power4/multiarch/strchrnul-ppc32.c | 4 - sysdeps/s390/strchrnul-c.c | 2 - 3 files changed, 22 insertions(+), 137 deletions(-) diff --git a/string/strchrnul.c b/string/strchrnul.c index fa2db4b417..4a34a1be6e 100644 --- a/string/strchrnul.c +++ b/string/strchrnul.c @@ -1,10 +1,5 @@ /* Copyright (C) 1991-2023 Free Software Foundation, Inc. This file is part of the GNU C Library. - Based on strlen implementation by Torbjorn Granlund (tege@sics.se), - with help from Dan Sahlin (dan@sics.se) and - bug fix and commentary by Jim Blandy (jimb@ai.mit.edu); - adaptation to strchr suggested by Dick Karpinski (dick@cca.ucsf.edu), - and implemented by Roland McGrath (roland@ai.mit.edu). The GNU C Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public @@ -21,146 +16,42 @@ . */ #include -#include #include +#include +#include +#include +#include +#include #undef __strchrnul #undef strchrnul -#ifndef STRCHRNUL -# define STRCHRNUL __strchrnul +#ifdef STRCHRNUL +# define __strchrnul STRCHRNUL #endif /* Find the first occurrence of C in S or the final NUL byte. */ char * -STRCHRNUL (const char *s, int c_in) +__strchrnul (const char *str, int c_in) { - const unsigned char *char_ptr; - const unsigned long int *longword_ptr; - unsigned long int longword, magic_bits, charmask; - unsigned char c; - - c = (unsigned char) c_in; - - /* Handle the first few characters by reading one character at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = (const unsigned char *) s; - ((unsigned long int) char_ptr & (sizeof (longword) - 1)) != 0; - ++char_ptr) - if (*char_ptr == c || *char_ptr == '\0') - return (void *) char_ptr; - - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to 8-byte longwords. */ - - longword_ptr = (unsigned long int *) char_ptr; - - /* Bits 31, 24, 16, and 8 of this number are zero. Call these bits - the "holes." Note that there is a hole just to the left of - each byte, with an extra at the end: - - bits: 01111110 11111110 11111110 11111111 - bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD - - The 1-bits make sure that carries propagate to the next 0-bit. - The 0-bits provide holes for carries to fall into. */ - magic_bits = -1; - magic_bits = magic_bits / 0xff * 0xfe << 1 >> 1 | 1; - - /* Set up a longword, each of whose bytes is C. */ - charmask = c | (c << 8); - charmask |= charmask << 16; - if (sizeof (longword) > 4) - /* Do the shift in two steps to avoid a warning if long has 32 bits. */ - charmask |= (charmask << 16) << 16; - if (sizeof (longword) > 8) - abort (); - - /* Instead of the traditional loop which tests each character, - we will test a longword at a time. The tricky part is testing - if *any of the four* bytes in the longword in question are zero. */ - for (;;) - { - /* We tentatively exit the loop if adding MAGIC_BITS to - LONGWORD fails to change any of the hole bits of LONGWORD. - - 1) Is this safe? Will it catch all the zero bytes? - Suppose there is a byte with all zeros. Any carry bits - propagating from its left will fall into the hole at its - least significant bit and stop. Since there will be no - carry from its most significant bit, the LSB of the - byte to the left will be unchanged, and the zero will be - detected. + op_t repeated_c = repeat_bytes (c_in); - 2) Is this worthwhile? Will it ignore everything except - zero bytes? Suppose every byte of LONGWORD has a bit set - somewhere. There will be a carry into bit 8. If bit 8 - is set, this will carry into bit 16. If bit 8 is clear, - one of bits 9-15 must be set, so there will be a carry - into bit 16. Similarly, there will be a carry into bit - 24. If one of bits 24-30 is set, there will be a carry - into bit 31, so all of the hole bits will be changed. + uintptr_t s_int = (uintptr_t) str; + const op_t *word_ptr = word_containing (str); - The one misfire occurs when bits 24-30 are clear and bit - 31 is set; in this case, the hole at bit 31 is not - changed. If we had access to the processor carry flag, - we could close this loophole by putting the fourth hole - at bit 32! + op_t word = *word_ptr; - So it ignores everything except 128's, when they're aligned - properly. + op_t mask = check_mask (find_zero_eq_all (word, repeated_c), s_int); + if (mask != 0) + return (char *) str + index_first (mask); - 3) But wait! Aren't we looking for C as well as zero? - Good point. So what we do is XOR LONGWORD with a longword, - each of whose bytes is C. This turns each byte that is C - into a zero. */ + do + word = *++word_ptr; + while (! has_zero_eq (word, repeated_c)); - longword = *longword_ptr++; - - /* Add MAGIC_BITS to LONGWORD. */ - if ((((longword + magic_bits) - - /* Set those bits that were unchanged by the addition. */ - ^ ~longword) - - /* Look at only the hole bits. If any of the hole bits - are unchanged, most likely one of the bytes was a - zero. */ - & ~magic_bits) != 0 - - /* That caught zeroes. Now test for C. */ - || ((((longword ^ charmask) + magic_bits) ^ ~(longword ^ charmask)) - & ~magic_bits) != 0) - { - /* Which of the bytes was C or zero? - If none of them were, it was a misfire; continue the search. */ - - const unsigned char *cp = (const unsigned char *) (longword_ptr - 1); - - if (*cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (sizeof (longword) > 4) - { - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - } - } - } - - /* This should never happen. */ - return NULL; + op_t found = index_first_zero_eq (word, repeated_c); + return (char *) word_ptr + found; } - +#ifndef STRCHRNUL weak_alias (__strchrnul, strchrnul) +#endif diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/strchrnul-ppc32.c b/sysdeps/powerpc/powerpc32/power4/multiarch/strchrnul-ppc32.c index 88ce5dfffa..da03ac7c04 100644 --- a/sysdeps/powerpc/powerpc32/power4/multiarch/strchrnul-ppc32.c +++ b/sysdeps/powerpc/powerpc32/power4/multiarch/strchrnul-ppc32.c @@ -19,10 +19,6 @@ #include #define STRCHRNUL __strchrnul_ppc - -#undef weak_alias -#define weak_alias(a,b ) - extern __typeof (strchrnul) __strchrnul_ppc attribute_hidden; #include diff --git a/sysdeps/s390/strchrnul-c.c b/sysdeps/s390/strchrnul-c.c index e1248d1dbf..ff6aa38d4f 100644 --- a/sysdeps/s390/strchrnul-c.c +++ b/sysdeps/s390/strchrnul-c.c @@ -22,8 +22,6 @@ # if HAVE_STRCHRNUL_IFUNC # define STRCHRNUL STRCHRNUL_C # define __strchrnul STRCHRNUL -# undef weak_alias -# define weak_alias(name, alias) # endif # include From patchwork Tue Jan 10 21:00:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 640908 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp2917593pvb; Tue, 10 Jan 2023 13:04:30 -0800 (PST) X-Google-Smtp-Source: AMrXdXvjkaxlRlZWfrxf6fOv3uJ2KgPOjwtblkmTMmg7sFxkJvZo3fQTDE2Wdjg0ZgdKr4DL6t+x X-Received: by 2002:a17:906:d052:b0:7be:e26a:6104 with SMTP id bo18-20020a170906d05200b007bee26a6104mr61715139ejb.52.1673384670079; Tue, 10 Jan 2023 13:04:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673384670; cv=none; d=google.com; s=arc-20160816; b=BqYBc7KtSP80KThp1T3b9tfyfkdE5lD0fnJrhT697KccAQgvO5jXp24n130tYaH2Bo DJ17fdANA6Xrj1ZKcVFX94Rz/qJgNMQsDfEtcMhXRRSk4+HkJ64aOjtyNg6+iZvi6/Dc 7KzOO59eRYbh60lf5Mhqdt+AW29stLvBBNug6HymKVPeqvqwZxoDlmy05zPr6mU7DqBe piN6m0kc3DjdL10jD1+El0q/W+hX8Fipd7fyrU5yz70LT6gKXHmsl3/EsshAIqrc2Urb gV6yB+J20veiefaPgXTGcVGnFgCdCMmUrPmBWwHNjVgqfzzmYdc/sxU1zqK25f62Z8CS TqiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=kaHoGHUD+J1FsTiUTsC2T3sn6pJ9jHe9pe1Jcc6cGuk=; b=apdd3zuxiyNz4nOJvUB+454BZ9kX7pi0s7H+5ZwQy6rXHh3GkV9vlBHbmmCEdzbm9P Z4qhDtGRtvB9Cln2XSXU8tDrToYHgekAfOvMbTLAKFy46amdUWjQLtLYilgFGdrpr+bR Lq85tJ8E/cfdZHHCPl8zBfWjrovMVamVb7okBf8Z9bCFLrT7OeqYggSfMlDR6UTzSMYt 8LyxF1qvq6yqaFXxB/JklSf9ylyvNu/DTdf0kjKlAbMI59BXVbxDHsRw+WS9IfOIP7Gy OgHfBP0rsNSQwZt/4AP+rIl4w7NNElMNTNx9XjzPxCaUEwdx+TRIV8a3j4Vb3YCIiIAZ hV1g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=WPDRjIPX; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id cs2-20020a0564020c4200b00499bebec458si4782967edb.299.2023.01.10.13.04.29 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:04:30 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=WPDRjIPX; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0288B38582A3 for ; Tue, 10 Jan 2023 21:04:29 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0288B38582A3 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673384669; bh=kaHoGHUD+J1FsTiUTsC2T3sn6pJ9jHe9pe1Jcc6cGuk=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=WPDRjIPXByOw9XNQOWDqZi+2Fl3Mf3AViEacdFhcV+BQQfsq0Bnm+4dU2J0svXsKJ 4ykP7/4GgIoy5+LqrKdCGqwZFiP1UG6I7Vju6vehP4/Kn+4gBNMHI4qbVRrsPr1Zl7 edVrF+zwWhj5YoWu3ylprUxYnpiLJUdU1RrFuqdQ= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oi1-x22e.google.com (mail-oi1-x22e.google.com [IPv6:2607:f8b0:4864:20::22e]) by sourceware.org (Postfix) with ESMTPS id 5DB1B385B527 for ; Tue, 10 Jan 2023 21:01:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5DB1B385B527 Received: by mail-oi1-x22e.google.com with SMTP id o66so11170215oia.6 for ; Tue, 10 Jan 2023 13:01:36 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kaHoGHUD+J1FsTiUTsC2T3sn6pJ9jHe9pe1Jcc6cGuk=; b=zshvzakCCaU0liWpkKWUKg/SBv7U/7kAe3OOFaW6FcBQBxkO70vjWKevsC1m8Pe7QH 9nvQSzXg6fV+OHq0e7dMt/kUn+akiRt//NcqRnlK4Dkf3bXTdSoMzPmmhG+ZEmM1qvpE 66Lf3qDLGkIvOqFf9KAleOQnf25mMe0rwNHgqEVwKMs6FcKDryBxRgO9ct/FHTUPvYVr BMapvqhyb80YReRUXsX9EyCZLiV9skTk8qCHlwsUeiFqNz5ij/iKJyeJV8bVHcvqmrDP E32CBbKRH9GiMYJyP5i0BJZpbbz9ParC/fm5vSe269gS9fQJys7oz0uQtfNRSkgvmNoa odXA== X-Gm-Message-State: AFqh2kq2iA+hDs+8FYLWodyOVze0tL/wtyl5+2EHffAOugQSvkcISbFT gy4WQfSsDwXupsJ9o3YWNGiw7ry62AsMhOXVUdw= X-Received: by 2002:a54:4012:0:b0:360:f6a4:76f2 with SMTP id x18-20020a544012000000b00360f6a476f2mr27223554oie.18.1673384494930; Tue, 10 Jan 2023 13:01:34 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:e8a0:dd55:3328:997]) by smtp.gmail.com with ESMTPSA id r5-20020a4a83c5000000b0049ee88e86f9sm6202193oog.10.2023.01.10.13.01.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:01:34 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Cc: Adhemerval Zanella Netto Subject: [PATCH v6 09/17] string: Improve generic strcmp Date: Tue, 10 Jan 2023 18:00:58 -0300 Message-Id: <20230110210106.1457686-10-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> References: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto New generic implementation tries to use word operations along with the new string-fz{b,i} functions even for inputs with different alignments (with still uses aligned access plus merge operation to get a correct word by word comparison). Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu, and powerpc-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Co-authored-by: Richard Henderson --- string/strcmp.c | 119 +++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 103 insertions(+), 16 deletions(-) diff --git a/string/strcmp.c b/string/strcmp.c index 053f5a8d2b..fafd967567 100644 --- a/string/strcmp.c +++ b/string/strcmp.c @@ -15,33 +15,120 @@ License along with the GNU C Library; if not, see . */ +#include +#include +#include +#include #include +#include -#undef strcmp - -#ifndef STRCMP -# define STRCMP strcmp +#ifdef STRCMP +# define strcmp STRCMP #endif +static inline int +final_cmp (const op_t w1, const op_t w2) +{ + /* It can not use index_first_zero_ne because it must not compare past the + final '/0' is present (and final_cmp is called before has_zero check). + */ + for (size_t i = 0; i < sizeof (op_t); i++) + { + unsigned char c1 = extractbyte (w1, i); + unsigned char c2 = extractbyte (w2, i); + if (c1 == '\0' || c1 != c2) + return c1 - c2; + } + return 0; +} + +/* Aligned loop: if a difference is found, exit to compare the bytes. Else + if a zero is found we have equal strings. */ +static inline int +strcmp_aligned_loop (const op_t *x1, const op_t *x2, op_t w1) +{ + op_t w2 = *x2++; + + while (w1 == w2) + { + if (has_zero (w1)) + return 0; + w1 = *x1++; + w2 = *x2++; + } + + return final_cmp (w1, w2); +} + +/* Unaligned loop: align the first partial of P2, with 0xff for the rest of + the bytes so that we can also apply the has_zero test to see if we have + already reached EOS. If we have, then we can simply fall through to the + final comparison. */ +static inline int +strcmp_unaligned_loop (const op_t *x1, const op_t *x2, op_t w1, uintptr_t ofs) +{ + op_t w2a = *x2++; + uintptr_t sh_1 = ofs * CHAR_BIT; + uintptr_t sh_2 = sizeof(op_t) * CHAR_BIT - sh_1; + + op_t w2 = MERGE (w2a, sh_1, (op_t)-1, sh_2); + if (!has_zero (w2)) + { + op_t w2b; + + /* Unaligned loop. The invariant is that W2B, which is "ahead" of W1, + does not contain end-of-string. Therefore it is safe (and necessary) + to read another word from each while we do not have a difference. */ + while (1) + { + w2b = *x2++; + w2 = MERGE (w2a, sh_1, w2b, sh_2); + if (w1 != w2) + return final_cmp (w1, w2); + if (has_zero (w2b)) + break; + w1 = *x1++; + w2a = w2b; + } + + /* Zero found in the second partial of P2. If we had EOS in the aligned + word, we have equality. */ + if (has_zero (w1)) + return 0; + + /* Load the final word of P1 and align the final partial of P2. */ + w1 = *x1++; + w2 = MERGE (w2b, sh_1, 0, sh_2); + } + + return final_cmp (w1, w2); +} + /* Compare S1 and S2, returning less than, equal to or greater than zero if S1 is lexicographically less than, equal to or greater than S2. */ int -STRCMP (const char *p1, const char *p2) +strcmp (const char *p1, const char *p2) { - const unsigned char *s1 = (const unsigned char *) p1; - const unsigned char *s2 = (const unsigned char *) p2; - unsigned char c1, c2; - - do + /* Handle the unaligned bytes of p1 first. */ + uintptr_t n = -(uintptr_t)p1 % sizeof(op_t); + for (int i = 0; i < n; ++i) { - c1 = (unsigned char) *s1++; - c2 = (unsigned char) *s2++; - if (c1 == '\0') - return c1 - c2; + unsigned char c1 = *p1++; + unsigned char c2 = *p2++; + int diff = c1 - c2; + if (c1 == '\0' || diff != 0) + return diff; } - while (c1 == c2); - return c1 - c2; + /* P1 is now aligned to unsigned long. P2 may or may not be. */ + const op_t *x1 = (const op_t *) p1; + op_t w1 = *x1++; + uintptr_t ofs = (uintptr_t) p2 % sizeof(op_t); + return ofs == 0 + ? strcmp_aligned_loop (x1, (const op_t *)p2, w1) + : strcmp_unaligned_loop (x1, (const op_t *)(p2 - ofs), w1, ofs); } +#ifndef STRCMP libc_hidden_builtin_def (strcmp) +#endif From patchwork Tue Jan 10 21:00:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 640909 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp2917902pvb; Tue, 10 Jan 2023 13:05:11 -0800 (PST) X-Google-Smtp-Source: AMrXdXuTGiRqjIIilrSyq6ACKoeFGys1N1KF6EToxnE3ZQoDjsxQNMbjKaV/mo8ewVgvtT03rhf+ X-Received: by 2002:a17:907:d38c:b0:83c:1a1e:1efe with SMTP id vh12-20020a170907d38c00b0083c1a1e1efemr58660965ejc.6.1673384711024; Tue, 10 Jan 2023 13:05:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673384711; cv=none; d=google.com; s=arc-20160816; b=yiI+ujpV2n0Jne5Q5gG/NKW7PoSdMgLfA8V6BiOgtja6gZtF6jYZZ9UEyk3N6dPMb8 V6ngAnbh350Wg0voZDTSN1jEaS7Xq6fxMCwsg89rrMdwcko1OnBY/wYAAklfzghyhTpL QSCwdShgbzkybq+OjkiYZhRw/jZ9wpl+QZEyNg289e1dZMTjpVMLYnGwyBOBouzPpEI3 qgsq9ncjhW+tcXETydSHpPjyTDEFw+OHAW/ieUbgsmUQvjnx8ck9aeE/Wxq8CXnq8/Cz 5T7FqNVcbRF/EsS+pmH3ggp7wZh2mJgZSf2pAMtm7L2JNFR3bOey5FNLrNWE7a534Sgr LZ7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=BGZ/aswhpJrTe6roW/gL25KkikkfqYaQyO2pjAAmET4=; b=HFdybP/XKFZ6dbwiK/9//SH5llehtJVD6Rr8TazsUd5ImSlaYJuo9r2oxxRRCM6W1v 9gYaj1c7DEZKHUwYKfXHhHHNtmiMbShYL7/10bkZc92+U3G0wAOJvSFB+hRHty8334jf zjfe+Zcli2iQ8pDHqizKiPnQ5a80X+MZzAPeUiri+1fzTuT1OtCjeSeZYNlyv1U2ZUvZ OidyDTH83pN1DosVCtfkM6dH5NuHVvJyB5Aw1G/ET6yBE/ruIk89rkid1Oga2P4t2rdh tPImao1/Va2ll9bHG37oPLYMYf81UNRn0Qh3r+m/+dI8lR2XmoigfzGo8bDlG7yIk7db nPhg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=koxps79w; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id xf5-20020a17090731c500b0084d439b4e0bsi7592961ejb.574.2023.01.10.13.05.10 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:05:11 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=koxps79w; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C08D33817FB3 for ; Tue, 10 Jan 2023 21:05:09 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C08D33817FB3 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673384709; bh=BGZ/aswhpJrTe6roW/gL25KkikkfqYaQyO2pjAAmET4=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=koxps79w60Q7cnxGFZ55ZGQvL1HwoBMoUs4Zx2H8jPJUdYfixhWuLDyLiqCQ4SE1C pQuQuSClnq/KiXDNx+idPO51bF0LrK+KjUwxbr8ashjtvYEb0izDwi1T8o9Nw1eMpS j0VAcd8NTzTr+i0WhBOEZ9kXW4II3DkmxqMb5Uls= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oo1-xc31.google.com (mail-oo1-xc31.google.com [IPv6:2607:f8b0:4864:20::c31]) by sourceware.org (Postfix) with ESMTPS id 5797C384F02F for ; Tue, 10 Jan 2023 21:01:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5797C384F02F Received: by mail-oo1-xc31.google.com with SMTP id h3-20020a4aa283000000b004ead187bd6eso3541598ool.5 for ; Tue, 10 Jan 2023 13:01:38 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BGZ/aswhpJrTe6roW/gL25KkikkfqYaQyO2pjAAmET4=; b=vwGX3Pz2oKz/xDGMvQY5mk2gP+2j5PsBLNE1H5uxentLYucID/2TUwhelyijKeooQJ f+TUjGd0k7yPoUbV7WKjMz0EQrnG10twsYkSIiDTiAhhXw7TlgA7sRWG7eNGerneNVUv MYJuVpCrl/n2fZGfVKE3b+Q/jDm1eAv9bD/eqf1XL8MJeB4ecJ+nMKaAqK043yzKLaVF n3Vg4t6Fr+IGKHW7mWd9XJaERtdjWxEVJBpKjooVwLMUx5RHl+fFsUkMuRpmO/C6XWVq xBhYtODXiBGC28OFFFAHfqFmLdLVANGsavMBb5BUd/w0sYoaOuJlsqwjEKUvCULiCas8 VjLg== X-Gm-Message-State: AFqh2kqlr5ubzj3Q9lA4Dti9D/9+JPA0acg9ZtEdSfStGR0JYwrXpQA4 3Pc1Qdq/ZGIC0OOz/B0V7RERAgyZ6CiilZtTsF8= X-Received: by 2002:a4a:83c6:0:b0:4a5:4335:100c with SMTP id r6-20020a4a83c6000000b004a54335100cmr28907983oog.7.1673384497179; Tue, 10 Jan 2023 13:01:37 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:e8a0:dd55:3328:997]) by smtp.gmail.com with ESMTPSA id r5-20020a4a83c5000000b0049ee88e86f9sm6202193oog.10.2023.01.10.13.01.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:01:36 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Cc: Adhemerval Zanella Netto Subject: [PATCH v6 10/17] string: Improve generic memchr Date: Tue, 10 Jan 2023 18:00:59 -0300 Message-Id: <20230110210106.1457686-11-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> References: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto New algorithm have the following key differences: - Reads first word unaligned and use string-maskoff function to remove unwanted data. This strategy follow arch-specific optimization used on aarch64 and powerpc. - Use string-fz{b,i} and string-opthr functions. Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu, and powerpc64-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Co-authored-by: Richard Henderson --- string/memchr.c | 171 +++++------------- .../powerpc32/power4/multiarch/memchr-ppc32.c | 14 +- .../powerpc64/multiarch/memchr-ppc64.c | 9 +- 3 files changed, 55 insertions(+), 139 deletions(-) diff --git a/string/memchr.c b/string/memchr.c index f800d47dce..90ef314372 100644 --- a/string/memchr.c +++ b/string/memchr.c @@ -1,10 +1,6 @@ -/* Copyright (C) 1991-2023 Free Software Foundation, Inc. +/* Scan memory for a character. Generic version + Copyright (C) 1991-2023 Free Software Foundation, Inc. This file is part of the GNU C Library. - Based on strlen implementation by Torbjorn Granlund (tege@sics.se), - with help from Dan Sahlin (dan@sics.se) and - commentary by Jim Blandy (jimb@ai.mit.edu); - adaptation to memchr suggested by Dick Karpinski (dick@cca.ucsf.edu), - and implemented by Roland McGrath (roland@ai.mit.edu). The GNU C Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public @@ -20,143 +16,76 @@ License along with the GNU C Library; if not, see . */ -#ifndef _LIBC -# include -#endif - +#include +#include +#include +#include +#include +#include #include -#include - -#include +#undef memchr -#undef __memchr -#ifdef _LIBC -# undef memchr +#ifdef MEMCHR +# define __memchr MEMCHR #endif -#ifndef weak_alias -# define __memchr memchr -#endif - -#ifndef MEMCHR -# define MEMCHR __memchr -#endif +static inline const char * +sadd (uintptr_t x, uintptr_t y) +{ + uintptr_t ret = INT_ADD_OVERFLOW (x, y) ? (uintptr_t)-1 : x + y; + return (const char *)ret; +} /* Search no more than N bytes of S for C. */ void * -MEMCHR (void const *s, int c_in, size_t n) +__memchr (void const *s, int c_in, size_t n) { - /* On 32-bit hardware, choosing longword to be a 32-bit unsigned - long instead of a 64-bit uintmax_t tends to give better - performance. On 64-bit hardware, unsigned long is generally 64 - bits already. Change this typedef to experiment with - performance. */ - typedef unsigned long int longword; - - const unsigned char *char_ptr; - const longword *longword_ptr; - longword repeated_one; - longword repeated_c; - unsigned char c; - - c = (unsigned char) c_in; + if (__glibc_unlikely (n == 0)) + return NULL; - /* Handle the first few bytes by reading one byte at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = (const unsigned char *) s; - n > 0 && (size_t) char_ptr % sizeof (longword) != 0; - --n, ++char_ptr) - if (*char_ptr == c) - return (void *) char_ptr; + /* Read the first word, but munge it so that bytes before the array + will not match goal. */ + const op_t *word_ptr = word_containing (s); + uintptr_t s_int = (uintptr_t) s; - longword_ptr = (const longword *) char_ptr; + op_t word = *word_ptr; + op_t repeated_c = repeat_bytes (c_in); + op_t mask = check_mask (find_eq_all (word, repeated_c), s_int); - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to any size longwords. */ + /* Compute the address of the last byte taking in consideration possible + overflow. */ + const char *lbyte = sadd (s_int, n - 1); + /* And also the address of the word containing the last byte. */ + const op_t *lword = word_containing (lbyte); - /* Compute auxiliary longword values: - repeated_one is a value which has a 1 in every byte. - repeated_c has c in every byte. */ - repeated_one = 0x01010101; - repeated_c = c | (c << 8); - repeated_c |= repeated_c << 16; - if (0xffffffffU < (longword) -1) + if (mask != 0) { - repeated_one |= repeated_one << 31 << 1; - repeated_c |= repeated_c << 31 << 1; - if (8 < sizeof (longword)) - { - size_t i; - - for (i = 64; i < sizeof (longword) * 8; i *= 2) - { - repeated_one |= repeated_one << i; - repeated_c |= repeated_c << i; - } - } + char *ret = (char *) s + index_first (mask); + return (ret <= lbyte) ? ret : NULL; } + if (word_ptr == lword) + return NULL; - /* Instead of the traditional loop which tests each byte, we will test a - longword at a time. The tricky part is testing if *any of the four* - bytes in the longword in question are equal to c. We first use an xor - with repeated_c. This reduces the task to testing whether *any of the - four* bytes in longword1 is zero. - - We compute tmp = - ((longword1 - repeated_one) & ~longword1) & (repeated_one << 7). - That is, we perform the following operations: - 1. Subtract repeated_one. - 2. & ~longword1. - 3. & a mask consisting of 0x80 in every byte. - Consider what happens in each byte: - - If a byte of longword1 is zero, step 1 and 2 transform it into 0xff, - and step 3 transforms it into 0x80. A carry can also be propagated - to more significant bytes. - - If a byte of longword1 is nonzero, let its lowest 1 bit be at - position k (0 <= k <= 7); so the lowest k bits are 0. After step 1, - the byte ends in a single bit of value 0 and k bits of value 1. - After step 2, the result is just k bits of value 1: 2^k - 1. After - step 3, the result is 0. And no carry is produced. - So, if longword1 has only non-zero bytes, tmp is zero. - Whereas if longword1 has a zero byte, call j the position of the least - significant zero byte. Then the result has a zero at positions 0, ..., - j-1 and a 0x80 at position j. We cannot predict the result at the more - significant bytes (positions j+1..3), but it does not matter since we - already have a non-zero bit at position 8*j+7. - - So, the test whether any byte in longword1 is zero is equivalent to - testing whether tmp is nonzero. */ - - while (n >= sizeof (longword)) + word = *++word_ptr; + while (word_ptr != lword) { - longword longword1 = *longword_ptr ^ repeated_c; - - if ((((longword1 - repeated_one) & ~longword1) - & (repeated_one << 7)) != 0) - break; - longword_ptr++; - n -= sizeof (longword); + if (has_eq (word, repeated_c)) + return (char *) word_ptr + index_first_eq (word, repeated_c); + word = *++word_ptr; } - char_ptr = (const unsigned char *) longword_ptr; - - /* At this point, we know that either n < sizeof (longword), or one of the - sizeof (longword) bytes starting at char_ptr is == c. On little-endian - machines, we could determine the first such byte without any further - memory accesses, just by looking at the tmp result from the last loop - iteration. But this does not work on big-endian machines. Choose code - that works in both cases. */ - - for (; n > 0; --n, ++char_ptr) + if (has_eq (word, repeated_c)) { - if (*char_ptr == c) - return (void *) char_ptr; + /* We found a match, but it might be in a byte past the end of the + array. */ + char *ret = (char *) word_ptr + index_first_eq (word, repeated_c); + if (ret <= lbyte) + return ret; } - return NULL; } -#ifdef weak_alias +#ifndef MEMCHR weak_alias (__memchr, memchr) -#endif libc_hidden_builtin_def (memchr) +#endif diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/memchr-ppc32.c b/sysdeps/powerpc/powerpc32/power4/multiarch/memchr-ppc32.c index 39ff84f3f3..a78585650f 100644 --- a/sysdeps/powerpc/powerpc32/power4/multiarch/memchr-ppc32.c +++ b/sysdeps/powerpc/powerpc32/power4/multiarch/memchr-ppc32.c @@ -18,17 +18,11 @@ #include -#define MEMCHR __memchr_ppc +extern __typeof (memchr) __memchr_ppc attribute_hidden; -#undef weak_alias -#define weak_alias(a, b) +#define MEMCHR __memchr_ppc +#include #ifdef SHARED -# undef libc_hidden_builtin_def -# define libc_hidden_builtin_def(name) \ - __hidden_ver1(__memchr_ppc, __GI_memchr, __memchr_ppc); +__hidden_ver1(__memchr_ppc, __GI_memchr, __memchr_ppc); #endif - -extern __typeof (memchr) __memchr_ppc attribute_hidden; - -#include diff --git a/sysdeps/powerpc/powerpc64/multiarch/memchr-ppc64.c b/sysdeps/powerpc/powerpc64/multiarch/memchr-ppc64.c index 8097df709c..49ba5521fe 100644 --- a/sysdeps/powerpc/powerpc64/multiarch/memchr-ppc64.c +++ b/sysdeps/powerpc/powerpc64/multiarch/memchr-ppc64.c @@ -18,14 +18,7 @@ #include -#define MEMCHR __memchr_ppc - -#undef weak_alias -#define weak_alias(a, b) - -# undef libc_hidden_builtin_def -# define libc_hidden_builtin_def(name) - extern __typeof (memchr) __memchr_ppc attribute_hidden; +#define MEMCHR __memchr_ppc #include From patchwork Tue Jan 10 21:01:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 640898 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp2916250pvb; Tue, 10 Jan 2023 13:02:10 -0800 (PST) X-Google-Smtp-Source: AMrXdXu5m/dhPW8ruETWZO/eu0c5bh3aXKOcqmFx6UMv74E9W3tD5yqncfCeY0jR4lI1nkyRIbEh X-Received: by 2002:a05:6402:185:b0:499:c294:77af with SMTP id r5-20020a056402018500b00499c29477afmr4931242edv.12.1673384530646; Tue, 10 Jan 2023 13:02:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673384530; cv=none; d=google.com; s=arc-20160816; b=kB6H3m3uzCkx8kt6xGcayYdmj6K0hODCHDG6SAtPI23a0nY1xkl5h6hm1q3KPf/z6a 8dfKFPNw3tjdOYlaOiCQF3OCbDnBRAYEaYrSWdlopm1BIOxz7u+2AFsXZKkQp5+ced3d G7cecNSPYgBOVjWV1GHtyfqwNKM45LX0CUbMUsw2eJZyae9iRIECDIZykfII0/YBSwIE LURzcUSJn7S3aeP6K49LZbjG0UXZLkMb+loMxbW378XIP2ZHgX3I7gV791oKZYtEg42X jWfRXbTGt6atOIUQuJJoiZsKEvfK3YVYHGBoJsIUSPs74nraxjhXj2lcTmwQfeygG3JP i1Jw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=3M5WFZ+sUcUbCjb/jbEDPp1IgkqgmB+DrmtVZ+dFqFo=; b=QuG8vjSoZbbCGjNaq/+qXQwiiwvyKQmBQSDojUtgOHA1AC38X7Xv7loTRvHpVD2qcg /PrhtLH/B1/rQ38fC2j4CKnldxk3efmSB8tZ3s3H4O7inUZ6Rm/tqRtcvGDaHO1QX8nv XGFnfoKLm34BD9g0kgY4O1tJx41VF3Xrv8z1WeVveFHNxR6/nkUe00tMUCzwGJ6fy2TE 61ub/hetFqYBNmPQxElUYH7qGtYKgT50eNd+3+d9QZkpOa6c7iI2/cg7xDhWiVADmo6K /gRmLRkaOVvX1Uo/mqr9t0d++yn4y3PRs/C0Y7BR3kIj9VwETx9yoSo6exUiz1Ug9Xs2 8XOw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b="FYQ/jt8S"; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id c8-20020a05640227c800b0046bce2261e5si15093762ede.471.2023.01.10.13.02.10 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:02:10 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b="FYQ/jt8S"; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7B04B383B78D for ; Tue, 10 Jan 2023 21:02:09 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7B04B383B78D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673384529; bh=3M5WFZ+sUcUbCjb/jbEDPp1IgkqgmB+DrmtVZ+dFqFo=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=FYQ/jt8SuiP8vPrK82NZTlO42ASI3xEj7oMkh4eSgreJsG6U6cakiKSfL76mCy+dL Irwyxtn2z67ySAbxhBg9AUuip8FxvkekHtAkpKFv3nDKXgpsCXZnoBVetCB+HTf/bN hx95inZ+dgEHxWrtVYKaEFsYXFmQK+KpWJqMwSdc= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oo1-xc33.google.com (mail-oo1-xc33.google.com [IPv6:2607:f8b0:4864:20::c33]) by sourceware.org (Postfix) with ESMTPS id AFEC63858028 for ; Tue, 10 Jan 2023 21:01:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AFEC63858028 Received: by mail-oo1-xc33.google.com with SMTP id t15-20020a4a96cf000000b0049f7e18db0dso3539510ooi.10 for ; Tue, 10 Jan 2023 13:01:40 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3M5WFZ+sUcUbCjb/jbEDPp1IgkqgmB+DrmtVZ+dFqFo=; b=QapSOTrhr509Wbp4/I+TvWGBXVreL6jPIQwbMFQ32t24XvKmXJBPNRo9pBo9uOK3je xIIQvMn63il1wQkEpOxPDw7vlHhSketAkUNAyg1EuMa/fR3eVgK13+iI9XDcFcauxG0b KjpNrug5qvVi0Yfj7VJumfPgL113CYtWfHnx03akc5lcNvLNvAtMvvcsJsqLcjDZn6rA lwSqd/piJsx8qCd/gh1NYAPOzVhnPfeavDLk+7UfhKxAE76Twt0fE2AKB6OHujA26pSQ BTWa2lfrNcRV4ScPT25xdy0oyLmC5uTg4+tmKmae75dN+XJ34wYRaehJNrfV1HqUe6MQ a5vQ== X-Gm-Message-State: AFqh2kpo4UY0Ea459obQPj13th5Ab9T/VO4BSbmNdmtJvMA5Hnk5ox0R Ar9smUBYH4l/xHJ5lNNwb3bLGhfH03nnjkGnp9A= X-Received: by 2002:a4a:ac0c:0:b0:4e7:128c:f195 with SMTP id p12-20020a4aac0c000000b004e7128cf195mr19083938oon.8.1673384499399; Tue, 10 Jan 2023 13:01:39 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:e8a0:dd55:3328:997]) by smtp.gmail.com with ESMTPSA id r5-20020a4a83c5000000b0049ee88e86f9sm6202193oog.10.2023.01.10.13.01.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:01:38 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Cc: Adhemerval Zanella Netto Subject: [PATCH v6 11/17] string: Improve generic memrchr Date: Tue, 10 Jan 2023 18:01:00 -0300 Message-Id: <20230110210106.1457686-12-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> References: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto New algorithm have the following key differences: - Use string-fz{b,i} functions. Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu, and powerpc64-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Co-authored-by: Richard Henderson --- string/memrchr.c | 189 ++++++++--------------------------------------- 1 file changed, 32 insertions(+), 157 deletions(-) diff --git a/string/memrchr.c b/string/memrchr.c index 18b20ff76a..cc8cfda3ae 100644 --- a/string/memrchr.c +++ b/string/memrchr.c @@ -1,11 +1,6 @@ /* memrchr -- find the last occurrence of a byte in a memory block Copyright (C) 1991-2023 Free Software Foundation, Inc. This file is part of the GNU C Library. - Based on strlen implementation by Torbjorn Granlund (tege@sics.se), - with help from Dan Sahlin (dan@sics.se) and - commentary by Jim Blandy (jimb@ai.mit.edu); - adaptation to memchr suggested by Dick Karpinski (dick@cca.ucsf.edu), - and implemented by Roland McGrath (roland@ai.mit.edu). The GNU C Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public @@ -21,177 +16,57 @@ License along with the GNU C Library; if not, see . */ -#include - -#ifdef HAVE_CONFIG_H -# include -#endif - -#if defined _LIBC -# include -# include -#endif - -#if defined HAVE_LIMITS_H || defined _LIBC -# include -#endif - -#define LONG_MAX_32_BITS 2147483647 - -#ifndef LONG_MAX -# define LONG_MAX LONG_MAX_32_BITS -#endif - -#include +#include +#include +#include +#include +#include #undef __memrchr #undef memrchr -#ifndef weak_alias -# define __memrchr memrchr +#ifdef MEMRCHR +# define __memrchr MEMRCHR #endif -/* Search no more than N bytes of S for C. */ void * -#ifndef MEMRCHR -__memrchr -#else -MEMRCHR -#endif - (const void *s, int c_in, size_t n) +__memrchr (const void *s, int c_in, size_t n) { - const unsigned char *char_ptr; - const unsigned long int *longword_ptr; - unsigned long int longword, magic_bits, charmask; - unsigned char c; - - c = (unsigned char) c_in; - /* Handle the last few characters by reading one character at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = (const unsigned char *) s + n; - n > 0 && ((unsigned long int) char_ptr - & (sizeof (longword) - 1)) != 0; - --n) - if (*--char_ptr == c) + Do this until CHAR_PTR is aligned on a word boundary, or + the entirety of small inputs. */ + const unsigned char *char_ptr = (const unsigned char *) (s + n); + size_t align = (uintptr_t) char_ptr % sizeof (op_t); + if (n < OP_T_THRES || align > n) + align = n; + for (size_t i = 0; i < align; ++i) + if (*--char_ptr == c_in) return (void *) char_ptr; - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to 8-byte longwords. */ + const op_t *word_ptr = (const op_t *) char_ptr; + n -= align; + if (__glibc_unlikely (n == 0)) + return NULL; - longword_ptr = (const unsigned long int *) char_ptr; + /* Compute the address of the word containing the initial byte. */ + const op_t *lword = word_containing (s); - /* Bits 31, 24, 16, and 8 of this number are zero. Call these bits - the "holes." Note that there is a hole just to the left of - each byte, with an extra at the end: - - bits: 01111110 11111110 11111110 11111111 - bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD - - The 1-bits make sure that carries propagate to the next 0-bit. - The 0-bits provide holes for carries to fall into. */ - magic_bits = -1; - magic_bits = magic_bits / 0xff * 0xfe << 1 >> 1 | 1; - - /* Set up a longword, each of whose bytes is C. */ - charmask = c | (c << 8); - charmask |= charmask << 16; -#if LONG_MAX > LONG_MAX_32_BITS - charmask |= charmask << 32; -#endif + /* Set up a word, each of whose bytes is C. */ + op_t repeated_c = repeat_bytes (c_in); - /* Instead of the traditional loop which tests each character, - we will test a longword at a time. The tricky part is testing - if *any of the four* bytes in the longword in question are zero. */ - while (n >= sizeof (longword)) + while (word_ptr != lword) { - /* We tentatively exit the loop if adding MAGIC_BITS to - LONGWORD fails to change any of the hole bits of LONGWORD. - - 1) Is this safe? Will it catch all the zero bytes? - Suppose there is a byte with all zeros. Any carry bits - propagating from its left will fall into the hole at its - least significant bit and stop. Since there will be no - carry from its most significant bit, the LSB of the - byte to the left will be unchanged, and the zero will be - detected. - - 2) Is this worthwhile? Will it ignore everything except - zero bytes? Suppose every byte of LONGWORD has a bit set - somewhere. There will be a carry into bit 8. If bit 8 - is set, this will carry into bit 16. If bit 8 is clear, - one of bits 9-15 must be set, so there will be a carry - into bit 16. Similarly, there will be a carry into bit - 24. If one of bits 24-30 is set, there will be a carry - into bit 31, so all of the hole bits will be changed. - - The one misfire occurs when bits 24-30 are clear and bit - 31 is set; in this case, the hole at bit 31 is not - changed. If we had access to the processor carry flag, - we could close this loophole by putting the fourth hole - at bit 32! - - So it ignores everything except 128's, when they're aligned - properly. - - 3) But wait! Aren't we looking for C, not zero? - Good point. So what we do is XOR LONGWORD with a longword, - each of whose bytes is C. This turns each byte that is C - into a zero. */ - - longword = *--longword_ptr ^ charmask; - - /* Add MAGIC_BITS to LONGWORD. */ - if ((((longword + magic_bits) - - /* Set those bits that were unchanged by the addition. */ - ^ ~longword) - - /* Look at only the hole bits. If any of the hole bits - are unchanged, most likely one of the bytes was a - zero. */ - & ~magic_bits) != 0) + op_t word = *--word_ptr; + if (has_eq (word, repeated_c)) { - /* Which of the bytes was C? If none of them were, it was - a misfire; continue the search. */ - - const unsigned char *cp = (const unsigned char *) longword_ptr; - -#if LONG_MAX > 2147483647 - if (cp[7] == c) - return (void *) &cp[7]; - if (cp[6] == c) - return (void *) &cp[6]; - if (cp[5] == c) - return (void *) &cp[5]; - if (cp[4] == c) - return (void *) &cp[4]; -#endif - if (cp[3] == c) - return (void *) &cp[3]; - if (cp[2] == c) - return (void *) &cp[2]; - if (cp[1] == c) - return (void *) &cp[1]; - if (cp[0] == c) - return (void *) cp; + /* We found a match, but it might be in a byte past the start + of the array. */ + char *ret = (char *) word_ptr + index_last_eq (word, repeated_c); + return ret >= (char *) s ? ret : NULL; } - - n -= sizeof (longword); } - - char_ptr = (const unsigned char *) longword_ptr; - - while (n-- > 0) - { - if (*--char_ptr == c) - return (void *) char_ptr; - } - - return 0; + return NULL; } #ifndef MEMRCHR -# ifdef weak_alias weak_alias (__memrchr, memrchr) -# endif #endif From patchwork Tue Jan 10 21:01:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 640902 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp2916694pvb; Tue, 10 Jan 2023 13:02:59 -0800 (PST) X-Google-Smtp-Source: AMrXdXv6wuZWFrwnNeByc6ZS42ESO2SMHZZuWXPna5QW6c7C0eIJfPoQ4cgJFjfItYsuKwWtrfz6 X-Received: by 2002:a05:6402:5293:b0:497:c96b:4dea with SMTP id en19-20020a056402529300b00497c96b4deamr12927658edb.5.1673384579218; Tue, 10 Jan 2023 13:02:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673384579; cv=none; d=google.com; s=arc-20160816; b=weHIDp6tMSdWFRDVi44oIM0ATDINSN6AzAw/zE4MnvCYX/CzdeTvg8dU2XWhkcqlvT mxlkTOPnMNYWVEr1UHt79QXa8Xg9YXOYypxjF+LBvC81uQJLsaX2eh0joeYFm5HDbAQ0 SsKQavnrwDA9YukYuZ5H/za7qYla61pdEW43RliTf7HAE7PLzsRicC3vab/gIcsGYgPK K0WratUxasmWn5RCb78TK5edAxWEUlgJ+EiGkZKDofHuZG+aIU6IZzHABmQU1nYYAsay iBlTJ9E5AxXIcKWbsfJl0DvSnH99TTCLbiLMRo91LiInxyQqYNzOl09tCdwTosp0wNnU xOUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=uR+tmviJAnIQt3Uy9C/8G52NzH6k6qWZDh0/RH+zyro=; b=yAUE/W7bhP/xnSiR+tMxAdu+laX4H3BJeJyE+1FP//enLaOJQsgXlG8T7Xzg1Iblyf dXXp0xhvmYEg9w7GQRR1WhyQPqyhdYYJH7z/Fh2HS2xEEK7zOZNyTa/II/V9ZnRTaQDh hllJJh3GenyetUCTCNCjjnr7Cps2P80UgQWK/Ekl/S3nz0+6uRc6z2jR8MNj+2s5Q8yh R/nJRQ+QvywAEnDrYGluGPIVXH6tWTQ2D4K7P/54KaekwZMtcDBRpJrgcOrD5yKRsTVi gNODmVgwfCMi47HYybua4BzrArA67csIn6z28yO5Lo+Jcr8/T+WMqc+IJQildf/j+5QW tWHw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=US60Rbhq; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id t23-20020aa7d717000000b004534c6c4bd7si11695541edq.433.2023.01.10.13.02.58 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:02:59 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=US60Rbhq; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DA38F38493ED for ; Tue, 10 Jan 2023 21:02:57 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DA38F38493ED DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673384577; bh=uR+tmviJAnIQt3Uy9C/8G52NzH6k6qWZDh0/RH+zyro=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=US60RbhqfcaIShtkN2k01c9g3WajyoyleRQ+873s+o3jYa7NkVY0eM+lnft9KmHKm EUq8cmKmX+idaM3rSfv/Kg4+b9IRia1BggIF7JQBx8MLLTDN7N+mQIfPpiZigKv5If m8AuJtxBSTGXCTWdHc2G+amHE6xGrvTnzf1uWU2Q= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oo1-xc29.google.com (mail-oo1-xc29.google.com [IPv6:2607:f8b0:4864:20::c29]) by sourceware.org (Postfix) with ESMTPS id D49D13858281 for ; Tue, 10 Jan 2023 21:01:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D49D13858281 Received: by mail-oo1-xc29.google.com with SMTP id q20-20020a4a3314000000b004f2177e6b38so601279ooq.3 for ; Tue, 10 Jan 2023 13:01:43 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uR+tmviJAnIQt3Uy9C/8G52NzH6k6qWZDh0/RH+zyro=; b=r47wAbb4Zb0DhJKaSYrA88SYbfspY7wru8yg4/bnClOx7Kv+IsRy9avcz0wAajV43j ET9j9G7SI1GJlt7ogWOnoSfrhRW5aldjpi6zlkE84fsZi9cdfnE4+DuoR6ik5EDRjKhP U7qE99oZpU31qNMRRMCjlW98MJpK4FkD1u4EoQuU4f8Tbq934FiUO8Juev29yHvURPxg c6cI4fuUELAs0QQeD6C4r6NzN7uXOqmdXGPnPnbHCs99quA/JlXXtk4L0qEX6/JKjyln l+iM3P+rxUYIebppymhbIdaU7R+5mzu45Q4JT4nuybWj65uELCCw9j1jfjawFgPvzcmC BOug== X-Gm-Message-State: AFqh2kp9Cg/DvIy2X7GBTNynNWL8KPNZn+ppu+sDfmOnR9XLPHjpX/bf IGfaR5HPU8hWqvZRuAKlZvhPKXgPy8b3SwQLxTQ= X-Received: by 2002:a4a:952f:0:b0:49f:8941:ffed with SMTP id m44-20020a4a952f000000b0049f8941ffedmr30671136ooi.9.1673384501451; Tue, 10 Jan 2023 13:01:41 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:e8a0:dd55:3328:997]) by smtp.gmail.com with ESMTPSA id r5-20020a4a83c5000000b0049ee88e86f9sm6202193oog.10.2023.01.10.13.01.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:01:40 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Subject: [PATCH v6 12/17] hppa: Add memcopy.h Date: Tue, 10 Jan 2023 18:01:01 -0300 Message-Id: <20230110210106.1457686-13-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> References: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Richard Henderson GCC's combine pass cannot merge (x >> c | y << (32 - c)) into a double-word shift unless (1) the subtract is in the same basic block and (2) the result of the subtract is used exactly once. Neither condition is true for any use of MERGE. By forcing the use of a double-word shift, we not only reduce contention on SAR, but also allow the setting of SAR to be hoisted outside of a loop. Checked on hppa-linux-gnu. --- sysdeps/hppa/memcopy.h | 42 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) create mode 100644 sysdeps/hppa/memcopy.h diff --git a/sysdeps/hppa/memcopy.h b/sysdeps/hppa/memcopy.h new file mode 100644 index 0000000000..0d4b4ac435 --- /dev/null +++ b/sysdeps/hppa/memcopy.h @@ -0,0 +1,42 @@ +/* Definitions for memory copy functions, PA-RISC version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include + +/* Use a single double-word shift instead of two shifts and an ior. + If the uses of MERGE were close to the computation of shl/shr, + the compiler might have been able to create this itself. + But instead that computation is well separated. + + Using an inline function instead of a macro is the easiest way + to ensure that the types are correct. */ + +#undef MERGE + +static __always_inline op_t +MERGE (op_t w0, int shl, op_t w1, int shr) +{ + _Static_assert (OPSIZ == 4 || OPSIZ == 8, "Invalid OPSIZE"); + + op_t res; + if (OPSIZ == 4) + asm ("shrpw %1,%2,%%sar,%0" : "=r"(res) : "r"(w0), "r"(w1), "q"(shr)); + else if (OPSIZ == 8) + asm ("shrpd %1,%2,%%sar,%0" : "=r"(res) : "r"(w0), "r"(w1), "q"(shr)); + return res; +} From patchwork Tue Jan 10 21:01:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 640901 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp2916652pvb; Tue, 10 Jan 2023 13:02:55 -0800 (PST) X-Google-Smtp-Source: AMrXdXvDJPdnBDe/+evkPXne4pUfItMXxZeThsjV5E02Mqr2FeOyJTuoxU7opyKYELjyX3lHAZ+h X-Received: by 2002:a17:907:9d0b:b0:78d:f455:30db with SMTP id kt11-20020a1709079d0b00b0078df45530dbmr62547648ejc.3.1673384575384; Tue, 10 Jan 2023 13:02:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673384575; cv=none; d=google.com; s=arc-20160816; b=bHL9wi0f25IaL59+ec0s6yc0e/EUa3quJLNHhJgrEXQTY/A3Q91pvcsL1R5yRzaEml IHvIPSN0PQC6e/o9KcoL0/cCEbaKnoV1YxL163RS7w/WQ4Dr5avGXopfdfH1+Bfme7hl Fv/C8Luhpt3IVc07n3dQtLAorn6WzkbsAQJn0vXys3abQfV4rdqyf/JLLxaPSAH+hM+D pwa9IMtQkf6ThzVcKF0kpOLUgmK5zzDY3u5Zqk2zYMidDxFhazPvhzr2bkDYbIWstJTx ad1PFMzkBuDw4SK6dJoCM90nemtURLHg6HvBdvGlGsrj7MR876JN+vl7lxto4S/GbGzu YD8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=hWOcUJjSqHCMrNrz/r9938jmbfA0HbRSV45xxOnIXTs=; b=k8ClL0BOtVFkWRF8twnXvRGupTsT3FPclT4shWFb3GwGpCDoZo+HpFb9GcTH069/Wt b3AqEajeD8Jb8XDxLVLAXdon9DJrnuA0eNovf1sGBK9HY6Hvrljqr5oqm/O2PmbuppfG QDJiNUMZYAhW3nbavoBaFkCWySRVJfzt/gDUUa3tSYZoK2SqU4/WsQed+RBtJROidhZ8 3hisVN2ubhuyY7N2YsiIzbwew1BZ8kvFPq4UC0HfRPwKevxXM9ARa9j56vXuvf0c/kBx jwixV87SCdSHd+tqX3rp9VqhKU8xyJZgXTKFxMIk6a45tg1OSY+D4vLCCNDprPgmMpBZ x09A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=rcGtg4qu; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id l5-20020a170906794500b007adcc8fb7a0si14755140ejo.399.2023.01.10.13.02.55 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:02:55 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=rcGtg4qu; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3FA6438432E6 for ; Tue, 10 Jan 2023 21:02:54 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3FA6438432E6 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673384574; bh=hWOcUJjSqHCMrNrz/r9938jmbfA0HbRSV45xxOnIXTs=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=rcGtg4quF8Ic+jAD+U9kGAMMrw9zuHpmmm0RsA4xgiqsr3MByQaq1/tuNfXv9p9Mj 6XeEOKe2kWZC3PvI5MPC+XsIQbgFK8XOFy9/tjTDgaWet4yfTFmp3Adq8H6khlbeLL Zwvwwdg4FocZar3oRkW+uIOMAWo6xqmvOhWRuHKs= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oo1-xc36.google.com (mail-oo1-xc36.google.com [IPv6:2607:f8b0:4864:20::c36]) by sourceware.org (Postfix) with ESMTPS id F100838493E9 for ; Tue, 10 Jan 2023 21:01:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F100838493E9 Received: by mail-oo1-xc36.google.com with SMTP id c190-20020a4a4fc7000000b004a3addd10b5so3551045oob.1 for ; Tue, 10 Jan 2023 13:01:44 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hWOcUJjSqHCMrNrz/r9938jmbfA0HbRSV45xxOnIXTs=; b=FhpqT66gVaHRSfpb2Nnk+yyw+eJZrjwCJ0+WS8OJBUHXt8y+MJstm/oZ2sDVsK8Ka5 DN+eJCaE2X29Zqr8o8Tq6rO4Z5xf610w3ZrS+58e9bBScP2ScpK0MyPH/DdWAQ62O689 5UeNXiMCh4WOStqajxQMOZ2FLK+ZUbvTU1EEsOQfZK3BZmNYQBtCRWz+f+X+RJHAAKSG KiqSNQcDH/MwPJKf3PNTJlnDsIAcyxXdWqd0nk93eieeHiFIcEJtlyINhUadXPzv/Aot YgP55OgcC6GJIqCNzse8BnEkHiPOHQZd0op1RjBHr13CqMCN+7Z8rz1VfjfoMw/BMI50 zw5Q== X-Gm-Message-State: AFqh2kp1SHS9ZdQpxTMvQ1zhHTmax7r2DLM4MiXAB9UhRY//5adfApIM C5sElnqFDxCghQYt3I1EsNkf1T7p6wSsm2XkD0Q= X-Received: by 2002:a4a:4b07:0:b0:4a0:23af:db58 with SMTP id q7-20020a4a4b07000000b004a023afdb58mr29717140ooa.7.1673384503510; Tue, 10 Jan 2023 13:01:43 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:e8a0:dd55:3328:997]) by smtp.gmail.com with ESMTPSA id r5-20020a4a83c5000000b0049ee88e86f9sm6202193oog.10.2023.01.10.13.01.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:01:42 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Subject: [PATCH v6 13/17] hppa: Add string-fzb.h and string-fzi.h Date: Tue, 10 Jan 2023 18:01:02 -0300 Message-Id: <20230110210106.1457686-14-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> References: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Richard Henderson Use UXOR,SBZ to test for a zero byte within a word. While we can get semi-decent code out of asm-goto, we would do slightly better with a compiler builtin. For index_zero et al, sequential testing of bytes is less expensive than any tricks that involve a count-leading-zeros insn that we don't have. Checked on hppa-linux-gnu. --- sysdeps/hppa/string-fzb.h | 70 +++++++++++++++++++ sysdeps/hppa/string-fzi.h | 139 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 209 insertions(+) create mode 100644 sysdeps/hppa/string-fzb.h create mode 100644 sysdeps/hppa/string-fzi.h diff --git a/sysdeps/hppa/string-fzb.h b/sysdeps/hppa/string-fzb.h new file mode 100644 index 0000000000..865e548492 --- /dev/null +++ b/sysdeps/hppa/string-fzb.h @@ -0,0 +1,70 @@ +/* Zero byte detection, boolean. HPPA version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_FZB_H +#define _STRING_FZB_H 1 + +#include +#include + +/* Determine if any byte within X is zero. This is a pure boolean test. */ + +static __always_inline _Bool +has_zero (op_t x) +{ + _Static_assert (sizeof (op_t) == 4, "64-bit not supported"); + + /* It's more useful to expose a control transfer to the compiler + than to expose a proper boolean result. */ + asm goto ("uxor,sbz %%r0,%0,%%r0\n\t" + "b,n %l1" : : "r"(x) : : nbz); + return 1; + nbz: + return 0; +} + +/* Likewise, but for byte equality between X1 and X2. */ + +static __always_inline _Bool +has_eq (op_t x1, op_t x2) +{ + _Static_assert (sizeof (op_t) == 4, "64-bit not supported"); + + asm goto ("uxor,sbz %0,%1,%%r0\n\t" + "b,n %l2" : : "r"(x1), "r"(x2) : : nbz); + return 1; + nbz: + return 0; +} + +/* Likewise, but for zeros in X1 and equal bytes between X1 and X2. */ + +static __always_inline _Bool +has_zero_eq (op_t x1, op_t x2) +{ + _Static_assert (sizeof (op_t) == 4, "64-bit not supported"); + + asm goto ("uxor,sbz %%r0,%0,%%r0\n\t" + "uxor,nbz %0,%1,%%r0\n\t" + "b,n %l2" : : "r"(x1), "r"(x2) : : sbz); + return 0; + sbz: + return 1; +} + +#endif /* _STRING_FZB_H */ diff --git a/sysdeps/hppa/string-fzi.h b/sysdeps/hppa/string-fzi.h new file mode 100644 index 0000000000..1824cbba5a --- /dev/null +++ b/sysdeps/hppa/string-fzi.h @@ -0,0 +1,139 @@ +/* string-fzi.h -- zero byte detection; indexes. HPPA version. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_FZI_H +#define _STRING_FZI_H 1 + +#include +#include + +_Static_assert (sizeof (op_t) == 4, "64-bit not supported"); + +static __always_inline unsigned int +index_first (op_t c) +{ + /* Since we have no clz insn, direct tests of the bytes is faster + than loading up the constants to do the masking. */ + if (c & 0xff000000) + return 0; + if (c & 0x00ff0000) + return 1; + if (c & 0x0000ff00) + return 2; + return 3; +} + +/* Given a word X that is known to contain a zero byte, return the + index of the first such within the long in memory order. */ +static __always_inline unsigned int +index_first_zero (op_t x) +{ + unsigned int ret; + + /* Since we have no clz insn, direct tests of the bytes is faster + than loading up the constants to do the masking. */ + asm ("extrw,u,<> %1,23,8,%%r0\n\t" + "ldi 2,%0\n\t" + "extrw,u,<> %1,15,8,%%r0\n\t" + "ldi 1,%0\n\t" + "extrw,u,<> %1,7,8,%%r0\n\t" + "ldi 0,%0" + : "=r"(ret) : "r"(x), "0"(3)); + + return ret; +} + +/* Similarly, but perform the search for byte equality between X1 and X2. */ +static __always_inline unsigned int +index_first_eq (op_t x1, op_t x2) +{ + return index_first_zero (x1 ^ x2); +} + +/* Similarly, but perform the search for zero within X1 or + equality between X1 and X2. */ +static __always_inline unsigned int +index_first_zero_eq (op_t x1, op_t x2) +{ + unsigned int ret; + + /* Since we have no clz insn, direct tests of the bytes is faster + than loading up the constants to do the masking. */ + asm ("extrw,u,= %1,23,8,%%r0\n\t" + "extrw,u,<> %2,23,8,%%r0\n\t" + "ldi 2,%0\n\t" + "extrw,u,= %1,15,8,%%r0\n\t" + "extrw,u,<> %2,15,8,%%r0\n\t" + "ldi 1,%0\n\t" + "extrw,u,= %1,7,8,%%r0\n\t" + "extrw,u,<> %2,7,8,%%r0\n\t" + "ldi 0,%0" + : "=r"(ret) : "r"(x1), "r"(x1 ^ x2), "0"(3)); + + return ret; +} + +/* Similarly, but perform the search for zero within X1 or + inequality between X1 and X2. */ +static __always_inline unsigned int +index_first_zero_ne (op_t x1, op_t x2) +{ + unsigned int ret; + + /* Since we have no clz insn, direct tests of the bytes is faster + than loading up the constants to do the masking. */ + asm ("extrw,u,<> %2,23,8,%%r0\n\t" + "extrw,u,<> %1,23,8,%%r0\n\t" + "ldi 2,%0\n\t" + "extrw,u,<> %2,15,8,%%r0\n\t" + "extrw,u,<> %1,15,8,%%r0\n\t" + "ldi 1,%0\n\t" + "extrw,u,<> %2,7,8,%%r0\n\t" + "extrw,u,<> %1,7,8,%%r0\n\t" + "ldi 0,%0" + : "=r"(ret) : "r"(x1), "r"(x1 ^ x2), "0"(3)); + + return ret; +} + +/* Similarly, but search for the last zero within X. */ +static __always_inline unsigned int +index_last_zero (op_t x) +{ + unsigned int ret; + + /* Since we have no ctz insn, direct tests of the bytes is faster + than loading up the constants to do the masking. */ + asm ("extrw,u,<> %1,15,8,%%r0\n\t" + "ldi 1,%0\n\t" + "extrw,u,<> %1,23,8,%%r0\n\t" + "ldi 2,%0\n\t" + "extrw,u,<> %1,31,8,%%r0\n\t" + "ldi 3,%0" + : "=r"(ret) : "r"(x), "0"(0)); + + return ret; +} + +static __always_inline unsigned int +index_last_eq (op_t x1, op_t x2) +{ + return index_last_zero (x1 ^ x2); +} + +#endif /* _STRING_FZI_H */ From patchwork Tue Jan 10 21:01:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 640904 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp2917078pvb; Tue, 10 Jan 2023 13:03:37 -0800 (PST) X-Google-Smtp-Source: AMrXdXv8A709NobaBIti4OSUdfdny5+wYfe8FTBHMPHLiuCR2aby0B82VGuBa5awJXkGGAYV/7a0 X-Received: by 2002:a05:6402:3608:b0:495:b002:4ba2 with SMTP id el8-20020a056402360800b00495b0024ba2mr17585752edb.3.1673384616815; Tue, 10 Jan 2023 13:03:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673384616; cv=none; d=google.com; s=arc-20160816; b=epok3EtxbmYwq3eIlTTTfd9L7QWjWSlAm3PwSdMERZOdWtrP6HIYSJCF/1hb60UxCk sHWiwVyfYA/129nQ6Eg/3xdexAtHAHOeK8qbhtzBWXUsOdXbihEKN3y85tGC0KZBVuxM l949MEciOnFV9FleZpt4pXp0NIOLC4F/5vpAWaCKPnPH2sNFpT24XRtKUsmGMB7XCTmx p1u5ixDEtXf/EgJYkNuNd+HRmCP2AhxSflPTyT+0YHh0XvPDk7Vi7hcYmm4k6JRo/c2g N0tsYVslIgdYwFKkB4koCWjRDnGznACZaktyb4xy0Nw0aeETg1OLPt+x4kh52P7ctbgC W+8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=KNElYNRjRhFADtzb7bB0aTQaoOehcuU9896RlIzV08w=; b=zyFRa0lCeLGeXHEYdfPpwphOrMVyEQ84ly3UQq6N43M0hYj2P8mhmLelszDQcJx977 ThCSn5sVSwNpcXZwG5MOCqY+c9C/1WOcq2kzdpnRT6+plqu563+cu4LSqKmojuMWwA4w WSVpuxGXrT+3I1VNTBS/VKyPwIN8Hgz7ta/OBAqBoGwI6mWsyT1n/rI2sP+2XiunhWb6 SbFsDHaLdg3KLKOe9/RZ70plG8C4hRb5Fn3KgnGO4NJA3dlmlabLf9AyjEmPQT69GSCT PbSXkNFYls9xYGvQNzhAfHDZb2kebC528o8h8ST3C3qPf9avjKfmqYqk/5vY/qlMrAQL Qknw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=UdUcXJiM; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id cq6-20020a056402220600b004520b01a355si2642255edb.52.2023.01.10.13.03.36 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:03:36 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=UdUcXJiM; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 67F163896C03 for ; Tue, 10 Jan 2023 21:03:35 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 67F163896C03 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673384615; bh=KNElYNRjRhFADtzb7bB0aTQaoOehcuU9896RlIzV08w=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=UdUcXJiMEPgzHX1mUE1eFMMzGn1z/0cxOGmg7ISAfwuh09F7o53Vhd0ZUwjH0lXeI fOE1WghDKGXwCnkA3uCBytnLtjCUNTU0RecaNf7n+yswcUkgoYRjtwQwHJheHc9zvq Y94qniPdOxxvqK+PG9As/p2Ku9/BgGluLzP4J2pI= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oo1-xc31.google.com (mail-oo1-xc31.google.com [IPv6:2607:f8b0:4864:20::c31]) by sourceware.org (Postfix) with ESMTPS id D1C5B3858022 for ; Tue, 10 Jan 2023 21:01:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D1C5B3858022 Received: by mail-oo1-xc31.google.com with SMTP id 187-20020a4a09c4000000b004d8f3cb09f5so3543039ooa.6 for ; Tue, 10 Jan 2023 13:01:46 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KNElYNRjRhFADtzb7bB0aTQaoOehcuU9896RlIzV08w=; b=5M+lJvkwYGVb5glCnASbUFBw6mMp4axaotf3Q1/ZYTa4rzWcr6ZuBdExWnLS5evvT/ PsLiFSdD7UgHl40WYZqztuwLriEoSFf/u2Aj18Gj4fvuToh1kKxONW5Y8f2di69ZDkWH VZIBK31jIYBfh1ZMc+9PSBeVgBVM+kedx9SqrktzTclL4G4sAdQsGaDEj3/PyF5XaHCH wrK7yovMhtrgxE2beLsD3FThoHmwj6nxI97WmHjVUPTWwspfF6PP9Ge9Wh2kC9gUB3RD 36n3MMEnEuKBDPfZq56QTnmftGGWS7BSwjktVMyra9eRFZnBXb/q+UyAuMu7fWbP1BjR rNqg== X-Gm-Message-State: AFqh2koCL3ivxkL6BFZ3znomMZ76OFSGzPtBWIxFH5woy+tUXwYROiPD ZEw9bYPjqfcAK30Tb7KIG3IzLjYOLggn3v2j630= X-Received: by 2002:a4a:d585:0:b0:4f2:cf1:36ee with SMTP id z5-20020a4ad585000000b004f20cf136eemr2116262oos.1.1673384505428; Tue, 10 Jan 2023 13:01:45 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:e8a0:dd55:3328:997]) by smtp.gmail.com with ESMTPSA id r5-20020a4a83c5000000b0049ee88e86f9sm6202193oog.10.2023.01.10.13.01.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:01:44 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Subject: [PATCH v6 14/17] alpha: Add string-fzb.h and string-fzi.h Date: Tue, 10 Jan 2023 18:01:03 -0300 Message-Id: <20230110210106.1457686-15-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> References: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Richard Henderson While alpha has the more important string functions in assembly, there are still a few for find the generic routines are used. Use the CMPBGE insn, via the builtin, for testing of zeros. Use a simplified expansion of __builtin_ctz when the insn isn't available. Checked on alpha-linux-gnu. --- sysdeps/alpha/string-fzb.h | 52 +++++++++++++++++ sysdeps/alpha/string-fzi.h | 113 +++++++++++++++++++++++++++++++++++++ 2 files changed, 165 insertions(+) create mode 100644 sysdeps/alpha/string-fzb.h create mode 100644 sysdeps/alpha/string-fzi.h diff --git a/sysdeps/alpha/string-fzb.h b/sysdeps/alpha/string-fzb.h new file mode 100644 index 0000000000..e3934ba413 --- /dev/null +++ b/sysdeps/alpha/string-fzb.h @@ -0,0 +1,52 @@ +/* Zero byte detection; boolean. Alpha version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_FZB_H +#define _STRING_FZB_H 1 + +#include +#include + +/* Note that since CMPBGE creates a bit mask rather than a byte mask, + we cannot simply provide a target-specific string-fza.h. */ + +/* Determine if any byte within X is zero. This is a pure boolean test. */ + +static __always_inline _Bool +has_zero (op_t x) +{ + return __builtin_alpha_cmpbge (0, x) != 0; +} + +/* Likewise, but for byte equality between X1 and X2. */ + +static __always_inline _Bool +has_eq (op_t x1, op_t x2) +{ + return has_zero (x1 ^ x2); +} + +/* Likewise, but for zeros in X1 and equal bytes between X1 and X2. */ + +static __always_inline _Bool +has_zero_eq (op_t x1, op_t x2) +{ + return has_zero (x1) | has_eq (x1, x2); +} + +#endif /* _STRING_FZB_H */ diff --git a/sysdeps/alpha/string-fzi.h b/sysdeps/alpha/string-fzi.h new file mode 100644 index 0000000000..bc2f0bdc91 --- /dev/null +++ b/sysdeps/alpha/string-fzi.h @@ -0,0 +1,113 @@ +/* string-fzi.h -- zero byte detection; indices. Alpha version. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_FZI_H +#define _STRING_FZI_H + +#include +#include + +/* Note that since CMPBGE creates a bit mask rather than a byte mask, + we cannot simply provide a target-specific string-fza.h. */ + +/* A subroutine for the index_zero functions. Given a bitmask C, + return the index of the first bit set in memory order. */ + +static __always_inline unsigned int +index_first (unsigned long int c) +{ +#ifdef __alpha_cix__ + return __builtin_ctzl (c); +#else + c = c & -c; + return (c & 0xf0 ? 4 : 0) + (c & 0xcc ? 2 : 0) + (c & 0xaa ? 1 : 0); +#endif +} + +/* Similarly, but return the (memory order) index of the last bit + that is non-zero. Note that only the least 8 bits may be nonzero. */ + +static __always_inline unsigned int +index_last (unsigned long int x) +{ +#ifdef __alpha_cix__ + return __builtin_clzl (x) ^ 63; +#else + unsigned r = 0; + if (x & 0xf0) + r += 4; + if (x & (0xc << r)) + r += 2; + if (x & (0x2 << r)) + r += 1; + return r; +#endif +} + +/* Given a word X that is known to contain a zero byte, return the + index of the first such within the word in memory order. */ + +static __always_inline unsigned int +index_first_zero (op_t x) +{ + return index_first (__builtin_alpha_cmpbge (0, x)); +} + +/* Similarly, but perform the test for byte equality between X1 and X2. */ + +static __always_inline unsigned int +index_first_eq (op_t x1, op_t x2) +{ + return index_first_zero (x1 ^ x2); +} + +/* Similarly, but perform the search for zero within X1 or + equality between X1 and X2. */ + +static __always_inline unsigned int +index_first_zero_eq (op_t x1, op_t x2) +{ + return index_first (__builtin_alpha_cmpbge (0, x1) + | __builtin_alpha_cmpbge (0, x1 ^ x2)); +} + +/* Similarly, but perform the search for zero within X1 or + inequality between X1 and X2. */ + +static __always_inline unsigned int +index_first_zero_ne (op_t x1, op_t x2) +{ + return index_first (__builtin_alpha_cmpbge (0, x1) + | (__builtin_alpha_cmpbge (0, x1 ^ x2) ^ 0xFF)); +} + +/* Similarly, but search for the last zero within X. */ + +static __always_inline unsigned int +index_last_zero (op_t x) +{ + return index_last (__builtin_alpha_cmpbge (0, x)); +} + +static __always_inline unsigned int +index_last_eq (op_t x1, op_t x2) +{ + return index_last_zero (x1 ^ x2); +} + +#endif /* _STRING_FZI_H */ From patchwork Tue Jan 10 21:01:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 640905 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp2917125pvb; Tue, 10 Jan 2023 13:03:40 -0800 (PST) X-Google-Smtp-Source: AMrXdXtyGwVQBljpNIO2RmATFKNRHQTaWV99HQSOigxWHsIJ9cGPs1BKowfJZw/K/Wb1xXbPgDyF X-Received: by 2002:aa7:c796:0:b0:46c:aec4:606f with SMTP id n22-20020aa7c796000000b0046caec4606fmr58085101eds.23.1673384620340; Tue, 10 Jan 2023 13:03:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673384620; cv=none; d=google.com; s=arc-20160816; b=qBo3BvmqghXrxgMYkBL1VRUFL4S3QPUR/KjT5otlUZ2ZyLlijYlRtXofOSEteVIm58 ELjV0bRx2mmBUVFgU688RZg3fqZK4VJeg+0/riklWnrU3CLyc5+IpHyNJQcfHMUvRtoT kzL7JLk/20HdQ9dfDh0cxML3EN9PXoPLsq/9qH55j2RlywUUuxby1HUu8kC/QY2W3AHh DSqjKvYEQeZDMHQ+h0K4O0PoLH93OikN1lOsF0MKzKsaVHiE7eK/ECnULSLq3OOtR2jx vbhG+lo+B54CpRURQyNyFBzfd1giQg/gQOOY83PXD3lygEqSSK+9SykJI3iv3w66zXBF UFcA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=4ccsKlaV6iEM5HdbbwpX6FgBr0cjdJfigDL2bOGuIrU=; b=oSzWUyoK6B9ka21zcdc+KYbxxMAOYmv5enSoRcADjRmA4L42bz0i31lLOEF+Rpr9M4 NjkpQD1UgZJCNHTE3BYw5L6T/bGDR1Oc5nvRVGnPxXSbrtxevy+f7mP1x5qL7gAahP8m 5J8Igv3oKMND/KZj2kUs8YJWwCvguQ/KFFlry8DNoU3XlQ7o/LhJkXkALWpeqr6/flQV KwvFGElocLSNc/OrYiHCRhTld1ehVK7aHLI+/+cMm/l5Q6bPgxfIDCmo6cIKv0DVuCtM 5pN8BGp4clfTM9eImsR7npROcKdATGHvRsKbKMP/QhknV6YKMDpneo/3uekSMRCMB5IS +NjA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=yPCGHwE3; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id q11-20020a056402032b00b0048d858ddabfsi11867412edw.414.2023.01.10.13.03.40 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:03:40 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=yPCGHwE3; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F01473896C2A for ; Tue, 10 Jan 2023 21:03:38 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org F01473896C2A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673384619; bh=4ccsKlaV6iEM5HdbbwpX6FgBr0cjdJfigDL2bOGuIrU=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=yPCGHwE3PfRDuN9zPlxWi5W9zrezXkAVPFvwNocwwlRdMiSjTR+CqmuoGRVXJ2EOj /DTUZZzmB79vPkQ9pBDs0drA6rlNZnGsULX2ibF04TyrGz7zv1am+dg89fQciw7vMW Ja2tTBo3Qv8ElJ2/SQrNREaZTIaSc1Ei7WEPMeR8= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oo1-xc29.google.com (mail-oo1-xc29.google.com [IPv6:2607:f8b0:4864:20::c29]) by sourceware.org (Postfix) with ESMTPS id 5077F3858414 for ; Tue, 10 Jan 2023 21:01:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5077F3858414 Received: by mail-oo1-xc29.google.com with SMTP id q20-20020a4a3314000000b004f2177e6b38so601337ooq.3 for ; Tue, 10 Jan 2023 13:01:48 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4ccsKlaV6iEM5HdbbwpX6FgBr0cjdJfigDL2bOGuIrU=; b=koFaaSxoUWe3xkXJpsUBhmpCl4zMkfgIpmtFotEJ3UcjGxgODxgYXFD7ZOHdmvhWau GOMFF33IQDYXKI0stJMEVo2aeRKU1Pr5KTH10xUdBLkBZKYHce1nry4B9L7rh49xo+Wx CuGk48spM//38nKVevH8HyePlzlPguMzEU5E2XC0oazazOZc15Apb83ZgVnr6Fpev/Bm lbIaTJ1pzIB3A0K2HG+9ZLxpxajAt0cO+q7Rd8XbB1NwfCDf4Qvj105eQDWOm3kS14XE vjU9tqhz7SBcqiZ3wor8XbMoY9qHjmm4aQvfKIf6195vjYl4W3Q5w5Ze5UWlO7jkaFxQ P9OQ== X-Gm-Message-State: AFqh2kqgT+Tdxj2gox38RMwayRIy5pW6zZXvfHbC6wiso4b0haHrjxoA QUg7rk+HAIMS4fwo9rrkrigEHYyCyEEqRcHMhPU= X-Received: by 2002:a4a:df0d:0:b0:4a3:e7ac:d31b with SMTP id i13-20020a4adf0d000000b004a3e7acd31bmr35464181oou.5.1673384507318; Tue, 10 Jan 2023 13:01:47 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:e8a0:dd55:3328:997]) by smtp.gmail.com with ESMTPSA id r5-20020a4a83c5000000b0049ee88e86f9sm6202193oog.10.2023.01.10.13.01.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:01:46 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Subject: [PATCH v6 15/17] arm: Add string-fza.h Date: Tue, 10 Jan 2023 18:01:04 -0300 Message-Id: <20230110210106.1457686-16-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> References: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Richard Henderson While arm has the more important string functions in assembly, there are still a few generic routines used. Use the UQSUB8 insn for testing of zeros. Checked on armv7-linux-gnueabihf --- sysdeps/arm/armv6t2/string-fza.h | 70 ++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 sysdeps/arm/armv6t2/string-fza.h diff --git a/sysdeps/arm/armv6t2/string-fza.h b/sysdeps/arm/armv6t2/string-fza.h new file mode 100644 index 0000000000..7aa0843325 --- /dev/null +++ b/sysdeps/arm/armv6t2/string-fza.h @@ -0,0 +1,70 @@ +/* Zero byte detection; basics. ARM version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_FZA_H +#define _STRING_FZA_H 1 + +#include +#include + +/* This function returns at least one bit set within every byte + of X that is zero. */ + +static __always_inline op_t +find_zero_all (op_t x) +{ + /* Use unsigned saturated subtraction from 1 in each byte. + That leaves 1 for every byte that was zero. */ + op_t ret, ones = repeat_bytes (0x01); + asm ("uqsub8 %0,%1,%2" : "=r"(ret) : "r"(ones), "r"(x)); + return ret; +} + +/* Identify bytes that are equal between X1 and X2. */ + +static __always_inline op_t +find_eq_all (op_t x1, op_t x2) +{ + return find_zero_all (x1 ^ x2); +} + +/* Identify zero bytes in X1 or equality between X1 and X2. */ + +static __always_inline op_t +find_zero_eq_all (op_t x1, op_t x2) +{ + return find_zero_all (x1) | find_zero_all (x1 ^ x2); +} + +/* Identify zero bytes in X1 or inequality between X1 and X2. */ + +static __always_inline op_t +find_zero_ne_all (op_t x1, op_t x2) +{ + /* Make use of the fact that we'll already have ONES in a register. */ + op_t ones = repeat_bytes (0x01); + return find_zero_all (x1) | (find_zero_all (x1 ^ x2) ^ ones); +} + +/* Define the "inexact" versions in terms of the exact versions. */ +#define find_zero_low find_zero_all +#define find_eq_low find_eq_all +#define find_zero_eq_low find_zero_eq_all +#define find_zero_ne_low find_zero_ne_all + +#endif /* _STRING_FZA_H */ From patchwork Tue Jan 10 21:01:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 640907 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp2917512pvb; Tue, 10 Jan 2023 13:04:18 -0800 (PST) X-Google-Smtp-Source: AMrXdXsiTHlkL6DeEFigZLV6xOPJfRSbh9kqpFDrQ/2v4fIv9CsLrcoBR5ZdYGCww/rGC5VFmknr X-Received: by 2002:aa7:d393:0:b0:490:47c3:3d78 with SMTP id x19-20020aa7d393000000b0049047c33d78mr20600131edq.1.1673384658566; Tue, 10 Jan 2023 13:04:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673384658; cv=none; d=google.com; s=arc-20160816; b=GYgoFTqxTUhK7NWQjly4Mm7lyax2zQ4ka+vkWYmOwC8o5KBanDSnuOthuBjDwVSyoG rVyn9MpSxptHnkDrhmTgNQClNlYRtt09onyYWM82nkZPPoPReHJmY/0yUbixNt56n2TK mEmuo01u509kwcYm+0sdpgUJHd5eJtrHKtGyHdJsChQHgCPGUgNyYYyBrxxpsIVJk4Yv OBWAnbEPm7VogwHxh0epi8x/F0mb7KZKJe0ID8hJkQ3soCEspWg7K9qAbWa2rD5HLJkp 0vfzUi5cfN5VpCJHxz6TMKkKxst1tcWAlSA16kGZWLy5ANaJxyls+F16i1jdNJvUv97Z bFbw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=NB6MyQN54L0ZfjYrf0IIqicl8+o3iDBFOSABu07doRU=; b=dI8+r8pK11NWfEnmzPcpUhijtsjYMmuVAmCBN2B/RojvpDKzQGbC6aanJiZPbEm2Si dtZEelCeGgU6bHm/3opGiiVbhL5XDdYwwVSTg0FWWMTbk7tRlUw7Oz/EkipWp/Ul37+0 NUm5dhGtWjDSJ4xS6KQdlW6OVLdeJMgIyRl+kBS3JMko4JHNy9cmdeTLUusgL0mcuXnn F/N2e0OSlxSfdWsV28+eC4HxIbiE8mPk6FTBOC76gEBF94U8ZKui48s2lx1icq9/n1To W+7Li7dV8mWDtNTMdqqZ8x7V5f+nNibVJehh6Vz1VBaJ+Suo+6v6iJjgdUyPZbC51PYA OWqQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=NcvN54KP; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id ay23-20020a056402203700b00499c290b7e2si3153756edb.415.2023.01.10.13.04.18 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:04:18 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=NcvN54KP; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 57851385703F for ; Tue, 10 Jan 2023 21:04:17 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 57851385703F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673384657; bh=NB6MyQN54L0ZfjYrf0IIqicl8+o3iDBFOSABu07doRU=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=NcvN54KPpIDK49y5Hr0I2VFsBnIObOe73TV1FjiZOtkEQhwn1qvDU5FKHofqqeh6p 1yeBSN023/Qw0UfZbF0n1oL0cTI4DjkPcg6k2z/tR5qZ/0Gr7EQeUgrniGU7LdRUcP aPjv6siBj2AbG8xqpmAg9Zve4PClVMX8nFpAH4hU= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oo1-xc35.google.com (mail-oo1-xc35.google.com [IPv6:2607:f8b0:4864:20::c35]) by sourceware.org (Postfix) with ESMTPS id 5446C38493F8 for ; Tue, 10 Jan 2023 21:01:50 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5446C38493F8 Received: by mail-oo1-xc35.google.com with SMTP id x15-20020a4ab90f000000b004e64a0a967fso3549180ooo.2 for ; Tue, 10 Jan 2023 13:01:50 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NB6MyQN54L0ZfjYrf0IIqicl8+o3iDBFOSABu07doRU=; b=grXwKKjNNGx65ZDq3NMDcjXCSsSPDf85RwWD7T8scMMvhY67QzLgYKZh343DKEG3/M KI7WkebmnAD6lczp1qCEPQyAjo1xO401pnx+GtzSdwT3q4/DQHNr1ZGEFrNz68Mp3SVP GOyUVN/yxFC0lmifNybRmrUqD5TDsm0Z25aO39qmRLEC/tyCng0VXBf2Wk9z3X35zKjM YWpiTFQwB3TF8aZSJ3Gwn9jHJDWVKTkE76TVxkSBZ+AKE73zG/dccVGA3NbmGyzR7c77 6LJCDWolgOah7EG7zldazp79TIUDI9+fpXJ4+jxlHbpravZ79PJcE0EKzZiJGMf+ZFjQ fBzw== X-Gm-Message-State: AFqh2kqagAKPjeFM8buDMVZQCPqrW26Fer5GCTZOj9uD7Rr9psomWdKl N4XFIPn6WN+V49vGmZxeMJDiLxdLUOYGEcQCNGw= X-Received: by 2002:a05:6820:138a:b0:4a3:aa96:23c7 with SMTP id i10-20020a056820138a00b004a3aa9623c7mr35730049oow.6.1673384509325; Tue, 10 Jan 2023 13:01:49 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:e8a0:dd55:3328:997]) by smtp.gmail.com with ESMTPSA id r5-20020a4a83c5000000b0049ee88e86f9sm6202193oog.10.2023.01.10.13.01.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:01:48 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Subject: [PATCH v6 16/17] powerpc: Add string-fza.h Date: Tue, 10 Jan 2023 18:01:05 -0300 Message-Id: <20230110210106.1457686-17-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> References: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Richard Henderson While ppc has the more important string functions in assembly, there are still a few generic routines used. Use the Power 6 CMPB insn for testing of zeros. Checked on powerpc64le-linux-gnu. --- sysdeps/powerpc/string-fza.h | 70 ++++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 sysdeps/powerpc/string-fza.h diff --git a/sysdeps/powerpc/string-fza.h b/sysdeps/powerpc/string-fza.h new file mode 100644 index 0000000000..5496c9db4b --- /dev/null +++ b/sysdeps/powerpc/string-fza.h @@ -0,0 +1,70 @@ +/* Zero byte detection; basics. PowerPC version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _POWERPC_STRING_FZA_H +#define _POWERPC_STRING_FZA_H 1 + +/* PowerISA 2.05 (POWER6) provides cmpb instruction. */ +#ifdef _ARCH_PWR6 +# include + +/* This function returns 0xff for each byte that is + equal between X1 and X2. */ + +static __always_inline op_t +find_eq_all (op_t x1, op_t x2) +{ + op_t ret; + asm ("cmpb %0,%1,%2" : "=r"(ret) : "r"(x1), "r"(x2)); + return ret; +} + +/* This function returns 0xff for each byte that is zero in X. */ + +static __always_inline op_t +find_zero_all (op_t x) +{ + return find_eq_all (x, 0); +} + +/* Identify zero bytes in X1 or equality between X1 and X2. */ + +static __always_inline op_t +find_zero_eq_all (op_t x1, op_t x2) +{ + return find_zero_all (x1) | find_eq_all (x1, x2); +} + +/* Identify zero bytes in X1 or inequality between X1 and X2. */ + +static __always_inline op_t +find_zero_ne_all (op_t x1, op_t x2) +{ + return find_zero_all (x1) | ~find_eq_all (x1, x2); +} + +/* Define the "inexact" versions in terms of the exact versions. */ +# define find_zero_low find_zero_all +# define find_eq_low find_eq_all +# define find_zero_eq_low find_zero_eq_all +# define find_zero_ne_low find_zero_ne_all +#else +# include +#endif /* _ARCH_PWR6 */ + +#endif /* _POWERPC_STRING_FZA_H */ From patchwork Tue Jan 10 21:01:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 640910 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp2918232pvb; Tue, 10 Jan 2023 13:05:52 -0800 (PST) X-Google-Smtp-Source: AMrXdXs6KUzLQ2LCDExiR+jDz7El1viuVxB4pwXdpzMnCdr8BerXJgOTUQOLTCEoxYXWBJjA2j0X X-Received: by 2002:a05:6402:3712:b0:499:70a8:f918 with SMTP id ek18-20020a056402371200b0049970a8f918mr8672123edb.16.1673384751849; Tue, 10 Jan 2023 13:05:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673384751; cv=none; d=google.com; s=arc-20160816; b=yaSxcQK890V59uViwsmj9P43A753d0NHaBHbJdEAU+rCN7argt7StYdYQ0cKKnaAK5 GoZRBYSuc9sULZiv+//icP3Y1B/QQevVuZAjQb7NpOl4CSRr0p6nv42eQX8sqTJCNQRL ukYHbbfU2S+8berrGYRVoMdL+atwq1G8tOy2hyPiYCFxaqWHuf1N3Pvv6cexgvUJuPr6 fIlhtf+40cpJn5WBDwWOskGolR3kZ58hEAwVWP4vrPezDxM3n7iP/xp24x3WSW3f/usi Hx31Ugc8v/89Z7aqJas169aw0Le88Ji2/eFjRZnpk1dxtPvE4iLxQ2Z8kzBbezrfc4Kv jLag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=yGPovr38ZU02FWiH7KoXJ0+CqtlDx/oj85V6st1cjgM=; b=kthkXPjEEGohaJmPQZLfswaD9jhl2W+HU7U8k5a1m2R/BK4TnUWTaEbSbR2xBoQl38 fgUkYNITq3P4KyTbaMlF6S72yiJhb4U3CMsmJMzuY5G8nmX6Fpt0ClaPDTgJOl8qZF55 3ZTvasZEyuoGeIVnLDITaXQgyDrdZ9fzZiEVjUeTsx5UmLqru0zJggiUCWMWEHhx3FTD WVpvq/9yJ3DxHcaZU+5GXWqS3/wWbLSKe78uyYVCfbOtR+c56l8XE/0EwFiihBAcV0TF SV01fQ5BPHhFyWCrGF5YjMoZO1KkD2tOQbDwbBurFHrV+D0X+6BOGO9I+ZpyH6iYGk6y yrGg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=IEoL21Ew; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id f15-20020a0564021e8f00b0048d6ace589esi15958552edf.128.2023.01.10.13.05.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:05:51 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=IEoL21Ew; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AC40F38432EA for ; Tue, 10 Jan 2023 21:05:50 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AC40F38432EA DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673384750; bh=yGPovr38ZU02FWiH7KoXJ0+CqtlDx/oj85V6st1cjgM=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=IEoL21Ew260KFJxRuxvzAyUqXO1U+0iiRa0+dSbQv4VkreaWh2BG+eBHPmoGFhkHY /7JuX8ius8NF2IVM6KPIQ11CU+C3gmSo7YWr7JoGRoaqTVMNhP8vQ2EYj0fTN5becn Sgcs/kjEqM3Qvm8PWVOyNoX+DhOu7iaaqKUWzDww= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oo1-xc35.google.com (mail-oo1-xc35.google.com [IPv6:2607:f8b0:4864:20::c35]) by sourceware.org (Postfix) with ESMTPS id 1018438432C1 for ; Tue, 10 Jan 2023 21:01:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1018438432C1 Received: by mail-oo1-xc35.google.com with SMTP id d2-20020a4ab202000000b004ae3035538bso3537320ooo.12 for ; Tue, 10 Jan 2023 13:01:52 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yGPovr38ZU02FWiH7KoXJ0+CqtlDx/oj85V6st1cjgM=; b=Z40GKuKLR7NwI5VTd+YaEHsRUI+EYJeyW2WB7PNgpiuh8KTqW69XfJEDtQlcujjG1Z wRAFq9mRZr8O2qtAFMyD84bvyU6f1MExbR0+fJymQ7SM/zpihr+UNUXgrMLX06e4LTdA YYeLSQTCZOnRLxkzxvpm35gT2tvPwAodaZ5fsdSq1Bl6draGdRhnBJQvAii2vR8RDjUo FPAAVILZFaXWMhVOfMMthNeWjh7M9LM4QqO8WVpEK8WdCw1iguuFtop6X4igN8eoPEhz 02CI6QuV1dXvvfiCLPHblMDQ3SiKS01KjGeP4yQ4DGVDNR5q/dWl+xNscq231mAvXzqq Zsag== X-Gm-Message-State: AFqh2kqIkIa44wXP6bgL0J/Ze4QpYPwT0engNr924ve0eL5t3SGkFwP+ V9rNflUVC3BzKnNDAIiQrKfYQVpXEb0CYEEGIkc= X-Received: by 2002:a4a:e606:0:b0:4f2:1b88:6364 with SMTP id f6-20020a4ae606000000b004f21b886364mr895412oot.3.1673384511422; Tue, 10 Jan 2023 13:01:51 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:e8a0:dd55:3328:997]) by smtp.gmail.com with ESMTPSA id r5-20020a4a83c5000000b0049ee88e86f9sm6202193oog.10.2023.01.10.13.01.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 13:01:50 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Cc: Adhemerval Zanella Netto Subject: [PATCH v6 17/17] sh: Add string-fzb.h Date: Tue, 10 Jan 2023 18:01:06 -0300 Message-Id: <20230110210106.1457686-18-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> References: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto Use the SH cmp/str on has_{zero,eq,zero_eq}. Checked on sh4-linux-gnu. --- sysdeps/sh/string-fzb.h | 54 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) create mode 100644 sysdeps/sh/string-fzb.h diff --git a/sysdeps/sh/string-fzb.h b/sysdeps/sh/string-fzb.h new file mode 100644 index 0000000000..0ad19b58c9 --- /dev/null +++ b/sysdeps/sh/string-fzb.h @@ -0,0 +1,54 @@ +/* Zero byte detection; boolean. SH4 version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef STRING_FZB_H +#define STRING_FZB_H 1 + +#include +#include + +/* Determine if any byte within X is zero. This is a pure boolean test. */ + +static __always_inline _Bool +has_zero (op_t x) +{ + op_t zero = 0x0, ret; + asm volatile ("cmp/str %1,%2\n" + "movt %0\n" + : "=r" (ret) + : "r" (zero), "r" (x)); + return ret; +} + +/* Likewise, but for byte equality between X1 and X2. */ + +static __always_inline _Bool +has_eq (op_t x1, op_t x2) +{ + return has_zero (x1 ^ x2); +} + +/* Likewise, but for zeros in X1 and equal bytes between X1 and X2. */ + +static __always_inline _Bool +has_zero_eq (op_t x1, op_t x2) +{ + return has_zero (x1) | has_eq (x1, x2); +} + +#endif /* STRING_FZB_H */