From patchwork Fri Jan 13 18:27:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 642036 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp357751pvb; Fri, 13 Jan 2023 10:27:49 -0800 (PST) X-Google-Smtp-Source: AMrXdXvGZsF9kl/q21MKAqW8VUInG2xib3448avbvv19GCPOdpTLO+oVsi0RNirrjunvLvVlsssW X-Received: by 2002:a50:eac6:0:b0:461:d042:80db with SMTP id u6-20020a50eac6000000b00461d04280dbmr71128199edp.0.1673634469731; Fri, 13 Jan 2023 10:27:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673634469; cv=none; d=google.com; s=arc-20160816; b=yoCHA5fbEdZ0oKUD5HNfLBqz/crFFRTj29vrIcXG+idKvJvQ0+SyTirTvUH2MRul0b Lpm9ozlKTlAb3Ft4Yi6KbZrqlIQGCf6atvwg6EVNGgf8Gh05lXcqoJti7g4DldBDr3TH GTk6V0Rn02T5W2wCEkhPwG4NpAaTgNXTdRJHFotIGFwf+QGINcQj2DddJIMdfNk3CV6q waomFSgFXw96xm3IcqnHOX/m35I18H2Usqc+9W/Ht2zZLGp7vV8zEzN3SzUzzsbspIRV mosgYO3f/eFxfPQCvRxT4YLVeORpNtsNAjNEG1wBzLlEpUAeqIcG3p0goux6zPt80LlU R7ig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=s7htVBDI/GEyKJawwkD95640SDCHWN/0Ibq4jZ18+QM=; b=H4GVcwTT2VC0QSdubNhdyQhb5XA+9+4LbzgS0Kn1kBj3X2wS3e3jBcLcdnIaAKgkNO jLynDIvXxrL2OIEDqSNf79SQiQ2roC0OGa1XFk/mpcHlVOmyJRqypXNxDYul3PjzbjDW e0ZRSV9S6yKD+YDCcjUnXjyFmj5p46065qfSm7dLCRmNOUNuRmN3+gsD1sDyPgIxWimU ETtO1fJxrw4HPDj4vxMTPfnOulch+tF7FE/N0E8qQCCPFFx1gePXd9Op6Q4nDWKB1g7U 9E+nu8FAVL6h59O6yHKW8vCgHvoZtUO70qGxGlkXjYJp0GEdigtQqRIqivOg+uyG+xSd LxQw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b="UUa/D28n"; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id hr15-20020a1709073f8f00b008699e402620si3948933ejc.518.2023.01.13.10.27.49 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 10:27:49 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b="UUa/D28n"; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 850C038493E0 for ; Fri, 13 Jan 2023 18:27:48 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 850C038493E0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673634468; bh=s7htVBDI/GEyKJawwkD95640SDCHWN/0Ibq4jZ18+QM=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=UUa/D28nwFH1CfM1qxb5+UH3EPwfmlO5eK0hj2qZZ0cs4pKbsSouaZmYw+heBSvAj Br4t6Tl1YLz28uBluobQ1D7krTIhSlkbvXnHAaC+YRT3zMMVLj0bCflcYpvBe3HgQi diKnF19Pspvq+wBygYKamd+oaukq7aqW3wV6cqok= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) by sourceware.org (Postfix) with ESMTPS id CF2C73858004 for ; Fri, 13 Jan 2023 18:27:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CF2C73858004 Received: by mail-pl1-x635.google.com with SMTP id jl4so24252469plb.8 for ; Fri, 13 Jan 2023 10:27:38 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=s7htVBDI/GEyKJawwkD95640SDCHWN/0Ibq4jZ18+QM=; b=1TmPwS94oAa9oL9d6bBsZH3mbK5mXNyXjr+jYZhL+i4DR5tGsljQRzGYjmkwtlqONt 3wqJV+omzFfA14jIiBDz9/X9Oh7N8ociADz7pH8feayUHRMTENKfF8FZoXU+TFABuizo cifT2Dv3crY5YZb6lcUYgdCUztl/qg/hh+fbImzm2j2zvih7z180w1K33cBLJ7jjz4Hf VrZlp2nVhmUIX1j7cVUCT6HhLe/ts3y8F3wUiGWGRKU3vaePxR1Nlpxrt6m1FpmVM9CK GG/d2bXvDyfitTHnFF7zUVYjbRaiDcr8rp9x74abs4k9+txEaouAK0u4CLBPohPxOjwN ENWg== X-Gm-Message-State: AFqh2kqF7dZ4ONVnvWkjNu/EGOPs4BYriwexWM8Q4XMye74r6nsb0aYg XgYSrXUiML3sxKq4+JDU2CBitnKhvj2OTwy8 X-Received: by 2002:a17:903:1355:b0:193:3a92:f4bd with SMTP id jl21-20020a170903135500b001933a92f4bdmr12201555plb.47.1673634457933; Fri, 13 Jan 2023 10:27:37 -0800 (PST) Received: from stoup.. (rrcs-173-198-77-218.west.biz.rr.com. [173.198.77.218]) by smtp.gmail.com with ESMTPSA id s17-20020a170902c65100b001927ebc40e2sm14443640pls.193.2023.01.13.10.27.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 10:27:37 -0800 (PST) To: libc-alpha@sourceware.org Cc: adhemerval.zanella@linaro.org, goldstein.w.n@gmail.com Subject: [PATCH v8 01/17] Parameterize op_t from memcopy.h Date: Fri, 13 Jan 2023 08:27:17 -1000 Message-Id: <20230113182733.1268668-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230113182733.1268668-1-richard.henderson@linaro.org> References: <20230113182733.1268668-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Henderson via Libc-alpha From: Richard Henderson Reply-To: Richard Henderson Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto It moves the op_t definition out to an specific header, adds the attribute 'may-alias', and cleanup its duplicated definitions. Checked with a build and check with run-built-tests=no for all major Linux ABIs. Message-Id: <20230111204558.2402155-2-adhemerval.zanella@linaro.org> Reviewed-by: Richard Henderson --- sysdeps/generic/memcopy.h | 6 ++---- sysdeps/generic/string-optype.h | 24 ++++++++++++++++++++++++ sysdeps/x86_64/x32/string-optype.h | 24 ++++++++++++++++++++++++ string/memcmp.c | 1 - 4 files changed, 50 insertions(+), 5 deletions(-) create mode 100644 sysdeps/generic/string-optype.h create mode 100644 sysdeps/x86_64/x32/string-optype.h diff --git a/sysdeps/generic/memcopy.h b/sysdeps/generic/memcopy.h index 9f3ffb5d30..b5ffa4d114 100644 --- a/sysdeps/generic/memcopy.h +++ b/sysdeps/generic/memcopy.h @@ -55,10 +55,8 @@ [I fail to understand. I feel stupid. --roland] */ -/* Type to use for aligned memory operations. - This should normally be the biggest type supported by a single load - and store. */ -#define op_t unsigned long int +/* Type to use for aligned memory operations. */ +#include #define OPSIZ (sizeof (op_t)) /* Type to use for unaligned operations. */ diff --git a/sysdeps/generic/string-optype.h b/sysdeps/generic/string-optype.h new file mode 100644 index 0000000000..42bdd2a145 --- /dev/null +++ b/sysdeps/generic/string-optype.h @@ -0,0 +1,24 @@ +/* Define a type to use for word access. Generic version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_OPTYPE_H +#define _STRING_OPTYPE_H 1 + +typedef unsigned long int __attribute__ ((__may_alias__)) op_t; + +#endif /* string-optype.h */ diff --git a/sysdeps/x86_64/x32/string-optype.h b/sysdeps/x86_64/x32/string-optype.h new file mode 100644 index 0000000000..e7679f934f --- /dev/null +++ b/sysdeps/x86_64/x32/string-optype.h @@ -0,0 +1,24 @@ +/* Define a type to use for word access. Generic version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_OPTYPE_H +#define _STRING_OPTYPE_H 1 + +typedef unsigned long long int __attribute__ ((__may_alias__)) op_t; + +#endif /* string-optype.h */ diff --git a/string/memcmp.c b/string/memcmp.c index 067b2e6a42..ea0fa03e1c 100644 --- a/string/memcmp.c +++ b/string/memcmp.c @@ -46,7 +46,6 @@ /* Type to use for aligned memory operations. This should normally be the biggest type supported by a single load and store. Must be an unsigned type. */ -# define op_t unsigned long int # define OPSIZ (sizeof (op_t)) /* Threshold value for when to enter the unrolled loops. */ From patchwork Fri Jan 13 18:27:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 642041 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp358320pvb; Fri, 13 Jan 2023 10:29:15 -0800 (PST) X-Google-Smtp-Source: AMrXdXvg8pFbxx3aHWOIEWYOPskueyc1/8NoJqDF4nvfN4W8skGDRJvxErVJHQiMeLnafIa9D0PH X-Received: by 2002:a05:6402:5505:b0:499:c332:3b50 with SMTP id fi5-20020a056402550500b00499c3323b50mr14476104edb.39.1673634555040; Fri, 13 Jan 2023 10:29:15 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673634555; cv=none; d=google.com; s=arc-20160816; b=c+oFwEARP1xYjjaW9Ek7U4UJ6Ap1hd1ZYzpySWwPkW70mUGdFxVE9DWPGbzhGLGcE3 dBQSAY2b1ou1eYWCXLNQTmKKX2f3h9WQw19GLUltwk6m+vOyJIwEnswy7aJgUFQZ3pzl 1L8wa7HfBHrD769yqJhO6HFLnCuJ0tf8hdTqP86UToseBd2R0NSYQsrOJNpfFh+OJAJn 3e2keaEs0Yqd1l1TJI24Z1BHdut8k+nWRvCWWrtNPpP5LIT3fmW14MXCIz6ulJXUTq6T rjRa5vm/ZGuKAQp59pJ1WCvZDVQc5/2Y1204ymsLQs4UI74FLspm0M8TxSb/3GtdbIDB hjIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=3YOfQgzvPO37lwN5xewD1gZlsrMvF0C46UAL/M8JUhM=; b=SxGZpPnLG0CIRIuQRLUlzs4RESbuhrX2LBxL0ieLtMxDH5ILC/1/8ipfZOe9IBcuDf F2OdlI9nQ2rDHnTE3wf8YpkpEYSTwRc3j1wfo4E5jRqK2OOpaTIeEO92gklezUAVHKPk Mo+cq9WppOhM5XEMKMXo3je/Yg6Lu2lvHUEwFXTO2DQMIkeB17Lzg+5JUgBdBpBq7L/L gmuanNsYT4D8s06Nkg+0ivGVSKnQbvRL9Z6kdXKDLZODXrXZMioyehdicBXISCOuUHof +fA+1b3eP+9XCAiQSZgpzlCwfPPgi7YLpIq7CZ3MSTbtlgkvus709ZJWUwJSPxFTj+Dm P8vA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=g1CNFWEh; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id b7-20020a056402278700b0046dd7c3ceb5si23580354ede.475.2023.01.13.10.29.14 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 10:29:15 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=g1CNFWEh; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F18C038983B7 for ; Fri, 13 Jan 2023 18:29:13 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org F18C038983B7 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673634554; bh=3YOfQgzvPO37lwN5xewD1gZlsrMvF0C46UAL/M8JUhM=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=g1CNFWEh9i+pxc7s/Rgde7TnF+EMh5hpMBLkxabyQoyYGmRuuVsTNc7Z94PmRCytW FOf+hBMzC0jD0Vbt1iF19UkLaw3m/dgj8lGSFUiwq3pvIoBRUQXPTHupDBLB49pMSI z+QjXVWTLEbYrLB0BtPFuqCDAVEp00VNA2fLLxy8= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e]) by sourceware.org (Postfix) with ESMTPS id E3EEA3861009 for ; Fri, 13 Jan 2023 18:27:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E3EEA3861009 Received: by mail-pl1-x62e.google.com with SMTP id b17so16755599pld.7 for ; Fri, 13 Jan 2023 10:27:41 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3YOfQgzvPO37lwN5xewD1gZlsrMvF0C46UAL/M8JUhM=; b=TgYwwhjSez75YU6XLYvajl8HqURRFTyojjOzCQJDnAVQIbzFclLSGa370T0yYS1XEY M36GkEMhJpkdjwG8dxiM/pZxaDAd/p1eckxIWh20u6OF24P9Y0qA2sAN1mW/5TOpJmyX eaLjHMpnhB+Gs5b3ngWnW99SitHSQs215UpcMH391MTdO0zgNKZ3R+TKuHNMX4dJLuuX tfs0aH7vnCUxfah1VrMcroHW3RgM+9EL0NRzCnJ4DgIS6ZWW9O6burkW0fqNJcnphk23 SHALIzF+1Chtx+RT2xhq1oUQXIEq1rrxXgBcRgGLv7KtCzs9LbbPZxhARWW+j3swJwQm k+VA== X-Gm-Message-State: AFqh2kon1PF8GGBoW6icICrzmydyx2FLT5g2CP5bxBjrqY0RvuWrTioM JWgW1+ahGnyiLtj60ix5YX9RCFDHF4mMelAy X-Received: by 2002:a17:902:9884:b0:194:7a99:d5fd with SMTP id s4-20020a170902988400b001947a99d5fdmr315138plp.10.1673634460954; Fri, 13 Jan 2023 10:27:40 -0800 (PST) Received: from stoup.. (rrcs-173-198-77-218.west.biz.rr.com. [173.198.77.218]) by smtp.gmail.com with ESMTPSA id s17-20020a170902c65100b001927ebc40e2sm14443640pls.193.2023.01.13.10.27.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 10:27:40 -0800 (PST) To: libc-alpha@sourceware.org Cc: adhemerval.zanella@linaro.org, goldstein.w.n@gmail.com Subject: [PATCH v8 03/17] Add string-maskoff.h generic header Date: Fri, 13 Jan 2023 08:27:19 -1000 Message-Id: <20230113182733.1268668-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230113182733.1268668-1-richard.henderson@linaro.org> References: <20230113182733.1268668-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Henderson via Libc-alpha From: Richard Henderson Reply-To: Richard Henderson Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto Macros to operate on unaligned access for string operations: - create_mask: create a mask based on pointer alignment to sets up non-zero bytes before the beginning of the word so a following operation (such as find zero) might ignore these bytes. - repeat_bytes: setup an word with each byte being c_in. - highbit_mask: create a mask with high bit of each byte being 1, and the low 7 bits being all the opposite of the input. - word_containing: return the address of the op_t word containing the addres. These macros are meant to be used on optimized vectorized string implementations. Message-Id: <20230111204558.2402155-4-adhemerval.zanella@linaro.org> --- sysdeps/generic/string-maskoff.h | 73 ++++++++++++++++++++++++++++++++ 1 file changed, 73 insertions(+) create mode 100644 sysdeps/generic/string-maskoff.h diff --git a/sysdeps/generic/string-maskoff.h b/sysdeps/generic/string-maskoff.h new file mode 100644 index 0000000000..73edd5ad0f --- /dev/null +++ b/sysdeps/generic/string-maskoff.h @@ -0,0 +1,73 @@ +/* Mask off bits. Generic C version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_MASKOFF_H +#define _STRING_MASKOFF_H 1 + +#include +#include +#include +#include + +/* Provide a mask based on the pointer alignment that sets up non-zero + bytes before the beginning of the word. It is used to mask off + undesirable bits from an aligned read from an unaligned pointer. + For instance, on a 64 bits machine with a pointer alignment of + 3 the function returns 0x0000000000ffffff for LE and 0xffffff0000000000 + (meaning to mask off the initial 3 bytes). */ +static __always_inline op_t +create_mask (uintptr_t i) +{ + i = i % sizeof (op_t); + if (__BYTE_ORDER == __LITTLE_ENDIAN) + return ~(((op_t)-1) << (i * CHAR_BIT)); + else + return ~(((op_t)-1) >> (i * CHAR_BIT)); +} + +/* Setup an word with each byte being c_in. For instance, on a 64 bits + machine with input as 0xce the functions returns 0xcececececececece. */ +static __always_inline op_t +repeat_bytes (unsigned char c_in) +{ + return ((op_t)-1 / 0xff) * c_in; +} + +/* Based on mask created by 'create_mask', mask off the high bit of each + byte in the mask. It is used to mask off undesirable bits from an + aligned read from an unaligned pointer, and also taking care to avoid + match possible bytes meant to be matched. For instance, on a 64 bits + machine with a mask created from a pointer with an alignment of 3 + (0x0000000000ffffff) the function returns 0x7f7f7f0000000000 for BE + and 0x00000000007f7f7f for LE. */ +static __always_inline op_t +highbit_mask (op_t m) +{ + return m & repeat_bytes (0x7f); +} + +/* Return the address of the op_t word containing the address P. For + instance on address 0x0011223344556677 and op_t with size of 8, + it returns 0x0011223344556670. */ +static __always_inline op_t * +word_containing (char const *p) +{ + return (op_t *) ((uintptr_t) p & -sizeof (op_t)); +} + +#endif /* _STRING_MASKOFF_H */ From patchwork Fri Jan 13 18:27:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 642037 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp357826pvb; Fri, 13 Jan 2023 10:28:04 -0800 (PST) X-Google-Smtp-Source: AMrXdXtME/B9i77Iz+5j96Elqtr2sI4OS/xYQ5WfeTu1nAwHsY/xRA07vp0WIrgd55KStl7j/HRt X-Received: by 2002:a05:6402:6d8:b0:462:6e5e:329a with SMTP id n24-20020a05640206d800b004626e5e329amr70202619edy.8.1673634484652; Fri, 13 Jan 2023 10:28:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673634484; cv=none; d=google.com; s=arc-20160816; b=kAI7AjPArt9z3cYEIzm8g+7psp+DHELGCksRnRr3Lj6mzyGxySglE6WScxX52meF5Y Hc8BdQhRcSo1b69PT0r8ZJ3rEWmd1pTA2cl00Dt//kT4eLSSTVh7Mg9dEt1tGXyxJZsy MfRctigrTt75oVQaibH0FGPrdReBdTmoQsc216aytQwAVT2ULlb8WZHbia9rz5ndHBhp O298ynqHqVGYjK0gz1Fnm6BCxBGN4Htl12/HBswZEbK13ciG/wBh+vqBvMmxh3L9pnf4 669G+VOYwGjyH7XJl0c/SA9b1ylHSNCiV+fhD8uGfLPUEY9PIWOzTeijPcNpQM1YtIGz bRrw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=7gxO35Xa5x6SJT+3akYnlV15eWWQVXb2RY7CJ0vjA4E=; b=F0APMaGiOB2dYNFUy56ejtIeWOFnGKhwG3YdY1GrEavknpd41uRyztS8S20eqlJHcZ FVTzOuC4zqw7HQ/CwqLOwUHHXwGkmjbqGWxoRdso/kuJAT/OoNMlKlBPBYB+QajcwRp9 EcC4VtMPIwZjruVivmLqdLh/Krhbca7PaMccXF5xTnY6muEc+ooTlZmS69Qkey10VxSJ 4z03sH/klTqc2mdaG46IIV/DslU8tHzMlsxnaX0p1i+Ec7dq280A1gOQ2yAt5fjgHHJB am37vVbxwen97P4hiFU+Uiw1JmB1QqHIpQ9LWiDEFX+rusaAGUFY4kv4Yc5hRDOAAQos gRlw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=GB6eTj2r; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id z11-20020a50cd0b000000b0046db48205e6si23458587edi.77.2023.01.13.10.28.04 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 10:28:04 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=GB6eTj2r; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 24944383FBA0 for ; Fri, 13 Jan 2023 18:28:03 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 24944383FBA0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673634483; bh=7gxO35Xa5x6SJT+3akYnlV15eWWQVXb2RY7CJ0vjA4E=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=GB6eTj2ryi1HGFyF4xr7rQXw5YScOjja8XoIXCzFma9Canoy5AL9yFmq6eIXlvheu DFpu699L4Ma0GU407F90B2qb3JqZnQ+RHaObFhYs+lg6u2TFVJmWdCNuaX8V4tPj19 RplIHRZMLiR0fkXm1sIWw66cyjgy3ReA86QfmYqI= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pj1-x1031.google.com (mail-pj1-x1031.google.com [IPv6:2607:f8b0:4864:20::1031]) by sourceware.org (Postfix) with ESMTPS id 985F1384406A for ; Fri, 13 Jan 2023 18:27:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 985F1384406A Received: by mail-pj1-x1031.google.com with SMTP id z9-20020a17090a468900b00226b6e7aeeaso25226615pjf.1 for ; Fri, 13 Jan 2023 10:27:43 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7gxO35Xa5x6SJT+3akYnlV15eWWQVXb2RY7CJ0vjA4E=; b=DOP6GPtHSDORQMEoQJHlJtOtE0JwUrab2VEvvM82lNqRD5soGEy87ALnMZlSdE+elT uGnZF6ZfZJ9jT6PyVKFpveYbKyZv6hHE3ZMhIdGDNXLTtyB7vrfi1RsnZyHGml4Re1of 3iBpNtUyvhR0/0/EcfhNnlb900GPFCMIaSODZ3KzDyBi35hX/twO0wxKnVruE7H7trlO axDKjZLQQ2ACvAb9ZedcpGZfpuDfRC+YWyBPH6eis+tIKyq8gP9PFAU2gRBOUO14Dz7s ptkJEG/8WHLaR9y7/Ugas5of1xGBatRQG6vRYBJCY56uhLjOYcMkzqk1EEz6R9MrHvty bLXw== X-Gm-Message-State: AFqh2kqqR7YTa1Iu52FcAl2CZV5IysRHW88Q/8U5y7CngOzZe6oz9EfL D4Qt/9d9m1cN1sDRl4eC8CX61JJ0SYd0FsEp X-Received: by 2002:a17:902:e543:b0:189:a6b4:91ed with SMTP id n3-20020a170902e54300b00189a6b491edmr118577389plf.17.1673634462541; Fri, 13 Jan 2023 10:27:42 -0800 (PST) Received: from stoup.. (rrcs-173-198-77-218.west.biz.rr.com. [173.198.77.218]) by smtp.gmail.com with ESMTPSA id s17-20020a170902c65100b001927ebc40e2sm14443640pls.193.2023.01.13.10.27.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 10:27:42 -0800 (PST) To: libc-alpha@sourceware.org Cc: adhemerval.zanella@linaro.org, goldstein.w.n@gmail.com Subject: [PATCH v8 04/17] Add string vectorized find and detection functions Date: Fri, 13 Jan 2023 08:27:20 -1000 Message-Id: <20230113182733.1268668-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230113182733.1268668-1-richard.henderson@linaro.org> References: <20230113182733.1268668-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Henderson via Libc-alpha From: Richard Henderson Reply-To: Richard Henderson Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto This patch adds generic string find and detection meant to be used in generic vectorized string implementation. The idea is to decompose the basic string operation so each architecture can reimplement if it provides any specialized hardware instruction. The 'string-fza.h' provides zero byte detection functions: - find_zero_low, find_zero_all, find_eq_low, find_eq_all, find_zero_eq_low, find_zero_eq_all, find_zero_ne_low, and find_zero_ne_all The 'string-fzb.h' provides boolean zero byte detection functions: - has_zero: determine if any byte within a word is zero. - has_eq: determine byte equality between two words. - has_zero_eq: determine if any byte within a word is zero along with byte equality between two words. The 'string-fzi.h' provides positions for string-fza.h results: - index_first_zero: return index of first zero byte within a word. - index_first_eq: return index of first byte different between two words. The 'string-fzc.h' provides a combined version of fza and fzi: - index_first_zero_eq: return index of first zero byte within a word or first byte different between two words. - index_first_zero_ne: return index of first zero byte within a word or first byte equal between two words. - index_last_zero: return index of last zero byte within a word. - index_last_eq: return index of last byte different between two words. Co-authored-by: Richard Henderson Message-Id: <20230111204558.2402155-5-adhemerval.zanella@linaro.org> --- sysdeps/generic/string-extbyte.h | 37 ++++++++++ sysdeps/generic/string-fza.h | 116 +++++++++++++++++++++++++++++++ sysdeps/generic/string-fzb.h | 49 +++++++++++++ sysdeps/generic/string-fzc.h | 91 ++++++++++++++++++++++++ sysdeps/generic/string-fzi.h | 71 +++++++++++++++++++ 5 files changed, 364 insertions(+) create mode 100644 sysdeps/generic/string-extbyte.h create mode 100644 sysdeps/generic/string-fza.h create mode 100644 sysdeps/generic/string-fzb.h create mode 100644 sysdeps/generic/string-fzc.h create mode 100644 sysdeps/generic/string-fzi.h diff --git a/sysdeps/generic/string-extbyte.h b/sysdeps/generic/string-extbyte.h new file mode 100644 index 0000000000..38b4674dca --- /dev/null +++ b/sysdeps/generic/string-extbyte.h @@ -0,0 +1,37 @@ +/* Extract by from memory word. Generic C version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_EXTBYTE_H +#define _STRING_EXTBYTE_H 1 + +#include +#include +#include + +/* Extract the byte at index IDX from word X, with index 0 being the + least significant byte. */ +static __always_inline unsigned char +extractbyte (op_t x, unsigned int idx) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + return x >> (idx * CHAR_BIT); + else + return x >> (sizeof (x) - 1 - idx) * CHAR_BIT; +} + +#endif /* _STRING_EXTBYTE_H */ diff --git a/sysdeps/generic/string-fza.h b/sysdeps/generic/string-fza.h new file mode 100644 index 0000000000..3ac0111a74 --- /dev/null +++ b/sysdeps/generic/string-fza.h @@ -0,0 +1,116 @@ +/* Basic zero byte detection. Generic C version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_FZA_H +#define _STRING_FZA_H 1 + +#include +#include +#include + +/* The function return a byte mask. */ +typedef op_t find_t; + +/* Return the mask WORD shifted based on S_INT address value, to ignore + values not presented in the aligned word read. */ +static __always_inline find_t +shift_find (find_t word, uintptr_t s) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + return word >> (CHAR_BIT * (s % sizeof (op_t))); + else + return word << (CHAR_BIT * (s % sizeof (op_t))); +} + +/* This function returns non-zero if any byte in X is zero. + More specifically, at least one bit set within the least significant + byte that was zero; other bytes within the word are indeterminate. */ +static __always_inline find_t +find_zero_low (op_t x) +{ + /* This expression comes from + https://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord + Subtracting 1 sets 0x80 in a byte that was 0; anding ~x clears + 0x80 in a byte that was >= 128; anding 0x80 isolates that test bit. */ + op_t lsb = repeat_bytes (0x01); + op_t msb = repeat_bytes (0x80); + return (x - lsb) & ~x & msb; +} + +/* This function returns at least one bit set within every byte of X that + is zero. The result is exact in that, unlike find_zero_low, all bytes + are determinate. This is usually used for finding the index of the + most significant byte that was zero. */ +static __always_inline find_t +find_zero_all (op_t x) +{ + /* For each byte, find not-zero by + (0) And 0x7f so that we cannot carry between bytes, + (1) Add 0x7f so that non-zero carries into 0x80, + (2) Or in the original byte (which might have had 0x80 set). + Then invert and mask such that 0x80 is set iff that byte was zero. */ + op_t m = repeat_bytes (0x7f); + return ~(((x & m) + m) | x | m); +} + +/* With similar caveats, identify bytes that are equal between X1 and X2. */ +static __always_inline find_t +find_eq_low (op_t x1, op_t x2) +{ + return find_zero_low (x1 ^ x2); +} + +static __always_inline find_t +find_eq_all (op_t x1, op_t x2) +{ + return find_zero_all (x1 ^ x2); +} + +/* With similar caveats, identify zero bytes in X1 and bytes that are + equal between in X1 and X2. */ +static __always_inline find_t +find_zero_eq_low (op_t x1, op_t x2) +{ + return find_zero_low (x1) | find_zero_low (x1 ^ x2); +} + +static __always_inline find_t +find_zero_eq_all (op_t x1, op_t x2) +{ + return find_zero_all (x1) | find_zero_all (x1 ^ x2); +} + +/* With similar caveats, identify zero bytes in X1 and bytes that are + not equal between in X1 and X2. */ +static __always_inline find_t +find_zero_ne_low (op_t x1, op_t x2) +{ + return (~find_zero_eq_low (x1, x2)) + 1; +} + +static __always_inline find_t +find_zero_ne_all (op_t x1, op_t x2) +{ + op_t m = repeat_bytes (0x7f); + op_t eq = x1 ^ x2; + op_t nz1 = ((x1 & m) + m) | x1; + op_t ne2 = ((eq & m) + m) | eq; + return (ne2 | ~nz1) & ~m; +} + +#endif /* _STRING_FZA_H */ diff --git a/sysdeps/generic/string-fzb.h b/sysdeps/generic/string-fzb.h new file mode 100644 index 0000000000..42de500d67 --- /dev/null +++ b/sysdeps/generic/string-fzb.h @@ -0,0 +1,49 @@ +/* Zero byte detection, boolean. Generic C version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_FZB_H +#define _STRING_FZB_H 1 + +#include +#include + +/* Determine if any byte within X is zero. This is a pure boolean test. */ + +static __always_inline _Bool +has_zero (op_t x) +{ + return find_zero_low (x) != 0; +} + +/* Likewise, but for byte equality between X1 and X2. */ + +static __always_inline _Bool +has_eq (op_t x1, op_t x2) +{ + return find_eq_low (x1, x2) != 0; +} + +/* Likewise, but for zeros in X1 and equal bytes between X1 and X2. */ + +static __always_inline _Bool +has_zero_eq (op_t x1, op_t x2) +{ + return find_zero_eq_low (x1, x2); +} + +#endif /* _STRING_FZB_H */ diff --git a/sysdeps/generic/string-fzc.h b/sysdeps/generic/string-fzc.h new file mode 100644 index 0000000000..f159254535 --- /dev/null +++ b/sysdeps/generic/string-fzc.h @@ -0,0 +1,91 @@ +/* Zero byte detection; indexes. Generic C version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_FZC_H +#define _STRING_FZC_H 1 + +#include +#include +#include + + +/* Given a word X that is known to contain a zero byte, return the index of + the first such within the word in memory order. */ +static __always_inline unsigned int +index_first_zero (op_t x) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + x = find_zero_low (x); + else + x = find_zero_all (x); + return index_first (x); +} + +/* Similarly, but perform the search for byte equality between X1 and X2. */ +static __always_inline unsigned int +index_first_eq (op_t x1, op_t x2) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + x1 = find_eq_low (x1, x2); + else + x1 = find_eq_all (x1, x2); + return index_first (x1); +} + +/* Similarly, but perform the search for zero within X1 or equality between + X1 and X2. */ +static __always_inline unsigned int +index_first_zero_eq (op_t x1, op_t x2) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + x1 = find_zero_eq_low (x1, x2); + else + x1 = find_zero_eq_all (x1, x2); + return index_first (x1); +} + +/* Similarly, but perform the search for zero within X1 or inequality between + X1 and X2. */ +static __always_inline unsigned int +index_first_zero_ne (op_t x1, op_t x2) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + x1 = find_zero_ne_low (x1, x2); + else + x1 = find_zero_ne_all (x1, x2); + return index_first (x1); +} + +/* Similarly, but search for the last zero within X. */ +static __always_inline unsigned int +index_last_zero (op_t x) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + x = find_zero_all (x); + else + x = find_zero_low (x); + return index_last (x); +} + +static __always_inline unsigned int +index_last_eq (op_t x1, op_t x2) +{ + return index_last_zero (x1 ^ x2); +} + +#endif /* STRING_FZC_H */ diff --git a/sysdeps/generic/string-fzi.h b/sysdeps/generic/string-fzi.h new file mode 100644 index 0000000000..2deecefc23 --- /dev/null +++ b/sysdeps/generic/string-fzi.h @@ -0,0 +1,71 @@ +/* Zero byte detection; indexes. Generic C version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_FZI_H +#define _STRING_FZI_H 1 + +#include +#include +#include + +static __always_inline int +clz (find_t c) +{ + if (sizeof (find_t) == sizeof (unsigned long)) + return __builtin_clzl (c); + else + return __builtin_clzll (c); +} + +static __always_inline int +ctz (find_t c) +{ + if (sizeof (find_t) == sizeof (unsigned long)) + return __builtin_ctzl (c); + else + return __builtin_ctzll (c); +} + +/* A subroutine for the index_zero functions. Given a test word C, return + the (memory order) index of the first byte (in memory order) that is + non-zero. */ +static __always_inline unsigned int +index_first (find_t c) +{ + int r; + if (__BYTE_ORDER == __LITTLE_ENDIAN) + r = ctz (c); + else + r = clz (c); + return r / CHAR_BIT; +} + +/* Similarly, but return the (memory order) index of the last byte that is + non-zero. */ +static __always_inline unsigned int +index_last (find_t c) +{ + int r; + if (__BYTE_ORDER == __LITTLE_ENDIAN) + r = clz (c); + else + r = ctz (c); + return sizeof (find_t) - 1 - (r / CHAR_BIT); +} + +#endif /* STRING_FZI_H */ From patchwork Fri Jan 13 18:27:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 642038 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp357852pvb; Fri, 13 Jan 2023 10:28:08 -0800 (PST) X-Google-Smtp-Source: AMrXdXu31Wi7tXtwGDoujgfTDSxKE32RbmIHxk0NsuDx5milW4x4AxSaIS3RxIkuWrb1eyoUAHpS X-Received: by 2002:a17:907:2d09:b0:86c:988d:178c with SMTP id gs9-20020a1709072d0900b0086c988d178cmr230324ejc.52.1673634488418; Fri, 13 Jan 2023 10:28:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673634488; cv=none; d=google.com; s=arc-20160816; b=sDrl+7IZ6PHSCMQJ3dPI8py0Gnq0vh7mDD4v5nljVtBbCbJ6pIuG7C78XzUehvZXga SyHywD5u/W4s5oE+9V8t+RAgxO2xo4lC+jxAzKxHtXUDsDxPTCGB/fr0Un7pqIG7HP3Q 22CGhH43vKBsOIDdyvfKx2GX6joj8xrFcQ1llTFX3PVlGR8Evlq0sxNO80ovzYpnDSaF Snn8/HvgR/o3RUloYIJZ5IlhR47j20e0EDE9SXd9hL7tDoHVjjXg1CLB06qUuHJdg3O6 LVT1ChZ+p90TkBkNJmxHUdvb/bjoAQsqC7gs6rS0G6J7n0j5WX6MIvqbb+cHwjG/kQSV PGTw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=OqHYFL+CdMGKg9rYN0RAWE2xppStfJKEdo6sEDhhdCk=; b=lWLcxCln4+f4ghEYM1qfrlryRhznXfvFU9t63NHlNclaVC/mlgvNEWfSWr88jVS4qq Q8aBqXOu246D2W/Cmgv7dpwBsoXtYjbqgp7rT7nt1S7X8KPY+CeGCa0/yuzynkE/ETaM jCknTNguFjeCONAMhRrNRAx4ugAXEgD4oR+vWWp4dMQr+YA4gWbu3OZIGU+UUVgG7saH u8P2TxoupiA+UpdhRjxZsjeoBfieOnDw/HOgntUpJgqdwrUmX2ooPtFrmpvLH6ksqOjh j2Gj1SFdL/+jhaLoeCNcfgp8CK2p2o4QR0le0yJc5p+2kSpjauknoUviCeXgS0GlJDxO Hd6Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=iOkX7Xcw; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id dt19-20020a170907729300b007a7f207a1b9si20567567ejc.664.2023.01.13.10.28.08 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 10:28:08 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=iOkX7Xcw; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1048F3881D07 for ; Fri, 13 Jan 2023 18:28:07 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1048F3881D07 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673634487; bh=OqHYFL+CdMGKg9rYN0RAWE2xppStfJKEdo6sEDhhdCk=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=iOkX7XcwnAvVBK4DQc8o/ZF4VQkxdVpMiEQaS33O7hJoPGwrj9AdVgEWCUzsPz7r6 Z7bdLokfwYeakcV+0nueq1qg9Z+Qf/4OEU6+afTK3SpKt3Sn5MEn2hr/buYyvS0sI6 s9xgaDxpbhvdK/Ri7eLCzBM9d95SVXPtGv0lHfhI= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pj1-x1034.google.com (mail-pj1-x1034.google.com [IPv6:2607:f8b0:4864:20::1034]) by sourceware.org (Postfix) with ESMTPS id 0053A38493FF for ; Fri, 13 Jan 2023 18:27:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0053A38493FF Received: by mail-pj1-x1034.google.com with SMTP id o18so2057790pji.1 for ; Fri, 13 Jan 2023 10:27:44 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OqHYFL+CdMGKg9rYN0RAWE2xppStfJKEdo6sEDhhdCk=; b=YsGlOjTZTsb4HRxf56z0EU+1ecWNKBtaFPWi/HWvbOyerQKFix7E89QjjkslKW8fxj yb9/Y+yC9WDAYSv6TNViK4xr1s/URKZj18/Oqlaoq+XUpobcqr/YTVZNp8SozLEUi4GD +g85Sm7XIRQudvY8XvyQGnlAADy5wTpxxZVJeGD3tdvT/T956rSWa/xfn6L96FzyGNu+ Mojk4iXJBMO9vfTHRRJw3m52MPESCMTlAN6nIU0qK3XuEH9vkwNRjf1e6KaaJ0+L9X6H MSK6jBmPpaJvHnei5ckYQC5fK5sjUp+tr4n+lVnALhx2uU5jsWI9fnN8CYUsF0BjzMjm QgNA== X-Gm-Message-State: AFqh2kqDdiGf9ZVaAiSHgRTpYOyCi6iPKsn1j4v+M5z2qHsPiR2EraQV +QNmE6va4eHn8rHgZAxDHEwpuSpDCC70M/57 X-Received: by 2002:a17:902:6945:b0:194:7536:de5f with SMTP id k5-20020a170902694500b001947536de5fmr1923829plt.56.1673634463990; Fri, 13 Jan 2023 10:27:43 -0800 (PST) Received: from stoup.. (rrcs-173-198-77-218.west.biz.rr.com. [173.198.77.218]) by smtp.gmail.com with ESMTPSA id s17-20020a170902c65100b001927ebc40e2sm14443640pls.193.2023.01.13.10.27.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 10:27:43 -0800 (PST) To: libc-alpha@sourceware.org Cc: adhemerval.zanella@linaro.org, goldstein.w.n@gmail.com Subject: [PATCH v8 05/17] string: Improve generic strlen Date: Fri, 13 Jan 2023 08:27:21 -1000 Message-Id: <20230113182733.1268668-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230113182733.1268668-1-richard.henderson@linaro.org> References: <20230113182733.1268668-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Henderson via Libc-alpha From: Richard Henderson Reply-To: Richard Henderson Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto New algorithm have the following key differences: - Reads first word unaligned and use string-maskoff functions to remove unwanted data. This strategy follow arch-specific optimization used on powerpc, sparc, and SH. - Use of has_zero and index_first_zero parametrized functions. Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu, and powercp64-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Co-authored-by: Richard Henderson Message-Id: <20230111204558.2402155-6-adhemerval.zanella@linaro.org> --- string/strlen.c | 87 +++++++++-------------------------------- sysdeps/s390/strlen-c.c | 12 +++--- 2 files changed, 25 insertions(+), 74 deletions(-) diff --git a/string/strlen.c b/string/strlen.c index ee1aae0fff..9769dc62c3 100644 --- a/string/strlen.c +++ b/string/strlen.c @@ -17,84 +17,33 @@ #include #include +#include +#include +#include +#include -#undef strlen - -#ifndef STRLEN -# define STRLEN strlen +#ifdef STRLEN +# define __strlen STRLEN #endif /* Return the length of the null-terminated string STR. Scan for the null terminator quickly by testing four bytes at a time. */ size_t -STRLEN (const char *str) +__strlen (const char *str) { - const char *char_ptr; - const unsigned long int *longword_ptr; - unsigned long int longword, himagic, lomagic; + /* Align pointer to sizeof op_t. */ + const uintptr_t s_int = (uintptr_t) str; + const op_t *word_ptr = word_containing (str); - /* Handle the first few characters by reading one character at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = str; ((unsigned long int) char_ptr - & (sizeof (longword) - 1)) != 0; - ++char_ptr) - if (*char_ptr == '\0') - return char_ptr - str; + /* Read and MASK the first word. */ + op_t word = *word_ptr | create_mask (s_int); - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to 8-byte longwords. */ + while (! has_zero (word)) + word = *++word_ptr; - longword_ptr = (unsigned long int *) char_ptr; - - /* Computing (longword - lomagic) sets the high bit of any corresponding - byte that is either zero or greater than 0x80. The latter case can be - filtered out by computing (~longword & himagic). The final result - will always be non-zero if one of the bytes of longword is zero. */ - himagic = 0x80808080L; - lomagic = 0x01010101L; - if (sizeof (longword) > 4) - { - /* 64-bit version of the magic. */ - /* Do the shift in two steps to avoid a warning if long has 32 bits. */ - himagic = ((himagic << 16) << 16) | himagic; - lomagic = ((lomagic << 16) << 16) | lomagic; - } - if (sizeof (longword) > 8) - abort (); - - /* Instead of the traditional loop which tests each character, - we will test a longword at a time. The tricky part is testing - if *any of the four* bytes in the longword in question are zero. */ - for (;;) - { - longword = *longword_ptr++; - - if (((longword - lomagic) & ~longword & himagic) != 0) - { - /* Which of the bytes was the zero? */ - - const char *cp = (const char *) (longword_ptr - 1); - - if (cp[0] == 0) - return cp - str; - if (cp[1] == 0) - return cp - str + 1; - if (cp[2] == 0) - return cp - str + 2; - if (cp[3] == 0) - return cp - str + 3; - if (sizeof (longword) > 4) - { - if (cp[4] == 0) - return cp - str + 4; - if (cp[5] == 0) - return cp - str + 5; - if (cp[6] == 0) - return cp - str + 6; - if (cp[7] == 0) - return cp - str + 7; - } - } - } + return ((const char *) word_ptr) + index_first_zero (word) - str; } +#ifndef STRLEN +weak_alias (__strlen, strlen) libc_hidden_builtin_def (strlen) +#endif diff --git a/sysdeps/s390/strlen-c.c b/sysdeps/s390/strlen-c.c index b829ef2452..0a33a6f8e5 100644 --- a/sysdeps/s390/strlen-c.c +++ b/sysdeps/s390/strlen-c.c @@ -21,12 +21,14 @@ #if HAVE_STRLEN_C # if HAVE_STRLEN_IFUNC # define STRLEN STRLEN_C -# if defined SHARED && IS_IN (libc) -# undef libc_hidden_builtin_def -# define libc_hidden_builtin_def(name) \ - __hidden_ver1 (__strlen_c, __GI_strlen, __strlen_c); -# endif # endif # include + +# if HAVE_STRLEN_IFUNC +# if defined SHARED && IS_IN (libc) +__hidden_ver1 (__strlen_c, __GI_strlen, __strlen_c); +# endif +# endif + #endif From patchwork Fri Jan 13 18:27:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 642044 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp358599pvb; Fri, 13 Jan 2023 10:29:57 -0800 (PST) X-Google-Smtp-Source: AMrXdXsfyUOpmrl3OroZmXtJ2ugjch33USRg774ois6ePlbLPsHbcLSzG1lF5RI1Wu4GXtfxt8/y X-Received: by 2002:a17:906:5e41:b0:84d:465e:49b7 with SMTP id b1-20020a1709065e4100b0084d465e49b7mr16799581eju.63.1673634597176; Fri, 13 Jan 2023 10:29:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673634597; cv=none; d=google.com; s=arc-20160816; b=SMSoYwXLUQhjo+bMOWdkpnD6PnssgPDYUZz4JCxpEgw4xQkqL+6jzTyHwbvWkAWOVP KSQYvJsFqtC9g3Fk429YQNyM5mWIioPNIRfeza9M44SI7a0/YkOrvCYyMqsfisupJS9s 8l2peQ6ZCB0QJOEL9HsaGY+6lZC1lrODruB2z3/HkxLFOTP6f3yZJanIUrzUVtjplBNd NsHhyMML0MREcGQLPODAzH0eg4UzuQrOczqTbTl1/2RYD9588OJxe2LWU58nZLGa91wj 9HZh5zNWnF/v8mOlyot2ioGmzSTq+Fuw1Q+oLljvLR+Hv7AbPWgMHGi641lLBZwNsLoR sX/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=rubEpvIyp4P+ohkAK01kF2nuwVO/ToDcQCM/YO3DmBI=; b=qZBRXnyOrkyu2h2NYNzvnGTMqze+eo02pYyYwLVrJ85oUvzee+SJYM66flXaHZ2n1C R2hx5FL20n4OA6mA66kM5RX3LK5HqXE9s1tjb+y0hzlN9lWceTJ4H2JnN5CC9HDSDWm0 wIMeGlg3ygY0JOhFSH+QUaM3f5sgYobQ3vVCdV9T2bKXAVdzL3joVXN7i1hkaicVnfuu qpB/SF+/gitRY0VuFxsBoVP11HSAnAelwXVCRvSr7DMFRpmST2h6TVc8cdn5ui4wMJs1 5f0mqo4rjHdDzuLT1pTTgoEhTQEHXFGRvG3xanaMBG03pDEALgHLkUKWH7wz8D16d4Ka ga9g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=Iu11OtPV; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id s5-20020a17090699c500b007c103219025si22672167ejn.825.2023.01.13.10.29.56 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 10:29:57 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=Iu11OtPV; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EC3B53887F47 for ; Fri, 13 Jan 2023 18:29:55 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org EC3B53887F47 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673634596; bh=rubEpvIyp4P+ohkAK01kF2nuwVO/ToDcQCM/YO3DmBI=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=Iu11OtPVs/Hn+/ogaY5Zax8T0rnyTMfmEjgzzNhTkYgOTjXgT+H1uasswWb1kh6AU uMPADrEVdTrxJrzm+MxPw06Zy/h9+RLdKrEwCqowZzCrPl9I8wKjbPCTLviGTzof6I t7HwA83hjnlKJpeIcpkcC+8An8Qo/4syWgO5iMI8= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634]) by sourceware.org (Postfix) with ESMTPS id 8055638543BE for ; Fri, 13 Jan 2023 18:27:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8055638543BE Received: by mail-pl1-x634.google.com with SMTP id w3so24273291ply.3 for ; Fri, 13 Jan 2023 10:27:46 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rubEpvIyp4P+ohkAK01kF2nuwVO/ToDcQCM/YO3DmBI=; b=YOnX8dZ9Z+4tsEKtEMEaztC6UupNH4LHicjWTyhl0FhDZDl99DymqVy4BcNqyNB4tA CReg6NzXK+fqBCloY2z7H6vsMtTIx1ZDV7X+6khsz8loceFxNWb2RHXNTI1lwwGFRHfe mkM2s3AvgBI1kb/x8YUbUI5JQCQQ7Jj2RNp7CGFhc9g88GuVRUPfwN7MZ17c1b9wnHDr d1EmovcJiM3WJg+vj/1WG5u45mjkHW7VL0PalZRVdoVCXS5wnwBT/0sCJ3Vr5cgoEAaP gnvpk5y4DCn8e7y55+31d6dnxa3h0rnp4/GWkzSHRLuByeZISmfMIND04F1U9l/YDpz8 91DA== X-Gm-Message-State: AFqh2kpCECCQMergWYSQvGZpWfdjc+qHvWgaltme5KmCuMSBo3ETMWaH vM757AGzUbP8+HUOU4hEHIDBn95H/uXX3WBk X-Received: by 2002:a17:902:d4d1:b0:189:c322:df3a with SMTP id o17-20020a170902d4d100b00189c322df3amr111101707plg.43.1673634465458; Fri, 13 Jan 2023 10:27:45 -0800 (PST) Received: from stoup.. (rrcs-173-198-77-218.west.biz.rr.com. [173.198.77.218]) by smtp.gmail.com with ESMTPSA id s17-20020a170902c65100b001927ebc40e2sm14443640pls.193.2023.01.13.10.27.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 10:27:45 -0800 (PST) To: libc-alpha@sourceware.org Cc: adhemerval.zanella@linaro.org, goldstein.w.n@gmail.com Subject: [PATCH v8 06/17] string: Improve generic strnlen Date: Fri, 13 Jan 2023 08:27:22 -1000 Message-Id: <20230113182733.1268668-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230113182733.1268668-1-richard.henderson@linaro.org> References: <20230113182733.1268668-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Henderson via Libc-alpha From: Richard Henderson Reply-To: Richard Henderson Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto With an optimized memchr, new strnlen implementation basically calls memchr and adjust the result pointer value. It also cleanups the multiple inclusion by leaving the ifunc implementation to undef the weak_alias and libc_hidden_def. Co-authored-by: Richard Henderson Message-Id: <20230111204558.2402155-7-adhemerval.zanella@linaro.org> --- string/strnlen.c | 137 +----------------- sysdeps/i386/i686/multiarch/strnlen-c.c | 16 +- .../power4/multiarch/strnlen-ppc32.c | 16 +- sysdeps/s390/strnlen-c.c | 16 +- 4 files changed, 30 insertions(+), 155 deletions(-) diff --git a/string/strnlen.c b/string/strnlen.c index 6ff294eab1..dc23354ec8 100644 --- a/string/strnlen.c +++ b/string/strnlen.c @@ -1,10 +1,6 @@ /* Find the length of STRING, but scan at most MAXLEN characters. Copyright (C) 1991-2023 Free Software Foundation, Inc. - Based on strlen written by Torbjorn Granlund (tege@sics.se), - with help from Dan Sahlin (dan@sics.se); - commentary by Jim Blandy (jimb@ai.mit.edu). - The GNU C Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the @@ -20,7 +16,6 @@ not, see . */ #include -#include /* Find the length of S, but scan at most MAXLEN characters. If no '\0' terminator is found in that many characters, return MAXLEN. */ @@ -32,134 +27,12 @@ size_t __strnlen (const char *str, size_t maxlen) { - const char *char_ptr, *end_ptr = str + maxlen; - const unsigned long int *longword_ptr; - unsigned long int longword, himagic, lomagic; - - if (maxlen == 0) - return 0; - - if (__glibc_unlikely (end_ptr < str)) - end_ptr = (const char *) ~0UL; - - /* Handle the first few characters by reading one character at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = str; ((unsigned long int) char_ptr - & (sizeof (longword) - 1)) != 0; - ++char_ptr) - if (*char_ptr == '\0') - { - if (char_ptr > end_ptr) - char_ptr = end_ptr; - return char_ptr - str; - } - - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to 8-byte longwords. */ - - longword_ptr = (unsigned long int *) char_ptr; - - /* Bits 31, 24, 16, and 8 of this number are zero. Call these bits - the "holes." Note that there is a hole just to the left of - each byte, with an extra at the end: - - bits: 01111110 11111110 11111110 11111111 - bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD - - The 1-bits make sure that carries propagate to the next 0-bit. - The 0-bits provide holes for carries to fall into. */ - himagic = 0x80808080L; - lomagic = 0x01010101L; - if (sizeof (longword) > 4) - { - /* 64-bit version of the magic. */ - /* Do the shift in two steps to avoid a warning if long has 32 bits. */ - himagic = ((himagic << 16) << 16) | himagic; - lomagic = ((lomagic << 16) << 16) | lomagic; - } - if (sizeof (longword) > 8) - abort (); - - /* Instead of the traditional loop which tests each character, - we will test a longword at a time. The tricky part is testing - if *any of the four* bytes in the longword in question are zero. */ - while (longword_ptr < (unsigned long int *) end_ptr) - { - /* We tentatively exit the loop if adding MAGIC_BITS to - LONGWORD fails to change any of the hole bits of LONGWORD. - - 1) Is this safe? Will it catch all the zero bytes? - Suppose there is a byte with all zeros. Any carry bits - propagating from its left will fall into the hole at its - least significant bit and stop. Since there will be no - carry from its most significant bit, the LSB of the - byte to the left will be unchanged, and the zero will be - detected. - - 2) Is this worthwhile? Will it ignore everything except - zero bytes? Suppose every byte of LONGWORD has a bit set - somewhere. There will be a carry into bit 8. If bit 8 - is set, this will carry into bit 16. If bit 8 is clear, - one of bits 9-15 must be set, so there will be a carry - into bit 16. Similarly, there will be a carry into bit - 24. If one of bits 24-30 is set, there will be a carry - into bit 31, so all of the hole bits will be changed. - - The one misfire occurs when bits 24-30 are clear and bit - 31 is set; in this case, the hole at bit 31 is not - changed. If we had access to the processor carry flag, - we could close this loophole by putting the fourth hole - at bit 32! - - So it ignores everything except 128's, when they're aligned - properly. */ - - longword = *longword_ptr++; - - if ((longword - lomagic) & himagic) - { - /* Which of the bytes was the zero? If none of them were, it was - a misfire; continue the search. */ - - const char *cp = (const char *) (longword_ptr - 1); - - char_ptr = cp; - if (cp[0] == 0) - break; - char_ptr = cp + 1; - if (cp[1] == 0) - break; - char_ptr = cp + 2; - if (cp[2] == 0) - break; - char_ptr = cp + 3; - if (cp[3] == 0) - break; - if (sizeof (longword) > 4) - { - char_ptr = cp + 4; - if (cp[4] == 0) - break; - char_ptr = cp + 5; - if (cp[5] == 0) - break; - char_ptr = cp + 6; - if (cp[6] == 0) - break; - char_ptr = cp + 7; - if (cp[7] == 0) - break; - } - } - char_ptr = end_ptr; - } - - if (char_ptr > end_ptr) - char_ptr = end_ptr; - return char_ptr - str; + const char *found = memchr (str, '\0', maxlen); + return found ? found - str : maxlen; } + #ifndef STRNLEN -libc_hidden_def (__strnlen) weak_alias (__strnlen, strnlen) -#endif +libc_hidden_def (__strnlen) libc_hidden_def (strnlen) +#endif diff --git a/sysdeps/i386/i686/multiarch/strnlen-c.c b/sysdeps/i386/i686/multiarch/strnlen-c.c index 351e939a93..beb0350d53 100644 --- a/sysdeps/i386/i686/multiarch/strnlen-c.c +++ b/sysdeps/i386/i686/multiarch/strnlen-c.c @@ -1,10 +1,10 @@ #define STRNLEN __strnlen_ia32 -#ifdef SHARED -# undef libc_hidden_def -# define libc_hidden_def(name) \ - __hidden_ver1 (__strnlen_ia32, __GI_strnlen, __strnlen_ia32); \ - strong_alias (__strnlen_ia32, __strnlen_ia32_1); \ - __hidden_ver1 (__strnlen_ia32_1, __GI___strnlen, __strnlen_ia32_1); -#endif +#include -#include "string/strnlen.c" +#ifdef SHARED +/* Alias for internal symbol to avoid PLT generation, it redirects the + libc_hidden_def (__strnlen/strlen) to default implementation. */ +__hidden_ver1 (__strnlen_ia32, __GI_strnlen, __strnlen_ia32); +strong_alias (__strnlen_ia32, __strnlen_ia32_1); +__hidden_ver1 (__strnlen_ia32_1, __GI___strnlen, __strnlen_ia32_1); +#endif diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-ppc32.c b/sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-ppc32.c index 957b9b99e8..2ca1cd7181 100644 --- a/sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-ppc32.c +++ b/sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-ppc32.c @@ -17,12 +17,12 @@ . */ #define STRNLEN __strnlen_ppc -#ifdef SHARED -# undef libc_hidden_def -# define libc_hidden_def(name) \ - __hidden_ver1 (__strnlen_ppc, __GI_strnlen, __strnlen_ppc); \ - strong_alias (__strnlen_ppc, __strnlen_ppc_1); \ - __hidden_ver1 (__strnlen_ppc_1, __GI___strnlen, __strnlen_ppc_1); -#endif - #include + +#ifdef SHARED +/* Alias for internal symbol to avoid PLT generation, it redirects the + libc_hidden_def (__strnlen/strlen) to default implementation. */ +__hidden_ver1 (__strnlen_ppc, __GI_strnlen, __strnlen_ppc); \ +strong_alias (__strnlen_ppc, __strnlen_ppc_1); \ +__hidden_ver1 (__strnlen_ppc_1, __GI___strnlen, __strnlen_ppc_1); +#endif diff --git a/sysdeps/s390/strnlen-c.c b/sysdeps/s390/strnlen-c.c index 172fcc7caa..95156a0ff5 100644 --- a/sysdeps/s390/strnlen-c.c +++ b/sysdeps/s390/strnlen-c.c @@ -21,14 +21,16 @@ #if HAVE_STRNLEN_C # if HAVE_STRNLEN_IFUNC # define STRNLEN STRNLEN_C -# if defined SHARED && IS_IN (libc) -# undef libc_hidden_def -# define libc_hidden_def(name) \ - __hidden_ver1 (__strnlen_c, __GI_strnlen, __strnlen_c); \ - strong_alias (__strnlen_c, __strnlen_c_1); \ - __hidden_ver1 (__strnlen_c_1, __GI___strnlen, __strnlen_c_1); -# endif # endif # include + +# if HAVE_STRNLEN_IFUNC +# if defined SHARED && IS_IN (libc) +__hidden_ver1 (__strnlen_c, __GI_strnlen, __strnlen_c); +strong_alias (__strnlen_c, __strnlen_c_1); +__hidden_ver1 (__strnlen_c_1, __GI___strnlen, __strnlen_c_1); +# endif +# endif + #endif From patchwork Fri Jan 13 18:27:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 642040 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp358166pvb; Fri, 13 Jan 2023 10:28:54 -0800 (PST) X-Google-Smtp-Source: AMrXdXspgwQB7M5LUqwkftyk5I/DrWFHTWoAseXKDM+Cy/2eimajBnuzLjm58vcuMTy4ukk0cI/4 X-Received: by 2002:a17:906:ce3c:b0:84d:3606:959f with SMTP id sd28-20020a170906ce3c00b0084d3606959fmr3902710ejb.28.1673634534447; Fri, 13 Jan 2023 10:28:54 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673634534; cv=none; d=google.com; s=arc-20160816; b=tUMs7BGQJj2neTLOQKJzfOtSwvdtUHhqiZCio1yNQdDQQdWDVqKhdKsMi2o9772JZF 6tL39GfsSU/M9VZ5n82/DrWMvHe6j0Glf9xtdDRJYn8/1kUlHWM0g0euFEzZUa/HXQN7 T5Lbka43hNO97dbJA+DlqDqmLTknZ16Z3/5EfsfGimUomWSn7B/pSIked/J9BrynvOri y9NEFJyvm494n+HsMSc+E0EQrHQVBfT3fdVa5Nc+itFwnKTm6x+ANQ2Q/xkd/ZhjAN1l hCo7xlGelwWk+gzXZ6m+gFcAiEsI25nz81Ik+Qbv5cknx4pQsuXwJmvQHbj294zWkfsj OWTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=Lr8fcCFCLEXMcMa2jAwiXGOwETn0rlrYauVuqLzbHJI=; b=tVcGGoQbXB/4Bx28fX98h0fkov6KrxgTcxS9lhES433DDqzEzd4f+ypVTM1izgDPLv NwPsJSsqt4BbUGvFZ/0dfVds3VrfiOYctg1oEfcGKMVg/7JQuiZ+dw8BNWGjbTdkFEZT hd5kXER5lGZ9I4ZLOYgAaHW1u5FCY4lXF2DcVpbVe3PDIVq1qTDT7eOBuf1crSzfUWBv Uaoe+64Nt+VVfK/IQi4bcF+KRAt/j7e3MVukM/SPvXEq/cjbBCQljN+xQZrr6j8R/hMd I7lC6kgVVLZ4IFvBaSewKxQbn91AEGnADAgOAh/dQXqb8+23paDOEpa3R485x8zy0T1M SrPA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=Ctq96kwR; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id hw20-20020a170907a0d400b007c12c63d1f4si16715086ejc.813.2023.01.13.10.28.54 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 10:28:54 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=Ctq96kwR; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 18EEB388456B for ; Fri, 13 Jan 2023 18:28:53 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 18EEB388456B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673634533; bh=Lr8fcCFCLEXMcMa2jAwiXGOwETn0rlrYauVuqLzbHJI=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=Ctq96kwRC46SObv+0Il0sO/X/8Y0uD7TD3s99JV9uLpxZjglzhny8B494Eydg9+AJ gGhRTNUa50hawYOtOqr/wH3mPkwinkYUAL/N5sUocMDX96pvTwS5fZI2eNcCovAlHk jQVRlqYOaMqL2x3W6L0zeMmViNOmZsGaRwS3xypw= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pj1-x102c.google.com (mail-pj1-x102c.google.com [IPv6:2607:f8b0:4864:20::102c]) by sourceware.org (Postfix) with ESMTPS id E2D5C383FB85 for ; Fri, 13 Jan 2023 18:27:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E2D5C383FB85 Received: by mail-pj1-x102c.google.com with SMTP id s13-20020a17090a6e4d00b0022900843652so6273447pjm.1 for ; Fri, 13 Jan 2023 10:27:47 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Lr8fcCFCLEXMcMa2jAwiXGOwETn0rlrYauVuqLzbHJI=; b=EIoDQULnvPqLF4k44R3Yyo9zj3T1HD8Ow1iyCp1R1KI6Qjif7aDHAuMa4Lhe8LYB6O S/GPZjRyGwviaLqx+zM0Svfc1K+tMRiPgGwlye5YKr87R2aukFdj07+LbGDFg7Fo8srx xIPXyHFEF8E4NXR9tuqiafzhI2bNoAIk9dg9D6qp8kvOOHdUmg8fNC5i05ah+j67JUpB rDrhw2HqBLQEynVMLnJab+Go2Mqda+VlTMrJ6PGly0DsGQE9IZpAfa5CSSZdm2m6+jSd 241axyh4yz6jwoVILANspCEJVWBAToIhO60MkpZeUj34bXNQwRD2WBby/nqnteX7iU0j ZToQ== X-Gm-Message-State: AFqh2krSOzjQcYFLkyO235P28m7t3edlsNCybp4J8n+6mP3Dtb92pxrT kiT5aa8uMJNW6lMYwLrd8LwUXdnxrkjHl2es X-Received: by 2002:a17:903:250:b0:194:7220:3c1d with SMTP id j16-20020a170903025000b0019472203c1dmr3452841plh.64.1673634466877; Fri, 13 Jan 2023 10:27:46 -0800 (PST) Received: from stoup.. (rrcs-173-198-77-218.west.biz.rr.com. [173.198.77.218]) by smtp.gmail.com with ESMTPSA id s17-20020a170902c65100b001927ebc40e2sm14443640pls.193.2023.01.13.10.27.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 10:27:46 -0800 (PST) To: libc-alpha@sourceware.org Cc: adhemerval.zanella@linaro.org, goldstein.w.n@gmail.com Subject: [PATCH v8 07/17] string: Improve generic strchr Date: Fri, 13 Jan 2023 08:27:23 -1000 Message-Id: <20230113182733.1268668-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230113182733.1268668-1-richard.henderson@linaro.org> References: <20230113182733.1268668-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Henderson via Libc-alpha From: Richard Henderson Reply-To: Richard Henderson Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto New algorithm now calls strchrnul. Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu, and powerpc64-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Message-Id: <20230111204558.2402155-8-adhemerval.zanella@linaro.org> --- string/strchr.c | 159 ++-------------------------------------- sysdeps/s390/strchr-c.c | 13 ++-- 2 files changed, 15 insertions(+), 157 deletions(-) diff --git a/string/strchr.c b/string/strchr.c index 1572b8b42e..30c3eb10f2 100644 --- a/string/strchr.c +++ b/string/strchr.c @@ -21,165 +21,22 @@ . */ #include -#include #undef strchr +#undef index -#ifndef STRCHR -# define STRCHR strchr +#ifdef STRCHR +# define strchr STRCHR #endif /* Find the first occurrence of C in S. */ char * -STRCHR (const char *s, int c_in) +strchr (const char *s, int c_in) { - const unsigned char *char_ptr; - const unsigned long int *longword_ptr; - unsigned long int longword, magic_bits, charmask; - unsigned char c; - - c = (unsigned char) c_in; - - /* Handle the first few characters by reading one character at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = (const unsigned char *) s; - ((unsigned long int) char_ptr & (sizeof (longword) - 1)) != 0; - ++char_ptr) - if (*char_ptr == c) - return (void *) char_ptr; - else if (*char_ptr == '\0') - return NULL; - - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to 8-byte longwords. */ - - longword_ptr = (unsigned long int *) char_ptr; - - /* Bits 31, 24, 16, and 8 of this number are zero. Call these bits - the "holes." Note that there is a hole just to the left of - each byte, with an extra at the end: - - bits: 01111110 11111110 11111110 11111111 - bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD - - The 1-bits make sure that carries propagate to the next 0-bit. - The 0-bits provide holes for carries to fall into. */ - magic_bits = -1; - magic_bits = magic_bits / 0xff * 0xfe << 1 >> 1 | 1; - - /* Set up a longword, each of whose bytes is C. */ - charmask = c | (c << 8); - charmask |= charmask << 16; - if (sizeof (longword) > 4) - /* Do the shift in two steps to avoid a warning if long has 32 bits. */ - charmask |= (charmask << 16) << 16; - if (sizeof (longword) > 8) - abort (); - - /* Instead of the traditional loop which tests each character, - we will test a longword at a time. The tricky part is testing - if *any of the four* bytes in the longword in question are zero. */ - for (;;) - { - /* We tentatively exit the loop if adding MAGIC_BITS to - LONGWORD fails to change any of the hole bits of LONGWORD. - - 1) Is this safe? Will it catch all the zero bytes? - Suppose there is a byte with all zeros. Any carry bits - propagating from its left will fall into the hole at its - least significant bit and stop. Since there will be no - carry from its most significant bit, the LSB of the - byte to the left will be unchanged, and the zero will be - detected. - - 2) Is this worthwhile? Will it ignore everything except - zero bytes? Suppose every byte of LONGWORD has a bit set - somewhere. There will be a carry into bit 8. If bit 8 - is set, this will carry into bit 16. If bit 8 is clear, - one of bits 9-15 must be set, so there will be a carry - into bit 16. Similarly, there will be a carry into bit - 24. If one of bits 24-30 is set, there will be a carry - into bit 31, so all of the hole bits will be changed. - - The one misfire occurs when bits 24-30 are clear and bit - 31 is set; in this case, the hole at bit 31 is not - changed. If we had access to the processor carry flag, - we could close this loophole by putting the fourth hole - at bit 32! - - So it ignores everything except 128's, when they're aligned - properly. - - 3) But wait! Aren't we looking for C as well as zero? - Good point. So what we do is XOR LONGWORD with a longword, - each of whose bytes is C. This turns each byte that is C - into a zero. */ - - longword = *longword_ptr++; - - /* Add MAGIC_BITS to LONGWORD. */ - if ((((longword + magic_bits) - - /* Set those bits that were unchanged by the addition. */ - ^ ~longword) - - /* Look at only the hole bits. If any of the hole bits - are unchanged, most likely one of the bytes was a - zero. */ - & ~magic_bits) != 0 - - /* That caught zeroes. Now test for C. */ - || ((((longword ^ charmask) + magic_bits) ^ ~(longword ^ charmask)) - & ~magic_bits) != 0) - { - /* Which of the bytes was C or zero? - If none of them were, it was a misfire; continue the search. */ - - const unsigned char *cp = (const unsigned char *) (longword_ptr - 1); - - if (*cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (sizeof (longword) > 4) - { - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - } - } - } - - return NULL; + char *r = __strchrnul (s, c_in); + return (*(unsigned char *)r == (unsigned char)c_in) ? r : NULL; } - -#ifdef weak_alias -# undef index +#ifndef STRCHR weak_alias (strchr, index) -#endif libc_hidden_builtin_def (strchr) +#endif diff --git a/sysdeps/s390/strchr-c.c b/sysdeps/s390/strchr-c.c index c00f2cceea..90822ae0f4 100644 --- a/sysdeps/s390/strchr-c.c +++ b/sysdeps/s390/strchr-c.c @@ -21,13 +21,14 @@ #if HAVE_STRCHR_C # if HAVE_STRCHR_IFUNC # define STRCHR STRCHR_C -# undef weak_alias -# if defined SHARED && IS_IN (libc) -# undef libc_hidden_builtin_def -# define libc_hidden_builtin_def(name) \ - __hidden_ver1 (__strchr_c, __GI_strchr, __strchr_c); -# endif # endif # include + +# if HAVE_STRCHR_IFUNC +# if defined SHARED && IS_IN (libc) +__hidden_ver1 (__strchr_c, __GI_strchr, __strchr_c); +# endif +# endif + #endif From patchwork Fri Jan 13 18:27:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 642039 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp358154pvb; Fri, 13 Jan 2023 10:28:53 -0800 (PST) X-Google-Smtp-Source: AMrXdXtqjwoJlEUKtwY4BHEwAR8828O8y77IpjAfqtAlZ4dYWt9q9HfpQbZJx7VuRCxk5F3FvDGG X-Received: by 2002:aa7:d710:0:b0:49a:f52f:eed0 with SMTP id t16-20020aa7d710000000b0049af52feed0mr9799668edq.13.1673634533577; Fri, 13 Jan 2023 10:28:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673634533; cv=none; d=google.com; s=arc-20160816; b=lfOUyTCHvsxapLRz43K6PNuzSkcv0EBoNO+RrNCtZwwW8gubdq6ywz/dbrg2SxfYkz Jw/QcxKilwxlFuncmKSr7FfcOhQ+9abP77TTZj5OBwZ8H4L6lPT6C8/Ur1L1H6axvJQ7 Cz4hPlg0qjIrZjKrX1KFxkTjd4y6S2+9GRpMcWjgRQ8cQzLcwZJD67W51+b2U4npOYwQ gG5JTRyFROipwGIb32nAxT4/48T8bOFO6cjCRfYsf2ovC/M43iriUJAOCD63UpoFm8rO zFlOYNqEVddVLvouljM/jVgy2L440arAbA9l+LZmiIbsxGb+I+Ix8G5uBy9BG9GKe5pe WkmQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=Xd7AMVwSgaQic/n8UIai1jxNSrqpLlokxOpa1xp/ar4=; b=oS/p8PDockq9rzJnSHojXI6ApmhXktK5/I1JzgSVHPksuGZmoylxR48nyIj5FMohOh Pk0rGm7T+S9+3WHJTOg9gd2F0JLUvzXILpFyY7qg0IrCxX+huXkePtx9omeqPNVg74TW xEBdh8AuGCF6qrL+y/KNPY2YaCLXUqoQZoJW9oqXAYDJtOkqon9hz8e8zg6BJeL5HXts wp6pA+h5bJ6Kz1Isjbc9Taf3OT0yqMCODLD2//jW6M+XeoOU6IHtezIIvFaneJcoeXmu CNmERTH0FhTteOWSt591Hh2uv3mMrjrkDnHl7jH4TkWWqRp4bvvklXz2ew9bE5l6yl49 MbjQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=ZNbgeI4y; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id jg29-20020a170907971d00b007c0d428795dsi24370742ejc.191.2023.01.13.10.28.53 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 10:28:53 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=ZNbgeI4y; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2A99F38493F6 for ; Fri, 13 Jan 2023 18:28:52 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2A99F38493F6 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673634532; bh=Xd7AMVwSgaQic/n8UIai1jxNSrqpLlokxOpa1xp/ar4=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=ZNbgeI4yzSYOJCysSEdoo2aVLloDLC+NpAMi6K5CY90RD7OQfJHRq1eLIH5XIw0Fc f4AqdX+8bnqybal8Q09Y+x+agpox4PnSSttT/Bo/JrX/y8TzllcwAw7sJ2uUZdCX1T cYPAQB9cjBgjXjYHd3rwDBI0E99otHzsLXFjuyyg= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pj1-x1034.google.com (mail-pj1-x1034.google.com [IPv6:2607:f8b0:4864:20::1034]) by sourceware.org (Postfix) with ESMTPS id 6784B383FB93 for ; Fri, 13 Jan 2023 18:27:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6784B383FB93 Received: by mail-pj1-x1034.google.com with SMTP id o13so19718608pjg.2 for ; Fri, 13 Jan 2023 10:27:49 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Xd7AMVwSgaQic/n8UIai1jxNSrqpLlokxOpa1xp/ar4=; b=GDerKB64/C4cTUSPyimJ9gOvjheqW2CzW7kZO96gJyr82jwkOh12EHW8afvktduCSi PGqt89BiYCwa4ZBY4XywolBNbPoAeFcMGwqO1QJT7vlhpeUpWzqdkCh0h4oSY9s1dzk8 zK1I/WzLegkDwyJxRN38OU9iR1Zo9np2g3a1S0liWnj7VjdYRm7fUz+04C5zqb0GTOKk 4zjcuOA/1QTeR4UCVZYPGlmMWRBiHjIIdji+Ooxr/PKqjGFzHSlTAOmygAf6VeEiBahT ybKx+r9YDrknNtmzKglr81KtrTXuy8GIaen527XFttmZDvs0Lk2UsKAU26YANkoGlHUO wy3w== X-Gm-Message-State: AFqh2kqFQ7SGbHc1DQ3DJ8zZY/QWbkEXltgwWWKJMSQRtg3qFZuGSWil x0PXyINdmKphMZyDLQ7V8omKMhbOK2mFunk/ X-Received: by 2002:a17:902:ef8c:b0:194:767e:9708 with SMTP id iz12-20020a170902ef8c00b00194767e9708mr1543487plb.23.1673634468390; Fri, 13 Jan 2023 10:27:48 -0800 (PST) Received: from stoup.. (rrcs-173-198-77-218.west.biz.rr.com. [173.198.77.218]) by smtp.gmail.com with ESMTPSA id s17-20020a170902c65100b001927ebc40e2sm14443640pls.193.2023.01.13.10.27.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 10:27:47 -0800 (PST) To: libc-alpha@sourceware.org Cc: adhemerval.zanella@linaro.org, goldstein.w.n@gmail.com Subject: [PATCH v8 08/17] string: Improve generic strchrnul Date: Fri, 13 Jan 2023 08:27:24 -1000 Message-Id: <20230113182733.1268668-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230113182733.1268668-1-richard.henderson@linaro.org> References: <20230113182733.1268668-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Henderson via Libc-alpha From: Richard Henderson Reply-To: Richard Henderson Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto New algorithm have the following key differences: - Reads first word unaligned and use string-maskoff function to remove unwanted data. This strategy follow arch-specific optimization used on aarch64 and powerpc. - Use string-fz{a,b,c,i} functions. Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu, and powerpc-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Co-authored-by: Richard Henderson Message-Id: <20230111204558.2402155-9-adhemerval.zanella@linaro.org> --- string/strchrnul.c | 154 +++--------------- .../power4/multiarch/strchrnul-ppc32.c | 4 - sysdeps/s390/strchrnul-c.c | 2 - 3 files changed, 23 insertions(+), 137 deletions(-) diff --git a/string/strchrnul.c b/string/strchrnul.c index fa2db4b417..fdcdffafa5 100644 --- a/string/strchrnul.c +++ b/string/strchrnul.c @@ -1,10 +1,5 @@ /* Copyright (C) 1991-2023 Free Software Foundation, Inc. This file is part of the GNU C Library. - Based on strlen implementation by Torbjorn Granlund (tege@sics.se), - with help from Dan Sahlin (dan@sics.se) and - bug fix and commentary by Jim Blandy (jimb@ai.mit.edu); - adaptation to strchr suggested by Dick Karpinski (dick@cca.ucsf.edu), - and implemented by Roland McGrath (roland@ai.mit.edu). The GNU C Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public @@ -21,146 +16,43 @@ . */ #include -#include #include +#include +#include +#include +#include +#include +#include #undef __strchrnul #undef strchrnul -#ifndef STRCHRNUL -# define STRCHRNUL __strchrnul +#ifdef STRCHRNUL +# define __strchrnul STRCHRNUL #endif /* Find the first occurrence of C in S or the final NUL byte. */ char * -STRCHRNUL (const char *s, int c_in) +__strchrnul (const char *str, int c_in) { - const unsigned char *char_ptr; - const unsigned long int *longword_ptr; - unsigned long int longword, magic_bits, charmask; - unsigned char c; + op_t repeated_c = repeat_bytes (c_in); - c = (unsigned char) c_in; + uintptr_t s_int = (uintptr_t) str; + const op_t *word_ptr = word_containing (str); - /* Handle the first few characters by reading one character at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = (const unsigned char *) s; - ((unsigned long int) char_ptr & (sizeof (longword) - 1)) != 0; - ++char_ptr) - if (*char_ptr == c || *char_ptr == '\0') - return (void *) char_ptr; + op_t word = *word_ptr; - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to 8-byte longwords. */ + find_t mask = shift_find (find_zero_eq_all (word, repeated_c), s_int); + if (mask != 0) + return (char *) str + index_first (mask); - longword_ptr = (unsigned long int *) char_ptr; + do + word = *++word_ptr; + while (! has_zero_eq (word, repeated_c)); - /* Bits 31, 24, 16, and 8 of this number are zero. Call these bits - the "holes." Note that there is a hole just to the left of - each byte, with an extra at the end: - - bits: 01111110 11111110 11111110 11111111 - bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD - - The 1-bits make sure that carries propagate to the next 0-bit. - The 0-bits provide holes for carries to fall into. */ - magic_bits = -1; - magic_bits = magic_bits / 0xff * 0xfe << 1 >> 1 | 1; - - /* Set up a longword, each of whose bytes is C. */ - charmask = c | (c << 8); - charmask |= charmask << 16; - if (sizeof (longword) > 4) - /* Do the shift in two steps to avoid a warning if long has 32 bits. */ - charmask |= (charmask << 16) << 16; - if (sizeof (longword) > 8) - abort (); - - /* Instead of the traditional loop which tests each character, - we will test a longword at a time. The tricky part is testing - if *any of the four* bytes in the longword in question are zero. */ - for (;;) - { - /* We tentatively exit the loop if adding MAGIC_BITS to - LONGWORD fails to change any of the hole bits of LONGWORD. - - 1) Is this safe? Will it catch all the zero bytes? - Suppose there is a byte with all zeros. Any carry bits - propagating from its left will fall into the hole at its - least significant bit and stop. Since there will be no - carry from its most significant bit, the LSB of the - byte to the left will be unchanged, and the zero will be - detected. - - 2) Is this worthwhile? Will it ignore everything except - zero bytes? Suppose every byte of LONGWORD has a bit set - somewhere. There will be a carry into bit 8. If bit 8 - is set, this will carry into bit 16. If bit 8 is clear, - one of bits 9-15 must be set, so there will be a carry - into bit 16. Similarly, there will be a carry into bit - 24. If one of bits 24-30 is set, there will be a carry - into bit 31, so all of the hole bits will be changed. - - The one misfire occurs when bits 24-30 are clear and bit - 31 is set; in this case, the hole at bit 31 is not - changed. If we had access to the processor carry flag, - we could close this loophole by putting the fourth hole - at bit 32! - - So it ignores everything except 128's, when they're aligned - properly. - - 3) But wait! Aren't we looking for C as well as zero? - Good point. So what we do is XOR LONGWORD with a longword, - each of whose bytes is C. This turns each byte that is C - into a zero. */ - - longword = *longword_ptr++; - - /* Add MAGIC_BITS to LONGWORD. */ - if ((((longword + magic_bits) - - /* Set those bits that were unchanged by the addition. */ - ^ ~longword) - - /* Look at only the hole bits. If any of the hole bits - are unchanged, most likely one of the bytes was a - zero. */ - & ~magic_bits) != 0 - - /* That caught zeroes. Now test for C. */ - || ((((longword ^ charmask) + magic_bits) ^ ~(longword ^ charmask)) - & ~magic_bits) != 0) - { - /* Which of the bytes was C or zero? - If none of them were, it was a misfire; continue the search. */ - - const unsigned char *cp = (const unsigned char *) (longword_ptr - 1); - - if (*cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (sizeof (longword) > 4) - { - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - } - } - } - - /* This should never happen. */ - return NULL; + op_t found = index_first_zero_eq (word, repeated_c); + return (char *) word_ptr + found; } - +#ifndef STRCHRNUL weak_alias (__strchrnul, strchrnul) +#endif diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/strchrnul-ppc32.c b/sysdeps/powerpc/powerpc32/power4/multiarch/strchrnul-ppc32.c index 88ce5dfffa..da03ac7c04 100644 --- a/sysdeps/powerpc/powerpc32/power4/multiarch/strchrnul-ppc32.c +++ b/sysdeps/powerpc/powerpc32/power4/multiarch/strchrnul-ppc32.c @@ -19,10 +19,6 @@ #include #define STRCHRNUL __strchrnul_ppc - -#undef weak_alias -#define weak_alias(a,b ) - extern __typeof (strchrnul) __strchrnul_ppc attribute_hidden; #include diff --git a/sysdeps/s390/strchrnul-c.c b/sysdeps/s390/strchrnul-c.c index e1248d1dbf..ff6aa38d4f 100644 --- a/sysdeps/s390/strchrnul-c.c +++ b/sysdeps/s390/strchrnul-c.c @@ -22,8 +22,6 @@ # if HAVE_STRCHRNUL_IFUNC # define STRCHRNUL STRCHRNUL_C # define __strchrnul STRCHRNUL -# undef weak_alias -# define weak_alias(name, alias) # endif # include From patchwork Fri Jan 13 18:27:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 642043 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp358454pvb; Fri, 13 Jan 2023 10:29:34 -0800 (PST) X-Google-Smtp-Source: AMrXdXs//iccyWOHzI1VNcUxb26d8vf5+CimHunWnblpWtWmhE20dYy4izzObqx8h2X0YTAwOx3t X-Received: by 2002:a05:6402:5285:b0:45c:834b:eb5f with SMTP id en5-20020a056402528500b0045c834beb5fmr85211569edb.42.1673634574767; Fri, 13 Jan 2023 10:29:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673634574; cv=none; d=google.com; s=arc-20160816; b=tkXtmRn/FbBCXHHhC7HCnm+AiBSo8stD1gK+efVzuEYAnS1ZYkFVj9xzJa18sn3aUT iuLjdsr12JKhff79h2hcGVMIQwwBxbMnF8mGsr+oJJcYVEqUgNoCG0P0p9DwfLeWok+5 PFtLt/pzF9WmJJo26N/WzRtMsIJXcJ2YCJNxIjt9uSjtKijY3Llsw9RXqk/Z4Xsn5fHn i2FiG3JMwU6l7X1SDyFO25xryOGAnReTvZd8bDWEyCU/nloH1UXKk860M1baS+gXs/fo fKMsSE6vc1YhGMphGUe+6Wm6f+cuLIdNPLDbYYsT2K+qShn4mPDVPSBx2DAgqczd7het P+Sg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=Phgzl6cvhbrcYlQf6e5Xz5JDmrDLCzWT1fqECO1nztc=; b=bpdwbvtlosJMKJ/oeDQu68A7Tdm8LJ6hHRyeQWBSYLINz0cab+DmsBNOhx/qxsNTd9 aNJuOIBibV9rspjWd8ckANjueNwETF/STnK2xV8B5oVgFzLWOY8aQLypfF/THPhBq0MO rJUYH6W7H/4A3wM70juBb+++Uh0es4a/z8EfeLaDjrwK4jYWudA33Q8I8a/CGnyWS5jJ SPDLup09EyTf8PDKhQyYTtvtNs4Gj5iW9YQc21TgTMzNeI8X1P/FbLsElZnFOh3CSlur fZLHRZODxXZbfMGK010TSvs6IKJiV02fVuWtPuxPVdbR3m7fhdUbJwfFtrriaF8KOJuU er2g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=Wy4sU81f; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id q11-20020a056402032b00b0048d858ddabfsi20230670edw.414.2023.01.13.10.29.34 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 10:29:34 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=Wy4sU81f; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9A1DD38A815F for ; Fri, 13 Jan 2023 18:29:33 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9A1DD38A815F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673634573; bh=Phgzl6cvhbrcYlQf6e5Xz5JDmrDLCzWT1fqECO1nztc=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=Wy4sU81fdVYfaotXmiP41kG8SXmuWsXIubHAvNhgax/puvnqvogYvj90vkdA0KrLZ V5fosDK5TClPjFdymNmkKE8dL8zXL9I/FU4BiUL8sXJG6qQSv1n9ojHKxuTlH6ROf3 DQXcXvvMEiAfb2RQ++PQXuthjB+QeXn9rWKMwEls= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pj1-x1033.google.com (mail-pj1-x1033.google.com [IPv6:2607:f8b0:4864:20::1033]) by sourceware.org (Postfix) with ESMTPS id 90CFD3858004 for ; Fri, 13 Jan 2023 18:27:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 90CFD3858004 Received: by mail-pj1-x1033.google.com with SMTP id cx21-20020a17090afd9500b00228f2ecc6dbso2069747pjb.0 for ; Fri, 13 Jan 2023 10:27:51 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Phgzl6cvhbrcYlQf6e5Xz5JDmrDLCzWT1fqECO1nztc=; b=G+KF4d7iYL3k11/edLYQ4sI9E2GLJdnpFunC+31kmAWVx8U6IAa8qKAQd31NQqms6p Pc0r2vCdxv1hmTojIUY0MMmP+CTQr1uUHWdMsM1CW2xyW5yXmu5DSAtgLwMr0A3WhuXr E6jzjpr6Z/3g2t5Rz1IY2qzNcNHcYIkXyrN3FHzXPBNoDZBfYcxrW0yLEUHpEbWNzuMw msQt7Lw2XQ9CIKjrYrIiHYQwOWGviGu3Xd4jIIf85Wc2m7TV7EosA/GO60AViNGTDgqQ d4mujcBvzVKN03S3LPnOZrk128oyyQDphCBxn/9p/cnvBHGSxLUyT+NZlaPDWxGpUX1P 2bjg== X-Gm-Message-State: AFqh2kr32LyxI16LYwuTQsdSIQPKr8bKvl5dLFDdJbEqZ5rDUDhgwbo1 wtR/cCmhRVqZCU6gviPftsa6rwolAJ0PlHxI X-Received: by 2002:a17:902:d5c8:b0:193:38b4:b9a7 with SMTP id g8-20020a170902d5c800b0019338b4b9a7mr20256032plh.31.1673634469938; Fri, 13 Jan 2023 10:27:49 -0800 (PST) Received: from stoup.. (rrcs-173-198-77-218.west.biz.rr.com. [173.198.77.218]) by smtp.gmail.com with ESMTPSA id s17-20020a170902c65100b001927ebc40e2sm14443640pls.193.2023.01.13.10.27.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 10:27:49 -0800 (PST) To: libc-alpha@sourceware.org Cc: adhemerval.zanella@linaro.org, goldstein.w.n@gmail.com Subject: [PATCH v8 09/17] string: Improve generic strcmp Date: Fri, 13 Jan 2023 08:27:25 -1000 Message-Id: <20230113182733.1268668-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230113182733.1268668-1-richard.henderson@linaro.org> References: <20230113182733.1268668-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Henderson via Libc-alpha From: Richard Henderson Reply-To: Richard Henderson Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto New generic implementation tries to use word operations along with the new string-fzb.h functions even for inputs with different alignments (with still uses aligned access plus merge operation to get a correct word by word comparison). Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu, and powerpc-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Co-authored-by: Richard Henderson Message-Id: <20230111204558.2402155-10-adhemerval.zanella@linaro.org> --- string/strcmp.c | 118 +++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 102 insertions(+), 16 deletions(-) diff --git a/string/strcmp.c b/string/strcmp.c index 053f5a8d2b..f4861c0c08 100644 --- a/string/strcmp.c +++ b/string/strcmp.c @@ -15,33 +15,119 @@ License along with the GNU C Library; if not, see . */ +#include +#include +#include #include +#include -#undef strcmp - -#ifndef STRCMP -# define STRCMP strcmp +#ifdef STRCMP +# define strcmp STRCMP #endif +static inline int +final_cmp (const op_t w1, const op_t w2) +{ + /* It can not use index_first_zero_ne because it must not compare past the + final '\0' is present (and final_cmp is called before has_zero check). + */ + for (size_t i = 0; i < sizeof (op_t); i++) + { + unsigned char c1 = extractbyte (w1, i); + unsigned char c2 = extractbyte (w2, i); + if (c1 == '\0' || c1 != c2) + return c1 - c2; + } + return 0; +} + +/* Aligned loop: if a difference is found, exit to compare the bytes. Else + if a zero is found we have equal strings. */ +static inline int +strcmp_aligned_loop (const op_t *x1, const op_t *x2, op_t w1) +{ + op_t w2 = *x2++; + + while (w1 == w2) + { + if (has_zero (w1)) + return 0; + w1 = *x1++; + w2 = *x2++; + } + + return final_cmp (w1, w2); +} + +/* Unaligned loop: align the first partial of P2, with 0xff for the rest of + the bytes so that we can also apply the has_zero test to see if we have + already reached EOS. If we have, then we can simply fall through to the + final comparison. */ +static inline int +strcmp_unaligned_loop (const op_t *x1, const op_t *x2, op_t w1, uintptr_t ofs) +{ + op_t w2a = *x2++; + uintptr_t sh_1 = ofs * CHAR_BIT; + uintptr_t sh_2 = sizeof(op_t) * CHAR_BIT - sh_1; + + op_t w2 = MERGE (w2a, sh_1, (op_t)-1, sh_2); + if (!has_zero (w2)) + { + op_t w2b; + + /* Unaligned loop. The invariant is that W2B, which is "ahead" of W1, + does not contain end-of-string. Therefore it is safe (and necessary) + to read another word from each while we do not have a difference. */ + while (1) + { + w2b = *x2++; + w2 = MERGE (w2a, sh_1, w2b, sh_2); + if (w1 != w2) + return final_cmp (w1, w2); + if (has_zero (w2b)) + break; + w1 = *x1++; + w2a = w2b; + } + + /* Zero found in the second partial of P2. If we had EOS in the aligned + word, we have equality. */ + if (has_zero (w1)) + return 0; + + /* Load the final word of P1 and align the final partial of P2. */ + w1 = *x1++; + w2 = MERGE (w2b, sh_1, 0, sh_2); + } + + return final_cmp (w1, w2); +} + /* Compare S1 and S2, returning less than, equal to or greater than zero if S1 is lexicographically less than, equal to or greater than S2. */ int -STRCMP (const char *p1, const char *p2) +strcmp (const char *p1, const char *p2) { - const unsigned char *s1 = (const unsigned char *) p1; - const unsigned char *s2 = (const unsigned char *) p2; - unsigned char c1, c2; - - do + /* Handle the unaligned bytes of p1 first. */ + uintptr_t n = -(uintptr_t)p1 % sizeof(op_t); + for (int i = 0; i < n; ++i) { - c1 = (unsigned char) *s1++; - c2 = (unsigned char) *s2++; - if (c1 == '\0') - return c1 - c2; + unsigned char c1 = *p1++; + unsigned char c2 = *p2++; + int diff = c1 - c2; + if (c1 == '\0' || diff != 0) + return diff; } - while (c1 == c2); - return c1 - c2; + /* P1 is now aligned to op_t. P2 may or may not be. */ + const op_t *x1 = (const op_t *) p1; + op_t w1 = *x1++; + uintptr_t ofs = (uintptr_t) p2 % sizeof(op_t); + return ofs == 0 + ? strcmp_aligned_loop (x1, (const op_t *)p2, w1) + : strcmp_unaligned_loop (x1, (const op_t *)(p2 - ofs), w1, ofs); } +#ifndef STRCMP libc_hidden_builtin_def (strcmp) +#endif From patchwork Fri Jan 13 18:27:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 642046 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp358918pvb; Fri, 13 Jan 2023 10:30:39 -0800 (PST) X-Google-Smtp-Source: AMrXdXvax99gUArrmQPORBhN/2HUVN6kpVCS51GBTmdKKWOVROvWb3P6B3Uc6QY3TxVPq+P6Q2uz X-Received: by 2002:a17:907:c11:b0:84b:b481:6188 with SMTP id ga17-20020a1709070c1100b0084bb4816188mr5399967ejc.64.1673634639467; Fri, 13 Jan 2023 10:30:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673634639; cv=none; d=google.com; s=arc-20160816; b=nGEsXH93YK2ZH8AfOO6uKLhjFk68bzN+CidmPA9l7ogJaTwmVlotDoJBviN+7gh66F NZNw6fMwdGxbBtfN4OehhTPuLtwUTlAzQjvkJxSkadJFk2FyAdl/aDPHHlSMhfb5OPFL aOuHsXKCvIXjaAInpT7i4HAuSyEEOhKBgPnGuz+cCwyMK57qgu69E8M8Jc0YyZ+da6cY 2zllygVgwPdmW/LcSyabrp+DjWo768aK5i/e8AtuOp0jYIknVFvAiGm+XiNAypy1meeg 8tMk4VP4vsfRlWZ2TTLH2DfrTY0rDzOiNBDBgmnJwd4cObs8G3tx/b76DK7hL6yWrKtp TOhw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=pB6rHi+5FxKP5urLewo651AKfBcJP7eeKA2p/ih9O9Y=; b=MWCrMpfDzrh55YO1S5VNvvDxyYbaAlJGA2DKGtWsWy60mslW/9cWj5+OVcZQ2FQ6ML dRpFVu6ztCByRi64JO7WralBOelNRnem7cdzI0sw09LPwkBJrgpGrRU16LYnOsqrKC87 m1veZ/YP1h6eooNY0Z8HPPL/xXTpSAV11Piz4sI3S7T8yCeUd34CRzMaRPC7nLqshG4d aLJHB1ltoMO9aTtivX43fPqe8CpCKXjvIJlHxrtIhkYUzk+sT8IRB53aZX6qSeqxEtTN xnqZzCE/TW6Bs6qMeDi9thSjeELbDOU+tnKnfkjq998ZiR/iUfbSmuez0jvLQkTareKc 7zCw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=Vloy+1yo; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id hd38-20020a17090796a600b0086ac03cc7d6si2788822ejc.180.2023.01.13.10.30.39 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 10:30:39 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=Vloy+1yo; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A494C3947410 for ; Fri, 13 Jan 2023 18:30:37 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A494C3947410 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673634637; bh=pB6rHi+5FxKP5urLewo651AKfBcJP7eeKA2p/ih9O9Y=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=Vloy+1yoGvIwUudtk4PTbFpUJcQsFJ5A/pTK7gvLad4zp7HNJ0QAtL4waXTSPYZIQ J3m1WWdw1UewO+murstyRxXWPRv+d7gdVOsN+CixiOgg9MhkdwuOjNu7otz6fgp7i4 +V/xJq/i4TKIChq3PzU+7AEysxEIauFSqAgptlyM= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pj1-x1034.google.com (mail-pj1-x1034.google.com [IPv6:2607:f8b0:4864:20::1034]) by sourceware.org (Postfix) with ESMTPS id DBF5A383FB9C for ; Fri, 13 Jan 2023 18:27:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DBF5A383FB9C Received: by mail-pj1-x1034.google.com with SMTP id o13so19718738pjg.2 for ; Fri, 13 Jan 2023 10:27:51 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pB6rHi+5FxKP5urLewo651AKfBcJP7eeKA2p/ih9O9Y=; b=uahFrdJnxoHPjzEIB+vcdrqoje8EledsvHfHZPllfq9BxKv/00WUWpOeRuQ+ADs+83 Kbum5No+TTFsl1Ewxxv414MgZh81I4D10Kl+rHmohZjw3jGCcA/HPJZDT7i1PBXoulxn vU4xgKzaul4grxi+arYV7Q9cCnonG7++KiTdrBdOXYF/o/JwgNYaiP9QiJB6fxAvxDiT JFWoXhtN2udYH1ouH6UNcm6Ik2TWkUi+EFojE2VNiWuXDV5nfz8ecs+q1Z7eIZCpwW1T 5szQhK5ux1BsAZ/plygx9O1L2RYhRkZhldEbBkmld7S9jGxzfoAlaSDPJvhoqJ0nsIdo ko/w== X-Gm-Message-State: AFqh2kq7vrlW/K3F4J7B0YjRtNIkE3I0XYA42g9mdZuNvpY35L9unYXX +Y0x9LE0lY3MKl4hvDbdaLOXEt19BE84JBUu X-Received: by 2002:a17:903:26c4:b0:192:f6d0:602e with SMTP id jg4-20020a17090326c400b00192f6d0602emr31781587plb.22.1673634471372; Fri, 13 Jan 2023 10:27:51 -0800 (PST) Received: from stoup.. (rrcs-173-198-77-218.west.biz.rr.com. [173.198.77.218]) by smtp.gmail.com with ESMTPSA id s17-20020a170902c65100b001927ebc40e2sm14443640pls.193.2023.01.13.10.27.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 10:27:50 -0800 (PST) To: libc-alpha@sourceware.org Cc: adhemerval.zanella@linaro.org, goldstein.w.n@gmail.com Subject: [PATCH v8 10/17] string: Improve generic memchr Date: Fri, 13 Jan 2023 08:27:26 -1000 Message-Id: <20230113182733.1268668-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230113182733.1268668-1-richard.henderson@linaro.org> References: <20230113182733.1268668-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Henderson via Libc-alpha From: Richard Henderson Reply-To: Richard Henderson Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto New algorithm have the following key differences: - Reads first word unaligned and use string-maskoff function to remove unwanted data. This strategy follow arch-specific optimization used on aarch64 and powerpc. - Use string-fz{b,i} and string-opthr functions. Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu, and powerpc64-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Co-authored-by: Richard Henderson Message-Id: <20230111204558.2402155-11-adhemerval.zanella@linaro.org> --- string/memchr.c | 177 ++++++------------ .../powerpc32/power4/multiarch/memchr-ppc32.c | 16 +- .../powerpc64/multiarch/memchr-ppc64.c | 9 +- 3 files changed, 59 insertions(+), 143 deletions(-) diff --git a/string/memchr.c b/string/memchr.c index f800d47dce..e3264de540 100644 --- a/string/memchr.c +++ b/string/memchr.c @@ -1,10 +1,6 @@ -/* Copyright (C) 1991-2023 Free Software Foundation, Inc. +/* Scan memory for a character. Generic version + Copyright (C) 1991-2023 Free Software Foundation, Inc. This file is part of the GNU C Library. - Based on strlen implementation by Torbjorn Granlund (tege@sics.se), - with help from Dan Sahlin (dan@sics.se) and - commentary by Jim Blandy (jimb@ai.mit.edu); - adaptation to memchr suggested by Dick Karpinski (dick@cca.ucsf.edu), - and implemented by Roland McGrath (roland@ai.mit.edu). The GNU C Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public @@ -20,143 +16,76 @@ License along with the GNU C Library; if not, see . */ -#ifndef _LIBC -# include -#endif - +#include +#include +#include +#include +#include +#include +#include #include -#include +#undef memchr -#include - -#undef __memchr -#ifdef _LIBC -# undef memchr +#ifdef MEMCHR +# define __memchr MEMCHR #endif -#ifndef weak_alias -# define __memchr memchr -#endif - -#ifndef MEMCHR -# define MEMCHR __memchr -#endif +static inline const char * +sadd (uintptr_t x, uintptr_t y) +{ + uintptr_t ret = INT_ADD_OVERFLOW (x, y) ? (uintptr_t)-1 : x + y; + return (const char *)ret; +} /* Search no more than N bytes of S for C. */ void * -MEMCHR (void const *s, int c_in, size_t n) +__memchr (void const *s, int c_in, size_t n) { - /* On 32-bit hardware, choosing longword to be a 32-bit unsigned - long instead of a 64-bit uintmax_t tends to give better - performance. On 64-bit hardware, unsigned long is generally 64 - bits already. Change this typedef to experiment with - performance. */ - typedef unsigned long int longword; + if (__glibc_unlikely (n == 0)) + return NULL; - const unsigned char *char_ptr; - const longword *longword_ptr; - longword repeated_one; - longword repeated_c; - unsigned char c; + /* Read the first word, but munge it so that bytes before the array + will not match goal. */ + const op_t *word_ptr = word_containing (s); + uintptr_t s_int = (uintptr_t) s; - c = (unsigned char) c_in; + op_t word = *word_ptr; + op_t repeated_c = repeat_bytes (c_in); + /* Compute the address of the last byte taking in consideration possible + overflow. */ + const char *lbyte = sadd (s_int, n - 1); + /* And also the address of the word containing the last byte. */ + const op_t *lword = word_containing (lbyte); - /* Handle the first few bytes by reading one byte at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = (const unsigned char *) s; - n > 0 && (size_t) char_ptr % sizeof (longword) != 0; - --n, ++char_ptr) - if (*char_ptr == c) - return (void *) char_ptr; - - longword_ptr = (const longword *) char_ptr; - - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to any size longwords. */ - - /* Compute auxiliary longword values: - repeated_one is a value which has a 1 in every byte. - repeated_c has c in every byte. */ - repeated_one = 0x01010101; - repeated_c = c | (c << 8); - repeated_c |= repeated_c << 16; - if (0xffffffffU < (longword) -1) + find_t mask = shift_find (find_eq_all (word, repeated_c), s_int); + if (mask != 0) { - repeated_one |= repeated_one << 31 << 1; - repeated_c |= repeated_c << 31 << 1; - if (8 < sizeof (longword)) - { - size_t i; + char *ret = (char *) s + index_first (mask); + return (ret <= lbyte) ? ret : NULL; + } + if (word_ptr == lword) + return NULL; - for (i = 64; i < sizeof (longword) * 8; i *= 2) - { - repeated_one |= repeated_one << i; - repeated_c |= repeated_c << i; - } - } + word = *++word_ptr; + while (word_ptr != lword) + { + if (has_eq (word, repeated_c)) + return (char *) word_ptr + index_first_eq (word, repeated_c); + word = *++word_ptr; } - /* Instead of the traditional loop which tests each byte, we will test a - longword at a time. The tricky part is testing if *any of the four* - bytes in the longword in question are equal to c. We first use an xor - with repeated_c. This reduces the task to testing whether *any of the - four* bytes in longword1 is zero. - - We compute tmp = - ((longword1 - repeated_one) & ~longword1) & (repeated_one << 7). - That is, we perform the following operations: - 1. Subtract repeated_one. - 2. & ~longword1. - 3. & a mask consisting of 0x80 in every byte. - Consider what happens in each byte: - - If a byte of longword1 is zero, step 1 and 2 transform it into 0xff, - and step 3 transforms it into 0x80. A carry can also be propagated - to more significant bytes. - - If a byte of longword1 is nonzero, let its lowest 1 bit be at - position k (0 <= k <= 7); so the lowest k bits are 0. After step 1, - the byte ends in a single bit of value 0 and k bits of value 1. - After step 2, the result is just k bits of value 1: 2^k - 1. After - step 3, the result is 0. And no carry is produced. - So, if longword1 has only non-zero bytes, tmp is zero. - Whereas if longword1 has a zero byte, call j the position of the least - significant zero byte. Then the result has a zero at positions 0, ..., - j-1 and a 0x80 at position j. We cannot predict the result at the more - significant bytes (positions j+1..3), but it does not matter since we - already have a non-zero bit at position 8*j+7. - - So, the test whether any byte in longword1 is zero is equivalent to - testing whether tmp is nonzero. */ - - while (n >= sizeof (longword)) + if (has_eq (word, repeated_c)) { - longword longword1 = *longword_ptr ^ repeated_c; - - if ((((longword1 - repeated_one) & ~longword1) - & (repeated_one << 7)) != 0) - break; - longword_ptr++; - n -= sizeof (longword); + /* We found a match, but it might be in a byte past the end of the + array. */ + char *ret = (char *) word_ptr + index_first_eq (word, repeated_c); + if (ret <= lbyte) + return ret; } - - char_ptr = (const unsigned char *) longword_ptr; - - /* At this point, we know that either n < sizeof (longword), or one of the - sizeof (longword) bytes starting at char_ptr is == c. On little-endian - machines, we could determine the first such byte without any further - memory accesses, just by looking at the tmp result from the last loop - iteration. But this does not work on big-endian machines. Choose code - that works in both cases. */ - - for (; n > 0; --n, ++char_ptr) - { - if (*char_ptr == c) - return (void *) char_ptr; - } - return NULL; } -#ifdef weak_alias +#ifndef MEMCHR weak_alias (__memchr, memchr) -#endif libc_hidden_builtin_def (memchr) +#endif diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/memchr-ppc32.c b/sysdeps/powerpc/powerpc32/power4/multiarch/memchr-ppc32.c index 39ff84f3f3..a78585650f 100644 --- a/sysdeps/powerpc/powerpc32/power4/multiarch/memchr-ppc32.c +++ b/sysdeps/powerpc/powerpc32/power4/multiarch/memchr-ppc32.c @@ -18,17 +18,11 @@ #include -#define MEMCHR __memchr_ppc - -#undef weak_alias -#define weak_alias(a, b) - -#ifdef SHARED -# undef libc_hidden_builtin_def -# define libc_hidden_builtin_def(name) \ - __hidden_ver1(__memchr_ppc, __GI_memchr, __memchr_ppc); -#endif - extern __typeof (memchr) __memchr_ppc attribute_hidden; +#define MEMCHR __memchr_ppc #include + +#ifdef SHARED +__hidden_ver1(__memchr_ppc, __GI_memchr, __memchr_ppc); +#endif diff --git a/sysdeps/powerpc/powerpc64/multiarch/memchr-ppc64.c b/sysdeps/powerpc/powerpc64/multiarch/memchr-ppc64.c index 8097df709c..49ba5521fe 100644 --- a/sysdeps/powerpc/powerpc64/multiarch/memchr-ppc64.c +++ b/sysdeps/powerpc/powerpc64/multiarch/memchr-ppc64.c @@ -18,14 +18,7 @@ #include -#define MEMCHR __memchr_ppc - -#undef weak_alias -#define weak_alias(a, b) - -# undef libc_hidden_builtin_def -# define libc_hidden_builtin_def(name) - extern __typeof (memchr) __memchr_ppc attribute_hidden; +#define MEMCHR __memchr_ppc #include From patchwork Fri Jan 13 18:27:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 642045 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp358738pvb; Fri, 13 Jan 2023 10:30:17 -0800 (PST) X-Google-Smtp-Source: AMrXdXsSEyeI4ws8ObxNVIjqznoJgkFsnsxItIEZucRak4DiO8sRhEnMAEcSeTQBozkvD4lsAKb8 X-Received: by 2002:a17:906:7749:b0:7c5:fd:4352 with SMTP id o9-20020a170906774900b007c500fd4352mr88942154ejn.49.1673634616816; Fri, 13 Jan 2023 10:30:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673634616; cv=none; d=google.com; s=arc-20160816; b=ZxLnNaD3R9/4bxVdc/b6AtvGy/V6mjaASX11QNoyMsuanVz5q6J7vTFTFAO1BMLS1J Zh2hlAja/uaIGDFXvBTR3QP0uP7N8H+6QmbkFGjROKh2jsoTntuYlba/N6SW8r1TTmVP vSplUx6d7YG9plxXxOJNGNtfAymkiYYytf9MQy1Yk8FSwk2RgDxttoHvheRI2bIoB66J eq3/AUUBb35JK1zZlfXP3gma0nSy+qsLyCCTV+Dp+beuHlx8g/PHSF+3KkmzS9MHYVmH e2sOnlGfVtnLGnsE+pA0cim+dCNC/ChEd7Lidjfh/urqHL55TgaJprMSEmgQDA6XJAQD Z91w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=l6hk+JrzZ13/Egy56cZUiWIGSFWd/CPHLSq15Udmx44=; b=vH7XVjklRJQzwf41kbqP+hYatjEaxQY6/I3FVrBPVenhKK3COrOGQobWlum7u3D5qn gHzLn3IgKdJws9S3+Al6Q04LX31GQ3GHD6dVq7/ZkgzuURslKJnfLprCD9qijoIZtd9/ outbb4XDjeexJ5/pMMFYbeCNPBIMQarhzlkqPn81ayeEbNtcU7a8jeCpmZCJkMrFYU6F 4eaKnumJJR2aC8NLSSiCwlDG3apUeEnLh3qoiWbcrBuqyuo/FtIITMk8PNb0wYurNea8 +fUoJBi9MSYhEm10qm2YIMM/0MYQiJfuKa7LvaDxxBufqPZErnTRPaGeBmiooxj1JxMT lmng== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b="k7qsJWm/"; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id wz4-20020a170906fe4400b0084d43e4541csi15147328ejb.720.2023.01.13.10.30.16 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 10:30:16 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b="k7qsJWm/"; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A08793888C50 for ; Fri, 13 Jan 2023 18:30:15 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A08793888C50 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673634615; bh=l6hk+JrzZ13/Egy56cZUiWIGSFWd/CPHLSq15Udmx44=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=k7qsJWm/37ZEKsXRy0BVSkjKLc3oUt8zZwK25XVx6bsvPbKahGDqNM/ncW0CQKFOX 2nwKvNB3kguOtx300jzyxfvn78w/AIdbdiee4TYxJcOglAPRUMW+c0EjuBG6K/1cOI ykMG4apzGfydayxNPlgc1T/BGGSFbfZDw8SRTScs= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pj1-x102b.google.com (mail-pj1-x102b.google.com [IPv6:2607:f8b0:4864:20::102b]) by sourceware.org (Postfix) with ESMTPS id E5D2438493D8 for ; Fri, 13 Jan 2023 18:27:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E5D2438493D8 Received: by mail-pj1-x102b.google.com with SMTP id u1-20020a17090a450100b0022936a63a21so1342832pjg.4 for ; Fri, 13 Jan 2023 10:27:53 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=l6hk+JrzZ13/Egy56cZUiWIGSFWd/CPHLSq15Udmx44=; b=Jpfw8YXd4ZIXI2RzFFVfKN5KTxq4va/z1lTiARzizAiwExJQqN9GGPcJ/o93c1ReEq qLqnJNd6zSBmvQzu4IZW3T7j0UthydeFbht5gu+9Z5NITFtQIQxaHFgs7b8yDqGxvMwK eXlJ+B33EbTI/YP/fZTSLT0lVLETwUw2HQZpCAHoJMQvkf/AaAXoAQDyMsWAK8+6FlQU xTuR2J74lnD7gPOnQPi4BSqFjxdf4QRrDDormhuDl1wh8nkEkNJQGgMgXSd71Q1/CFhD Yj4IqtDouAfrsFpF8pPWjFKBAf+xuwv7LpdbP41mhJVx2cdcqvtD8Tg81bwdGk73IqiC tXoQ== X-Gm-Message-State: AFqh2krUf+iNNKSJNmsBsOOWsGaGt47FmFYMdIHOlU0k72WPVKn3+5rm bL70ZLigYvUo0axGcniD3I+oxNel/KeIaPLk X-Received: by 2002:a17:902:9a98:b0:193:167c:d4b1 with SMTP id w24-20020a1709029a9800b00193167cd4b1mr25433394plp.11.1673634472875; Fri, 13 Jan 2023 10:27:52 -0800 (PST) Received: from stoup.. (rrcs-173-198-77-218.west.biz.rr.com. [173.198.77.218]) by smtp.gmail.com with ESMTPSA id s17-20020a170902c65100b001927ebc40e2sm14443640pls.193.2023.01.13.10.27.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 10:27:52 -0800 (PST) To: libc-alpha@sourceware.org Cc: adhemerval.zanella@linaro.org, goldstein.w.n@gmail.com Subject: [PATCH v8 11/17] string: Improve generic memrchr Date: Fri, 13 Jan 2023 08:27:27 -1000 Message-Id: <20230113182733.1268668-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230113182733.1268668-1-richard.henderson@linaro.org> References: <20230113182733.1268668-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Henderson via Libc-alpha From: Richard Henderson Reply-To: Richard Henderson Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto New algorithm have the following key differences: - Use string-fz{b,c} functions. Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu, and powerpc64-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Co-authored-by: Richard Henderson Message-Id: <20230111204558.2402155-12-adhemerval.zanella@linaro.org> --- string/memrchr.c | 190 ++++++++--------------------------------------- 1 file changed, 33 insertions(+), 157 deletions(-) diff --git a/string/memrchr.c b/string/memrchr.c index 18b20ff76a..e455db6842 100644 --- a/string/memrchr.c +++ b/string/memrchr.c @@ -1,11 +1,6 @@ /* memrchr -- find the last occurrence of a byte in a memory block Copyright (C) 1991-2023 Free Software Foundation, Inc. This file is part of the GNU C Library. - Based on strlen implementation by Torbjorn Granlund (tege@sics.se), - with help from Dan Sahlin (dan@sics.se) and - commentary by Jim Blandy (jimb@ai.mit.edu); - adaptation to memchr suggested by Dick Karpinski (dick@cca.ucsf.edu), - and implemented by Roland McGrath (roland@ai.mit.edu). The GNU C Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public @@ -21,177 +16,58 @@ License along with the GNU C Library; if not, see . */ -#include - -#ifdef HAVE_CONFIG_H -# include -#endif - -#if defined _LIBC -# include -# include -#endif - -#if defined HAVE_LIMITS_H || defined _LIBC -# include -#endif - -#define LONG_MAX_32_BITS 2147483647 - -#ifndef LONG_MAX -# define LONG_MAX LONG_MAX_32_BITS -#endif - -#include +#include +#include +#include +#include +#include #undef __memrchr #undef memrchr -#ifndef weak_alias -# define __memrchr memrchr +#ifdef MEMRCHR +# define __memrchr MEMRCHR #endif -/* Search no more than N bytes of S for C. */ void * -#ifndef MEMRCHR -__memrchr -#else -MEMRCHR -#endif - (const void *s, int c_in, size_t n) +__memrchr (const void *s, int c_in, size_t n) { - const unsigned char *char_ptr; - const unsigned long int *longword_ptr; - unsigned long int longword, magic_bits, charmask; - unsigned char c; - - c = (unsigned char) c_in; - /* Handle the last few characters by reading one character at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = (const unsigned char *) s + n; - n > 0 && ((unsigned long int) char_ptr - & (sizeof (longword) - 1)) != 0; - --n) - if (*--char_ptr == c) + Do this until CHAR_PTR is aligned on a word boundary, or + the entirety of small inputs. */ + const unsigned char *char_ptr = (const unsigned char *) (s + n); + size_t align = (uintptr_t) char_ptr % sizeof (op_t); + if (n < OP_T_THRES || align > n) + align = n; + + for (size_t i = 0; i < align; ++i) + if (*--char_ptr == c_in) return (void *) char_ptr; - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to 8-byte longwords. */ + const op_t *word_ptr = (const op_t *) char_ptr; + n -= align; + if (__glibc_unlikely (n == 0)) + return NULL; - longword_ptr = (const unsigned long int *) char_ptr; + /* Compute the address of the word containing the initial byte. */ + const op_t *lword = word_containing (s); - /* Bits 31, 24, 16, and 8 of this number are zero. Call these bits - the "holes." Note that there is a hole just to the left of - each byte, with an extra at the end: + /* Set up a word, each of whose bytes is C. */ + op_t repeated_c = repeat_bytes (c_in); - bits: 01111110 11111110 11111110 11111111 - bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD - - The 1-bits make sure that carries propagate to the next 0-bit. - The 0-bits provide holes for carries to fall into. */ - magic_bits = -1; - magic_bits = magic_bits / 0xff * 0xfe << 1 >> 1 | 1; - - /* Set up a longword, each of whose bytes is C. */ - charmask = c | (c << 8); - charmask |= charmask << 16; -#if LONG_MAX > LONG_MAX_32_BITS - charmask |= charmask << 32; -#endif - - /* Instead of the traditional loop which tests each character, - we will test a longword at a time. The tricky part is testing - if *any of the four* bytes in the longword in question are zero. */ - while (n >= sizeof (longword)) + while (word_ptr != lword) { - /* We tentatively exit the loop if adding MAGIC_BITS to - LONGWORD fails to change any of the hole bits of LONGWORD. - - 1) Is this safe? Will it catch all the zero bytes? - Suppose there is a byte with all zeros. Any carry bits - propagating from its left will fall into the hole at its - least significant bit and stop. Since there will be no - carry from its most significant bit, the LSB of the - byte to the left will be unchanged, and the zero will be - detected. - - 2) Is this worthwhile? Will it ignore everything except - zero bytes? Suppose every byte of LONGWORD has a bit set - somewhere. There will be a carry into bit 8. If bit 8 - is set, this will carry into bit 16. If bit 8 is clear, - one of bits 9-15 must be set, so there will be a carry - into bit 16. Similarly, there will be a carry into bit - 24. If one of bits 24-30 is set, there will be a carry - into bit 31, so all of the hole bits will be changed. - - The one misfire occurs when bits 24-30 are clear and bit - 31 is set; in this case, the hole at bit 31 is not - changed. If we had access to the processor carry flag, - we could close this loophole by putting the fourth hole - at bit 32! - - So it ignores everything except 128's, when they're aligned - properly. - - 3) But wait! Aren't we looking for C, not zero? - Good point. So what we do is XOR LONGWORD with a longword, - each of whose bytes is C. This turns each byte that is C - into a zero. */ - - longword = *--longword_ptr ^ charmask; - - /* Add MAGIC_BITS to LONGWORD. */ - if ((((longword + magic_bits) - - /* Set those bits that were unchanged by the addition. */ - ^ ~longword) - - /* Look at only the hole bits. If any of the hole bits - are unchanged, most likely one of the bytes was a - zero. */ - & ~magic_bits) != 0) + op_t word = *--word_ptr; + if (has_eq (word, repeated_c)) { - /* Which of the bytes was C? If none of them were, it was - a misfire; continue the search. */ - - const unsigned char *cp = (const unsigned char *) longword_ptr; - -#if LONG_MAX > 2147483647 - if (cp[7] == c) - return (void *) &cp[7]; - if (cp[6] == c) - return (void *) &cp[6]; - if (cp[5] == c) - return (void *) &cp[5]; - if (cp[4] == c) - return (void *) &cp[4]; -#endif - if (cp[3] == c) - return (void *) &cp[3]; - if (cp[2] == c) - return (void *) &cp[2]; - if (cp[1] == c) - return (void *) &cp[1]; - if (cp[0] == c) - return (void *) cp; + /* We found a match, but it might be in a byte past the start + of the array. */ + char *ret = (char *) word_ptr + index_last_eq (word, repeated_c); + return ret >= (char *) s ? ret : NULL; } - - n -= sizeof (longword); } - - char_ptr = (const unsigned char *) longword_ptr; - - while (n-- > 0) - { - if (*--char_ptr == c) - return (void *) char_ptr; - } - - return 0; + return NULL; } #ifndef MEMRCHR -# ifdef weak_alias weak_alias (__memrchr, memrchr) -# endif #endif From patchwork Fri Jan 13 18:27:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 642042 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp358433pvb; Fri, 13 Jan 2023 10:29:31 -0800 (PST) X-Google-Smtp-Source: AMrXdXu36Q9sSovuEM4ZWov1TZI+ITnZiRyswpKB3IHEhprT9bn0CAd+W32+hLYjOBeom9fJ/K5V X-Received: by 2002:a50:fe89:0:b0:49b:53dc:fe0 with SMTP id d9-20020a50fe89000000b0049b53dc0fe0mr7977813edt.40.1673634571826; Fri, 13 Jan 2023 10:29:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673634571; cv=none; d=google.com; s=arc-20160816; b=d+p0hMQ7IHlN0NRfILmo0RgXcHi/q+zUTgBsDUKOfXjFQ6UlgmWT1v8NxZCBQAvbq7 Eu0JXqaXI/SS1yK7OoSpAR9nMu0ds5GJRMVz7nVi5W/Bq29i6+eUIDPQssS9S5EWjdID 1ZHaA2sS7sZXbeX6RjuYA0Ysd4eNF1bmJlLTqzgRpJ7xuTo0sMHssLgns+7gV+OLfTtp YCSMN3R/ZHDu8N+bDbEU6gxcc5Et3T+x2SjNYJLf/qzKAUaL7bVJitLnk7HoR+ohXEV3 MJu3qdqV9EWaFbMRxvNEVpF7DG4nbTi8kYbCAz54U4g81PHG93X7TC3sEewjJpX1KTmU GBmA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=GyLtL9HgZ7A139N6K1a5xCBABR+k0+1g1HSS56x7RLs=; b=FgTkCmc8e2sZ5TIRQ1J1oNG0XN8GrDG9kA8orbz4unYW1n9y1prRQYpW0Tl4P4SAE4 VebK/R9NrXtdJnUVzUkTdTREPOdrGlVIXNURJMVt+tTtRP4Fe8SLdxX/2u9U3xC4RCej 7r7EsyJ+qGkRTUjtLuFfzMx50/+QcJ82gUYcUM55vkKUmVyREXqWnG0rYNhKMPIOBiJX bxSzF09BybD3k0JEOYYXkzFfwERRxlE4Wvq4fpZLALWOZ0pHdYVofEepbDKo5mhu0PXb Fjfq1t7LPZ/J2gvf9nmTeXvxaX1IYvPXcWvOhe4nk0gxCe9FBf/QK4tGjFtYF0HshjtI tCpg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=not1MNEi; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id cm25-20020a0564020c9900b00490ff375640si8721312edb.518.2023.01.13.10.29.31 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 10:29:31 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=not1MNEi; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5E85F3889E25 for ; Fri, 13 Jan 2023 18:29:30 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5E85F3889E25 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673634570; bh=GyLtL9HgZ7A139N6K1a5xCBABR+k0+1g1HSS56x7RLs=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=not1MNEi4SY5pYBYtYyNN4cvMIEpoxrABI5cr/niZQh12WRO6Z/EQMGbensbHV535 cBGwU1gxVEUfrI6QHssrMNwWFjsUFVVg39IbPdfCTGlUhm3FyIVeJj2g678a9AHRNi H+KKtCeFDzGM2xuztEZvjZ+kqOiw9sqT592sYoCc= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pj1-x102c.google.com (mail-pj1-x102c.google.com [IPv6:2607:f8b0:4864:20::102c]) by sourceware.org (Postfix) with ESMTPS id DA7C93881D1A for ; Fri, 13 Jan 2023 18:28:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DA7C93881D1A Received: by mail-pj1-x102c.google.com with SMTP id q64so23220026pjq.4 for ; Fri, 13 Jan 2023 10:28:02 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GyLtL9HgZ7A139N6K1a5xCBABR+k0+1g1HSS56x7RLs=; b=xVlZ3k0/TIx+i/WLqwF9zBDbtiuTqde+08CRTtRtYLptf6k17v5+WYwHc74QPYPDfb 4rN/Mi2mkWpUQYCLGo970MkCNdSH4zekZ0lm6UwoDk1uoQ+gf3QbAu3JjGdz/STvJSSW KVKhQ3eTO8gzsweSvkEKoMExeHaEFmzvGQvxnhQcKt1bMWPJMpyOvOY13tiV06sOIBy5 PRnDUKOxOTEDWljO+Wrq3demwlysedSNkMzwuU922X5R/u/4H0xkIAMbqVd2vYoxvw3R 2i/Dr/d/5FbzNrOsCcJ9j7CH1QA2FKvQusnyT32skfSAbtVvyMqHPlBdzr4Jr+HeN49H 5XWA== X-Gm-Message-State: AFqh2kqjBld6SLKIe8AFzqmXV80QRaOh6+Pz9DssV5eGUUX19ORuJlHA m90xNMfJ6OZz2xNwqpEl3HKiW9LKvosEiZNH X-Received: by 2002:a17:902:f54b:b0:194:7813:2be5 with SMTP id h11-20020a170902f54b00b0019478132be5mr1655182plf.48.1673634481890; Fri, 13 Jan 2023 10:28:01 -0800 (PST) Received: from stoup.. (rrcs-173-198-77-218.west.biz.rr.com. [173.198.77.218]) by smtp.gmail.com with ESMTPSA id s17-20020a170902c65100b001927ebc40e2sm14443640pls.193.2023.01.13.10.28.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 10:28:01 -0800 (PST) To: libc-alpha@sourceware.org Cc: adhemerval.zanella@linaro.org, goldstein.w.n@gmail.com Subject: [PATCH v8 17/17] sh: Add string-fzb.h Date: Fri, 13 Jan 2023 08:27:33 -1000 Message-Id: <20230113182733.1268668-18-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230113182733.1268668-1-richard.henderson@linaro.org> References: <20230113182733.1268668-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Henderson via Libc-alpha From: Richard Henderson Reply-To: Richard Henderson Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto Use the SH cmp/str on has_{zero,eq,zero_eq}. Checked on sh4-linux-gnu. Message-Id: <20230111204558.2402155-18-adhemerval.zanella@linaro.org> --- sysdeps/powerpc/{ => power6}/string-fza.h | 0 sysdeps/sh/string-fzb.h | 59 +++++++++++++++++++++++ 2 files changed, 59 insertions(+) rename sysdeps/powerpc/{ => power6}/string-fza.h (100%) create mode 100644 sysdeps/sh/string-fzb.h diff --git a/sysdeps/powerpc/string-fza.h b/sysdeps/powerpc/power6/string-fza.h similarity index 100% rename from sysdeps/powerpc/string-fza.h rename to sysdeps/powerpc/power6/string-fza.h diff --git a/sysdeps/sh/string-fzb.h b/sysdeps/sh/string-fzb.h new file mode 100644 index 0000000000..5a7868333b --- /dev/null +++ b/sysdeps/sh/string-fzb.h @@ -0,0 +1,59 @@ +/* Zero byte detection; boolean. SH4 version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef STRING_FZB_H +#define STRING_FZB_H 1 + +#include +#include + +/* Determine if any bytes within X1 and X2 are equal. */ + +static __always_inline _Bool +has_eq (op_t x1, op_t x2) +{ + int ret; + + /* + * TODO: A compiler builtin for cmp/str would be much better. + * It is difficult to use asm goto here, because the range of + * bt/bf are quite small. + */ + asm("cmp/str %1,%2\n\tmovt %0" + : "=r" (ret) : "r" (x1), "r" (x2) : "t"); + + return ret; +} + +/* Determine if any byte within X is zero. */ + +static __always_inline _Bool +has_zero (op_t x) +{ + return has_eq (x, 0); +} + +/* Likewise, but for zeros in X1 and equal bytes between X1 and X2. */ + +static __always_inline _Bool +has_zero_eq (op_t x1, op_t x2) +{ + return has_zero (x1) | has_eq (x1, x2); +} + +#endif /* STRING_FZB_H */