From patchwork Wed Jan 10 12:47:45 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 124084 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp5235555qgn; Wed, 10 Jan 2018 04:49:56 -0800 (PST) X-Google-Smtp-Source: ACJfBouZgsSRyeAgnH/LLBl+wiTEPKWeeT4aFIrlETTiACDHwe0fgCZDJKkv1poljufbs8Vfm6J5 X-Received: by 10.101.75.81 with SMTP id k17mr14481869pgt.301.1515588596410; Wed, 10 Jan 2018 04:49:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515588596; cv=none; d=google.com; s=arc-20160816; b=K6tE9yKC8W1mByXW5vdMv267ZOBjTb3PO6V/HaMB+qQ5vuEXBjNzG+mP49PD9MujYB ATpGSHN/g3RaZsKdxL0c15xKdd1GSVa+EC+qrXHfIrB29m0hCh1vjtnbum99r8KllVBL x7eks90bk497sYVgw8e8kEg0KhLGLWVcuu76Hn79ijLcW7XrZAT3QmZo7Exd5+MsKwvO r0sLDn/2k5DvFFRapXqEkABH5OUexPQn71l6REJjIcGLC+7iJJNR1PvQ15Dmtd042iz0 yPQLGpl5MQrCDr1S9dDNXVwZidg1teRDfBhu3xVvThj5b0+SGgbjNK0HuvwhbbYpkKap 71Kg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :delivered-to:sender:list-help:list-post:list-archive:list-subscribe :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=fTc+nqctqGyeIWTSzskTRZNdD2E/CoCwkDIgQhLo3X4=; b=thvG4HhTCDJhKDDwoi6lsCv5BAYWxbp7UaiqwI+q5deesg8dayGiJvnRvglhqAvoKe XPiC4FqCCsXBIqxKFPXITeMrYe6RhhKoLUgxadV9Dz2nXVbGQFUHJlnpVh6WatH8+Fp+ V/mnbzlnZ7+p8KVlc/hFnA5AWKfOCTO/JlgwIeUkIq02BPoybLEpbWS6bRY892rozjun wW01FwyzMKh+YGiVdEQGPnaMMFBkoAdp3Frsc4oQOtufbL6rEYB9S2qtWP2+SG3hYPS8 5ehkkS6XADR0V+QCYe4OSXlnP6Cau9E7BJ2s5wTZZRHGGiEXINzoTX8uE9Wdnlm4SZtx kkeg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=lciJIcdu; spf=pass (google.com: domain of libc-alpha-return-89005-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89005-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id z28si10568716pgc.174.2018.01.10.04.49.56 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Jan 2018 04:49:56 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-89005-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=lciJIcdu; spf=pass (google.com: domain of libc-alpha-return-89005-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89005-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=LAJ98B1l3U4CBo2Iv6amFP0F7bae6QD 4A05MJ8EBdWA0T3w2ZraKEpv48AIKp58mHw5ZtttpODA9r+YG366+vFONF3WjSjh cnpg3Oj9d0JU3CkPocSqre9fT0BRff0KUHcV9Xr99MCoxeikzKXTA/htXUo580Ca rrI3M0/zsptE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; s=default; bh=6ph7VqZ1AxDiGr05EPRNi9H7djE=; b=lciJI cduK2nx/gCBf1L0itC1KBf0d//JyOnLfU6DSDPKemCBKl4UkJmDg7qbxuUx5RmZb T777oGl7A6U1T9riEoEzv8Ofik3Raok9jn9+gD//LCyAzfThb165FQrRpCdt2O4G cFecfKpCc7L+dTIkFfMouhtUGT1hOQSoroX2zs= Received: (qmail 129960 invoked by alias); 10 Jan 2018 12:48:43 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 120627 invoked by uid 89); 10 Jan 2018 12:48:27 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.8 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=H*Ad:U*rth X-HELO: mail-qt0-f195.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=fTc+nqctqGyeIWTSzskTRZNdD2E/CoCwkDIgQhLo3X4=; b=hTvFYlcJWYKmMTLnGvUyAp8fbDawhNXjsEpqNsSgfItQfXE5Vg1i9kr7/8rfeZ1RsH S5oyag+5JBwt9SNdwgq0XoE5Ap9SuO11Pn59AUsSNp08H1RmGkCiI/FXPuBipaCvBovq c9rP/5iCIlzjgi5z9S2D6t/dldWjQRPx85a58oOFmdXrdsC8EMvr3T/aCNA5Zx4pgL56 25GAZxl0lbmILMQm5LcrYtq0tDZH9FP5xAPRthe0ulXkeR421sjYGG4bv75AdZueF5EA Lvr3zJ8jJwrwQp5SMToNLWybqbsPKSxdFyB4U+N2XWS08shdnfJyjXcr8BoxEDJpAmv1 1Lkw== X-Gm-Message-State: AKwxytfC1uNbrxYJKPwf7qgCapPhhALhYL5dYd8N5gtOPHpwwvggwUfu vg0h+na0DIqxBTQzfzx3Sr9sDMgUXkQ= X-Received: by 10.200.5.10 with SMTP id u10mr26968140qtg.103.1515588490881; Wed, 10 Jan 2018 04:48:10 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Richard Henderson Subject: [PATCH v3 01/18] Parameterize op_t from memcopy.h Date: Wed, 10 Jan 2018 10:47:45 -0200 Message-Id: <1515588482-15744-2-git-send-email-adhemerval.zanella@linaro.org> In-Reply-To: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> From: Richard Henderson Basically moves op_t definition out to an specific header, adds the attribute 'may-alias', and cleanup its duplicated definitions. It lead to inclusion of tilegx32 gmp-mparam.h similar to x32 so op_t can be define as a long long (from _LONG_LONG_LIMB). Checked with a build and check with run-built-tests=no for all major Linux ABIs (alpha, aarch64, arm, hppa, i686, ia64, m68k, microblaze, mips, mips64, nios2, powerpc, powerpc64le, s390x, sh4, sparc64, tilegx, and x86_64). Richard Henderson Adhemerval Zanella * sysdeps/generic/string-optype.h: New file. * sysdeps/generic/memcopy.h: Include it. * string/memcmp.c (op_t): Remove define. * sysdeps/tile/memcmp.c (op_t): Likewise. * sysdeps/tile/memcopy.h (op_t): Likewise. * sysdeps/tile/tilegx32/gmp-mparam.h: New file. --- string/memcmp.c | 1 - sysdeps/generic/memcopy.h | 7 +++---- sysdeps/generic/string-optype.h | 31 +++++++++++++++++++++++++++++++ sysdeps/tile/memcmp.c | 1 - sysdeps/tile/memcopy.h | 7 ------- sysdeps/tile/tilegx32/gmp-mparam.h | 30 ++++++++++++++++++++++++++++++ 6 files changed, 64 insertions(+), 13 deletions(-) create mode 100644 sysdeps/generic/string-optype.h create mode 100644 sysdeps/tile/tilegx32/gmp-mparam.h -- 2.7.4 diff --git a/string/memcmp.c b/string/memcmp.c index aea5129..4fd2f83 100644 --- a/string/memcmp.c +++ b/string/memcmp.c @@ -46,7 +46,6 @@ /* Type to use for aligned memory operations. This should normally be the biggest type supported by a single load and store. Must be an unsigned type. */ -# define op_t unsigned long int # define OPSIZ (sizeof(op_t)) /* Threshold value for when to enter the unrolled loops. */ diff --git a/sysdeps/generic/memcopy.h b/sysdeps/generic/memcopy.h index c0d8da3..c7e9cc9 100644 --- a/sysdeps/generic/memcopy.h +++ b/sysdeps/generic/memcopy.h @@ -56,10 +56,9 @@ [I fail to understand. I feel stupid. --roland] */ -/* Type to use for aligned memory operations. - This should normally be the biggest type supported by a single load - and store. */ -#define op_t unsigned long int +/* Type to use for aligned memory operations. */ +#include + #define OPSIZ (sizeof(op_t)) /* Type to use for unaligned operations. */ diff --git a/sysdeps/generic/string-optype.h b/sysdeps/generic/string-optype.h new file mode 100644 index 0000000..1324070 --- /dev/null +++ b/sysdeps/generic/string-optype.h @@ -0,0 +1,31 @@ +/* Define a type to use for word access. Generic version. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef STRING_OPTYPE_H +#define STRING_OPTYPE_H 1 + +/* Use the existing parameterization from gmp as a default. */ +#include + +#ifdef _LONG_LONG_LIMB +typedef unsigned long long int __attribute__((__may_alias__)) op_t; +#else +typedef unsigned long int __attribute__((__may_alias__)) op_t; +#endif + +#endif /* string-optype.h */ diff --git a/sysdeps/tile/memcmp.c b/sysdeps/tile/memcmp.c index b7cf00a..89fff57 100644 --- a/sysdeps/tile/memcmp.c +++ b/sysdeps/tile/memcmp.c @@ -45,7 +45,6 @@ /* Type to use for aligned memory operations. This should normally be the biggest type supported by a single load and store. Must be an unsigned type. */ -# define op_t unsigned long int # define OPSIZ (sizeof(op_t)) /* Threshold value for when to enter the unrolled loops. */ diff --git a/sysdeps/tile/memcopy.h b/sysdeps/tile/memcopy.h index 0c357c1..748f648 100644 --- a/sysdeps/tile/memcopy.h +++ b/sysdeps/tile/memcopy.h @@ -22,10 +22,3 @@ /* The tilegx implementation of memcpy is safe to use for memmove. */ #undef MEMCPY_OK_FOR_FWD_MEMMOVE #define MEMCPY_OK_FOR_FWD_MEMMOVE 1 - -/* Support more efficient copying on tilegx32, which supports - long long as a native 64-bit type. */ -#if __WORDSIZE == 32 -# undef op_t -# define op_t unsigned long long int -#endif diff --git a/sysdeps/tile/tilegx32/gmp-mparam.h b/sysdeps/tile/tilegx32/gmp-mparam.h new file mode 100644 index 0000000..7d1cb98 --- /dev/null +++ b/sysdeps/tile/tilegx32/gmp-mparam.h @@ -0,0 +1,30 @@ +/* Compiler/machine parameter header file. TileGX32 version. + + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#if defined __GMP_H__ && ! defined _LONG_LONG_LIMB +#error "Included too late for _LONG_LONG_LIMB to take effect" +#endif + +#define _LONG_LONG_LIMB +#define BITS_PER_MP_LIMB 64 +#define BYTES_PER_MP_LIMB 8 +#define BITS_PER_LONGINT 32 +#define BITS_PER_INT 32 +#define BITS_PER_SHORTINT 16 +#define BITS_PER_CHAR 8 From patchwork Wed Jan 10 12:47:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 124088 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp5236109qgn; Wed, 10 Jan 2018 04:50:31 -0800 (PST) X-Google-Smtp-Source: ACJfBouJ5z1uX6mPTiUl17BMGxXW8EAsATNqY8jXsU2P5YRYmo0lXimhDkFE5SBLbyPjKYTC4HAU X-Received: by 10.101.69.70 with SMTP id x6mr14888084pgr.409.1515588631365; Wed, 10 Jan 2018 04:50:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515588631; cv=none; d=google.com; s=arc-20160816; b=Oocg1+Ceu3oNfiZhsL0RW9Z1ZZO2LNuKycpzMRzlGIEpv+P1+sHlWAJJzfhNy5ySaX zbsdIv+/BYE+tDFQ56zSAXDEQLIUUKGGLxzCLYmDxZedNEOzh95qdh31rXnPJSetCXtf 1aPHCZZlo+/2M/D0OnkDOIXZHmrZTTSJG9QKAZJpqJv00LSCjgUI7CwRjTvs+fQOoWfC hm4UggYC0JJdBDXxmva+ayNA2xQbw9E/7P3I1ax1DOtvsNAFXJmfxJvA46p+UmhEvxZd XjsfTJ9086zIcowSASra1K/YJXmZZ1uBaykovzOhujQSbM6Dwn2fCfp3uI4vCohW7BSs 7Rbg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :delivered-to:sender:list-help:list-post:list-archive:list-subscribe :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=eD85uWCnr4Atzm05xkWv+fULYu/xpkegtpu4R4H3F5c=; b=hrWh4x0qTVbbNeBYuGS9DyWu5Ny7dAcA0b2R0oN9KqJjl8GdDU/5R5XywuBDZ52vXi zICWB04GxCCbzKJL1aOECyx6Soo6Cgn+xP1XnkrbpQUt2sv/kbO8fYvE6OtQYmwhj6E4 rlVJEBpGoMcnh2YaO6lIOj0DX7yQ+XLx3AM8SYekLfmSzVGtq/HUYxSBi4ldA2i+GShi TB2AaDY8g0nWR880F2KZ+lcIJzdo5W+j9SOf/GYbcBbN4lhaojS/yzp6+fPfcogVYhew 4ib9nIY+SCEMnh7LFUT8D3+eg4emF0OWSiBEfMVx53iuTvcmYfbdOa91+gD4vCAsaWtS Jy4g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=l6RXEQeC; spf=pass (google.com: domain of libc-alpha-return-89010-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89010-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id m7si10552336pgt.469.2018.01.10.04.50.31 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Jan 2018 04:50:31 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-89010-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=l6RXEQeC; spf=pass (google.com: domain of libc-alpha-return-89010-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89010-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=cGQ7xsWfFHD72iuJn+xujTm3oUeW3gt qCUy4CxpP9p/++SEIEL9DN/EkOnX5U2m2KizpsgDF7VuJMQiYQrx8gd07YPM0hVq zvrfd/JVp+RrBvhB5vyuSYmEg9h/3jp+dWUsdnUqrL5vd+JnM8WnCPCvSpTkQMPq 5WA4Iv4R/mOw= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; s=default; bh=liInf4XmOTu4HGQf59LUW8TisZk=; b=l6RXE QeCp13hnnstOwh/MfauYWIhLCEsQmHjVt3b6p8oGjijVZIfmkgEp4T9bP31zllPL ToNdkvMi30xa2PytLUJX9d3RTwePP6h/BI+rsuGj12aEJJChGacdek00mB+/yk+l 9/aRmerUSHtS4ERAIzpX9BLyZWVIpj+Dpp4Tmk= Received: (qmail 130457 invoked by alias); 10 Jan 2018 12:48:44 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 120633 invoked by uid 89); 10 Jan 2018 12:48:27 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=assumptions, H*Ad:U*rth X-HELO: mail-qt0-f193.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=eD85uWCnr4Atzm05xkWv+fULYu/xpkegtpu4R4H3F5c=; b=TYIks5MN5TjYa0Zuia4mJDjKvPZytDkfMx4k7IPGvY69N/JbUh2WhAxT5Wdz6U+MRN HsmLM9WKzD64RL7xC8Durn1YeEMK8BeREvKOarGXTsnNr8Bnr7xuPOgp954QsXljJSgy ogLMvmza8ogmvGaUVZP9McHcB9wcJuDrPxEXwD1M+2BQ262J7WF8JOD0qs8UrddVOcZz Q2B3Zx43gEVC0tGpTrH2+A22+ftZehrXG0kK/1QwsERIJ+ij5Ir6NkwPvIGN2V1MGTw5 oCAUAtElOODvtbuh50imCoxzeoIg19G5ikoNKQw1mSjTgwsilBMcxZmDhuq4ZcjhJgpe Ci2Q== X-Gm-Message-State: AKwxytf+b8UeDHU77KymLWl8jOin8oxAXyd/4pzKa5qbL7a7NYb7+IKp wpR6KcL1EoMrer1qNDcMi5Rllf6tLWg= X-Received: by 10.200.58.228 with SMTP id x91mr26318657qte.323.1515588492533; Wed, 10 Jan 2018 04:48:12 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Richard Henderson Subject: [PATCH v3 02/18] Parameterize OP_T_THRES from memcopy.h Date: Wed, 10 Jan 2018 10:47:46 -0200 Message-Id: <1515588482-15744-3-git-send-email-adhemerval.zanella@linaro.org> In-Reply-To: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> From: Richard Henderson Basically it moves OP_T_THRES out of memcopy.h to its own header and adjust each architecture that redefines it. Checked with a build and check with run-built-tests=no for all major Linux ABIs (alpha, aarch64, arm, hppa, i686, ia64, m68k, microblaze, mips, mips64, nios2, powerpc, powerpc64le, s390x, sh4, sparc64, tilegx, and x86_64). Richard Henderson Adhemerval Zanella * sysdeps/generic/memcopy.h (OP_T_THRES): Move... * sysdeps/generic/string-opthr.h: ... here; new file. * sysdeps/i386/memcopy.h (OP_T_THRES): Move... * sysdeps/i386/string-opthr.h: ... here; new file. * sysdeps/m68k/memcopy.h (OP_T_THRES): Remove. * string/memcmp.c (OP_T_THRES): Remove definition. * sysdeps/powerpc/powerpc32/power4/memcopy.h (OP_T_THRES): Likewise. --- string/memcmp.c | 3 --- sysdeps/generic/memcopy.h | 4 +--- sysdeps/generic/string-opthr.h | 25 +++++++++++++++++++++++++ sysdeps/i386/memcopy.h | 3 --- sysdeps/i386/string-opthr.h | 25 +++++++++++++++++++++++++ sysdeps/m68k/memcopy.h | 3 --- sysdeps/powerpc/powerpc32/power4/memcopy.h | 5 ----- 7 files changed, 51 insertions(+), 17 deletions(-) create mode 100644 sysdeps/generic/string-opthr.h create mode 100644 sysdeps/i386/string-opthr.h -- 2.7.4 diff --git a/string/memcmp.c b/string/memcmp.c index 4fd2f83..82ad082 100644 --- a/string/memcmp.c +++ b/string/memcmp.c @@ -48,9 +48,6 @@ and store. Must be an unsigned type. */ # define OPSIZ (sizeof(op_t)) -/* Threshold value for when to enter the unrolled loops. */ -# define OP_T_THRES 16 - /* Type to use for unaligned operations. */ typedef unsigned char byte; diff --git a/sysdeps/generic/memcopy.h b/sysdeps/generic/memcopy.h index c7e9cc9..1698379 100644 --- a/sysdeps/generic/memcopy.h +++ b/sysdeps/generic/memcopy.h @@ -58,6 +58,7 @@ /* Type to use for aligned memory operations. */ #include +#include #define OPSIZ (sizeof(op_t)) @@ -190,9 +191,6 @@ extern void _wordcopy_bwd_dest_aligned (long int, long int, size_t) #endif -/* Threshold value for when to enter the unrolled loops. */ -#define OP_T_THRES 16 - /* Set to 1 if memcpy is safe to use for forward-copying memmove with overlapping addresses. This is 0 by default because memcpy implementations are generally not safe for overlapping addresses. */ diff --git a/sysdeps/generic/string-opthr.h b/sysdeps/generic/string-opthr.h new file mode 100644 index 0000000..17fa627 --- /dev/null +++ b/sysdeps/generic/string-opthr.h @@ -0,0 +1,25 @@ +/* Define a threshold for word access. Generic version. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef STRING_OPTHR_H +#define STRING_OPTHR_H 1 + +/* Threshold value for when to enter the unrolled loops. */ +#define OP_T_THRES 16 + +#endif /* string-opthr.h */ diff --git a/sysdeps/i386/memcopy.h b/sysdeps/i386/memcopy.h index 12bb39f..28cee47 100644 --- a/sysdeps/i386/memcopy.h +++ b/sysdeps/i386/memcopy.h @@ -19,9 +19,6 @@ #include -#undef OP_T_THRES -#define OP_T_THRES 8 - #undef BYTE_COPY_FWD #define BYTE_COPY_FWD(dst_bp, src_bp, nbytes) \ do { \ diff --git a/sysdeps/i386/string-opthr.h b/sysdeps/i386/string-opthr.h new file mode 100644 index 0000000..ed3e4b2 --- /dev/null +++ b/sysdeps/i386/string-opthr.h @@ -0,0 +1,25 @@ +/* Define a threshold for word access. i386 version. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef I386_STRING_OPTHR_H +#define I386_STRING_OPTHR_H 1 + +/* Threshold value for when to enter the unrolled loops. */ +#define OP_T_THRES 8 + +#endif /* I386_STRING_OPTHR_H */ diff --git a/sysdeps/m68k/memcopy.h b/sysdeps/m68k/memcopy.h index 58569c6..ee0c5fc 100644 --- a/sysdeps/m68k/memcopy.h +++ b/sysdeps/m68k/memcopy.h @@ -21,9 +21,6 @@ #if defined(__mc68020__) || defined(mc68020) -#undef OP_T_THRES -#define OP_T_THRES 16 - /* WORD_COPY_FWD and WORD_COPY_BWD are not symmetric on the 68020, because of its weird instruction overlap characteristics. */ diff --git a/sysdeps/powerpc/powerpc32/power4/memcopy.h b/sysdeps/powerpc/powerpc32/power4/memcopy.h index 8050abc..37ed40b 100644 --- a/sysdeps/powerpc/powerpc32/power4/memcopy.h +++ b/sysdeps/powerpc/powerpc32/power4/memcopy.h @@ -51,11 +51,6 @@ [I fail to understand. I feel stupid. --roland] */ - -/* Threshold value for when to enter the unrolled loops. */ -#undef OP_T_THRES -#define OP_T_THRES 16 - /* Copy exactly NBYTES bytes from SRC_BP to DST_BP, without any assumptions about alignment of the pointers. */ #undef BYTE_COPY_FWD From patchwork Wed Jan 10 12:47:47 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 124077 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp5234631qgn; Wed, 10 Jan 2018 04:48:52 -0800 (PST) X-Google-Smtp-Source: ACJfBovxSVWpPPWiO6J6IkTdp3TAENxOAAa3EJ3+HHZsMAAzuCq10QpjytPTm0KmufS4efpRInU5 X-Received: by 10.98.15.203 with SMTP id 72mr16946766pfp.104.1515588532644; Wed, 10 Jan 2018 04:48:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515588532; cv=none; d=google.com; s=arc-20160816; b=hxaU5HP1lWIWvUhgmC878Rg0g4qT50uc9c+K8oiNafpDzHp9shXRrbQQrnlfv3DTJO GU9QlHNXSY+k2EwfKCc0Yi7YeL6mUbOKtuFppIc8kHxbogzLdHt+6RLTwYYAtW7NGMiu v8h+iYL0CMNYQOVupETvpK2mdWzXQCfNMXpqnqH2VkK/hDMCqe1nYjf0rdRKDMCVpQGD ti2eUTxvwbJP5yoiLHimOPjGCIkD5xMupiO/zdqHEFR6LxCAdRV3gjdv44m4mwELDmNi f8ObeZG/Z1j5n/L0D1xiK72Dp3dUUdE5Yp6OIZ55HfSJvVhpBkkVWKUfEymlRCTxJUx3 TJoQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:to:from:delivered-to :sender:list-help:list-post:list-archive:list-subscribe :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=eZ8dCF6rUA/Tb1B1ylwpffxbACcE+rU3uMRyHIAzZo4=; b=I8ZOhJteH7egqkkU/Tn4CPanOFr0JMpjKukic7U1Gq8QIMdjKexoxEpVjI426P7tVJ 6wHedfARrcm7/p4hJ8vuH2S99o2IuRF/IK6RhBAF3V1e+Vmn+DD3W2vXkKOD3U8tS3mD xC0BeApf3bjza1myoVIW6aU0GXXOKL4fXU3qAO9XRl7a7SY4Ya9KiVqdntLumZ4KTzH6 zAQ9b/pRCqRLGcT3VRirpL+sWzxPynh8aI2lC2vGYSbHGj8TAWmI20/WF11AGFdetlDl FYCvGlvE+ZzSfIGgpjDEgjmC0ZP5gk8urmKJ2nsOsvHcDT/lfDTGsg+9sdvX8lsJrWT+ +UDA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=gsoJjVOL; spf=pass (google.com: domain of libc-alpha-return-88999-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-88999-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id u79si11700689pfa.354.2018.01.10.04.48.52 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Jan 2018 04:48:52 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-88999-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=gsoJjVOL; spf=pass (google.com: domain of libc-alpha-return-88999-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-88999-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=piwEVEfO6FBCrYr6Kufzl30btr5EYS0 6Va9lEknBx4Oq28piYiDmFP9FQGtL05NahHhdFi2ZEytQqw1xA7vK6V7sxXYZSs9 laPSzxsDS/HBX6LmbJ7AHoklPo3ZjotKGEYqCqZ2mT/AIgGmQ6Zn0rx/vFEw5Prm Bzr79W1OjaoU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:in-reply-to :references; s=default; bh=wtaVHSxIiKUGJkxP5PdyT82ozpo=; b=gsoJj VOLaUvHdLcG4AEgOrz+kqu5vwufXML+WdAADZQ/VV8e0/T1WjHEMtV3nQ57xMalP ncWNRpTsEBZsOV7htmlGKOL8HrcD2vCRZSIgslq5jmG2yMonr580rVWtclTES+zX +xQWSr58hs/1G028k6Ut5Vfs1kb5pB0KaQ8U+o= Received: (qmail 129300 invoked by alias); 10 Jan 2018 12:48:42 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 120626 invoked by uid 89); 10 Jan 2018 12:48:27 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mail-qt0-f193.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=eZ8dCF6rUA/Tb1B1ylwpffxbACcE+rU3uMRyHIAzZo4=; b=KqbzbUnbuM0OExonQfIEErK/61oH/el13ZXdUToD6gbu0qbxiPgdaVaqaHCgCqp6QG 6R5gxC3rY4D+6BmhmjJl0ElJcnXvbheu+FJrM+uA7PND7d780rMGCle8SyuUnQxegD11 M+ocSVcIP9b1uiTRhbklXdUnH+xl0dPvfKIsBRsUCD3ZopNNbKPJR9vjKzLww3OxKcP6 PeGKpietMAvw62JJ07aakRM71syCDdJp35DWVtzWtN97Sek9Lwz8gwkYJamEa4+IvrjF fyKxT54LNe0l7pOklnGSaAmCTwxJB8vitXKh61yeqZS6RH54Sj6/zDk3yyABb21/IdEb dWHg== X-Gm-Message-State: AKwxytejbAR3LkAtXzweZYQ2SKBJDcL3mym0wrgDHVVXbrzvFnZGL8Zz JFz8Gf/6nwnSHaHw9a52B7RSxNcyvag= X-Received: by 10.237.51.35 with SMTP id u32mr24942662qtd.110.1515588493906; Wed, 10 Jan 2018 04:48:13 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Subject: [PATCH v3 03/18] Add string-maskoff.h generic header Date: Wed, 10 Jan 2018 10:47:47 -0200 Message-Id: <1515588482-15744-4-git-send-email-adhemerval.zanella@linaro.org> In-Reply-To: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> Macros to operate on unaligned access for string operations: - create_mask: create a mask based on pointer alignment to sets up non-zero bytes before the beginning of the word so a following operation (such as find zero) might ignore these bytes. - highbit_mask: create a mask with high bit of each byte being 1, and the low 7 bits being all the opposite of the input. These macros are meant to be used on optimized vectorized string implementations. Richard Henderson Adhemerval Zanella * sysdeps/generic/string-maskoff.h: New file. --- sysdeps/generic/string-maskoff.h | 64 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) create mode 100644 sysdeps/generic/string-maskoff.h -- 2.7.4 diff --git a/sysdeps/generic/string-maskoff.h b/sysdeps/generic/string-maskoff.h new file mode 100644 index 0000000..6231798 --- /dev/null +++ b/sysdeps/generic/string-maskoff.h @@ -0,0 +1,64 @@ +/* Mask off bits. Generic C version. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef STRING_MASKOFF_H +#define STRING_MASKOFF_H 1 + +#include +#include +#include + +/* Provide a mask based on the pointer alignment that sets up non-zero + bytes before the beginning of the word. It is used to mask off + undesirable bits from an aligned read from an unaligned pointer. + For instance, on a 64 bits machine with a pointer alignment of + 3 the function returns 0x0000000000ffffff for LE and 0xffffff0000000000 + (meaning to mask off the initial 3 bytes). */ +static inline op_t +create_mask (uintptr_t i) +{ + i = i % sizeof (op_t); + if (__BYTE_ORDER == __LITTLE_ENDIAN) + return ~(((op_t)-1) << (i * CHAR_BIT)); + else + return ~(((op_t)-1) >> (i * CHAR_BIT)); +} + +/* Setup an word with each byte being c_in. For instance, on a 64 bits + machine with input as 0xce the functions returns 0xcececececececece. */ +static inline op_t +repeat_bytes (unsigned char c_in) +{ + return ((op_t)-1 / 0xff) * c_in; +} + +/* Create a mask with high bit of each byte being 1, and the low 7 bits + being all the opposite of the input mask. It is used to mask off + undesirable bits from an aligned read from an unaligned pointer, + and also taking care to avoid match possible bytes meant to be + matched. For instance, on a 64 bits machine with a pointer alignment + of 3 the function returns 0x7f7f7f0000000000 (input meant to + be 0xffffff0000000000) for BE and 0x00000000007f7f7f for LE (input + meant to be 0x0000000000ffffff). */ +static inline op_t +highbit_mask (op_t m) +{ + return m & ~repeat_bytes (0x80); +} + +#endif /* STRING_MASKOFF_H */ From patchwork Wed Jan 10 12:47:48 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 124083 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp5235418qgn; Wed, 10 Jan 2018 04:49:47 -0800 (PST) X-Google-Smtp-Source: ACJfBoscaTp2+7qQiplnUPtJNqMbzd7nVlu6ov4Xsn3pgEN7c8NEvA9mBEfnWC1Qh1KctpT0rHtV X-Received: by 10.84.229.136 with SMTP id c8mr3371708plk.171.1515588587224; Wed, 10 Jan 2018 04:49:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515588587; cv=none; d=google.com; s=arc-20160816; b=uFlkk26W5zO9YHhPMMe1gn8VubS5/zIjUMTG8odEgAuEaq0TO/taV3kyFGx649miFH yaJAfXlHjTqsDllHApJx6POP3j9TUbaMSzT+2Ai60LLtk1BZPdIxdM2gRSUIOjPR6X6k qgwMPlXW6HWnm3ixAcMJ+CTTH3VW8SNUqT9FygXhSaxAkfymfMZmxtgiJlZNv/wHLtCT hslui9kS+8vzcyC71o3czBys2hj9VGLj8N+eaZdjv9Mrgu+5qr//rr7ZUP0MP890pNpq SgW+SUgf9E7XaLGq7WHEdlw8NbDLZdr1apJsteIbi8yTylazbXwH+Qxp8Y1KVyv3S9bE VSUQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:to:from:delivered-to :sender:list-help:list-post:list-archive:list-subscribe :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=XBEXX3UZmcJ2Egae4+yud6XvCDchqIlZnQfSyPNbtS8=; b=MpRRhSpmtJ6e4efDlVipQHpU89J81c428gQwAy/mcTI1VFaatK5xMDCdQS0dCPYGUe k6UgY09KWLQNNK9BSzz89lxEuGSEHAIXyGCb3bsjaq4QfqWp9Fse6qOZG55LMT2ygTG5 Jq6q5mfsVZ0EcqMjCxzSpOAPzV2k+7Bav6ym+wxrRJ+Y9th8Rnj63zjF1knq4V/jc5UQ llKet88aJMLijSTGdvmtg/W+poxRKtWmxQjGacG7iOQYoy7DOFSN1OaqIH3nhO85GG29 x6FV3smLzKMsPtfKJkk+yoacUajy61gKRtWeh/WrDrWsBGPoUkQNaT/QmwOsuPBlw5G+ oRzw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=QAFtNY7d; spf=pass (google.com: domain of libc-alpha-return-89006-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89006-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id c1si10504148pgp.517.2018.01.10.04.49.46 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Jan 2018 04:49:47 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-89006-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=QAFtNY7d; spf=pass (google.com: domain of libc-alpha-return-89006-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89006-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=Y1DXCgB/nm7XjULdo5TqZNmvFdpl/vV 0G2FyGyD3ksteupeKHfzavDIl/H2WPBpKpxsT81KL7R8xp13VEVGZeTFV413yhe1 sxCDoUeattpkt9GOtBkRhe6sIuuF1tKJFp8evW2T3RTjlqfiFijnrNZG35IQNtov 2CllWpR8dcEU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:in-reply-to :references; s=default; bh=T3Em+XtKCGK6iZ8P9EKqugoCDi4=; b=QAFtN Y7dHwNCgO3nFOP9pMnp91SRbfMAju2DLrSHkR2GrIaUm7vzYu9VU7fjtpFws4fRN wifrMJYMzJJqDG4WBNk0MZKSfolaFFdP/mcSgDzfNXwE7yOYhIhXpk3tPRdjdhpz Ezk72xwmtv5lWPJDFU7zi271sXrwDzOBIr2aCo= Received: (qmail 130041 invoked by alias); 10 Jan 2018 12:48:43 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 122696 invoked by uid 89); 10 Jan 2018 12:48:30 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=Kim, sd, inequality X-HELO: mail-qt0-f169.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=XBEXX3UZmcJ2Egae4+yud6XvCDchqIlZnQfSyPNbtS8=; b=cBlDQOoog7SGiKEX/zaJ8nE55aezVXlbjKp67R+YoEZM8mLaMJH0cbo69+suVquin5 1kuMiauFMWlNEbGCdID5yBFccoRonIEbXNsJ5hxlsWBG8wATIBEA35bnO7/mFxVgoDSh JYxr7/F4SmTSYKCpdiE/7H9+FOgO95cPAjp/USJoU/iBm4Q+wHIDune0vynsKhkP4+yo 5PAWAEtGvyUwTYgDAdiACAlWS38P0k62IuHhaHuqOqd7jZAC4s0kxmQ4/75riSl1KWHw 0i29kDw58wmXDeolUZOtpebI/1b+h6s5gnA68yIjhzSTPBt7HVPaAKomG8mDOVlfjFAR uESQ== X-Gm-Message-State: AKwxytfmhQQF92gmK2jkHaNQzLdm0BMl8FUGzkVs9Nf18uRxKB3PYchD bxXtOhqsb9jNoQ98XWSBeoWOcoRxfNs= X-Received: by 10.237.50.5 with SMTP id y5mr27316143qtd.7.1515588495433; Wed, 10 Jan 2018 04:48:15 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Subject: [PATCH v3 04/18] Add string vectorized find and detection functions Date: Wed, 10 Jan 2018 10:47:48 -0200 Message-Id: <1515588482-15744-5-git-send-email-adhemerval.zanella@linaro.org> In-Reply-To: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> This patch adds generic string find and detection implementation meant to be used in generic vectorized string implementation. The idea is to decompose the basic string operation so each architecture can reimplement if it provides any specialized hardware instruction. The 'string-fza.h' provides zero byte detection functions (find_zero_low, find_zero_all, find_eq_low, find_eq_all, find_zero_eq_low, find_zero_eq_all, find_zero_ne_low, and find_zero_ne_all). They are used on both functions provided by 'string-fzb.h' and 'string-fzi'. The 'string-fzb.h' provides boolean zero byte detection with the functions: - has_zero: determine if any byte within a word is zero. - has_eq: determine byte equality between two words. - has_zero_eq: determine if any byte within a word is zero along with byte equality between two words. The 'string-fzi.h' provides zero byte detection along with its positions: - index_first_zero: return index of first zero byte within a word. - index_first_eq: return index of first byte different between two words. - index_first_zero_eq: return index of first zero byte within a word or first byte different between two words. - index_first_zero_ne: return index of first zero byte within a word or first byte equal between two words. - index_last_zero: return index of last zero byte within a word. - index_last_eq: return index of last byte different between two words. Also, to avoid libcalls in the '__builtin_c{t,l}z{l}' calls (which may add performance degradation), inline implementation based on De Bruijn sequences are added (enabled by a configure check). Richard Henderson Adhemerval Zanella * config.h.in (HAVE_BUILTIN_CTZ, HAVE_BUILTIN_CLZ): New defines. * configure.ac: Check for __builtin_ctz{l} with no external dependencies * sysdeps/generic/string-extbyte.h: New file. * sysdeps/generic/string-fza.h: Likewise. * sysdeps/generic/string-fzb.h: Likewise. * sysdeps/generic/string-fzi.h: Likewise. --- config.h.in | 8 ++ configure | 54 ++++++++++ configure.ac | 34 +++++++ sysdeps/generic/string-extbyte.h | 35 +++++++ sysdeps/generic/string-fza.h | 117 +++++++++++++++++++++ sysdeps/generic/string-fzb.h | 49 +++++++++ sysdeps/generic/string-fzi.h | 215 +++++++++++++++++++++++++++++++++++++++ 7 files changed, 512 insertions(+) create mode 100644 sysdeps/generic/string-extbyte.h create mode 100644 sysdeps/generic/string-fza.h create mode 100644 sysdeps/generic/string-fzb.h create mode 100644 sysdeps/generic/string-fzi.h -- 2.7.4 diff --git a/config.h.in b/config.h.in index d928e7d..03bcfe6 100644 --- a/config.h.in +++ b/config.h.in @@ -245,4 +245,12 @@ in i386 6 argument syscall issue). */ #define CAN_USE_REGISTER_ASM_EBP 0 +/* If compiler supports __builtin_ctz{l} without any external depedencies + (libgcc for instance). */ +#define HAVE_BUILTIN_CTZ 0 + +/* If compiler supports __builtin_clz{l} without any external depedencies + (libgcc for instance). */ +#define HAVE_BUILTIN_CLZ 0 + #endif diff --git a/configure b/configure index 7a8bd3f..ff4464f 100755 --- a/configure +++ b/configure @@ -6592,6 +6592,60 @@ if test $libc_cv_builtin_trap = yes; then fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_ctz{l} with no external dependencies" >&5 +$as_echo_n "checking for __builtin_ctz{l} with no external dependencies... " >&6; } +if ${libc_cv_builtin_ctz+:} false; then : + $as_echo_n "(cached) " >&6 +else + libc_cv_builtin_ctz=yes +echo 'int foo (unsigned long x) { return __builtin_ctz (x); }' > conftest.c +if { ac_try='${CC-cc} $CFLAGS $CPPFLAGS -S conftest.c -o conftest.s 1>&5' + { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5 + (eval $ac_try) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; }; then + if grep '__ctz[s,d]i2' conftest.s > /dev/null; then + libc_cv_builtin_ctz=no + fi +fi +rm -f conftest.c conftest.s + +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libc_cv_builtin_ctz" >&5 +$as_echo "$libc_cv_builtin_ctz" >&6; } +if test x$libc_cv_builtin_ctz = xyes; then + $as_echo "#define HAVE_BUILTIN_CTZ 1" >>confdefs.h + +fi + +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_clz{l} with no external dependencies" >&5 +$as_echo_n "checking for __builtin_clz{l} with no external dependencies... " >&6; } +if ${libc_cv_builtin_clz+:} false; then : + $as_echo_n "(cached) " >&6 +else + libc_cv_builtin_clz=yes +echo 'int foo (unsigned long x) { return __builtin_clz (x); }' > conftest.c +if { ac_try='${CC-cc} $CFLAGS $CPPFLAGS -S conftest.c -o conftest.s 1>&5' + { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5 + (eval $ac_try) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; }; then + if grep '__clz[s,d]i2' conftest.s > /dev/null; then + libc_cv_builtin_clz=no + fi +fi +rm -f conftest.c conftest.s + +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libc_cv_builtin_clz" >&5 +$as_echo "$libc_cv_builtin_clz" >&6; } +if test x$libc_cv_builtin_clz = xyes; then + $as_echo "#define HAVE_BUILTIN_CLZ 1" >>confdefs.h + +fi + ac_ext=cpp ac_cpp='$CXXCPP $CPPFLAGS' ac_compile='$CXX -c $CXXFLAGS $CPPFLAGS conftest.$ac_ext >&5' diff --git a/configure.ac b/configure.ac index ca1282a..7f9c9f8 100644 --- a/configure.ac +++ b/configure.ac @@ -1675,6 +1675,40 @@ if test $libc_cv_builtin_trap = yes; then AC_DEFINE([HAVE_BUILTIN_TRAP]) fi +AC_CACHE_CHECK(for __builtin_ctz{l} with no external dependencies, + libc_cv_builtin_ctz, [dnl +libc_cv_builtin_ctz=yes +echo 'int foo (unsigned long x) { return __builtin_ctz (x); }' > conftest.c +if AC_TRY_COMMAND(${CC-cc} $CFLAGS $CPPFLAGS -S conftest.c -o conftest.s 1>&AS_MESSAGE_LOG_FD); then +changequote(,)dnl + if grep '__ctz[s,d]i2' conftest.s > /dev/null; then + libc_cv_builtin_ctz=no + fi +changequote([,])dnl +fi +rm -f conftest.c conftest.s +]) +if test x$libc_cv_builtin_ctz = xyes; then + AC_DEFINE(HAVE_BUILTIN_CTZ) +fi + +AC_CACHE_CHECK(for __builtin_clz{l} with no external dependencies, + libc_cv_builtin_clz, [dnl +libc_cv_builtin_clz=yes +echo 'int foo (unsigned long x) { return __builtin_clz (x); }' > conftest.c +if AC_TRY_COMMAND(${CC-cc} $CFLAGS $CPPFLAGS -S conftest.c -o conftest.s 1>&AS_MESSAGE_LOG_FD); then +changequote(,)dnl + if grep '__clz[s,d]i2' conftest.s > /dev/null; then + libc_cv_builtin_clz=no + fi +changequote([,])dnl +fi +rm -f conftest.c conftest.s +]) +if test x$libc_cv_builtin_clz = xyes; then + AC_DEFINE(HAVE_BUILTIN_CLZ) +fi + dnl C++ feature tests. AC_LANG_PUSH([C++]) diff --git a/sysdeps/generic/string-extbyte.h b/sysdeps/generic/string-extbyte.h new file mode 100644 index 0000000..69a78ce --- /dev/null +++ b/sysdeps/generic/string-extbyte.h @@ -0,0 +1,35 @@ +/* Extract by from memory word. Generic C version. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef STRING_EXTBYTE_H +#define STRING_EXTBYTE_H 1 + +#include +#include +#include + +static inline unsigned char +extractbyte (op_t x, unsigned idx) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + return x >> (idx * CHAR_BIT); + else + return x >> (sizeof (x) - 1 - idx) * CHAR_BIT; +} + +#endif /* STRING_EXTBYTE_H */ diff --git a/sysdeps/generic/string-fza.h b/sysdeps/generic/string-fza.h new file mode 100644 index 0000000..ab208bf --- /dev/null +++ b/sysdeps/generic/string-fza.h @@ -0,0 +1,117 @@ +/* Basic zero byte detection. Generic C version. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef STRING_FZA_H +#define STRING_FZA_H 1 + +#include +#include + +/* This function returns non-zero if any byte in X is zero. + More specifically, at least one bit set within the least significant + byte that was zero; other bytes within the word are indeterminate. */ + +static inline op_t +find_zero_low (op_t x) +{ + /* This expression comes from + https://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord + Subtracting 1 sets 0x80 in a byte that was 0; anding ~x clears + 0x80 in a byte that was >= 128; anding 0x80 isolates that test bit. */ + op_t lsb = (op_t)-1 / 0xff; + op_t msb = lsb << (CHAR_BIT - 1); + return (x - lsb) & ~x & msb; +} + +/* This function returns at least one bit set within every byte of X that + is zero. The result is exact in that, unlike find_zero_low, all bytes + are determinate. This is usually used for finding the index of the + most significant byte that was zero. */ + +static inline op_t +find_zero_all (op_t x) +{ + /* For each byte, find not-zero by + (0) And 0x7f so that we cannot carry between bytes, + (1) Add 0x7f so that non-zero carries into 0x80, + (2) Or in the original byte (which might have had 0x80 set). + Then invert and mask such that 0x80 is set iff that byte was zero. */ + op_t m = ((op_t)-1 / 0xff) * 0x7f; + return ~(((x & m) + m) | x | m); +} + +/* With similar caveats, identify bytes that are equal between X1 and X2. */ + +static inline op_t +find_eq_low (op_t x1, op_t x2) +{ + return find_zero_low (x1 ^ x2); +} + +static inline op_t +find_eq_all (op_t x1, op_t x2) +{ + return find_zero_all (x1 ^ x2); +} + +/* With similar caveats, identify zero bytes in X1 and bytes that are + equal between in X1 and X2. */ + +static inline op_t +find_zero_eq_low (op_t x1, op_t x2) +{ + op_t lsb = (op_t)-1 / 0xff; + op_t msb = lsb << (CHAR_BIT - 1); + op_t eq = x1 ^ x2; + return (((x1 - lsb) & ~x1) | ((eq - lsb) & ~eq)) & msb; +} + +static inline op_t +find_zero_eq_all (op_t x1, op_t x2) +{ + op_t m = ((op_t)-1 / 0xff) * 0x7f; + op_t eq = x1 ^ x2; + op_t c1 = ((x1 & m) + m) | x1; + op_t c2 = ((eq & m) + m) | eq; + return ~((c1 & c2) | m); +} + +/* With similar caveats, identify zero bytes in X1 and bytes that are + not equal between in X1 and X2. */ + +static inline op_t +find_zero_ne_low (op_t x1, op_t x2) +{ + op_t m = ((op_t)-1 / 0xff) * 0x7f; + op_t eq = x1 ^ x2; + op_t nz1 = (x1 + m) | x1; /* msb set if byte not zero */ + op_t ne2 = (eq + m) | eq; /* msb set if byte not equal */ + return (ne2 | ~nz1) & ~m; /* msb set if x1 zero or x2 not equal */ +} + +static inline op_t +find_zero_ne_all (op_t x1, op_t x2) +{ + op_t m = ((op_t)-1 / 0xff) * 0x7f; + op_t eq = x1 ^ x2; + op_t nz1 = ((x1 & m) + m) | x1; + op_t ne2 = ((eq & m) + m) | eq; + return (ne2 | ~nz1) & ~m; +} + +#endif /* STRING_FZA_H */ diff --git a/sysdeps/generic/string-fzb.h b/sysdeps/generic/string-fzb.h new file mode 100644 index 0000000..d4ab59b --- /dev/null +++ b/sysdeps/generic/string-fzb.h @@ -0,0 +1,49 @@ +/* Zero byte detection, boolean. Generic C version. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef STRING_FZB_H +#define STRING_FZB_H 1 + +#include +#include + +/* Determine if any byte within X is zero. This is a pure boolean test. */ + +static inline _Bool +has_zero (op_t x) +{ + return find_zero_low (x) != 0; +} + +/* Likewise, but for byte equality between X1 and X2. */ + +static inline _Bool +has_eq (op_t x1, op_t x2) +{ + return find_eq_low (x1, x2) != 0; +} + +/* Likewise, but for zeros in X1 and equal bytes between X1 and X2. */ + +static inline _Bool +has_zero_eq (op_t x1, op_t x2) +{ + return find_zero_eq_low (x1, x2); +} + +#endif /* STRING_FZB_H */ diff --git a/sysdeps/generic/string-fzi.h b/sysdeps/generic/string-fzi.h new file mode 100644 index 0000000..57101f2 --- /dev/null +++ b/sysdeps/generic/string-fzi.h @@ -0,0 +1,215 @@ +/* Zero byte detection; indexes. Generic C version. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef STRING_FZI_H +#define STRING_FZI_H 1 + +#include +#include +#include + +/* An improved bitscan routine, multiplying the De Bruijn sequence with a + 0-1 mask separated by the least significant one bit of a scanned integer + or bitboard [1]. + + [1] https://chessprogramming.wikispaces.com/Kim+Walisch */ + +static inline unsigned +index_access (const op_t i) +{ + static const char index[] = + { +# if __WORDSIZE == 64 + 0, 47, 1, 56, 48, 27, 2, 60, + 57, 49, 41, 37, 28, 16, 3, 61, + 54, 58, 35, 52, 50, 42, 21, 44, + 38, 32, 29, 23, 17, 11, 4, 62, + 46, 55, 26, 59, 40, 36, 15, 53, + 34, 51, 20, 43, 31, 22, 10, 45, + 25, 39, 14, 33, 19, 30, 9, 24, + 13, 18, 8, 12, 7, 6, 5, 63 +# else + 0, 9, 1, 10, 13, 21, 2, 29, + 11, 14, 16, 18, 22, 25, 3, 30, + 8, 12, 20, 28, 15, 17, 24, 7, + 19, 27, 23, 6, 26, 5, 4, 31 +# endif + }; + return index[i]; +} + +/* For architecture which only provides __builtin_clz{l} (HAVE_BUILTIN_CLZ) + and/or __builtin_ctz{l} (HAVE_BUILTIN_CTZ) which uses external libcalls + (for intance __c{l,t}z{s,d}i2 from libgcc) the following wrapper provides + inline implementation for both count leading zeros and count trailing + zeros using branchless computation. */ + +static inline unsigned +__ctz (op_t x) +{ +#if !HAVE_BUILTIN_CTZ + op_t i; +# if __WORDSIZE == 64 + i = (x ^ (x - 1)) * 0x03F79D71B4CB0A89ull >> 58; +# else + i = (x ^ (x - 1)) * 0x07C4ACDDU >> 27; +# endif + return index_access (i); +#else + if (sizeof (op_t) == sizeof (long)) + return __builtin_ctzl (x); + else + return __builtin_ctzll (x); +#endif +}; + +static inline unsigned +__clz (op_t x) +{ +#if !HAVE_BUILTIN_CLZ + unsigned r; + op_t i; + + x |= x >> 1; + x |= x >> 2; + x |= x >> 4; + x |= x >> 8; + x |= x >> 16; +# if __WORDSIZE == 64 + x |= x >> 32; + i = x * 0x03F79D71B4CB0A89ull >> 58; +# else + i = x * 0x07C4ACDDU >> 27; +# endif + r = index_access (i); + return r ^ (sizeof (op_t) * CHAR_BIT - 1); +#else + if (sizeof (op_t) == sizeof (long)) + return __builtin_clzl (x); + else + return __builtin_clzll (x); +#endif +} + +/* A subroutine for the index_zero functions. Given a test word C, return + the (memory order) index of the first byte (in memory order) that is + non-zero. */ + +static inline unsigned int +index_first_ (op_t c) +{ + _Static_assert (sizeof (op_t) == sizeof (long) + || sizeof (op_t) == sizeof (long long), + "Unhandled word size"); + + unsigned r; + if (__BYTE_ORDER == __LITTLE_ENDIAN) + r = __ctz (c); + else + r = __clz (c); + return r / CHAR_BIT; +} + +/* Similarly, but return the (memory order) index of the last byte + that is non-zero. */ + +static inline unsigned int +index_last_ (op_t c) +{ + _Static_assert (sizeof (op_t) == sizeof (long) + || sizeof (op_t) == sizeof (long long), + "Unhandled word size"); + + unsigned r; + if (__BYTE_ORDER == __LITTLE_ENDIAN) + r = __clz (c); + else + r = __ctz (c); + return sizeof (op_t) - 1 - (r / CHAR_BIT); +} + +/* Given a word X that is known to contain a zero byte, return the + index of the first such within the word in memory order. */ + +static inline unsigned int +index_first_zero (op_t x) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + x = find_zero_low (x); + else + x = find_zero_all (x); + return index_first_ (x); +} + +/* Similarly, but perform the search for byte equality between X1 and X2. */ + +static inline unsigned int +index_first_eq (op_t x1, op_t x2) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + x1 = find_eq_low (x1, x2); + else + x1 = find_eq_all (x1, x2); + return index_first_ (x1); +} + +/* Similarly, but perform the search for zero within X1 or + equality between X1 and X2. */ + +static inline unsigned int +index_first_zero_eq (op_t x1, op_t x2) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + x1 = find_zero_eq_low (x1, x2); + else + x1 = find_zero_eq_all (x1, x2); + return index_first_ (x1); +} + +/* Similarly, but perform the search for zero within X1 or + inequality between X1 and X2. */ + +static inline unsigned int +index_first_zero_ne (op_t x1, op_t x2) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + x1 = find_zero_ne_low (x1, x2); + else + x1 = find_zero_ne_all (x1, x2); + return index_first_ (x1); +} + +/* Similarly, but search for the last zero within X. */ + +static inline unsigned int +index_last_zero (op_t x) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + x = find_zero_all (x); + else + x = find_zero_low (x); + return index_last_ (x); +} + +static inline unsigned int +index_last_eq (op_t x1, op_t x2) +{ + return index_last_zero (x1 ^ x2); +} + +#endif /* STRING_FZI_H */ From patchwork Wed Jan 10 12:47:49 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 124078 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp5234768qgn; Wed, 10 Jan 2018 04:49:01 -0800 (PST) X-Google-Smtp-Source: ACJfBotxLw/K0y6hEcBXQRBPm/5tCxCWxGodCeIIjNgfBPpgZm542NenhgMe3KvTY7O/6rmRMDQ2 X-Received: by 10.159.208.74 with SMTP id w10mr18875373plz.39.1515588541492; Wed, 10 Jan 2018 04:49:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515588541; cv=none; d=google.com; s=arc-20160816; b=aEDI0gauOZtkETqxy4kQI8V4ei09SZ6uWNsq6CkxF4eDyaPcnlLlG34JCwe/OcVtYo KWiJ4CmNfonizkyFKe60wqsKacGZBL4XxJA1SU+yUqW8YNiYdNAvoSPrLIi6PD3rN3PW 6rOYqSqUHpvnXkqgQSLBR/ezI2H9qxnSW27AbacjSO/BCdSQjc8qjlMilufZOIwnsFtG C9tx7sllp+bU3i0xRtLzVTDo0r8+aCxT7JuvFrl/lzhvYYnxJm0ovP8usDebzCwlobRk rwfMBp0wm4xygVrxgb1vhfpHDo1F++1rLeqYEhhqrKiKjT/d6eACWcF4OiEHE4Byl1DO QV2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :delivered-to:sender:list-help:list-post:list-archive:list-subscribe :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=F7n65JWH6DK72UlwEL9kyZ/kWe/DcSZGPSRiCswFaYI=; b=T8trG4xLRubOHCMZPk6ixtTfdTZoJ8DjNWeKhwRXuZDo2xnFiQlp0zyqzI+HqfDXwo 9ugSR8v8qcmFYZIbZVSfMgvl2D5Vu5DuVnNPeVaA9U3XPthQ24CRt8zRdvI/9G8eexLg 5OKKMRvHyNihSao29E6Y/EXBTCDaZ8t/dGYQFwUPBBYK+2n9oqhf61ILbwkrOzSU7I8A 2JfERV19g1Sam0lbhhseSZ3Wk25l29sRw3ohOahKpxvYTlLj/QjY4L9QhHYlNjbUp/QR 4Y5OSWGp9Igl5VHvZMoIq6yPQX89j98N29QDI1A71Rxzp4SjanvZ1uzuUgdUYehfV++/ h6wA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=vr4cpfpD; spf=pass (google.com: domain of libc-alpha-return-89002-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89002-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id r4si776706pls.43.2018.01.10.04.49.01 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Jan 2018 04:49:01 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-89002-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=vr4cpfpD; spf=pass (google.com: domain of libc-alpha-return-89002-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89002-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=ttirvOLX61Zx8UKIsqK0oejjehF3NV6 9HWbOiFhvCJt+A/vNG6aBz++2sKPT+YGUHXb1dWX2utEbTd+IkLuZgqRXY+cpbI8 AhkdQFb+7KfWQFhJMi3EESNumtqCTeSCbuuUmqi7pYMDpucFNT4XWRhC/uSlUg3R ZcX/NdpHptgw= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; s=default; bh=AFBUbl1c7jIXAK8vnbld8mP1SQ8=; b=vr4cp fpD5hXFURR1r7oE9j/p461KcFq8tt1bmADTCNSmxHIh82w7zO8PrQDEgdO0Yak0E IuewytebvvWV6uvvQ1gdZRd3QtfMg2KuETwGCjZcEqfc7k4uwgiFxPm6k/Tvc3vO WXWPUHRRhZ4GiJCztydM9ujwMlrhrFWUbWBmhg= Received: (qmail 129704 invoked by alias); 10 Jan 2018 12:48:43 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 121182 invoked by uid 89); 10 Jan 2018 12:48:28 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=H*Ad:U*rth X-HELO: mail-qk0-f195.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=F7n65JWH6DK72UlwEL9kyZ/kWe/DcSZGPSRiCswFaYI=; b=JlDke3d03dgLXQE2J24Rf5jOk+H/46KIwFDVsTG18TUSdCsOwNUXTjDscQZJlAsyTh CWoR66i4wjgjbUXD8fSFPnD2P9e2O1Ab7bfaLRQas8vJAvM/9xFmeinpq00zH8fL+tfS X4nT9VM5kGdj1ViJvYWeCaQFPvb6eI8XV8HFbx+jf86DD0yazt6+E/BiyiOO2R66AucH OAL+M7AdG1tvjDhaQNAGP7beZguglcxI0uGxm7LeUWbwtDGcpoT47vJHZZKLNzJhGOCd TiaagKcBn5LkVuOlHj11A3rFRMxlNM043ZqslMwR59j1LaAOUph2O6bSUpF5qai7bZXz 6LMw== X-Gm-Message-State: AKwxytcSd1GPwmERGibJ/DqpvKYVWps5xbWB6SGnYnPOg9zar2EjK8Tj B5yXONkAQ3QqFUEoK4fg/3RqNftjq0Y= X-Received: by 10.55.119.132 with SMTP id s126mr25932845qkc.250.1515588496936; Wed, 10 Jan 2018 04:48:16 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Richard Henderson Subject: [PATCH v3 05/18] string: Improve generic strlen Date: Wed, 10 Jan 2018 10:47:49 -0200 Message-Id: <1515588482-15744-6-git-send-email-adhemerval.zanella@linaro.org> In-Reply-To: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> From: Richard Henderson New algorithm have the following key differences: - Reads first word unaligned and use string-maskoff functions to remove unwanted data. This strategy follow assemble optimized ones for powerpc, sparc, and SH. - Use of has_zero and index_first_zero parametrized functions. Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu, and sparcv9-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). [BZ #5806] * string/strlen.c: Use them. --- string/strlen.c | 83 +++++++++++---------------------------------------------- 1 file changed, 15 insertions(+), 68 deletions(-) -- 2.7.4 diff --git a/string/strlen.c b/string/strlen.c index 8ce1318..6bd0ed9 100644 --- a/string/strlen.c +++ b/string/strlen.c @@ -20,6 +20,11 @@ #include #include +#include +#include +#include +#include +#include #undef strlen @@ -32,78 +37,20 @@ size_t STRLEN (const char *str) { - const char *char_ptr; - const unsigned long int *longword_ptr; - unsigned long int longword, himagic, lomagic; + /* Align pointer to sizeof op_t. */ + const uintptr_t s_int = (uintptr_t) str; + const op_t *word_ptr = (const op_t*) (s_int & -sizeof (op_t)); - /* Handle the first few characters by reading one character at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = str; ((unsigned long int) char_ptr - & (sizeof (longword) - 1)) != 0; - ++char_ptr) - if (*char_ptr == '\0') - return char_ptr - str; + /* Read and MASK the first word. */ + op_t word = *word_ptr | create_mask (s_int); - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to 8-byte longwords. */ - - longword_ptr = (unsigned long int *) char_ptr; - - /* Bits 31, 24, 16, and 8 of this number are zero. Call these bits - the "holes." Note that there is a hole just to the left of - each byte, with an extra at the end: - - bits: 01111110 11111110 11111110 11111111 - bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD - - The 1-bits make sure that carries propagate to the next 0-bit. - The 0-bits provide holes for carries to fall into. */ - himagic = 0x80808080L; - lomagic = 0x01010101L; - if (sizeof (longword) > 4) + while (1) { - /* 64-bit version of the magic. */ - /* Do the shift in two steps to avoid a warning if long has 32 bits. */ - himagic = ((himagic << 16) << 16) | himagic; - lomagic = ((lomagic << 16) << 16) | lomagic; + if (has_zero (word)) + break; + word = *++word_ptr; } - if (sizeof (longword) > 8) - abort (); - /* Instead of the traditional loop which tests each character, - we will test a longword at a time. The tricky part is testing - if *any of the four* bytes in the longword in question are zero. */ - for (;;) - { - longword = *longword_ptr++; - - if (((longword - lomagic) & ~longword & himagic) != 0) - { - /* Which of the bytes was the zero? If none of them were, it was - a misfire; continue the search. */ - - const char *cp = (const char *) (longword_ptr - 1); - - if (cp[0] == 0) - return cp - str; - if (cp[1] == 0) - return cp - str + 1; - if (cp[2] == 0) - return cp - str + 2; - if (cp[3] == 0) - return cp - str + 3; - if (sizeof (longword) > 4) - { - if (cp[4] == 0) - return cp - str + 4; - if (cp[5] == 0) - return cp - str + 5; - if (cp[6] == 0) - return cp - str + 6; - if (cp[7] == 0) - return cp - str + 7; - } - } - } + return ((const char *) word_ptr) + index_first_zero (word) - str; } libc_hidden_builtin_def (strlen) From patchwork Wed Jan 10 12:47:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 124090 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp5236498qgn; Wed, 10 Jan 2018 04:50:58 -0800 (PST) X-Google-Smtp-Source: ACJfBovFiwk645wxSkvVKukwlAWj3Z1z1f9dikNVMhg2GPXgZlqfyJyj3aQNqoxozlZM4Rlo+mGc X-Received: by 10.84.218.69 with SMTP id f5mr6572831plm.431.1515588658857; Wed, 10 Jan 2018 04:50:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515588658; cv=none; d=google.com; s=arc-20160816; b=T7FvcLu6a2RwCfP2WP7t9lbhhzj9s/kCegx3WJlrfy2KizMluzxThIr0j5o/1droYA p2oSDqjSgHj1fSu7B2uCyGSJ+AYL1xBrBqAT9noAxqAT3qIH5E9f/pZIaGR+dKgOy/Ig MZVUk7JJfCs2YPY6GGxF/YJVv7FQWCMmqJn025MRWMKH40rspk43ly+gy1IXEsAY+lw4 XdQVQ4N7a2k62ls3e9CSG67kLIK5PZsRd1a+jWfqiSM4NgMOvH6/o9SQMijahgLP3t8O 4fJAvPvut1FHfLh7FcWYcmg6OMZHUx/6u2t1HPYrgpOA4IFXzTVPay+lRE7a6e+XPQzf /gKA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :delivered-to:sender:list-help:list-post:list-archive:list-subscribe :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=ZAGQXyp3oa0jH6OpVTFTW783F01OZriu/gm0F37XlL0=; b=flq42qAfBEuHr6/bPc+lJu+L3HPAI4zMTUxytfgLmsBOESSL7MauvdnfSvPg8rsqC7 Cy3AiVuwaLx1sbZjOOJ+7PoiqOKgtcZo+Jdh2ldMvvqiXIoS/ckG5059Xr2yS0LEDV5p 4jxiV9LFUopVH2qy44UuKm0GmbpVZ9jQlz4ha6czmQP0cPN/+8YN+YLmw06+6265jixz tFip73kaG/5hPivzb+oOpY7jGbikk3mMEKq3h07nypN3iC5NJcDKzZ7lvTdTsLF2OCIJ maj5AAqgEpZiAxD92CGamRbhfT2oh5qhyCO/TYn+86MCPQNB4GfmVmNruWcGNFWh9oiN q/2Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=tRLc59+e; spf=pass (google.com: domain of libc-alpha-return-89012-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89012-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id j189si5724241pgc.294.2018.01.10.04.50.58 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Jan 2018 04:50:58 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-89012-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=tRLc59+e; spf=pass (google.com: domain of libc-alpha-return-89012-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89012-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=VA0ZLHngqyP7/JJvWR1jFVBj0GLd58m 4INSWQV+r2PBiTTdPmPEUrjZnY72SyBfgolfRH/TUQzyXonEIUVv6Q5JAZp/EGQo jRxQvEfSgx2Ue2pnvrWykoPuXf4d4VzH7e9JADE89+n5DojQWNuzGzuD+ZM47BHm 8eKy5RM+c6ck= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; s=default; bh=4sN2xALPEoodkH/Ch3h/E4wONto=; b=tRLc5 9+egLY/Cg2y6a9gTFndKVsqPkPcsYP+OkCE/qTGyOSrvgsNc9TctI2klFaM4c1u8 gW42HeHqufGQZ1T1Qee57twP5Q84eRU0uSwjlIFUpPazd8iAAo4E2bkTyqtG+gS6 XJzozK1V+bb8ZQq0/ksivEoRQaBtrQgAPDoa+s= Received: (qmail 130719 invoked by alias); 10 Jan 2018 12:48:45 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 121697 invoked by uid 89); 10 Jan 2018 12:48:29 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=j1 X-HELO: mail-qk0-f195.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ZAGQXyp3oa0jH6OpVTFTW783F01OZriu/gm0F37XlL0=; b=GA6VbXPLY23hRNijI9SQE/ukqN0bVx9fOUlK6/q0MsIsHdLXn0tBkNmmdqiIM8PGEr oQFickl7m2/ljFqzxRBaSCNeoa42C7VS+2zVO4zER5/NfVw/HcA1Z4fES5q2S9q4NY+x nZtku9up8AXYAY9WocUSdKV5BSX/co2LKMPdF+PX7yDsUtLSslv5/FaQffbOAVDaPgBL JrnnRxQFxI/noreaZGv86ZmZrSm2Vmt7+Pvqnf5h9luqSUMWY4KBz0U5zb+0mqMgdxDZ pnMcMU4raODuCf0RL+UX4AobTW+ghObzRqoPQVaDKDtJBjXVTHL8a34oxDs9FgMLJ2LB 43Hw== X-Gm-Message-State: AKGB3mL5cJn9o4p1jS138JaHINXwNRUClIUb7AVlDKw8seMObIPUO+kl bgEFpYLKDtS8oH/nWMLt3RZHE0reufE= X-Received: by 10.55.106.195 with SMTP id f186mr24382195qkc.53.1515588498463; Wed, 10 Jan 2018 04:48:18 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Richard Henderson Subject: [PATCH v3 06/18] string: Improve generic memchr Date: Wed, 10 Jan 2018 10:47:50 -0200 Message-Id: <1515588482-15744-7-git-send-email-adhemerval.zanella@linaro.org> In-Reply-To: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> From: Richard Henderson New algorithm have the following key differences: - Reads first word unaligned and use string-maskoff function to remove unwanted data. This strategy follow assemble optimized ones for aarch64, powerpc and tile. - Use string-fz{b,i} and string-opthr functions. Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu, and sparcv9-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). [BZ #5806] * string/memchr.c: Use string-fzb.h, string-fzi.h, string-opthr.h. --- string/memchr.c | 157 +++++++++++++++----------------------------------------- 1 file changed, 40 insertions(+), 117 deletions(-) -- 2.7.4 diff --git a/string/memchr.c b/string/memchr.c index c4e21b8..ae3fd93 100644 --- a/string/memchr.c +++ b/string/memchr.c @@ -20,24 +20,16 @@ License along with the GNU C Library; if not, see . */ -#ifndef _LIBC -# include -#endif - #include - #include +#include +#include +#include +#include +#include +#include -#include - -#undef __memchr -#ifdef _LIBC -# undef memchr -#endif - -#ifndef weak_alias -# define __memchr memchr -#endif +#undef memchr #ifndef MEMCHR # define MEMCHR __memchr @@ -47,116 +39,47 @@ void * MEMCHR (void const *s, int c_in, size_t n) { - /* On 32-bit hardware, choosing longword to be a 32-bit unsigned - long instead of a 64-bit uintmax_t tends to give better - performance. On 64-bit hardware, unsigned long is generally 64 - bits already. Change this typedef to experiment with - performance. */ - typedef unsigned long int longword; - - const unsigned char *char_ptr; - const longword *longword_ptr; - longword repeated_one; - longword repeated_c; - unsigned char c; - - c = (unsigned char) c_in; - - /* Handle the first few bytes by reading one byte at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = (const unsigned char *) s; - n > 0 && (size_t) char_ptr % sizeof (longword) != 0; - --n, ++char_ptr) - if (*char_ptr == c) - return (void *) char_ptr; - - longword_ptr = (const longword *) char_ptr; - - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to any size longwords. */ - - /* Compute auxiliary longword values: - repeated_one is a value which has a 1 in every byte. - repeated_c has c in every byte. */ - repeated_one = 0x01010101; - repeated_c = c | (c << 8); - repeated_c |= repeated_c << 16; - if (0xffffffffU < (longword) -1) - { - repeated_one |= repeated_one << 31 << 1; - repeated_c |= repeated_c << 31 << 1; - if (8 < sizeof (longword)) - { - size_t i; - - for (i = 64; i < sizeof (longword) * 8; i *= 2) - { - repeated_one |= repeated_one << i; - repeated_c |= repeated_c << i; - } - } - } + const op_t *word_ptr, *lword; + op_t repeated_c, before_mask, word; + const char *lbyte; + char *ret; + uintptr_t s_int; - /* Instead of the traditional loop which tests each byte, we will test a - longword at a time. The tricky part is testing if *any of the four* - bytes in the longword in question are equal to c. We first use an xor - with repeated_c. This reduces the task to testing whether *any of the - four* bytes in longword1 is zero. - - We compute tmp = - ((longword1 - repeated_one) & ~longword1) & (repeated_one << 7). - That is, we perform the following operations: - 1. Subtract repeated_one. - 2. & ~longword1. - 3. & a mask consisting of 0x80 in every byte. - Consider what happens in each byte: - - If a byte of longword1 is zero, step 1 and 2 transform it into 0xff, - and step 3 transforms it into 0x80. A carry can also be propagated - to more significant bytes. - - If a byte of longword1 is nonzero, let its lowest 1 bit be at - position k (0 <= k <= 7); so the lowest k bits are 0. After step 1, - the byte ends in a single bit of value 0 and k bits of value 1. - After step 2, the result is just k bits of value 1: 2^k - 1. After - step 3, the result is 0. And no carry is produced. - So, if longword1 has only non-zero bytes, tmp is zero. - Whereas if longword1 has a zero byte, call j the position of the least - significant zero byte. Then the result has a zero at positions 0, ..., - j-1 and a 0x80 at position j. We cannot predict the result at the more - significant bytes (positions j+1..3), but it does not matter since we - already have a non-zero bit at position 8*j+7. - - So, the test whether any byte in longword1 is zero is equivalent to - testing whether tmp is nonzero. */ - - while (n >= sizeof (longword)) - { - longword longword1 = *longword_ptr ^ repeated_c; - if ((((longword1 - repeated_one) & ~longword1) - & (repeated_one << 7)) != 0) - break; - longword_ptr++; - n -= sizeof (longword); - } + if (__glibc_unlikely (n == 0)) + return NULL; + + s_int = (uintptr_t) s; + word_ptr = (const op_t*) (s_int & -sizeof (op_t)); - char_ptr = (const unsigned char *) longword_ptr; + /* Set up a word, each of whose bytes is C. */ + repeated_c = repeat_bytes (c_in); + before_mask = create_mask (s_int); - /* At this point, we know that either n < sizeof (longword), or one of the - sizeof (longword) bytes starting at char_ptr is == c. On little-endian - machines, we could determine the first such byte without any further - memory accesses, just by looking at the tmp result from the last loop - iteration. But this does not work on big-endian machines. Choose code - that works in both cases. */ + /* Compute the address of the last byte taking in consideration possible + overflow. */ + uintptr_t lbyte_int = s_int + n - 1; + lbyte_int |= -(lbyte_int < s_int); + lbyte = (const char *) lbyte_int; - for (; n > 0; --n, ++char_ptr) + /* Compute the address of the word containing the last byte. */ + lword = (const op_t *) ((uintptr_t) lbyte & -sizeof (op_t)); + + /* Read the first word, but munge it so that bytes before the array + will not match goal. */ + word = (*word_ptr | before_mask) ^ (repeated_c & before_mask); + + while (has_eq (word, repeated_c) == 0) { - if (*char_ptr == c) - return (void *) char_ptr; + if (word_ptr == lword) + return NULL; + word = *++word_ptr; } - return NULL; + /* We found a match, but it might be in a byte past the end + of the array. */ + ret = (char *) word_ptr + index_first_eq (word, repeated_c); + return (ret <= lbyte) ? ret : NULL; } -#ifdef weak_alias weak_alias (__memchr, memchr) -#endif libc_hidden_builtin_def (memchr) From patchwork Wed Jan 10 12:47:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 124086 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp5235848qgn; Wed, 10 Jan 2018 04:50:13 -0800 (PST) X-Google-Smtp-Source: ACJfBoveQFAJKjQez2yW8oUD4IsA6JW/WrXd46lHH0QQGFBTjh0a9vcZtJ+5HPZXLbHe4yJYKOUg X-Received: by 10.84.248.68 with SMTP id e4mr19169515pln.296.1515588613897; Wed, 10 Jan 2018 04:50:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515588613; cv=none; d=google.com; s=arc-20160816; b=Xr1iQlvmd1RKCmoXhA4yfbPDJMtUFLjAiZv6TESI+NL52H2g7U9UxXUaxjNLcaIJj/ TUeQMDfCNrMQCTA7ZnRfwfZGu5+MHJFPYLYZm/K8cmnp/FdtKeGTdHjKEiSkqhBomvk9 1hsALKYjxMhTzJhSHUVwXg7oRliFRaKznBNa5wswLsAI+JDqbOzdjTbrK1anssxG4HPG 3aU6tZGSOp1UyiWqkEBCSwNbGPKsJABxn22OxkKEX5KtmUvwJ0AjTroCzQMNpDX4CTHZ tVVToTK5lpbw03FiiSR/iB1qNqN1VVrNtj2b8ThWaGJ9IL2BKWDRiglOXvufdyuE9DoJ 9KcA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :delivered-to:sender:list-help:list-post:list-archive:list-subscribe :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=wDf7DV7rAZpVhMS7uk2xqzkBRsQMwonSdgneY4Dhoq4=; b=CA6MKFjZ/8ygO6Z9ARtzdSxAkZ0nDf68SbnnumQH+HzT49/u/1qJIJ115bmXgurJ59 tspFiHuYMPNt0AgnC9p7oTvdgeShW1R45UBlsEIWsGE8+yxmVR8s6afMPYMeu4n5YfqJ 4KWH6J3LlL3Ztxn2K35Y0NBuFiGyldhTiPRwXryoZGBHhOQuFGsOwYhtLwbKKxdCKgL1 zRx8b+ae5ceU8UMxp/kkCG+GXxjTg9HsXL7D/cxK7Y4TM2Um8H3QQbUH2J8qEqzXKzU4 MuICpUHuCkSHfe5rSAj1qI5GCGK5N5k2/Q/SsNYq1lZp3LOmMxNVqkAhJXFaj7kpOVhc SilQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=X1VJOEgD; spf=pass (google.com: domain of libc-alpha-return-89008-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89008-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id u62si10482246pgc.587.2018.01.10.04.50.13 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Jan 2018 04:50:13 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-89008-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=X1VJOEgD; spf=pass (google.com: domain of libc-alpha-return-89008-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89008-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=c+yZ5mPXTxo8mkiX8A/gBoZbp4iZ6mF UraPWmGV5hmc9V1s+GqggIP4fswTVXyp47Zb5vM3R//RpSyMCYidr6SosByMpG2+ 9tb2BvkIuXUmgvj8nYL9JpwIzz/4es3ZhW58niBylT+A9G/gn6+TIw/DWE95r775 2ABJVzw0MF4o= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; s=default; bh=9jcN/NRWWQezIwR0eMkoFT5QMXQ=; b=X1VJO EgDMWO0BszHD+sS0XsJd8CP1WVQvXznNGEx+yvpL54vlBG3c48s93InbqTah2Tqu 0rF55O007p70Zf0LwInuzswuQBkN0B5Y4ji4PD1ojF1BWr1rKvVV3A36Vcgp1xLi To1sAuzan3BVDTtb1mGwZJcl8x8ZwxaEDlJQVw= Received: (qmail 130280 invoked by alias); 10 Jan 2018 12:48:43 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 124477 invoked by uid 89); 10 Jan 2018 12:48:33 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mail-qk0-f193.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=wDf7DV7rAZpVhMS7uk2xqzkBRsQMwonSdgneY4Dhoq4=; b=b2RNuOYzodzObK7rZ5Xu3IOm+6C+0jArUZMAfp5Ad8UN0NrLgbS4qTUNtuIpUUQQr1 2bz+g9tDcsiDP2USzbl51Zc0NEKHBOMioPyye/N3AxuUG6RfltsJott7eWvcNBL2ZaOB fhI5sJz7zuVjeqACYeejsNp/Mg0ptA4zKWNTHi779GQ+KyFrAfwi5iPWnkXo5B6nkJCa A8gZsNO+2kkc1krgezN3ezV3Q4WXzTlF7I1YpQMVVqvdGbDIKx7A75HOLIjhqQIY6rRX Z8avzjvld2d0dpM3mdppkyahnmwLOIE6CcqyXMW4flnCcgW5vDiR/R/rkiGUIXwEC5za 3s3g== X-Gm-Message-State: AKwxytf0IJz2keISi67ojn2qtRYSH5XkzGrO5l6Gu/3ACnbZH/ekyL0n ub//Iu1fLq6nIYOv8ikHMBlrRhG1aTA= X-Received: by 10.55.53.205 with SMTP id c196mr14918336qka.311.1515588500021; Wed, 10 Jan 2018 04:48:20 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Richard Henderson Subject: [PATCH v3 07/18] string: Improve generic memrchr Date: Wed, 10 Jan 2018 10:47:51 -0200 Message-Id: <1515588482-15744-8-git-send-email-adhemerval.zanella@linaro.org> In-Reply-To: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> From: Richard Henderson New algorithm have the following key differences: - Use string-fz{b,i} functions. It also cleanups the multiple inclusion by leaving the ifunc implementation to undef the weak_alias. Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu, and sparcv9-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Richard Henderson Adhemerval Zanella [BZ #5806] * string/memrchr.c: Use string-fzb.h, string-fzi.h. * sysdeps/i386/i686/multiarch/memrchr-c.c: Redefined weak_alias. * sysdeps/s390/multiarch/memrchr-c.c: Likewise. --- string/memrchr.c | 193 +++++++------------------------- sysdeps/i386/i686/multiarch/memrchr-c.c | 2 + sysdeps/s390/multiarch/memrchr-c.c | 2 + 3 files changed, 44 insertions(+), 153 deletions(-) -- 2.7.4 diff --git a/string/memrchr.c b/string/memrchr.c index 191b89a..5ae9c81 100644 --- a/string/memrchr.c +++ b/string/memrchr.c @@ -21,177 +21,64 @@ License along with the GNU C Library; if not, see . */ -#include - -#ifdef HAVE_CONFIG_H -# include -#endif - -#if defined _LIBC -# include -# include -#endif - -#if defined HAVE_LIMITS_H || defined _LIBC -# include -#endif - -#define LONG_MAX_32_BITS 2147483647 - -#ifndef LONG_MAX -# define LONG_MAX LONG_MAX_32_BITS -#endif - -#include +#include +#include +#include +#include +#include +#include +#include #undef __memrchr #undef memrchr -#ifndef weak_alias -# define __memrchr memrchr +#ifndef MEMRCHR +# define MEMRCHR __memrchr #endif -/* Search no more than N bytes of S for C. */ void * -#ifndef MEMRCHR -__memrchr -#else -MEMRCHR -#endif - (const void *s, int c_in, size_t n) +MEMRCHR (const void *s, int c_in, size_t n) { - const unsigned char *char_ptr; - const unsigned long int *longword_ptr; - unsigned long int longword, magic_bits, charmask; - unsigned char c; - - c = (unsigned char) c_in; + uintptr_t s_int = (uintptr_t) s; + uintptr_t lbyte_int = s_int + n; /* Handle the last few characters by reading one character at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = (const unsigned char *) s + n; - n > 0 && ((unsigned long int) char_ptr - & (sizeof (longword) - 1)) != 0; - --n) - if (*--char_ptr == c) + Do this until CHAR_PTR is aligned on a word boundary, or + the entirety of small inputs. */ + const unsigned char *char_ptr = (const unsigned char *) lbyte_int; + size_t align = lbyte_int % sizeof (op_t); + if (n < OP_T_THRES || align > n) + align = n; + for (size_t i = 0; i < align; ++i) + if (*--char_ptr == c_in) return (void *) char_ptr; - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to 8-byte longwords. */ - - longword_ptr = (const unsigned long int *) char_ptr; - - /* Bits 31, 24, 16, and 8 of this number are zero. Call these bits - the "holes." Note that there is a hole just to the left of - each byte, with an extra at the end: - - bits: 01111110 11111110 11111110 11111111 - bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD - - The 1-bits make sure that carries propagate to the next 0-bit. - The 0-bits provide holes for carries to fall into. */ - magic_bits = -1; - magic_bits = magic_bits / 0xff * 0xfe << 1 >> 1 | 1; + const op_t *word_ptr = (const op_t *) char_ptr; + n -= align; + if (__glibc_unlikely (n == 0)) + return NULL; - /* Set up a longword, each of whose bytes is C. */ - charmask = c | (c << 8); - charmask |= charmask << 16; -#if LONG_MAX > LONG_MAX_32_BITS - charmask |= charmask << 32; -#endif - - /* Instead of the traditional loop which tests each character, - we will test a longword at a time. The tricky part is testing - if *any of the four* bytes in the longword in question are zero. */ - while (n >= sizeof (longword)) - { - /* We tentatively exit the loop if adding MAGIC_BITS to - LONGWORD fails to change any of the hole bits of LONGWORD. - - 1) Is this safe? Will it catch all the zero bytes? - Suppose there is a byte with all zeros. Any carry bits - propagating from its left will fall into the hole at its - least significant bit and stop. Since there will be no - carry from its most significant bit, the LSB of the - byte to the left will be unchanged, and the zero will be - detected. - - 2) Is this worthwhile? Will it ignore everything except - zero bytes? Suppose every byte of LONGWORD has a bit set - somewhere. There will be a carry into bit 8. If bit 8 - is set, this will carry into bit 16. If bit 8 is clear, - one of bits 9-15 must be set, so there will be a carry - into bit 16. Similarly, there will be a carry into bit - 24. If one of bits 24-30 is set, there will be a carry - into bit 31, so all of the hole bits will be changed. - - The one misfire occurs when bits 24-30 are clear and bit - 31 is set; in this case, the hole at bit 31 is not - changed. If we had access to the processor carry flag, - we could close this loophole by putting the fourth hole - at bit 32! - - So it ignores everything except 128's, when they're aligned - properly. - - 3) But wait! Aren't we looking for C, not zero? - Good point. So what we do is XOR LONGWORD with a longword, - each of whose bytes is C. This turns each byte that is C - into a zero. */ - - longword = *--longword_ptr ^ charmask; + /* Compute the address of the word containing the initial byte. */ + const op_t *lword = (const op_t *) (s_int & -sizeof (op_t)); - /* Add MAGIC_BITS to LONGWORD. */ - if ((((longword + magic_bits) + /* Set up a word, each of whose bytes is C. */ + op_t repeated_c = repeat_bytes (c_in); - /* Set those bits that were unchanged by the addition. */ - ^ ~longword) + char *ret; + op_t word; - /* Look at only the hole bits. If any of the hole bits - are unchanged, most likely one of the bytes was a - zero. */ - & ~magic_bits) != 0) - { - /* Which of the bytes was C? If none of them were, it was - a misfire; continue the search. */ - - const unsigned char *cp = (const unsigned char *) longword_ptr; - -#if LONG_MAX > 2147483647 - if (cp[7] == c) - return (void *) &cp[7]; - if (cp[6] == c) - return (void *) &cp[6]; - if (cp[5] == c) - return (void *) &cp[5]; - if (cp[4] == c) - return (void *) &cp[4]; -#endif - if (cp[3] == c) - return (void *) &cp[3]; - if (cp[2] == c) - return (void *) &cp[2]; - if (cp[1] == c) - return (void *) &cp[1]; - if (cp[0] == c) - return (void *) cp; - } - - n -= sizeof (longword); - } - - char_ptr = (const unsigned char *) longword_ptr; - - while (n-- > 0) + while (word_ptr != lword) { - if (*--char_ptr == c) - return (void *) char_ptr; + word = *--word_ptr; + if (has_eq (word, repeated_c)) + goto found; } + return NULL; - return 0; +found: + /* We found a match, but it might be in a byte past the start + of the array. */ + ret = (char *) word_ptr + index_last_eq (word, repeated_c); + return (ret >= (char*) s) ? ret : NULL; } -#ifndef MEMRCHR -# ifdef weak_alias weak_alias (__memrchr, memrchr) -# endif -#endif diff --git a/sysdeps/i386/i686/multiarch/memrchr-c.c b/sysdeps/i386/i686/multiarch/memrchr-c.c index ef7bbbe..23c937b 100644 --- a/sysdeps/i386/i686/multiarch/memrchr-c.c +++ b/sysdeps/i386/i686/multiarch/memrchr-c.c @@ -1,5 +1,7 @@ #if IS_IN (libc) # define MEMRCHR __memrchr_ia32 +# undef weak_alias +# define weak_alias(a,b) # include extern void *__memrchr_ia32 (const void *, int, size_t); #endif diff --git a/sysdeps/s390/multiarch/memrchr-c.c b/sysdeps/s390/multiarch/memrchr-c.c index 1e3c914..d7e59a4 100644 --- a/sysdeps/s390/multiarch/memrchr-c.c +++ b/sysdeps/s390/multiarch/memrchr-c.c @@ -18,6 +18,8 @@ #if defined HAVE_S390_VX_ASM_SUPPORT && IS_IN (libc) # define MEMRCHR __memrchr_c +# undef weak_alias +# define weak_alias(a,b) # include extern __typeof (__memrchr) __memrchr_c; From patchwork Wed Jan 10 12:47:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 124091 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp5236673qgn; Wed, 10 Jan 2018 04:51:10 -0800 (PST) X-Google-Smtp-Source: ACJfBosQryoyChaTLL6NPzWxhPhPC+q/KkMyXUcd0BmsE69OOb8qFlzZly4G9ir8pQvjsaFfttTT X-Received: by 10.84.234.16 with SMTP id m16mr19283824plk.201.1515588669946; Wed, 10 Jan 2018 04:51:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515588669; cv=none; d=google.com; s=arc-20160816; b=DM+W03TEDikK+nuxgcB8nLUpTgOcHnaRmBx9aReHhNRGhsjWyT/m/aoHTz9lgGcG2e L7cEbi2RIoUf5lgA9ifpqAcRw/TkQhRcwTXn+h7N1Zy2GnOR83+eYepQLQjcqdD2TmUO xkT6k/OxhxPDcLlKHtZ0VX4TBICwcIDuFh6HzA4puwCqlpdqYq+FN2TfSiP6zFIoEbPK 5tKrtIf/a7amp5fYtyIx10SKyL/+GzEDh9kvbHrlECNfRA2XT9zCENQJ0k89XcFVmK4X dmTI1M+0KhrdSvu5iKkVbRXMSJvDFuf+9g7vKXxZ0qtCg++AUbXt3Ymht87Hlgbsv+fv ZOwg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:to:from:delivered-to :sender:list-help:list-post:list-archive:list-subscribe :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=8bv47aTmfuJAHJFWqyz9SpqrVAG+Ypqse/uIo8TvPwo=; b=OBjrL4WAP8NxGfMVXXfgvkIfm4bnY2Pv97j0EKrrOFagLp1R+UZoBLvReP3ddoMwWi 3Y6amB+yXU7ezMts8WqxPt5+Nq9jnY3QyItkIcSy40OuBHKJiWxKd4lhG5Q9p9fWzF8I ySVBSu2y7daK0mshLWj6akfISVe9P/kaLJAYtK80EPk7TcJFPwCy6MyTa6hr/trSmLy0 N/yvIv8nPCHPJpONxWk5fPuursPF1M4S+9E6WzAaz9xfhldylRghtnY/FEVGUZuVubWk dTfphgVlPlx08Z1PqDZ0pVo3fh6hWJGGAv7Zlhw+AVkOXkdCUtXZfJ5ENANCT/6xfxdt KdBw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=AmYLvtZC; spf=pass (google.com: domain of libc-alpha-return-89013-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89013-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id a3si10483186pgd.336.2018.01.10.04.51.09 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Jan 2018 04:51:09 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-89013-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=AmYLvtZC; spf=pass (google.com: domain of libc-alpha-return-89013-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89013-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=jwXOXw/y1oaNMiqrslAW/naIeISNpkP N2wPuPH7MroCGooOrrYJ3bIXieXtZ/U/g7hPLZZ9KpWHTeLLa8ykWy01PbPTMaTZ PwUej8R6SrSs9wJ3TXmQcbRyrW8KHCLotVOy16ID+dZGrhmQPID7WqwKTh/glM+e iIOHsHakQoM8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:in-reply-to :references; s=default; bh=MzlE6fBEMOo2x/BywsI4GMB5/AI=; b=AmYLv tZCk72F7WECTy6lzdpM0UznfKtYCIeBNtES1fo9dKuIwxti2vh6E8y6K31Ne9AKN k/ZmHWWja3iXYAAcM9iWLXRdYscpe6MivwSApWlVPE5zyeol+FhbM5fnH7c3nXVx RaLZrSIa+4inggdDNQZys3DEk91TlvOhKDUvJI= Received: (qmail 130785 invoked by alias); 10 Jan 2018 12:48:45 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 124475 invoked by uid 89); 10 Jan 2018 12:48:33 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=1712, 1819, 2430 X-HELO: mail-qk0-f170.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=8bv47aTmfuJAHJFWqyz9SpqrVAG+Ypqse/uIo8TvPwo=; b=lq8S4zKPg5+EUTDd1S21DG7eahz0N8q+t5vqAsYSVEnBhh7SPBaS/rTdCz6QhtY5qY arS/jL4vKrj8w6KMIOR0qL5sfgAekVMb81yTAnCvCp15rqbf9Q6w56l9NYuC/3omhSEp m8/tGMFhiekz92VViWqinSfnHQ2ejjD83RuNrrBuUHsGrUy5kAutds9grJZcxUw199Oe q0WS0qDavLAewpc3Q2dNwOOOadJ61Yrk1jwOsIyAxQZ44UUHRGDnD0BRipj5Wh7ftLiZ Z7tCkWsZWVqlUvEdoqlgvqY4lSIYjCiyndS1Lr4kcWXAVJnBiOQyKouQfAn6OYA0N0Ki /1ig== X-Gm-Message-State: AKwxyte/RsybXjB1I9ddazdSYcVPo81q1FFLpq9eibHqQH3Kp9deZ+ZF xzLxkzoJzuTesaEv3qXK8mMS7xZNLYM= X-Received: by 10.55.204.18 with SMTP id r18mr25604273qki.212.1515588501439; Wed, 10 Jan 2018 04:48:21 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Subject: [PATCH v3 08/18] string: Improve generic strnlen Date: Wed, 10 Jan 2018 10:47:52 -0200 Message-Id: <1515588482-15744-9-git-send-email-adhemerval.zanella@linaro.org> In-Reply-To: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> With an optimized memchr, new strnlen implementation basically calls memchr and adjust the result pointer value. It also cleanups the multiple inclusion by leaving the ifunc implementation to undef the weak_alias and libc_hidden_def. Richard Henderson Adhemerval Zanella [BZ #5806] * string/strnlen.c: Rewrite in terms of memchr. * sysdeps/i386/i686/multiarch/strnlen-c.c: Redefine weak_alias and libc_hidden_def. * sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-ppc32.c: Likewise. * sysdeps/s390/multiarch/strnlen-c.c: Likewise. --- string/strnlen.c | 139 ++------------------- sysdeps/i386/i686/multiarch/strnlen-c.c | 19 +-- .../powerpc32/power4/multiarch/strnlen-ppc32.c | 19 +-- sysdeps/s390/multiarch/strnlen-c.c | 18 ++- 4 files changed, 43 insertions(+), 152 deletions(-) -- 2.7.4 diff --git a/string/strnlen.c b/string/strnlen.c index c2ce1eb..a3ec6af 100644 --- a/string/strnlen.c +++ b/string/strnlen.c @@ -21,146 +21,21 @@ not, see . */ #include -#include /* Find the length of S, but scan at most MAXLEN characters. If no '\0' terminator is found in that many characters, return MAXLEN. */ -#ifdef STRNLEN -# define __strnlen STRNLEN +#ifndef STRNLEN +# define STRNLEN __strnlen #endif size_t -__strnlen (const char *str, size_t maxlen) +STRNLEN (const char *str, size_t maxlen) { - const char *char_ptr, *end_ptr = str + maxlen; - const unsigned long int *longword_ptr; - unsigned long int longword, himagic, lomagic; - - if (maxlen == 0) - return 0; - - if (__glibc_unlikely (end_ptr < str)) - end_ptr = (const char *) ~0UL; - - /* Handle the first few characters by reading one character at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = str; ((unsigned long int) char_ptr - & (sizeof (longword) - 1)) != 0; - ++char_ptr) - if (*char_ptr == '\0') - { - if (char_ptr > end_ptr) - char_ptr = end_ptr; - return char_ptr - str; - } - - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to 8-byte longwords. */ - - longword_ptr = (unsigned long int *) char_ptr; - - /* Bits 31, 24, 16, and 8 of this number are zero. Call these bits - the "holes." Note that there is a hole just to the left of - each byte, with an extra at the end: - - bits: 01111110 11111110 11111110 11111111 - bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD - - The 1-bits make sure that carries propagate to the next 0-bit. - The 0-bits provide holes for carries to fall into. */ - himagic = 0x80808080L; - lomagic = 0x01010101L; - if (sizeof (longword) > 4) - { - /* 64-bit version of the magic. */ - /* Do the shift in two steps to avoid a warning if long has 32 bits. */ - himagic = ((himagic << 16) << 16) | himagic; - lomagic = ((lomagic << 16) << 16) | lomagic; - } - if (sizeof (longword) > 8) - abort (); - - /* Instead of the traditional loop which tests each character, - we will test a longword at a time. The tricky part is testing - if *any of the four* bytes in the longword in question are zero. */ - while (longword_ptr < (unsigned long int *) end_ptr) - { - /* We tentatively exit the loop if adding MAGIC_BITS to - LONGWORD fails to change any of the hole bits of LONGWORD. - - 1) Is this safe? Will it catch all the zero bytes? - Suppose there is a byte with all zeros. Any carry bits - propagating from its left will fall into the hole at its - least significant bit and stop. Since there will be no - carry from its most significant bit, the LSB of the - byte to the left will be unchanged, and the zero will be - detected. - - 2) Is this worthwhile? Will it ignore everything except - zero bytes? Suppose every byte of LONGWORD has a bit set - somewhere. There will be a carry into bit 8. If bit 8 - is set, this will carry into bit 16. If bit 8 is clear, - one of bits 9-15 must be set, so there will be a carry - into bit 16. Similarly, there will be a carry into bit - 24. If one of bits 24-30 is set, there will be a carry - into bit 31, so all of the hole bits will be changed. - - The one misfire occurs when bits 24-30 are clear and bit - 31 is set; in this case, the hole at bit 31 is not - changed. If we had access to the processor carry flag, - we could close this loophole by putting the fourth hole - at bit 32! - - So it ignores everything except 128's, when they're aligned - properly. */ - - longword = *longword_ptr++; - - if ((longword - lomagic) & himagic) - { - /* Which of the bytes was the zero? If none of them were, it was - a misfire; continue the search. */ - - const char *cp = (const char *) (longword_ptr - 1); - - char_ptr = cp; - if (cp[0] == 0) - break; - char_ptr = cp + 1; - if (cp[1] == 0) - break; - char_ptr = cp + 2; - if (cp[2] == 0) - break; - char_ptr = cp + 3; - if (cp[3] == 0) - break; - if (sizeof (longword) > 4) - { - char_ptr = cp + 4; - if (cp[4] == 0) - break; - char_ptr = cp + 5; - if (cp[5] == 0) - break; - char_ptr = cp + 6; - if (cp[6] == 0) - break; - char_ptr = cp + 7; - if (cp[7] == 0) - break; - } - } - char_ptr = end_ptr; - } - - if (char_ptr > end_ptr) - char_ptr = end_ptr; - return char_ptr - str; + const char *found = memchr (str, '\0', maxlen); + return found ? found - str : maxlen; } -#ifndef STRNLEN -libc_hidden_def (__strnlen) + weak_alias (__strnlen, strnlen) -#endif +libc_hidden_def (__strnlen) libc_hidden_def (strnlen) diff --git a/sysdeps/i386/i686/multiarch/strnlen-c.c b/sysdeps/i386/i686/multiarch/strnlen-c.c index 351e939..bfbf811 100644 --- a/sysdeps/i386/i686/multiarch/strnlen-c.c +++ b/sysdeps/i386/i686/multiarch/strnlen-c.c @@ -1,10 +1,15 @@ #define STRNLEN __strnlen_ia32 +#undef weak_alias +#define weak_alias(a,b) +#undef libc_hidden_def +#define libc_hidden_def(a) + +#include + #ifdef SHARED -# undef libc_hidden_def -# define libc_hidden_def(name) \ - __hidden_ver1 (__strnlen_ia32, __GI_strnlen, __strnlen_ia32); \ - strong_alias (__strnlen_ia32, __strnlen_ia32_1); \ - __hidden_ver1 (__strnlen_ia32_1, __GI___strnlen, __strnlen_ia32_1); +/* Alias for internal symbol to avoid PLT generation, it redirects the + libc_hidden_def (__strnlen/strlen) to default implementation. */ +__hidden_ver1 (__strnlen_ia32, __GI_strnlen, __strnlen_ia32); +strong_alias (__strnlen_ia32, __strnlen_ia32_1); +__hidden_ver1 (__strnlen_ia32_1, __GI___strnlen, __strnlen_ia32_1); #endif - -#include "string/strnlen.c" diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-ppc32.c b/sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-ppc32.c index df940d3..e2ccd21 100644 --- a/sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-ppc32.c +++ b/sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-ppc32.c @@ -17,12 +17,17 @@ . */ #define STRNLEN __strnlen_ppc -#ifdef SHARED -# undef libc_hidden_def -# define libc_hidden_def(name) \ - __hidden_ver1 (__strnlen_ppc, __GI_strnlen, __strnlen_ppc); \ - strong_alias (__strnlen_ppc, __strnlen_ppc_1); \ - __hidden_ver1 (__strnlen_ppc_1, __GI___strnlen, __strnlen_ppc_1); -#endif +#undef weak_alias +#define weak_alias(a,b) +#undef libc_hidden_def +#define libc_hidden_def(a) #include + +#ifdef SHARED +/* Alias for internal symbol to avoid PLT generation, it redirects the + libc_hidden_def (__strnlen/strlen) to default implementation. */ +__hidden_ver1 (__strnlen_ppc, __GI_strnlen, __strnlen_ppc); \ +strong_alias (__strnlen_ppc, __strnlen_ppc_1); \ +__hidden_ver1 (__strnlen_ppc_1, __GI___strnlen, __strnlen_ppc_1); +#endif diff --git a/sysdeps/s390/multiarch/strnlen-c.c b/sysdeps/s390/multiarch/strnlen-c.c index 353e83e..f77f59d 100644 --- a/sysdeps/s390/multiarch/strnlen-c.c +++ b/sysdeps/s390/multiarch/strnlen-c.c @@ -18,13 +18,19 @@ #if defined HAVE_S390_VX_ASM_SUPPORT && IS_IN (libc) # define STRNLEN __strnlen_c +# undef weak_alias +# define weak_alias(a,b) +# undef libc_hidden_def +# define libc_hidden_def(a) + +# include + # ifdef SHARED -# undef libc_hidden_def -# define libc_hidden_def(name) \ - __hidden_ver1 (__strnlen_c, __GI_strnlen, __strnlen_c); \ - strong_alias (__strnlen_c, __strnlen_c_1); \ - __hidden_ver1 (__strnlen_c_1, __GI___strnlen, __strnlen_c_1); +/* Alias for internal symbol to avoid PLT generation, it redirects the + libc_hidden_def (__strnlen/strlen) to default implementation. */ +__hidden_ver1 (__strnlen_c, __GI_strnlen, __strnlen_c); +strong_alias (__strnlen_c, __strnlen_c_1); +__hidden_ver1 (__strnlen_c_1, __GI___strnlen, __strnlen_c_1); # endif /* SHARED */ -# include #endif /* HAVE_S390_VX_ASM_SUPPORT && IS_IN (libc) */ From patchwork Wed Jan 10 12:47:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 124081 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp5235187qgn; Wed, 10 Jan 2018 04:49:28 -0800 (PST) X-Google-Smtp-Source: ACJfBotNhOAEmTmkxJFFGOtLlW5C5yQ2r4xrfeG1mwfeXGGdxCYLMKlxdqZmvyCCayv8Lh5Dl+i8 X-Received: by 10.99.190.76 with SMTP id g12mr348358pgo.235.1515588568241; Wed, 10 Jan 2018 04:49:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515588568; cv=none; d=google.com; s=arc-20160816; b=vviZACbn/X3jaCW/mkSViKZsFEVtXB/jeAK2LnawlBCAVVs7QTXQW+hdg1XiQI4S+L dGcsK5PtNyDq9f0OsW6dfnBJRo9VnU8MRMp2j7FHrVC8THIvDKVE9hdh7rYQpiquyf8N vL94Rn52YIl1hMnHhZzETgkegAECPIqbqYE2ATkTrkl+x/7aeDOrpKTI8fy7P8D9rHFx gkduJmuwakfyLXzsPyz8HiC5RXwRVSLa657Vim/bknWEVhnSnWWhce/y6rwJrDmcDqBP /Zw8bpDgZkh2iOCRDr9GXN3O447vmApct1QCt/L5ZKQMrSu4A1uu5m65PrwTHDTUByF1 D38A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :delivered-to:sender:list-help:list-post:list-archive:list-subscribe :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=eKKt9EJ4kSe1azG9c+topdoxtYd46kh2jlXinfk4sHM=; b=meDb2A5jRCJfuJA5BNMtRH1ks+8IOfDxk4iTGppiAbXGEA0HwJqjBfwurJxYR7QeLT htop3Q7ufpkMo7Fm95ftB4ZDHs6at3/LtefTQRPlBAscz7lK93JuaQu5mn95NFMAlAYl Wwd2ZRDI2rqjSpmNef7yImIX7bOBCMLXc1jxst2ZrLDx8oyxjqI3LZC7U5aB/ZbmqlRL o6GqfS23UKCCyt2sfaISZcBV2GsaeIc8SEf09KZdRRZmL/cnN0Pa9QXxx8Nq9vnDOFMo qCLS3nIrGq2+Lr8xOi6QHr1ibaTD/vOz/z9hbLM0PCcO9yFfJNhh/Z/JjFQ8FjfF9LEh eOdQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=qkDFQ7Nt; spf=pass (google.com: domain of libc-alpha-return-89003-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89003-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id l30si12100394plg.123.2018.01.10.04.49.27 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Jan 2018 04:49:28 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-89003-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=qkDFQ7Nt; spf=pass (google.com: domain of libc-alpha-return-89003-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89003-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=OCKanqoEs3cxEda4fbaPolS7cNVxZWL t9pYS7bhswSpTc2A+mfy+JsTYy1nMf6mR0NjUYJg5JYqYutXfTMeKM8mDmq3o+P/ 15KAtXE73Rgvkk9gwiDAdwPpw9H0VBbAwIv5REvt/EBDripWjH6vY1eqr7r4RIB8 VgYcZGB3sA9M= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; s=default; bh=0+xZ5zEEM2yrOnTQkI74ckcrwLg=; b=qkDFQ 7Nt8zYKK4xuMjxmCL50S0Gbm88v5EtCRtyR0EYaW4aIUJSf/TlU8Q2l3l5Jts2nM QnkgIkvEFWlULJdg/BlXzp4FwwWlynXo8bXGoqAFNYFcMG5lckanCv2rvFoFTptH RVzIkyKGF0NZomHPN6bRRro1ZtRHT58fG+U34g= Received: (qmail 129824 invoked by alias); 10 Jan 2018 12:48:43 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 124476 invoked by uid 89); 10 Jan 2018 12:48:33 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=2430 X-HELO: mail-qk0-f170.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=eKKt9EJ4kSe1azG9c+topdoxtYd46kh2jlXinfk4sHM=; b=T8MtKd2F5zsiLvokUNVNkQFFeDCjzYpkhhycvTEiAvD7klrxu4nDceZLbt58oMvI2K w2WOYDAdbYLkerqJaYijMM98u1x0/NB8hus88OhVkc1BXzBBmF/112F6/4VAnhszWWQQ +N+IvjmR+OdkHKtiT40r05+yFoRFqfq8zDl2mj2d5X3d9CCcP3WwspRaNmxxTqwDQZ5/ K/3ELW9AuXFraI5ZfpP4UIGcsXRERIiVUs46TcfSRpMsJchhYokxdWhceZ/SHzyWcEap pj15y30UAJetat82xVD5GcoYkCOhad0+QxcIkhl5lGR+jJQ6VIxRtHdRit2aLKzwE7TT uK/g== X-Gm-Message-State: AKwxytcgQ5+fzE5wak88fRcjByOEM70m3HuNU+rur6bshjpbb7Tvjy4A haVCXjuDzzWWa4s5V3jIqIG6wHBAwgI= X-Received: by 10.55.214.75 with SMTP id t72mr27405753qki.42.1515588502984; Wed, 10 Jan 2018 04:48:22 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Richard Henderson Subject: [PATCH v3 09/18] string: Improve generic strchr Date: Wed, 10 Jan 2018 10:47:53 -0200 Message-Id: <1515588482-15744-10-git-send-email-adhemerval.zanella@linaro.org> In-Reply-To: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> From: Richard Henderson New algorithm have the following key differences: - Reads first word unaligned and use string-maskoff function to remove unwanted data. This strategy follow assemble optimized ones for aarch64 and powerpc. - Use string-fz{b,i} and string-extbyte function. Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu, and sparcv9-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Richard Henderson Adhemerval Zanella [BZ #5806] * string/strchr.c: Use string-fzb.h, string-fzi.h, string-extbyte.h. * sysdeps/s390/multiarch/strchr-c.c: Redefine weak_alias. --- string/strchr.c | 166 +++++++------------------------------- sysdeps/s390/multiarch/strchr-c.c | 1 + 2 files changed, 29 insertions(+), 138 deletions(-) -- 2.7.4 diff --git a/string/strchr.c b/string/strchr.c index a63fdfc..ee8ed5c 100644 --- a/string/strchr.c +++ b/string/strchr.c @@ -22,8 +22,15 @@ #include #include +#include +#include +#include +#include +#include +#include #undef strchr +#undef index #ifndef STRCHR # define STRCHR strchr @@ -33,153 +40,36 @@ char * STRCHR (const char *s, int c_in) { - const unsigned char *char_ptr; - const unsigned long int *longword_ptr; - unsigned long int longword, magic_bits, charmask; - unsigned char c; + const op_t *word_ptr; + op_t found, word; - c = (unsigned char) c_in; + /* Set up a word, each of whose bytes is C. */ + unsigned char c = (unsigned char) c_in; + op_t repeated_c = repeat_bytes (c_in); - /* Handle the first few characters by reading one character at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = (const unsigned char *) s; - ((unsigned long int) char_ptr & (sizeof (longword) - 1)) != 0; - ++char_ptr) - if (*char_ptr == c) - return (void *) char_ptr; - else if (*char_ptr == '\0') - return NULL; + /* Align the input address to op_t. */ + uintptr_t s_int = (uintptr_t) s; + word_ptr = (op_t*) (s_int & -sizeof (op_t)); - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to 8-byte longwords. */ + /* Read the first aligned word, but force bytes before the string to + match neither zero nor goal (we make sure the high bit of each byte + is 1, and the low 7 bits are all the opposite of the goal byte). */ + op_t bmask = create_mask (s_int); + word = (*word_ptr | bmask) ^ (repeated_c & highbit_mask (bmask)); - longword_ptr = (unsigned long int *) char_ptr; - - /* Bits 31, 24, 16, and 8 of this number are zero. Call these bits - the "holes." Note that there is a hole just to the left of - each byte, with an extra at the end: - - bits: 01111110 11111110 11111110 11111111 - bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD - - The 1-bits make sure that carries propagate to the next 0-bit. - The 0-bits provide holes for carries to fall into. */ - magic_bits = -1; - magic_bits = magic_bits / 0xff * 0xfe << 1 >> 1 | 1; - - /* Set up a longword, each of whose bytes is C. */ - charmask = c | (c << 8); - charmask |= charmask << 16; - if (sizeof (longword) > 4) - /* Do the shift in two steps to avoid a warning if long has 32 bits. */ - charmask |= (charmask << 16) << 16; - if (sizeof (longword) > 8) - abort (); - - /* Instead of the traditional loop which tests each character, - we will test a longword at a time. The tricky part is testing - if *any of the four* bytes in the longword in question are zero. */ - for (;;) + while (1) { - /* We tentatively exit the loop if adding MAGIC_BITS to - LONGWORD fails to change any of the hole bits of LONGWORD. - - 1) Is this safe? Will it catch all the zero bytes? - Suppose there is a byte with all zeros. Any carry bits - propagating from its left will fall into the hole at its - least significant bit and stop. Since there will be no - carry from its most significant bit, the LSB of the - byte to the left will be unchanged, and the zero will be - detected. - - 2) Is this worthwhile? Will it ignore everything except - zero bytes? Suppose every byte of LONGWORD has a bit set - somewhere. There will be a carry into bit 8. If bit 8 - is set, this will carry into bit 16. If bit 8 is clear, - one of bits 9-15 must be set, so there will be a carry - into bit 16. Similarly, there will be a carry into bit - 24. If one of bits 24-30 is set, there will be a carry - into bit 31, so all of the hole bits will be changed. - - The one misfire occurs when bits 24-30 are clear and bit - 31 is set; in this case, the hole at bit 31 is not - changed. If we had access to the processor carry flag, - we could close this loophole by putting the fourth hole - at bit 32! - - So it ignores everything except 128's, when they're aligned - properly. - - 3) But wait! Aren't we looking for C as well as zero? - Good point. So what we do is XOR LONGWORD with a longword, - each of whose bytes is C. This turns each byte that is C - into a zero. */ - - longword = *longword_ptr++; - - /* Add MAGIC_BITS to LONGWORD. */ - if ((((longword + magic_bits) - - /* Set those bits that were unchanged by the addition. */ - ^ ~longword) - - /* Look at only the hole bits. If any of the hole bits - are unchanged, most likely one of the bytes was a - zero. */ - & ~magic_bits) != 0 || - - /* That caught zeroes. Now test for C. */ - ((((longword ^ charmask) + magic_bits) ^ ~(longword ^ charmask)) - & ~magic_bits) != 0) - { - /* Which of the bytes was C or zero? - If none of them were, it was a misfire; continue the search. */ - - const unsigned char *cp = (const unsigned char *) (longword_ptr - 1); - - if (*cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (sizeof (longword) > 4) - { - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - } - } + if (has_zero_eq (word, repeated_c)) + break; + word = *++word_ptr; } + found = index_first_zero_eq (word, repeated_c); + + if (extractbyte (word, found) == c) + return (char *) (word_ptr) + found; return NULL; } -#ifdef weak_alias -# undef index weak_alias (strchr, index) -#endif libc_hidden_builtin_def (strchr) diff --git a/sysdeps/s390/multiarch/strchr-c.c b/sysdeps/s390/multiarch/strchr-c.c index 606cb56..e91ef94 100644 --- a/sysdeps/s390/multiarch/strchr-c.c +++ b/sysdeps/s390/multiarch/strchr-c.c @@ -19,6 +19,7 @@ #if defined HAVE_S390_VX_ASM_SUPPORT && IS_IN (libc) # define STRCHR __strchr_c # undef weak_alias +# define weak_alias(a, b) # ifdef SHARED # undef libc_hidden_builtin_def # define libc_hidden_builtin_def(name) \ From patchwork Wed Jan 10 12:47:54 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 124095 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp5237259qgn; Wed, 10 Jan 2018 04:51:46 -0800 (PST) X-Google-Smtp-Source: ACJfBotEhhl+5pURTf/keAEfZXK22HnyPZ9VLd4y2wNRdT2eIzB3mJbG4oGJ6Or9NfOUYQUwGhNS X-Received: by 10.101.85.15 with SMTP id f15mr3805184pgr.153.1515588706908; Wed, 10 Jan 2018 04:51:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515588706; cv=none; d=google.com; s=arc-20160816; b=sUWnZUEQ/VsG76d3arfkdVK7so6l4ZRh7FIuZCpByT4vG97CWvd6aJnjcgvYqA5UPy V3SS+r0EGmO6Z7bKhVnqCVafawM4Htw/tMv7IfjcE8sGiGc8Pf38Yuuy+fItYlbn4TVT jL4HdfXE4DUG9MNi/fISgXRRVVUqrGNz+5WREaCc8YLAtclLbeX53croqldNZH0en+zn +wT+v2tx8L/8XdUuNKqZVvvWzCSD7HvNOu/2FHA76frA3WtPnuulK1nrCXSXZ9Df6L9p RCP7+WEhgH+eXPAss/4gcEz4BVVtiNRBM0O81wTrOnefjgOKcGNk+nO59f5HlMNo1tkb 1OQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :delivered-to:sender:list-help:list-post:list-archive:list-subscribe :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=irvnEYvgvHaWmrx2Q8RIdFzDWdxlHO/HcvsibierjG0=; b=MAqnv+68bXvq8WmlySo/mpvMXe2hz3xkuIa0vY4wWlSbjRed1344vigIiHcbZMMLOB a6yZUn88LOhpTOG3PTzwGDhN8faGVG92Zty3dpdutgyppjiqVoLm6vTUUaUjXw7X4pHB aFN6bXMCDfcrKhZ9CXwaxPTYJxKZVqbZjWNF7eG5BI7HheCdW9awgIb8dotduAnhO1eU Z1paaeP4CNIMW4/mlEuk3s1nphb+rPrHaYOihHnNT+O6LFx+RaY6td/Loyk1i61HXBlt Ng78YIzbkjVeXHVlV1nu0RCsJqQHXCuDdTmWcgjck0cYXl8zqP40aGrlxCN2zpqGPdq4 Fb0w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=ediz/X8f; spf=pass (google.com: domain of libc-alpha-return-89016-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89016-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id u85si11705724pfi.278.2018.01.10.04.51.46 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Jan 2018 04:51:46 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-89016-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=ediz/X8f; spf=pass (google.com: domain of libc-alpha-return-89016-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89016-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=o9FUiEWNjl3nl6V7vsLk4+i7u/bcuAA PgbgCsRMZuDimqenYPai4NsfPx7KxEtgJV6McVwB2+/XQtNA29JSvuIihLm3Y1u2 5ReKEgwRrMQCp5HgQ5qK2dTrwBmHyvYSwRGZRdQkv3QWYLyhAivSkPkocJYL/6ru i2R06HoOIYMc= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; s=default; bh=F7j63yb6UgyQXz10blCV4KGSRsw=; b=ediz/ X8f01POIM71oISfjj+eTuzWqHeM173pI2EJMpgBuEGicT6evDeu9RcpEvtBzSVi1 2EVhdcv2SX9+LG3XQIMaR9l8Vfq98TlryW0GgSPR5qsH2lXOcWyL3dFisAi33v3i cSxfPollMkyDXk7g+eY+LySEPzcdcXGAZ0hbuI= Received: (qmail 767 invoked by alias); 10 Jan 2018 12:48:49 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 124478 invoked by uid 89); 10 Jan 2018 12:48:33 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=2430 X-HELO: mail-qk0-f176.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=irvnEYvgvHaWmrx2Q8RIdFzDWdxlHO/HcvsibierjG0=; b=A0ytSERtUx7rqdVIJD0zEqymRMSm1NrN7Zxc89juLk6CG/QA10n1xbrwX7U4/Pncbq KkHZWrzKdUn24a9K4rNtxmOejYL/mGLcJIbXQJI+GkF/DEo0bx51A41+dWJxhoDuXM3g ZHOOGP62Oku8vFyKXbzmzVUREPmgq2V6HT6MoRqe3WkxA9OHLvEirZ4b9exaE9zjjOhd MQAtZ2gDX0Ml2KX62gnSEjca0H0kedjmiN64baxIoLn42GZRtirQA/L0RKhXKdoqcapN BYz6oXqhkRVQ2AGTD4Z1oec6EhqUr68wS+OwKmZxPR+EC+KtjJyTi6iTGJVY2rRBERJY jQGA== X-Gm-Message-State: AKwxyte5AMe60xMgUIp33w3jjsU5G9f2J4LFvsjEfiqCQW4c4mdxPX6t wUPbsKXd2htFuY4ri/Ymj8/0wiRZW8U= X-Received: by 10.55.115.194 with SMTP id o185mr23900974qkc.143.1515588504506; Wed, 10 Jan 2018 04:48:24 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Richard Henderson Subject: [PATCH v3 10/18] string: Improve generic strchrnul Date: Wed, 10 Jan 2018 10:47:54 -0200 Message-Id: <1515588482-15744-11-git-send-email-adhemerval.zanella@linaro.org> In-Reply-To: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> From: Richard Henderson New algorithm have the following key differences: - Reads first word unaligned and use string-maskoff function to remove unwanted data. This strategy follow assemble optimized ones for aarch64, powerpc and tile. - Use string-fz{b,i} functions. Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu, and sparcv9-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). [BZ #5806] * string/strchrnul.c: Use string-fzb.h, string-fzi.h. --- string/strchrnul.c | 146 +++++++++-------------------------------------------- 1 file changed, 25 insertions(+), 121 deletions(-) -- 2.7.4 diff --git a/string/strchrnul.c b/string/strchrnul.c index 5a17602..beeab88 100644 --- a/string/strchrnul.c +++ b/string/strchrnul.c @@ -21,8 +21,12 @@ . */ #include -#include #include +#include +#include +#include +#include +#include #undef __strchrnul #undef strchrnul @@ -33,134 +37,34 @@ /* Find the first occurrence of C in S or the final NUL byte. */ char * -STRCHRNUL (const char *s, int c_in) +STRCHRNUL (const char *str, int c_in) { - const unsigned char *char_ptr; - const unsigned long int *longword_ptr; - unsigned long int longword, magic_bits, charmask; - unsigned char c; + const op_t *word_ptr; + op_t found, word; - c = (unsigned char) c_in; + /* Set up a word, each of whose bytes is C. */ + op_t repeated_c = repeat_bytes (c_in); - /* Handle the first few characters by reading one character at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = (const unsigned char *) s; - ((unsigned long int) char_ptr & (sizeof (longword) - 1)) != 0; - ++char_ptr) - if (*char_ptr == c || *char_ptr == '\0') - return (void *) char_ptr; + /* Align the input address to op_t. */ + uintptr_t s_int = (uintptr_t) str; + word_ptr = (op_t*) (s_int & -sizeof (op_t)); - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to 8-byte longwords. */ + /* Read the first aligned word, but force bytes before the string to + match neither zero nor goal (we make sure the high bit of each byte + is 1, and the low 7 bits are all the opposite of the goal byte). */ + op_t bmask = create_mask (s_int); + word = (*word_ptr | bmask) ^ (repeated_c & highbit_mask (bmask)); - longword_ptr = (unsigned long int *) char_ptr; - - /* Bits 31, 24, 16, and 8 of this number are zero. Call these bits - the "holes." Note that there is a hole just to the left of - each byte, with an extra at the end: - - bits: 01111110 11111110 11111110 11111111 - bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD - - The 1-bits make sure that carries propagate to the next 0-bit. - The 0-bits provide holes for carries to fall into. */ - magic_bits = -1; - magic_bits = magic_bits / 0xff * 0xfe << 1 >> 1 | 1; - - /* Set up a longword, each of whose bytes is C. */ - charmask = c | (c << 8); - charmask |= charmask << 16; - if (sizeof (longword) > 4) - /* Do the shift in two steps to avoid a warning if long has 32 bits. */ - charmask |= (charmask << 16) << 16; - if (sizeof (longword) > 8) - abort (); - - /* Instead of the traditional loop which tests each character, - we will test a longword at a time. The tricky part is testing - if *any of the four* bytes in the longword in question are zero. */ - for (;;) + while (1) { - /* We tentatively exit the loop if adding MAGIC_BITS to - LONGWORD fails to change any of the hole bits of LONGWORD. - - 1) Is this safe? Will it catch all the zero bytes? - Suppose there is a byte with all zeros. Any carry bits - propagating from its left will fall into the hole at its - least significant bit and stop. Since there will be no - carry from its most significant bit, the LSB of the - byte to the left will be unchanged, and the zero will be - detected. - - 2) Is this worthwhile? Will it ignore everything except - zero bytes? Suppose every byte of LONGWORD has a bit set - somewhere. There will be a carry into bit 8. If bit 8 - is set, this will carry into bit 16. If bit 8 is clear, - one of bits 9-15 must be set, so there will be a carry - into bit 16. Similarly, there will be a carry into bit - 24. If one of bits 24-30 is set, there will be a carry - into bit 31, so all of the hole bits will be changed. - - The one misfire occurs when bits 24-30 are clear and bit - 31 is set; in this case, the hole at bit 31 is not - changed. If we had access to the processor carry flag, - we could close this loophole by putting the fourth hole - at bit 32! - - So it ignores everything except 128's, when they're aligned - properly. - - 3) But wait! Aren't we looking for C as well as zero? - Good point. So what we do is XOR LONGWORD with a longword, - each of whose bytes is C. This turns each byte that is C - into a zero. */ - - longword = *longword_ptr++; - - /* Add MAGIC_BITS to LONGWORD. */ - if ((((longword + magic_bits) - - /* Set those bits that were unchanged by the addition. */ - ^ ~longword) - - /* Look at only the hole bits. If any of the hole bits - are unchanged, most likely one of the bytes was a - zero. */ - & ~magic_bits) != 0 || - - /* That caught zeroes. Now test for C. */ - ((((longword ^ charmask) + magic_bits) ^ ~(longword ^ charmask)) - & ~magic_bits) != 0) - { - /* Which of the bytes was C or zero? - If none of them were, it was a misfire; continue the search. */ - - const unsigned char *cp = (const unsigned char *) (longword_ptr - 1); - - if (*cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (sizeof (longword) > 4) - { - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - } - } + if (has_zero_eq (word, repeated_c)) + break; + word = *++word_ptr; } - /* This should never happen. */ - return NULL; + found = index_first_zero_eq (word, repeated_c); + + return (char *) (word_ptr) + found; } weak_alias (__strchrnul, strchrnul) From patchwork Wed Jan 10 12:47:55 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 124094 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp5237095qgn; Wed, 10 Jan 2018 04:51:37 -0800 (PST) X-Google-Smtp-Source: ACJfBovuM1JsKH9h7F9n5l4HB9ZPTg2+Qy7Yf5ljZivjAHF1ka+aYrosQcXU1KLyiIqU2m+1v0vm X-Received: by 10.101.97.204 with SMTP id j12mr14766616pgv.266.1515588697006; Wed, 10 Jan 2018 04:51:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515588697; cv=none; d=google.com; s=arc-20160816; b=NuiE1gh3nX9jPG2Qm7K+cciC9AbF1WAYdIt7+RCyhobCUBI5wirRs0i5FEnRsA4nsE vCL9Vi2WEWA1n4dZSANdSxdfoEEyf39zBtNRqB9a2u2wyix9YXW49gH3sJ/RPMrQoA0M dUvcRIrYachstzuRq8KiWmOTkQftLfc/KPc2kfNo7qIeF7jo4l2BNgd9q+uBOR09mQNI PzePzGKqVAFM1zkCaqOCoaKj8yvjNkOg2NBQCBWclG8v+DFzmlBdH4huDtQZdSS7B+2b XHS1//ipGH+u/jLNRa9TjkCsqscZ5X5CwaPxwNL2KVeIYlHG40+JepGDdExy5EwbcFLs iQGg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :delivered-to:sender:list-help:list-post:list-archive:list-subscribe :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=NnHdlpEViY3pTBtyFSudZSz5tMtcULn1+FNNzpkAnvE=; b=w0I3P833tl0QiOWE1o0lZ+1LPii/9q66hxJZWnIvl9zHBWNfN7UpaS7sIFJ1zEqn8P 85yTHZC1s10zHzmMeqIa04LnattVleGJgxcD+VyS0JMbi41dnvq2xT2KMjaawUyxonj+ VhXwPu5NXyDXgLCrceWUSiGKUtBK8ATUsOErtPotQ7+WR8buJkEN9OfRjawvxjnoiKB6 dSY5Zntfj6paHsdiJh12CcbgW9Iq54Ect35+Pxs0uIVvAvlNGTxupv04C9DPFRf7iTM6 QK4w6jsSNBJ9bx2NEtpWWZrtIlxSiWvC5HCQbQgd1d5RV7jjwTdbV4/wD9U2aBMMgdwY tPiQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=F9Xae9hR; spf=pass (google.com: domain of libc-alpha-return-89017-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89017-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id 3si7836801pla.229.2018.01.10.04.51.36 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Jan 2018 04:51:36 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-89017-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=F9Xae9hR; spf=pass (google.com: domain of libc-alpha-return-89017-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89017-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=MG7/P0wcldxsWD4aEwnXHEPoL3xTzNp jHtvYsUXatSb9hsJixwovi49HpDVFQgRT+ATeKs9SETzUDvyY2uVCEA9OC7XviG3 0D5Z7dYc84GbfUQudA/cwKTJv+XOIihXL5mk4fJRBBV3su63w3iqYtaqcp7nM89a fCYRBh+HH8rU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; s=default; bh=eJ/+cIsmjy4rnIukj/pe/+8QE0k=; b=F9Xae 9hRkzBhHkSGVSpyXGgGDlMbYFMcY4sXb7BFQn01EHAapAKtWpf4RDGZcIRKu0MkO 6BQIXbMyEIPmAb7WNtYZBICLBEiuBwe3zX38mYMBpIvnqsD9mYtpo5hPeFsCI43J hHBgR8UV5WfVzhJzUziJhD4t8KjHytz22LVKGQ= Received: (qmail 826 invoked by alias); 10 Jan 2018 12:48:49 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 121927 invoked by uid 89); 10 Jan 2018 12:48:29 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mail-qt0-f193.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=NnHdlpEViY3pTBtyFSudZSz5tMtcULn1+FNNzpkAnvE=; b=WleFyH5XoEhvUZvDHqxbDPb8tG/ii5mPGV0BOfcGoUYHXnlQMbFM9Su6ytyaymouT2 GqKexrauIP3MXbb2/AHhLzqa631SvM4kk561D5YHVDCT4iCLi3kU6/6ZZzDk/TCFXzYq ReJuqt1FDCI25xIEOQhFBNMl0hYR33d04oWq4sDqvih1YkQhqDt0zd1tBVHIepGATWM1 3Olh2dvsG1HBpx+KgUC5aHpEVnYIAVE2MqN2xCI21zEeibo+JWJ4xxxXJoBSp6464et+ vJ2OJ7U+QNUaUxnCrN4z5Ga/AljISqH7dh3FlQ5u3ZLInLeQynw3GbOJupUwXe44+feo Z1bA== X-Gm-Message-State: AKwxytcYSW912pVtXNBLw4xDuLVpARpguyaIXOSJDOYSd8VCAV9ohXUO EdkS/HVr0VDJPmfn3c7OhMV6VGbHjNw= X-Received: by 10.237.35.208 with SMTP id k16mr24447568qtc.47.1515588506089; Wed, 10 Jan 2018 04:48:26 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Richard Henderson Subject: [PATCH v3 11/18] string: Improve generic strcmp Date: Wed, 10 Jan 2018 10:47:55 -0200 Message-Id: <1515588482-15744-12-git-send-email-adhemerval.zanella@linaro.org> In-Reply-To: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> From: Richard Henderson New generic implementation tries to use word operations along with the new string-fz{b,i} functions even for inputs with different alignments (with still uses aligned access plus merge operation to get a correct word by word comparison). Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu, and sparcv9-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Richard Henderson Adhemerval Zanella * string/strcmp.c: Rewrite using memcopy.h, string-fzb.h, string-fzi.h. --- string/strcmp.c | 97 ++++++++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 89 insertions(+), 8 deletions(-) -- 2.7.4 diff --git a/string/strcmp.c b/string/strcmp.c index e198d19..c346ab9 100644 --- a/string/strcmp.c +++ b/string/strcmp.c @@ -16,6 +16,12 @@ . */ #include +#include +#include +#include +#include +#include +#include #undef strcmp @@ -29,19 +35,94 @@ int STRCMP (const char *p1, const char *p2) { - const unsigned char *s1 = (const unsigned char *) p1; - const unsigned char *s2 = (const unsigned char *) p2; + const op_t *x1, *x2; + op_t w1, w2; unsigned char c1, c2; + uintptr_t i, n, ofs; + int diff; - do + /* Handle the unaligned bytes of p1 first. */ + n = -(uintptr_t)p1 % sizeof(op_t); + for (i = 0; i < n; ++i) { - c1 = (unsigned char) *s1++; - c2 = (unsigned char) *s2++; - if (c1 == '\0') - return c1 - c2; + c1 = *p1++; + c2 = *p2++; + diff = c1 - c2; + if (c1 == '\0' || diff) + return diff; } - while (c1 == c2); + /* P1 is now aligned to unsigned long. P2 may or may not be. */ + x1 = (const op_t *)p1; + w1 = *x1++; + ofs = (uintptr_t)p2 % sizeof(op_t); + if (ofs == 0) + { + x2 = (const op_t *)p2; + w2 = *x2++; + /* Aligned loop. If a difference is found, exit to compare the + bytes. Else if a zero is found we have equal strings. */ + while (w1 == w2) + { + if (has_zero (w1)) + return 0; + w1 = *x1++; + w2 = *x2++; + } + } + else + { + op_t w2a, w2b; + uintptr_t sh_1, sh_2; + + x2 = (const op_t *)(p2 - ofs); + w2a = *x2++; + sh_1 = ofs * CHAR_BIT; + sh_2 = sizeof(op_t) * CHAR_BIT - sh_1; + + /* Align the first partial of P2, with 0xff for the rest of the + bytes so that we can also apply the has_zero test to see if we + have already reached EOS. If we have, then we can simply fall + through to the final comparison. */ + w2 = MERGE (w2a, sh_1, (op_t)-1, sh_2); + if (!has_zero (w2)) + { + /* Unaligned loop. The invariant is that W2B, which is "ahead" + of W1, does not contain end-of-string. Therefore it is safe + (and necessary) to read another word from each while we do + not have a difference. */ + while (1) + { + w2b = *x2++; + w2 = MERGE (w2a, sh_1, w2b, sh_2); + if (w1 != w2) + goto final_cmp; + if (has_zero (w2b)) + break; + w1 = *x1++; + w2a = w2b; + } + + /* Zero found in the second partial of P2. If we had EOS + in the aligned word, we have equality. */ + if (has_zero (w1)) + return 0; + + /* Load the final word of P1 and align the final partial of P2. */ + w1 = *x1++; + w2 = MERGE (w2b, sh_1, 0, sh_2); + } + } + + final_cmp: + for (i = 0; i < sizeof (op_t); i++) + { + c1 = extractbyte (w1, i); + c2 = extractbyte (w2, i); + if (c1 == '\0' || c1 != c2) + return c1 - c2; + } return c1 - c2; } + libc_hidden_builtin_def (strcmp) From patchwork Wed Jan 10 12:47:56 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 124085 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp5235704qgn; Wed, 10 Jan 2018 04:50:05 -0800 (PST) X-Google-Smtp-Source: ACJfBosLx19mYiODh1Ln4FaZd4t6p/+KSqQgzJCpx8jxYC3fiHUWHBewM3wL24PA6OnczAUNfJjD X-Received: by 10.99.43.9 with SMTP id r9mr14848317pgr.233.1515588605009; Wed, 10 Jan 2018 04:50:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515588605; cv=none; d=google.com; s=arc-20160816; b=Z2yebYHia+6PcaEwSymaTbWvIKKoYkY7CZRVmzTSd8F1ZQ7ap+RZlVfEYw/2/zk6zq 0qv0XsUQx6QQV/zCtgzYeL48Vqk0hl5OtlsHkMRMUISrW+rkDtF6jIIt+NrebqC0kgsI 2zNLHZ1kF7VhCEHBt3qy3wGuxevU4KPUrWs+xxEsAAxT74dv+H8YDj59W++yWwe7IxjQ pNv1gh16X2bT2EUzJr3jDeI7Tt6925c6uKcOU+jBPHviS7A3CaJGKpq0zLxjL5R+TGEs RQKYgEciZcAJZ7CIkTSoXtyCgpi3VBb1UusD0iXt/Qpvsha4Ty/XQcOlc3jDkNYUKT0A 2fMw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :delivered-to:sender:list-help:list-post:list-archive:list-subscribe :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=6KhtS4DC+QtoOJU9OLLhmLgWpCcCuAvgWYSgiJ6sNEk=; b=p2QjSTImPUmwTgrsfktrE3I/rhlyQLG8WwlyB+ikulBDcGaV3X//J7v6+lgA2BauWk etQiCnbAvo2uWbmj/GZr0hBxKrmS6fumhiMTKPSRxlLYlVORgd+Io8obNQugiEmDlhV7 uS8hXfzrpJSJgvLe7GFdH7fGFZFEqN0/NJxNeFHQUgtzfXYRBXPY+5GWpjUgDhtpO7re S7cfcpaZ0dh1jEeKkPL1xue9VrfhKEdp+UhEMRU6gFfAx6CXwlNN5bw5Z7pIDRyvyqet fo2otWSgYo0xg/tZ8h1PaBaWSgIx4kCfo8/HxaNF2Gf4ZHt0Gv9+zzdrNwa/DTF6x7Ch WRwQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=bHioidcZ; spf=pass (google.com: domain of libc-alpha-return-89007-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89007-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id 88si12123204pld.474.2018.01.10.04.50.04 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Jan 2018 04:50:04 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-89007-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=bHioidcZ; spf=pass (google.com: domain of libc-alpha-return-89007-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89007-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=dJVD5vOd4NVHWtZRdz8yH+S5sK5b7wy 5fbaxZq1whQKElEXj2huY1oJfpEWNsZgAU0nlrhqcwe4I6tAe2amwTz9huZ5HC20 GGgblZaK09JPJ5wFbSDxg+MlSL3dFLaOuRoEbsEVOg9VVaVZ60WGdoGtSK3Q098l MAxs/3cjxMGI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; s=default; bh=tvqFk2TZj+js/J8QcqvvT8NNhmM=; b=bHioi dcZyCDwj4jMe8bDsiQNCoFpjBfPi15CHm8NNMp7e5PT4A0wCfJTY8VMHVbT9zmJv Z5kk1gn6/yLRGQJHHs/C/R2OJUh4we3MqOumkZfATLTreh7F/FN0xCjpLFTiJQsQ uSkmd5ui2N+eIr7RJtOeQ9MysZFlAYHHrd4Fok= Received: (qmail 130119 invoked by alias); 10 Jan 2018 12:48:43 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 125944 invoked by uid 89); 10 Jan 2018 12:48:36 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mail-qt0-f194.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=6KhtS4DC+QtoOJU9OLLhmLgWpCcCuAvgWYSgiJ6sNEk=; b=oXMbIsyN8f2kAQ5fHsZR0B3VpixaLlWUhqvWrqURPkN8DFva2tGoS5nybTDJp3ZdEs nhaE25ovrsLzQ2/6axDhpMuj0iDCASk0yTkI6zce1UgWjrn7vv/EEXc7oqENnafKBN2j 6mm+OZn4HI6pSgP1RyiQCwXJbW8n8VkyxtH1CnoOgSUYFqpLANAL9qR4mj/Q4Y37zC+Q pxbL+84MmmZxUIrUAJpkWVq/ErT3gAWcqLg6u6GTNQDT+IyD2WKwdYNB5L3bUAvZxJDN Isls3kVKTFCzMA43xIpnGa6I6K2V+JBAQlPK5ajVtTfW76cibB2HUu9PP6M6paEuFZfe l++w== X-Gm-Message-State: AKwxytcxg1/D1Lokoz4Espb8hFHQfb+t43gqVjmVygizh+sr/EfyCh/o Ya4h17yuyL1HGa4vPpbss+97k08rdc4= X-Received: by 10.200.37.119 with SMTP id 52mr26911565qtn.270.1515588507648; Wed, 10 Jan 2018 04:48:27 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Adhemerval Zanella Subject: [PATCH v3 12/18] string: Improve generic strcpy Date: Wed, 10 Jan 2018 10:47:56 -0200 Message-Id: <1515588482-15744-13-git-send-email-adhemerval.zanella@linaro.org> In-Reply-To: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> From: Adhemerval Zanella New generic implementation tries to use word operations along with the new string-fz{b,i} functions even for inputs with different alignments (with still uses aligned access plus merge operation to get a correct word by word comparison). Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu, and sparcv9-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Richard Henderson Adhemerval Zanella * string/strcpy.c: Rewrite using memcopy.h, string-fzb.h, string-fzi.h. * string/test-strcpy.c (test_main): Add move coverage. --- string/strcpy.c | 109 ++++++++++++++++++++++++++++++++++++++++++++++++++- string/test-strcpy.c | 24 +++++++++++- 2 files changed, 130 insertions(+), 3 deletions(-) -- 2.7.4 diff --git a/string/strcpy.c b/string/strcpy.c index a4cce89..358b1b1 100644 --- a/string/strcpy.c +++ b/string/strcpy.c @@ -15,8 +15,13 @@ License along with the GNU C Library; if not, see . */ -#include #include +#include +#include +#include +#include +#include +#include #undef strcpy @@ -28,6 +33,106 @@ char * STRCPY (char *dest, const char *src) { - return memcpy (dest, src, strlen (src) + 1); + char *dst = dest; + const op_t *xs; + op_t *xd; + op_t ws; + +#if _STRING_ARCH_unaligned + /* For architectures which supports unaligned memory operations, it first + aligns the source pointer, reads op_t bytes at time until a zero is + found, and writes unaligned to destination. */ + uintptr_t n = -(uintptr_t) src % sizeof (op_t); + for (uintptr_t i = 0; i < n; ++i) + { + unsigned c = *src++; + *dst++ = c; + if (c == '\0') + return dest; + } + xs = (const op_t *) src; + ws = *xs++; + xd = (op_t *) dst; + while (!has_zero (ws)) + { + *xd++ = ws; + ws = *xs++; + } +#else + /* For architectures which only supports aligned accesses, it first align + the destination pointer. */ + uintptr_t n = -(uintptr_t) dst % sizeof (op_t); + for (uintptr_t i = 0; i < n; ++i) + { + unsigned c = *src++; + *dst++ = c; + if (c == '\0') + return dest; + } + xd = (op_t *) dst; + + /* Destination is aligned to op_t while source might be not. */ + uintptr_t ofs = (uintptr_t) src % sizeof (op_t); + if (ofs == 0) + { + /* Aligned loop. If a zero is found, exit to copy the remaining + bytes. */ + xs = (const op_t *) src; + + ws = *xs++; + while (!has_zero (ws)) + { + *xd++ = ws; + ws = *xs++; + } + } + else + { + /* Unaligned loop: align the source pointer and mask off the + undesirable bytes which is not part of the string. */ + op_t wsa, wsb; + uintptr_t sh_1, sh_2; + + xs = (const op_t *)(src - ofs); + wsa = *xs++; + sh_1 = ofs * CHAR_BIT; + sh_2 = sizeof(op_t) * CHAR_BIT - sh_1; + + /* Align the first partial op_t from source, with 0xff for the rest + of the bytes so that we can also apply the has_zero test to see if we + have already reached EOS. If we have, then we can simply fall + through to the final byte copies. */ + ws = MERGE (wsa, sh_1, (op_t)-1, sh_2); + if (!has_zero (ws)) + { + while (1) + { + wsb = *xs++; + ws = MERGE (wsa, sh_1, wsb, sh_2); + if (has_zero (wsb)) + break; + *xd++ = ws; + wsa = wsb; + } + + /* WS may contain bytes that we not written yet in destination. + Write them down and merge with the op_t containing the EOS + byte. */ + if (!has_zero (ws)) + { + *xd++ = ws; + ws = MERGE (wsb, sh_1, ws, sh_2); + } + } + } +#endif + + /* Just copy the final bytes from op_t. */ + dst = (char *) xd; + uintptr_t fz = index_first_zero (ws); + for (uintptr_t i = 0; i < fz + 1; i++) + *dst++ = extractbyte (ws, i); + + return dest; } libc_hidden_builtin_def (strcpy) diff --git a/string/test-strcpy.c b/string/test-strcpy.c index 2a1bf93..fa03c73 100644 --- a/string/test-strcpy.c +++ b/string/test-strcpy.c @@ -207,7 +207,7 @@ do_random_tests (void) int test_main (void) { - size_t i; + size_t i, j; test_init (); @@ -222,12 +222,26 @@ test_main (void) do_test (0, 0, i, BIG_CHAR); do_test (0, i, i, SMALL_CHAR); do_test (i, 0, i, BIG_CHAR); + + for (j = 1; j < 16; ++j) + { + do_test (0, 0, i + j, SMALL_CHAR); + do_test (0, 0, i + j, BIG_CHAR); + do_test (0, i, i + j, SMALL_CHAR); + do_test (i, 0, i + j, BIG_CHAR); + } } for (i = 1; i < 8; ++i) { do_test (0, 0, 8 << i, SMALL_CHAR); do_test (8 - i, 2 * i, 8 << i, SMALL_CHAR); + + for (j = 1; j < 8; ++j) + { + do_test (0, 0, (8 << i) + j, SMALL_CHAR); + do_test (8 - i, 2 * i, (8 << i) + j, SMALL_CHAR); + } } for (i = 1; i < 8; ++i) @@ -236,6 +250,14 @@ test_main (void) do_test (2 * i, i, 8 << i, BIG_CHAR); do_test (i, i, 8 << i, SMALL_CHAR); do_test (i, i, 8 << i, BIG_CHAR); + + for (j = 1; j < 8; ++j) + { + do_test (i, 2 * i, (8 << i) + j, SMALL_CHAR); + do_test (2 * i, i, (8 << i) + j, BIG_CHAR); + do_test (i, i, (8 << i) + j, SMALL_CHAR); + do_test (i, i, (8 << i) + j, BIG_CHAR); + } } do_random_tests (); From patchwork Wed Jan 10 12:47:57 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 124089 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp5236321qgn; Wed, 10 Jan 2018 04:50:46 -0800 (PST) X-Google-Smtp-Source: ACJfBovdZpV4OU+zIoQU0f9rzjPsi440+3RB8vc93m6wFCF/FhQeSq+BDiIHraylFezljbLoTN8s X-Received: by 10.101.81.7 with SMTP id f7mr14998331pgq.443.1515588646188; Wed, 10 Jan 2018 04:50:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515588646; cv=none; d=google.com; s=arc-20160816; b=lZR4XqhF4iOKdOcUkJfD1sz4SKsXOK8lSbR8B3YKK5vOT0kShmCRIbgX0DPkHSLbMh rQyvVr1w1CKfy9vFlaQOHA51zaqAr7rVhS+AjNvVnI2dxGN1CRMV0Z7JezuxtXYhQ0S1 FEQo4SiNmUwl+dYmzAhFKTiYwpwU9V1oiyBebEKT0C8mtoRz39eS7BHmakzellNDxTXF utlfaz7NalGKzBStdpl5Miz+66DRfr45TGAXy+/3biP0kFxUhmWt1TxAm9AxYUCpE596 iNN0b0D7jSvS43CSG8fclAAOH3e3R7585hUABL0R3GKeDb0P5a0oK7OoxjYKKBshUs5X 3ZDQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :delivered-to:sender:list-help:list-post:list-archive:list-subscribe :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=Qh/zwzbekk5kEdipaQYNSHOoe6wV0qMGmahUD5xjepE=; b=pFZ21gLZgRtCp0eRE5P414dK48qsHndMXV0M2ePHsD7okL87YiIpdiM3wSUKWTPUqc khMelZHNO9sTvAEN/3P10qmnOrkSfOoRHyDpONMQo4DS9dF+YuyXexfpmlvJYkrQCIay rC7z8KUrJ9M4YN6VphuHLj5yoGZypAZ6ynvBtY8gwkN2Sxe2E9v1tgpLaIT5bKI6K9/6 zRBTO6vd5OcKw/vpywPPddo+B5BEJDMdItu+0xWWy5Dp53O4E58NhYlM81ayA4FFNzrF Q5CU9B9T1FKOczOrtf3qK5NxnJ+pTH+x60/OZ53tOcn69QOsOGA2gs9F6tc9moWookfZ yWxw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=B3o5a/3S; spf=pass (google.com: domain of libc-alpha-return-89011-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89011-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id l13si10622717pgc.833.2018.01.10.04.50.45 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Jan 2018 04:50:46 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-89011-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=B3o5a/3S; spf=pass (google.com: domain of libc-alpha-return-89011-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89011-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=tDl3bcoMdhCMvxWlKDON5QBu6421KQH EmBMKOga6mhadViQoEzknTLkxTwUxPkLmM1V9D2mgNrHOZpgpOl+v7JibG+Oy1SJ IfZQGIGHxoGxZrS/YM22AgRJ42ONagd7bm3yjsWUGae/+712BYxZF4jfY3SKnLxh A6mFMpoCFj2w= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; s=default; bh=nkQThLnYRedLbtw3x7QSRlj0q3c=; b=B3o5a /3SaJcgySmOKR6r/As+pF8gIsMkkL+CnbRLBkKOXx/1TDuE2/jXA1dJ8P1XUsoNJ YAbsnT4+5Pg1/XCFnbsdpaQoSR+a8NLG63BNlFKnhpAHLwx39FHOKFlFVprFK93l 3GibnIM//tT7uZMlDSPwtBuWuyN54tOyUTty7U= Received: (qmail 130633 invoked by alias); 10 Jan 2018 12:48:44 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 125343 invoked by uid 89); 10 Jan 2018 12:48:35 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mail-qt0-f196.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Qh/zwzbekk5kEdipaQYNSHOoe6wV0qMGmahUD5xjepE=; b=Bgfw3H1cjQdlzcwT8eujrQtOPbUln5B8YIq9FIJ7TIoXSRYWP6J3RyHk3iLbNMTjYc Q4nxdoxFfEPt966NU6QoaOEPj5LW14SvPqnX3bd5WvXZowx0ZLJaWJrRDvYIUClXF9S/ cygvyliM0b/m+NbWeWwoCBMTcuPotQFnSstpcsMBhaqB96mtE1Y4F4YGGb5Pt/wtzzv1 aGMXCsmeRVe5XtAVydBcGDNA0RvOLiQqi+VEzVQa72hiK+VbgC+NgaBqB0Awnuz4TOF4 d/NLhWpyaMQ5Gk7EpInBRhQAZBeJdLVeILIoCztQnD9KeA4rOh27izuKRuvn7bz3YwUj lD5w== X-Gm-Message-State: AKwxytcS2yYzff7f8mUXQ+1ZESlxl1V4/Fgg4+NQ3GotImsttjp6DHP8 pSpsiW4rc3rKLpnl++0bZgwoFvsPb6E= X-Received: by 10.200.39.244 with SMTP id x49mr25880823qtx.169.1515588509206; Wed, 10 Jan 2018 04:48:29 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Richard Henderson Subject: [PATCH v3 13/18] hppa: Add memcopy.h Date: Wed, 10 Jan 2018 10:47:57 -0200 Message-Id: <1515588482-15744-14-git-send-email-adhemerval.zanella@linaro.org> In-Reply-To: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> From: Richard Henderson GCC's combine pass cannot merge (x >> c | y << (32 - c)) into a double-word shift unless (1) the subtract is in the same basic block and (2) the result of the subtract is used exactly once. Neither condition is true for any use of MERGE. By forcing the use of a double-word shift, we not only reduce contention on SAR, but also allow the setting of SAR to be hoisted outside of a loop. Checked on hppa-linux-gnu. Richard Henderson * sysdeps/hppa/memcopy.h: New file. --- sysdeps/hppa/memcopy.h | 44 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) create mode 100644 sysdeps/hppa/memcopy.h -- 2.7.4 diff --git a/sysdeps/hppa/memcopy.h b/sysdeps/hppa/memcopy.h new file mode 100644 index 0000000..4dcade7 --- /dev/null +++ b/sysdeps/hppa/memcopy.h @@ -0,0 +1,44 @@ +/* Definitions for memory copy functions, PA-RISC version. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include + +/* Use a single double-word shift instead of two shifts and an ior. + If the uses of MERGE were close to the computation of shl/shr, + the compiler might have been able to create this itself. + But instead that computation is well separated. + + Using an inline function instead of a macro is the easiest way + to ensure that the types are correct. */ + +#undef MERGE + +extern void link_error(void); + +static inline op_t +MERGE(op_t w0, int shl, op_t w1, int shr) +{ + op_t res; + if (OPSIZ == 4) + asm("shrpw %1,%2,%%sar,%0" : "=r"(res) : "r"(w0), "r"(w1), "q"(shr)); + else if (OPSIZ == 8) + asm("shrpd %1,%2,%%sar,%0" : "=r"(res) : "r"(w0), "r"(w1), "q"(shr)); + else + link_error(), res = 0; + return res; +} From patchwork Wed Jan 10 12:47:58 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 124092 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp5236807qgn; Wed, 10 Jan 2018 04:51:19 -0800 (PST) X-Google-Smtp-Source: ACJfBotyoVOzZbnhQLPnQyojVZ31fbx9UMBmm6FSUCPBBbgC4oIsuZAw8OMOTV/6L9cXjYOBWsEA X-Received: by 10.84.218.69 with SMTP id f5mr6573622plm.431.1515588679328; Wed, 10 Jan 2018 04:51:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515588679; cv=none; d=google.com; s=arc-20160816; b=1F6q+OJdP5isrW2QFxDN3SLpBYIs1+oyNRv6nCG9Ls/ABwZyXZMXr2mrUfHRFaneHJ ZSXKhEmEGRRxCwRAYatLL54DNREbRVwE0KiJCk+i4210BtxpP7q2vTvoMjtICw9yKicX utSl4EUnOiENNI2Qyg2D4ypPjLEKEuceCFc3Cjx2X21jOaGEpJmYKRwUCqAYblIREPpZ knk1aBmbN4ytR5UVEtwzktOVE8AOSZCrhzSqNi0v31bw6UzlNTIP1dnnPhXKCnsR+EYO Pt8WjUzhENo7dH9vgERsS5EX07tcFz9XLwi6Nvp2+vPeG8gkBacMbCAaOUSTxBGr39ks wNmA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :delivered-to:sender:list-help:list-post:list-archive:list-subscribe :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=Z6EmNX77pGOB5rAa+ITkSM7/K6R0N+X+q3x4RWbR9bE=; b=w3UFx+nHTQ6KGmDO/syquZM1HSwVEG0ly65kqayoHEw+YmGXj2Pn8mKVcibdj9BANp GL5sUzS1dziXAipIQMcDkbrY/Rnurnjxaa3WgE+5iG/AU8qkRvdeTaBuMznV7kcBy6E6 2OtVRU0brVoHmlpBjt+UVk0RKU9MXN5rR2ZU6625mmrw4iwaotHOEf18vx5F41dZTBGR vjLmKFINrhQG4snZ9Or+OmMqSTv4xwg0wZM0PLIldvg57QPU4TzMN77Bo3jRnF1XMjZD xzswJwDWrB3mU22p52F49WMB4alsnq47hP7pLSVCMDDWEDNZG4s+S5cGovCm/CUUVI+j cL4g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=Ze+Nryz8; spf=pass (google.com: domain of libc-alpha-return-89014-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89014-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id a3si10533457pgc.105.2018.01.10.04.51.18 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Jan 2018 04:51:19 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-89014-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=Ze+Nryz8; spf=pass (google.com: domain of libc-alpha-return-89014-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89014-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=QNIqg0eFSAd4PDtrb4gOaEhtVVC62ZI ll+SLCPxl3OhO7rea6UOYccW86Nzx0AyW9JiPlFowF5gFEMcD3p+OVo3rA/F8YsN B9PPvVnIEql668qJKO681aiQt7OfMKTbfKg2nfEg733LTsbO3GNtzgby9Y3ofGnk XCzEn2H33Ci0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; s=default; bh=4fAwDoNJ+LtSSP4OO3ZgiKckPxY=; b=Ze+Nr yz83hugF/GNU9095A6vVCAr/VJimVvArPyA8PbuIQlb5l3+QtR9I8a15oUjshLoj uwIUZGpY7VAtqWbEVlNiu8eN0ntJ5YMFRgk93EiULejPFibnlHRyF5AdkwuRzMsr WWB3DDXaP9jTDftplPrY2x9jxenJd8QB93FKXM= Received: (qmail 336 invoked by alias); 10 Jan 2018 12:48:47 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 125945 invoked by uid 89); 10 Jan 2018 12:48:36 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=bn X-HELO: mail-qt0-f194.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Z6EmNX77pGOB5rAa+ITkSM7/K6R0N+X+q3x4RWbR9bE=; b=j61UsQQrrZgZHBGRVVWWIx6R/I6ype9XQ0QRn46wHLjZ33encuCiZW6dz8olvEtAt6 clZink8UNGJp/eSiMHcjurva36siAz7vw1+X6DOulhBLxmFvHW2nH2Oju/4qyj9nZi1m oVv6/fumXJQPNfmw3IwB4TmhPSpK09igOi2cRi+p7zGGHRYtxCmBcUB43AUK7jqEv5Pm AELgO2nPJY49bXSQccX4qwXubBOZbTbNpeuykmAxaXHFkSc/3O3URrfFGasfbFdCkLuN O/d5pUYOZLLwTPD6SnZNuBG2w18Tp7n1HWjJcLgyBiDAD2RVnngRsva2r0vKNfQRFG4R aSSA== X-Gm-Message-State: AKwxyteW3gAnVj5KtMwNvfRSwgR6YJkBoVGhecvvP04OtFPonfLUFp7B A/Y6n3poQjWbdGnnE1DhOoDbdh/WvSY= X-Received: by 10.200.27.135 with SMTP id z7mr7447242qtj.58.1515588510762; Wed, 10 Jan 2018 04:48:30 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Richard Henderson Subject: [PATCH v3 14/18] hppa: Add string-fzb.h and string-fzi.h Date: Wed, 10 Jan 2018 10:47:58 -0200 Message-Id: <1515588482-15744-15-git-send-email-adhemerval.zanella@linaro.org> In-Reply-To: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> From: Richard Henderson Use UXOR,SBZ to test for a zero byte within a word. While we can get semi-decent code out of asm-goto, we would do slightly better with a compiler builtin. For index_zero et al, sequential testing of bytes is less expensive than any tricks that involve a count-leading-zeros insn that we don't have. Checked on hppa-linux-gnu. Richard Henderson * sysdeps/hppa/string-fzb.h: New file. * sysdeps/hppa/string-fzi.h: Likewise. --- sysdeps/hppa/string-fzb.h | 69 ++++++++++++++++++++++++ sysdeps/hppa/string-fzi.h | 135 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 204 insertions(+) create mode 100644 sysdeps/hppa/string-fzb.h create mode 100644 sysdeps/hppa/string-fzi.h -- 2.7.4 diff --git a/sysdeps/hppa/string-fzb.h b/sysdeps/hppa/string-fzb.h new file mode 100644 index 0000000..0385d99 --- /dev/null +++ b/sysdeps/hppa/string-fzb.h @@ -0,0 +1,69 @@ +/* Zero byte detection, boolean. HPPA version. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef STRING_FZB_H +#define STRING_FZB_H 1 + +#include + +/* Determine if any byte within X is zero. This is a pure boolean test. */ + +static inline _Bool +has_zero (op_t x) +{ + _Static_assert (sizeof (op_t) == 4, "64-bit not supported"); + + /* It's more useful to expose a control transfer to the compiler + than to expose a proper boolean result. */ + asm goto ("uxor,sbz %%r0,%0,%%r0\n\t" + "b,n %l1" : : "r"(x) : : nbz); + return 1; + nbz: + return 0; +} + +/* Likewise, but for byte equality between X1 and X2. */ + +static inline _Bool +has_eq (op_t x1, op_t x2) +{ + _Static_assert (sizeof (op_t) == 4, "64-bit not supported"); + + asm goto ("uxor,sbz %0,%1,%%r0\n\t" + "b,n %l2" : : "r"(x1), "r"(x2) : : nbz); + return 1; + nbz: + return 0; +} + +/* Likewise, but for zeros in X1 and equal bytes between X1 and X2. */ + +static inline _Bool +has_zero_eq (op_t x1, op_t x2) +{ + _Static_assert (sizeof (op_t) == 4, "64-bit not supported"); + + asm goto ("uxor,sbz %%r0,%0,%%r0\n\t" + "uxor,nbz %0,%1,%%r0\n\t" + "b,n %l2" : : "r"(x1), "r"(x2) : : sbz); + return 0; + sbz: + return 1; +} + +#endif /* STRING_HASZERO_H */ diff --git a/sysdeps/hppa/string-fzi.h b/sysdeps/hppa/string-fzi.h new file mode 100644 index 0000000..22bd8ac --- /dev/null +++ b/sysdeps/hppa/string-fzi.h @@ -0,0 +1,135 @@ +/* string-fzi.h -- zero byte detection; indexes. HPPA version. + Copyright (C) 2016 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef STRING_FZI_H +#define STRING_FZI_H 1 + +#include + +/* Given a word X that is known to contain a zero byte, return the + index of the first such within the long in memory order. */ + +static inline unsigned int +index_first_zero (op_t x) +{ + unsigned int ret; + + _Static_assert (sizeof (op_t) == 4, "64-bit not supported"); + + /* Since we have no clz insn, direct tests of the bytes is faster + than loading up the constants to do the masking. */ + asm ("extrw,u,<> %1,23,8,%%r0\n\t" + "ldi 2,%0\n\t" + "extrw,u,<> %1,15,8,%%r0\n\t" + "ldi 1,%0\n\t" + "extrw,u,<> %1,7,8,%%r0\n\t" + "ldi 0,%0" + : "=r"(ret) : "r"(x), "0"(3)); + + return ret; +} + +/* Similarly, but perform the search for byte equality between X1 and X2. */ + +static inline unsigned int +index_first_eq (op_t x1, op_t x2) +{ + return index_first_zero (x1 ^ x2); +} + +/* Similarly, but perform the search for zero within X1 or + equality between X1 and X2. */ + +static inline unsigned int +index_first_zero_eq (op_t x1, op_t x2) +{ + unsigned int ret; + + _Static_assert (sizeof (op_t) == 4, "64-bit not supported"); + + /* Since we have no clz insn, direct tests of the bytes is faster + than loading up the constants to do the masking. */ + asm ("extrw,u,= %1,23,8,%%r0\n\t" + "extrw,u,<> %2,23,8,%%r0\n\t" + "ldi 2,%0\n\t" + "extrw,u,= %1,15,8,%%r0\n\t" + "extrw,u,<> %2,15,8,%%r0\n\t" + "ldi 1,%0\n\t" + "extrw,u,= %1,7,8,%%r0\n\t" + "extrw,u,<> %2,7,8,%%r0\n\t" + "ldi 0,%0" + : "=r"(ret) : "r"(x1), "r"(x1 ^ x2), "0"(3)); + + return ret; +} + +/* Similarly, but perform the search for zero within X1 or + inequality between X1 and X2. */ + +static inline unsigned int +index_first_zero_ne (op_t x1, op_t x2) +{ + unsigned int ret; + + _Static_assert (sizeof (op_t) == 4, "64-bit not supported"); + + /* Since we have no clz insn, direct tests of the bytes is faster + than loading up the constants to do the masking. */ + asm ("extrw,u,<> %2,23,8,%%r0\n\t" + "extrw,u,<> %1,23,8,%%r0\n\t" + "ldi 2,%0\n\t" + "extrw,u,<> %2,15,8,%%r0\n\t" + "extrw,u,<> %1,15,8,%%r0\n\t" + "ldi 1,%0\n\t" + "extrw,u,<> %2,7,8,%%r0\n\t" + "extrw,u,<> %1,7,8,%%r0\n\t" + "ldi 0,%0" + : "=r"(ret) : "r"(x1), "r"(x1 ^ x2), "0"(3)); + + return ret; +} + +/* Similarly, but search for the last zero within X. */ + +static inline unsigned int +index_last_zero (op_t x) +{ + unsigned int ret; + + _Static_assert (sizeof (op_t) == 4, "64-bit not supported"); + + /* Since we have no ctz insn, direct tests of the bytes is faster + than loading up the constants to do the masking. */ + asm ("extrw,u,<> %1,15,8,%%r0\n\t" + "ldi 1,%0\n\t" + "extrw,u,<> %1,23,8,%%r0\n\t" + "ldi 2,%0\n\t" + "extrw,u,<> %1,31,8,%%r0\n\t" + "ldi 3,%0" + : "=r"(ret) : "r"(x), "0"(0)); + + return ret; +} + +static inline unsigned int +index_last_eq (op_t x1, op_t x2) +{ + return index_last_zero (x1 ^ x2); +} + +#endif /* STRING_FZI_H */ From patchwork Wed Jan 10 12:47:59 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 124079 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp5234920qgn; Wed, 10 Jan 2018 04:49:11 -0800 (PST) X-Google-Smtp-Source: ACJfBotjr5aMoiypCZ166IezulHiZ2Bsj1wUOu3NsHkPb/l0CitCifzokOCdxtEP+b26C0V1AUPQ X-Received: by 10.84.217.76 with SMTP id e12mr5883666plj.331.1515588551120; Wed, 10 Jan 2018 04:49:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515588551; cv=none; d=google.com; s=arc-20160816; b=eAUc5HBGVwK7Y72OU7yuU7gfw2MZVk5bpNaQFH6vdN7nSnyo9d5wYuldtgG+it5NCi Akuh/PeKSDetQmvKoQ3e3obIAVTOXArSbmRdxaN8TEmOuNROMlMqK/yFN9ESj23Mm6rd ogQyubY3gKpyLd3JMO4kLep63DcyRdLNnxkasphNNDQDXTSFJu+qiwL6qO+zllFIi9aL Q9u7AtQEcaD96lKR0yBBTJ5pe3odc4JWPn7qzsBgcMH4+IE3Smht+bDL3o0tqvd0LaCy vBueAx71Eaeb8G7UY1BiQLiWe/JIeroRnCiykPSjrp/bLH1JhEPdPALJMha0X+XaUNrA joGg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :delivered-to:sender:list-help:list-post:list-archive:list-subscribe :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=dRrSRLzJnc/25De4rEe6MG53SfYheTJv01cUWmbfE/E=; b=yAZ1jQev8ynINAh/ZGLjhQcIQ5K5KeLzVIYRBomCE75QUrA4eTmbHes0LoMbyOM93U wLyN6ber8W8hARCj5YdVhY5Q/qoKqy2ovVFGuoImos0W3Yl4ayArQvSvu4fP71dQ8qnf D0W7x3r1tKF5UbZUoyykzeYvNOjfLfuiKk4F4wd6AfaixjPWFGY18qZqV13WnzJJ2PSJ 7DKAoPyvu2KQ/WswaILoZXbQ++S0gVV+BLOVgGVs/zery1TP/fNaes2RiGUJ5/AV/0JU iTeHg1thy4bja62Yx3FVKxs2nH0T/kEi42mCQIDegsp4G708V/xZYQnriYhW8O4mxOz0 YhEg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=RQsr2ol8; spf=pass (google.com: domain of libc-alpha-return-89001-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89001-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id c9si1703166plk.195.2018.01.10.04.49.10 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Jan 2018 04:49:11 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-89001-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=RQsr2ol8; spf=pass (google.com: domain of libc-alpha-return-89001-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89001-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=xH5kkq0jmb5m9Mdvd5EsdFnbdg4J8AU VY02sYMCnIz3v0NRXfHly4NBTxOqrvOafCxrk+6ycCQsI+UAl2B7Ltb02rasAGS0 PnhXZMj29z1ImtILa1VIXq1tZe03uxDsTWZerwS5SjsU/QhPyeZ0tUNFhJdqtUyB MV3PjIGr4mJU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; s=default; bh=aW/Xx4AVmYADnxQZFfleMzPdVss=; b=RQsr2 ol8zgZTVqBelaRqwjUFY0eI6nSI51T/FRAHtTQDbva2Sh+m8xJQm/ZaHLSczMkPV 3PBIKfinYbAqQbB26BP3MVIrNK7wEuaf/0Bxwv6Sfdb3+at2Q82YpbOfbDCw3Kkh URr3vLk5afeb9s/8s3CMz9vV/HKGafuPGzI2Ek= Received: (qmail 129565 invoked by alias); 10 Jan 2018 12:48:42 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 125765 invoked by uid 89); 10 Jan 2018 12:48:36 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mail-qt0-f195.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=dRrSRLzJnc/25De4rEe6MG53SfYheTJv01cUWmbfE/E=; b=jKlgk736lZyaQH6o5A9a+YdL+CaAAgryHuu2HbcEMpZMQDlIyJPyCXUvMfFXYQ78Zu qXr+k4R5EpWKaWeaAd7cT7UQbDY/EpJ+pwDaW/ndpE86oZhxPnYE2MLI0sDhoLmdjAgH tUbgmq1f7Dd9QmHMkOiTNoWJomeD+S9G8x9V7POyGKYT66q/LEMauoRQXhFUlgVMq/Ot /+h6dEIUWSs6rjQMlEAnLg4ZavzfeBrrRULAE12XEKzV9JEpT0iBk2Al/m6BwlJdBasf Qm33JgOayDLsogpgyJkr6IK31ki5kKr/tJVcy3s7nD84QZpSUZn58n2gRAWBq4EP41sF mnOA== X-Gm-Message-State: AKwxytf1H5aIRSU99clxTTbWa2bdh2Y3eDaGDCeg4OfSXei4ZoPbFfBs sOe3yrrugMB88F8+SAtpVReb87SoE2U= X-Received: by 10.200.42.80 with SMTP id l16mr26179167qtl.164.1515588512392; Wed, 10 Jan 2018 04:48:32 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Richard Henderson Subject: [PATCH v3 15/18] alpha: Add string-fzb.h and string-fzi.h Date: Wed, 10 Jan 2018 10:47:59 -0200 Message-Id: <1515588482-15744-16-git-send-email-adhemerval.zanella@linaro.org> In-Reply-To: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> From: Richard Henderson While alpha has the more important string functions in assembly, there are still a few for find the generic routines are used. Use the CMPBGE insn, via the builtin, for testing of zeros. Use a simplified expansion of __builtin_ctz when the insn isn't available. Checked on alpha-linux-gnu. Richard Henderson * sysdeps/alpha/string-fzb.h: New file. * sysdeps/alpha/string-fzi.h: Likewise. --- sysdeps/alpha/string-fzb.h | 51 ++++++++++++++++++++ sysdeps/alpha/string-fzi.h | 113 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 164 insertions(+) create mode 100644 sysdeps/alpha/string-fzb.h create mode 100644 sysdeps/alpha/string-fzi.h -- 2.7.4 diff --git a/sysdeps/alpha/string-fzb.h b/sysdeps/alpha/string-fzb.h new file mode 100644 index 0000000..0e6a71c --- /dev/null +++ b/sysdeps/alpha/string-fzb.h @@ -0,0 +1,51 @@ +/* Zero byte detection; boolean. Alpha version. + Copyright (C) 2016 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef STRING_FZB_H +#define STRING_FZB_H 1 + +#include + +/* Note that since CMPBGE creates a bit mask rather than a byte mask, + we cannot simply provide a target-specific string-fza.h. */ + +/* Determine if any byte within X is zero. This is a pure boolean test. */ + +static inline _Bool +has_zero (op_t x) +{ + return __builtin_alpha_cmpbge (0, x) != 0; +} + +/* Likewise, but for byte equality between X1 and X2. */ + +static inline _Bool +has_eq (op_t x1, op_t x2) +{ + return has_zero (x1 ^ x2); +} + +/* Likewise, but for zeros in X1 and equal bytes between X1 and X2. */ + +static inline _Bool +has_zero_eq (op_t x1, op_t x2) +{ + return has_zero (x1) | has_eq (x1, x2); +} + +#endif /* STRING_FZB_H */ diff --git a/sysdeps/alpha/string-fzi.h b/sysdeps/alpha/string-fzi.h new file mode 100644 index 0000000..243a9e5 --- /dev/null +++ b/sysdeps/alpha/string-fzi.h @@ -0,0 +1,113 @@ +/* string-fzi.h -- zero byte detection; indices. Alpha version. + Copyright (C) 2016 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef STRING_FZI_H +#define STRING_FZI_H + +#include +#include + +/* Note that since CMPBGE creates a bit mask rather than a byte mask, + we cannot simply provide a target-specific string-fza.h. */ + +/* A subroutine for the index_zero functions. Given a bitmask C, + return the index of the first bit set in memory order. */ + +static inline unsigned int +index_first_ (unsigned long int c) +{ +#ifdef __alpha_cix__ + return __builtin_ctzl (c); +#else + c = c & -c; + return (c & 0xf0 ? 4 : 0) + (c & 0xcc ? 2 : 0) + (c & 0xaa ? 1 : 0); +#endif +} + +/* Similarly, but return the (memory order) index of the last bit + that is non-zero. Note that only the least 8 bits may be nonzero. */ + +static inline unsigned int +index_last_ (unsigned long int x) +{ +#ifdef __alpha_cix__ + return __builtin_clzl (x) ^ 63; +#else + unsigned r = 0; + if (x & 0xf0) + r += 4; + if (x & (0xc << r)) + r += 2; + if (x & (0x2 << r)) + r += 1; + return r; +#endif +} + +/* Given a word X that is known to contain a zero byte, return the + index of the first such within the word in memory order. */ + +static inline unsigned int +index_first_zero (op_t x) +{ + return index_first_ (__builtin_alpha_cmpbge (0, x)); +} + +/* Similarly, but perform the test for byte equality between X1 and X2. */ + +static inline unsigned int +index_first_eq (op_t x1, op_t x2) +{ + return index_first_zero (x1 ^ x2); +} + +/* Similarly, but perform the search for zero within X1 or + equality between X1 and X2. */ + +static inline unsigned int +index_first_zero_eq (op_t x1, op_t x2) +{ + return index_first_ (__builtin_alpha_cmpbge (0, x1) + | __builtin_alpha_cmpbge (0, x1 ^ x2)); +} + +/* Similarly, but perform the search for zero within X1 or + inequality between X1 and X2. */ + +static inline unsigned int +index_first_zero_ne (op_t x1, op_t x2) +{ + return index_first_ (__builtin_alpha_cmpbge (0, x1) + | (__builtin_alpha_cmpbge (0, x1 ^ x2) ^ 0xFF)); +} + +/* Similarly, but search for the last zero within X. */ + +static inline unsigned int +index_last_zero (op_t x) +{ + return index_last_ (__builtin_alpha_cmpbge (0, x)); +} + +static inline unsigned int +index_last_eq (op_t x1, op_t x2) +{ + return index_last_zero (x1 ^ x2); +} + +#endif /* STRING_FZI_H */ From patchwork Wed Jan 10 12:48:00 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 124087 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp5236004qgn; Wed, 10 Jan 2018 04:50:22 -0800 (PST) X-Google-Smtp-Source: ACJfBovYyUQY0AmzdHaMZp+6eqIbb99YpjLIKC/S6r9+BEnvTJ2vxa0i1GzixjcYukYh98pjTd7d X-Received: by 10.98.246.17 with SMTP id x17mr12279792pfh.46.1515588622349; Wed, 10 Jan 2018 04:50:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515588622; cv=none; d=google.com; s=arc-20160816; b=c08UbLcIyag0kxDIobSZJXWgAMAiLOxMwS2dSC3AJcHQ8nOJTe/Lme5Ol8kpPvhAFI U3f4UGjO8DWx2Yj4nnv82FZlMCrO0m1PkaGLCwWQ5xCvs6wSvV0Sd2Yqix5ro0D2Dma+ yz2ARI1wTSvC2COpSqm5uMuq2p10q8wJiyiD1hQVTiEg/ITZqPUJUHHtbrp+Nihzm+Et n2uq+d2oRG71mB90ZtW6P1HIu3mpmQ+jj+hRxDR2oLqRn8gpa6TV6lDVNRqgO//d7T9W q2trID4n5XWxZjBmdnh98R5vGJC+UqiXZbv+74jn1zxZaKmCXH77fQOJopkRIGcK1WLm u+KA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :delivered-to:sender:list-help:list-post:list-archive:list-subscribe :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=xlyEGGeee0lWD17u8+dJYv6H/PU8HKCcKr38H0rrHXo=; b=golu565MSKUQiOR+AZAWghd0fH56oPb8KOPUokkTx/3jPoPmnqil4Zf8fWCgVBFIrI /tudikBKYm2G557Xe2DZVr8mFqagNU8OPrVdVsQuh5/PgxSwmC6lQshEvGFWmbXHddh2 4XCzFAkYRzTIyHI2mdwzIgFW1ph1S4Hns5J6yqTngD7Hps3SWomOZp1oeRcoxYP6//bQ SvDQfmNu9KRtfG/l/7Fb/AfdheO5IrF/a5hC2J70suopSKKiMO/ZQw2DGtVUFDVSGqiB D1jycVMf8kAL2vvPEPKgWtecAHE8wwIxq1ifRTrpl+405IwOZJa7tEagdDsdskY6k9QC xwLw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=cdfKaYmr; spf=pass (google.com: domain of libc-alpha-return-89009-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89009-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id d2si12060831pld.436.2018.01.10.04.50.22 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Jan 2018 04:50:22 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-89009-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=cdfKaYmr; spf=pass (google.com: domain of libc-alpha-return-89009-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89009-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=Dd4gQPNg4EuUBDQ5L5HsEDfr7mWifS6 4NkgWzX888hgLQfwVILTMrmydFVT+1283dcH9d/QRCzuRmccxtFiCeqCzFDUIaza ekjihZlElgkasaq30rJBclr81gjVLHuBBuGSSZb6tq+WtG/RGzWW/8nRGPdeDWab G3gJJuuq8ZyE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; s=default; bh=HeNC8rpScw7REHZi9kFbjlbUbuk=; b=cdfKa YmrKb7ljhoZsibTHh4CX5jQCaELjthfTZc8n9r1ESysjts1VBSOfgW1ajzIyiS1S DE1XFhua716xAS5T9uZkYfHB3lCeFEGSudV/1MIqD+bJtvN+0dQAQjN6sBSkbSYI mYjc/n9fpaVg4oqtviJ6Cj2pzU54XBG4Unf0xw= Received: (qmail 130379 invoked by alias); 10 Jan 2018 12:48:43 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 127078 invoked by uid 89); 10 Jan 2018 12:48:39 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=Identify, leaves, basics X-HELO: mail-qt0-f193.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=xlyEGGeee0lWD17u8+dJYv6H/PU8HKCcKr38H0rrHXo=; b=eNhCSXoe5tbYvIQ+o54/gxmV2v9IRT132Ku3lOSE7FVlKYiJHpbHrv50RAWHV9D3uq kNgpKd27gFNgwFyhHmO+gFwLdUlpju0Yh2+tvbrEC8guFuK0Zi+kVkD3/eIh00l4BeKX HKOzxycyE819cFLe8RxGUYwnnOcBxVhVL6VLmYwuaUe0URCgOPANb7U7tFx2NdD+gXWt dRN2HF8AX8lCMjzYNQOnCjyuVYq15wQoQICFuuwY+7CrcZK1NaxaqZcoA9DNVd4KfkXd 48IYzDclhX8PYqAxixVRme0ibv4DKDNpafeCgeDe01oiFIi5um1TGTgmAZNYgfw7fbmu XewQ== X-Gm-Message-State: AKwxyteaYMBaN1cA6q2APYyibHpJPGAAqO8TVffaGZN6mlGu4REM3any v08yuta9eYlGo6XxKj5A5IMNVRD8Tl4= X-Received: by 10.237.42.21 with SMTP id c21mr2394133qtd.240.1515588513991; Wed, 10 Jan 2018 04:48:33 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Richard Henderson Subject: [PATCH v3 16/18] arm: Add string-fza.h Date: Wed, 10 Jan 2018 10:48:00 -0200 Message-Id: <1515588482-15744-17-git-send-email-adhemerval.zanella@linaro.org> In-Reply-To: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> From: Richard Henderson While arm has the more important string functions in assembly, there are still a few generic routines used. Use the UQSUB8 insn for testing of zeros. Checked on armv7-linux-gnueabihf Richard Henderson * sysdeps/arm/armv6t2/string-fza.h: New file. --- sysdeps/arm/armv6t2/string-fza.h | 69 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) create mode 100644 sysdeps/arm/armv6t2/string-fza.h -- 2.7.4 diff --git a/sysdeps/arm/armv6t2/string-fza.h b/sysdeps/arm/armv6t2/string-fza.h new file mode 100644 index 0000000..8c38f87 --- /dev/null +++ b/sysdeps/arm/armv6t2/string-fza.h @@ -0,0 +1,69 @@ +/* Zero byte detection; basics. ARM version. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef STRING_FZA_H +#define STRING_FZA_H 1 + +#include + +/* This function returns at least one bit set within every byte + of X that is zero. */ + +static inline op_t +find_zero_all (op_t x) +{ + /* Use unsigned saturated subtraction from 1 in each byte. + That leaves 1 for every byte that was zero. */ + op_t ret, ones = (op_t)-1 / 0xff; + asm ("uqsub8 %0,%1,%2" : "=r"(ret) : "r"(ones), "r"(x)); + return ret; +} + +/* Identify bytes that are equal between X1 and X2. */ + +static inline op_t +find_eq_all (op_t x1, op_t x2) +{ + return find_zero_all (x1 ^ x2); +} + +/* Identify zero bytes in X1 or equality between X1 and X2. */ + +static inline op_t +find_zero_eq_all (op_t x1, op_t x2) +{ + return find_zero_all (x1) | find_zero_all (x1 ^ x2); +} + +/* Identify zero bytes in X1 or inequality between X1 and X2. */ + +static inline op_t +find_zero_ne_all (op_t x1, op_t x2) +{ + /* Make use of the fact that we'll already have ONES in a register. */ + op_t ones = (op_t)-1 / 0xff; + return find_zero_all (x1) | (find_zero_all (x1 ^ x2) ^ ones); +} + +/* Define the "inexact" versions in terms of the exact versions. */ +#define find_zero_low find_zero_all +#define find_eq_low find_eq_all +#define find_zero_eq_low find_zero_eq_all +#define find_zero_ne_low find_zero_ne_all + +#endif /* STRING_FZA_H */ From patchwork Wed Jan 10 12:48:01 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 124080 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp5235062qgn; Wed, 10 Jan 2018 04:49:19 -0800 (PST) X-Google-Smtp-Source: ACJfBosRPhvdpfWkCX7XoUX4A7nbBCxzxtHfHdLBW366iTWSzuKNiwES8mTlwGOWL6wpmFD9lvbJ X-Received: by 10.98.43.3 with SMTP id r3mr16533992pfr.130.1515588559677; Wed, 10 Jan 2018 04:49:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515588559; cv=none; d=google.com; s=arc-20160816; b=e4fc3ByUb6FkUruYbbZwFaCpVI/HcdBhr2ax5rr8JCR9T5cSJr/kNkPoLuBEBkXDJ3 yeBHbNkmOv9iPl6AwdsG/EmncKNrjlUdjBs9a3V0uVHKp5Adul/mYUEWpwCDxemI+hRi omvKVGxUnNKheidfPSBHv9dwvqTUGvqEX7fC++Sh0fS6F1VYhJqy8W09bKSb84hhOQk+ 9T/A5DWzS9IIKmfjq4uQ+Na4rUIKhKNcMwu5EtBGAHeMc/VPYGcpyqomhpD1Ev8yPc2Y atbytc4T17KKKY6y+Eo0Pn87cxTbuYtkT5hryAoNihHYT64rFKN/PoXriATS6LDswAvT URYw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :delivered-to:sender:list-help:list-post:list-archive:list-subscribe :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=wRjeFs1mWfSroKruWagOxHvKNvLVUS+0Cx0dYQtIMYA=; b=N0iek21pjm1i1Bk22gxDOfXshcNCSfcQFYO7FNud9gmnA/GO+wqiVPENwiEQ2GN7er m+IDXRiXX1pgDxCICCmt6kwzOJOERXRWMdXeAe8JvtAfHHqJ2JbSsHz1G1fICeWmksTH 4L1YSG0+EKlZ3y5VfncuPtvWUEOKUDdrPxlksYA1K4IlC0PIZoBdfKaxTBc33Fp7MV8v dW6wQWW/hJulQXV5lFCGpLMirWmBlS5dkJF9ZpuVYCWymCmIEmLNSw6jnb/+mVwJzhoc ZTupFdazhb20UCyajyzbiKAE5ai6LPEcJYp3rgDPVSa+rNjzuANj07vHRb/7H5O2o6jq BKnA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=pZTLDux5; spf=pass (google.com: domain of libc-alpha-return-89004-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89004-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id b17si11689877pfd.406.2018.01.10.04.49.19 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Jan 2018 04:49:19 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-89004-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=pZTLDux5; spf=pass (google.com: domain of libc-alpha-return-89004-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89004-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=GjiH7yTuIYeelm9Z7XcuP916TFiwS3n IKKrD/8C2d0jCnXv8EgqzOKo7GfXdB65aOLWSgHIHbl60ksygMZG13mMZm17KZCf RXFFEKJPIR8Jcz4Wkkg9rfjErCmYzNJLXj3APjg3Mktm+V6fJFBP0RMCK0U8ZslU 4/E/avfE1ZiU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; s=default; bh=PpA+8LHyTYQZgVMHvaRBR0Fgp88=; b=pZTLD ux5/UBi4C21zL+dOJMV5qi9g/b9Os73zpBu2P6084adv+u9qwb7jQPYlecrKZ7BY nHD9I/MEW3mvaVloeKEUixnVPCeUeWHSBIZcATML+SVtgvM8HBuhkk63IzWWJFYn E/gtTCAXJJTw87AqOQZ6DDEcYyYannzxHmjTjY= Received: (qmail 129891 invoked by alias); 10 Jan 2018 12:48:43 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 127925 invoked by uid 89); 10 Jan 2018 12:48:40 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=Identify, basics X-HELO: mail-qk0-f195.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=wRjeFs1mWfSroKruWagOxHvKNvLVUS+0Cx0dYQtIMYA=; b=fXIVxwDEtehtWzhH+mKw8wi7DME5lwvt/0YwcVpeu5V4Eq+eQ2JyK+FOkEXRDGB1y/ GlbGBzU3AfeWC03XTb92BQu0EzfwRBdKiCJtLtvbLb1tnI2YFt9GIdnbcj+xUMqexAVr iiYx1lEaOIMLFtoJBy/8w6dH5YUHXhfA38470D6BXGn8cy16htt+SZTQ5AjtaApp02jO eSBbuqw1rFWve81Vj8VjK9jssO5lsFZctqHoxXnpZ7UxXa+KZg28sX4lx+Rd68qqpy2A wE0J6K0PV/eZSlO4McnkAWDebOG973Ji3sDKnTFofE/WhdOxRZ1zhOuCmMcCwLeHlFB5 KrpA== X-Gm-Message-State: AKwxytfLTAa5sAdAf2rzm6E0cB4e09cYAoeDXAGR/jJNKmRPAWp2sNQt MPUQ26JEHIc7TmWFlnEO+uEyKTjk8os= X-Received: by 10.55.65.75 with SMTP id o72mr12987184qka.202.1515588515590; Wed, 10 Jan 2018 04:48:35 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Richard Henderson Subject: [PATCH v3 17/18] powerpc: Add string-fza.h Date: Wed, 10 Jan 2018 10:48:01 -0200 Message-Id: <1515588482-15744-18-git-send-email-adhemerval.zanella@linaro.org> In-Reply-To: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> From: Richard Henderson While ppc has the more important string functions in assembly, there are still a few generic routines used. Use the Power 6 CMPB insn for testing of zeros. Checked on powerpc64le-linux-gnu. Richard Henderson * sysdeps/powerpc/power6/string-fza.h: New file. * sysdeps/powerpc/powerpc32/power6/string-fza.h: Likewise. * sysdeps/powerpc/powerpc64/power6/string-fza.h: Likewise. --- sysdeps/powerpc/power6/string-fza.h | 65 +++++++++++++++++++++++++++ sysdeps/powerpc/powerpc32/power6/string-fza.h | 1 + sysdeps/powerpc/powerpc64/power6/string-fza.h | 1 + 3 files changed, 67 insertions(+) create mode 100644 sysdeps/powerpc/power6/string-fza.h create mode 100644 sysdeps/powerpc/powerpc32/power6/string-fza.h create mode 100644 sysdeps/powerpc/powerpc64/power6/string-fza.h -- 2.7.4 Reviewed-by: Tulio Magno Quites Machado Filho diff --git a/sysdeps/powerpc/power6/string-fza.h b/sysdeps/powerpc/power6/string-fza.h new file mode 100644 index 0000000..4549dde --- /dev/null +++ b/sysdeps/powerpc/power6/string-fza.h @@ -0,0 +1,65 @@ +/* Zero byte detection; basics. Power6/ISA 2.03 version. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef STRING_FZA_H +#define STRING_FZA_H 1 + +#include + +/* This function returns 0xff for each byte that is + equal between X1 and X2. */ + +static inline op_t +find_eq_all (op_t x1, op_t x2) +{ + op_t ret; + asm ("cmpb %0,%1,%2" : "=r"(ret) : "r"(x1), "r"(x2)); + return ret; +} + +/* This function returns 0xff for each byte that is zero in X. */ + +static inline op_t +find_zero_all (op_t x) +{ + return find_eq_all (x, 0); +} + +/* Identify zero bytes in X1 or equality between X1 and X2. */ + +static inline op_t +find_zero_eq_all (op_t x1, op_t x2) +{ + return find_zero_all (x1) | find_eq_all (x1, x2); +} + +/* Identify zero bytes in X1 or inequality between X1 and X2. */ + +static inline op_t +find_zero_ne_all (op_t x1, op_t x2) +{ + return find_zero_all (x1) | ~find_eq_all (x1, x2); +} + +/* Define the "inexact" versions in terms of the exact versions. */ +#define find_zero_low find_zero_all +#define find_eq_low find_eq_all +#define find_zero_eq_low find_zero_eq_all +#define find_zero_ne_low find_zero_ne_all + +#endif /* STRING_FZA_H */ diff --git a/sysdeps/powerpc/powerpc32/power6/string-fza.h b/sysdeps/powerpc/powerpc32/power6/string-fza.h new file mode 100644 index 0000000..bb00d7c --- /dev/null +++ b/sysdeps/powerpc/powerpc32/power6/string-fza.h @@ -0,0 +1 @@ +#include diff --git a/sysdeps/powerpc/powerpc64/power6/string-fza.h b/sysdeps/powerpc/powerpc64/power6/string-fza.h new file mode 100644 index 0000000..bb00d7c --- /dev/null +++ b/sysdeps/powerpc/powerpc64/power6/string-fza.h @@ -0,0 +1 @@ +#include From patchwork Wed Jan 10 12:48:02 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 124093 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp5236933qgn; Wed, 10 Jan 2018 04:51:28 -0800 (PST) X-Google-Smtp-Source: ACJfBovyXcLJGA4orT/ZgMY5ydNbW+frzjuaF0rikkNTdq6gUH0t2ZYZ6MlLU68cQ4UBWlztsc2O X-Received: by 10.98.150.204 with SMTP id s73mr12812000pfk.200.1515588688229; Wed, 10 Jan 2018 04:51:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515588688; cv=none; d=google.com; s=arc-20160816; b=IKRwsBSeDDTKqodRDjArDsGwJiGPXjb0dIOEGZBVT9k1jg3PsIMpJ0DeKY3TgCRkzb vIsdu/gmTK69r9JIbzJJ529vpyOFVT0d/3BRP7Qz8FfPNfz9te5y+9FKGhafdsGBOTFz 4yOCqTA+bfccZir+P7Uqu4stT3KH6ejnUYHE1MhYNyfr7KpZz5D2630Bm+huXunxfpBR 8fhnjd9iMSyKFs3EDrd+CnK/v89gviplqIkUctcFrusN3nxQwSZduNk3SR8Zphy150zk h1tK0HVS1me8o/Gup7SHJ0iQKHxt01gA9fXJCo8OebcOdP2/CrdcavGHCxL2tkN6EiEH w0Og== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:to:from:delivered-to :sender:list-help:list-post:list-archive:list-subscribe :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=P9fXsoda5k4EO66liqE+SKWQBCGGlE75mhO5dD3ywGg=; b=CJsoHDl+XiRIjikYUXNfYvoUoNEF6ziE0vFV9ukCxP148Mcs8XoDlLiqOHkF6NSzXe YbtDRmdOmEGztz8kCBHah+lEnQ5sslhnRodmphvda6wmcczY2EjRVn/aXx0tNdE8LcuU I+fatgd0surIjyASOw2CbK9lpCfw+O7tUcIU1V9AjFqSXCUbbKLHnlsNAFipTyKBiWWC WRfb4YsOO+CP7KiozXtMH7qRagOy33lTlh7ix5QSietO4AD+RrrrRvsA+fnHHmptnFpK R8TGG4ZHQlHOXaEw2Zr6uHt9W/cs2m5rshFTLIX//t3qlzfpyvOQXObVMYF7AuwoUQzT Xzkg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=vjEEscaE; spf=pass (google.com: domain of libc-alpha-return-89015-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89015-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id b25si2535657pgn.721.2018.01.10.04.51.27 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Jan 2018 04:51:28 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-89015-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=vjEEscaE; spf=pass (google.com: domain of libc-alpha-return-89015-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89015-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=F4ZdxsTw9MLa7ZzK8CH6WOmUNp8XSvx FRs2DQcwm3ny6dqh11wOIhgqIbg3f5s/B674UYowREQlPjVSjbZ4j6y588O2sP3V oS493ox0kiVGqCGTEHiXd4lWkIPNybnC5L0NqquLEXMxsRihDDgY0r6bCSc21hcX 4cv8/T7ChUdA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:in-reply-to :references; s=default; bh=H/H3+VrUS+3blXADHRVguswTlvo=; b=vjEEs caEAZt6ws3jbMhDEA2Xb4053/TtDz0OUwSgdJdQtsleM+5n1dHUA/Kg9oHhX2rCB tnMxrFX/1HwgwOjcnwTsfgP9TQX12ygNvdZmLNyjyiIRjSlRVFL0oHmKo9s7PuwD 1n1j1YOXIMQR93f1umwxPZiu9lBw7tdQHYNRd4= Received: (qmail 657 invoked by alias); 10 Jan 2018 12:48:48 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 128130 invoked by uid 89); 10 Jan 2018 12:48:40 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mail-qk0-f196.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=P9fXsoda5k4EO66liqE+SKWQBCGGlE75mhO5dD3ywGg=; b=o6Y4URcAGjxM9N9cURR1g6snnMWz6YEl+gUF5ZhTNuPVosSVSzZGLMEGGXCIiOufWn Gg8v/aC912Ruzmn+jKvCUH6ktmjgm9yY2ulCC7zavZU2Mi3PJgsHsEpXPcOOyH/Qfrka byMIxcs2iAHN+mYUA/zZgAp5MrVG2xNA2W+JPkujpD58blPPPg7I0lsMDCfjsYt3lpcC udWkvC684lk3j8tR623fjBc5MB+RzAmSPvU2pyJ3pP0jLThBoEMshOOqrspiznZzADlu yuzKfj+e/hmdOX0luNDtr/Muy+tIQtoJDz4NzHC7ja05iP1EixLHeOgKeSwoBR92CEW4 ZkOA== X-Gm-Message-State: AKwxytdBWcDGBHVssBSokc7pp+bxN10uJ931XzRidHC4f/v3gOUk46Hz 0qgN8P/p+v9Ql0JqypZSAs9Hz3+jCIU= X-Received: by 10.55.119.132 with SMTP id s126mr25934079qkc.250.1515588516945; Wed, 10 Jan 2018 04:48:36 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Subject: [PATCH v3 18/18] sh: Add string-fzb.h Date: Wed, 10 Jan 2018 10:48:02 -0200 Message-Id: <1515588482-15744-19-git-send-email-adhemerval.zanella@linaro.org> In-Reply-To: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> Use the SH cmp/str on has_{zero,eq,zero_eq}. Checked on sh4-linux-gnu. Adhemerval Zanella * sysdeps/sh/string-fzb.h: New file. --- sysdeps/sh/string-fzb.h | 53 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 53 insertions(+) create mode 100644 sysdeps/sh/string-fzb.h -- 2.7.4 diff --git a/sysdeps/sh/string-fzb.h b/sysdeps/sh/string-fzb.h new file mode 100644 index 0000000..70d8a8f --- /dev/null +++ b/sysdeps/sh/string-fzb.h @@ -0,0 +1,53 @@ +/* Zero byte detection; boolean. SH4 version. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef STRING_FZB_H +#define STRING_FZB_H 1 + +#include + +/* Determine if any byte within X is zero. This is a pure boolean test. */ + +static inline _Bool +has_zero (op_t x) +{ + op_t zero = 0x0, ret; + asm volatile ("cmp/str %1,%2\n" + "movt %0\n" + : "=r" (ret) + : "r" (zero), "r" (x)); + return ret; +} + +/* Likewise, but for byte equality between X1 and X2. */ + +static inline _Bool +has_eq (op_t x1, op_t x2) +{ + return has_zero (x1 ^ x2); +} + +/* Likewise, but for zeros in X1 and equal bytes between X1 and X2. */ + +static inline _Bool +has_zero_eq (op_t x1, op_t x2) +{ + return has_zero (x1) | has_eq (x1, x2); +} + +#endif /* STRING_FZB_H */