From patchwork Wed Jan 10 12:47:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 124081 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp5235187qgn; Wed, 10 Jan 2018 04:49:28 -0800 (PST) X-Google-Smtp-Source: ACJfBotNhOAEmTmkxJFFGOtLlW5C5yQ2r4xrfeG1mwfeXGGdxCYLMKlxdqZmvyCCayv8Lh5Dl+i8 X-Received: by 10.99.190.76 with SMTP id g12mr348358pgo.235.1515588568241; Wed, 10 Jan 2018 04:49:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515588568; cv=none; d=google.com; s=arc-20160816; b=vviZACbn/X3jaCW/mkSViKZsFEVtXB/jeAK2LnawlBCAVVs7QTXQW+hdg1XiQI4S+L dGcsK5PtNyDq9f0OsW6dfnBJRo9VnU8MRMp2j7FHrVC8THIvDKVE9hdh7rYQpiquyf8N vL94Rn52YIl1hMnHhZzETgkegAECPIqbqYE2ATkTrkl+x/7aeDOrpKTI8fy7P8D9rHFx gkduJmuwakfyLXzsPyz8HiC5RXwRVSLa657Vim/bknWEVhnSnWWhce/y6rwJrDmcDqBP /Zw8bpDgZkh2iOCRDr9GXN3O447vmApct1QCt/L5ZKQMrSu4A1uu5m65PrwTHDTUByF1 D38A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :delivered-to:sender:list-help:list-post:list-archive:list-subscribe :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=eKKt9EJ4kSe1azG9c+topdoxtYd46kh2jlXinfk4sHM=; b=meDb2A5jRCJfuJA5BNMtRH1ks+8IOfDxk4iTGppiAbXGEA0HwJqjBfwurJxYR7QeLT htop3Q7ufpkMo7Fm95ftB4ZDHs6at3/LtefTQRPlBAscz7lK93JuaQu5mn95NFMAlAYl Wwd2ZRDI2rqjSpmNef7yImIX7bOBCMLXc1jxst2ZrLDx8oyxjqI3LZC7U5aB/ZbmqlRL o6GqfS23UKCCyt2sfaISZcBV2GsaeIc8SEf09KZdRRZmL/cnN0Pa9QXxx8Nq9vnDOFMo qCLS3nIrGq2+Lr8xOi6QHr1ibaTD/vOz/z9hbLM0PCcO9yFfJNhh/Z/JjFQ8FjfF9LEh eOdQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=qkDFQ7Nt; spf=pass (google.com: domain of libc-alpha-return-89003-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89003-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id l30si12100394plg.123.2018.01.10.04.49.27 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Jan 2018 04:49:28 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-89003-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=qkDFQ7Nt; spf=pass (google.com: domain of libc-alpha-return-89003-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89003-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=OCKanqoEs3cxEda4fbaPolS7cNVxZWL t9pYS7bhswSpTc2A+mfy+JsTYy1nMf6mR0NjUYJg5JYqYutXfTMeKM8mDmq3o+P/ 15KAtXE73Rgvkk9gwiDAdwPpw9H0VBbAwIv5REvt/EBDripWjH6vY1eqr7r4RIB8 VgYcZGB3sA9M= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; s=default; bh=0+xZ5zEEM2yrOnTQkI74ckcrwLg=; b=qkDFQ 7Nt8zYKK4xuMjxmCL50S0Gbm88v5EtCRtyR0EYaW4aIUJSf/TlU8Q2l3l5Jts2nM QnkgIkvEFWlULJdg/BlXzp4FwwWlynXo8bXGoqAFNYFcMG5lckanCv2rvFoFTptH RVzIkyKGF0NZomHPN6bRRro1ZtRHT58fG+U34g= Received: (qmail 129824 invoked by alias); 10 Jan 2018 12:48:43 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 124476 invoked by uid 89); 10 Jan 2018 12:48:33 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=2430 X-HELO: mail-qk0-f170.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=eKKt9EJ4kSe1azG9c+topdoxtYd46kh2jlXinfk4sHM=; b=T8MtKd2F5zsiLvokUNVNkQFFeDCjzYpkhhycvTEiAvD7klrxu4nDceZLbt58oMvI2K w2WOYDAdbYLkerqJaYijMM98u1x0/NB8hus88OhVkc1BXzBBmF/112F6/4VAnhszWWQQ +N+IvjmR+OdkHKtiT40r05+yFoRFqfq8zDl2mj2d5X3d9CCcP3WwspRaNmxxTqwDQZ5/ K/3ELW9AuXFraI5ZfpP4UIGcsXRERIiVUs46TcfSRpMsJchhYokxdWhceZ/SHzyWcEap pj15y30UAJetat82xVD5GcoYkCOhad0+QxcIkhl5lGR+jJQ6VIxRtHdRit2aLKzwE7TT uK/g== X-Gm-Message-State: AKwxytcgQ5+fzE5wak88fRcjByOEM70m3HuNU+rur6bshjpbb7Tvjy4A haVCXjuDzzWWa4s5V3jIqIG6wHBAwgI= X-Received: by 10.55.214.75 with SMTP id t72mr27405753qki.42.1515588502984; Wed, 10 Jan 2018 04:48:22 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Richard Henderson Subject: [PATCH v3 09/18] string: Improve generic strchr Date: Wed, 10 Jan 2018 10:47:53 -0200 Message-Id: <1515588482-15744-10-git-send-email-adhemerval.zanella@linaro.org> In-Reply-To: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> From: Richard Henderson New algorithm have the following key differences: - Reads first word unaligned and use string-maskoff function to remove unwanted data. This strategy follow assemble optimized ones for aarch64 and powerpc. - Use string-fz{b,i} and string-extbyte function. Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu, and sparcv9-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Richard Henderson Adhemerval Zanella [BZ #5806] * string/strchr.c: Use string-fzb.h, string-fzi.h, string-extbyte.h. * sysdeps/s390/multiarch/strchr-c.c: Redefine weak_alias. --- string/strchr.c | 166 +++++++------------------------------- sysdeps/s390/multiarch/strchr-c.c | 1 + 2 files changed, 29 insertions(+), 138 deletions(-) -- 2.7.4 diff --git a/string/strchr.c b/string/strchr.c index a63fdfc..ee8ed5c 100644 --- a/string/strchr.c +++ b/string/strchr.c @@ -22,8 +22,15 @@ #include #include +#include +#include +#include +#include +#include +#include #undef strchr +#undef index #ifndef STRCHR # define STRCHR strchr @@ -33,153 +40,36 @@ char * STRCHR (const char *s, int c_in) { - const unsigned char *char_ptr; - const unsigned long int *longword_ptr; - unsigned long int longword, magic_bits, charmask; - unsigned char c; + const op_t *word_ptr; + op_t found, word; - c = (unsigned char) c_in; + /* Set up a word, each of whose bytes is C. */ + unsigned char c = (unsigned char) c_in; + op_t repeated_c = repeat_bytes (c_in); - /* Handle the first few characters by reading one character at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = (const unsigned char *) s; - ((unsigned long int) char_ptr & (sizeof (longword) - 1)) != 0; - ++char_ptr) - if (*char_ptr == c) - return (void *) char_ptr; - else if (*char_ptr == '\0') - return NULL; + /* Align the input address to op_t. */ + uintptr_t s_int = (uintptr_t) s; + word_ptr = (op_t*) (s_int & -sizeof (op_t)); - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to 8-byte longwords. */ + /* Read the first aligned word, but force bytes before the string to + match neither zero nor goal (we make sure the high bit of each byte + is 1, and the low 7 bits are all the opposite of the goal byte). */ + op_t bmask = create_mask (s_int); + word = (*word_ptr | bmask) ^ (repeated_c & highbit_mask (bmask)); - longword_ptr = (unsigned long int *) char_ptr; - - /* Bits 31, 24, 16, and 8 of this number are zero. Call these bits - the "holes." Note that there is a hole just to the left of - each byte, with an extra at the end: - - bits: 01111110 11111110 11111110 11111111 - bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD - - The 1-bits make sure that carries propagate to the next 0-bit. - The 0-bits provide holes for carries to fall into. */ - magic_bits = -1; - magic_bits = magic_bits / 0xff * 0xfe << 1 >> 1 | 1; - - /* Set up a longword, each of whose bytes is C. */ - charmask = c | (c << 8); - charmask |= charmask << 16; - if (sizeof (longword) > 4) - /* Do the shift in two steps to avoid a warning if long has 32 bits. */ - charmask |= (charmask << 16) << 16; - if (sizeof (longword) > 8) - abort (); - - /* Instead of the traditional loop which tests each character, - we will test a longword at a time. The tricky part is testing - if *any of the four* bytes in the longword in question are zero. */ - for (;;) + while (1) { - /* We tentatively exit the loop if adding MAGIC_BITS to - LONGWORD fails to change any of the hole bits of LONGWORD. - - 1) Is this safe? Will it catch all the zero bytes? - Suppose there is a byte with all zeros. Any carry bits - propagating from its left will fall into the hole at its - least significant bit and stop. Since there will be no - carry from its most significant bit, the LSB of the - byte to the left will be unchanged, and the zero will be - detected. - - 2) Is this worthwhile? Will it ignore everything except - zero bytes? Suppose every byte of LONGWORD has a bit set - somewhere. There will be a carry into bit 8. If bit 8 - is set, this will carry into bit 16. If bit 8 is clear, - one of bits 9-15 must be set, so there will be a carry - into bit 16. Similarly, there will be a carry into bit - 24. If one of bits 24-30 is set, there will be a carry - into bit 31, so all of the hole bits will be changed. - - The one misfire occurs when bits 24-30 are clear and bit - 31 is set; in this case, the hole at bit 31 is not - changed. If we had access to the processor carry flag, - we could close this loophole by putting the fourth hole - at bit 32! - - So it ignores everything except 128's, when they're aligned - properly. - - 3) But wait! Aren't we looking for C as well as zero? - Good point. So what we do is XOR LONGWORD with a longword, - each of whose bytes is C. This turns each byte that is C - into a zero. */ - - longword = *longword_ptr++; - - /* Add MAGIC_BITS to LONGWORD. */ - if ((((longword + magic_bits) - - /* Set those bits that were unchanged by the addition. */ - ^ ~longword) - - /* Look at only the hole bits. If any of the hole bits - are unchanged, most likely one of the bytes was a - zero. */ - & ~magic_bits) != 0 || - - /* That caught zeroes. Now test for C. */ - ((((longword ^ charmask) + magic_bits) ^ ~(longword ^ charmask)) - & ~magic_bits) != 0) - { - /* Which of the bytes was C or zero? - If none of them were, it was a misfire; continue the search. */ - - const unsigned char *cp = (const unsigned char *) (longword_ptr - 1); - - if (*cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (sizeof (longword) > 4) - { - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - } - } + if (has_zero_eq (word, repeated_c)) + break; + word = *++word_ptr; } + found = index_first_zero_eq (word, repeated_c); + + if (extractbyte (word, found) == c) + return (char *) (word_ptr) + found; return NULL; } -#ifdef weak_alias -# undef index weak_alias (strchr, index) -#endif libc_hidden_builtin_def (strchr) diff --git a/sysdeps/s390/multiarch/strchr-c.c b/sysdeps/s390/multiarch/strchr-c.c index 606cb56..e91ef94 100644 --- a/sysdeps/s390/multiarch/strchr-c.c +++ b/sysdeps/s390/multiarch/strchr-c.c @@ -19,6 +19,7 @@ #if defined HAVE_S390_VX_ASM_SUPPORT && IS_IN (libc) # define STRCHR __strchr_c # undef weak_alias +# define weak_alias(a, b) # ifdef SHARED # undef libc_hidden_builtin_def # define libc_hidden_builtin_def(name) \