From patchwork Wed Jan 10 12:47:54 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 124095 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp5237259qgn; Wed, 10 Jan 2018 04:51:46 -0800 (PST) X-Google-Smtp-Source: ACJfBotEhhl+5pURTf/keAEfZXK22HnyPZ9VLd4y2wNRdT2eIzB3mJbG4oGJ6Or9NfOUYQUwGhNS X-Received: by 10.101.85.15 with SMTP id f15mr3805184pgr.153.1515588706908; Wed, 10 Jan 2018 04:51:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515588706; cv=none; d=google.com; s=arc-20160816; b=sUWnZUEQ/VsG76d3arfkdVK7so6l4ZRh7FIuZCpByT4vG97CWvd6aJnjcgvYqA5UPy V3SS+r0EGmO6Z7bKhVnqCVafawM4Htw/tMv7IfjcE8sGiGc8Pf38Yuuy+fItYlbn4TVT jL4HdfXE4DUG9MNi/fISgXRRVVUqrGNz+5WREaCc8YLAtclLbeX53croqldNZH0en+zn +wT+v2tx8L/8XdUuNKqZVvvWzCSD7HvNOu/2FHA76frA3WtPnuulK1nrCXSXZ9Df6L9p RCP7+WEhgH+eXPAss/4gcEz4BVVtiNRBM0O81wTrOnefjgOKcGNk+nO59f5HlMNo1tkb 1OQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :delivered-to:sender:list-help:list-post:list-archive:list-subscribe :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=irvnEYvgvHaWmrx2Q8RIdFzDWdxlHO/HcvsibierjG0=; b=MAqnv+68bXvq8WmlySo/mpvMXe2hz3xkuIa0vY4wWlSbjRed1344vigIiHcbZMMLOB a6yZUn88LOhpTOG3PTzwGDhN8faGVG92Zty3dpdutgyppjiqVoLm6vTUUaUjXw7X4pHB aFN6bXMCDfcrKhZ9CXwaxPTYJxKZVqbZjWNF7eG5BI7HheCdW9awgIb8dotduAnhO1eU Z1paaeP4CNIMW4/mlEuk3s1nphb+rPrHaYOihHnNT+O6LFx+RaY6td/Loyk1i61HXBlt Ng78YIzbkjVeXHVlV1nu0RCsJqQHXCuDdTmWcgjck0cYXl8zqP40aGrlxCN2zpqGPdq4 Fb0w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=ediz/X8f; spf=pass (google.com: domain of libc-alpha-return-89016-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89016-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id u85si11705724pfi.278.2018.01.10.04.51.46 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Jan 2018 04:51:46 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-89016-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=ediz/X8f; spf=pass (google.com: domain of libc-alpha-return-89016-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89016-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=o9FUiEWNjl3nl6V7vsLk4+i7u/bcuAA PgbgCsRMZuDimqenYPai4NsfPx7KxEtgJV6McVwB2+/XQtNA29JSvuIihLm3Y1u2 5ReKEgwRrMQCp5HgQ5qK2dTrwBmHyvYSwRGZRdQkv3QWYLyhAivSkPkocJYL/6ru i2R06HoOIYMc= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; s=default; bh=F7j63yb6UgyQXz10blCV4KGSRsw=; b=ediz/ X8f01POIM71oISfjj+eTuzWqHeM173pI2EJMpgBuEGicT6evDeu9RcpEvtBzSVi1 2EVhdcv2SX9+LG3XQIMaR9l8Vfq98TlryW0GgSPR5qsH2lXOcWyL3dFisAi33v3i cSxfPollMkyDXk7g+eY+LySEPzcdcXGAZ0hbuI= Received: (qmail 767 invoked by alias); 10 Jan 2018 12:48:49 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 124478 invoked by uid 89); 10 Jan 2018 12:48:33 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=2430 X-HELO: mail-qk0-f176.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=irvnEYvgvHaWmrx2Q8RIdFzDWdxlHO/HcvsibierjG0=; b=A0ytSERtUx7rqdVIJD0zEqymRMSm1NrN7Zxc89juLk6CG/QA10n1xbrwX7U4/Pncbq KkHZWrzKdUn24a9K4rNtxmOejYL/mGLcJIbXQJI+GkF/DEo0bx51A41+dWJxhoDuXM3g ZHOOGP62Oku8vFyKXbzmzVUREPmgq2V6HT6MoRqe3WkxA9OHLvEirZ4b9exaE9zjjOhd MQAtZ2gDX0Ml2KX62gnSEjca0H0kedjmiN64baxIoLn42GZRtirQA/L0RKhXKdoqcapN BYz6oXqhkRVQ2AGTD4Z1oec6EhqUr68wS+OwKmZxPR+EC+KtjJyTi6iTGJVY2rRBERJY jQGA== X-Gm-Message-State: AKwxyte5AMe60xMgUIp33w3jjsU5G9f2J4LFvsjEfiqCQW4c4mdxPX6t wUPbsKXd2htFuY4ri/Ymj8/0wiRZW8U= X-Received: by 10.55.115.194 with SMTP id o185mr23900974qkc.143.1515588504506; Wed, 10 Jan 2018 04:48:24 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Richard Henderson Subject: [PATCH v3 10/18] string: Improve generic strchrnul Date: Wed, 10 Jan 2018 10:47:54 -0200 Message-Id: <1515588482-15744-11-git-send-email-adhemerval.zanella@linaro.org> In-Reply-To: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> From: Richard Henderson New algorithm have the following key differences: - Reads first word unaligned and use string-maskoff function to remove unwanted data. This strategy follow assemble optimized ones for aarch64, powerpc and tile. - Use string-fz{b,i} functions. Checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu, and sparcv9-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). [BZ #5806] * string/strchrnul.c: Use string-fzb.h, string-fzi.h. --- string/strchrnul.c | 146 +++++++++-------------------------------------------- 1 file changed, 25 insertions(+), 121 deletions(-) -- 2.7.4 diff --git a/string/strchrnul.c b/string/strchrnul.c index 5a17602..beeab88 100644 --- a/string/strchrnul.c +++ b/string/strchrnul.c @@ -21,8 +21,12 @@ . */ #include -#include #include +#include +#include +#include +#include +#include #undef __strchrnul #undef strchrnul @@ -33,134 +37,34 @@ /* Find the first occurrence of C in S or the final NUL byte. */ char * -STRCHRNUL (const char *s, int c_in) +STRCHRNUL (const char *str, int c_in) { - const unsigned char *char_ptr; - const unsigned long int *longword_ptr; - unsigned long int longword, magic_bits, charmask; - unsigned char c; + const op_t *word_ptr; + op_t found, word; - c = (unsigned char) c_in; + /* Set up a word, each of whose bytes is C. */ + op_t repeated_c = repeat_bytes (c_in); - /* Handle the first few characters by reading one character at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = (const unsigned char *) s; - ((unsigned long int) char_ptr & (sizeof (longword) - 1)) != 0; - ++char_ptr) - if (*char_ptr == c || *char_ptr == '\0') - return (void *) char_ptr; + /* Align the input address to op_t. */ + uintptr_t s_int = (uintptr_t) str; + word_ptr = (op_t*) (s_int & -sizeof (op_t)); - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to 8-byte longwords. */ + /* Read the first aligned word, but force bytes before the string to + match neither zero nor goal (we make sure the high bit of each byte + is 1, and the low 7 bits are all the opposite of the goal byte). */ + op_t bmask = create_mask (s_int); + word = (*word_ptr | bmask) ^ (repeated_c & highbit_mask (bmask)); - longword_ptr = (unsigned long int *) char_ptr; - - /* Bits 31, 24, 16, and 8 of this number are zero. Call these bits - the "holes." Note that there is a hole just to the left of - each byte, with an extra at the end: - - bits: 01111110 11111110 11111110 11111111 - bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD - - The 1-bits make sure that carries propagate to the next 0-bit. - The 0-bits provide holes for carries to fall into. */ - magic_bits = -1; - magic_bits = magic_bits / 0xff * 0xfe << 1 >> 1 | 1; - - /* Set up a longword, each of whose bytes is C. */ - charmask = c | (c << 8); - charmask |= charmask << 16; - if (sizeof (longword) > 4) - /* Do the shift in two steps to avoid a warning if long has 32 bits. */ - charmask |= (charmask << 16) << 16; - if (sizeof (longword) > 8) - abort (); - - /* Instead of the traditional loop which tests each character, - we will test a longword at a time. The tricky part is testing - if *any of the four* bytes in the longword in question are zero. */ - for (;;) + while (1) { - /* We tentatively exit the loop if adding MAGIC_BITS to - LONGWORD fails to change any of the hole bits of LONGWORD. - - 1) Is this safe? Will it catch all the zero bytes? - Suppose there is a byte with all zeros. Any carry bits - propagating from its left will fall into the hole at its - least significant bit and stop. Since there will be no - carry from its most significant bit, the LSB of the - byte to the left will be unchanged, and the zero will be - detected. - - 2) Is this worthwhile? Will it ignore everything except - zero bytes? Suppose every byte of LONGWORD has a bit set - somewhere. There will be a carry into bit 8. If bit 8 - is set, this will carry into bit 16. If bit 8 is clear, - one of bits 9-15 must be set, so there will be a carry - into bit 16. Similarly, there will be a carry into bit - 24. If one of bits 24-30 is set, there will be a carry - into bit 31, so all of the hole bits will be changed. - - The one misfire occurs when bits 24-30 are clear and bit - 31 is set; in this case, the hole at bit 31 is not - changed. If we had access to the processor carry flag, - we could close this loophole by putting the fourth hole - at bit 32! - - So it ignores everything except 128's, when they're aligned - properly. - - 3) But wait! Aren't we looking for C as well as zero? - Good point. So what we do is XOR LONGWORD with a longword, - each of whose bytes is C. This turns each byte that is C - into a zero. */ - - longword = *longword_ptr++; - - /* Add MAGIC_BITS to LONGWORD. */ - if ((((longword + magic_bits) - - /* Set those bits that were unchanged by the addition. */ - ^ ~longword) - - /* Look at only the hole bits. If any of the hole bits - are unchanged, most likely one of the bytes was a - zero. */ - & ~magic_bits) != 0 || - - /* That caught zeroes. Now test for C. */ - ((((longword ^ charmask) + magic_bits) ^ ~(longword ^ charmask)) - & ~magic_bits) != 0) - { - /* Which of the bytes was C or zero? - If none of them were, it was a misfire; continue the search. */ - - const unsigned char *cp = (const unsigned char *) (longword_ptr - 1); - - if (*cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (sizeof (longword) > 4) - { - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - } - } + if (has_zero_eq (word, repeated_c)) + break; + word = *++word_ptr; } - /* This should never happen. */ - return NULL; + found = index_first_zero_eq (word, repeated_c); + + return (char *) (word_ptr) + found; } weak_alias (__strchrnul, strchrnul)