From patchwork Wed Jan 10 12:47:44 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 124082 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp5235315qgn; Wed, 10 Jan 2018 04:49:37 -0800 (PST) X-Google-Smtp-Source: ACJfBovArG+DDxMSz+Orzl2bWKYHnPTpjb6GHRfMt4yOm2B/T6AGMBvFJJFRGAdEpxBhw/fvFkG6 X-Received: by 10.99.115.8 with SMTP id o8mr15187774pgc.241.1515588577279; Wed, 10 Jan 2018 04:49:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515588577; cv=none; d=google.com; s=arc-20160816; b=p9jRijZclbHgM5Yef1gkP4e8D7MGOdrQ911lF9E0StcH7pa55dtDb+mozMfEenjXEH KKJpiBpbsBsqYRSR0+tF6wqU+3zFIvtsSk9Jgt/kKb4m+We/kdpcscUG4Ja/iOI4chs0 311fzkEH7vAdg31w6Vv2r4pWeZN/hmVLc1zmhz33YIdeZeDm2E4eVj9F/PbCYFU3AKQP kRT63YW7Aiawa6fnzz/AsEC+Ctd79gRWtk82wXxl4yj92CnVH2qInKCO9TvcV3dJO+Jh 6U9MjglinLNc4exr/zdy9BtAZjNhh5fmiaGaUUL6gODIieMkhXtpZwysA51ZXYG3MevD 8Iaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:date:subject:to:from:delivered-to:sender:list-help :list-post:list-archive:list-subscribe:list-unsubscribe:list-id :precedence:mailing-list:dkim-signature:domainkey-signature :arc-authentication-results; bh=t2xqzmPpKBNAIUZAtQUT94JFS4W1i4cNQMK+C5IfC0k=; b=PNk3R5HU81pqm4g2iTGVTIBtM1iGdasgEFzjnW97eptglcLKNBct9h+J99nnoNNn42 I3p76cSztmcLTPyoqSFwArWn2b3/HGPy+JCdwFB5Yh7EgyZ9ZlJsHeGnRMXp/TBfGztd F5U/xJWjh1q9lMh/dsGYq4mPr0TfKNmJxtUwDv07GRZADzymRf9rFxG1UYdRd1O046bI oQuXGkPUl81Jmr7zSzzBG13+3PU0lAC11ZYprZExzrm1qdX00MyPJijQnzrvEeHMTMMt R3SsO03ovnd0dhKlDzVobV/1lkq0lCeoKJavjnYQGjvQP3FycRFuXO3deDS0dqcNWpbD wYBA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=nb4Z/FFD; spf=pass (google.com: domain of libc-alpha-return-89000-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89000-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id p65si2766333pfd.243.2018.01.10.04.49.36 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Jan 2018 04:49:37 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-89000-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=nb4Z/FFD; spf=pass (google.com: domain of libc-alpha-return-89000-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89000-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id; q=dns; s= default; b=NH8cm0UNobRDMTH6q2b1VEpeY1l8oAjwYch/XLRpXC5LWMDASQ4CC yRBu+ctiw6qepnt9yGSrir5kePcft2XXjhApw0mBwTJpJVpFg1s5Dwc7DQ5r0I+w xoW/PUzC31iD0bgBEkD0+S/cHEXtDFzV3eeVgYRiHHujRrEH6ns7v4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id; s=default; bh=H2GcE2ftFjlpQHSy8KoiAWfBzAE=; b=nb4Z/FFDkACUnsIkPi1krhUCOcW1 FK/CY/X9xSd5OAFV/pTonYO7CpyNHJZUid08A+WzdX2DKsHfvVcFrhE8AF9+SY8F KbZAS96Hl6F+VqS5YTxYKQ2+Md8ZDREx5whonnaazyRD6vLURooh4QHpPOmd+6aC YAP3/htPNEzSjGQ= Received: (qmail 129417 invoked by alias); 10 Jan 2018 12:48:42 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 120859 invoked by uid 89); 10 Jan 2018 12:48:27 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-12.4 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=aimed, Adjusted X-HELO: mail-qt0-f196.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id; bh=t2xqzmPpKBNAIUZAtQUT94JFS4W1i4cNQMK+C5IfC0k=; b=p1gcqp7CkDM/dkoCGmF0Rd4MTp9YW4iHWtb64fGSi8kVXcosLhgXbGfEeSApDgDhm0 Zjov8X2jgyIjE7BtGIH1cuniulVza3iAKKRyhLOLD/fWzN9863n4CRXPmB63k21nzCZR F0aCuISConBNpvJuCs+1sc6fARIWGE3liEnPVn1rnVzpSQYoF8OUam/UXALHBctvChzL xtfVTWhuxePtnQcCj8uFiWdoTkpMx8JL6lfl/ieatVt9qYo33qixrRL77fa8EMPFGPVR xDwNAftxdoRz5/1abxZMbtRdPpB3LYWrzjmLyUnACWpuFnL12kubAz0UqNRMaa6VTaY9 DKZg== X-Gm-Message-State: AKwxytcM7hKimxBVKiHUBWOKILlBSQvhmUkGO4i4ilRkDDt213a/oeQa mBXVhCKyi/dfqXeRSAy6cl4UhznLS90= X-Received: by 10.237.53.166 with SMTP id c35mr21162292qte.215.1515588489338; Wed, 10 Jan 2018 04:48:09 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Subject: [PATCH v3 00/18] Improve generic string routines Date: Wed, 10 Jan 2018 10:47:44 -0200 Message-Id: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> It is an update of previous Richard's patchset [1] to provide generic string implementation for newer ports and make them only focus on just specific routines to get a better overall improvement. It is done by: 1. parametrizing the internal routines (for instance the find zero in a word) so each architecture can reimplement without the need to reimplement the whole routine. 2. vectorizing more string implementations (for instance strcpy and strcmp). 3. Change some implementations to use already possible optimized ones (for instance strnlen). It makes new ports to focus on only provide optimized implementation of a hardful symbols (for instance memchr) and make its improvement to be used in a larger set of routines. For the rest of #5806 I think we can handle them later and if performance of generic implementation is closer I think it is better to just remove old assembly implementations. I also checked on x86_64-linux-gnu, i686-linux-gnu, sparc64-linux-gnu, and sparcv9-linux-gnu by removing the arch-specific assembly implementation and disabling multiarch (it covers both LE and BE for 64 and 32 bits). I also checked the string routines on alpha, hppa, and sh. Changes since v2: * Move string-fz{a,b,i} to its own patch. * Add a inline implementation for __builtin_c{l,t}z to avoid using compiler provided symbols. * Add a new header, string-maskoff.h, to handle unaligned accesses on some implementation. * Fixed strcmp on LE machines. * Added a unaligned strcpy variant for architecture that define _STRING_ARCH_unaligned. * Add SH string-fzb.h (which uses cmp/str instruction to find a zero in word). Changes since v1: * Marked ChangeLog entries with [BZ #5806], as appropriate. * Reorganized the headers, so that armv6t2 and power6 need override as little as possible to use their (integer) zero detection insns. * Hopefully fixed all of the coding style issues. * Adjusted the memrchr algorithm as discussed. * Replaced the #ifdef STRRCHR etc that are used by the multiarch files. * Tested on i386, i686, x86_64 (verified this is unused), ppc64, ppc64le --with-cpu=power8 (to use power6 in multiarch), armv7, aarch64, alpha (qemu) and hppa (qemu). PS: This patchset is aimed for 2.28. [1] https://sourceware.org/ml/libc-alpha/2016-12/msg00830.html Adhemerval Zanella (5): Add string-maskoff.h generic header Add string vectorized find and detection functions string: Improve generic strnlen string: Improve generic strcpy sh: Add string-fzb.h Richard Henderson (13): Parameterize op_t from memcopy.h Parameterize OP_T_THRES from memcopy.h string: Improve generic strlen string: Improve generic memchr string: Improve generic memrchr string: Improve generic strchr string: Improve generic strchrnul string: Improve generic strcmp hppa: Add memcopy.h hppa: Add string-fzb.h and string-fzi.h alpha: Add string-fzb.h and string-fzi.h arm: Add string-fza.h powerpc: Add string-fza.h config.h.in | 8 + configure | 54 ++++++ configure.ac | 34 ++++ string/memchr.c | 157 ++++----------- string/memcmp.c | 4 - string/memrchr.c | 193 ++++-------------- string/strchr.c | 166 +++------------- string/strchrnul.c | 146 +++----------- string/strcmp.c | 97 +++++++++- string/strcpy.c | 109 ++++++++++- string/strlen.c | 83 ++------ string/strnlen.c | 139 +------------ string/test-strcpy.c | 24 ++- sysdeps/alpha/string-fzb.h | 51 +++++ sysdeps/alpha/string-fzi.h | 113 +++++++++++ sysdeps/arm/armv6t2/string-fza.h | 69 +++++++ sysdeps/generic/memcopy.h | 11 +- sysdeps/generic/string-extbyte.h | 35 ++++ sysdeps/generic/string-fza.h | 117 +++++++++++ sysdeps/generic/string-fzb.h | 49 +++++ sysdeps/generic/string-fzi.h | 215 +++++++++++++++++++++ sysdeps/generic/string-maskoff.h | 64 ++++++ sysdeps/generic/string-opthr.h | 25 +++ sysdeps/generic/string-optype.h | 31 +++ sysdeps/hppa/memcopy.h | 44 +++++ sysdeps/hppa/string-fzb.h | 69 +++++++ sysdeps/hppa/string-fzi.h | 135 +++++++++++++ sysdeps/i386/i686/multiarch/memrchr-c.c | 2 + sysdeps/i386/i686/multiarch/strnlen-c.c | 19 +- sysdeps/i386/memcopy.h | 3 - sysdeps/i386/string-opthr.h | 25 +++ sysdeps/m68k/memcopy.h | 3 - sysdeps/powerpc/power6/string-fza.h | 65 +++++++ sysdeps/powerpc/powerpc32/power4/memcopy.h | 5 - .../powerpc32/power4/multiarch/strnlen-ppc32.c | 19 +- sysdeps/powerpc/powerpc32/power6/string-fza.h | 1 + sysdeps/powerpc/powerpc64/power6/string-fza.h | 1 + sysdeps/s390/multiarch/memrchr-c.c | 2 + sysdeps/s390/multiarch/strchr-c.c | 1 + sysdeps/s390/multiarch/strnlen-c.c | 18 +- sysdeps/sh/string-fzb.h | 53 +++++ sysdeps/tile/memcmp.c | 1 - sysdeps/tile/memcopy.h | 7 - sysdeps/tile/tilegx32/gmp-mparam.h | 30 +++ 44 files changed, 1707 insertions(+), 790 deletions(-) create mode 100644 sysdeps/alpha/string-fzb.h create mode 100644 sysdeps/alpha/string-fzi.h create mode 100644 sysdeps/arm/armv6t2/string-fza.h create mode 100644 sysdeps/generic/string-extbyte.h create mode 100644 sysdeps/generic/string-fza.h create mode 100644 sysdeps/generic/string-fzb.h create mode 100644 sysdeps/generic/string-fzi.h create mode 100644 sysdeps/generic/string-maskoff.h create mode 100644 sysdeps/generic/string-opthr.h create mode 100644 sysdeps/generic/string-optype.h create mode 100644 sysdeps/hppa/memcopy.h create mode 100644 sysdeps/hppa/string-fzb.h create mode 100644 sysdeps/hppa/string-fzi.h create mode 100644 sysdeps/i386/string-opthr.h create mode 100644 sysdeps/powerpc/power6/string-fza.h create mode 100644 sysdeps/powerpc/powerpc32/power6/string-fza.h create mode 100644 sysdeps/powerpc/powerpc64/power6/string-fza.h create mode 100644 sysdeps/sh/string-fzb.h create mode 100644 sysdeps/tile/tilegx32/gmp-mparam.h -- 2.7.4