From patchwork Mon Feb 19 20:46:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 774112 Delivered-To: patch@linaro.org Received: by 2002:a5d:4943:0:b0:33b:4db1:f5b3 with SMTP id r3csp1346231wrs; Mon, 19 Feb 2024 12:47:18 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCUqlkoz39vLirH1YR+yp8taFbPpgiColERY1tg90MLxAxIw1LOaTWJ7FhbrBdjgTsU5T8U8kU5zpjF6kuJbC0ct X-Google-Smtp-Source: AGHT+IFr3GLLurTSWeudqlUhek/Jb7A8tvtBRK7h9VRjdW8w6wXdrM15ZXfzmvNx45FzMhfmDzC2 X-Received: by 2002:a0d:d956:0:b0:608:1be7:6e4e with SMTP id b83-20020a0dd956000000b006081be76e4emr5262522ywe.33.1708375638324; Mon, 19 Feb 2024 12:47:18 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708375638; cv=pass; d=google.com; s=arc-20160816; b=ytkrpczVmGP/19PeqbgWwLJvlRba/jkIocVMwm8iuZgIYW6NQOvWmEnH/FJchYmzIc mdNBAaJFcSWW8skMgKnbtBHSedDG2xte9es9piNVUlOYHxmW/W2eD11l/zyrBHRi/1QL sfzHKTQysjed1+9002CvQx3Uj8KEdXJGSirbYFY/gjbqM5pi4IVt2O/VEkAqxeyU08q5 sQRwJ8bqDA0F7Ar46ct8SjPEV3h6p/eJtORZoLaXr+5aY5zQPmvODjZfH3czD8W171sf B01aYcyZPocU5LGyKm7aVEGI0dgzr+Nz2uMWLHRhuFMfMCXYUA1Y2c4IcFIne36OsRk8 xBzw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:to:from:dkim-signature :arc-filter:dmarc-filter:delivered-to; bh=tgMJB7ZJmq1zecnlkTAMu+yz2VqrlcWAjZT0fb6dwQI=; fh=dHLBnA+MhGtNtN2B2JMAELi4oD+gmgMg7DL8H0jYbkI=; b=Xo9X4oAqsrbBBfNM0rJ4vTnR57//uaTC2g8Gxpn+H2nband7cIQ4Xs2bAt2EATTf/z fzpncVS1/MVTGhnij6VSKPbAJFMc/lYhmnHIYGunSPvJUKegPdSfR1MKGzX+qmBHaFz5 /SIzeozXiB+SzYx0ZG2d2MQYBXdbm5w6bHHIgMSv9R6ocTyq3jQV/1JzBfjgL0tmybDl RM/wswuYrhvAWudoX1XCMQ9prnTVUdo2Ttni4+BHDm2pY5ky3XKB2R4jmdYKKnWHLQw5 ZCEDIDbsFGpBVA36H/23lN6u4m2z6tmnOxv9dXAgNRKHuGTDCRJQC0drU3CgyChPk3F0 WCbQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Hs68VdRJ; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id i16-20020ac84890000000b0042deeac6756si6287522qtq.469.2024.02.19.12.47.18 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Feb 2024 12:47:18 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Hs68VdRJ; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CCE0D3858C41 for ; Mon, 19 Feb 2024 20:47:17 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pg1-x532.google.com (mail-pg1-x532.google.com [IPv6:2607:f8b0:4864:20::532]) by sourceware.org (Postfix) with ESMTPS id 27CE93858C42 for ; Mon, 19 Feb 2024 20:47:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 27CE93858C42 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 27CE93858C42 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::532 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708375625; cv=none; b=VQNY/zgVjcTes/PbLxjvoUXc3vQ5P/xodG3uVKuJKpxOte9tDiWcimp663VYotGzE9nDAYU0TkIF8mXAJyx7U//BmaSfsfhp+O1O6rXCOfa+ZAARRyxeyK9sMHxbjMpPqDb7ISGnOJlWqZnkyjOS8sYClSzH9fl3Rw5P7aq8ajg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708375625; c=relaxed/simple; bh=uwTKsnnY7BZ/sVRSqsTqA0KhfEf6VMtlDfS/fjnzUm8=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=tySHPhKZB2OeEff//sWCAflXVKctMDSvznDsFXNj+7BgTUvyZJf18du4Q0FiQzhOcHR54w1WhuiS/+/M0ZXR/sFYUhHd7tK+q76Q0pep5GDdjr+u5mkIsSskgkvHBFU1iTL7gtCWIrUQW+saYYKbigp/6SRxo5LwMKFUPbUAUy4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pg1-x532.google.com with SMTP id 41be03b00d2f7-5ce07cf1e5dso3715181a12.2 for ; Mon, 19 Feb 2024 12:47:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1708375621; x=1708980421; darn=sourceware.org; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:from:to:cc:subject:date:message-id:reply-to; bh=tgMJB7ZJmq1zecnlkTAMu+yz2VqrlcWAjZT0fb6dwQI=; b=Hs68VdRJ6KVv30W993WRuWC2kJjyD6RLqwkcLOoaBK9XmNogxWsijATZNXFFSmdZTY e4N90hsmx84ki60NTQFRzjkSmVg+ZSCJDCouj5a64cqDR8hd4P+jXd3siBqPbOzmPnye 5rpwqGHspt1tM6vwCYZPZGAxyDAUBO7EZgZ3sNUImYhDTOHzmlFAZHEIKWPIGK1Vxbmz sOtzg+8yUHCf7UgZ0EnFBc9qwaB2ObsmtbS2oownUtyYFrb6HgMqZ30QKRLLJh9xEbi9 tZCWJyrdy+RozE8Z2xhaZwt12IyOIRrDyIfAgQeq2U9mqMjG96isag4sclGvcaGakzEv JX0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708375621; x=1708980421; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=tgMJB7ZJmq1zecnlkTAMu+yz2VqrlcWAjZT0fb6dwQI=; b=oj9rH4ashl3ZXCbM6aExWqifYzz7oFV1IpyMJrvvLXNGA5bQFW+4q6ooqcgFER9hoE bLyZdiHqhjnbB2ZUI44XXzz0ln7ouc4OdVRPDKPixJz/qiJcg+Y6bHVAXl+6WdGt12sr ZpwygCBvOczSWt8HhxhJfmrQsjz+KnKsI+zNeO+nGw25p22Yd9ND76ENwf70Zmjb7LMN gVWoIle9fw1LWUHzlnJFKLcBh0totvriLubxN/DtNeahBXBsM+ZQcx+iqdnbmNfaDxpL yskc1UmiIP8FQoZZggS3tNyC23V6kczy85E4yAUGXiWEIEWr6kb4hAv4I4UVS1R62wq0 qVQA== X-Gm-Message-State: AOJu0YxcSYGqh+JBdjyX7Y/JoW/1PpKIC+Cenq5C3Lg1yrNtiGvTf+PA NJnhbsrApU92uJoTdmiuFHCppSgLI5o+AXJR92xAnbKmldgjG8t1jn6qxWhGwOFmJ4BPSCSU+L8 g X-Received: by 2002:a17:90a:f411:b0:299:5d32:e53a with SMTP id ch17-20020a17090af41100b002995d32e53amr3706303pjb.7.1708375620059; Mon, 19 Feb 2024 12:47:00 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:8177:87f4:ac6d:3153:10b3]) by smtp.gmail.com with ESMTPSA id w3-20020a17090a528300b002990d91d31dsm5684587pjh.15.2024.02.19.12.46.58 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Feb 2024 12:46:59 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Subject: [PATCH] powerpc: Remove power7 strstr optimization Date: Mon, 19 Feb 2024 17:46:55 -0300 Message-Id: <20240219204655.3096414-1-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SCC_10_SHORT_WORD_LINES, SCC_20_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_PASS, TVD_SUBJ_WIPE_DEBT, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org The optimization is not faster than the generic algorithm, using the bench-strstr the geometric mean running on a POWER10 machine using gcc 13.1.1 is 482.47 while the default __strstr_ppc is 340.97 (which uses the generic implementation). Also, there is no need to redirect the internal str*/mem* call to optimized version, internal ifunc is supported and enabled for internal calls (meaning that the generic implementation will use any asm optimization if available). Checked on powerpc64le-linux-gnu. Reviewed-by: Peter Bergner --- sysdeps/powerpc/powerpc64/multiarch/Makefile | 2 +- .../powerpc64/multiarch/ifunc-impl-list.c | 9 - .../powerpc64/multiarch/strstr-power7.S | 33 -- .../powerpc64/multiarch/strstr-ppc64.c | 29 - sysdeps/powerpc/powerpc64/multiarch/strstr.c | 36 -- sysdeps/powerpc/powerpc64/power7/Makefile | 1 - .../powerpc/powerpc64/power7/strstr-ppc64.c | 27 - sysdeps/powerpc/powerpc64/power7/strstr.S | 535 ------------------ 8 files changed, 1 insertion(+), 671 deletions(-) delete mode 100644 sysdeps/powerpc/powerpc64/multiarch/strstr-power7.S delete mode 100644 sysdeps/powerpc/powerpc64/multiarch/strstr-ppc64.c delete mode 100644 sysdeps/powerpc/powerpc64/multiarch/strstr.c delete mode 100644 sysdeps/powerpc/powerpc64/power7/strstr-ppc64.c delete mode 100644 sysdeps/powerpc/powerpc64/power7/strstr.S diff --git a/sysdeps/powerpc/powerpc64/multiarch/Makefile b/sysdeps/powerpc/powerpc64/multiarch/Makefile index 594fbb8058..5624249fe2 100644 --- a/sysdeps/powerpc/powerpc64/multiarch/Makefile +++ b/sysdeps/powerpc/powerpc64/multiarch/Makefile @@ -24,7 +24,7 @@ sysdep_routines += memcpy-power8-cached memcpy-power7 memcpy-a2 memcpy-power6 \ strcmp-power8 strcmp-power7 strcmp-ppc64 \ strcat-power8 strcat-power7 strcat-ppc64 \ memmove-power7 memmove-ppc64 wordcopy-ppc64 \ - strncpy-power8 strstr-power7 strstr-ppc64 \ + strncpy-power8 \ strspn-power8 strspn-ppc64 strcspn-power8 strcspn-ppc64 \ strlen-power8 strcasestr-power8 strcasestr-ppc64 \ strcasecmp-ppc64 strcasecmp-power8 strncase-ppc64 \ diff --git a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c index 5b2d6a90ab..e3e6f5fd10 100644 --- a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c +++ b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c @@ -432,15 +432,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, IFUNC_IMPL_ADD (array, i, strcspn, 1, __strcspn_ppc)) - /* Support sysdeps/powerpc/powerpc64/multiarch/strstr.c. */ - IFUNC_IMPL (i, name, strstr, - IFUNC_IMPL_ADD (array, i, strstr, - hwcap & PPC_FEATURE_ARCH_2_06, - __strstr_power7) - IFUNC_IMPL_ADD (array, i, strstr, 1, - __strstr_ppc)) - - /* Support sysdeps/powerpc/powerpc64/multiarch/strcasestr.c. */ IFUNC_IMPL (i, name, strcasestr, IFUNC_IMPL_ADD (array, i, strcasestr, diff --git a/sysdeps/powerpc/powerpc64/multiarch/strstr-power7.S b/sysdeps/powerpc/powerpc64/multiarch/strstr-power7.S deleted file mode 100644 index 73029fad24..0000000000 --- a/sysdeps/powerpc/powerpc64/multiarch/strstr-power7.S +++ /dev/null @@ -1,33 +0,0 @@ -/* Optimized strstr implementation for POWER7. - Copyright (C) 2015-2024 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -#define STRSTR __strstr_power7 - -#undef libc_hidden_builtin_def -#define libc_hidden_builtin_def(name) - -#define STRLEN __strlen_power7 -#define STRNLEN __strnlen_power7 -#define STRCHR __strchr_power7 -#ifdef SHARED -#define STRLEN_is_local -#define STRNLEN_is_local -#define STRCHR_is_local -#endif - -#include diff --git a/sysdeps/powerpc/powerpc64/multiarch/strstr-ppc64.c b/sysdeps/powerpc/powerpc64/multiarch/strstr-ppc64.c deleted file mode 100644 index 3cf22eebf7..0000000000 --- a/sysdeps/powerpc/powerpc64/multiarch/strstr-ppc64.c +++ /dev/null @@ -1,29 +0,0 @@ -/* Copyright (C) 2015-2024 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -#include - -#define STRSTR __strstr_ppc -#if IS_IN (libc) && defined(SHARED) -# undef libc_hidden_builtin_def -# define libc_hidden_builtin_def(name) \ - __hidden_ver1(__strstr_ppc, __GI_strstr, __strstr_ppc); -#endif - -extern __typeof (strstr) __strstr_ppc attribute_hidden; - -#include diff --git a/sysdeps/powerpc/powerpc64/multiarch/strstr.c b/sysdeps/powerpc/powerpc64/multiarch/strstr.c deleted file mode 100644 index a25ad0fa50..0000000000 --- a/sysdeps/powerpc/powerpc64/multiarch/strstr.c +++ /dev/null @@ -1,36 +0,0 @@ -/* Multiple versions of strstr. PowerPC64 version. - Copyright (C) 2015-2024 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -/* Define multiple versions only for definition in libc. */ -#if IS_IN (libc) -# define strstr __redirect_strstr -# include -# include -# include "init-arch.h" - -extern __typeof (strstr) __strstr_ppc attribute_hidden; -extern __typeof (strstr) __strstr_power7 attribute_hidden; -# undef strstr - -/* Avoid DWARF definition DIE on ifunc symbol so that GDB can handle - ifunc symbol properly. */ -libc_ifunc_redirected (__redirect_strstr, strstr, - (hwcap & PPC_FEATURE_ARCH_2_06) - ? __strstr_power7 - : __strstr_ppc); -#endif diff --git a/sysdeps/powerpc/powerpc64/power7/Makefile b/sysdeps/powerpc/powerpc64/power7/Makefile index 9a0e7474bb..7eeff4b770 100644 --- a/sysdeps/powerpc/powerpc64/power7/Makefile +++ b/sysdeps/powerpc/powerpc64/power7/Makefile @@ -9,7 +9,6 @@ $(foreach suf,$(all-object-suffixes),$(objpfx)rtld$(suf)): \ endif ifeq ($(subdir),string) -sysdep_routines += strstr-ppc64 CFLAGS-strncase.c += -funroll-loops CFLAGS-strncase_l.c += -funroll-loops endif diff --git a/sysdeps/powerpc/powerpc64/power7/strstr-ppc64.c b/sysdeps/powerpc/powerpc64/power7/strstr-ppc64.c deleted file mode 100644 index 12a1aa8802..0000000000 --- a/sysdeps/powerpc/powerpc64/power7/strstr-ppc64.c +++ /dev/null @@ -1,27 +0,0 @@ -/* Optimized strstr implementation for PowerPC64/POWER7. - Copyright (C) 2015-2024 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -#include - -#define STRSTR __strstr_ppc -#undef libc_hidden_builtin_def -#define libc_hidden_builtin_def(__name) - -extern __typeof (strstr) __strstr_ppc attribute_hidden; - -#include diff --git a/sysdeps/powerpc/powerpc64/power7/strstr.S b/sysdeps/powerpc/powerpc64/power7/strstr.S deleted file mode 100644 index d4a324780f..0000000000 --- a/sysdeps/powerpc/powerpc64/power7/strstr.S +++ /dev/null @@ -1,535 +0,0 @@ -/* Optimized strstr implementation for PowerPC64/POWER7. - Copyright (C) 2015-2024 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -#include - -/* Char * [r3] strstr (char *s [r3], char * pat[r4]) */ - -/* The performance gain is obtained using aligned memory access, load - * doubleword and usage of cmpb instruction for quicker comparison. */ - -#define ITERATIONS 64 - -#ifndef STRSTR -# define STRSTR strstr -#endif - -#ifndef STRLEN -/* For builds with no IFUNC support, local calls should be made to internal - GLIBC symbol (created by libc_hidden_builtin_def). */ -# ifdef SHARED -# define STRLEN __GI_strlen -# define STRLEN_is_local -# else -# define STRLEN strlen -# endif -#endif - -#ifndef STRNLEN -/* For builds with no IFUNC support, local calls should be made to internal - GLIBC symbol (created by libc_hidden_builtin_def). */ -# ifdef SHARED -# define STRNLEN __GI_strnlen -# define STRNLEN_is_local -# else -# define STRNLEN __strnlen -# endif -#endif - -#ifndef STRCHR -# ifdef SHARED -# define STRCHR __GI_strchr -# define STRCHR_is_local -# else -# define STRCHR strchr -# endif -#endif - -#define FRAMESIZE (FRAME_MIN_SIZE+32) - .machine power7 -/* Can't be ENTRY_TOCLESS due to calling __strstr_ppc which uses r2. */ -ENTRY (STRSTR, 4) - CALL_MCOUNT 2 - mflr r0 /* Load link register LR to r0. */ - std r31, -8(r1) /* Save callers register r31. */ - std r30, -16(r1) /* Save callers register r30. */ - std r29, -24(r1) /* Save callers register r29. */ - std r28, -32(r1) /* Save callers register r28. */ - std r0, 16(r1) /* Store the link register. */ - cfi_offset(r31, -8) - cfi_offset(r30, -16) - cfi_offset(r28, -32) - cfi_offset(r29, -24) - cfi_offset(lr, 16) - stdu r1, -FRAMESIZE(r1) /* Create the stack frame. */ - cfi_adjust_cfa_offset(FRAMESIZE) - - dcbt 0, r3 - dcbt 0, r4 - cmpdi cr7, r3, 0 - beq cr7, L(retnull) - cmpdi cr7, r4, 0 - beq cr7, L(retnull) - - mr r29, r3 - mr r30, r4 - mr r3, r4 - bl STRLEN -#ifndef STRLEN_is_local - nop -#endif - - cmpdi cr7, r3, 0 /* If search str is null. */ - beq cr7, L(ret_r3) - - mr r31, r3 - mr r4, r3 - mr r3, r29 - bl STRNLEN -#ifndef STRNLEN_is_local - nop -#endif - - cmpd cr7, r3, r31 /* If len(r3) < len(r4). */ - blt cr7, L(retnull) - mr r3, r29 - lbz r4, 0(r30) - bl STRCHR -#ifndef STRCHR_is_local - nop -#endif - - mr r11, r3 - /* If first char of search str is not present. */ - cmpdi cr7, r3, 0 - ble cr7, L(end) - /* Reg r28 is used to count the number of iterations. */ - li r28, 0 - rldicl r8, r3, 0, 52 /* Page cross check. */ - cmpldi cr7, r8, 4096-16 - bgt cr7, L(bytebybyte) - - rldicl r8, r30, 0, 52 - cmpldi cr7, r8, 4096-16 - bgt cr7, L(bytebybyte) - - /* If len(r4) < 8 handle in a different way. */ - /* Shift position based on null and use cmpb. */ - cmpdi cr7, r31, 8 - blt cr7, L(lessthan8) - - /* Len(r4) >= 8 reaches here. */ - mr r8, r3 /* Save r3 for future use. */ - mr r4, r30 /* Restore r4. */ - li r0, 0 - rlwinm r10, r30, 3, 26, 28 /* Calculate padding in bits. */ - clrrdi r4, r4, 3 /* Make r4 aligned to 8. */ - ld r6, 0(r4) - addi r4, r4, 8 - cmpdi cr7, r10, 0 /* Check if its already aligned? */ - beq cr7, L(begin1) -#ifdef __LITTLE_ENDIAN__ - srd r6, r6, r10 /* Discard unwanted bits. */ -#else - sld r6, r6, r10 -#endif - ld r9, 0(r4) - subfic r10, r10, 64 -#ifdef __LITTLE_ENDIAN__ - sld r9, r9, r10 /* Discard unwanted bits. */ -#else - srd r9, r9, r10 -#endif - or r6, r6, r9 /* Form complete search str. */ -L(begin1): - mr r29, r6 - rlwinm r10, r3, 3, 26, 28 - clrrdi r3, r3, 3 - ld r5, 0(r3) - cmpb r9, r0, r6 /* Check if input has null. */ - cmpdi cr7, r9, 0 - bne cr7, L(return3) - cmpb r9, r0, r5 /* Check if input has null. */ -#ifdef __LITTLE_ENDIAN__ - srd r9, r9, r10 -#else - sld r9, r9, r10 -#endif - cmpdi cr7, r9, 0 - bne cr7, L(retnull) - - li r12, -8 /* Shift values. */ - li r11, 72 /* Shift values. */ - cmpdi cr7, r10, 0 - beq cr7, L(nextbyte1) - mr r12, r10 - addi r12, r12, -8 - subfic r11, r12, 64 - -L(nextbyte1): - ldu r7, 8(r3) /* Load next dw. */ - addi r12, r12, 8 /* Shift one byte and compare. */ - addi r11, r11, -8 -#ifdef __LITTLE_ENDIAN__ - srd r9, r5, r12 /* Rotate based on mask. */ - sld r10, r7, r11 -#else - sld r9, r5, r12 - srd r10, r7, r11 -#endif - /* Form single dw from few bytes on first load and second load. */ - or r10, r9, r10 - /* Check for null in the formed dw. */ - cmpb r9, r0, r10 - cmpdi cr7, r9, 0 - bne cr7, L(retnull) - /* Cmpb search str and input str. */ - cmpb r9, r10, r6 - cmpdi cr7, r9, -1 - beq cr7, L(match) - addi r8, r8, 1 - b L(begin) - - .align 4 -L(match): - /* There is a match of 8 bytes, check next bytes. */ - cmpdi cr7, r31, 8 - beq cr7, L(return) - /* Update next starting point r8. */ - srdi r9, r11, 3 - subf r9, r9, r3 - mr r8, r9 - -L(secondmatch): - mr r5, r7 - rlwinm r10, r30, 3, 26, 28 /* Calculate padding in bits. */ - ld r6, 0(r4) - addi r4, r4, 8 - cmpdi cr7, r10, 0 /* Check if its already aligned? */ - beq cr7, L(proceed3) -#ifdef __LITTLE_ENDIAN__ - srd r6, r6, r10 /* Discard unwanted bits. */ - cmpb r9, r0, r6 - sld r9, r9, r10 -#else - sld r6, r6, r10 - cmpb r9, r0, r6 - srd r9, r9, r10 -#endif - cmpdi cr7, r9, 0 - bne cr7, L(proceed3) - ld r9, 0(r4) - subfic r10, r10, 64 -#ifdef __LITTLE_ENDIAN__ - sld r9, r9, r10 /* Discard unwanted bits. */ -#else - srd r9, r9, r10 -#endif - or r6, r6, r9 /* Form complete search str. */ - -L(proceed3): - li r7, 0 - addi r3, r3, 8 - cmpb r9, r0, r5 - cmpdi cr7, r9, 0 - bne cr7, L(proceed4) - ld r7, 0(r3) -L(proceed4): -#ifdef __LITTLE_ENDIAN__ - srd r9, r5, r12 - sld r10, r7, r11 -#else - sld r9, r5, r12 - srd r10, r7, r11 -#endif - /* Form single dw with few bytes from first and second load. */ - or r10, r9, r10 - cmpb r9, r0, r6 - cmpdi cr7, r9, 0 - bne cr7, L(return4) - /* Check for null in the formed dw. */ - cmpb r9, r0, r10 - cmpdi cr7, r9, 0 - bne cr7, L(retnull) - /* If the next 8 bytes dont match, start search again. */ - cmpb r9, r10, r6 - cmpdi cr7, r9, -1 - bne cr7, L(reset) - /* If the next 8 bytes match, load and compare next 8. */ - b L(secondmatch) - - .align 4 -L(reset): - /* Start the search again. */ - addi r8, r8, 1 - b L(begin) - - .align 4 -L(return3): - /* Count leading zeros and compare partial dw. */ -#ifdef __LITTLE_ENDIAN__ - addi r7, r9, -1 - andc r7, r7, r9 - popcntd r7, r7 - subfic r7, r7, 64 - sld r10, r5, r7 - sld r6, r6, r7 -#else - cntlzd r7, r9 - subfic r7, r7, 64 - srd r10, r5, r7 - srd r6, r6, r7 -#endif - cmpb r9, r10, r6 - cmpdi cr7, r9, -1 - addi r8, r8, 1 - /* Start search again if there is no match. */ - bne cr7, L(begin) - /* If the words match, update return values. */ - subfic r7, r7, 64 - srdi r7, r7, 3 - add r3, r3, r7 - subf r3, r31, r3 - b L(end) - - .align 4 -L(return4): - /* Count leading zeros and compare partial dw. */ -#ifdef __LITTLE_ENDIAN__ - addi r7, r9, -1 - andc r7, r7, r9 - popcntd r7, r7 - subfic r7, r7, 64 - sld r10, r10, r7 - sld r6, r6, r7 -#else - cntlzd r7, r9 - subfic r7, r7, 64 - srd r10, r10, r7 - srd r6, r6, r7 -#endif - cmpb r9, r10, r6 - cmpdi cr7, r9, -1 - addi r8, r8, 1 - bne cr7, L(begin) - subfic r7, r7, 64 - srdi r11, r11, 3 - subf r3, r11, r3 - srdi r7, r7, 3 - add r3, r3, r7 - subf r3, r31, r3 - b L(end) - - .align 4 -L(begin): - mr r3, r8 - /* When our iterations exceed ITERATIONS,fall back to default. */ - addi r28, r28, 1 - cmpdi cr7, r28, ITERATIONS - beq cr7, L(default) - lbz r4, 0(r30) - bl STRCHR -#ifndef STRCHR_is_local - nop -#endif - /* If first char of search str is not present. */ - cmpdi cr7, r3, 0 - ble cr7, L(end) - mr r8, r3 - mr r4, r30 /* Restore r4. */ - li r0, 0 - mr r6, r29 - clrrdi r4, r4, 3 - addi r4, r4, 8 - b L(begin1) - - /* Handle less than 8 search string. */ - .align 4 -L(lessthan8): - mr r4, r3 - mr r9, r30 - li r0, 0 - - rlwinm r10, r9, 3, 26, 28 /* Calculate padding in bits. */ - srdi r8, r10, 3 /* Padding in bytes. */ - clrrdi r9, r9, 3 /* Make r4 aligned to 8. */ - ld r6, 0(r9) - cmpdi cr7, r10, 0 /* Check if its already aligned? */ - beq cr7, L(proceed2) -#ifdef __LITTLE_ENDIAN__ - srd r6, r6, r10 /* Discard unwanted bits. */ -#else - sld r6, r6, r10 -#endif - subfic r8, r8, 8 - cmpd cr7, r8, r31 /* Next load needed? */ - bge cr7, L(proceed2) - ld r7, 8(r9) - subfic r10, r10, 64 -#ifdef __LITTLE_ENDIAN__ - sld r7, r7, r10 /* Discard unwanted bits. */ -#else - srd r7, r7, r10 -#endif - or r6, r6, r7 /* Form complete search str. */ -L(proceed2): - mr r29, r6 - rlwinm r10, r3, 3, 26, 28 - clrrdi r7, r3, 3 /* Make r3 aligned. */ - ld r5, 0(r7) - sldi r8, r31, 3 - subfic r8, r8, 64 -#ifdef __LITTLE_ENDIAN__ - sld r6, r6, r8 - cmpb r9, r0, r5 - srd r9, r9, r10 -#else - srd r6, r6, r8 - cmpb r9, r0, r5 - sld r9, r9, r10 -#endif - cmpdi cr7, r9, 0 - bne cr7, L(noload) - cmpdi cr7, r10, 0 - beq cr7, L(continue) - ld r7, 8(r7) -L(continue1): - mr r12, r10 - addi r12, r12, -8 - subfic r11, r12, 64 - b L(nextbyte) - - .align 4 -L(continue): - ld r7, 8(r7) - li r12, -8 /* Shift values. */ - li r11, 72 /* Shift values. */ -L(nextbyte): - addi r12, r12, 8 /* Mask for rotation. */ - addi r11, r11, -8 -#ifdef __LITTLE_ENDIAN__ - srd r9, r5, r12 - sld r10, r7, r11 - or r10, r9, r10 - sld r10, r10, r8 - cmpb r9, r0, r10 - srd r9, r9, r8 -#else - sld r9, r5, r12 - srd r10, r7, r11 - or r10, r9, r10 - srd r10, r10, r8 - cmpb r9, r0, r10 - sld r9, r9, r8 -#endif - cmpdi cr7, r9, 0 - bne cr7, L(retnull) - cmpb r9, r10, r6 - cmpdi cr7, r9, -1 - beq cr7, L(end) - addi r3, r4, 1 - /* When our iterations exceed ITERATIONS,fall back to default. */ - addi r28, r28, 1 - cmpdi cr7, r28, ITERATIONS - beq cr7, L(default) - lbz r4, 0(r30) - bl STRCHR -#ifndef STRCHR_is_local - nop -#endif - /* If first char of search str is not present. */ - cmpdi cr7, r3, 0 - ble cr7, L(end) - mr r4, r3 - mr r6, r29 - li r0, 0 - b L(proceed2) - - .align 4 -L(noload): - /* Reached null in r3, so skip next load. */ - li r7, 0 - b L(continue1) - - .align 4 -L(return): - /* Update return values. */ - srdi r9, r11, 3 - subf r3, r9, r3 - b L(end) - - /* Handling byte by byte. */ - .align 4 -L(bytebybyte): - mr r8, r3 - addi r8, r8, -1 -L(loop1): - addi r8, r8, 1 - mr r3, r8 - mr r4, r30 - lbz r6, 0(r4) - cmpdi cr7, r6, 0 - beq cr7, L(updater3) -L(loop): - lbz r5, 0(r3) - cmpdi cr7, r5, 0 - beq cr7, L(retnull) - cmpld cr7, r6, r5 - bne cr7, L(loop1) - addi r3, r3, 1 - addi r4, r4, 1 - lbz r6, 0(r4) - cmpdi cr7, r6, 0 - beq cr7, L(updater3) - b L(loop) - - /* Handling return values. */ - .align 4 -L(updater3): - subf r3, r31, r3 /* Reduce len of r4 from r3. */ - b L(end) - - .align 4 -L(ret_r3): - mr r3, r29 /* Return r3. */ - b L(end) - - .align 4 -L(retnull): - li r3, 0 /* Return NULL. */ - b L(end) - - .align 4 -L(default): - mr r4, r30 - bl __strstr_ppc - nop - - .align 4 -L(end): - addi r1, r1, FRAMESIZE /* Restore stack pointer. */ - cfi_adjust_cfa_offset(-FRAMESIZE) - ld r0, 16(r1) /* Restore the saved link register. */ - ld r28, -32(r1) /* Restore callers save register r28. */ - ld r29, -24(r1) /* Restore callers save register r29. */ - ld r30, -16(r1) /* Restore callers save register r30. */ - ld r31, -8(r1) /* Restore callers save register r31. */ - mtlr r0 /* Branch to link register. */ - blr -END (STRSTR) -libc_hidden_builtin_def (strstr)