From patchwork Tue Oct 31 20:09:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 739599 Delivered-To: patch@linaro.org Received: by 2002:a5d:4c47:0:b0:32d:baff:b0ca with SMTP id n7csp1841829wrt; Tue, 31 Oct 2023 13:10:08 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFL6uze7OiPeTNMF+8lZXbu4hU4QdCPcKoXiv8aSMMw5hnUfKsSeMHUzxSe2yWIxUWY24Fu X-Received: by 2002:a05:6830:7191:b0:6cd:9bc:b994 with SMTP id el17-20020a056830719100b006cd09bcb994mr17282172otb.1.1698783008423; Tue, 31 Oct 2023 13:10:08 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1698783008; cv=pass; d=google.com; s=arc-20160816; b=jKlB+0UbBrx5nv58sJgKRx8665Csw7l3NSGuMbAjKCGbUy09jYiB/mDQ5CelbhoLPS rsr/J+7ryLWnRFx8LMOLz26IuCTcsN0pBEf0rfGD30lBSkFcOjR2SfiJjZ6x7e/n2DUE A0dBUSc/QKP4zgqkAwEMjbQszlkmQZHexy6XDdmAQHPdkz8hQLzs07G2I+xYyf5l375G ERx0REaqDl6H85DVLY2KZJ/Qf5trPMGKt7tsIQZh98owYoOX/ymG5Ik7EFjUjIgncNZm hV7t5kv55m98PWEphLldAVfgV5XhsVEs7krcUBVMX9WAadEnW7rdL0BNo6e/s/YogkaV 0x6g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=sWcO/F3jf5++7bvf7k677eZvBleFBNohDCBgOUG9Cw4=; fh=pPFR/wQ0k/GuqyNsntNA5LqEpigLoSYvby6BmJvILyA=; b=PHh6kjbO/+YQ358IFvAIbLKnhpUC74AfUyHcntBuNKMf9AE2K6ArWG1zu/aQCU0Hxs asTHixiYv/ghlBizCN0RPEwqN9vkv5ZEInpeOR8CBQIYRw5tneD8MN6t9WHINdneDieN UqDfkAfHSNm81YC7X6h4F9S9+yrx7Pz3afSG8qLKE6pIYkoPn+/RT9uwobb5bsnIYJo6 tZbtZXy97HKPiLxnE1L6RwZByGNai4l1iosnOiJ8ygNbCZ8vVLMRXqw1dsM5PTlUOSgD fMrNkXuTTx2AVYqs4b9p0U0w+3Qx8DhZlMjvCWYQaTyqk3rIdTtUvqy4TPjHMwwyKMSa ePgA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=i55wSBTN; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id u20-20020a05622a011400b0041953bee755si1625881qtw.441.2023.10.31.13.10.08 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Oct 2023 13:10:08 -0700 (PDT) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=i55wSBTN; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 176083856961 for ; Tue, 31 Oct 2023 20:10:08 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-yw1-x1133.google.com (mail-yw1-x1133.google.com [IPv6:2607:f8b0:4864:20::1133]) by sourceware.org (Postfix) with ESMTPS id 96BA5385770A for ; Tue, 31 Oct 2023 20:09:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 96BA5385770A Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 96BA5385770A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::1133 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698782977; cv=none; b=bnxWC4qIapwSlTQjIqUDjJAAjwbLYOHcsfCcRcMJO9b/fbNtXnaCRLqX5oX81R/DxTEVPg2kAq3lVJBsi1Jh0xbTDjv4YzTszz1uuoolbeHicKSvBViCKAdt76ZmARF3HhMirhqWcvnsqsmQNOmFOXanNslkDVxE/yXaf+HkaVo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698782977; c=relaxed/simple; bh=9WsET7lIW9+kn5hSO4zzd6EWHRIXxU5dVbmf+rh3PlQ=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=nsYBClVN0w00fPLx+qd1ARL9oEc0NpVk27B5/AcAy8I0iPf2vUWw8afr1D27jXJI/+gJ2qY3m5jfRk7PnTekD+JbpgPs1x/ldhmzhv1dQZZAQQhAzPjCXXUURmsfbCZeTofKliS+AGqOa/bMuFbBq27Ud1p2m6FTZ+3new8U2Ro= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-yw1-x1133.google.com with SMTP id 00721157ae682-59b5484fbe6so59077927b3.1 for ; Tue, 31 Oct 2023 13:09:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1698782975; x=1699387775; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=sWcO/F3jf5++7bvf7k677eZvBleFBNohDCBgOUG9Cw4=; b=i55wSBTNTJVklV1cQbqK2E7uLdiLhq0/DlItPufE2Mx6Ju2YrHNeuv3X9L7Ww9DnEb Dq89ojhV7gF8f4uYZFtuTz1DWN/7y5PEj0lswJSML2HMIBb0gxikHtQmVi5Z9e43WUEt nwUlwrfrYSH4TYFP9Sc3JkBBhIg9WMkCsHVaWHAsA9LVyoD9bDvvNXK2315X05jpwi+i 74g+EcB2ki+LvGPShfebxSaB6XA5R7f3LT3s/UZ5+ddihUUJmMYfAtnr23WutDEoOeBA A+8x/hRj0mubgBpj8cBInsScIVr0tNPsuLJZXmJjg6A6xP7KA3GS4CBrcgbrmY0CtHWA Gqzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698782975; x=1699387775; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sWcO/F3jf5++7bvf7k677eZvBleFBNohDCBgOUG9Cw4=; b=Mm+Q6SXE3+ERfzWxvR7evAk4itgKlRD66d/yF/RZisLrg9nKY8KMd0sxFiMsBSKCOs OY10aEwZn5zGh+ZLBG07hGRYfYzgxYejyTOz8I3qOFkD76vK/rz3rqFJ+lbpKwI8rOul xx5E+0KQc5jO1TK3kpo05mVx/8n+626x3Gm/LJPikMCIXn8VoFmYGMWK0F47WI08sx4t TPsQLfSqh5vRzbv9MmjO6o1MEeVsVvBr/9aHiOgB8GOVD9ETVYybPqVVVnRvTFJpfjWr OmQb7zjZbadBCtrRDMwAgJjNY9su74J6LjAHquKv0hmrvjcEIyyIc3372QD7zA+erWhp eiqw== X-Gm-Message-State: AOJu0YxSZvJHptowwTHaFq2LDk1PdRcTVU2NdZocA1F6B965j8gru0Nb 5q6ltt8gj23UK5Ln/hWPv15vjEHcIpo92kVrYvjoyw== X-Received: by 2002:a81:ed0a:0:b0:5a8:3cb:b53d with SMTP id k10-20020a81ed0a000000b005a803cbb53dmr12418187ywm.1.1698782975259; Tue, 31 Oct 2023 13:09:35 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c0:3d3c:6c87:9be3:8cfc:976d]) by smtp.gmail.com with ESMTPSA id q69-20020a819948000000b005a7fa3ccb32sm1264111ywg.35.2023.10.31.13.09.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Oct 2023 13:09:34 -0700 (PDT) From: Adhemerval Zanella To: libc-alpha@sourceware.org, Noah Goldstein , "H . J . Lu" , Bruce Merry Subject: [PATCH 3/4] x86: Do not prefer ERMS for memset on Zen3+ Date: Tue, 31 Oct 2023 17:09:24 -0300 Message-Id: <20231031200925.3297456-4-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231031200925.3297456-1-adhemerval.zanella@linaro.org> References: <20231031200925.3297456-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org The REP STOSB usage on memset does show any performance gain on Zen3/Zen4 cores compared to the vectorized loops. Checked on x86_64-linux-gnu. --- sysdeps/x86/dl-cacheinfo.h | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/sysdeps/x86/dl-cacheinfo.h b/sysdeps/x86/dl-cacheinfo.h index 51e5ba200f..99ba0f776a 100644 --- a/sysdeps/x86/dl-cacheinfo.h +++ b/sysdeps/x86/dl-cacheinfo.h @@ -1018,11 +1018,17 @@ dl_init_cacheinfo (struct cpu_features *cpu_features) if (tunable_size > minimum_rep_movsb_threshold) rep_movsb_threshold = tunable_size; - /* NB: The default value of the x86_rep_stosb_threshold tunable is the - same as the default value of __x86_rep_stosb_threshold and the - minimum value is fixed. */ - rep_stosb_threshold = TUNABLE_GET (x86_rep_stosb_threshold, - long int, NULL); + /* For AMD Zen3+ architecture, the performance of vectorized loop is + slight better than ERMS. */ + if (cpu_features->basic.kind == arch_kind_amd) + rep_stosb_threshold = SIZE_MAX; + + if (TUNABLE_IS_INITIALIZED (x86_rep_stosb_threshold)) + /* NB: The default value of the x86_rep_stosb_threshold tunable is the + same as the default value of __x86_rep_stosb_threshold and the + minimum value is fixed. */ + rep_stosb_threshold = TUNABLE_GET (x86_rep_stosb_threshold, + long int, NULL); TUNABLE_SET_WITH_BOUNDS (x86_data_cache_size, data, 0, SIZE_MAX); TUNABLE_SET_WITH_BOUNDS (x86_shared_cache_size, shared, 0, SIZE_MAX);