From patchwork Fri Aug 31 20:42:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 145690 Delivered-To: patch@linaro.org Received: by 2002:a2e:1648:0:0:0:0:0 with SMTP id 8-v6csp1210174ljw; Fri, 31 Aug 2018 13:43:00 -0700 (PDT) X-Google-Smtp-Source: ANB0Vdb74XgS/dGU2WX1dHErmXItyn7Jt+hC27KS1XZhp8QEbRwrOV6UouGALA3RM1k5T0fL0Gcx X-Received: by 2002:a65:6398:: with SMTP id h24-v6mr16177243pgv.245.1535748180570; Fri, 31 Aug 2018 13:43:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1535748180; cv=none; d=google.com; s=arc-20160816; b=x6Sqk/yU3e8ThWUsI3u+CPyUx7hnAhip0fDb0dJeaYOvhsgZcxJ6u4ZCkg4TJ41Zov NkHhGwei2tR/I7gPeqexVwYKR1kg3cmD4nkuRW3vRDUbNGGdAmyuSDzysrRzmi7nJcLp s2tw04Ua+qEeo2T1D/ZD+wKRX7YtmWQSs1XKMTUif0vqZn8xHU/onQOqPEJqC82fXr/d 0e3GR8SBYW9YWdE0J4Yae7nc++r3LvhGcZKedl8H6hoLsdUxlnB49HMkm8HkWrTmGo+w P+T/b7i12IJAv7LQOJt1lrWmux3/XZ0f3xr2KbTIxbZOVXv8bhkMFujVY25UQ7V6kmBv +tXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:dkim-signature:delivered-to:sender:list-help:list-post :list-archive:list-subscribe:list-unsubscribe:list-id:precedence :mailing-list:dkim-signature:domainkey-signature :arc-authentication-results; bh=YKdf/EQ/fH3nXwczt1z9+eTguan8aCkhWgUFX5uvuY8=; b=seSvXmgozo1DRubtsXsd/jCwpo32OGQ/VbUa12ZEiOqM/EA0k+rHOT0S+KolaQ8PSO gfp2kKUKdrc/4c/TGVk4E3E07d6SS26G/L/BmUvhAJt3+IgA9tkU1HqxwDz7rMBl3J6X khukvc1WAjK61n+FdFDFJOcNdaUpQWkc7xwWOkDS7jwhqos5y3MsLBzSTZqrOpkMuyZI ab5e+C1oUZKqZtydbYqsLKxifhGQW0dyVkp7qL/RXi5HHnJROsDtaJMwFQCxSdlxFHC4 OsgQ1qzsjQ6wsQtw/DaWqOAdu1iLJCAjIMhCCO/CTqIFf/1qilOmJRxHaxWA2GTd8d/3 kZVA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=C5B4NSno; dkim=pass header.i=@linaro.org header.s=google header.b=OKwJjl65; spf=pass (google.com: domain of libc-alpha-return-95622-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom="libc-alpha-return-95622-patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id g15-v6si10493254plo.284.2018.08.31.13.43.00 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 31 Aug 2018 13:43:00 -0700 (PDT) Received-SPF: pass (google.com: domain of libc-alpha-return-95622-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=C5B4NSno; dkim=pass header.i=@linaro.org header.s=google header.b=OKwJjl65; spf=pass (google.com: domain of libc-alpha-return-95622-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom="libc-alpha-return-95622-patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:mime-version :content-type:content-transfer-encoding; q=dns; s=default; b=Oat RTz6DVpxnDGC77x/NUTWkz/kVwCSiTblfHObkQjzJQDujBzws1KNJp2aH5Bysi8x /3D0cbewLznodiXI+K8KEDWUNQkFfgi2nBh07HwDAaPkZFYYL/5Mj6a1uzq9piDn A1YF8D8Tb8aRSiBETj+3WNM7+2Zb076U4zOsXZL8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:mime-version :content-type:content-transfer-encoding; s=default; bh=qoGuWd/RJ 4il6r3YZZJuw02/fJc=; b=C5B4NSnoMfe4pCiTrFJXoPSwgpE2a5RCsA0YQmRe/ csF8YbVtJvzBlXKJJ1Y5vRIfjXrK8fiGpYIEI2PLISODPoWg1L86/0ch4tKSrHS8 Hi9rJqhZFKFNx5EaDWcyXzsD+LNnOOU3PjRGl5mKTOcMUndY7bEj5AgtGTZ1A6Bk 9Y= Received: (qmail 53801 invoked by alias); 31 Aug 2018 20:42:50 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 53783 invoked by uid 89); 31 Aug 2018 20:42:49 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-9.0 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SEM_URI, SEM_URIRED, SPF_PASS autolearn=ham version=3.3.2 spammy=2950, attack, refs, ranging X-HELO: mail-qt0-f180.google.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=YKdf/EQ/fH3nXwczt1z9+eTguan8aCkhWgUFX5uvuY8=; b=OKwJjl651Mi+h8/K9Ezlv2RJuxnqqC/7jKlKvTAe1DMePNRUQzU8bozt/5bdFa+/X7 mVCCUFSTK8YycjCpGJk7hQZX7CWB/AousIl1u7s+Jwxt0LTnLZtytoHwBt5i5fyOue6I 4Vv7ICGhBtU8ERl5yBFHNcjQc99GPtJPskZHI= Return-Path: From: Adhemerval Zanella To: libc-alpha@sourceware.org Subject: [PATCH v2 0/7] Refactor qsort implementation Date: Fri, 31 Aug 2018 17:42:31 -0300 Message-Id: <20180831204238.10626-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 This is a version 2 of my previous refactor patch for qsort [1]. Main changes from previous versions are: - Drop Mersenne Twister implementation for support_random. Now that arc4random is being proposed by Florian, we can use it instead. I am just using a wrapper to POSIX rand48 interfaces to share code between tests and benchmark. - Tune the optimization generic swap implementation to avoid memcpy mempcpy calls on some platforms. -- This patchset refactor the qsort implementation to fix some long standing issues, add more tests coverage, and a default benchmark. The main changes are: - Use quicksort as default to avoid potentially calling malloc. - Convert the qsort tests to libsupport and add qsort_r tests. - Add a qsort benchmark. The reason to remove mergesort usage on qsort is to avoid malloc usage and the logic to decide whether to switch to quicksort (which requires issue syscalls to get total system physical memory). It also simplifies the implementation and make it fully AS-Safe and AC-Safe (since quicksort implementation uses O(1) space allocated on stack due the total number of possible elements constraint). I have checked smoothsort algorithm as a possible alternative implementation that also have O(1) space usage, however it is faster only for already sorted input being slower for random, mostly sorted or repeated inputs. Another possible alternative is a simpler heapsort, which is also slower. The quicksort have the disvantage of O(n^2) as worse case, however current glibc implementation seems to have handle the pivot selection in suitable way. Ondřej Bílka has raised questioning regarding it could be DoS attack [2], and although I do not dismiss this potentially issue (although unlikely due the median pivot selection) I think it would be worth to discuss it on another thread along with possible alternatives. Comparing current GLIBC performance using the proposed benchmark in this patchset (which contains the BZ#21719 [2] issue) against the resulting implementation I see for x86_64 (i7-4790K, gcc 7.2.1): Results for member size 4 MostlySorted nmemb | base | patched | diff 32 | 1791 | 1962 | 9.55 4096 | 530267 | 733319 | 38.29 32768 | 5319819 | 6888903 | 29.50 524288 | 105147020 | 132290195 | 25.81 Repeated nmemb | base | patched | diff 32 | 1988 | 2101 | 5.68 4096 | 898057 | 917065 | 2.12 32768 | 8890765 | 8824654 | -0.74 524288 | 178316071 | 174457327 | -2.16 Sorted nmemb | base | patched | diff 32 | 1511 | 1323 | -12.44 4096 | 277733 | 334323 | 20.38 32768 | 2634360 | 3342922 | 26.90 524288 | 49793076 | 67018557 | 34.59 Unsorted nmemb | base | patched | diff 32 | 2070 | 2267 | 9.52 4096 | 941830 | 967586 | 2.73 32768 | 9492371 | 9561046 | 0.72 524288 | 191355021 | 188618808 | -1.43 Results for member size 8 MostlySorted nmemb | base | patched | diff 32 | 1763 | 1895 | 7.49 4096 | 510794 | 737046 | 44.29 32768 | 5075103 | 7503068 | 47.84 524288 | 103741137 | 145792996 | 40.54 Repeated nmemb | base | patched | diff 32 | 1908 | 2392 | 25.37 4096 | 904798 | 941746 | 4.08 32768 | 8954918 | 8846305 | -1.21 524288 | 179825532 | 172015349 | -4.34 Sorted nmemb | base | patched | diff 32 | 1316 | 1154 | -12.31 4096 | 261069 | 302059 | 15.70 32768 | 2449581 | 2957746 | 20.74 524288 | 47772793 | 59248485 | 24.02 Unsorted nmemb | base | patched | diff 32 | 2011 | 2014 | 0.15 4096 | 953723 | 1014879 | 6.41 32768 | 9539278 | 9719416 | 1.89 524288 | 193690362 | 188728948 | -2.56 Results for member size 32 MostlySorted nmemb | base | patched | diff 32 | 4686 | 3054 | -34.83 4096 | 1688822 | 1093125 | -35.27 32768 | 17633569 | 10672711 | -39.48 524288 | 375170630 | 187567373 | -50.00 Repeated nmemb | base | patched | diff 32 | 5138 | 4084 | -20.51 4096 | 2187509 | 1334757 | -38.98 32768 | 22271793 | 12764998 | -42.69 524288 | 468956765 | 251722513 | -46.32 Sorted nmemb | base | patched | diff 32 | 3581 | 1232 | -65.60 4096 | 938145 | 301690 | -67.84 32768 | 9553669 | 2967993 | -68.93 524288 | 194239124 | 64345690 | -66.87 Unsorted nmemb | base | patched | diff 32 | 5235 | 3927 | -24.99 4096 | 2227377 | 1483229 | -33.41 32768 | 22875769 | 13819034 | -39.59 524288 | 484156353 | 272021523 | -43.82 So it is performance decrease ranging from 15% to 45%, mainly for sorted kind inputs, for array members of 4 and 8 (from my analysis to create the benchtest seems to most used kind of input) which I think it is acceptable considering the advantages of a qsort with constant extra memory requirements (around 1336 bytes for x86_64 and generic type size). I also pushed this patchset in a personal branch [3]. [1] https://sourceware.org/ml/libc-alpha/2018-01/msg00626.html [2] https://sourceware.org/ml/libc-alpha/2018-02/msg00358.html [3] https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/azanella/qsort-refactor Adhemerval Zanella (7): stdlib: Adjust tst-qsort{2} to libsupport support: Add pseudo-random number generator interface stdlib: Add more qsort{_r} coverage benchtests: Add bench-qsort stdlib: Remove use of mergesort on qsort stdlib: Optimization qsort{_r} swap implementation stdlib: Remove undefined behavior from qsort implementation benchtests/Makefile | 2 +- benchtests/bench-qsort.c | 343 +++++++++++++++++++++++++++++++++++++++ manual/argp.texi | 2 +- manual/locale.texi | 3 +- manual/search.texi | 7 +- stdlib/Makefile | 5 +- stdlib/msort.c | 310 ----------------------------------- stdlib/qsort.c | 268 ++++++++---------------------- stdlib/qsort_common.c | 225 +++++++++++++++++++++++++ stdlib/tst-qsort.c | 45 +++-- stdlib/tst-qsort2.c | 44 +++-- stdlib/tst-qsort3.c | 235 +++++++++++++++++++++++++++ support/Makefile | 3 +- support/support_random.c | 90 ++++++++++ support/support_random.h | 59 +++++++ 15 files changed, 1071 insertions(+), 570 deletions(-) create mode 100644 benchtests/bench-qsort.c delete mode 100644 stdlib/msort.c create mode 100644 stdlib/qsort_common.c create mode 100644 stdlib/tst-qsort3.c create mode 100644 support/support_random.c create mode 100644 support/support_random.h -- 2.17.1