From patchwork Thu Jan 18 17:53:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 124995 Delivered-To: patch@linaro.org Received: by 10.46.64.27 with SMTP id n27csp228254lja; Thu, 18 Jan 2018 09:53:44 -0800 (PST) X-Google-Smtp-Source: ACJfBoswmqXcan7Z50zwD5y9g3iCH8SsOdlpySwh6Ya+016gxPDrGp35nn9gcObYpZhtioALP6HD X-Received: by 10.99.146.3 with SMTP id o3mr2071627pgd.309.1516298023972; Thu, 18 Jan 2018 09:53:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516298023; cv=none; d=google.com; s=arc-20160816; b=sYt48mjHX3p8SnMM6TXsAt/yNHHCwQDPEI7yHwAj3m99hwPxgBzbroDENzhhHmcecF KYhGFuZNJ1mytEJ6Iw3hp1RRQxRpoNjx1g6xOnuUUWrMtfR9gYiOS5cRoUcwQik7RvC9 29Ja51wyi0Nq2lG4KskoXHqXtX20Pm4egrgZrM2DOVOjs6OQOc0FGN4f6olvmQdS1rjd fu3KOQeA4vlsNf+WmwiFViz01BVYU09dl75vkltOEvjPymoQ1A6NshXPiCfYsVS2lQsl iQuuycWzydluUWWMczQ+zSOkRzFGXKhrt4btP0xlPeGRkiNXQXPAPnYo2k73v8donN1o kwUQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:date:subject:to:from:delivered-to:sender:list-help :list-post:list-archive:list-subscribe:list-unsubscribe:list-id :precedence:mailing-list:dkim-signature:domainkey-signature :arc-authentication-results; bh=NrJU6vSO2KZyJsRVGWIf1W03RVaGScLJtTQ+W9flgOs=; b=pPzcnAgAQLTM4yecekSiMiZ8aHzIzIwbT+pgAbxTKdNxiUuS+CFOW8GIM9FvIcwvUV OAM61kXXESs94BEYSpbk/S1ffL34LRiabB//4Q/zuWaeEp7iz9OFFFAh9qsPZG5UrOR2 BGP+xVNkWcwTx1lXAtGcAZWXVx8TiWcwF7TA/0uJGEk9rgmpunGzLG17YyAdaHvVy/J1 Hk+mLr3Yo0NwFuNOLXNcZz2DIuziQZQdMoT7m/BKCqFgAEjePzyqUcx+a0H7g0vWlGpv Ukmk0OnbZXIli86xAASXMzmM9fC2QXYL/kE9r/XztdLndF4x6jvjUmviWN1ZzBu9NzdT oPRw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=CtcKnxaZ; spf=pass (google.com: domain of libc-alpha-return-89322-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89322-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id b2si715415pgr.434.2018.01.18.09.53.43 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 18 Jan 2018 09:53:43 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-89322-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=CtcKnxaZ; spf=pass (google.com: domain of libc-alpha-return-89322-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-89322-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id; q=dns; s= default; b=IBP154GvzyIT92cl9RCs+9vKWySjJ+2ZIU7hGlEUZpb8fvrrcUwsE f4kM00C6c8Glf/0t1f/PQ/QzcP+UWB6ucLrPYiEn4/bYSolAPeoWnzmdCGJBchJp OiTc7o5ULbsM+bM9PKOhCesQ8RBxRR4zPK92U/6NI//zMhAFXSzbUY= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id; s=default; bh=cMQLJu/rtY49CX7R4u77Tj/KXEc=; b=CtcKnxaZIrJkIo2pEzMb8lwKf/8g eFnTueaI+Bg8yRwscTDXPGBALcX3VVHqS2xQdnAuqlXn7qf6y0hNoruQnEtMY6af ZzghnI20ahtUJLmMYBpG3Y/pr+kx2lvleugPg38eJzCgCs5v29hcIJS7Ayy9oKqV ZmKSYCLdeA7AIOg= Received: (qmail 78638 invoked by alias); 18 Jan 2018 17:53:34 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 78614 invoked by uid 89); 18 Jan 2018 17:53:33 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-12.4 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mail-qt0-f172.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id; bh=NrJU6vSO2KZyJsRVGWIf1W03RVaGScLJtTQ+W9flgOs=; b=JDjo00B2j4D7wE05Ix3t2ByhjZVTggMLab+hQkkeNP5PJF9MSMktQ1r3AjXf90+vke RrwWzbQETdQqkFsPuIm67lp/TEDrFdCkfWquxNSGb0m9bUrviY283H94P6HozBKw3IdV nSVc4aQLmsXO8Y6u2ZN+kjy8yfncCbYUHx8r7mJ9/7J702yHdF8vn1+YzG+zEJuMwOL3 37qe9VBpTbsZ4GZephIYgv2muS7WSTYPwLxuNuMdo+W4NyQAiZpGXu/r57bSY/o32QtE 0TQkb65c5IBBK24pKtxhYTliqwyNbEkcg4lzjEwcCEIuDnK1XFQQzpFYxBb1Il2wF3Kz /20Q== X-Gm-Message-State: AKwxytcgwkgGBTHy4BwjlPK5FL3gS+fjU23vxB3OBt2rPi+YDLmzGI82 Yq71JqcEhytpe+6uY6bFcLK68UxYW+4= X-Received: by 10.55.66.76 with SMTP id p73mr18831942qka.300.1516298009237; Thu, 18 Jan 2018 09:53:29 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Subject: [PATCH 0/7] Refactor qsort implementation Date: Thu, 18 Jan 2018 15:53:15 -0200 Message-Id: <1516298002-4618-1-git-send-email-adhemerval.zanella@linaro.org> This patchset refactor the qsort implementation to fix some long standing issues, add more tests coverage, and a default benchmark. The main changes are: - Use quicksort as default to avoid potentially calling malloc. - Convert the qsort tests to libsupport and add qsort_r tests. - Add a qsort benchmark. The reason to remove mergesort usage on qsort is to avoid malloc usage and the logici to decide whether to switch to quicksort (which requires issue syscalls to get total system physical memory). It also simplifies the implementation and make it fully AS-Safe and AC-Safe (since quicksort implementation uses O(1) space allocated on stack due the total number of possible elements constraint). I have checked smoothsort algorithm as a possible alternative implementation that also have O(1) space usage, however it is faster only for already sorted input being slower for random, mostly sorted or repeated inputs. For reference I have pushed the implementation I measured against on personal branch [1]. The quicksort have the disvantage of O(n^2) as worse case, however current glibc implementation seems to have handle the pivot selection in suitable way. Comparing current GLIBC performance using the proposed benchmark in this patchset (which contains the BZ#21719 [2] issue) against the resulting implementation I see for x86_64 (i7-4790K, gcc 7.2.1): Results for member size 4 Sorted nmemb | base | patched | diff 32| 1488 | 1257 | -15.52 4096| 262961 | 302235 | 14.94 32768| 2481627 | 3020728 | 21.72 524288| 47154892 | 59306436 | 25.77 Repeated nmemb | base | patched | diff 32| 1955 | 1873 | -4.19 4096| 911947 | 904864 | -0.78 32768| 8775122 | 8542801 | -2.65 524288| 176944163 | 168426795 | -4.81 MostlySorted nmemb | base | patched | diff 32| 1699 | 1776 | 4.53 4096| 495316 | 709937 | 43.33 32768| 5136835 | 6855890 | 33.47 524288| 102572259 | 129385161 | 26.14 Unsorted nmemb | base | patched | diff 32| 2055 | 1941 | -5.55 4096| 916862 | 969021 | 5.69 32768| 9380553 | 9462116 | 0.87 524288| 190338891 | 186560908 | -1.98 Results for member size 8 Sorted nmemb | base | patched | diff 32| 1431 | 1205 | -15.79 4096| 277474 | 325554 | 17.33 32768| 2740730 | 3264125 | 19.10 524288| 54565602 | 66107684 | 21.15 Repeated nmemb | base | patched | diff 32| 2201 | 2118 | -3.77 4096| 893247 | 979114 | 9.61 32768| 9284822 | 9028606 | -2.76 524288| 185279216 | 174903867 | -5.60 MostlySorted nmemb | base | patched | diff 32| 1852 | 2346 | 26.67 4096| 536032 | 759158 | 41.63 32768| 5654647 | 7810444 | 38.12 524288| 113271181 | 135900146 | 19.98 Unsorted nmemb | base | patched | diff 32| 5585 | 2301 | -58.80 4096| 987922 | 1014018 | 2.64 32768| 9685917 | 9888078 | 2.09 524288| 198097197 | 192479957 | -2.84 Results for member size 32 Sorted nmemb | base | patched | diff 32| 4098 | 1184 | -71.11 4096| 1119484 | 325865 | -70.89 32768| 11233415 | 3331750 | -70.34 524288| 236345467 | 69067176 | -70.78 Repeated nmemb | base | patched | diff 32| 5754 | 4813 | -16.35 4096| 2348098 | 1624137 | -30.83 32768| 24567198 | 15896739 | -35.29 524288| 524545398 | 316328778 | -39.69 MostlySorted nmemb | base | patched | diff 32| 5106 | 5332 | 4.43 4096| 1946236 | 1312703 | -32.55 32768| 20692983 | 12360726 | -40.27 524288| 448701099 | 231603294 | -48.38 Unsorted nmemb | base | patched | diff 32| 6116 | 6047 | -1.13 4096| 2508786 | 1695241 | -32.43 32768| 25171790 | 16430388 | -34.73 524288| 535393549 | 329496913 | -38.46 So it is performance decrease ranging from 15% to 45%, mainly for sorted kind inputs, for array members of 4 and 8 (from my analysis to create the benchtest seems to most used kind of input) which I think it is acceptable considering the advantages of a qsort with constant extra memory requirements (around 1336 bytes for x86_64 and generic type size). I also pushed this patchset in a personal branch [3]. [1] https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=shortlog;h=refs/heads/azanella/qsort-smooth [2] https://sourceware.org/bugzilla/show_bug.cgi?id=21719 [3] https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/azanella/qsort-refactor Adhemerval Zanella (7): stdlib: Adjust tst-qsort{2} to libsupport support: Add Mersenne Twister pseudo-random number generator benchtests: Add bench-qsort stdlib: Add more qsort{_r} coverage stdlib: Remove use of mergesort on qsort stdlib: Optimization qsort{_r} swap implementation stdlib: Remove undefined behavior from qsort implementation benchtests/Makefile | 2 +- benchtests/bench-qsort.c | 352 +++++++++++++++++++++++++++++++++++++++++++ manual/argp.texi | 2 +- manual/locale.texi | 3 +- manual/search.texi | 7 +- stdlib/Makefile | 11 +- stdlib/msort.c | 310 ------------------------------------- stdlib/qsort.c | 262 ++++++++------------------------ stdlib/qsort_common.c | 225 +++++++++++++++++++++++++++ stdlib/tst-qsort.c | 45 +++--- stdlib/tst-qsort2.c | 44 +++--- stdlib/tst-qsort3.c | 231 ++++++++++++++++++++++++++++ support/Makefile | 2 + support/support_random.c | 219 +++++++++++++++++++++++++++ support/support_random.h | 109 ++++++++++++++ support/tst-support_random.c | 87 +++++++++++ 16 files changed, 1339 insertions(+), 572 deletions(-) create mode 100644 benchtests/bench-qsort.c delete mode 100644 stdlib/msort.c create mode 100644 stdlib/qsort_common.c create mode 100644 stdlib/tst-qsort3.c create mode 100644 support/support_random.c create mode 100644 support/support_random.h create mode 100644 support/tst-support_random.c -- 2.7.4