From patchwork Fri Aug 31 20:42:38 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 145697 Delivered-To: patch@linaro.org Received: by 2002:a2e:1648:0:0:0:0:0 with SMTP id 8-v6csp1211177ljw; Fri, 31 Aug 2018 13:44:13 -0700 (PDT) X-Google-Smtp-Source: ANB0VdbGTbbj69CHh97twfGY2F5pZ8S18aVgFiK8nBsP7oGs/TG9+quM8pbRQAOlLMP5HQJfDITC X-Received: by 2002:a63:df4e:: with SMTP id h14-v6mr16076439pgj.300.1535748252942; Fri, 31 Aug 2018 13:44:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1535748252; cv=none; d=google.com; s=arc-20160816; b=amz26w3ghLU4RU0U99hBzf0Fc+0Qmnn/9y0yNGDmQVAoPU/Teowruri3cfnDlGM6g6 yAKGoBgJrafGHsg4hnKaKzuRDXy36LiBGdKJUw1npim4RE8ljFghWzG498PGIN/hX6+j 6b/C2HXnjyuGEnqptsjIWwvXN4Y9oAj/7HuLsfxX+ttAOPc3LUdEHLICQosMlVF1iNon ceojJt4sFebwbwE+D1jwlAitJV4LG0TZVG9kOqhqKC5YWteAJ9ViOXLU/75Wy2r8hMAq oQEySN7187+76ZoU6txpZOWOrZu0pejKgMOLo2sndKM+1KHn0ZEu0HeREXWo+Of7Lu/8 wAkA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:to:from :dkim-signature:delivered-to:sender:list-help:list-post:list-archive :list-subscribe:list-unsubscribe:list-id:precedence:mailing-list :dkim-signature:domainkey-signature:arc-authentication-results; bh=gXrM1sfLY361hdTN6kTJzer5iMeBHUpFmUS075NDvLg=; b=oBAsoMHhiNuC7Y9qtM4LtoCNJd+7aJJYeDIG0y5bqyUcMpe3gRnbDC2Gm+/3/G0oIz Eaxkq1sekC4VcCKrAgKZBB731EDSX7Hj6MJoLVNUdepN+xoZ1dE75Hi0yf9rfEEtLoK+ XHfVzeECHBga1eKlY/yBzHb+94hD0x2jDQurMzgzxmdbwpoLV8ag6o29VzQ4fCueUVPQ S/U+U36uV3VoOfHE41ubCFYOCsLHlm7G9/YBDjIJCK1b9ueTMgkP/YHY3Xiwg5xdCjI4 U4WhRvowlsdVq4s0ZRTvbQ6i5rENBbMx6iTUPunrWsmS1epotQDMHCPZZRs0nt0cpuTf lFFQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=pHuyDMEn; dkim=pass header.i=@linaro.org header.s=google header.b=caOjF9Hx; spf=pass (google.com: domain of libc-alpha-return-95629-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom="libc-alpha-return-95629-patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id x7-v6si8474408pgi.465.2018.08.31.13.44.12 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 31 Aug 2018 13:44:12 -0700 (PDT) Received-SPF: pass (google.com: domain of libc-alpha-return-95629-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=pHuyDMEn; dkim=pass header.i=@linaro.org header.s=google header.b=caOjF9Hx; spf=pass (google.com: domain of libc-alpha-return-95629-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom="libc-alpha-return-95629-patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=K7iG+ZCvAFBV7q3CgHJBwkO54cV1fqM JYaxYL9Tctjeb2hsoSTymBdf4bQIssVG4XVFhY/CuoDYukKM968BlREsGH4zzPfl y7WTWLHMpCSB6DuSugV6pXXDFuJzDJaVWFmzNtt48k8r1CDvlypEwdyJadsDCPTI dGLf297R63RM= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:in-reply-to :references; s=default; bh=liYck7On5g8EFExAaZqNciwb0Xs=; b=pHuyD MEnU7EFCmDH5i4rex3/mfgUQiizNZMJaJBOSZEJqH/Bwz9kJlB4ZbrlbP28XGlmt TA1aeBAeLss2jsA8LfcaGeNtU01xwrHP1IPE/yO4bpC0poUYLOXpD6XEV5zUowYK vpIZ4gS4X3f3Conu4P32cs5zcOOMt4YA6IrO1Y= Received: (qmail 55450 invoked by alias); 31 Aug 2018 20:43:02 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 55334 invoked by uid 89); 31 Aug 2018 20:43:01 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-22.7 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, SEM_URI, SEM_URIRED, SPF_PASS autolearn=ham version=3.3.2 spammy=mid, Engineering, famous, sorts X-HELO: mail-qk1-f179.google.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=gXrM1sfLY361hdTN6kTJzer5iMeBHUpFmUS075NDvLg=; b=caOjF9HxyTzAXwLRvQHMhHZLoj5xSC0HA/mAkxefGAUPmirQCfxr+SU3MyF0gtIkJ3 DlE8efsphgbIIywf92nLtX5jBWAJ4+E3c15gj+35hY4OSxNJzIlLJrqzoWLijQHOobZN S82R6mOdFWECwakSPMfFz8aw0f6h371gaVyBo= Return-Path: From: Adhemerval Zanella To: libc-alpha@sourceware.org Subject: [PATCH v2 7/7] stdlib: Remove undefined behavior from qsort implementation Date: Fri, 31 Aug 2018 17:42:38 -0300 Message-Id: <20180831204238.10626-8-adhemerval.zanella@linaro.org> In-Reply-To: <20180831204238.10626-1-adhemerval.zanella@linaro.org> References: <20180831204238.10626-1-adhemerval.zanella@linaro.org> Internally qsort is implemented on top of __qsort_r by casting the function pointer to another type (__compar_fn_t tp __compar_d_fn_t) and passing a NULL extra argument. Casting function pointer with different types for subsequent function call is undefined-behaviour (C11 6.3.2.3): "[8] A pointer to a function of one type may be converted to a pointer to a function of another type and back again; the result shall compare equal to the original pointer. If a converted pointer is used to call a function whose type is not compatible with the referenced type, the behavior is undefined." Also 'compatible' in this case also does not apply according to 6.7.6.3 Function declarators (including prototypes): "[15] For two function types to be compatible, both shall specify compatible return types. (146) Moreover, the parameter type lists, if both are present, shall agree in the number of parameters and in use of the ellipsis terminator; corresponding parameters shall have compatible types. [...]" Although this works on all architectures glibc supports (mostly because it casts function pointers with similar calling conventions), I think it is worth to avoid it. This patch fixes it by adding a common implementation (qsort_common.c) which redefines the function based on the required types. For x86_64 (i7-4790K, gcc 7.1.1) shows a slight better performance for qsort: Results for member size 4 MostlySorted nmemb | base | patched | diff 32 | 2012 | 1962 | -2.49 4096 | 789329 | 733319 | -7.10 32768 | 7871853 | 6888903 | -12.49 524288 | 151037732 | 132290195 | -12.41 Repeated nmemb | base | patched | diff 32 | 2017 | 2101 | 4.16 4096 | 948784 | 917065 | -3.34 32768 | 9242215 | 8824654 | -4.52 524288 | 182508235 | 174457327 | -4.41 Sorted nmemb | base | patched | diff 32 | 1332 | 1323 | -0.68 4096 | 351388 | 334323 | -4.86 32768 | 3556735 | 3342922 | -6.01 524288 | 67278267 | 67018557 | -0.39 Unsorted nmemb | base | patched | diff 32 | 2009 | 2267 | 12.84 4096 | 1039794 | 967586 | -6.94 32768 | 10116859 | 9561046 | -5.49 524288 | 198713626 | 188618808 | -5.08 Results for member size 8 MostlySorted nmemb | base | patched | diff 32 | 1965 | 1895 | -3.56 4096 | 762125 | 737046 | -3.29 32768 | 7017240 | 7503068 | 6.92 524288 | 129564606 | 145792996 | 12.53 Repeated nmemb | base | patched | diff 32 | 1998 | 2392 | 19.72 4096 | 970507 | 941746 | -2.96 32768 | 9173248 | 8846305 | -3.56 524288 | 179637006 | 172015349 | -4.24 Sorted nmemb | base | patched | diff 32 | 1205 | 1154 | -4.23 4096 | 308174 | 302059 | -1.98 32768 | 3018157 | 2957746 | -2.00 524288 | 59608087 | 59248485 | -0.60 Unsorted nmemb | base | patched | diff 32 | 2575 | 2014 | -21.79 4096 | 1040231 | 1014879 | -2.44 32768 | 10160881 | 9719416 | -4.34 524288 | 197305501 | 188728948 | -4.35 Results for member size 32 MostlySorted nmemb | base | patched | diff 32 | 4474 | 3054 | -31.74 4096 | 1185956 | 1093125 | -7.83 32768 | 10710788 | 10672711 | -0.36 524288 | 196553161 | 187567373 | -4.57 Repeated nmemb | base | patched | diff 32 | 4670 | 4084 | -12.55 4096 | 1351979 | 1334757 | -1.27 32768 | 12917614 | 12764998 | -1.18 524288 | 258020738 | 251722513 | -2.44 Sorted nmemb | base | patched | diff 32 | 1221 | 1232 | 0.90 4096 | 308146 | 301690 | -2.10 32768 | 3120670 | 2967993 | -4.89 524288 | 64995508 | 64345690 | -1.00 Unsorted nmemb | base | patched | diff 32 | 3975 | 3927 | -1.21 4096 | 1470465 | 1483229 | 0.87 32768 | 14109425 | 13819034 | -2.06 524288 | 276687595 | 272021523 | -1.69 Checked on x86_64-linux-gnu. * stdlib/qsort.c: Move common code to stdlib/qsort_common.c and parametrize the function definition based wether to use the '_r' variant. * stdlib/qsort_common.c: New file. --- stdlib/qsort.c | 208 ++------------------------------------ stdlib/qsort_common.c | 225 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 233 insertions(+), 200 deletions(-) create mode 100644 stdlib/qsort_common.c -- 2.17.1 diff --git a/stdlib/qsort.c b/stdlib/qsort.c index c3fb0e862f..10b805930a 100644 --- a/stdlib/qsort.c +++ b/stdlib/qsort.c @@ -16,17 +16,13 @@ License along with the GNU C Library; if not, see . */ -/* If you consider tuning this algorithm, you should consult first: - Engineering a sort function; Jon Bentley and M. Douglas McIlroy; - Software - Practice and Experience; Vol. 23 (11), 1249-1265, 1993. */ - #include #include #include #include -/* Swap SIZE bytes between addresses A and B. Helper to generic types - are provided as an optimization. */ +/* Swap SIZE bytes between addresses A and B. These helpers are provided + along the generic one as an optimization. */ typedef void (*swap_t)(void *, void *, size_t); @@ -104,202 +100,14 @@ typedef struct #define POP(low, high) ((void) (--top, (low = top->lo), (high = top->hi))) #define STACK_NOT_EMPTY (stack < top) - -/* Order size using quicksort. This implementation incorporates - four optimizations discussed in Sedgewick: - - 1. Non-recursive, using an explicit stack of pointer that store the - next array partition to sort. To save time, this maximum amount - of space required to store an array of SIZE_MAX is allocated on the - stack. Assuming a 32-bit (64 bit) integer for size_t, this needs - only 32 * sizeof(stack_node) == 256 bytes (for 64 bit: 1024 bytes). - Pretty cheap, actually. - - 2. Chose the pivot element using a median-of-three decision tree. - This reduces the probability of selecting a bad pivot value and - eliminates certain extraneous comparisons. - - 3. Only quicksorts TOTAL_ELEMS / MAX_THRESH partitions, leaving - insertion sort to order the MAX_THRESH items within each partition. - This is a big win, since insertion sort is faster for small, mostly - sorted array segments. - - 4. The larger of the two sub-partitions is always pushed onto the - stack first, with the algorithm then concentrating on the - smaller partition. This *guarantees* no more than log (total_elems) - stack size is needed (actually O(1) in this case)! */ - -void -__qsort_r (void *const pbase, size_t total_elems, size_t size, - __compar_d_fn_t cmp, void *arg) -{ - char *base_ptr = (char *) pbase; - - const size_t max_thresh = MAX_THRESH * size; - - if (total_elems == 0) - /* Avoid lossage with unsigned arithmetic below. */ - return; - - swap_t swap = select_swap_func (pbase, size); - - if (total_elems > MAX_THRESH) - { - char *lo = base_ptr; - char *hi = &lo[size * (total_elems - 1)]; - stack_node stack[STACK_SIZE]; - stack_node *top = stack; - - PUSH (NULL, NULL); - - while (STACK_NOT_EMPTY) - { - char *left_ptr; - char *right_ptr; - - /* Select median value from among LO, MID, and HI. Rearrange - LO and HI so the three values are sorted. This lowers the - probability of picking a pathological pivot value and - skips a comparison for both the LEFT_PTR and RIGHT_PTR in - the while loops. */ - - char *mid = lo + size * ((hi - lo) / size >> 1); - - if ((*cmp) ((void *) mid, (void *) lo, arg) < 0) - swap (mid, lo, size); - if ((*cmp) ((void *) hi, (void *) mid, arg) < 0) - swap (mid, hi, size); - else - goto jump_over; - if ((*cmp) ((void *) mid, (void *) lo, arg) < 0) - swap (mid, lo, size); - jump_over:; - - left_ptr = lo + size; - right_ptr = hi - size; - - /* Here's the famous ``collapse the walls'' section of quicksort. - Gotta like those tight inner loops! They are the main reason - that this algorithm runs much faster than others. */ - do - { - while ((*cmp) ((void *) left_ptr, (void *) mid, arg) < 0) - left_ptr += size; - - while ((*cmp) ((void *) mid, (void *) right_ptr, arg) < 0) - right_ptr -= size; - - if (left_ptr < right_ptr) - { - swap (left_ptr, right_ptr, size); - if (mid == left_ptr) - mid = right_ptr; - else if (mid == right_ptr) - mid = left_ptr; - left_ptr += size; - right_ptr -= size; - } - else if (left_ptr == right_ptr) - { - left_ptr += size; - right_ptr -= size; - break; - } - } - while (left_ptr <= right_ptr); - - /* Set up pointers for next iteration. First determine whether - left and right partitions are below the threshold size. If so, - ignore one or both. Otherwise, push the larger partition's - bounds on the stack and continue sorting the smaller one. */ - - if ((size_t) (right_ptr - lo) <= max_thresh) - { - if ((size_t) (hi - left_ptr) <= max_thresh) - /* Ignore both small partitions. */ - POP (lo, hi); - else - /* Ignore small left partition. */ - lo = left_ptr; - } - else if ((size_t) (hi - left_ptr) <= max_thresh) - /* Ignore small right partition. */ - hi = right_ptr; - else if ((right_ptr - lo) > (hi - left_ptr)) - { - /* Push larger left partition indices. */ - PUSH (lo, right_ptr); - lo = left_ptr; - } - else - { - /* Push larger right partition indices. */ - PUSH (left_ptr, hi); - hi = right_ptr; - } - } - } - - /* Once the BASE_PTR array is partially sorted by quicksort the rest - is completely sorted using insertion sort, since this is efficient - for partitions below MAX_THRESH size. BASE_PTR points to the beginning - of the array to sort, and END_PTR points at the very last element in - the array (*not* one beyond it!). */ - -#define min(x, y) ((x) < (y) ? (x) : (y)) - - { - char *const end_ptr = &base_ptr[size * (total_elems - 1)]; - char *tmp_ptr = base_ptr; - char *thresh = min(end_ptr, base_ptr + max_thresh); - char *run_ptr; - - /* Find smallest element in first threshold and place it at the - array's beginning. This is the smallest array element, - and the operation speeds up insertion sort's inner loop. */ - - for (run_ptr = tmp_ptr + size; run_ptr <= thresh; run_ptr += size) - if ((*cmp) ((void *) run_ptr, (void *) tmp_ptr, arg) < 0) - tmp_ptr = run_ptr; - - if (tmp_ptr != base_ptr) - swap (tmp_ptr, base_ptr, size); - - /* Insertion sort, running from left-hand-side up to right-hand-side. */ - - run_ptr = base_ptr + size; - while ((run_ptr += size) <= end_ptr) - { - tmp_ptr = run_ptr - size; - while ((*cmp) ((void *) run_ptr, (void *) tmp_ptr, arg) < 0) - tmp_ptr -= size; - - tmp_ptr += size; - if (tmp_ptr != run_ptr) - { - char *trav; - - trav = run_ptr + size; - while (--trav >= run_ptr) - { - char c = *trav; - char *hi, *lo; - - for (hi = lo = trav; (lo -= size) >= tmp_ptr; hi = lo) - *hi = *lo; - *hi = c; - } - } - } - } -} +#define R_VERSION +#define R_FUNC __qsort_r +#include libc_hidden_def (__qsort_r) weak_alias (__qsort_r, qsort_r) -void -qsort (void *b, size_t n, size_t s, __compar_fn_t cmp) -{ - return __qsort_r (b, n, s, (__compar_d_fn_t) cmp, NULL); -} +#define R_FUNC qsort +#include + libc_hidden_def (qsort) diff --git a/stdlib/qsort_common.c b/stdlib/qsort_common.c new file mode 100644 index 0000000000..666b1958ab --- /dev/null +++ b/stdlib/qsort_common.c @@ -0,0 +1,225 @@ +/* Common implementation for both qsort and qsort_r. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* If you consider tuning this algorithm, you should consult first: + Engineering a sort function; Jon Bentley and M. Douglas McIlroy; + Software - Practice and Experience; Vol. 23 (11), 1249-1265, 1993. */ + +#ifdef R_VERSION +# define R_CMP_TYPE __compar_d_fn_t +# define R_CMP_ARG , void *arg +# define R_CMP(p1, p2) cmp (p1, p2, arg) +#else +# define R_CMP_TYPE __compar_fn_t +# define R_CMP_ARG +# define R_CMP(p1, p2) cmp (p1, p2) +#endif + +/* Order size using quicksort. This implementation incorporates + four optimizations discussed in Sedgewick: + + 1. Non-recursive, using an explicit stack of pointer that store the + next array partition to sort. To save time, this maximum amount + of space required to store an array of SIZE_MAX is allocated on the + stack. Assuming a 32-bit (64 bit) integer for size_t, this needs + only 32 * sizeof(stack_node) == 256 bytes (for 64 bit: 1024 bytes). + Pretty cheap, actually. + + 2. Chose the pivot element using a median-of-three decision tree. + This reduces the probability of selecting a bad pivot value and + eliminates certain extraneous comparisons. + + 3. Only quicksorts TOTAL_ELEMS / MAX_THRESH partitions, leaving + insertion sort to order the MAX_THRESH items within each partition. + This is a big win, since insertion sort is faster for small, mostly + sorted array segments. + + 4. The larger of the two sub-partitions is always pushed onto the + stack first, with the algorithm then concentrating on the + smaller partition. This *guarantees* no more than log (total_elems) + stack size is needed (actually O(1) in this case)! */ + +void +R_FUNC (void *pbase, size_t total_elems, size_t size, R_CMP_TYPE cmp R_CMP_ARG) +{ + if (total_elems == 0) + /* Avoid lossage with unsigned arithmetic below. */ + return; + + char *base_ptr = (char *) pbase; + + const size_t max_thresh = MAX_THRESH * size; + + swap_t swap = select_swap_func (pbase, size); + + if (total_elems > MAX_THRESH) + { + char *lo = base_ptr; + char *hi = &lo[size * (total_elems - 1)]; + stack_node stack[STACK_SIZE]; + stack_node *top = stack; + + PUSH (NULL, NULL); + + while (STACK_NOT_EMPTY) + { + char *left_ptr; + char *right_ptr; + + /* Select median value from among LO, MID, and HI. Rearrange + LO and HI so the three values are sorted. This lowers the + probability of picking a pathological pivot value and + skips a comparison for both the LEFT_PTR and RIGHT_PTR in + the while loops. */ + + char *mid = lo + size * ((hi - lo) / size >> 1); + + if (R_CMP ((void *) mid, (void *) lo) < 0) + swap (mid, lo, size); + if (R_CMP ((void *) hi, (void *) mid) < 0) + swap (mid, hi, size); + else + goto jump_over; + if (R_CMP ((void *) mid, (void *) lo) < 0) + swap (mid, lo, size); + jump_over:; + + left_ptr = lo + size; + right_ptr = hi - size; + + /* Here's the famous ``collapse the walls'' section of quicksort. + Gotta like those tight inner loops! They are the main reason + that this algorithm runs much faster than others. */ + do + { + while (R_CMP ((void *) left_ptr, (void *) mid) < 0) + left_ptr += size; + + while (R_CMP ((void *) mid, (void *) right_ptr) < 0) + right_ptr -= size; + + if (left_ptr < right_ptr) + { + swap (left_ptr, right_ptr, size); + if (mid == left_ptr) + mid = right_ptr; + else if (mid == right_ptr) + mid = left_ptr; + left_ptr += size; + right_ptr -= size; + } + else if (left_ptr == right_ptr) + { + left_ptr += size; + right_ptr -= size; + break; + } + } + while (left_ptr <= right_ptr); + + /* Set up pointers for next iteration. First determine whether + left and right partitions are below the threshold size. If so, + ignore one or both. Otherwise, push the larger partition's + bounds on the stack and continue sorting the smaller one. */ + + if ((size_t) (right_ptr - lo) <= max_thresh) + { + if ((size_t) (hi - left_ptr) <= max_thresh) + /* Ignore both small partitions. */ + POP (lo, hi); + else + /* Ignore small left partition. */ + lo = left_ptr; + } + else if ((size_t) (hi - left_ptr) <= max_thresh) + /* Ignore small right partition. */ + hi = right_ptr; + else if ((right_ptr - lo) > (hi - left_ptr)) + { + /* Push larger left partition indices. */ + PUSH (lo, right_ptr); + lo = left_ptr; + } + else + { + /* Push larger right partition indices. */ + PUSH (left_ptr, hi); + hi = right_ptr; + } + } + } + + /* Once the BASE_PTR array is partially sorted by quicksort the rest + is completely sorted using insertion sort, since this is efficient + for partitions below MAX_THRESH size. BASE_PTR points to the beginning + of the array to sort, and END_PTR points at the very last element in + the array (*not* one beyond it!). */ + + { + char *const end_ptr = &base_ptr[size * (total_elems - 1)]; + char *tmp_ptr = base_ptr; + char *thresh = end_ptr < base_ptr + max_thresh ? + end_ptr : base_ptr + max_thresh; + char *run_ptr; + + /* Find smallest element in first threshold and place it at the + array's beginning. This is the smallest array element, + and the operation speeds up insertion sort's inner loop. */ + + for (run_ptr = tmp_ptr + size; run_ptr <= thresh; run_ptr += size) + if (R_CMP ((void *) run_ptr, (void *) tmp_ptr) < 0) + tmp_ptr = run_ptr; + + if (tmp_ptr != base_ptr) + swap (tmp_ptr, base_ptr, size); + + /* Insertion sort, running from left-hand-side up to right-hand-side. */ + + run_ptr = base_ptr + size; + while ((run_ptr += size) <= end_ptr) + { + tmp_ptr = run_ptr - size; + while (R_CMP ((void *) run_ptr, (void *) tmp_ptr) < 0) + tmp_ptr -= size; + + tmp_ptr += size; + if (tmp_ptr != run_ptr) + { + char *trav; + + trav = run_ptr + size; + while (--trav >= run_ptr) + { + char c = *trav; + char *hi, *lo; + + for (hi = lo = trav; (lo -= size) >= tmp_ptr; hi = lo) + *hi = *lo; + *hi = c; + } + } + } + } +} + +#undef R_NAME +#undef R_CMP_TYPE +#undef R_CMP_ARG +#undef R_CMP +#undef R_FUNC +#undef R_VERSION