From patchwork Thu Aug 31 18:00:31 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 111417 Delivered-To: patch@linaro.org Received: by 10.140.95.112 with SMTP id h103csp2923267qge; Thu, 31 Aug 2017 11:01:12 -0700 (PDT) X-Received: by 10.99.43.20 with SMTP id r20mr3470244pgr.210.1504202472384; Thu, 31 Aug 2017 11:01:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1504202472; cv=none; d=google.com; s=arc-20160816; b=DRxQ3y8yt0xI2DnkmG0HTFqYg+V4hcT+mVXhGs9F7VOfpouSB9fw4Ov/5RZVbYwx8r HAJdJVHsVHRsMPyzEkQ2l8yriDO9QPYdQLLMXIpIorxFpf41z0fH5WVc9jORWKPASsMk ETUlkSqLyFX8PzPuOK2Yym1elKOUmp6BjjuSUE3zXK+Z0rO92hYCjyd/6LE46PDmJuPc hpa7ZeDtDdohrRlGiDVD+KTu1TKhofWeycz7FLgcO8jXTR4Ipdux8dyDmD3f2TRrPz4D OYqVOExkxZTO9zzbh9GbRCpR5TXGxk6AD21mgN5nlrHwSTT72vGUC+QsPGr+zfbPyDvw TVDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=9OMT8m+lUBYaeZKZF1WMnuGRHUIOm/QWgX2CjuswVS0=; b=LaA7kmueIhhrvebxvpm/myLQv82lW0itx57Bl0lXPy7FZmQtcF0YAGaQUSZoa4fmnr eM5iYJYetSdcFwehQYkQYCty72khUhJuDjNbxHVBNhdb6opvyJjL6mhcsO9C9vRQ0fwd xBrVTSKIzyKDSJUYfhl9rGgKCpiWTPR9jBJCN8ixUrD7Jdv7lmee9uopYRS1WIlsfjJs Vw4IC7YGKcIO1k1zAgAg6DurvMlhmVIamMMiKFjkJqzjjP28cnUAHMBXco37+na/Ls5Z 7pPcicnjjnBha9Q1Iy1IwAH39a4MQqE/e79B3Ttw3rpyJ+631MKdaL//2x1Mwij0j8rT h0CQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=QV4BvQVM; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m11si162748pln.775.2017.08.31.11.01.12; Thu, 31 Aug 2017 11:01:12 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=QV4BvQVM; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751547AbdHaSBI (ORCPT + 26 others); Thu, 31 Aug 2017 14:01:08 -0400 Received: from mail-wm0-f49.google.com ([74.125.82.49]:37831 "EHLO mail-wm0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751388AbdHaSBE (ORCPT ); Thu, 31 Aug 2017 14:01:04 -0400 Received: by mail-wm0-f49.google.com with SMTP id u26so2328361wma.0 for ; Thu, 31 Aug 2017 11:01:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=9OMT8m+lUBYaeZKZF1WMnuGRHUIOm/QWgX2CjuswVS0=; b=QV4BvQVMzjUU1k42AOo2HIHJin+L/axlD29dFknPVorji5MkMePeMTRsZGj/TfYyPm mN0le7xojnbeffGylzTttr8geF9UKeZwMhVh1VYAEMLUxAKkIefeidT4DD2/FB35KoSV 3C6wfya53rTE/t0fEBLWYqXOQAc0BkkedkI1k= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=9OMT8m+lUBYaeZKZF1WMnuGRHUIOm/QWgX2CjuswVS0=; b=oZl44xqc9HXRvckyoXNdPf3wzc8dfgs2TJJI7Tw63I9pcJ2avsn5XkcChTVwGXOriF qbeDzlw4926fXbhNLzPmsd7AiRsWCouStod+9QnouXd8hsbNpXRkng5XgZ5/pDRZXPmw Rk0uWuTW4UEdBJiUKRlELM0VI4Ba8SJQJL6iZS24t01fONLzM2IeRUPLStwFWmtIlTTu GKbi7aQiqidk3nGRprWsLdJUNejE3TJE0vnMI1nvji4ci0vcZuYjocklvoi+VtXHpeNH CN7v1lS1xo0vORyB5WR9/K4ZgJfSYrsZ9vX6afwcRrokTflq7QT5MDsdDbjclmzlpMH4 USGA== X-Gm-Message-State: AHYfb5i0ExRV6ilnmi5JrjsorfKZhTQTUN9W215+CWDiyXjKfs1L20oq keIpzws2uAVWqhMbsB79sA== X-Google-Smtp-Source: ADKCNb5xugD/3YSLKopEjSaKXfwg10XIpQnIy3m8Vq6RpdYAO1C7DAmr7O1XSKQsA5dprl/2vcK17A== X-Received: by 10.28.103.84 with SMTP id b81mr998786wmc.22.1504202462514; Thu, 31 Aug 2017 11:01:02 -0700 (PDT) Received: from localhost.localdomain ([185.14.8.94]) by smtp.gmail.com with ESMTPSA id 66sm635285wmn.17.2017.08.31.11.01.00 (version=TLS1 cipher=AES128-SHA bits=128/128); Thu, 31 Aug 2017 11:01:01 -0700 (PDT) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, broonie@kernel.org, jeremywh7@gmail.com, lnicola@dend.ro, Paolo Valente Subject: [PATCH BUGFIX/IMPROVEMENT 2/2] doc, block, bfq: better describe how to properly configure bfq Date: Thu, 31 Aug 2017 20:00:31 +0200 Message-Id: <20170831180031.3747-3-paolo.valente@linaro.org> X-Mailer: git-send-email 2.10.0 In-Reply-To: <20170831180031.3747-1-paolo.valente@linaro.org> References: <20170831180031.3747-1-paolo.valente@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Many users have reported the lack of an HOWTO for properly configuring bfq as a function of the goal one wants to achieve (max responsiveness, max throughput, ...). In fact, all needed details are already provided in the documentation file bfq-iosched.txt. Yet the document lacks guidance on which parameter descriptions to look at. This commit adds some simple direction. Signed-off-by: Paolo Valente Reviewed-by: Jeremy Hickman Reviewed-by: Laurentiu Nicola --- Documentation/block/bfq-iosched.txt | 78 +++++++++++++++++++++++++------------ 1 file changed, 54 insertions(+), 24 deletions(-) -- 2.10.0 diff --git a/Documentation/block/bfq-iosched.txt b/Documentation/block/bfq-iosched.txt index 03ff4cc..3d6951d 100644 --- a/Documentation/block/bfq-iosched.txt +++ b/Documentation/block/bfq-iosched.txt @@ -35,7 +35,7 @@ CONTENTS 1-1 Personal systems 1-2 Server systems 2. How does BFQ work? -3. What are BFQ's tunable? +3. What are BFQ's tunables and how to properly configure BFQ? 4. BFQ group scheduling 4-1 Service guarantees provided 4-2 Interface @@ -147,12 +147,12 @@ plus a lot of code, are borrowed from CFQ. contrast, BFQ may idle the device for a short time interval, giving the process the chance to go on being served if it issues a new request in time. Device idling typically boosts the - throughput on rotational devices, if processes do synchronous - and sequential I/O. In addition, under BFQ, device idling is - also instrumental in guaranteeing the desired throughput - fraction to processes issuing sync requests (see the description - of the slice_idle tunable in this document, or [1, 2], for more - details). + throughput on rotational devices and on non-queueing flash-based + devices, if processes do synchronous and sequential I/O. In + addition, under BFQ, device idling is also instrumental in + guaranteeing the desired throughput fraction to processes + issuing sync requests (see the description of the slice_idle + tunable in this document, or [1, 2], for more details). - With respect to idling for service guarantees, if several processes are competing for the device at the same time, but @@ -161,6 +161,15 @@ plus a lot of code, are borrowed from CFQ. idling the device. Throughput is thus as high as possible in this common scenario. + - On flash-based storage with internal queueing of commands + (typically NCQ), device idling happens to be always detrimental + for throughput. So, with these devices, BFQ performs idling + only when strictly needed for service guarantees, i.e., for + guaranteeing low latency or fairness. In these cases, overall + throughput may be sub-optimal. No solution currently exists to + provide both strong service guarantees and optimal throughput + on devices with internal queueing. + - If low-latency mode is enabled (default configuration), BFQ executes some special heuristics to detect interactive and soft real-time applications (e.g., video or audio players/streamers), @@ -248,13 +257,24 @@ plus a lot of code, are borrowed from CFQ. the Idle class, to prevent it from starving. -3. What are BFQ's tunable? -========================== +3. What are BFQ's tunables and how to properly configure BFQ? +============================================================= -The tunables back_seek-max, back_seek_penalty, fifo_expire_async and -fifo_expire_sync below are the same as in CFQ. Their description is -just copied from that for CFQ. Some considerations in the description -of slice_idle are copied from CFQ too. +Most BFQ tunables affect service guarantees (basically latency and +fairness) and throughput. For full details on how to choose the +desired tradeoff between service guarantees and throughput, see the +parameters slice_idle, strict_guarantees and low_latency. For details +on how to maximise throughput, see slice_idle, timeout_sync and +max_budget. The other performance-related parameters have been +inherited from, and have been preserved mostly for compatibility with +CFQ. So far, no performance improvement has been reported after +changing the latter parameters in BFQ. + +In particular, the tunables back_seek-max, back_seek_penalty, +fifo_expire_async and fifo_expire_sync below are the same as in +CFQ. Their description is just copied from that for CFQ. Some +considerations in the description of slice_idle are copied from CFQ +too. per-process ioprio and weight ----------------------------- @@ -284,15 +304,17 @@ number of seeks and see improved throughput. Setting slice_idle to 0 will remove all the idling on queues and one should see an overall improved throughput on faster storage devices -like multiple SATA/SAS disks in hardware RAID configuration. +like multiple SATA/SAS disks in hardware RAID configuration, as well +as flash-based storage with internal command queueing (and +parallelism). So depending on storage and workload, it might be useful to set slice_idle=0. In general for SATA/SAS disks and software RAID of SATA/SAS disks keeping slice_idle enabled should be useful. For any configurations where there are multiple spindles behind single LUN -(Host based hardware RAID controller or for storage arrays), setting -slice_idle=0 might end up in better throughput and acceptable -latencies. +(Host based hardware RAID controller or for storage arrays), or with +flash-based fast storage, setting slice_idle=0 might end up in better +throughput and acceptable latencies. Idling is however necessary to have service guarantees enforced in case of differentiated weights or differentiated I/O-request lengths. @@ -311,13 +333,14 @@ There is an important flipside for idling: apart from the above cases where it is beneficial also for throughput, idling can severely impact throughput. One important case is random workload. Because of this issue, BFQ tends to avoid idling as much as possible, when it is not -beneficial also for throughput. As a consequence of this behavior, and -of further issues described for the strict_guarantees tunable, -short-term service guarantees may be occasionally violated. And, in -some cases, these guarantees may be more important than guaranteeing -maximum throughput. For example, in video playing/streaming, a very -low drop rate may be more important than maximum throughput. In these -cases, consider setting the strict_guarantees parameter. +beneficial also for throughput (as detailed in Section 2). As a +consequence of this behavior, and of further issues described for the +strict_guarantees tunable, short-term service guarantees may be +occasionally violated. And, in some cases, these guarantees may be +more important than guaranteeing maximum throughput. For example, in +video playing/streaming, a very low drop rate may be more important +than maximum throughput. In these cases, consider setting the +strict_guarantees parameter. strict_guarantees ----------------- @@ -419,6 +442,13 @@ The default value is 0, which enables auto-tuning: BFQ sets max_budget to the maximum number of sectors that can be served during timeout_sync, according to the estimated peak rate. +For specific devices, some users have occasionally reported to have +reached a higher throughput by setting max_budget explicitly, i.e., by +setting max_budget to a higher value than 0. In particular, they have +set max_budget to higher values than those to which BFQ would have set +it with auto-tuning. An alternative way to achieve this goal is to +just increase the value of timeout_sync, leaving max_budget equal to 0. + weights -------