From patchwork Sun Apr 19 15:00:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Schoenebeck X-Patchwork-Id: 284217 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39A0DC38A30 for ; Sun, 19 Apr 2020 16:23:39 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 01110206A2 for ; Sun, 19 Apr 2020 16:23:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=crudebyte.com header.i=@crudebyte.com header.b="c0kJItpZ" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 01110206A2 Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=crudebyte.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:43926 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jQCjK-0001xl-4g for qemu-devel@archiver.kernel.org; Sun, 19 Apr 2020 12:23:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41242 helo=eggs1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jQCiJ-0001F1-Eb for qemu-devel@nongnu.org; Sun, 19 Apr 2020 12:22:35 -0400 Received: from Debian-exim by eggs1p.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jQCiI-0000AS-UR for qemu-devel@nongnu.org; Sun, 19 Apr 2020 12:22:35 -0400 Received: from lizzy.crudebyte.com ([91.194.90.13]:41945) by eggs1p.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jQCiI-0007ux-AD for qemu-devel@nongnu.org; Sun, 19 Apr 2020 12:22:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=crudebyte.com; s=lizzy; h=Cc:To:Subject:Date:From:References:In-Reply-To: Message-Id:Content-Type:Content-Transfer-Encoding:MIME-Version:Content-ID: Content-Description; bh=IezmcX2Zz1GFPAfr+8apNBJ7XzlcmwBnC7ZbeGtNK9Q=; b=c0kJI tpZMNIAU3AuNWTHz0ImVS87Dsv5GGt6qS578y04kpjJx1Fr9CMxDEFZqRcQrk8nnsPldr2HuPB2VG fOHJ+mIFnyqJ42YtbCtLTU8Yddhnt6S0kNr2ue2D3gw4RizwZ5SEx5J7FFiN6l0fvTOeNfUulT31O rEBQTaTimztBlzLhztATRcT3spThPRfZyGUfecRr221s+GpH0zgoba+/xeEpfjgwUzulFii+ywVZX TlNX1Wpn+La8S8EqQjPwV+AF2ynkzdYcUaT74DcwrjDnaFYDXPKImsznjA737EBKhG2dMz6qL2iQa inruYb0fjQbp6CT2N0qJ3d9UeaAnw==; Message-Id: In-Reply-To: References: From: Christian Schoenebeck Date: Sun, 19 Apr 2020 17:00:06 +0200 Subject: [PATCH v6 1/5] tests/virtio-9p: added split readdir tests To: qemu-devel@nongnu.org Cc: Greg Kurz Received-SPF: none client-ip=91.194.90.13; envelope-from=eedba1df29df6d0a13fec0c6ea9d751f0362995b@lizzy.crudebyte.com; helo=lizzy.crudebyte.com X-detected-operating-system: by eggs1p.gnu.org: Linux 3.11 and newer X-Received-From: 91.194.90.13 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" The previous, already existing 'basic' readdir test simply used a 'count' parameter big enough to retrieve all directory entries with a single Treaddir request. In the 3 new 'split' readdir tests added by this patch, directory entries are retrieved, split over several Treaddir requests by picking small 'count' parameters which force the server to truncate the response. So the test client sends as many Treaddir requests as necessary to get all directory entries. The following 3 new tests are added (executed in this sequence): 1. Split readdir test with count=512 2. Split readdir test with count=256 3. Split readdir test with count=128 This test case sequence is chosen because the smaller the 'count' value, the higher the chance of errors in case of implementation bugs on server side. Signed-off-by: Christian Schoenebeck --- tests/qtest/virtio-9p-test.c | 108 +++++++++++++++++++++++++++++++++++ 1 file changed, 108 insertions(+) diff --git a/tests/qtest/virtio-9p-test.c b/tests/qtest/virtio-9p-test.c index 2167322985..de30b717b6 100644 --- a/tests/qtest/virtio-9p-test.c +++ b/tests/qtest/virtio-9p-test.c @@ -578,6 +578,7 @@ static bool fs_dirents_contain_name(struct V9fsDirent *e, const char* name) return false; } +/* basic readdir test where reply fits into a single response message */ static void fs_readdir(void *obj, void *data, QGuestAllocator *t_alloc) { QVirtio9P *v9p = obj; @@ -631,6 +632,89 @@ static void fs_readdir(void *obj, void *data, QGuestAllocator *t_alloc) g_free(wnames[0]); } +/* readdir test where overall request is split over several messages */ +static void fs_readdir_split(void *obj, void *data, QGuestAllocator *t_alloc, + uint32_t count) +{ + QVirtio9P *v9p = obj; + alloc = t_alloc; + char *const wnames[] = { g_strdup(QTEST_V9FS_SYNTH_READDIR_DIR) }; + uint16_t nqid; + v9fs_qid qid; + uint32_t nentries, npartialentries; + struct V9fsDirent *entries, *tail, *partialentries; + P9Req *req; + int fid; + uint64_t offset; + + fs_attach(v9p, NULL, t_alloc); + + fid = 1; + offset = 0; + entries = NULL; + nentries = 0; + tail = NULL; + + req = v9fs_twalk(v9p, 0, fid, 1, wnames, 0); + v9fs_req_wait_for_reply(req, NULL); + v9fs_rwalk(req, &nqid, NULL); + g_assert_cmpint(nqid, ==, 1); + + req = v9fs_tlopen(v9p, fid, O_DIRECTORY, 0); + v9fs_req_wait_for_reply(req, NULL); + v9fs_rlopen(req, &qid, NULL); + + /* + * send as many Treaddir requests as required to get all directory + * entries + */ + while (true) { + npartialentries = 0; + partialentries = NULL; + + req = v9fs_treaddir(v9p, fid, offset, count, 0); + v9fs_req_wait_for_reply(req, NULL); + v9fs_rreaddir(req, &count, &npartialentries, &partialentries); + if (npartialentries > 0 && partialentries) { + if (!entries) { + entries = partialentries; + nentries = npartialentries; + tail = partialentries; + } else { + tail->next = partialentries; + nentries += npartialentries; + } + while (tail->next) { + tail = tail->next; + } + offset = tail->offset; + } else { + break; + } + } + + g_assert_cmpint( + nentries, ==, + QTEST_V9FS_SYNTH_READDIR_NFILES + 2 /* "." and ".." */ + ); + + /* + * Check all file names exist in returned entries, ignore their order + * though. + */ + g_assert_cmpint(fs_dirents_contain_name(entries, "."), ==, true); + g_assert_cmpint(fs_dirents_contain_name(entries, ".."), ==, true); + for (int i = 0; i < QTEST_V9FS_SYNTH_READDIR_NFILES; ++i) { + char *name = g_strdup_printf(QTEST_V9FS_SYNTH_READDIR_FILE, i); + g_assert_cmpint(fs_dirents_contain_name(entries, name), ==, true); + g_free(name); + } + + v9fs_free_dirents(entries); + + g_free(wnames[0]); +} + static void fs_walk_no_slash(void *obj, void *data, QGuestAllocator *t_alloc) { QVirtio9P *v9p = obj; @@ -793,6 +877,24 @@ static void fs_flush_ignored(void *obj, void *data, QGuestAllocator *t_alloc) g_free(wnames[0]); } +static void fs_readdir_split_128(void *obj, void *data, + QGuestAllocator *t_alloc) +{ + fs_readdir_split(obj, data, t_alloc, 128); +} + +static void fs_readdir_split_256(void *obj, void *data, + QGuestAllocator *t_alloc) +{ + fs_readdir_split(obj, data, t_alloc, 256); +} + +static void fs_readdir_split_512(void *obj, void *data, + QGuestAllocator *t_alloc) +{ + fs_readdir_split(obj, data, t_alloc, 512); +} + static void register_virtio_9p_test(void) { qos_add_test("config", "virtio-9p", pci_config, NULL); @@ -810,6 +912,12 @@ static void register_virtio_9p_test(void) qos_add_test("fs/flush/ignored", "virtio-9p", fs_flush_ignored, NULL); qos_add_test("fs/readdir/basic", "virtio-9p", fs_readdir, NULL); + qos_add_test("fs/readdir/split_512", "virtio-9p", + fs_readdir_split_512, NULL); + qos_add_test("fs/readdir/split_256", "virtio-9p", + fs_readdir_split_256, NULL); + qos_add_test("fs/readdir/split_128", "virtio-9p", + fs_readdir_split_128, NULL); } libqos_init(register_virtio_9p_test); From patchwork Sun Apr 19 15:02:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Schoenebeck X-Patchwork-Id: 284215 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0FED5C38A30 for ; Sun, 19 Apr 2020 16:27:24 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CBEFC206A2 for ; Sun, 19 Apr 2020 16:27:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=crudebyte.com header.i=@crudebyte.com header.b="Tir/AJZe" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CBEFC206A2 Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=crudebyte.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:44010 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jQCmx-0006XQ-1z for qemu-devel@archiver.kernel.org; Sun, 19 Apr 2020 12:27:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42128 helo=eggs1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jQCmK-0005mA-1w for qemu-devel@nongnu.org; Sun, 19 Apr 2020 12:26:44 -0400 Received: from Debian-exim by eggs1p.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jQCmJ-00063z-Ax for qemu-devel@nongnu.org; Sun, 19 Apr 2020 12:26:43 -0400 Received: from lizzy.crudebyte.com ([91.194.90.13]:41361) by eggs1p.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jQCmI-0004j9-RE for qemu-devel@nongnu.org; Sun, 19 Apr 2020 12:26:42 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=crudebyte.com; s=lizzy; h=Cc:To:Subject:Date:From:References:In-Reply-To: Message-Id:Content-Type:Content-Transfer-Encoding:MIME-Version:Content-ID: Content-Description; bh=Exi2eVU0aC9WtbkX7/D0N5Fl28ExllAxepKnldAijZ4=; b=Tir/A JZeE15PgLCj9TJ9cnqGNbRPqWNEfyzo5o0uidD98GZ9XjSiEP5eUOfX44gG+4byZVQOD2C+pG0Lg8 d+gN4MyPKhH9C9/CNn2J9rnfRFtKGWUTX3U7OWCRclXP3Lyc7/2F44pcHO2VCBCiHQJNTW9pFNMPq SagxeWPFJrsJY7AzLktP5/i5oiWelfmJyMYJjfYp6tbHdZrCOQYNuHDW/ySKnsHwGs6u3Z6LDguuo srGNPmqn3Hl5fQGNi/pgGCw8im2Nt8P2gATRLN+J6wtrOJLYOMS5fIajCuk4ZDzX4HdA9/DKseevj W+MrY5syL4L3Soq9i/iuiTn77fxvw==; Message-Id: In-Reply-To: References: From: Christian Schoenebeck Date: Sun, 19 Apr 2020 17:02:27 +0200 Subject: [PATCH v6 3/5] 9pfs: add new function v9fs_co_readdir_many() To: qemu-devel@nongnu.org Cc: Greg Kurz Received-SPF: none client-ip=91.194.90.13; envelope-from=fdb0e29a86d1df6005021a08078d7e69ed0de1a2@lizzy.crudebyte.com; helo=lizzy.crudebyte.com X-detected-operating-system: by eggs1p.gnu.org: Linux 3.11 and newer X-Received-From: 91.194.90.13 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" The newly added function v9fs_co_readdir_many() retrieves multiple directory entries with a single fs driver request. It is intended to replace uses of v9fs_co_readdir(), the latter only retrives a single directory entry per fs driver request instead. The reason for this planned replacement is that for every fs driver request the coroutine is dispatched from main I/O thread to a background I/O thread and eventually dispatched back to main I/O thread. Hopping between threads adds latency. So if a 9pfs Treaddir request reads a large amount of directory entries, this currently sums up to huge latencies of several hundred ms or even more. So using v9fs_co_readdir_many() instead of v9fs_co_readdir() will provide significant performance improvements. Signed-off-by: Christian Schoenebeck --- hw/9pfs/9p.h | 22 ++++++ hw/9pfs/codir.c | 181 +++++++++++++++++++++++++++++++++++++++++++++--- hw/9pfs/coth.h | 3 + 3 files changed, 195 insertions(+), 11 deletions(-) diff --git a/hw/9pfs/9p.h b/hw/9pfs/9p.h index 9553700dbb..116977939b 100644 --- a/hw/9pfs/9p.h +++ b/hw/9pfs/9p.h @@ -215,6 +215,28 @@ static inline void v9fs_readdir_init(V9fsDir *dir) qemu_mutex_init(&dir->readdir_mutex); } +/** + * Type for 9p fs drivers' (a.k.a. 9p backends) result of readdir requests, + * which is a chained list of directory entries. + */ +typedef struct V9fsDirEnt { + /* mandatory (must not be NULL) information for all readdir requests */ + struct dirent *dent; + /* + * optional (may be NULL): A full stat of each directory entry is just + * done if explicitly told to fs driver. + */ + struct stat *st; + /* + * instead of an array, directory entries are always returned as + * chained list, that's because the amount of entries retrieved by fs + * drivers is dependent on the individual entries' name (since response + * messages are size limited), so the final amount cannot be estimated + * before hand + */ + struct V9fsDirEnt *next; +} V9fsDirEnt; + /* * Filled by fs driver on open and other * calls. diff --git a/hw/9pfs/codir.c b/hw/9pfs/codir.c index 73f9a751e1..45c65a8f5b 100644 --- a/hw/9pfs/codir.c +++ b/hw/9pfs/codir.c @@ -18,28 +18,187 @@ #include "qemu/main-loop.h" #include "coth.h" +/* + * This is solely executed on a background IO thread. + */ +static int do_readdir(V9fsPDU *pdu, V9fsFidState *fidp, struct dirent **dent) +{ + int err = 0; + V9fsState *s = pdu->s; + struct dirent *entry; + + errno = 0; + entry = s->ops->readdir(&s->ctx, &fidp->fs); + if (!entry && errno) { + *dent = NULL; + err = -errno; + } else { + *dent = entry; + } + return err; +} + +/* + * TODO: This will be removed for performance reasons. + * Use v9fs_co_readdir_many() instead. + */ int coroutine_fn v9fs_co_readdir(V9fsPDU *pdu, V9fsFidState *fidp, struct dirent **dent) { int err; - V9fsState *s = pdu->s; if (v9fs_request_cancelled(pdu)) { return -EINTR; } - v9fs_co_run_in_worker( - { - struct dirent *entry; + v9fs_co_run_in_worker({ + err = do_readdir(pdu, fidp, dent); + }); + return err; +} + +/* + * This is solely executed on a background IO thread. + * + * See v9fs_co_readdir_many() (as its only user) below for details. + */ +static int do_readdir_many(V9fsPDU *pdu, V9fsFidState *fidp, + struct V9fsDirEnt **entries, + int32_t maxsize, bool dostat) +{ + V9fsState *s = pdu->s; + V9fsString name; + int len, err = 0; + int32_t size = 0; + off_t saved_dir_pos; + struct dirent *dent; + struct V9fsDirEnt *e = NULL; + V9fsPath path; + struct stat stbuf; - errno = 0; - entry = s->ops->readdir(&s->ctx, &fidp->fs); - if (!entry && errno) { + *entries = NULL; + v9fs_path_init(&path); + + /* + * TODO: Here should be a warn_report_once() if lock failed. + * + * With a good 9p client we should not get into concurrency here, + * because a good client would not use the same fid for concurrent + * requests. We do the lock here for safety reasons though. However + * the client would then suffer performance issues, so better log that + * issue here. + */ + v9fs_readdir_lock(&fidp->fs.dir); + + /* save the directory position */ + saved_dir_pos = s->ops->telldir(&s->ctx, &fidp->fs); + if (saved_dir_pos < 0) { + err = saved_dir_pos; + goto out; + } + + while (true) { + /* get directory entry from fs driver */ + err = do_readdir(pdu, fidp, &dent); + if (err || !dent) { + break; + } + + /* + * stop this loop as soon as it would exceed the allowed maximum + * response message size for the directory entries collected so far, + * because anything beyond that size would need to be discarded by + * 9p controller (main thread / top half) anyway + */ + v9fs_string_init(&name); + v9fs_string_sprintf(&name, "%s", dent->d_name); + len = v9fs_readdir_response_size(&name); + v9fs_string_free(&name); + if (size + len > maxsize) { + /* this is not an error case actually */ + break; + } + + /* append next node to result chain */ + if (!e) { + *entries = e = g_malloc0(sizeof(V9fsDirEnt)); + } else { + e = e->next = g_malloc0(sizeof(V9fsDirEnt)); + } + e->dent = g_malloc0(sizeof(struct dirent)); + memcpy(e->dent, dent, sizeof(struct dirent)); + + /* perform a full stat() for directory entry if requested by caller */ + if (dostat) { + err = s->ops->name_to_path( + &s->ctx, &fidp->path, dent->d_name, &path + ); + if (err < 0) { err = -errno; - } else { - *dent = entry; - err = 0; + break; } - }); + + err = s->ops->lstat(&s->ctx, &path, &stbuf); + if (err < 0) { + err = -errno; + break; + } + + e->st = g_malloc0(sizeof(struct stat)); + memcpy(e->st, &stbuf, sizeof(struct stat)); + } + + size += len; + saved_dir_pos = dent->d_off; + } + + /* restore (last) saved position */ + s->ops->seekdir(&s->ctx, &fidp->fs, saved_dir_pos); + +out: + v9fs_readdir_unlock(&fidp->fs.dir); + v9fs_path_free(&path); + if (err < 0) { + return err; + } + return size; +} + +/** + * @brief Reads multiple directory entries in one rush. + * + * Retrieves the requested (max. amount of) directory entries from the fs + * driver. This function must only be called by the main IO thread (top half). + * Internally this function call will be dispatched to a background IO thread + * (bottom half) where it is eventually executed by the fs driver. + * + * Acquiring multiple directory entries in one rush from the fs driver, + * instead of retrieving each directory entry individually, is very beneficial + * from performance point of view. Because for every fs driver request latency + * is added, which in practice could lead to overall latencies of several + * hundred ms for reading all entries (of just a single directory) if every + * directory entry was individually requested from driver. + * + * @param pdu - the causing 9p (T_readdir) client request + * @param fidp - already opened directory where readdir shall be performed on + * @param entries - output for directory entries (must not be NULL) + * @param maxsize - maximum result message body size (in bytes) + * @param dostat - whether a stat() should be performed and returned for + * each directory entry + * @returns resulting response message body size (in bytes) on success, + * negative error code otherwise + */ +int coroutine_fn v9fs_co_readdir_many(V9fsPDU *pdu, V9fsFidState *fidp, + struct V9fsDirEnt **entries, + int32_t maxsize, bool dostat) +{ + int err = 0; + + if (v9fs_request_cancelled(pdu)) { + return -EINTR; + } + v9fs_co_run_in_worker({ + err = do_readdir_many(pdu, fidp, entries, maxsize, dostat); + }); return err; } diff --git a/hw/9pfs/coth.h b/hw/9pfs/coth.h index c2cdc7a9ea..a6851822d5 100644 --- a/hw/9pfs/coth.h +++ b/hw/9pfs/coth.h @@ -49,6 +49,9 @@ void co_run_in_worker_bh(void *); int coroutine_fn v9fs_co_readlink(V9fsPDU *, V9fsPath *, V9fsString *); int coroutine_fn v9fs_co_readdir(V9fsPDU *, V9fsFidState *, struct dirent **); +int coroutine_fn v9fs_co_readdir_many(V9fsPDU *, V9fsFidState *, + struct V9fsDirEnt **, + int32_t, bool); off_t coroutine_fn v9fs_co_telldir(V9fsPDU *, V9fsFidState *); void coroutine_fn v9fs_co_seekdir(V9fsPDU *, V9fsFidState *, off_t); void coroutine_fn v9fs_co_rewinddir(V9fsPDU *, V9fsFidState *); From patchwork Sun Apr 19 15:06:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Schoenebeck X-Patchwork-Id: 284216 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39D43C38A30 for ; Sun, 19 Apr 2020 16:26:15 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F00C7206A2 for ; Sun, 19 Apr 2020 16:26:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=crudebyte.com header.i=@crudebyte.com header.b="Lkn4sRTt" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F00C7206A2 Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=crudebyte.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:43988 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jQClq-0004qq-5N for qemu-devel@archiver.kernel.org; Sun, 19 Apr 2020 12:26:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41756 helo=eggs1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from <14ec5d880cfca878bf32e643243c7ab3f4a52440@lizzy.crudebyte.com>) id 1jQCkK-0003iX-BA for qemu-devel@nongnu.org; Sun, 19 Apr 2020 12:24:41 -0400 Received: from Debian-exim by eggs1p.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from <14ec5d880cfca878bf32e643243c7ab3f4a52440@lizzy.crudebyte.com>) id 1jQCkI-0002jL-JP for qemu-devel@nongnu.org; Sun, 19 Apr 2020 12:24:40 -0400 Received: from lizzy.crudebyte.com ([91.194.90.13]:46681) by eggs1p.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <14ec5d880cfca878bf32e643243c7ab3f4a52440@lizzy.crudebyte.com>) id 1jQCkH-0001S7-3i for qemu-devel@nongnu.org; Sun, 19 Apr 2020 12:24:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=crudebyte.com; s=lizzy; h=Cc:To:Subject:Date:From:References:In-Reply-To: Message-Id:Content-Type:Content-Transfer-Encoding:MIME-Version:Content-ID: Content-Description; bh=vPRaDMJ4FktY+2WuNbhPAC/12T/foP/ReG2XD1IK6pU=; b=Lkn4s RTtw42+H1WkKCFm8NJLbWTqx0hNus0hzJsdd2EppwkKBdAbocamq2mv+qOmIVa2ZdoiKuQCtYGAyD FdCj+R+WkU4e5uop9x1qm/4EunMWbuv9p7JIYuZpnkzCqjcY8v2XFdTxZJtl1KqGyOP3P+aEgh1ch s1pL7Lo8Brm2uyyz8FHVwUchKzXoDCJiigvS9uU0jeprAq+0PyEAjhelWfh5Bgu5fDOs3I2EvE7Hr IviTzF4isGA4OAVXbBHZN7o3XUE1M5Fi4JcUq/uKxa5QLdNdY+KTQjugGDvG1BKqDa9+gtEC+CNrx qfQVBMN3QWFQzfQobc8HJCLDI/cAg==; Message-Id: <14ec5d880cfca878bf32e643243c7ab3f4a52440.1587309014.git.qemu_oss@crudebyte.com> In-Reply-To: References: From: Christian Schoenebeck Date: Sun, 19 Apr 2020 17:06:17 +0200 Subject: [PATCH v6 4/5] 9pfs: T_readdir latency optimization To: qemu-devel@nongnu.org Cc: Greg Kurz Received-SPF: none client-ip=91.194.90.13; envelope-from=14ec5d880cfca878bf32e643243c7ab3f4a52440@lizzy.crudebyte.com; helo=lizzy.crudebyte.com X-detected-operating-system: by eggs1p.gnu.org: Linux 3.11 and newer X-Received-From: 91.194.90.13 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Make top half really top half and bottom half really bottom half: Each T_readdir request handling is hopping between threads (main I/O thread and background I/O driver threads) several times for every individual directory entry, which sums up to huge latencies for handling just a single T_readdir request. Instead of doing that, collect now all required directory entries (including all potentially required stat buffers for each entry) in one rush on a background I/O thread from fs driver by calling the previously added function v9fs_co_readdir_many() instead of v9fs_co_readdir(), then assemble the entire resulting network response message for the readdir request on main I/O thread. The fs driver is still aborting the directory entry retrieval loop (on the background I/O thread inside of v9fs_co_readdir_many()) as soon as it would exceed the client's requested maximum R_readdir response size. So this will not introduce a performance penalty on another end. Signed-off-by: Christian Schoenebeck --- hw/9pfs/9p.c | 122 +++++++++++++++++++++++---------------------------- 1 file changed, 55 insertions(+), 67 deletions(-) diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c index 43584aca41..8283a3cfbb 100644 --- a/hw/9pfs/9p.c +++ b/hw/9pfs/9p.c @@ -971,30 +971,6 @@ static int coroutine_fn fid_to_qid(V9fsPDU *pdu, V9fsFidState *fidp, return 0; } -static int coroutine_fn dirent_to_qid(V9fsPDU *pdu, V9fsFidState *fidp, - struct dirent *dent, V9fsQID *qidp) -{ - struct stat stbuf; - V9fsPath path; - int err; - - v9fs_path_init(&path); - - err = v9fs_co_name_to_path(pdu, &fidp->path, dent->d_name, &path); - if (err < 0) { - goto out; - } - err = v9fs_co_lstat(pdu, &path, &stbuf); - if (err < 0) { - goto out; - } - err = stat_to_qid(pdu, &stbuf, qidp); - -out: - v9fs_path_free(&path); - return err; -} - V9fsPDU *pdu_alloc(V9fsState *s) { V9fsPDU *pdu = NULL; @@ -2337,6 +2313,18 @@ size_t v9fs_readdir_response_size(V9fsString *name) return 24 + v9fs_string_size(name); } +static void v9fs_free_dirents(struct V9fsDirEnt *e) +{ + struct V9fsDirEnt *next = NULL; + + for (; e; e = next) { + next = e->next; + g_free(e->dent); + g_free(e->st); + g_free(e); + } +} + static int coroutine_fn v9fs_do_readdir(V9fsPDU *pdu, V9fsFidState *fidp, int32_t max_count) { @@ -2345,54 +2333,53 @@ static int coroutine_fn v9fs_do_readdir(V9fsPDU *pdu, V9fsFidState *fidp, V9fsString name; int len, err = 0; int32_t count = 0; - off_t saved_dir_pos; struct dirent *dent; + struct stat *st; + struct V9fsDirEnt *entries = NULL; - /* save the directory position */ - saved_dir_pos = v9fs_co_telldir(pdu, fidp); - if (saved_dir_pos < 0) { - return saved_dir_pos; - } - - while (1) { - v9fs_readdir_lock(&fidp->fs.dir); + /* + * inode remapping requires the device id, which in turn might be + * different for different directory entries, so if inode remapping is + * enabled we have to make a full stat for each directory entry + */ + const bool dostat = pdu->s->ctx.export_flags & V9FS_REMAP_INODES; - err = v9fs_co_readdir(pdu, fidp, &dent); - if (err || !dent) { - break; - } - v9fs_string_init(&name); - v9fs_string_sprintf(&name, "%s", dent->d_name); - if ((count + v9fs_readdir_response_size(&name)) > max_count) { - v9fs_readdir_unlock(&fidp->fs.dir); + /* + * Fetch all required directory entries altogether on a background IO + * thread from fs driver. We don't want to do that for each entry + * individually, because hopping between threads (this main IO thread + * and background IO driver thread) would sum up to huge latencies. + */ + count = v9fs_co_readdir_many(pdu, fidp, &entries, max_count, dostat); + if (count < 0) { + err = count; + count = 0; + goto out; + } + count = 0; - /* Ran out of buffer. Set dir back to old position and return */ - v9fs_co_seekdir(pdu, fidp, saved_dir_pos); - v9fs_string_free(&name); - return count; - } + for (struct V9fsDirEnt *e = entries; e; e = e->next) { + dent = e->dent; if (pdu->s->ctx.export_flags & V9FS_REMAP_INODES) { - /* - * dirent_to_qid() implies expensive stat call for each entry, - * we must do that here though since inode remapping requires - * the device id, which in turn might be different for - * different entries; we cannot make any assumption to avoid - * that here. - */ - err = dirent_to_qid(pdu, fidp, dent, &qid); + st = e->st; + /* e->st should never be NULL, but just to be sure */ + if (!st) { + err = -1; + break; + } + + /* remap inode */ + err = stat_to_qid(pdu, st, &qid); if (err < 0) { - v9fs_readdir_unlock(&fidp->fs.dir); - v9fs_co_seekdir(pdu, fidp, saved_dir_pos); - v9fs_string_free(&name); - return err; + break; } } else { /* * Fill up just the path field of qid because the client uses * only that. To fill the entire qid structure we will have * to stat each dirent found, which is expensive. For the - * latter reason we don't call dirent_to_qid() here. Only drawback + * latter reason we don't call stat_to_qid() here. Only drawback * is that no multi-device export detection of stat_to_qid() * would be done and provided as error to the user here. But * user would get that error anyway when accessing those @@ -2405,25 +2392,26 @@ static int coroutine_fn v9fs_do_readdir(V9fsPDU *pdu, V9fsFidState *fidp, qid.version = 0; } + v9fs_string_init(&name); + v9fs_string_sprintf(&name, "%s", dent->d_name); + /* 11 = 7 + 4 (7 = start offset, 4 = space for storing count) */ len = pdu_marshal(pdu, 11 + count, "Qqbs", &qid, dent->d_off, dent->d_type, &name); - v9fs_readdir_unlock(&fidp->fs.dir); + v9fs_string_free(&name); if (len < 0) { - v9fs_co_seekdir(pdu, fidp, saved_dir_pos); - v9fs_string_free(&name); - return len; + err = len; + break; } + count += len; - v9fs_string_free(&name); - saved_dir_pos = dent->d_off; } - v9fs_readdir_unlock(&fidp->fs.dir); - +out: + v9fs_free_dirents(entries); if (err < 0) { return err; }