From patchwork Tue Nov 3 14:19:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?SsO8cmdlbiBHcm/Dnw==?= X-Patchwork-Id: 317272 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3651CC56202 for ; Tue, 3 Nov 2020 14:20:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D6F3922447 for ; Tue, 3 Nov 2020 14:20:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="JgIsyUVa" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729580AbgKCOUk (ORCPT ); Tue, 3 Nov 2020 09:20:40 -0500 Received: from mx2.suse.de ([195.135.220.15]:44990 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729568AbgKCOTZ (ORCPT ); Tue, 3 Nov 2020 09:19:25 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1604413163; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/VP6WSc6DvlLF9yrIYqTd35xwvJM48FVR6NQTa6cRdE=; b=JgIsyUVaWOmbursyXqw02LL+sm55mcM3N4m9dB+HCGPJp0NLUum7j7i0W3Vxyt9Lrgi8iu AWjq9WthmhZ1KAwx5ipvmSXh9RoGwa/uCzIsgcT9pSItV+LWQ32VykfneAa+hdFuxDV6Mm SZAzmlVsVLYsjue/LsMYmGkQZzC8L5s= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 38614AD1A for ; Tue, 3 Nov 2020 14:19:23 +0000 (UTC) From: Juergen Gross To: stable@vger.kernel.org Subject: [PATCH v2 01/14] xen/events: don't use chip_data for legacy IRQs Date: Tue, 3 Nov 2020 15:19:09 +0100 Message-Id: <20201103141922.21026-2-jgross@suse.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201103141922.21026-1-jgross@suse.com> References: <20201103141922.21026-1-jgross@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org Since commit c330fb1ddc0a ("XEN uses irqdesc::irq_data_common::handler_data to store a per interrupt XEN data pointer which contains XEN specific information.") Xen is using the chip_data pointer for storing IRQ specific data. When running as a HVM domain this can result in problems for legacy IRQs, as those might use chip_data for their own purposes. Use a local array for this purpose in case of legacy IRQs, avoiding the double use. This is upstream commit 0891fb39ba67bd7ae023ea0d367297ffff010781 Cc: stable@vger.kernel.org Fixes: c330fb1ddc0a ("XEN uses irqdesc::irq_data_common::handler_data to store a per interrupt XEN data pointer which contains XEN specific information.") Signed-off-by: Juergen Gross Tested-by: Stefan Bader Reviewed-by: Boris Ostrovsky Link: https://lore.kernel.org/r/20200930091614.13660-1-jgross@suse.com --- drivers/xen/events/events_base.c | 29 +++++++++++++++++++++-------- 1 file changed, 21 insertions(+), 8 deletions(-) diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index 95e5a9300ff0..fdeeef2b9947 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -90,6 +90,8 @@ static bool (*pirq_needs_eoi)(unsigned irq); /* Xen will never allocate port zero for any purpose. */ #define VALID_EVTCHN(chn) ((chn) != 0) +static struct irq_info *legacy_info_ptrs[NR_IRQS_LEGACY]; + static struct irq_chip xen_dynamic_chip; static struct irq_chip xen_percpu_chip; static struct irq_chip xen_pirq_chip; @@ -154,7 +156,18 @@ int get_evtchn_to_irq(unsigned evtchn) /* Get info for IRQ */ struct irq_info *info_for_irq(unsigned irq) { - return irq_get_chip_data(irq); + if (irq < nr_legacy_irqs()) + return legacy_info_ptrs[irq]; + else + return irq_get_chip_data(irq); +} + +static void set_info_for_irq(unsigned int irq, struct irq_info *info) +{ + if (irq < nr_legacy_irqs()) + legacy_info_ptrs[irq] = info; + else + irq_set_chip_data(irq, info); } /* Constructors for packed IRQ information. */ @@ -375,7 +388,7 @@ static void xen_irq_init(unsigned irq) info->type = IRQT_UNBOUND; info->refcnt = -1; - irq_set_chip_data(irq, info); + set_info_for_irq(irq, info); list_add_tail(&info->list, &xen_irq_list_head); } @@ -424,14 +437,14 @@ static int __must_check xen_allocate_irq_gsi(unsigned gsi) static void xen_free_irq(unsigned irq) { - struct irq_info *info = irq_get_chip_data(irq); + struct irq_info *info = info_for_irq(irq); if (WARN_ON(!info)) return; list_del(&info->list); - irq_set_chip_data(irq, NULL); + set_info_for_irq(irq, NULL); WARN_ON(info->refcnt > 0); @@ -601,7 +614,7 @@ EXPORT_SYMBOL_GPL(xen_irq_from_gsi); static void __unbind_from_irq(unsigned int irq) { int evtchn = evtchn_from_irq(irq); - struct irq_info *info = irq_get_chip_data(irq); + struct irq_info *info = info_for_irq(irq); if (info->refcnt > 0) { info->refcnt--; @@ -1105,7 +1118,7 @@ int bind_ipi_to_irqhandler(enum ipi_vector ipi, void unbind_from_irqhandler(unsigned int irq, void *dev_id) { - struct irq_info *info = irq_get_chip_data(irq); + struct irq_info *info = info_for_irq(irq); if (WARN_ON(!info)) return; @@ -1139,7 +1152,7 @@ int evtchn_make_refcounted(unsigned int evtchn) if (irq == -1) return -ENOENT; - info = irq_get_chip_data(irq); + info = info_for_irq(irq); if (!info) return -ENOENT; @@ -1167,7 +1180,7 @@ int evtchn_get(unsigned int evtchn) if (irq == -1) goto done; - info = irq_get_chip_data(irq); + info = info_for_irq(irq); if (!info) goto done; From patchwork Tue Nov 3 14:19:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?SsO8cmdlbiBHcm/Dnw==?= X-Patchwork-Id: 317273 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2E94C63699 for ; Tue, 3 Nov 2020 14:20:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B7FFF2242E for ; Tue, 3 Nov 2020 14:20:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="BEqq6QE/" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729571AbgKCOUm (ORCPT ); Tue, 3 Nov 2020 09:20:42 -0500 Received: from mx2.suse.de ([195.135.220.15]:45020 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729572AbgKCOTY (ORCPT ); Tue, 3 Nov 2020 09:19:24 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1604413163; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=D5JMDRpr4XQxvKEMDlxF9Pb3B5vWxz5wnOaFOseKtlY=; b=BEqq6QE/1qJUzv6g0NmspiFD60C0l2wqrUOJx5JtqSfYkRX5CYRcCyL7b8ej8hYPBezIwh WEg665wOdAPi4TIwsFc6RzphzmMdX8fJuXsE8+LZPxlTWzGYKZoKjucb0RcqWJOCgBsbzb NUWdJYBDit/B5gM/W4XMHHnt5rbVskU= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 6396AADAD for ; Tue, 3 Nov 2020 14:19:23 +0000 (UTC) From: Juergen Gross To: stable@vger.kernel.org Subject: [PATCH v2 03/14] xen/events: add a proper barrier to 2-level uevent unmasking Date: Tue, 3 Nov 2020 15:19:11 +0100 Message-Id: <20201103141922.21026-4-jgross@suse.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201103141922.21026-1-jgross@suse.com> References: <20201103141922.21026-1-jgross@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org A follow-up patch will require certain write to happen before an event channel is unmasked. While the memory barrier is not strictly necessary for all the callers, the main one will need it. In order to avoid an extra memory barrier when using fifo event channels, mandate evtchn_unmask() to provide write ordering. The 2-level event handling unmask operation is missing an appropriate barrier, so add it. Fifo event channels are fine in this regard due to using sync_cmpxchg(). This is part of XSA-332. This is upstream commit 4d3fe31bd993ef504350989786858aefdb877daa Cc: stable@vger.kernel.org Suggested-by: Julien Grall Signed-off-by: Juergen Gross Reviewed-by: Julien Grall Reviewed-by: Wei Liu --- drivers/xen/events/events_2l.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/xen/events/events_2l.c b/drivers/xen/events/events_2l.c index 8edef51c92e5..e4b75693600e 100644 --- a/drivers/xen/events/events_2l.c +++ b/drivers/xen/events/events_2l.c @@ -91,6 +91,8 @@ static void evtchn_2l_unmask(unsigned port) BUG_ON(!irqs_disabled()); + smp_wmb(); /* All writes before unmask must be visible. */ + if (unlikely((cpu != cpu_from_evtchn(port)))) do_hypercall = 1; else { From patchwork Tue Nov 3 14:19:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?SsO8cmdlbiBHcm/Dnw==?= X-Patchwork-Id: 317275 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 56BFFC5DF9D for ; Tue, 3 Nov 2020 14:20:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1650F22280 for ; Tue, 3 Nov 2020 14:20:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="F5hDHXSl" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729568AbgKCOUl (ORCPT ); Tue, 3 Nov 2020 09:20:41 -0500 Received: from mx2.suse.de ([195.135.220.15]:45032 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729573AbgKCOTZ (ORCPT ); Tue, 3 Nov 2020 09:19:25 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1604413163; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=17FMAHhs4YL1/t0+anTrzXuWljmDTLAYkMFKjnFnPu8=; b=F5hDHXSl98U8GB8s027G7mn7cM+qF/oR3I30J2aVOLfZUJx22JUHBaFbqhtQXzJkAPoSmX +AcCQaHBpk5lOhOKiXkFcd1IrGG1Qe058jle97QjL8JUuWJ7BhQGAkia+1QbPDtGRvm2Q3 HJbymcdvAwylI12kp/ZiFvYX2nHOVIU= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 74975ADBB for ; Tue, 3 Nov 2020 14:19:23 +0000 (UTC) From: Juergen Gross To: stable@vger.kernel.org Subject: [PATCH v2 04/14] xen/events: fix race in evtchn_fifo_unmask() Date: Tue, 3 Nov 2020 15:19:12 +0100 Message-Id: <20201103141922.21026-5-jgross@suse.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201103141922.21026-1-jgross@suse.com> References: <20201103141922.21026-1-jgross@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org Unmasking a fifo event channel can result in unmasking it twice, once directly in the kernel and once via a hypercall in case the event was pending. Fix that by doing the local unmask only if the event is not pending. This is part of XSA-332. This is upstream commit f01337197419b7e8a492e83089552b77d3b5fb90 Cc: stable@vger.kernel.org Signed-off-by: Juergen Gross Reviewed-by: Jan Beulich --- drivers/xen/events/events_fifo.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/xen/events/events_fifo.c b/drivers/xen/events/events_fifo.c index 76b318e88382..3071256a9413 100644 --- a/drivers/xen/events/events_fifo.c +++ b/drivers/xen/events/events_fifo.c @@ -227,19 +227,25 @@ static bool evtchn_fifo_is_masked(unsigned port) return sync_test_bit(EVTCHN_FIFO_BIT(MASKED, word), BM(word)); } /* - * Clear MASKED, spinning if BUSY is set. + * Clear MASKED if not PENDING, spinning if BUSY is set. + * Return true if mask was cleared. */ -static void clear_masked(volatile event_word_t *word) +static bool clear_masked_cond(volatile event_word_t *word) { event_word_t new, old, w; w = *word; do { + if (w & (1 << EVTCHN_FIFO_PENDING)) + return false; + old = w & ~(1 << EVTCHN_FIFO_BUSY); new = old & ~(1 << EVTCHN_FIFO_MASKED); w = sync_cmpxchg(word, old, new); } while (w != old); + + return true; } static void evtchn_fifo_unmask(unsigned port) @@ -248,8 +254,7 @@ static void evtchn_fifo_unmask(unsigned port) BUG_ON(!irqs_disabled()); - clear_masked(word); - if (evtchn_fifo_is_pending(port)) { + if (!clear_masked_cond(word)) { struct evtchn_unmask unmask = { .port = port }; (void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask); } From patchwork Tue Nov 3 14:19:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?SsO8cmdlbiBHcm/Dnw==?= X-Patchwork-Id: 317276 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95275C56201 for ; Tue, 3 Nov 2020 14:20:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3BA3B223AB for ; Tue, 3 Nov 2020 14:20:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="n8pqmROB" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729586AbgKCOUk (ORCPT ); Tue, 3 Nov 2020 09:20:40 -0500 Received: from mx2.suse.de ([195.135.220.15]:45104 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729580AbgKCOT0 (ORCPT ); Tue, 3 Nov 2020 09:19:26 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1604413163; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6srHEdc7K9ItGRZ8SPqZ63pI0fKJQUiatFahMjHcQpI=; b=n8pqmROBSQx9juFkGAJ99ZdreXWIuTlFmdar4UK8fZFtwpvef6OBimoGGd6XfpzMKGknu9 t4/Qd2k6yQ2rYprYsNUmsOx4QN7/fD8a0VniV8LMm9MVgn27FYmfdtvhX7j6JHUtBYsvOm e7wTNk9GKNV4ABWrCibHRix5eg8DcMw= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id A1105AE47 for ; Tue, 3 Nov 2020 14:19:23 +0000 (UTC) From: Juergen Gross To: stable@vger.kernel.org Subject: [PATCH v2 06/14] xen/blkback: use lateeoi irq binding Date: Tue, 3 Nov 2020 15:19:14 +0100 Message-Id: <20201103141922.21026-7-jgross@suse.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201103141922.21026-1-jgross@suse.com> References: <20201103141922.21026-1-jgross@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org In order to reduce the chance for the system becoming unresponsive due to event storms triggered by a misbehaving blkfront use the lateeoi irq binding for blkback and unmask the event channel only after processing all pending requests. As the thread processing requests is used to do purging work in regular intervals an EOI may be sent only after having received an event. If there was no pending I/O request flag the EOI as spurious. This is part of XSA-332. This is upstream commit 01263a1fabe30b4d542f34c7e2364a22587ddaf2 Cc: stable@vger.kernel.org Reported-by: Julien Grall Signed-off-by: Juergen Gross Reviewed-by: Jan Beulich Reviewed-by: Wei Liu --- drivers/block/xen-blkback/blkback.c | 22 +++++++++++++++++----- drivers/block/xen-blkback/xenbus.c | 5 ++--- 2 files changed, 19 insertions(+), 8 deletions(-) diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c index 3666afa639d1..b18f0162cb9c 100644 --- a/drivers/block/xen-blkback/blkback.c +++ b/drivers/block/xen-blkback/blkback.c @@ -202,7 +202,7 @@ static inline void shrink_free_pagepool(struct xen_blkif_ring *ring, int num) #define vaddr(page) ((unsigned long)pfn_to_kaddr(page_to_pfn(page))) -static int do_block_io_op(struct xen_blkif_ring *ring); +static int do_block_io_op(struct xen_blkif_ring *ring, unsigned int *eoi_flags); static int dispatch_rw_block_io(struct xen_blkif_ring *ring, struct blkif_request *req, struct pending_req *pending_req); @@ -615,6 +615,8 @@ int xen_blkif_schedule(void *arg) struct xen_vbd *vbd = &blkif->vbd; unsigned long timeout; int ret; + bool do_eoi; + unsigned int eoi_flags = XEN_EOI_FLAG_SPURIOUS; set_freezable(); while (!kthread_should_stop()) { @@ -639,16 +641,23 @@ int xen_blkif_schedule(void *arg) if (timeout == 0) goto purge_gnt_list; + do_eoi = ring->waiting_reqs; + ring->waiting_reqs = 0; smp_mb(); /* clear flag *before* checking for work */ - ret = do_block_io_op(ring); + ret = do_block_io_op(ring, &eoi_flags); if (ret > 0) ring->waiting_reqs = 1; if (ret == -EACCES) wait_event_interruptible(ring->shutdown_wq, kthread_should_stop()); + if (do_eoi && !ring->waiting_reqs) { + xen_irq_lateeoi(ring->irq, eoi_flags); + eoi_flags |= XEN_EOI_FLAG_SPURIOUS; + } + purge_gnt_list: if (blkif->vbd.feature_gnt_persistent && time_after(jiffies, ring->next_lru)) { @@ -1121,7 +1130,7 @@ static void end_block_io_op(struct bio *bio) * and transmute it to the block API to hand it over to the proper block disk. */ static int -__do_block_io_op(struct xen_blkif_ring *ring) +__do_block_io_op(struct xen_blkif_ring *ring, unsigned int *eoi_flags) { union blkif_back_rings *blk_rings = &ring->blk_rings; struct blkif_request req; @@ -1144,6 +1153,9 @@ __do_block_io_op(struct xen_blkif_ring *ring) if (RING_REQUEST_CONS_OVERFLOW(&blk_rings->common, rc)) break; + /* We've seen a request, so clear spurious eoi flag. */ + *eoi_flags &= ~XEN_EOI_FLAG_SPURIOUS; + if (kthread_should_stop()) { more_to_do = 1; break; @@ -1202,13 +1214,13 @@ __do_block_io_op(struct xen_blkif_ring *ring) } static int -do_block_io_op(struct xen_blkif_ring *ring) +do_block_io_op(struct xen_blkif_ring *ring, unsigned int *eoi_flags) { union blkif_back_rings *blk_rings = &ring->blk_rings; int more_to_do; do { - more_to_do = __do_block_io_op(ring); + more_to_do = __do_block_io_op(ring, eoi_flags); if (more_to_do) break; diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c index 25c41ce070a7..93896c992245 100644 --- a/drivers/block/xen-blkback/xenbus.c +++ b/drivers/block/xen-blkback/xenbus.c @@ -237,9 +237,8 @@ static int xen_blkif_map(struct xen_blkif_ring *ring, grant_ref_t *gref, BUG(); } - err = bind_interdomain_evtchn_to_irqhandler(blkif->domid, evtchn, - xen_blkif_be_int, 0, - "blkif-backend", ring); + err = bind_interdomain_evtchn_to_irqhandler_lateeoi(blkif->domid, + evtchn, xen_blkif_be_int, 0, "blkif-backend", ring); if (err < 0) { xenbus_unmap_ring_vfree(blkif->be->dev, ring->blk_ring); ring->blk_rings.common.sring = NULL; From patchwork Tue Nov 3 14:19:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?SsO8cmdlbiBHcm/Dnw==?= X-Patchwork-Id: 317277 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9DDBC2D0A3 for ; Tue, 3 Nov 2020 14:20:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 48A38223AB for ; Tue, 3 Nov 2020 14:20:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="YqTUFNrH" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729611AbgKCOUi (ORCPT ); Tue, 3 Nov 2020 09:20:38 -0500 Received: from mx2.suse.de ([195.135.220.15]:45122 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729585AbgKCOT1 (ORCPT ); Tue, 3 Nov 2020 09:19:27 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1604413163; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SKaEFe1nsei6EmdcfmJuP8QLhcXvJDuVM86N5s6gOT4=; b=YqTUFNrHTBCzEzURa/9Pnenq67nGKPkMZ0VVX0C+fIIHyKIGfXS+b2Ag9vu9fPesvez8ws WqV7eBT0YtNZ84kfYOzcf7IMw0ydj9go2YGQymkIJiaOmgx/H/Nqv03Z18EWHZhBz4tBid lxUxUkHu8VlRAdNWQitckFE+/m+kTKo= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id B4892ADC2 for ; Tue, 3 Nov 2020 14:19:23 +0000 (UTC) From: Juergen Gross To: stable@vger.kernel.org Subject: [PATCH v2 07/14] xen/netback: use lateeoi irq binding Date: Tue, 3 Nov 2020 15:19:15 +0100 Message-Id: <20201103141922.21026-8-jgross@suse.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201103141922.21026-1-jgross@suse.com> References: <20201103141922.21026-1-jgross@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org In order to reduce the chance for the system becoming unresponsive due to event storms triggered by a misbehaving netfront use the lateeoi irq binding for netback and unmask the event channel only just before going to sleep waiting for new events. Make sure not to issue an EOI when none is pending by introducing an eoi_pending element to struct xenvif_queue. When no request has been consumed set the spurious flag when sending the EOI for an interrupt. This is part of XSA-332. This is upstream commit 23025393dbeb3b8b3b60ebfa724cdae384992e27 Cc: stable@vger.kernel.org Reported-by: Julien Grall Signed-off-by: Juergen Gross Reviewed-by: Jan Beulich Reviewed-by: Wei Liu --- drivers/net/xen-netback/common.h | 15 +++++++ drivers/net/xen-netback/interface.c | 61 ++++++++++++++++++++++++----- drivers/net/xen-netback/netback.c | 11 +++++- drivers/net/xen-netback/rx.c | 13 ++++-- 4 files changed, 86 insertions(+), 14 deletions(-) diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 936c0b3e0ba2..86d23d0f563c 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -140,6 +140,20 @@ struct xenvif_queue { /* Per-queue data for xenvif */ char name[QUEUE_NAME_SIZE]; /* DEVNAME-qN */ struct xenvif *vif; /* Parent VIF */ + /* + * TX/RX common EOI handling. + * When feature-split-event-channels = 0, interrupt handler sets + * NETBK_COMMON_EOI, otherwise NETBK_RX_EOI and NETBK_TX_EOI are set + * by the RX and TX interrupt handlers. + * RX and TX handler threads will issue an EOI when either + * NETBK_COMMON_EOI or their specific bits (NETBK_RX_EOI or + * NETBK_TX_EOI) are set and they will reset those bits. + */ + atomic_t eoi_pending; +#define NETBK_RX_EOI 0x01 +#define NETBK_TX_EOI 0x02 +#define NETBK_COMMON_EOI 0x04 + /* Use NAPI for guest TX */ struct napi_struct napi; /* When feature-split-event-channels = 0, tx_irq = rx_irq. */ @@ -357,6 +371,7 @@ int xenvif_dealloc_kthread(void *data); irqreturn_t xenvif_ctrl_irq_fn(int irq, void *data); +bool xenvif_have_rx_work(struct xenvif_queue *queue, bool test_kthread); void xenvif_rx_action(struct xenvif_queue *queue); void xenvif_rx_queue_tail(struct xenvif_queue *queue, struct sk_buff *skb); diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index 4cafc31b98b7..c960cb7e3251 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -77,12 +77,28 @@ int xenvif_schedulable(struct xenvif *vif) !vif->disabled; } +static bool xenvif_handle_tx_interrupt(struct xenvif_queue *queue) +{ + bool rc; + + rc = RING_HAS_UNCONSUMED_REQUESTS(&queue->tx); + if (rc) + napi_schedule(&queue->napi); + return rc; +} + static irqreturn_t xenvif_tx_interrupt(int irq, void *dev_id) { struct xenvif_queue *queue = dev_id; + int old; - if (RING_HAS_UNCONSUMED_REQUESTS(&queue->tx)) - napi_schedule(&queue->napi); + old = atomic_fetch_or(NETBK_TX_EOI, &queue->eoi_pending); + WARN(old & NETBK_TX_EOI, "Interrupt while EOI pending\n"); + + if (!xenvif_handle_tx_interrupt(queue)) { + atomic_andnot(NETBK_TX_EOI, &queue->eoi_pending); + xen_irq_lateeoi(irq, XEN_EOI_FLAG_SPURIOUS); + } return IRQ_HANDLED; } @@ -116,19 +132,46 @@ static int xenvif_poll(struct napi_struct *napi, int budget) return work_done; } +static bool xenvif_handle_rx_interrupt(struct xenvif_queue *queue) +{ + bool rc; + + rc = xenvif_have_rx_work(queue, false); + if (rc) + xenvif_kick_thread(queue); + return rc; +} + static irqreturn_t xenvif_rx_interrupt(int irq, void *dev_id) { struct xenvif_queue *queue = dev_id; + int old; - xenvif_kick_thread(queue); + old = atomic_fetch_or(NETBK_RX_EOI, &queue->eoi_pending); + WARN(old & NETBK_RX_EOI, "Interrupt while EOI pending\n"); + + if (!xenvif_handle_rx_interrupt(queue)) { + atomic_andnot(NETBK_RX_EOI, &queue->eoi_pending); + xen_irq_lateeoi(irq, XEN_EOI_FLAG_SPURIOUS); + } return IRQ_HANDLED; } irqreturn_t xenvif_interrupt(int irq, void *dev_id) { - xenvif_tx_interrupt(irq, dev_id); - xenvif_rx_interrupt(irq, dev_id); + struct xenvif_queue *queue = dev_id; + int old; + + old = atomic_fetch_or(NETBK_COMMON_EOI, &queue->eoi_pending); + WARN(old, "Interrupt while EOI pending\n"); + + /* Use bitwise or as we need to call both functions. */ + if ((!xenvif_handle_tx_interrupt(queue) | + !xenvif_handle_rx_interrupt(queue))) { + atomic_andnot(NETBK_COMMON_EOI, &queue->eoi_pending); + xen_irq_lateeoi(irq, XEN_EOI_FLAG_SPURIOUS); + } return IRQ_HANDLED; } @@ -595,7 +638,7 @@ int xenvif_connect_ctrl(struct xenvif *vif, grant_ref_t ring_ref, shared = (struct xen_netif_ctrl_sring *)addr; BACK_RING_INIT(&vif->ctrl, shared, XEN_PAGE_SIZE); - err = bind_interdomain_evtchn_to_irq(vif->domid, evtchn); + err = bind_interdomain_evtchn_to_irq_lateeoi(vif->domid, evtchn); if (err < 0) goto err_unmap; @@ -653,7 +696,7 @@ int xenvif_connect_data(struct xenvif_queue *queue, if (tx_evtchn == rx_evtchn) { /* feature-split-event-channels == 0 */ - err = bind_interdomain_evtchn_to_irqhandler( + err = bind_interdomain_evtchn_to_irqhandler_lateeoi( queue->vif->domid, tx_evtchn, xenvif_interrupt, 0, queue->name, queue); if (err < 0) @@ -664,7 +707,7 @@ int xenvif_connect_data(struct xenvif_queue *queue, /* feature-split-event-channels == 1 */ snprintf(queue->tx_irq_name, sizeof(queue->tx_irq_name), "%s-tx", queue->name); - err = bind_interdomain_evtchn_to_irqhandler( + err = bind_interdomain_evtchn_to_irqhandler_lateeoi( queue->vif->domid, tx_evtchn, xenvif_tx_interrupt, 0, queue->tx_irq_name, queue); if (err < 0) @@ -674,7 +717,7 @@ int xenvif_connect_data(struct xenvif_queue *queue, snprintf(queue->rx_irq_name, sizeof(queue->rx_irq_name), "%s-rx", queue->name); - err = bind_interdomain_evtchn_to_irqhandler( + err = bind_interdomain_evtchn_to_irqhandler_lateeoi( queue->vif->domid, rx_evtchn, xenvif_rx_interrupt, 0, queue->rx_irq_name, queue); if (err < 0) diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 1c849106b793..f228298c3bd0 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -162,6 +162,10 @@ void xenvif_napi_schedule_or_enable_events(struct xenvif_queue *queue) if (more_to_do) napi_schedule(&queue->napi); + else if (atomic_fetch_andnot(NETBK_TX_EOI | NETBK_COMMON_EOI, + &queue->eoi_pending) & + (NETBK_TX_EOI | NETBK_COMMON_EOI)) + xen_irq_lateeoi(queue->tx_irq, 0); } static void tx_add_credit(struct xenvif_queue *queue) @@ -1613,9 +1617,14 @@ static bool xenvif_ctrl_work_todo(struct xenvif *vif) irqreturn_t xenvif_ctrl_irq_fn(int irq, void *data) { struct xenvif *vif = data; + unsigned int eoi_flag = XEN_EOI_FLAG_SPURIOUS; - while (xenvif_ctrl_work_todo(vif)) + while (xenvif_ctrl_work_todo(vif)) { xenvif_ctrl_action(vif); + eoi_flag = 0; + } + + xen_irq_lateeoi(irq, eoi_flag); return IRQ_HANDLED; } diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-netback/rx.c index ef5887037b22..9b62f65b630e 100644 --- a/drivers/net/xen-netback/rx.c +++ b/drivers/net/xen-netback/rx.c @@ -490,13 +490,13 @@ static bool xenvif_rx_queue_ready(struct xenvif_queue *queue) return queue->stalled && prod - cons >= 1; } -static bool xenvif_have_rx_work(struct xenvif_queue *queue) +bool xenvif_have_rx_work(struct xenvif_queue *queue, bool test_kthread) { return xenvif_rx_ring_slots_available(queue) || (queue->vif->stall_timeout && (xenvif_rx_queue_stalled(queue) || xenvif_rx_queue_ready(queue))) || - kthread_should_stop() || + (test_kthread && kthread_should_stop()) || queue->vif->disabled; } @@ -527,15 +527,20 @@ static void xenvif_wait_for_rx_work(struct xenvif_queue *queue) { DEFINE_WAIT(wait); - if (xenvif_have_rx_work(queue)) + if (xenvif_have_rx_work(queue, true)) return; for (;;) { long ret; prepare_to_wait(&queue->wq, &wait, TASK_INTERRUPTIBLE); - if (xenvif_have_rx_work(queue)) + if (xenvif_have_rx_work(queue, true)) break; + if (atomic_fetch_andnot(NETBK_RX_EOI | NETBK_COMMON_EOI, + &queue->eoi_pending) & + (NETBK_RX_EOI | NETBK_COMMON_EOI)) + xen_irq_lateeoi(queue->rx_irq, 0); + ret = schedule_timeout(xenvif_rx_queue_timeout(queue)); if (!ret) break; From patchwork Tue Nov 3 14:19:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?SsO8cmdlbiBHcm/Dnw==?= X-Patchwork-Id: 317274 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA8ACC63697 for ; Tue, 3 Nov 2020 14:20:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7578522264 for ; Tue, 3 Nov 2020 14:20:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="lpppwbZE" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729253AbgKCOUk (ORCPT ); Tue, 3 Nov 2020 09:20:40 -0500 Received: from mx2.suse.de ([195.135.220.15]:45128 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729595AbgKCOT0 (ORCPT ); Tue, 3 Nov 2020 09:19:26 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1604413163; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=r8mHf+joyz/0GpLe10Xdqsr1z1obANQBo/Ex5juEWB0=; b=lpppwbZE7+8QEQRFCH/GFK6RAV7Mo22V9IMqMSKB/leMG5f8W2IHSFlCo6wrjGnzAZ8Cvs uXNPJJiycurpt8t/J3mvjB1NAuUgEUQUgMzqSkmvEz6gLBpGknycAbYR/aGDFd5Wca2xmj HbdFezTx4CIc/6wdZXxkb+wOkpblEMk= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id DC470AE0B for ; Tue, 3 Nov 2020 14:19:23 +0000 (UTC) From: Juergen Gross To: stable@vger.kernel.org Subject: [PATCH v2 09/14] xen/pvcallsback: use lateeoi irq binding Date: Tue, 3 Nov 2020 15:19:17 +0100 Message-Id: <20201103141922.21026-10-jgross@suse.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201103141922.21026-1-jgross@suse.com> References: <20201103141922.21026-1-jgross@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org In order to reduce the chance for the system becoming unresponsive due to event storms triggered by a misbehaving pvcallsfront use the lateeoi irq binding for pvcallsback and unmask the event channel only after handling all write requests, which are the ones coming in via an irq. This requires modifying the logic a little bit to not require an event for each write request, but to keep the ioworker running until no further data is found on the ring page to be processed. This is part of XSA-332. This is upstream commit c8d647a326f06a39a8e5f0f1af946eacfa1835f8 Cc: stable@vger.kernel.org Reported-by: Julien Grall Signed-off-by: Juergen Gross Reviewed-by: Stefano Stabellini Reviewed-by: Wei Liu --- drivers/xen/pvcalls-back.c | 76 +++++++++++++++++++++++--------------- 1 file changed, 46 insertions(+), 30 deletions(-) diff --git a/drivers/xen/pvcalls-back.c b/drivers/xen/pvcalls-back.c index 398fd8b1639d..f94bb6034a5a 100644 --- a/drivers/xen/pvcalls-back.c +++ b/drivers/xen/pvcalls-back.c @@ -75,6 +75,7 @@ struct sock_mapping { atomic_t write; atomic_t io; atomic_t release; + atomic_t eoi; void (*saved_data_ready)(struct sock *sk); struct pvcalls_ioworker ioworker; }; @@ -96,7 +97,7 @@ static int pvcalls_back_release_active(struct xenbus_device *dev, struct pvcalls_fedata *fedata, struct sock_mapping *map); -static void pvcalls_conn_back_read(void *opaque) +static bool pvcalls_conn_back_read(void *opaque) { struct sock_mapping *map = (struct sock_mapping *)opaque; struct msghdr msg; @@ -116,17 +117,17 @@ static void pvcalls_conn_back_read(void *opaque) virt_mb(); if (error) - return; + return false; size = pvcalls_queued(prod, cons, array_size); if (size >= array_size) - return; + return false; spin_lock_irqsave(&map->sock->sk->sk_receive_queue.lock, flags); if (skb_queue_empty(&map->sock->sk->sk_receive_queue)) { atomic_set(&map->read, 0); spin_unlock_irqrestore(&map->sock->sk->sk_receive_queue.lock, flags); - return; + return true; } spin_unlock_irqrestore(&map->sock->sk->sk_receive_queue.lock, flags); wanted = array_size - size; @@ -150,7 +151,7 @@ static void pvcalls_conn_back_read(void *opaque) ret = inet_recvmsg(map->sock, &msg, wanted, MSG_DONTWAIT); WARN_ON(ret > wanted); if (ret == -EAGAIN) /* shouldn't happen */ - return; + return true; if (!ret) ret = -ENOTCONN; spin_lock_irqsave(&map->sock->sk->sk_receive_queue.lock, flags); @@ -169,10 +170,10 @@ static void pvcalls_conn_back_read(void *opaque) virt_wmb(); notify_remote_via_irq(map->irq); - return; + return true; } -static void pvcalls_conn_back_write(struct sock_mapping *map) +static bool pvcalls_conn_back_write(struct sock_mapping *map) { struct pvcalls_data_intf *intf = map->ring; struct pvcalls_data *data = &map->data; @@ -189,7 +190,7 @@ static void pvcalls_conn_back_write(struct sock_mapping *map) array_size = XEN_FLEX_RING_SIZE(map->ring_order); size = pvcalls_queued(prod, cons, array_size); if (size == 0) - return; + return false; memset(&msg, 0, sizeof(msg)); msg.msg_flags |= MSG_DONTWAIT; @@ -207,12 +208,11 @@ static void pvcalls_conn_back_write(struct sock_mapping *map) atomic_set(&map->write, 0); ret = inet_sendmsg(map->sock, &msg, size); - if (ret == -EAGAIN || (ret >= 0 && ret < size)) { + if (ret == -EAGAIN) { atomic_inc(&map->write); atomic_inc(&map->io); + return true; } - if (ret == -EAGAIN) - return; /* write the data, then update the indexes */ virt_wmb(); @@ -225,9 +225,13 @@ static void pvcalls_conn_back_write(struct sock_mapping *map) } /* update the indexes, then notify the other end */ virt_wmb(); - if (prod != cons + ret) + if (prod != cons + ret) { atomic_inc(&map->write); + atomic_inc(&map->io); + } notify_remote_via_irq(map->irq); + + return true; } static void pvcalls_back_ioworker(struct work_struct *work) @@ -236,6 +240,7 @@ static void pvcalls_back_ioworker(struct work_struct *work) struct pvcalls_ioworker, register_work); struct sock_mapping *map = container_of(ioworker, struct sock_mapping, ioworker); + unsigned int eoi_flags = XEN_EOI_FLAG_SPURIOUS; while (atomic_read(&map->io) > 0) { if (atomic_read(&map->release) > 0) { @@ -243,10 +248,18 @@ static void pvcalls_back_ioworker(struct work_struct *work) return; } - if (atomic_read(&map->read) > 0) - pvcalls_conn_back_read(map); - if (atomic_read(&map->write) > 0) - pvcalls_conn_back_write(map); + if (atomic_read(&map->read) > 0 && + pvcalls_conn_back_read(map)) + eoi_flags = 0; + if (atomic_read(&map->write) > 0 && + pvcalls_conn_back_write(map)) + eoi_flags = 0; + + if (atomic_read(&map->eoi) > 0 && !atomic_read(&map->write)) { + atomic_set(&map->eoi, 0); + xen_irq_lateeoi(map->irq, eoi_flags); + eoi_flags = XEN_EOI_FLAG_SPURIOUS; + } atomic_dec(&map->io); } @@ -343,12 +356,9 @@ static struct sock_mapping *pvcalls_new_active_socket( goto out; map->bytes = page; - ret = bind_interdomain_evtchn_to_irqhandler(fedata->dev->otherend_id, - evtchn, - pvcalls_back_conn_event, - 0, - "pvcalls-backend", - map); + ret = bind_interdomain_evtchn_to_irqhandler_lateeoi( + fedata->dev->otherend_id, evtchn, + pvcalls_back_conn_event, 0, "pvcalls-backend", map); if (ret < 0) goto out; map->irq = ret; @@ -882,15 +892,18 @@ static irqreturn_t pvcalls_back_event(int irq, void *dev_id) { struct xenbus_device *dev = dev_id; struct pvcalls_fedata *fedata = NULL; + unsigned int eoi_flags = XEN_EOI_FLAG_SPURIOUS; - if (dev == NULL) - return IRQ_HANDLED; + if (dev) { + fedata = dev_get_drvdata(&dev->dev); + if (fedata) { + pvcalls_back_work(fedata); + eoi_flags = 0; + } + } - fedata = dev_get_drvdata(&dev->dev); - if (fedata == NULL) - return IRQ_HANDLED; + xen_irq_lateeoi(irq, eoi_flags); - pvcalls_back_work(fedata); return IRQ_HANDLED; } @@ -900,12 +913,15 @@ static irqreturn_t pvcalls_back_conn_event(int irq, void *sock_map) struct pvcalls_ioworker *iow; if (map == NULL || map->sock == NULL || map->sock->sk == NULL || - map->sock->sk->sk_user_data != map) + map->sock->sk->sk_user_data != map) { + xen_irq_lateeoi(irq, 0); return IRQ_HANDLED; + } iow = &map->ioworker; atomic_inc(&map->write); + atomic_inc(&map->eoi); atomic_inc(&map->io); queue_work(iow->wq, &iow->register_work); @@ -940,7 +956,7 @@ static int backend_connect(struct xenbus_device *dev) goto error; } - err = bind_interdomain_evtchn_to_irq(dev->otherend_id, evtchn); + err = bind_interdomain_evtchn_to_irq_lateeoi(dev->otherend_id, evtchn); if (err < 0) goto error; fedata->irq = err; From patchwork Tue Nov 3 14:19:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?SsO8cmdlbiBHcm/Dnw==?= X-Patchwork-Id: 317278 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DEAD0C55179 for ; Tue, 3 Nov 2020 14:20:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8917222264 for ; Tue, 3 Nov 2020 14:20:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="ufIFlmQA" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729585AbgKCOUi (ORCPT ); Tue, 3 Nov 2020 09:20:38 -0500 Received: from mx2.suse.de ([195.135.220.15]:45136 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729605AbgKCOT0 (ORCPT ); Tue, 3 Nov 2020 09:19:26 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1604413164; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oAKbdIX/mxQQnuryR1KtecGtrFQFmm0qgt2EWT/5Uvk=; b=ufIFlmQAY0uvlpXjC4Oic4kvKyzWqILk7r5kjEU6J/1XubsUfK3V09iCe8+PwVBMozqZAO nQFwb8vpqMCb39zcHlGlliSJb6F3OZ4qXXyCvbHoZLGxVQ9UMzSQ7E06OlTLLxyp09trEm ocuzkfwQqu/rcWS+VChf7rUrDLpsSaE= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 0E4B9B298 for ; Tue, 3 Nov 2020 14:19:24 +0000 (UTC) From: Juergen Gross To: stable@vger.kernel.org Subject: [PATCH v2 11/14] xen/events: switch user event channels to lateeoi model Date: Tue, 3 Nov 2020 15:19:19 +0100 Message-Id: <20201103141922.21026-12-jgross@suse.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201103141922.21026-1-jgross@suse.com> References: <20201103141922.21026-1-jgross@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org Instead of disabling the irq when an event is received and enabling it again when handled by the user process use the lateeoi model. This is part of XSA-332. This is upstream commit c44b849cee8c3ac587da3b0980e01f77500d158c Cc: stable@vger.kernel.org Reported-by: Julien Grall Signed-off-by: Juergen Gross Tested-by: Stefano Stabellini Reviewed-by: Stefano Stabellini Reviewed-by: Jan Beulich Reviewed-by: Wei Liu --- drivers/xen/evtchn.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/xen/evtchn.c b/drivers/xen/evtchn.c index 47c70b826a6a..4b11e60e37a3 100644 --- a/drivers/xen/evtchn.c +++ b/drivers/xen/evtchn.c @@ -166,7 +166,6 @@ static irqreturn_t evtchn_interrupt(int irq, void *data) "Interrupt for port %d, but apparently not enabled; per-user %p\n", evtchn->port, u); - disable_irq_nosync(irq); evtchn->enabled = false; spin_lock(&u->ring_prod_lock); @@ -292,7 +291,7 @@ static ssize_t evtchn_write(struct file *file, const char __user *buf, evtchn = find_evtchn(u, port); if (evtchn && !evtchn->enabled) { evtchn->enabled = true; - enable_irq(irq_from_evtchn(port)); + xen_irq_lateeoi(irq_from_evtchn(port), 0); } } @@ -392,8 +391,8 @@ static int evtchn_bind_to_user(struct per_user_data *u, int port) if (rc < 0) goto err; - rc = bind_evtchn_to_irqhandler(port, evtchn_interrupt, 0, - u->name, evtchn); + rc = bind_evtchn_to_irqhandler_lateeoi(port, evtchn_interrupt, 0, + u->name, evtchn); if (rc < 0) goto err; From patchwork Tue Nov 3 14:19:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?SsO8cmdlbiBHcm/Dnw==?= X-Patchwork-Id: 317279 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 17C46C388F7 for ; Tue, 3 Nov 2020 14:20:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9C9CB22264 for ; Tue, 3 Nov 2020 14:20:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="Xm8+8UDW" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729636AbgKCOUh (ORCPT ); Tue, 3 Nov 2020 09:20:37 -0500 Received: from mx2.suse.de ([195.135.220.15]:45132 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729599AbgKCOT1 (ORCPT ); Tue, 3 Nov 2020 09:19:27 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1604413164; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CW9wDHEhnzEkX5jcnT4JMDRWtOfLafD6JuLpK5F9pWQ=; b=Xm8+8UDWpsUzG+2XggHxjOMKxorFG7vLk9alF0c0iXTzIuS3rShc6cyIZiTbRe22DkHl6Q yfA3p5H3dYjPWyQuAEZXGV+HKmQGZaT95IR/aThmVku9MLWGKaK235L7GsX3alHxAgiNzx Cd/Rlaya2SV4UVljMJGRBLbdgWZGJt8= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 39AAAB29C for ; Tue, 3 Nov 2020 14:19:24 +0000 (UTC) From: Juergen Gross To: stable@vger.kernel.org Subject: [PATCH v2 13/14] xen/events: defer eoi in case of excessive number of events Date: Tue, 3 Nov 2020 15:19:21 +0100 Message-Id: <20201103141922.21026-14-jgross@suse.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201103141922.21026-1-jgross@suse.com> References: <20201103141922.21026-1-jgross@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org In case rogue guests are sending events at high frequency it might happen that xen_evtchn_do_upcall() won't stop processing events in dom0. As this is done in irq handling a crash might be the result. In order to avoid that, delay further inter-domain events after some time in xen_evtchn_do_upcall() by forcing eoi processing into a worker on the same cpu, thus inhibiting new events coming in. The time after which eoi processing is to be delayed is configurable via a new module parameter "event_loop_timeout" which specifies the maximum event loop time in jiffies (default: 2, the value was chosen after some tests showing that a value of 2 was the lowest with an only slight drop of dom0 network throughput while multiple guests performed an event storm). How long eoi processing will be delayed can be specified via another parameter "event_eoi_delay" (again in jiffies, default 10, again the value was chosen after testing with different delay values). This is part of XSA-332. This is upstream commit e99502f76271d6bc4e374fe368c50c67a1fd3070 Cc: stable@vger.kernel.org Reported-by: Julien Grall Signed-off-by: Juergen Gross Reviewed-by: Stefano Stabellini Reviewed-by: Wei Liu --- .../admin-guide/kernel-parameters.txt | 8 + drivers/xen/events/events_2l.c | 7 +- drivers/xen/events/events_base.c | 189 +++++++++++++++++- drivers/xen/events/events_fifo.c | 30 +-- drivers/xen/events/events_internal.h | 14 +- 5 files changed, 216 insertions(+), 32 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index fb129272240c..8dbc8d4ec8f0 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -5270,6 +5270,14 @@ with /sys/devices/system/xen_memory/xen_memory0/scrub_pages. Default value controlled with CONFIG_XEN_SCRUB_PAGES_DEFAULT. + xen.event_eoi_delay= [XEN] + How long to delay EOI handling in case of event + storms (jiffies). Default is 10. + + xen.event_loop_timeout= [XEN] + After which time (jiffies) the event handling loop + should start to delay EOI handling. Default is 2. + xirc2ps_cs= [NET,PCMCIA] Format: ,,,,,[,[,[,]]] diff --git a/drivers/xen/events/events_2l.c b/drivers/xen/events/events_2l.c index e4b75693600e..f026624898e7 100644 --- a/drivers/xen/events/events_2l.c +++ b/drivers/xen/events/events_2l.c @@ -161,7 +161,7 @@ static inline xen_ulong_t active_evtchns(unsigned int cpu, * a bitset of words which contain pending event bits. The second * level is a bitset of pending events themselves. */ -static void evtchn_2l_handle_events(unsigned cpu) +static void evtchn_2l_handle_events(unsigned cpu, struct evtchn_loop_ctrl *ctrl) { int irq; xen_ulong_t pending_words; @@ -242,10 +242,7 @@ static void evtchn_2l_handle_events(unsigned cpu) /* Process port. */ port = (word_idx * BITS_PER_EVTCHN_WORD) + bit_idx; - irq = get_evtchn_to_irq(port); - - if (irq != -1) - generic_handle_irq(irq); + handle_irq_for_port(port, ctrl); bit_idx = (bit_idx + 1) % BITS_PER_EVTCHN_WORD; diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index bbeb1d3aecfd..d565f7ebb8ef 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -34,6 +34,8 @@ #include #include #include +#include +#include #ifdef CONFIG_X86 #include @@ -63,6 +65,15 @@ #include "events_internal.h" +#undef MODULE_PARAM_PREFIX +#define MODULE_PARAM_PREFIX "xen." + +static uint __read_mostly event_loop_timeout = 2; +module_param(event_loop_timeout, uint, 0644); + +static uint __read_mostly event_eoi_delay = 10; +module_param(event_eoi_delay, uint, 0644); + const struct evtchn_ops *evtchn_ops; /* @@ -86,6 +97,7 @@ static DEFINE_RWLOCK(evtchn_rwlock); * irq_mapping_update_lock * evtchn_rwlock * IRQ-desc lock + * percpu eoi_list_lock */ static LIST_HEAD(xen_irq_list_head); @@ -118,6 +130,8 @@ static struct irq_chip xen_pirq_chip; static void enable_dynirq(struct irq_data *data); static void disable_dynirq(struct irq_data *data); +static DEFINE_PER_CPU(unsigned int, irq_epoch); + static void clear_evtchn_to_irq_row(unsigned row) { unsigned col; @@ -397,17 +411,120 @@ void notify_remote_via_irq(int irq) } EXPORT_SYMBOL_GPL(notify_remote_via_irq); +struct lateeoi_work { + struct delayed_work delayed; + spinlock_t eoi_list_lock; + struct list_head eoi_list; +}; + +static DEFINE_PER_CPU(struct lateeoi_work, lateeoi); + +static void lateeoi_list_del(struct irq_info *info) +{ + struct lateeoi_work *eoi = &per_cpu(lateeoi, info->eoi_cpu); + unsigned long flags; + + spin_lock_irqsave(&eoi->eoi_list_lock, flags); + list_del_init(&info->eoi_list); + spin_unlock_irqrestore(&eoi->eoi_list_lock, flags); +} + +static void lateeoi_list_add(struct irq_info *info) +{ + struct lateeoi_work *eoi = &per_cpu(lateeoi, info->eoi_cpu); + struct irq_info *elem; + u64 now = get_jiffies_64(); + unsigned long delay; + unsigned long flags; + + if (now < info->eoi_time) + delay = info->eoi_time - now; + else + delay = 1; + + spin_lock_irqsave(&eoi->eoi_list_lock, flags); + + if (list_empty(&eoi->eoi_list)) { + list_add(&info->eoi_list, &eoi->eoi_list); + mod_delayed_work_on(info->eoi_cpu, system_wq, + &eoi->delayed, delay); + } else { + list_for_each_entry_reverse(elem, &eoi->eoi_list, eoi_list) { + if (elem->eoi_time <= info->eoi_time) + break; + } + list_add(&info->eoi_list, &elem->eoi_list); + } + + spin_unlock_irqrestore(&eoi->eoi_list_lock, flags); +} + static void xen_irq_lateeoi_locked(struct irq_info *info) { evtchn_port_t evtchn; + unsigned int cpu; evtchn = info->evtchn; - if (!VALID_EVTCHN(evtchn)) + if (!VALID_EVTCHN(evtchn) || !list_empty(&info->eoi_list)) return; + cpu = info->eoi_cpu; + if (info->eoi_time && info->irq_epoch == per_cpu(irq_epoch, cpu)) { + lateeoi_list_add(info); + return; + } + + info->eoi_time = 0; unmask_evtchn(evtchn); } +static void xen_irq_lateeoi_worker(struct work_struct *work) +{ + struct lateeoi_work *eoi; + struct irq_info *info; + u64 now = get_jiffies_64(); + unsigned long flags; + + eoi = container_of(to_delayed_work(work), struct lateeoi_work, delayed); + + read_lock_irqsave(&evtchn_rwlock, flags); + + while (true) { + spin_lock(&eoi->eoi_list_lock); + + info = list_first_entry_or_null(&eoi->eoi_list, struct irq_info, + eoi_list); + + if (info == NULL || now < info->eoi_time) { + spin_unlock(&eoi->eoi_list_lock); + break; + } + + list_del_init(&info->eoi_list); + + spin_unlock(&eoi->eoi_list_lock); + + info->eoi_time = 0; + + xen_irq_lateeoi_locked(info); + } + + if (info) + mod_delayed_work_on(info->eoi_cpu, system_wq, + &eoi->delayed, info->eoi_time - now); + + read_unlock_irqrestore(&evtchn_rwlock, flags); +} + +static void xen_cpu_init_eoi(unsigned int cpu) +{ + struct lateeoi_work *eoi = &per_cpu(lateeoi, cpu); + + INIT_DELAYED_WORK(&eoi->delayed, xen_irq_lateeoi_worker); + spin_lock_init(&eoi->eoi_list_lock); + INIT_LIST_HEAD(&eoi->eoi_list); +} + void xen_irq_lateeoi(unsigned int irq, unsigned int eoi_flags) { struct irq_info *info; @@ -427,6 +544,7 @@ EXPORT_SYMBOL_GPL(xen_irq_lateeoi); static void xen_irq_init(unsigned irq) { struct irq_info *info; + #ifdef CONFIG_SMP /* By default all event channels notify CPU#0. */ cpumask_copy(irq_get_affinity_mask(irq), cpumask_of(0)); @@ -441,6 +559,7 @@ static void xen_irq_init(unsigned irq) set_info_for_irq(irq, info); + INIT_LIST_HEAD(&info->eoi_list); list_add_tail(&info->list, &xen_irq_list_head); } @@ -496,6 +615,9 @@ static void xen_free_irq(unsigned irq) write_lock_irqsave(&evtchn_rwlock, flags); + if (!list_empty(&info->eoi_list)) + lateeoi_list_del(info); + list_del(&info->list); set_info_for_irq(irq, NULL); @@ -1355,6 +1477,54 @@ void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector) notify_remote_via_irq(irq); } +struct evtchn_loop_ctrl { + ktime_t timeout; + unsigned count; + bool defer_eoi; +}; + +void handle_irq_for_port(evtchn_port_t port, struct evtchn_loop_ctrl *ctrl) +{ + int irq; + struct irq_info *info; + + irq = get_evtchn_to_irq(port); + if (irq == -1) + return; + + /* + * Check for timeout every 256 events. + * We are setting the timeout value only after the first 256 + * events in order to not hurt the common case of few loop + * iterations. The 256 is basically an arbitrary value. + * + * In case we are hitting the timeout we need to defer all further + * EOIs in order to ensure to leave the event handling loop rather + * sooner than later. + */ + if (!ctrl->defer_eoi && !(++ctrl->count & 0xff)) { + ktime_t kt = ktime_get(); + + if (!ctrl->timeout) { + kt = ktime_add_ms(kt, + jiffies_to_msecs(event_loop_timeout)); + ctrl->timeout = kt; + } else if (kt > ctrl->timeout) { + ctrl->defer_eoi = true; + } + } + + info = info_for_irq(irq); + + if (ctrl->defer_eoi) { + info->eoi_cpu = smp_processor_id(); + info->irq_epoch = __this_cpu_read(irq_epoch); + info->eoi_time = get_jiffies_64() + event_eoi_delay; + } + + generic_handle_irq(irq); +} + static DEFINE_PER_CPU(unsigned, xed_nesting_count); static void __xen_evtchn_do_upcall(void) @@ -1362,6 +1532,7 @@ static void __xen_evtchn_do_upcall(void) struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu); int cpu = get_cpu(); unsigned count; + struct evtchn_loop_ctrl ctrl = { 0 }; read_lock(&evtchn_rwlock); @@ -1371,7 +1542,7 @@ static void __xen_evtchn_do_upcall(void) if (__this_cpu_inc_return(xed_nesting_count) - 1) goto out; - xen_evtchn_handle_events(cpu); + xen_evtchn_handle_events(cpu, &ctrl); BUG_ON(!irqs_disabled()); @@ -1382,6 +1553,13 @@ static void __xen_evtchn_do_upcall(void) out: read_unlock(&evtchn_rwlock); + /* + * Increment irq_epoch only now to defer EOIs only for + * xen_irq_lateeoi() invocations occurring from inside the loop + * above. + */ + __this_cpu_inc(irq_epoch); + put_cpu(); } @@ -1828,9 +2006,6 @@ void xen_callback_vector(void) void xen_callback_vector(void) {} #endif -#undef MODULE_PARAM_PREFIX -#define MODULE_PARAM_PREFIX "xen." - static bool fifo_events = true; module_param(fifo_events, bool, 0); @@ -1838,6 +2013,8 @@ static int xen_evtchn_cpu_prepare(unsigned int cpu) { int ret = 0; + xen_cpu_init_eoi(cpu); + if (evtchn_ops->percpu_init) ret = evtchn_ops->percpu_init(cpu); @@ -1864,6 +2041,8 @@ void __init xen_init_IRQ(void) if (ret < 0) xen_evtchn_2l_init(); + xen_cpu_init_eoi(smp_processor_id()); + cpuhp_setup_state_nocalls(CPUHP_XEN_EVTCHN_PREPARE, "xen/evtchn:prepare", xen_evtchn_cpu_prepare, xen_evtchn_cpu_dead); diff --git a/drivers/xen/events/events_fifo.c b/drivers/xen/events/events_fifo.c index 59e6002c9699..33462521bfd0 100644 --- a/drivers/xen/events/events_fifo.c +++ b/drivers/xen/events/events_fifo.c @@ -275,19 +275,9 @@ static uint32_t clear_linked(volatile event_word_t *word) return w & EVTCHN_FIFO_LINK_MASK; } -static void handle_irq_for_port(unsigned port) -{ - int irq; - - irq = get_evtchn_to_irq(port); - if (irq != -1) - generic_handle_irq(irq); -} - -static void consume_one_event(unsigned cpu, +static void consume_one_event(unsigned cpu, struct evtchn_loop_ctrl *ctrl, struct evtchn_fifo_control_block *control_block, - unsigned priority, unsigned long *ready, - bool drop) + unsigned priority, unsigned long *ready) { struct evtchn_fifo_queue *q = &per_cpu(cpu_queue, cpu); uint32_t head; @@ -320,16 +310,17 @@ static void consume_one_event(unsigned cpu, clear_bit(priority, ready); if (evtchn_fifo_is_pending(port) && !evtchn_fifo_is_masked(port)) { - if (unlikely(drop)) + if (unlikely(!ctrl)) pr_warn("Dropping pending event for port %u\n", port); else - handle_irq_for_port(port); + handle_irq_for_port(port, ctrl); } q->head[priority] = head; } -static void __evtchn_fifo_handle_events(unsigned cpu, bool drop) +static void __evtchn_fifo_handle_events(unsigned cpu, + struct evtchn_loop_ctrl *ctrl) { struct evtchn_fifo_control_block *control_block; unsigned long ready; @@ -341,14 +332,15 @@ static void __evtchn_fifo_handle_events(unsigned cpu, bool drop) while (ready) { q = find_first_bit(&ready, EVTCHN_FIFO_MAX_QUEUES); - consume_one_event(cpu, control_block, q, &ready, drop); + consume_one_event(cpu, ctrl, control_block, q, &ready); ready |= xchg(&control_block->ready, 0); } } -static void evtchn_fifo_handle_events(unsigned cpu) +static void evtchn_fifo_handle_events(unsigned cpu, + struct evtchn_loop_ctrl *ctrl) { - __evtchn_fifo_handle_events(cpu, false); + __evtchn_fifo_handle_events(cpu, ctrl); } static void evtchn_fifo_resume(void) @@ -416,7 +408,7 @@ static int evtchn_fifo_percpu_init(unsigned int cpu) static int evtchn_fifo_percpu_deinit(unsigned int cpu) { - __evtchn_fifo_handle_events(cpu, true); + __evtchn_fifo_handle_events(cpu, NULL); return 0; } diff --git a/drivers/xen/events/events_internal.h b/drivers/xen/events/events_internal.h index 980a56d51e4c..2cb9c2d2c5c0 100644 --- a/drivers/xen/events/events_internal.h +++ b/drivers/xen/events/events_internal.h @@ -32,11 +32,15 @@ enum xen_irq_type { */ struct irq_info { struct list_head list; + struct list_head eoi_list; int refcnt; enum xen_irq_type type; /* type */ unsigned irq; unsigned int evtchn; /* event channel */ unsigned short cpu; /* cpu bound */ + unsigned short eoi_cpu; /* EOI must happen on this cpu */ + unsigned int irq_epoch; /* If eoi_cpu valid: irq_epoch of event */ + u64 eoi_time; /* Time in jiffies when to EOI. */ union { unsigned short virq; @@ -55,6 +59,8 @@ struct irq_info { #define PIRQ_SHAREABLE (1 << 1) #define PIRQ_MSI_GROUP (1 << 2) +struct evtchn_loop_ctrl; + struct evtchn_ops { unsigned (*max_channels)(void); unsigned (*nr_channels)(void); @@ -69,7 +75,7 @@ struct evtchn_ops { void (*mask)(unsigned port); void (*unmask)(unsigned port); - void (*handle_events)(unsigned cpu); + void (*handle_events)(unsigned cpu, struct evtchn_loop_ctrl *ctrl); void (*resume)(void); int (*percpu_init)(unsigned int cpu); @@ -80,6 +86,7 @@ extern const struct evtchn_ops *evtchn_ops; extern int **evtchn_to_irq; int get_evtchn_to_irq(unsigned int evtchn); +void handle_irq_for_port(evtchn_port_t port, struct evtchn_loop_ctrl *ctrl); struct irq_info *info_for_irq(unsigned irq); unsigned cpu_from_irq(unsigned irq); @@ -137,9 +144,10 @@ static inline void unmask_evtchn(unsigned port) return evtchn_ops->unmask(port); } -static inline void xen_evtchn_handle_events(unsigned cpu) +static inline void xen_evtchn_handle_events(unsigned cpu, + struct evtchn_loop_ctrl *ctrl) { - return evtchn_ops->handle_events(cpu); + return evtchn_ops->handle_events(cpu, ctrl); } static inline void xen_evtchn_resume(void)