From patchwork Fri May 16 00:01:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Chia-Yu Chang \(Nokia\)" X-Patchwork-Id: 890412 Received: from AM0PR02CU008.outbound.protection.outlook.com (mail-westeuropeazon11013059.outbound.protection.outlook.com [52.101.72.59]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 19D9A4B1E56; Fri, 16 May 2025 00:02:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.72.59 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747353742; cv=fail; b=ZTF1OTm/2N96mUCzeR11FEgvavD1epZ8Pk0AqB1jZRFgWn6fXaeVo11favXQogj9oZmB3+JkLdDJcBckV4M5HYLEP1qldgrNAQ5PPaJP9bY3uh6lAD5/q1Zoq5hr6tZEF1xL9kz4+tQhaoFedE6TN7Mm5nTGnAtMZdAHdMRCNMo= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747353742; c=relaxed/simple; bh=jo12Th1sx5V3lpZB1ufGmJ9Fog1TSWM9cgjNLPuQSJQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=SbYhDchn5QCwa2ylv80d7/dh/gItg9V+8vwEfpPsLQPXJDOdopBkPnxtwS1/vtRv4CSLenLzsfG0J2eU3oyW9U7EWRw3kfk4QNs73Hdd9hiB+ON3dZgUnyfgAxjnCkgmfh76U232tZ7uCaXKZYki9T23H32O+DAX17/OeHUdEOg= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=oOe25xby; arc=fail smtp.client-ip=52.101.72.59 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="oOe25xby" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=dLv1LPAm4OnZh3yKxaR8BoTgDh2DfEt9P6ACEfCBHP92U3ERNGfq/3XRJJ0zHIOw3axK8BVD88ABdUYw1SryfI4QQDFhSFVZZqY11BxkoZxEWGj8xVdXY2JuLAQEbgFg00uC5pc/1weRkh8aSc+OOqxBbg7L18guoTJ8LyHxLvsKCWjde2bMSK3P7XSZjnwGyBTGqsLnbmfWOcHjVT8GgnhwElumDFWKcLcm9bnvOQXnZUqI+6q4eUc2EQ97LX2jwEEeHMUf3PDl5cnNb3mZMS+y1yCvoitda2CJrmv2W5KiErTsVL4rt79l31bGDa+MykK9p7tQcpzsqERsSz59KQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=t4Qqn85lp5ZhHAI4TQDZ2QEFL6yh7PdmpL7Z92m32+8=; b=T8HSwNN83UKbtg0Ktxz77pqKAe/TsAWhO14mVHaJQELlMRXvBeqK7HyM9466vrjvLOnL46o9L0843xobKuuyztUKHPLMVDJ0peukCMfu+xKx7h0+Hppm0vHJAEpzvmuMfyb+9FJ02JJprrOsPmIikLVCiOBepxB6SodZnkM8JgOfMbvTIpHLafQrEmVzlFrciBFoSQbl7wULS7Wi/VAKhDSN5qQv0aAU16/yizoiAV68wxPj6aUHYVJ9rvXwFSnitzexn/O5brJPKUe2g0K+y9le+dg7oK9x+Xl7wrC+v/c0nnmzKxgI9aZd5ZvIzYXWjvRQR81D1lOVjz9ltVD1Ow== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=temperror (sender ip is 131.228.2.240) smtp.rcpttodomain=apple.com smtp.mailfrom=nokia-bell-labs.com; dmarc=temperror action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=t4Qqn85lp5ZhHAI4TQDZ2QEFL6yh7PdmpL7Z92m32+8=; b=oOe25xbymaJ2mYRVs70C4XN3i5Fd9q32Q5OIKZNzJ5kKDG/x1utj4WDzze1oQfXM75ND9N2qb9t/7tFqb3tzq3CV9ky5AbzdJGwD/OY3yEsGuIOEjerQUzBwrxNyB0cFHlO5dVlM30LpIkOyLWaEd16CFoTKvHzH9nPal49PZZcEzRJUvBQBIYHMT+XqW7TnoYDYzhQ7RRdlzm14k+Vg69SiQh8QSzJ0dPHWHNqbD3w1CNQTtn4Pq6bSbsZqVdEq8wU+UbNDBM9ixL9dJf/8/QuXj0pqTLw/V+Hq/10vP2UQpAlb26sbTgB+ZfEc/6U6tcw/9Y+z+Vd7Azr+z3aDdA== Received: from CWLP265CA0465.GBRP265.PROD.OUTLOOK.COM (2603:10a6:400:1d4::10) by DB9PR07MB8449.eurprd07.prod.outlook.com (2603:10a6:10:36f::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8722.30; Fri, 16 May 2025 00:02:16 +0000 Received: from AMS0EPF000001B4.eurprd05.prod.outlook.com (2603:10a6:400:1d4:cafe::e4) by CWLP265CA0465.outlook.office365.com (2603:10a6:400:1d4::10) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8746.20 via Frontend Transport; Fri, 16 May 2025 00:02:15 +0000 X-MS-Exchange-Authentication-Results: spf=temperror (sender IP is 131.228.2.240) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=temperror action=none header.from=nokia-bell-labs.com; Received-SPF: TempError (protection.outlook.com: error in processing during lookup of nokia-bell-labs.com: DNS Timeout) Received: from fihe3nok0735.emea.nsn-net.net (131.228.2.240) by AMS0EPF000001B4.mail.protection.outlook.com (10.167.16.168) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8722.18 via Frontend Transport; Fri, 16 May 2025 00:02:13 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fihe3nok0735.emea.nsn-net.net (Postfix) with ESMTP id 8F69B200D7; Fri, 16 May 2025 03:02:12 +0300 (EEST) From: chia-yu.chang@nokia-bell-labs.com To: horms@kernel.org, donald.hunter@gmail.com, xandfury@gmail.com, netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, andrew+netdev@lunn.ch, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Chia-Yu Chang Subject: [PATCH v16 net-next 2/5] sched: Dump configuration and statistics of dualpi2 qdisc Date: Fri, 16 May 2025 02:01:58 +0200 Message-Id: <20250516000201.18008-3-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250516000201.18008-1-chia-yu.chang@nokia-bell-labs.com> References: <20250516000201.18008-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AMS0EPF000001B4:EE_|DB9PR07MB8449:EE_ X-MS-Office365-Filtering-Correlation-Id: 9efcd538-b3cd-48af-796a-08dd940ce9dc X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|82310400026|36860700013|1800799024|376014|7416014|921020; X-Microsoft-Antispam-Message-Info: NaVm8AnLlB4N9WRGpsbJH915hF4sT9vh/JFKRez+9xx3ohbge5duyCqdK8zOirL1wrJMPw9fBSG8LL9y12aZorGYQDutScc6i4sibytwCqoAyzYk2XY341LcpLvteQxpI51PnM2Nk0R5Dd9GTGOJUcbu+9p1QBJ6uybWaynKun3OL2C6M4oEa5JuyO23pr8uAZLX93EnncUpqRECk6BP5Cs+Bk93nYWVhXfFZzW9yszn+l5Ul0wVcDMBmYCPM4BXtUJxYCOAzdTwzaq8HXzUYTqBhzafbWjqlzo4JNR2u1AA8j+L++QDKmrXPHE6MRmb/PAHtiJTwAYQrlRUH4NHHOKM7GziCHpx0hNG2nkywaAhQRpfRSHNtx0z79kTfTAp12OkVjtWebvBLrjwlY00Ns/aGxJUZ4pKLGzqUsKJIboaFNI4QiIGfX4pbhFdYsFG+du7xnREDMzxc4KSh3JSH21bOv3KSWQJ4VnzIQcbiIPzTu5pVzmVfiQsFZjtIlXcyEulmGTx2YaH0vCY6+IneO/yXkvAo/dALdz1eZHzhfyoxqJ8QGkQGVqGRFkIqGlbpXe/3ybqIPpFPPrlHEcITGhgKo9trSd0atmzcr0d/ZeLkAZdguFbqohGAW8VFq5JwAyJFzl6LiDoHtNCGumoDozUk//umJEsT2uqe1Up7pKTAAQlBcJhOiZCwBtwgFk/y+XVpbrsnNcfCT0p0CjyW2hqFk0TocUuFn1ioq9tihQhN0P1QU5RxTJevN6CaXCraqL57u3anHmKKDInK1v2waOqeCSOGLrVmEKxlJl3roZsvWoWfr7jQ1xFat/izTB3uU77pg14OeEZBeX2K+NeJSRwteA50lyfDD2Hn/xRwzT63ism3cEd8pYtiZEbgHEq1/VrCKxifYTnUyN+GjFmGwk38zJQtisqu5Pz0IfaiCmdXdZwOXE7WbQM+0N7dvJOlb+T/2MkqmnxZHw4Ti/O5uXpdl2piFH9IhYnFn3ekhSMbv/m9JzOh8YyjwC3fooWeBHJnhtq5Jdm+zZGkDrUXRpEf3tC9vUAa+9eBK1X3DzLgROQ2yxSm0PMUhMKJW1JLVBHlMaS/FtaL3/blTFeVfwB0vh+LQct5wcjduA7ETkbA3Ph/Ru1HJYiJNbx8r/gw76t9Qi2OtKLKAYGA/8fQZqkDNGB3rDA2zmEO3h/M72zWJaa6js+Id5TTPEkm/wir8+ptSo9vcJ+GgUH8TP/eVi7hp3tRXP5BmLzQSDCLRqfuSKMjQGCPqeWsaEKqRo9Cr15W52fPujlnVfDmeZ/426HborQ8YLGaFmh/PMRx0+uLzG4jM3ZlsGxCiOJY+uKfl3DYApbQ79XBmxQ0r9X+PsN8U8aCrCCfhSx6ETTJtIWPpBrfJ7h+l7eUYLB8l7h7iIQb2l8dBGL1vjHiWjBtiIB+yuCAgN2WgapN9LL3dPcm+ltQqmKWbjD7+VYL+zm4d2erfL5WNlcO8LYJHAH8FMRJn62d5JCXi+1yMXNDzjNc/wlG/4cNYfL+nzEowF5U4tN6RdvOVphUb5Tmgg//g== X-Forefront-Antispam-Report: CIP:131.228.2.240; CTRY:FI; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:fihe3nok0735.emea.nsn-net.net; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(82310400026)(36860700013)(1800799024)(376014)(7416014)(921020); DIR:OUT; SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 May 2025 00:02:13.9334 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 9efcd538-b3cd-48af-796a-08dd940ce9dc X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0; Ip=[131.228.2.240]; Helo=[fihe3nok0735.emea.nsn-net.net] X-MS-Exchange-CrossTenant-AuthSource: AMS0EPF000001B4.eurprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR07MB8449 From: Chia-Yu Chang The configuration and statistics dump of the DualPI2 Qdisc provides information related to both queues, such as packet numbers and queuing delays in the L-queue and C-queue, as well as general information such as probability value, WRR credits, memory usage, packet marking counters, max queue size, etc. The following patch includes enqueue/dequeue for DualPI2. v16: - Update convert_ns_to_usec() to avoid overflow Signed-off-by: Chia-Yu Chang --- include/uapi/linux/pkt_sched.h | 15 ++++++ net/sched/sch_dualpi2.c | 89 ++++++++++++++++++++++++++++++++++ 2 files changed, 104 insertions(+) diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h index ae8af0e8d479..a7243f32ff0f 100644 --- a/include/uapi/linux/pkt_sched.h +++ b/include/uapi/linux/pkt_sched.h @@ -1264,4 +1264,19 @@ enum { #define TCA_DUALPI2_MAX (__TCA_DUALPI2_MAX - 1) +struct tc_dualpi2_xstats { + __u32 prob; /* current probability */ + __u32 delay_c; /* current delay in C queue */ + __u32 delay_l; /* current delay in L queue */ + __u32 packets_in_c; /* number of packets enqueued in C queue */ + __u32 packets_in_l; /* number of packets enqueued in L queue */ + __u32 maxq; /* maximum queue size */ + __u32 ecn_mark; /* packets marked with ecn*/ + __u32 step_marks; /* ECN marks due to the step AQM */ + __s32 credit; /* current c_protection credit */ + __u32 memory_used; /* Memory used of both queues */ + __u32 max_memory_used; /* Maximum used memory */ + __u32 memory_limit; /* Memory limit of both queues */ +}; + #endif diff --git a/net/sched/sch_dualpi2.c b/net/sched/sch_dualpi2.c index ffdfb7803e1f..97986c754e47 100644 --- a/net/sched/sch_dualpi2.c +++ b/net/sched/sch_dualpi2.c @@ -123,6 +123,14 @@ static u32 dualpi2_scale_alpha_beta(u32 param) return tmp; } +static u32 dualpi2_unscale_alpha_beta(u32 param) +{ + u64 tmp = ((u64)param * NSEC_PER_SEC << ALPHA_BETA_SCALING); + + do_div(tmp, MAX_PROB); + return tmp; +} + static ktime_t next_pi2_timeout(struct dualpi2_sched_data *q) { return ktime_add_ns(ktime_get_ns(), q->pi2_tupdate); @@ -223,6 +231,15 @@ static u32 convert_us_to_nsec(u32 us) return lower_32_bits(ns); } +static u32 convert_ns_to_usec(u64 ns) +{ + do_div(ns, NSEC_PER_USEC); + if (upper_32_bits(ns)) + return 0xffffffff; + else + return lower_32_bits(ns); +} + static enum hrtimer_restart dualpi2_timer(struct hrtimer *timer) { struct dualpi2_sched_data *q = from_timer(q, timer, pi2_timer); @@ -458,6 +475,76 @@ static int dualpi2_init(struct Qdisc *sch, struct nlattr *opt, return 0; } +static int dualpi2_dump(struct Qdisc *sch, struct sk_buff *skb) +{ + struct dualpi2_sched_data *q = qdisc_priv(sch); + struct nlattr *opts; + + opts = nla_nest_start_noflag(skb, TCA_OPTIONS); + if (!opts) + goto nla_put_failure; + + if (nla_put_u32(skb, TCA_DUALPI2_LIMIT, READ_ONCE(sch->limit)) || + nla_put_u32(skb, TCA_DUALPI2_MEMORY_LIMIT, + READ_ONCE(q->memory_limit)) || + nla_put_u32(skb, TCA_DUALPI2_TARGET, + convert_ns_to_usec(READ_ONCE(q->pi2_target))) || + nla_put_u32(skb, TCA_DUALPI2_TUPDATE, + convert_ns_to_usec(READ_ONCE(q->pi2_tupdate))) || + nla_put_u32(skb, TCA_DUALPI2_ALPHA, + dualpi2_unscale_alpha_beta(READ_ONCE(q->pi2_alpha))) || + nla_put_u32(skb, TCA_DUALPI2_BETA, + dualpi2_unscale_alpha_beta(READ_ONCE(q->pi2_beta))) || + nla_put_u32(skb, TCA_DUALPI2_STEP_THRESH, + READ_ONCE(q->step_in_packets) ? + READ_ONCE(q->step_thresh) : + convert_ns_to_usec(READ_ONCE(q->step_thresh))) || + nla_put_u32(skb, TCA_DUALPI2_MIN_QLEN_STEP, + READ_ONCE(q->min_qlen_step)) || + nla_put_u8(skb, TCA_DUALPI2_COUPLING, + READ_ONCE(q->coupling_factor)) || + nla_put_u8(skb, TCA_DUALPI2_DROP_OVERLOAD, + READ_ONCE(q->drop_overload)) || + (READ_ONCE(q->step_in_packets) && + nla_put_flag(skb, TCA_DUALPI2_STEP_PACKETS)) || + nla_put_u8(skb, TCA_DUALPI2_DROP_EARLY, + READ_ONCE(q->drop_early)) || + nla_put_u8(skb, TCA_DUALPI2_C_PROTECTION, + READ_ONCE(q->c_protection_wc)) || + nla_put_u8(skb, TCA_DUALPI2_ECN_MASK, READ_ONCE(q->ecn_mask)) || + nla_put_u8(skb, TCA_DUALPI2_SPLIT_GSO, READ_ONCE(q->split_gso))) + goto nla_put_failure; + + return nla_nest_end(skb, opts); + +nla_put_failure: + nla_nest_cancel(skb, opts); + return -1; +} + +static int dualpi2_dump_stats(struct Qdisc *sch, struct gnet_dump *d) +{ + struct dualpi2_sched_data *q = qdisc_priv(sch); + struct tc_dualpi2_xstats st = { + .prob = READ_ONCE(q->pi2_prob), + .packets_in_c = q->packets_in_c, + .packets_in_l = q->packets_in_l, + .maxq = q->maxq, + .ecn_mark = q->ecn_mark, + .credit = q->c_protection_credit, + .step_marks = q->step_marks, + .memory_used = q->memory_used, + .max_memory_used = q->max_memory_used, + .memory_limit = q->memory_limit, + }; + u64 qc, ql; + + get_queue_delays(q, &qc, &ql); + st.delay_l = convert_ns_to_usec(ql); + st.delay_c = convert_ns_to_usec(qc); + return gnet_stats_copy_app(d, &st, sizeof(st)); +} + /* Reset both L-queue and C-queue, internal packet counters, PI probability, * C-queue protection credit, and timestamps, while preserving current * configuration of DUALPI2. @@ -562,6 +649,8 @@ static struct Qdisc_ops dualpi2_qdisc_ops __read_mostly = { .destroy = dualpi2_destroy, .reset = dualpi2_reset, .change = dualpi2_change, + .dump = dualpi2_dump, + .dump_stats = dualpi2_dump_stats, .owner = THIS_MODULE, }; From patchwork Fri May 16 00:01:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang \(Nokia\)" X-Patchwork-Id: 890413 Received: from MRWPR03CU001.outbound.protection.outlook.com (mail-francesouthazon11011060.outbound.protection.outlook.com [40.107.130.60]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5BAAD182; Fri, 16 May 2025 00:02:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.130.60 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747353742; cv=fail; b=s8UAhDKcgmuiwo16bW1vvShILIiOePJQJmmRRlAb0qoqJIkISiqk+VJxv2yJizkuIyGYQ2dfbry91UPlPautoqJX+rZ7jel42ZtuUT/p8HRmUGA/8QU569irVAN80uGYvPC2oH+CIgS0g7/61UfjnnBcT1/hSwTLr3UdxPDQD48= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747353742; c=relaxed/simple; bh=ghfxC7SDUz6ae8TlI0e/3k1v9UDeOC8V+8fwLc75byc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=ujK6mkOw+e3s+KZTezoTOwxfImA9t2Bo0cMIs59CMPpRfQnHbyxgSSsc+dSiDvVGKNGH+flI6KnwM5vVAZPNJkYL9LNle8sXrndIVQSk4mb0VrETmCQT/rfov8EtNXi5hvSJKDHDD23nT7IzwjJicTCgGBYe1BTXrpBSiBKSBHM= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=npwZAunu; arc=fail smtp.client-ip=40.107.130.60 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="npwZAunu" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=bKK0Jx8Ng4ZDT4EksciDdmTsaPbMWOu1vnFkJbrzDYAaYP8sinEPH/IGD9j3jGoeehXhHC4/lcnglo8x7K3KFAX9TGK6Q/B3KIRv1MACmC0P/HSZRYsbL8bvjO3HrjZlDVWQFOtPZ2zjdjsNVo8U8IZaUeNt2SCaJRvxVt9HO4vjGIuh/Bd9ufC0iFbcdVQuzq8tsjklcWYegoxx6TBohoAwtVyb44J1UXer8voZFC6N3liDlWvNQPH/el0sAmMQ+ZhEMEnxNOZkNWqkmRfnQqRVuTzxbQzhtQXJHK288mfoobYSiKDVBdVlrrgACkGalS5O1JAS9NFc5cNl5p7RVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=BkPurEkdr9yCqUj8X70w8AYD3ORTl/hRi4YrUeyu4Yo=; b=kTNlHBoRDJ2bMhLNKqYrw66+c1bNzQtWKoaz5+fjFeIZ3V5tls/3vbJNCuUOAlK7FvzwkdZLuEqWAulr+//8JY92OaIprkp+CkR0baUSBv3LLdyim/lOnSxn7JpFveUnkFPAUPdEcGIdW7fQqGKN5vGr7rH8TynEnB+eIWl7Ob04hG1jz++sQpgE2ccFktSmmgK+NYFzdLzXIfTtKt775ieDfOlt/FygYP5NddxwdroOniBxqnvym1wm6ebXHxNpjs4k2biwax/DLsoIEaZQYe2yJHfOYdsBv+r8JX0ZLi1ckOYDwNdgjT9kYpmGbN8XngryFu2/gKnVU9OWoFARXg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 131.228.2.240) smtp.rcpttodomain=albisser.org smtp.mailfrom=nokia-bell-labs.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=BkPurEkdr9yCqUj8X70w8AYD3ORTl/hRi4YrUeyu4Yo=; b=npwZAunuvNlFEhoRER1qhsumjouuD78ZcDBHaE6n+cEeTI4nnu4mnwZBAKhP56di3JE+UMK5IhduRutcldtnSYtvOJrO3/peLLqNe2Il3AdRz0VN5xV/RZrqjFaPpnmKjngkEgM1v7W90+7H0kgQ5j91bTL/20Y+3bNdHZCWe7YOrFc5cYI8yN9/3II5LRlkUWJc3QWz8QmFDCxC1xsnKlgGThZa1W9If4GlUuIQ/p79V47ZVmAcYFBj1+sYMDm2fmz+lF0IIzed5C+KD4rMkMQHs1Q7NHrLBevNQycKiphYVvur/2fwVWN00mgr2ZFBWOkvT1MtvMiyAWcdVGKlkA== Received: from DB8PR09CA0016.eurprd09.prod.outlook.com (2603:10a6:10:a0::29) by PAWPR07MB10070.eurprd07.prod.outlook.com (2603:10a6:102:38f::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8722.33; Fri, 16 May 2025 00:02:16 +0000 Received: from DU2PEPF00028D13.eurprd03.prod.outlook.com (2603:10a6:10:a0:cafe::26) by DB8PR09CA0016.outlook.office365.com (2603:10a6:10:a0::29) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8746.17 via Frontend Transport; Fri, 16 May 2025 00:02:16 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 131.228.2.240) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nokia-bell-labs.com; Received-SPF: Pass (protection.outlook.com: domain of nokia-bell-labs.com designates 131.228.2.240 as permitted sender) receiver=protection.outlook.com; client-ip=131.228.2.240; helo=fihe3nok0735.emea.nsn-net.net; pr=C Received: from fihe3nok0735.emea.nsn-net.net (131.228.2.240) by DU2PEPF00028D13.mail.protection.outlook.com (10.167.242.27) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8722.18 via Frontend Transport; Fri, 16 May 2025 00:02:15 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fihe3nok0735.emea.nsn-net.net (Postfix) with ESMTP id 556E7200BA; Fri, 16 May 2025 03:02:14 +0300 (EEST) From: chia-yu.chang@nokia-bell-labs.com To: horms@kernel.org, donald.hunter@gmail.com, xandfury@gmail.com, netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, andrew+netdev@lunn.ch, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Olga Albisser , Olivier Tilmans , Henrik Steen , Bob Briscoe , Chia-Yu Chang Subject: [PATCH v16 net-next 3/5] sched: Add enqueue/dequeue of dualpi2 qdisc Date: Fri, 16 May 2025 02:01:59 +0200 Message-Id: <20250516000201.18008-4-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250516000201.18008-1-chia-yu.chang@nokia-bell-labs.com> References: <20250516000201.18008-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DU2PEPF00028D13:EE_|PAWPR07MB10070:EE_ X-MS-Office365-Filtering-Correlation-Id: b0a51c98-b4c4-4dd6-d9ef-08dd940ceaf6 X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|1800799024|36860700013|376014|82310400026|7416014|921020|13003099007; X-Microsoft-Antispam-Message-Info: =?utf-8?q?1OcUzt8bA26gQJ8enBrryYTalvPFNEV?= =?utf-8?q?TGTh5srWIxFodUSNENmmZwks5zNswp6Jt2jazwB5DO0gvluJbJaMQ36ejQkW9z9BE?= =?utf-8?q?yPtDeFWIQsesHpwJ6ux4Az0kSJuVEHIvEIIQJxJRmvvzvA1NWh9Ot6azl3JjGV0JM?= =?utf-8?q?Fr47tCcxahfgR2ptzLM7tPOr1FTHKLaw+R7zEpPhAWemaL4gKsXUpS5KbPzBnBOjl?= =?utf-8?q?Til6ZeC1sK1SsfNwOOndpsY0+u7Nk1+/SV7jv97rZLJN/ymikLQR0Qmqr7Lfj/7/c?= =?utf-8?q?+C9RseGFe1iyzpOSRQbUEwYllXm2GA4V8eovXqohStiAof8BwobtM1sTZPo17IbQ9?= =?utf-8?q?fb8X5ER8ryNJdaAW6OFQnD2nW4Ore336ltjycOXOykV1peI1HByeWUkyIbCKDX9wC?= =?utf-8?q?9JHFDdlYSIa9FUzGFxjBxumhgr6M86V+tANoU+xVPDDRtYNcSjeF2EmPgMQd56yHY?= =?utf-8?q?BT5r7Hzb83sP3JHqG9gAfxvHERdElr3TiftIM6ZiB/0o6z5EESUjkOe+M3TZZgOkp?= =?utf-8?q?t7dTH8Q0Z5fj2hj4PZAfb2TrJkklmW2UDDJaMN3+B7ckV/qufIrrqbJixPUjySW1Z?= =?utf-8?q?U7rFqoVb4DdWOdzGp5db1pkQiX6OJSfnVLnYL8LraRBp645trlsw2Ewkqd0Bcnruj?= =?utf-8?q?HgFZD92JVPRWWXSj+IN+zCmti/xC9G3RspqzR1cclTNdB8SHZmQykqq1J8Ig1qwti?= =?utf-8?q?9mg2p9wMzvqbNunRJ46cVkDCkpnZgE3cdsmF+DGH9AhQJJz/8a/TZhtmZRhYLOio/?= =?utf-8?q?QEDG65iUs9V7d5Ez0xvK8TL25KUhFZQvqWaJCZ7iTGRl8hsVD4YmwZhnqY8MY16qZ?= =?utf-8?q?OEYyhK/p0ypM64C+optObS8C1Aq+3JOcd2QZqQq3kWyzTiEZMvdNC/SPq1okoS82J?= =?utf-8?q?YipR1KacwnIWCuwcmxnW17AtybkkiKxMis1r4Lbgz4uC9P48FJkinao6RL6njyqZ4?= =?utf-8?q?bsGBiNDwbiOPvC4AHH11wwkvTaKpryMlIdro/yzdXw8cU181cQWeqaOidyGTq9mPu?= =?utf-8?q?cBtAnZAd/BxjM40gsBUN1A4Uq6wwwA4x/ifR/tDSCGEq1qxuFKV5FqMfl724FP7PU?= =?utf-8?q?Mw2k9j9tacSAHn5ceUcmjYTZE/RNdMCgkRItoPmDMap5tJ213Sp08/mvUUlkDyozS?= =?utf-8?q?YMA9e75pRIfGIheeDSq8XMX7HberedDTQUrrwFlwJ3sNsdf3j+oER7x8nwDuxLUnd?= =?utf-8?q?CoAw2t4N7hNqRYlMQ5Y6bW/NyoxQPUtJ2Esi7E+7K6sSlDXDiK3yo4a1sGJgKYyMu?= =?utf-8?q?EBYYkVLKtCJ3Tu+fYK41KRc4ucEWn7CEbsRhXMzelsB0iF1DjXA3ichafJjtLLDM7?= =?utf-8?q?3JI8JSTikenKPd/KCXREYw6Wr5kp6+sL4wisXoAeoWLlkgRxc8FzeDRfPheKjsA0o?= =?utf-8?q?NskXxq2abJV7wFxySytxvVAssA4jQeSOE0M+UrZXMz5IuSmWxn3H1U=3D?= X-Forefront-Antispam-Report: CIP:131.228.2.240; CTRY:FI; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:fihe3nok0735.emea.nsn-net.net; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(1800799024)(36860700013)(376014)(82310400026)(7416014)(921020)(13003099007); DIR:OUT; SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 May 2025 00:02:15.7946 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: b0a51c98-b4c4-4dd6-d9ef-08dd940ceaf6 X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0; Ip=[131.228.2.240]; Helo=[fihe3nok0735.emea.nsn-net.net] X-MS-Exchange-CrossTenant-AuthSource: DU2PEPF00028D13.eurprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAWPR07MB10070 From: Koen De Schepper DualPI2 provides L4S-type low latency & loss to traffic that uses a scalable congestion controller (e.g. TCP-Prague, DCTCP) without degrading the performance of 'classic' traffic (e.g. Reno, Cubic etc.). It is to be the reference implementation of IETF RFC9332 DualQ Coupled AQM (https://datatracker.ietf.org/doc/html/rfc9332). Note that creating two independent queues cannot meet the goal of DualPI2 mentioned in RFC9332: "...to preserve fairness between ECN-capable and non-ECN-capable traffic." Further, it could even lead to starvation of Classic traffic, which is also inconsistent with the requirements in RFC9332: "...although priority MUST be bounded in order not to starve Classic traffic." DualPI2 is designed to maintain approximate per-flow fairness on L-queue and C-queue by forming a single qdisc using the coupling factor and scheduler between two queues. The qdisc provides two queues called low latency and classic. It classifies packets based on the ECN field in the IP headers. By default it directs non-ECN and ECT(0) into the classic queue and ECT(1) and CE into the low latency queue, as per the IETF spec. Each queue runs its own AQM: * The classic AQM is called PI2, which is similar to the PIE AQM but more responsive and simpler. Classic traffic requires a decent target queue (default 15ms for Internet deployment) to fully utilize the link and to avoid high drop rates. * The low latency AQM is, by default, a very shallow ECN marking threshold (1ms) similar to that used for DCTCP. The DualQ isolates the low queuing delay of the Low Latency queue from the larger delay of the 'Classic' queue. However, from a bandwidth perspective, flows in either queue will share out the link capacity as if there was just a single queue. This bandwidth pooling effect is achieved by coupling together the drop and ECN-marking probabilities of the two AQMs. The PI2 AQM has two main parameters in addition to its target delay. The integral gain factor alpha is used to slowly correct any persistent standing queue error from the target delay, while the proportional gain factor beta is used to quickly compensate for queue changes (growth or shrinkage). Either alpha and beta are given as a parameter, or they can be calculated by tc from alternative typical and maximum RTT parameters. Internally, the output of a linear Proportional Integral (PI) controller is used for both queues. This output is squared to calculate the drop or ECN-marking probability of the classic queue. This counterbalances the square-root rate equation of Reno/Cubic, which is the trick that balances flow rates across the queues. For the ECN-marking probability of the low latency queue, the output of the base AQM is multiplied by a coupling factor. This determines the balance between the flow rates in each queue. The default setting makes the flow rates roughly equal, which should be generally applicable. If DUALPI2 AQM has detected overload (due to excessive non-responsive traffic in either queue), it will switch to signaling congestion solely using drop, irrespective of the ECN field. Alternatively, it can be configured to limit the drop probability and let the queue grow and eventually overflow (like tail-drop). GSO splitting in DUALPI2 is configurable from userspace while the default behavior is to split gso. When running DUALPI2 at unshaped 10gigE with 4 download streams test, splitting gso apart results in halving the latency with no loss in throughput: Summary of tcp_4down run 'no_split_gso': avg median # data pts Ping (ms) ICMP : 0.53 0.30 ms 350 TCP download avg : 2326.86 N/A Mbits/s 350 TCP download sum : 9307.42 N/A Mbits/s 350 TCP download::1 : 2672.99 2568.73 Mbits/s 350 TCP download::2 : 2586.96 2570.51 Mbits/s 350 TCP download::3 : 1786.26 1798.82 Mbits/s 350 TCP download::4 : 2261.21 2309.49 Mbits/s 350 Summart of tcp_4down run 'split_gso': avg median # data pts Ping (ms) ICMP   : 0.22 0.23 ms 350 TCP download avg : 2335.02 N/A Mbits/s 350 TCP download sum : 9340.09 N/A Mbits/s 350 TCP download::1 : 2335.30 2334.22 Mbits/s 350 TCP download::2 : 2334.72 2334.20 Mbits/s 350 TCP download::3 : 2335.28 2334.58 Mbits/s 350 TCP download::4 : 2334.79 2334.39 Mbits/s 350 A similar result is observed when running DUALPI2 at unshaped 1gigE with 1 download stream test: Summary of tcp_1down run 'no_split_gso': avg median # data pts Ping (ms) ICMP : 1.13 1.25 ms 350 TCP download : 941.41 941.46 Mbits/s 350 Summart of tcp_1down run 'split_gso': avg median # data pts Ping (ms) ICMP : 0.51 0.55 ms 350 TCP download : 941.41 941.45 Mbits/s 350 Additional details can be found in the draft: https://datatracker.ietf.org/doc/html/rfc9332 Signed-off-by: Koen De Schepper Co-developed-by: Olga Albisser Signed-off-by: Olga Albisser Co-developed-by: Olivier Tilmans Signed-off-by: Olivier Tilmans Co-developed-by: Henrik Steen Signed-off-by: Henrik Steen Signed-off-by: Bob Briscoe Signed-off-by: Ilpo Järvinen Co-developed-by: Chia-Yu Chang Signed-off-by: Chia-Yu Chang Acked-by: Dave Taht --- include/net/dropreason-core.h | 6 + net/sched/Kconfig | 12 + net/sched/Makefile | 1 + net/sched/sch_dualpi2.c | 449 ++++++++++++++++++++++++++++++++++ 4 files changed, 468 insertions(+) diff --git a/include/net/dropreason-core.h b/include/net/dropreason-core.h index bea77934a235..faae9f416e54 100644 --- a/include/net/dropreason-core.h +++ b/include/net/dropreason-core.h @@ -120,6 +120,7 @@ FN(ARP_PVLAN_DISABLE) \ FN(MAC_IEEE_MAC_CONTROL) \ FN(BRIDGE_INGRESS_STP_STATE) \ + FN(DUALPI2_STEP_DROP) \ FNe(MAX) /** @@ -570,6 +571,11 @@ enum skb_drop_reason { * ingress bridge port does not allow frames to be forwarded. */ SKB_DROP_REASON_BRIDGE_INGRESS_STP_STATE, + /** + * @SKB_DROP_REASON_DUALPI2_STEP_DROP: dropped by the step drop + * threshold of DualPI2 qdisc. + */ + SKB_DROP_REASON_DUALPI2_STEP_DROP, /** * @SKB_DROP_REASON_MAX: the maximum of core drop reasons, which * shouldn't be used as a real 'reason' - only for tracing code gen diff --git a/net/sched/Kconfig b/net/sched/Kconfig index 9f0b3f943fca..dda66a3590d8 100644 --- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -415,6 +415,18 @@ config NET_SCH_BPF If unsure, say N. +config NET_SCH_DUALPI2 + tristate "Dual Queue PI Square (DUALPI2) scheduler" + help + Say Y here if you want to use the Dual Queue Proportional Integral + Controller Improved with a Square scheduling algorithm. + For more information, please see https://tools.ietf.org/html/rfc9332 + + To compile this driver as a module, choose M here: the module + will be called sch_dualpi2. + + If unsure, say N. + menuconfig NET_SCH_DEFAULT bool "Allow override default queue discipline" help diff --git a/net/sched/Makefile b/net/sched/Makefile index 904d784902d1..5078ea84e6ad 100644 --- a/net/sched/Makefile +++ b/net/sched/Makefile @@ -63,6 +63,7 @@ obj-$(CONFIG_NET_SCH_CBS) += sch_cbs.o obj-$(CONFIG_NET_SCH_ETF) += sch_etf.o obj-$(CONFIG_NET_SCH_TAPRIO) += sch_taprio.o obj-$(CONFIG_NET_SCH_BPF) += bpf_qdisc.o +obj-$(CONFIG_NET_SCH_DUALPI2) += sch_dualpi2.o obj-$(CONFIG_NET_CLS_U32) += cls_u32.o obj-$(CONFIG_NET_CLS_ROUTE4) += cls_route.o diff --git a/net/sched/sch_dualpi2.c b/net/sched/sch_dualpi2.c index 97986c754e47..7ecd7502332c 100644 --- a/net/sched/sch_dualpi2.c +++ b/net/sched/sch_dualpi2.c @@ -113,8 +113,44 @@ struct dualpi2_sched_data { u32 step_marks; /* ECN mark pkt counter due to step AQM */ u32 memory_used; /* Memory used of both queues */ u32 max_memory_used;/* Maximum used memory */ + + /* Deferred drop statistics */ + u32 deferred_drops_cnt; /* Packets dropped */ + u32 deferred_drops_len; /* Bytes dropped */ +}; + +struct dualpi2_skb_cb { + u64 ts; /* Timestamp at enqueue */ + u8 apply_step:1, /* Can we apply the step threshold */ + classified:2, /* Packet classification results */ + ect:2; /* Packet ECT codepoint */ +}; + +enum dualpi2_classification_results { + DUALPI2_C_CLASSIC = 0, /* C-queue */ + DUALPI2_C_L4S = 1, /* L-queue (scale mark/classic drop) */ + DUALPI2_C_LLLL = 2, /* L-queue (no drops/marks) */ + __DUALPI2_C_MAX /* Keep last*/ }; +static struct dualpi2_skb_cb *dualpi2_skb_cb(struct sk_buff *skb) +{ + qdisc_cb_private_validate(skb, sizeof(struct dualpi2_skb_cb)); + return (struct dualpi2_skb_cb *)qdisc_skb_cb(skb)->data; +} + +static u64 dualpi2_sojourn_time(struct sk_buff *skb, u64 reference) +{ + return reference - dualpi2_skb_cb(skb)->ts; +} + +static u64 head_enqueue_time(struct Qdisc *q) +{ + struct sk_buff *skb = qdisc_peek_head(q); + + return skb ? dualpi2_skb_cb(skb)->ts : 0; +} + static u32 dualpi2_scale_alpha_beta(u32 param) { u64 tmp = ((u64)param * MAX_PROB >> ALPHA_BETA_SCALING); @@ -136,6 +172,25 @@ static ktime_t next_pi2_timeout(struct dualpi2_sched_data *q) return ktime_add_ns(ktime_get_ns(), q->pi2_tupdate); } +static bool skb_is_l4s(struct sk_buff *skb) +{ + return dualpi2_skb_cb(skb)->classified == DUALPI2_C_L4S; +} + +static bool skb_in_l_queue(struct sk_buff *skb) +{ + return dualpi2_skb_cb(skb)->classified != DUALPI2_C_CLASSIC; +} + +static bool dualpi2_mark(struct dualpi2_sched_data *q, struct sk_buff *skb) +{ + if (INET_ECN_set_ce(skb)) { + q->ecn_mark++; + return true; + } + return false; +} + static void dualpi2_reset_c_protection(struct dualpi2_sched_data *q) { q->c_protection_credit = q->c_protection_init; @@ -155,6 +210,398 @@ static void dualpi2_calculate_c_protection(struct Qdisc *sch, dualpi2_reset_c_protection(q); } +static bool dualpi2_roll(u32 prob) +{ + return get_random_u32() <= prob; +} + +/* Packets in the C-queue are subject to a marking probability pC, which is the + * square of the internal PI probability (i.e., have an overall lower mark/drop + * probability). If the qdisc is overloaded, ignore ECT values and only drop. + * + * Note that this marking scheme is also applied to L4S packets during overload. + * Return true if packet dropping is required in C queue + */ +static bool dualpi2_classic_marking(struct dualpi2_sched_data *q, + struct sk_buff *skb, u32 prob, + bool overload) +{ + if (dualpi2_roll(prob) && dualpi2_roll(prob)) { + if (overload || dualpi2_skb_cb(skb)->ect == INET_ECN_NOT_ECT) + return true; + dualpi2_mark(q, skb); + } + return false; +} + +/* Packets in the L-queue are subject to a marking probability pL given by the + * internal PI probability scaled by the coupling factor. + * + * On overload (i.e., @local_l_prob is >= 100%): + * - if the qdisc is configured to trade losses to preserve latency (i.e., + * @q->drop_overload), apply classic drops first before marking. + * - otherwise, preserve the "no loss" property of ECN at the cost of queueing + * delay, eventually resulting in taildrop behavior once sch->limit is + * reached. + * Return true if packet dropping is required in L queue + */ +static bool dualpi2_scalable_marking(struct dualpi2_sched_data *q, + struct sk_buff *skb, + u64 local_l_prob, u32 prob, + bool overload) +{ + if (overload) { + /* Apply classic drop */ + if (!q->drop_overload || + !(dualpi2_roll(prob) && dualpi2_roll(prob))) + goto mark; + return true; + } + + /* We can safely cut the upper 32b as overload==false */ + if (dualpi2_roll(local_l_prob)) { + /* Non-ECT packets could have classified as L4S by filters. */ + if (dualpi2_skb_cb(skb)->ect == INET_ECN_NOT_ECT) + return true; +mark: + dualpi2_mark(q, skb); + } + return false; +} + +/* Decide whether a given packet must be dropped (or marked if ECT), according + * to the PI2 probability. + * + * Never mark/drop if we have a standing queue of less than 2 MTUs. + */ +static bool must_drop(struct Qdisc *sch, struct dualpi2_sched_data *q, + struct sk_buff *skb) +{ + u64 local_l_prob; + bool overload; + u32 prob; + + if (sch->qstats.backlog < 2 * psched_mtu(qdisc_dev(sch))) + return false; + + prob = READ_ONCE(q->pi2_prob); + local_l_prob = (u64)prob * q->coupling_factor; + overload = local_l_prob > MAX_PROB; + + switch (dualpi2_skb_cb(skb)->classified) { + case DUALPI2_C_CLASSIC: + return dualpi2_classic_marking(q, skb, prob, overload); + case DUALPI2_C_L4S: + return dualpi2_scalable_marking(q, skb, local_l_prob, prob, + overload); + default: /* DUALPI2_C_LLLL */ + return false; + } +} + +static void dualpi2_read_ect(struct sk_buff *skb) +{ + struct dualpi2_skb_cb *cb = dualpi2_skb_cb(skb); + int wlen = skb_network_offset(skb); + + switch (skb_protocol(skb, true)) { + case htons(ETH_P_IP): + wlen += sizeof(struct iphdr); + if (!pskb_may_pull(skb, wlen) || + skb_try_make_writable(skb, wlen)) + goto not_ecn; + + cb->ect = ipv4_get_dsfield(ip_hdr(skb)) & INET_ECN_MASK; + break; + case htons(ETH_P_IPV6): + wlen += sizeof(struct ipv6hdr); + if (!pskb_may_pull(skb, wlen) || + skb_try_make_writable(skb, wlen)) + goto not_ecn; + + cb->ect = ipv6_get_dsfield(ipv6_hdr(skb)) & INET_ECN_MASK; + break; + default: + goto not_ecn; + } + return; + +not_ecn: + /* Non pullable/writable packets can only be dropped hence are + * classified as not ECT. + */ + cb->ect = INET_ECN_NOT_ECT; +} + +static int dualpi2_skb_classify(struct dualpi2_sched_data *q, + struct sk_buff *skb) +{ + struct dualpi2_skb_cb *cb = dualpi2_skb_cb(skb); + struct tcf_result res; + struct tcf_proto *fl; + int result; + + dualpi2_read_ect(skb); + if (cb->ect & q->ecn_mask) { + cb->classified = DUALPI2_C_L4S; + return NET_XMIT_SUCCESS; + } + + if (TC_H_MAJ(skb->priority) == q->sch->handle && + TC_H_MIN(skb->priority) < __DUALPI2_C_MAX) { + cb->classified = TC_H_MIN(skb->priority); + return NET_XMIT_SUCCESS; + } + + fl = rcu_dereference_bh(q->tcf_filters); + if (!fl) { + cb->classified = DUALPI2_C_CLASSIC; + return NET_XMIT_SUCCESS; + } + + result = tcf_classify(skb, NULL, fl, &res, false); + if (result >= 0) { +#ifdef CONFIG_NET_CLS_ACT + switch (result) { + case TC_ACT_STOLEN: + case TC_ACT_QUEUED: + case TC_ACT_TRAP: + return NET_XMIT_SUCCESS | __NET_XMIT_STOLEN; + case TC_ACT_SHOT: + return NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; + } +#endif + cb->classified = TC_H_MIN(res.classid) < __DUALPI2_C_MAX ? + TC_H_MIN(res.classid) : DUALPI2_C_CLASSIC; + } + return NET_XMIT_SUCCESS; +} + +static int dualpi2_enqueue_skb(struct sk_buff *skb, struct Qdisc *sch, + struct sk_buff **to_free) +{ + struct dualpi2_sched_data *q = qdisc_priv(sch); + struct dualpi2_skb_cb *cb; + + if (unlikely(qdisc_qlen(sch) >= sch->limit) || + unlikely((u64)q->memory_used + skb->truesize > q->memory_limit)) { + qdisc_qstats_overlimit(sch); + if (skb_in_l_queue(skb)) + qdisc_qstats_overlimit(q->l_queue); + return qdisc_drop_reason(skb, sch, to_free, + SKB_DROP_REASON_QDISC_OVERLIMIT); + } + + if (q->drop_early && must_drop(sch, q, skb)) { + qdisc_drop_reason(skb, sch, to_free, + SKB_DROP_REASON_QDISC_OVERLIMIT); + return NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; + } + + cb = dualpi2_skb_cb(skb); + cb->ts = ktime_get_ns(); + q->memory_used += skb->truesize; + if (q->memory_used > q->max_memory_used) + q->max_memory_used = q->memory_used; + + if (qdisc_qlen(sch) > q->maxq) + q->maxq = qdisc_qlen(sch); + + if (skb_in_l_queue(skb)) { + /* Only apply the step if a queue is building up */ + dualpi2_skb_cb(skb)->apply_step = skb_is_l4s(skb) && + qdisc_qlen(q->l_queue) >= q->min_qlen_step; + /* Keep the overall qdisc stats consistent */ + ++sch->q.qlen; + qdisc_qstats_backlog_inc(sch, skb); + ++q->packets_in_l; + if (!q->l_head_ts) + q->l_head_ts = cb->ts; + return qdisc_enqueue_tail(skb, q->l_queue); + } + ++q->packets_in_c; + if (!q->c_head_ts) + q->c_head_ts = cb->ts; + return qdisc_enqueue_tail(skb, sch); +} + +/* By default, dualpi2 will split GSO skbs into independent skbs and enqueue + * each of those individually. This yields the following benefits, at the + * expense of CPU usage: + * - Finer-grained AQM actions as the sub-packets of a burst no longer share the + * same fate (e.g., the random mark/drop probability is applied individually) + * - Improved precision of the starvation protection/WRR scheduler at dequeue, + * as the size of the dequeued packets will be smaller. + */ +static int dualpi2_qdisc_enqueue(struct sk_buff *skb, struct Qdisc *sch, + struct sk_buff **to_free) +{ + struct dualpi2_sched_data *q = qdisc_priv(sch); + int err; + + err = dualpi2_skb_classify(q, skb); + if (err != NET_XMIT_SUCCESS) { + if (err & __NET_XMIT_BYPASS) + qdisc_qstats_drop(sch); + __qdisc_drop(skb, to_free); + return err; + } + + if (q->split_gso && skb_is_gso(skb)) { + netdev_features_t features; + struct sk_buff *nskb, *next; + int cnt, byte_len, orig_len; + int err; + + features = netif_skb_features(skb); + nskb = skb_gso_segment(skb, features & ~NETIF_F_GSO_MASK); + if (IS_ERR_OR_NULL(nskb)) + return qdisc_drop(skb, sch, to_free); + + cnt = 1; + byte_len = 0; + orig_len = qdisc_pkt_len(skb); + skb_list_walk_safe(nskb, nskb, next) { + skb_mark_not_on_list(nskb); + qdisc_skb_cb(nskb)->pkt_len = nskb->len; + dualpi2_skb_cb(nskb)->classified = + dualpi2_skb_cb(skb)->classified; + dualpi2_skb_cb(nskb)->ect = dualpi2_skb_cb(skb)->ect; + err = dualpi2_enqueue_skb(nskb, sch, to_free); + if (err == NET_XMIT_SUCCESS) { + /* Compute the backlog adjustment that needs + * to be propagated in the qdisc tree to reflect + * all new skbs successfully enqueued. + */ + ++cnt; + byte_len += nskb->len; + } + } + if (err == NET_XMIT_SUCCESS) { + /* The caller will add the original skb stats to its + * backlog, compensate this. + */ + --cnt; + byte_len -= orig_len; + } + qdisc_tree_reduce_backlog(sch, -cnt, -byte_len); + consume_skb(skb); + return err; + } + return dualpi2_enqueue_skb(skb, sch, to_free); +} + +/* Select the queue from which the next packet can be dequeued, ensuring that + * neither queue can starve the other with a WRR scheduler. + * + * The sign of the WRR credit determines the next queue, while the size of + * the dequeued packet determines the magnitude of the WRR credit change. If + * either queue is empty, the WRR credit is kept unchanged. + * + * As the dequeued packet can be dropped later, the caller has to perform the + * qdisc_bstats_update() calls. + */ +static struct sk_buff *dequeue_packet(struct Qdisc *sch, + struct dualpi2_sched_data *q, + int *credit_change, + u64 now) +{ + struct sk_buff *skb = NULL; + int c_len; + + *credit_change = 0; + c_len = qdisc_qlen(sch) - qdisc_qlen(q->l_queue); + if (qdisc_qlen(q->l_queue) && (!c_len || q->c_protection_credit <= 0)) { + skb = __qdisc_dequeue_head(&q->l_queue->q); + WRITE_ONCE(q->l_head_ts, head_enqueue_time(q->l_queue)); + if (c_len) + *credit_change = q->c_protection_wc; + qdisc_qstats_backlog_dec(q->l_queue, skb); + /* Keep the global queue size consistent */ + --sch->q.qlen; + q->memory_used -= skb->truesize; + } else if (c_len) { + skb = __qdisc_dequeue_head(&sch->q); + WRITE_ONCE(q->c_head_ts, head_enqueue_time(sch)); + if (qdisc_qlen(q->l_queue)) + *credit_change = ~((s32)q->c_protection_wl) + 1; + q->memory_used -= skb->truesize; + } else { + dualpi2_reset_c_protection(q); + return NULL; + } + *credit_change *= qdisc_pkt_len(skb); + qdisc_qstats_backlog_dec(sch, skb); + return skb; +} + +static int do_step_aqm(struct dualpi2_sched_data *q, struct sk_buff *skb, + u64 now) +{ + u64 qdelay = 0; + + if (q->step_in_packets) + qdelay = qdisc_qlen(q->l_queue); + else + qdelay = dualpi2_sojourn_time(skb, now); + + if (dualpi2_skb_cb(skb)->apply_step && qdelay > q->step_thresh) { + if (!dualpi2_skb_cb(skb)->ect) + /* Drop this non-ECT packet */ + return 1; + if (dualpi2_mark(q, skb)) + ++q->step_marks; + } + qdisc_bstats_update(q->l_queue, skb); + return 0; +} + +static void drop_and_retry(struct dualpi2_sched_data *q, struct sk_buff *skb, + struct Qdisc *sch, enum skb_drop_reason reason) +{ + ++q->deferred_drops_cnt; + q->deferred_drops_len += qdisc_pkt_len(skb); + kfree_skb_reason(skb, reason); + qdisc_qstats_drop(sch); +} + +static struct sk_buff *dualpi2_qdisc_dequeue(struct Qdisc *sch) +{ + struct dualpi2_sched_data *q = qdisc_priv(sch); + struct sk_buff *skb; + int credit_change; + u64 now; + + now = ktime_get_ns(); + + while ((skb = dequeue_packet(sch, q, &credit_change, now))) { + if (!q->drop_early && must_drop(sch, q, skb)) { + drop_and_retry(q, skb, sch, + SKB_DROP_REASON_QDISC_CONGESTED); + continue; + } + + if (skb_in_l_queue(skb) && do_step_aqm(q, skb, now)) { + qdisc_qstats_drop(q->l_queue); + drop_and_retry(q, skb, sch, + SKB_DROP_REASON_DUALPI2_STEP_DROP); + continue; + } + + q->c_protection_credit += credit_change; + qdisc_bstats_update(sch, skb); + break; + } + + if (q->deferred_drops_cnt) { + qdisc_tree_reduce_backlog(sch, q->deferred_drops_cnt, + q->deferred_drops_len); + q->deferred_drops_cnt = 0; + q->deferred_drops_len = 0; + } + return skb; +} + static s64 __scale_delta(u64 diff) { do_div(diff, 1 << ALPHA_BETA_GRANULARITY); @@ -644,6 +1091,8 @@ static struct Qdisc_ops dualpi2_qdisc_ops __read_mostly = { .id = "dualpi2", .cl_ops = &dualpi2_class_ops, .priv_size = sizeof(struct dualpi2_sched_data), + .enqueue = dualpi2_qdisc_enqueue, + .dequeue = dualpi2_qdisc_dequeue, .peek = qdisc_peek_dequeued, .init = dualpi2_init, .destroy = dualpi2_destroy, From patchwork Fri May 16 00:02:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Chia-Yu Chang \(Nokia\)" X-Patchwork-Id: 890411 Received: from MRWPR03CU001.outbound.protection.outlook.com (mail-francesouthazon11011047.outbound.protection.outlook.com [40.107.130.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0BD7138F9C; Fri, 16 May 2025 00:02:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.130.47 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747353746; cv=fail; b=ISNValZp2t7xeM/Av6eX6pAoXuqzAvRoHLjNPP8lYPfgEkya46IEXuphkvgI/do+aAAPcqIM1uT+ep06FBBZM2i+tKsd0px7NhEne1ucr7HbwvyHXLTOCEyAXlhpDFMuNb75ZWaMF8F500s0pgBQxaWx3c1AM5BNEkxC0CpN784= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747353746; c=relaxed/simple; bh=xdxSyLFyOIXTIhAymGBb0hCEyLIUHA9WMMcisE5eMR4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=j16wFWeWLt2AIwoMDzlQU04l6ddertX/M1OjyvVvtrmEepQR8bcgvPfc4pK+4o8NiAGyE7VkcZXC4wGNIhgBruH6ySS59P77NBFsXaGpPzJldbZ8LytCNo5DQaj56tsjJrLD721kT1yW3CyyhKaiGnC6vzx5WoLysHwhQsH8W7Y= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=Nr3zBZFX; arc=fail smtp.client-ip=40.107.130.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="Nr3zBZFX" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=IBBNp+JDYlpndKUaOUPkJ1z7jsn64CnSRPf5U1anAG1S/6mYDJYJ/YhIwtKMxXYBLiqeEQDnvQy6vMis/r5L+ug4xLs+eoUNQMx1PID7xJfrdeoocCwoqj29ztGN6yXjfJRaluZgSBSb9W4TKiLmqDifiPPp/HQvWc5gSQ+KK+M90FdoxUrjguxzqIb1rS33lJ1s7uK+h2SjQFNP9b877yM+4iy4c5SLFvf9mpuJlRBbNQCys2xe9ULVwSWeHCNkbwnHV3WsjVlqgkM4BiS4gPflqN1AR7h1oRkPE7xK1HnwuOcmkJ5YpSvD4Zi/WqNEBd0gdy0NTYRB+RQIZfYcsA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=9yjs7EfoRvocJyC1N/EOGQqNJcfS4cQPSVORc2zRKOE=; b=cPtp0Tu9BI88XTS1rFGHJYdZC9jEETBUqgYSUxd9oVl7n61BnTvpWDpP4P7mf3s2ATzKCX5upqodKVyC53GRM9opxOk5ljO2ioXZOqc+AcWOn2iF/viz46egqglTn0vsac+1WmPUIDOVN3QpMSvLoREbFkIaISZ1HnLMdVA8g9myShtLL5Kqo7akMK0S/EoVmBPaawJVrA2PBsJmZaTaQ60rCKZOTprKl7vmKH0PrhnkHDWBu+Op7bnAaGs4LoP9ZjrG5+c0W3HHddpiv8TC0/oKPX3S+4xxvmYS26iPy3K1TiGCcXClpp/4OGdGbNbhdRAgDgYBLqITEDBz5ogEsg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=temperror (sender ip is 131.228.2.240) smtp.rcpttodomain=apple.com smtp.mailfrom=nokia-bell-labs.com; dmarc=temperror action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=9yjs7EfoRvocJyC1N/EOGQqNJcfS4cQPSVORc2zRKOE=; b=Nr3zBZFXWk2SGmNzWV+BHn5LARC9aEKwPPjdorQfCk/GgDMAeKfXa6F0iwq8yWrx7/kGKNaaEIrWgRgCmCErj6/tfL27N/OiZD4iJi9q/VzBY2LX08/Vd94clSzcxeo8vZxo3n0AOiN101KJ8pkN1J9dPXCXpVISv4UVH9FPKbZ/PTyRHZQsk5uEwUT+5LHaWA3BAHm9cpUCm/q9AKmV7eOEOjLcISlBhA8Y7I3IW3/rY7ThMuQ0upLsvIH1hZjRcz+8zyxNNnkAEys0yhbb/h5p/+XpnLjPmpK/FAFzWVYdWD2H2/YeR+wygsQbHrrqjSYT672B9CyIntwCH6TM+g== Received: from DUZPR01CA0288.eurprd01.prod.exchangelabs.com (2603:10a6:10:4b7::21) by PAXPR07MB7888.eurprd07.prod.outlook.com (2603:10a6:102:13c::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8722.33; Fri, 16 May 2025 00:02:20 +0000 Received: from DU2PEPF00028D11.eurprd03.prod.outlook.com (2603:10a6:10:4b7:cafe::76) by DUZPR01CA0288.outlook.office365.com (2603:10a6:10:4b7::21) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8746.20 via Frontend Transport; Fri, 16 May 2025 00:02:20 +0000 X-MS-Exchange-Authentication-Results: spf=temperror (sender IP is 131.228.2.240) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=temperror action=none header.from=nokia-bell-labs.com; Received-SPF: TempError (protection.outlook.com: error in processing during lookup of nokia-bell-labs.com: DNS Timeout) Received: from fihe3nok0735.emea.nsn-net.net (131.228.2.240) by DU2PEPF00028D11.mail.protection.outlook.com (10.167.242.25) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8722.18 via Frontend Transport; Fri, 16 May 2025 00:02:19 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fihe3nok0735.emea.nsn-net.net (Postfix) with ESMTP id CFE15200D3; Fri, 16 May 2025 03:02:17 +0300 (EEST) From: chia-yu.chang@nokia-bell-labs.com To: horms@kernel.org, donald.hunter@gmail.com, xandfury@gmail.com, netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, andrew+netdev@lunn.ch, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Chia-Yu Chang Subject: [PATCH v16 net-next 5/5] Documentation: netlink: specs: tc: Add DualPI2 specification Date: Fri, 16 May 2025 02:02:01 +0200 Message-Id: <20250516000201.18008-6-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250516000201.18008-1-chia-yu.chang@nokia-bell-labs.com> References: <20250516000201.18008-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DU2PEPF00028D11:EE_|PAXPR07MB7888:EE_ X-MS-Office365-Filtering-Correlation-Id: bd23dfdd-ee05-440b-3c84-08dd940ced02 X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|82310400026|36860700013|1800799024|376014|7416014|921020|13003099007; X-Microsoft-Antispam-Message-Info: HkfExVt/oMDQS9bZOO79zrL9uhOCZSn6o+JESszfX+3CPYofduL8Tu5P/5/qI6eq+hBvnq34eIixq+es+/VphXjmI8SyAu7ktSm0baKRUYsX59RcDhwjtwXr7aL2K2w28+LYTOva/1mq70CPOtzgzIpa472CTzf1Ahj7EcgHOXQRBy4LluKgSXCvpz395l+8NfP5UKz1/wVYOgxVCa2qIoWuJCMBf4RQwRIoDd+pqMl6ntDhzM6Upr7atcsW0FtpX4l1/uqbbd8TamEQU79OFZczU3+9vqrYDuuEJzdtjfHTnMNApwDeSSbRl10gpmblgYMjyHnvkg2FM119vR33tvsX5nbMrREaFMQZ3bobNgsHWpbU5Kcd+Z4tQYuky+QsNO9A0hlXr8VBpZm/JTIjpA5/qhPCP5T6t0YrbUOt6AKwEd5wU8z8MU/5CWKSZ/HCGGetmPIeDatr1XX1Yo4tkGirY3tqv0LkEXbZ6u4rlvQFoJmrTPuVjTHZD8kC/SYVUh21GPgcFxVBUWePecuLa3CczdqlTNp43aG5SslwCp8qDH1ZqTeIrE6u469INKas3Lfhc1XaJ1U0e6bwv0683JND87BnvBh83aP9/E+tSW3/aJBr7VJqiF4Zr1nIDQXY5BmcuffIGeKaGqPZuKtMwcifGnMdQpK2BlK0XsB1sl0c8wsVnsV1i1pGj14k5iOc2fc2WYDPlwpzsunpYunSMawd+Q0xKk/JSkYlcTiZYFKW9gvFl+5Fsc13UU+iKQJy8gJKY9kZ+KIV6Ipli4eLGAfcyi3nMQg7fBbVh2+rOkPBLEJ0iDUPfi0OevAVrmIeW4FoawQv2YdtEsKzv+QogtY8yPGEIaHLtPod5K5zQQmJK9HxhFZjYLhdeUArlRKT6Di+V/n68514SNb3EZPGJLDkzL6YmX0ZbCpaBGhVGjZm59ZjqBQED6SQ3u8WuW47ljb8laX2T5nNcnlLXplVMZb061H6IhqZRZhnXErKrS3NicOHmaE5ft/O8D+hDqmUTy3RaYDom32UUVyDqKh/34e1+mjpcdvTInuVND/zWZ1tKuTAeK/9Jf6HkRoBD99EA9ShDjl5SkLQTOZpL7cg5YFk4o3MM5o9QUb2Y5spTdC0jHl5BGO7h/VWkY9qXTgp5nQTr377Nvc9H+UHjN0jSjjXCJBFQWAcy42bQ5Zw6syBbKpSo58knMZRNod59Xw+no4rLvG9nJJ3BSqN72waWQWG6gSQvaWBAJ3k6y/fxsvlvSWCbrnbuGZTot3wzz7pSvXhH7v3tnKJS6VwQ1uyXPnFiDe1DQzBhkQl8mTmQi+SURJ1B7+WP0Q9LrKihiXxXc0mDQgpb8+K5TgxKd01AhkTco+VhRC4sVCPuCwJfnmHusQCQYaqC7TTh31flpI5Jdwm5RyI79RMp5Yh9WTW2vHpCuzNm17EfsX3vFY0PEkYw3njTjP0rEHaTZGPZpaFjCYqgoeUUv5tEbHpipKFdla7ZWbHyaOz7qTMxx3mpvsSxDpbO/ReJmY+Kprg90Id7Hx+Mx9xv5nUJXLomcfLIQ== X-Forefront-Antispam-Report: CIP:131.228.2.240; CTRY:FI; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:fihe3nok0735.emea.nsn-net.net; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(82310400026)(36860700013)(1800799024)(376014)(7416014)(921020)(13003099007); DIR:OUT; SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 May 2025 00:02:19.2292 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: bd23dfdd-ee05-440b-3c84-08dd940ced02 X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0; Ip=[131.228.2.240]; Helo=[fihe3nok0735.emea.nsn-net.net] X-MS-Exchange-CrossTenant-AuthSource: DU2PEPF00028D11.eurprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAXPR07MB7888 From: Chia-Yu Chang Introduce the specification of tc qdisc DualPI2 stats and attributes, which is the reference implementation of IETF RFC9332 DualQ Coupled AQM (https://datatracker.ietf.org/doc/html/rfc9332) providing two different queues: low latency queue (L-queue) and classic queue (C-queue). Signed-off-by: Chia-Yu Chang --- Documentation/netlink/specs/tc.yaml | 156 ++++++++++++++++++++++++++++ 1 file changed, 156 insertions(+) diff --git a/Documentation/netlink/specs/tc.yaml b/Documentation/netlink/specs/tc.yaml index 953aa837958b..162c38755446 100644 --- a/Documentation/netlink/specs/tc.yaml +++ b/Documentation/netlink/specs/tc.yaml @@ -51,6 +51,37 @@ definitions: - tundf - tunoam - tuncrit + - + name: tc-dualpi2-drop-overload-enum + type: enum + entries: + - overflow + - drop + - + name: tc-dualpi2-drop-early-enum + type: enum + entries: + - drop-dequeue + - drop-enqueue + - + name: tc-dualpi2-ecn-mask-enum + type: enum + entries: + - + name: l4s-ect + value: 1 + - + name: cla-ect + value: 2 + - + name: any-ect + value: 3 + - + name: tc-dualpi2-split-gso-enum + type: enum + entries: + - no-split-gso + - split-gso - name: tc-stats type: struct @@ -816,6 +847,58 @@ definitions: - name: drop-overmemory type: u32 + - + name: tc-dualpi2-xstats + type: struct + members: + - + name: prob + type: u32 + doc: Current probability + - + name: delay-c + type: u32 + doc: Current C-queue delay in microseconds + - + name: delay-l + type: u32 + doc: Current L-queue delay in microseconds + - + name: pkts-in-c + type: u32 + doc: Number of packets enqueued in the C-queue + - + name: pkts-in-l + type: u32 + doc: Number of packets enqueued in the L-queue + - + name: maxq + type: u32 + doc: Maximum number of packets seen by the DualPI2 + - + name: ecn-mark + type: u32 + doc: All packets marked with ecn + - + name: step-mark + type: u32 + doc: Only packets marked with ecn due to L-queue step AQM + - + name: credit + type: s32 + doc: Current credit value for WRR + - + name: memory-used + type: u32 + doc: Memory used in bytes by the DualPI2 + - + name: max-memory-used + type: u32 + doc: Maximum memory used in bytes by the DualPI2 + - + name: memory-limit + type: u32 + doc: Memory limit in bytes - name: tc-fq-pie-xstats type: struct @@ -2301,6 +2384,73 @@ attribute-sets: - name: quantum type: u32 + - + name: tc-dualpi2-attrs + attributes: + - + name: limit + type: u32 + doc: Limit of total number of packets in queue + - + name: memory-limit + type: u32 + doc: Memory limit of total number of packets in queue + - + name: target + type: u32 + doc: Classic target delay in microseconds + - + name: tupdate + type: u32 + doc: Drop probability update interval time in microseconds + - + name: alpha + type: u32 + doc: Integral gain factor in Hz for PI controller + - + name: beta + type: u32 + doc: Proportional gain factor in Hz for PI controller + - + name: step-thresh + type: u32 + doc: L4S step marking threshold (see also step-packets) + - + name: step-packets + type: flag + doc: L4S Step marking threshold unit in packets (otherwise is in microseconds) + - + name: min-qlen-step + type: u32 + doc: Packets enqueued to the L-queue can apply the step threshold when the queue length of L-queue is larger than this value. (0 is recommended) + - + name: coupling + type: u8 + doc: Probability coupling factor between Classic and L4S (2 is recommended) + - + name: drop-overload + type: u8 + doc: Control the overload strategy (drop to preserve latency or let the queue overflow) + enum: tc-dualpi2-drop-overload-enum + - + name: drop-early + type: u8 + doc: Decide where the Classic packets are PI-based dropped or marked + enum: tc-dualpi2-drop-early-enum + - + name: c-protection + type: u8 + doc: Classic WRR weight in percentage (from 0 to 100) + - + name: ecn-mask + type: u8 + doc: Configure the L-queue ECN classifier + enum: tc-dualpi2-ecn-mask-enum + - + name: split-gso + type: u8 + doc: Split aggregated skb or not + enum: tc-dualpi2-split-gso-enum - name: tc-ematch-attrs attributes: @@ -3681,6 +3831,9 @@ sub-messages: - value: drr attribute-set: tc-drr-attrs + - + value: dualpi2 + attribute-set: tc-dualpi2-attrs - value: etf attribute-set: tc-etf-attrs @@ -3848,6 +4001,9 @@ sub-messages: - value: codel fixed-header: tc-codel-xstats + - + value: dualpi2 + fixed-header: tc-dualpi2-xstats - value: fq fixed-header: tc-fq-qd-stats