From patchwork Wed Apr 21 13:07:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zidenberg, Tsahi" X-Patchwork-Id: 426262 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4C9DC433ED for ; Wed, 21 Apr 2021 13:08:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9C63261490 for ; Wed, 21 Apr 2021 13:08:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241227AbhDUNIq (ORCPT ); Wed, 21 Apr 2021 09:08:46 -0400 Received: from smtp-fw-6001.amazon.com ([52.95.48.154]:37991 "EHLO smtp-fw-6001.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240229AbhDUNHt (ORCPT ); Wed, 21 Apr 2021 09:07:49 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1619010437; x=1650546437; h=subject:from:to:cc:references:message-id:date: mime-version:in-reply-to:content-transfer-encoding; bh=qi9qpSSFejvHTma8np0UdFMojO/Y59AXXjDMbuDHHIA=; b=ItC5/VS62RnJv2M1gXRO4eioc7Ltm7J3S+fBIf9tG4fofB9Z25y4YW/U GpJXeGYu/wFYduHP5Auh9s7i4nG0wKWJJlWgV3QGNxadgmxIC8j0YRKr9 VbuVE++9ZPsTiVg784g53w10VE5QdcdjYWDdvYM+y5p3jFOsEVSS8m3Ed Q=; X-IronPort-AV: E=Sophos;i="5.82,240,1613433600"; d="scan'208";a="108989348" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO email-inbound-relay-2b-5bdc5131.us-west-2.amazon.com) ([10.43.8.2]) by smtp-border-fw-out-6001.iad6.amazon.com with ESMTP; 21 Apr 2021 13:07:09 +0000 Received: from EX13D01EUA001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan3.pdx.amazon.com [10.236.137.198]) by email-inbound-relay-2b-5bdc5131.us-west-2.amazon.com (Postfix) with ESMTPS id 51BAEA05D5; Wed, 21 Apr 2021 13:07:08 +0000 (UTC) Received: from 8c859063385e.ant.amazon.com (10.43.160.119) by EX13D01EUA001.ant.amazon.com (10.43.165.121) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Wed, 21 Apr 2021 13:07:05 +0000 Subject: [PATCH 1/8] uaccess: Add strict non-pagefault kernel-space read, function From: "Zidenberg, Tsahi" To: CC: Greg KH References: Message-ID: <2caa3d3d-9787-ab33-d68b-c39c079078c0@amazon.com> Date: Wed, 21 Apr 2021 16:07:00 +0300 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:78.0) Gecko/20100101 Thunderbird/78.9.1 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-Originating-IP: [10.43.160.119] X-ClientProxiedBy: EX13D20UWC002.ant.amazon.com (10.43.162.163) To EX13D01EUA001.ant.amazon.com (10.43.165.121) Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commit 75a1a607bb7e6d918be3aca11ec2214a275392f4 upstream Add two new probe_kernel_read_strict() and strncpy_from_unsafe_strict() helpers which by default alias to the __probe_kernel_read() and the __strncpy_from_unsafe(), respectively, but can be overridden by archs which have non-overlapping address ranges for kernel space and user space in order to bail out with -EFAULT when attempting to probe user memory including non-canonical user access addresses [0]:   4-level page tables:     user-space mem: 0x0000000000000000 - 0x00007fffffffffff     non-canonical:  0x0000800000000000 - 0xffff7fffffffffff   5-level page tables:     user-space mem: 0x0000000000000000 - 0x00ffffffffffffff     non-canonical:  0x0100000000000000 - 0xfeffffffffffffff The idea is that these helpers are complementary to the probe_user_read() and strncpy_from_unsafe_user() which probe user-only memory. Both added helpers here do the same, but for kernel-only addresses. Both set of helpers are going to be used for BPF tracing. They also explicitly avoid throwing the splat for non-canonical user addresses from 00c42373d397 ("x86-64: add warning for non-canonical user access address dereferences"). For compat, the current probe_kernel_read() and strncpy_from_unsafe() are left as-is.   [0] Documentation/x86/x86_64/mm.txt Signed-off-by: Daniel Borkmann Signed-off-by: Alexei Starovoitov Cc: Linus Torvalds Cc: Masami Hiramatsu Cc: x86@kernel.org Link: https://lore.kernel.org/bpf/eefeefd769aa5a013531f491a71f0936779e916b.1572649915.git.daniel@iogearbox.net Cc: # 5.4 Signed-off-by: Tsahi Zidenberg ---  arch/x86/mm/Makefile    |  2 +-  arch/x86/mm/maccess.c   | 43 +++++++++++++++++++++++++++++++++++++++++  include/linux/uaccess.h |  4 ++++  mm/maccess.c            | 25 +++++++++++++++++++++++-  4 files changed, 72 insertions(+), 2 deletions(-)  create mode 100644 arch/x86/mm/maccess.c diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 84373dc9b341..bbc68a54795e 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -13,7 +13,7 @@ CFLAGS_REMOVE_mem_encrypt_identity.o    = -pg  endif    obj-y    :=  init.o init_$(BITS).o fault.o ioremap.o extable.o pageattr.o mmap.o \ -        pat.o pgtable.o physaddr.o setup_nx.o tlb.o cpu_entry_area.o +        pat.o pgtable.o physaddr.o setup_nx.o tlb.o cpu_entry_area.o maccess.o    # Make sure __phys_addr has no stackprotector  nostackp := $(call cc-option, -fno-stack-protector) diff --git a/arch/x86/mm/maccess.c b/arch/x86/mm/maccess.c new file mode 100644 index 000000000000..f5b85bdc0535 --- /dev/null +++ b/arch/x86/mm/maccess.c @@ -0,0 +1,43 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include + +#ifdef CONFIG_X86_64 +static __always_inline u64 canonical_address(u64 vaddr, u8 vaddr_bits) +{ +    return ((s64)vaddr << (64 - vaddr_bits)) >> (64 - vaddr_bits); +} + +static __always_inline bool invalid_probe_range(u64 vaddr) +{ +    /* +     * Range covering the highest possible canonical userspace address +     * as well as non-canonical address range. For the canonical range +     * we also need to include the userspace guard page. +     */ +    return vaddr < TASK_SIZE_MAX + PAGE_SIZE || +           canonical_address(vaddr, boot_cpu_data.x86_virt_bits) != vaddr; +} +#else +static __always_inline bool invalid_probe_range(u64 vaddr) +{ +    return vaddr < TASK_SIZE_MAX; +} +#endif + +long probe_kernel_read_strict(void *dst, const void *src, size_t size) +{ +    if (unlikely(invalid_probe_range((unsigned long)src))) +        return -EFAULT; + +    return __probe_kernel_read(dst, src, size); +} + +long strncpy_from_unsafe_strict(char *dst, const void *unsafe_addr, long count) +{ +    if (unlikely(invalid_probe_range((unsigned long)unsafe_addr))) +        return -EFAULT; + +    return __strncpy_from_unsafe(dst, unsafe_addr, count); +} diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h index 38555435a64a..67f016010aad 100644 --- a/include/linux/uaccess.h +++ b/include/linux/uaccess.h @@ -311,6 +311,7 @@ copy_struct_from_user(void *dst, size_t ksize, const void __user *src,   * happens, handle that and return -EFAULT.   */  extern long probe_kernel_read(void *dst, const void *src, size_t size); +extern long probe_kernel_read_strict(void *dst, const void *src, size_t size);  extern long __probe_kernel_read(void *dst, const void *src, size_t size);    /* @@ -350,6 +351,9 @@ extern long notrace probe_user_write(void __user *dst, const void *src, size_t s  extern long notrace __probe_user_write(void __user *dst, const void *src, size_t size);    extern long strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count); +extern long strncpy_from_unsafe_strict(char *dst, const void *unsafe_addr, +                       long count); +extern long __strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count);  extern long strncpy_from_unsafe_user(char *dst, const void __user *unsafe_addr,                       long count);  extern long strnlen_unsafe_user(const void __user *unsafe_addr, long count); diff --git a/mm/maccess.c b/mm/maccess.c index 2d3c3d01064c..3ca8d97e5010 100644 --- a/mm/maccess.c +++ b/mm/maccess.c @@ -43,11 +43,20 @@ probe_write_common(void __user *dst, const void *src, size_t size)   * do_page_fault() doesn't attempt to take mmap_sem.  This makes   * probe_kernel_read() suitable for use within regions where the caller   * already holds mmap_sem, or other locks which nest inside mmap_sem. + * + * probe_kernel_read_strict() is the same as probe_kernel_read() except for + * the case where architectures have non-overlapping user and kernel address + * ranges: probe_kernel_read_strict() will additionally return -EFAULT for + * probing memory on a user address range where probe_user_read() is supposed + * to be used instead.   */    long __weak probe_kernel_read(void *dst, const void *src, size_t size)      __attribute__((alias("__probe_kernel_read")));   +long __weak probe_kernel_read_strict(void *dst, const void *src, size_t size) +    __attribute__((alias("__probe_kernel_read"))); +  long __probe_kernel_read(void *dst, const void *src, size_t size)  {      long ret; @@ -157,8 +166,22 @@ EXPORT_SYMBOL_GPL(probe_user_write);   *   * If @count is smaller than the length of the string, copies @count-1 bytes,   * sets the last byte of @dst buffer to NUL and returns @count. + * + * strncpy_from_unsafe_strict() is the same as strncpy_from_unsafe() except + * for the case where architectures have non-overlapping user and kernel address + * ranges: strncpy_from_unsafe_strict() will additionally return -EFAULT for + * probing memory on a user address range where strncpy_from_unsafe_user() is + * supposed to be used instead.   */ -long strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count) + +long __weak strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count) +    __attribute__((alias("__strncpy_from_unsafe"))); + +long __weak strncpy_from_unsafe_strict(char *dst, const void *unsafe_addr, +                       long count) +    __attribute__((alias("__strncpy_from_unsafe"))); + +long __strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count)  {      mm_segment_t old_fs = get_fs();      const void *src = unsafe_addr; From patchwork Wed Apr 21 13:15:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zidenberg, Tsahi" X-Patchwork-Id: 425834 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2FB90C41602 for ; Wed, 21 Apr 2021 13:16:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1081161462 for ; Wed, 21 Apr 2021 13:16:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241081AbhDUNRU (ORCPT ); Wed, 21 Apr 2021 09:17:20 -0400 Received: from smtp-fw-9103.amazon.com ([207.171.188.200]:11589 "EHLO smtp-fw-9103.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242848AbhDUNQS (ORCPT ); Wed, 21 Apr 2021 09:16:18 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1619010945; x=1650546945; h=subject:from:to:cc:references:message-id:date: mime-version:in-reply-to:content-transfer-encoding; bh=WjzUlaz2xJdprzzr617dUyV82cAgxp9xTWw8V0VCbx0=; b=SGsbirHAdDeVy2baySu9QjsF78q/Y+ba03xvuTPA/lpQoufihmoRQWP3 PsIMA5Q5MLtMrj92bStQCZmP23yY/spQfb5t+fCmRdnVDpxberXEgsn8o uPGvew/+JhtUbb28o5/pXnMmBEIBcIATnFMfHRw1pxqf1Z5pkk1wTBwy3 Q=; X-IronPort-AV: E=Sophos;i="5.82,240,1613433600"; d="scan'208";a="927962501" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO email-inbound-relay-1a-715bee71.us-east-1.amazon.com) ([10.25.36.210]) by smtp-border-fw-out-9103.sea19.amazon.com with ESMTP; 21 Apr 2021 13:15:42 +0000 Received: from EX13D01EUA001.ant.amazon.com (iad12-ws-svc-p26-lb9-vlan2.iad.amazon.com [10.40.163.34]) by email-inbound-relay-1a-715bee71.us-east-1.amazon.com (Postfix) with ESMTPS id 748F7A2251; Wed, 21 Apr 2021 13:15:41 +0000 (UTC) Received: from 8c859063385e.ant.amazon.com (10.43.161.253) by EX13D01EUA001.ant.amazon.com (10.43.165.121) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Wed, 21 Apr 2021 13:15:37 +0000 Subject: [PATCH 2/8] bpf: Add probe_read_{user, kernel} and probe_read_{user, , kernel}_str helpers From: "Zidenberg, Tsahi" To: CC: Greg KH References: Message-ID: <73702b7e-166f-b3d0-2853-ae8d545e56eb@amazon.com> Date: Wed, 21 Apr 2021 16:15:32 +0300 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:78.0) Gecko/20100101 Thunderbird/78.9.1 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-Originating-IP: [10.43.161.253] X-ClientProxiedBy: EX13D17UWB004.ant.amazon.com (10.43.161.132) To EX13D01EUA001.ant.amazon.com (10.43.165.121) Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commit 6ae08ae3dea2cfa03dd3665a3c8475c2d429ef47 upstream The current bpf_probe_read() and bpf_probe_read_str() helpers are broken in that they assume they can be used for probing memory access for kernel space addresses /as well as/ user space addresses. However, plain use of probe_kernel_read() for both cases will attempt to always access kernel space address space given access is performed under KERNEL_DS and some archs in-fact have overlapping address spaces where a kernel pointer and user pointer would have the /same/ address value and therefore accessing application memory via bpf_probe_read{,_str}() would read garbage values. Lets fix BPF side by making use of recently added 3d7081822f7f ("uaccess: Add non-pagefault user-space read functions"). Unfortunately, the only way to fix this status quo is to add dedicated bpf_probe_read_{user,kernel}() and bpf_probe_read_{user,kernel}_str() helpers. The bpf_probe_read{,_str}() helpers are kept as-is to retain their current behavior. The two *_user() variants attempt the access always under USER_DS set, the two *_kernel() variants will -EFAULT when accessing user memory if the underlying architecture has non-overlapping address ranges, also avoiding throwing the kernel warning via 00c42373d397 ("x86-64: add warning for non-canonical user access address dereferences"). conflict resolution: upstream also defines skb_output helper. This commit adds skb_output to FUNC_MAPPER, but does not take any code to add support or comments detailing the support. Fixes: a5e8c07059d0 ("bpf: add bpf_probe_read_str helper") Fixes: 2541517c32be ("tracing, perf: Implement BPF programs attached to kprobes") Signed-off-by: Daniel Borkmann Signed-off-by: Alexei Starovoitov Acked-by: Andrii Nakryiko Link: https://lore.kernel.org/bpf/796ee46e948bc808d54891a1108435f8652c6ca4.1572649915.git.daniel@iogearbox.net Cc: # 5.4 Signed-off-by: Tsahi Zidenberg ---  include/uapi/linux/bpf.h       | 123 ++++++++++++++--------  kernel/trace/bpf_trace.c       | 181 ++++++++++++++++++++++++---------  tools/include/uapi/linux/bpf.h | 116 +++++++++++++--------  3 files changed, 294 insertions(+), 126 deletions(-) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 8649422e760c..2fe91a083f7a 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -560,10 +560,13 @@ union bpf_attr {   *     Return   *         0 on success, or a negative error in case of failure.   * - * int bpf_probe_read(void *dst, u32 size, const void *src) + * int bpf_probe_read(void *dst, u32 size, const void *unsafe_ptr)   *     Description   *         For tracing programs, safely attempt to read *size* bytes from - *         address *src* and store the data in *dst*. + *         kernel space address *unsafe_ptr* and store the data in *dst*. + * + *         Generally, use bpf_probe_read_user() or bpf_probe_read_kernel() + *         instead.   *     Return   *         0 on success, or a negative error in case of failure.   * @@ -1425,45 +1428,14 @@ union bpf_attr {   *     Return   *         0 on success, or a negative error in case of failure.   * - * int bpf_probe_read_str(void *dst, int size, const void *unsafe_ptr) + * int bpf_probe_read_str(void *dst, u32 size, const void *unsafe_ptr)   *     Description - *         Copy a NUL terminated string from an unsafe address - *         *unsafe_ptr* to *dst*. The *size* should include the - *         terminating NUL byte. In case the string length is smaller than - *         *size*, the target is not padded with further NUL bytes. If the - *         string length is larger than *size*, just *size*-1 bytes are - *         copied and the last byte is set to NUL. - * - *         On success, the length of the copied string is returned. This - *         makes this helper useful in tracing programs for reading - *         strings, and more importantly to get its length at runtime. See - *         the following snippet: - * - *         :: - * - *             SEC("kprobe/sys_open") - *             void bpf_sys_open(struct pt_regs *ctx) - *             { - *                     char buf[PATHLEN]; // PATHLEN is defined to 256 - *                     int res = bpf_probe_read_str(buf, sizeof(buf), - *                                              ctx->di); - * - *                 // Consume buf, for example push it to - *                 // userspace via bpf_perf_event_output(); we - *                 // can use res (the string length) as event - *                 // size, after checking its boundaries. - *             } - * - *         In comparison, using **bpf_probe_read()** helper here instead - *         to read the string would require to estimate the length at - *         compile time, and would often result in copying more memory - *         than necessary. + *         Copy a NUL terminated string from an unsafe kernel address + *         *unsafe_ptr* to *dst*. See bpf_probe_read_kernel_str() for + *         more details.   * - *         Another useful use case is when parsing individual process - *         arguments or individual environment variables navigating - *         *current*\ **->mm->arg_start** and *current*\ - *         **->mm->env_start**: using this helper and the return value, - *         one can quickly iterate at the right offset of the memory area. + *         Generally, use bpf_probe_read_user_str() or bpf_probe_read_kernel_str() + *         instead.   *     Return   *         On success, the strictly positive length of the string,   *         including the trailing NUL character. On error, a negative @@ -2750,6 +2722,72 @@ union bpf_attr {   *        **-EOPNOTSUPP** kernel configuration does not enable SYN cookies   *   *        **-EPROTONOSUPPORT** IP packet version is not 4 or 6 + * + * int bpf_probe_read_user(void *dst, u32 size, const void *unsafe_ptr) + *     Description + *         Safely attempt to read *size* bytes from user space address + *         *unsafe_ptr* and store the data in *dst*. + *     Return + *         0 on success, or a negative error in case of failure. + * + * int bpf_probe_read_kernel(void *dst, u32 size, const void *unsafe_ptr) + *     Description + *         Safely attempt to read *size* bytes from kernel space address + *         *unsafe_ptr* and store the data in *dst*. + *     Return + *         0 on success, or a negative error in case of failure. + * + * int bpf_probe_read_user_str(void *dst, u32 size, const void *unsafe_ptr) + *     Description + *         Copy a NUL terminated string from an unsafe user address + *         *unsafe_ptr* to *dst*. The *size* should include the + *         terminating NUL byte. In case the string length is smaller than + *         *size*, the target is not padded with further NUL bytes. If the + *         string length is larger than *size*, just *size*-1 bytes are + *         copied and the last byte is set to NUL. + * + *         On success, the length of the copied string is returned. This + *         makes this helper useful in tracing programs for reading + *         strings, and more importantly to get its length at runtime. See + *         the following snippet: + * + *         :: + * + *             SEC("kprobe/sys_open") + *             void bpf_sys_open(struct pt_regs *ctx) + *             { + *                     char buf[PATHLEN]; // PATHLEN is defined to 256 + *                     int res = bpf_probe_read_user_str(buf, sizeof(buf), + *                                                   ctx->di); + * + *                 // Consume buf, for example push it to + *                 // userspace via bpf_perf_event_output(); we + *                 // can use res (the string length) as event + *                 // size, after checking its boundaries. + *             } + * + *         In comparison, using **bpf_probe_read_user()** helper here + *         instead to read the string would require to estimate the length + *         at compile time, and would often result in copying more memory + *         than necessary. + * + *         Another useful use case is when parsing individual process + *         arguments or individual environment variables navigating + *         *current*\ **->mm->arg_start** and *current*\ + *         **->mm->env_start**: using this helper and the return value, + *         one can quickly iterate at the right offset of the memory area. + *     Return + *         On success, the strictly positive length of the string, + *         including the trailing NUL character. On error, a negative + *         value. + * + * int bpf_probe_read_kernel_str(void *dst, u32 size, const void *unsafe_ptr) + *     Description + *         Copy a NUL terminated string from an unsafe kernel address *unsafe_ptr* + *         to *dst*. Same semantics as with bpf_probe_read_user_str() apply. + *     Return + *         On success, the strictly positive length of the string,    including + *         the trailing NUL character. On error, a negative value.   */  #define __BPF_FUNC_MAPPER(FN)        \      FN(unspec),            \ @@ -2862,7 +2900,12 @@ union bpf_attr {      FN(sk_storage_get),        \      FN(sk_storage_delete),        \      FN(send_signal),        \ -    FN(tcp_gen_syncookie), +    FN(tcp_gen_syncookie),        \ +    FN(skb_output),            \ +    FN(probe_read_user),        \ +    FN(probe_read_kernel),        \ +    FN(probe_read_user_str),    \ +    FN(probe_read_kernel_str),    /* integer value in 'imm' field of BPF_CALL instruction selects which helper   * function eBPF program intends to call diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 74c1db7178cf..0e329d48ab08 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -138,24 +138,140 @@ static const struct bpf_func_proto bpf_override_return_proto = {  };  #endif   -BPF_CALL_3(bpf_probe_read, void *, dst, u32, size, const void *, unsafe_ptr) +BPF_CALL_3(bpf_probe_read_user, void *, dst, u32, size, +       const void __user *, unsafe_ptr)  { -    int ret; +    int ret = probe_user_read(dst, unsafe_ptr, size);   -    ret = security_locked_down(LOCKDOWN_BPF_READ); -    if (ret < 0) -        goto out; +    if (unlikely(ret < 0)) +        memset(dst, 0, size); + +    return ret; +} + +static const struct bpf_func_proto bpf_probe_read_user_proto = { +    .func        = bpf_probe_read_user, +    .gpl_only    = true, +    .ret_type    = RET_INTEGER, +    .arg1_type    = ARG_PTR_TO_UNINIT_MEM, +    .arg2_type    = ARG_CONST_SIZE_OR_ZERO, +    .arg3_type    = ARG_ANYTHING, +}; + +BPF_CALL_3(bpf_probe_read_user_str, void *, dst, u32, size, +       const void __user *, unsafe_ptr) +{ +    int ret = strncpy_from_unsafe_user(dst, unsafe_ptr, size); + +    if (unlikely(ret < 0)) +        memset(dst, 0, size); + +    return ret; +} + +static const struct bpf_func_proto bpf_probe_read_user_str_proto = { +    .func        = bpf_probe_read_user_str, +    .gpl_only    = true, +    .ret_type    = RET_INTEGER, +    .arg1_type    = ARG_PTR_TO_UNINIT_MEM, +    .arg2_type    = ARG_CONST_SIZE_OR_ZERO, +    .arg3_type    = ARG_ANYTHING, +};   -    ret = probe_kernel_read(dst, unsafe_ptr, size); +static __always_inline int +bpf_probe_read_kernel_common(void *dst, u32 size, const void *unsafe_ptr, +                 const bool compat) +{ +    int ret = security_locked_down(LOCKDOWN_BPF_READ); + +    if (unlikely(ret < 0)) +        goto out; +    ret = compat ? probe_kernel_read(dst, unsafe_ptr, size) : +          probe_kernel_read_strict(dst, unsafe_ptr, size);      if (unlikely(ret < 0))  out:          memset(dst, 0, size); +    return ret; +} + +BPF_CALL_3(bpf_probe_read_kernel, void *, dst, u32, size, +       const void *, unsafe_ptr) +{ +    return bpf_probe_read_kernel_common(dst, size, unsafe_ptr, false); +} + +static const struct bpf_func_proto bpf_probe_read_kernel_proto = { +    .func        = bpf_probe_read_kernel, +    .gpl_only    = true, +    .ret_type    = RET_INTEGER, +    .arg1_type    = ARG_PTR_TO_UNINIT_MEM, +    .arg2_type    = ARG_CONST_SIZE_OR_ZERO, +    .arg3_type    = ARG_ANYTHING, +}; + +BPF_CALL_3(bpf_probe_read_compat, void *, dst, u32, size, +       const void *, unsafe_ptr) +{ +    return bpf_probe_read_kernel_common(dst, size, unsafe_ptr, true); +}   +static const struct bpf_func_proto bpf_probe_read_compat_proto = { +    .func        = bpf_probe_read_compat, +    .gpl_only    = true, +    .ret_type    = RET_INTEGER, +    .arg1_type    = ARG_PTR_TO_UNINIT_MEM, +    .arg2_type    = ARG_CONST_SIZE_OR_ZERO, +    .arg3_type    = ARG_ANYTHING, +}; + +static __always_inline int +bpf_probe_read_kernel_str_common(void *dst, u32 size, const void *unsafe_ptr, +                 const bool compat) +{ +    int ret = security_locked_down(LOCKDOWN_BPF_READ); + +    if (unlikely(ret < 0)) +        goto out; +    /* +     * The strncpy_from_unsafe_*() call will likely not fill the entire +     * buffer, but that's okay in this circumstance as we're probing +     * arbitrary memory anyway similar to bpf_probe_read_*() and might +     * as well probe the stack. Thus, memory is explicitly cleared +     * only in error case, so that improper users ignoring return +     * code altogether don't copy garbage; otherwise length of string +     * is returned that can be used for bpf_perf_event_output() et al. +     */ +    ret = compat ? strncpy_from_unsafe(dst, unsafe_ptr, size) : +          strncpy_from_unsafe_strict(dst, unsafe_ptr, size); +    if (unlikely(ret < 0)) +out: +        memset(dst, 0, size);      return ret;  }   -static const struct bpf_func_proto bpf_probe_read_proto = { -    .func        = bpf_probe_read, +BPF_CALL_3(bpf_probe_read_kernel_str, void *, dst, u32, size, +       const void *, unsafe_ptr) +{ +    return bpf_probe_read_kernel_str_common(dst, size, unsafe_ptr, false); +} + +static const struct bpf_func_proto bpf_probe_read_kernel_str_proto = { +    .func        = bpf_probe_read_kernel_str, +    .gpl_only    = true, +    .ret_type    = RET_INTEGER, +    .arg1_type    = ARG_PTR_TO_UNINIT_MEM, +    .arg2_type    = ARG_CONST_SIZE_OR_ZERO, +    .arg3_type    = ARG_ANYTHING, +}; + +BPF_CALL_3(bpf_probe_read_compat_str, void *, dst, u32, size, +       const void *, unsafe_ptr) +{ +    return bpf_probe_read_kernel_str_common(dst, size, unsafe_ptr, true); +} + +static const struct bpf_func_proto bpf_probe_read_compat_str_proto = { +    .func        = bpf_probe_read_compat_str,      .gpl_only    = true,      .ret_type    = RET_INTEGER,      .arg1_type    = ARG_PTR_TO_UNINIT_MEM, @@ -583,41 +699,6 @@ static const struct bpf_func_proto bpf_current_task_under_cgroup_proto = {      .arg2_type      = ARG_ANYTHING,  };   -BPF_CALL_3(bpf_probe_read_str, void *, dst, u32, size, -       const void *, unsafe_ptr) -{ -    int ret; - -    ret = security_locked_down(LOCKDOWN_BPF_READ); -    if (ret < 0) -        goto out; - -    /* -     * The strncpy_from_unsafe() call will likely not fill the entire -     * buffer, but that's okay in this circumstance as we're probing -     * arbitrary memory anyway similar to bpf_probe_read() and might -     * as well probe the stack. Thus, memory is explicitly cleared -     * only in error case, so that improper users ignoring return -     * code altogether don't copy garbage; otherwise length of string -     * is returned that can be used for bpf_perf_event_output() et al. -     */ -    ret = strncpy_from_unsafe(dst, unsafe_ptr, size); -    if (unlikely(ret < 0)) -out: -        memset(dst, 0, size); - -    return ret; -} - -static const struct bpf_func_proto bpf_probe_read_str_proto = { -    .func        = bpf_probe_read_str, -    .gpl_only    = true, -    .ret_type    = RET_INTEGER, -    .arg1_type    = ARG_PTR_TO_UNINIT_MEM, -    .arg2_type    = ARG_CONST_SIZE_OR_ZERO, -    .arg3_type    = ARG_ANYTHING, -}; -  struct send_signal_irq_work {      struct irq_work irq_work;      struct task_struct *task; @@ -697,8 +778,6 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)          return &bpf_map_pop_elem_proto;      case BPF_FUNC_map_peek_elem:          return &bpf_map_peek_elem_proto; -    case BPF_FUNC_probe_read: -        return &bpf_probe_read_proto;      case BPF_FUNC_ktime_get_ns:          return &bpf_ktime_get_ns_proto;      case BPF_FUNC_tail_call: @@ -725,8 +804,18 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)          return &bpf_current_task_under_cgroup_proto;      case BPF_FUNC_get_prandom_u32:          return &bpf_get_prandom_u32_proto; +    case BPF_FUNC_probe_read_user: +        return &bpf_probe_read_user_proto; +    case BPF_FUNC_probe_read_kernel: +        return &bpf_probe_read_kernel_proto; +    case BPF_FUNC_probe_read: +        return &bpf_probe_read_compat_proto; +    case BPF_FUNC_probe_read_user_str: +        return &bpf_probe_read_user_str_proto; +    case BPF_FUNC_probe_read_kernel_str: +        return &bpf_probe_read_kernel_str_proto;      case BPF_FUNC_probe_read_str: -        return &bpf_probe_read_str_proto; +        return &bpf_probe_read_compat_str_proto;  #ifdef CONFIG_CGROUPS      case BPF_FUNC_get_current_cgroup_id:          return &bpf_get_current_cgroup_id_proto; diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 8649422e760c..e4014608849d 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -560,10 +560,13 @@ union bpf_attr {   *     Return   *         0 on success, or a negative error in case of failure.   * - * int bpf_probe_read(void *dst, u32 size, const void *src) + * int bpf_probe_read(void *dst, u32 size, const void *unsafe_ptr)   *     Description   *         For tracing programs, safely attempt to read *size* bytes from - *         address *src* and store the data in *dst*. + *         kernel space address *unsafe_ptr* and store the data in *dst*. + * + *         Generally, use bpf_probe_read_user() or bpf_probe_read_kernel() + *         instead.   *     Return   *         0 on success, or a negative error in case of failure.   * @@ -1425,45 +1428,14 @@ union bpf_attr {   *     Return   *         0 on success, or a negative error in case of failure.   * - * int bpf_probe_read_str(void *dst, int size, const void *unsafe_ptr) + * int bpf_probe_read_str(void *dst, u32 size, const void *unsafe_ptr)   *     Description - *         Copy a NUL terminated string from an unsafe address - *         *unsafe_ptr* to *dst*. The *size* should include the - *         terminating NUL byte. In case the string length is smaller than - *         *size*, the target is not padded with further NUL bytes. If the - *         string length is larger than *size*, just *size*-1 bytes are - *         copied and the last byte is set to NUL. - * - *         On success, the length of the copied string is returned. This - *         makes this helper useful in tracing programs for reading - *         strings, and more importantly to get its length at runtime. See - *         the following snippet: - * - *         :: - * - *             SEC("kprobe/sys_open") - *             void bpf_sys_open(struct pt_regs *ctx) - *             { - *                     char buf[PATHLEN]; // PATHLEN is defined to 256 - *                     int res = bpf_probe_read_str(buf, sizeof(buf), - *                                              ctx->di); - * - *                 // Consume buf, for example push it to - *                 // userspace via bpf_perf_event_output(); we - *                 // can use res (the string length) as event - *                 // size, after checking its boundaries. - *             } + *         Copy a NUL terminated string from an unsafe kernel address + *         *unsafe_ptr* to *dst*. See bpf_probe_read_kernel_str() for + *         more details.   * - *         In comparison, using **bpf_probe_read()** helper here instead - *         to read the string would require to estimate the length at - *         compile time, and would often result in copying more memory - *         than necessary. - * - *         Another useful use case is when parsing individual process - *         arguments or individual environment variables navigating - *         *current*\ **->mm->arg_start** and *current*\ - *         **->mm->env_start**: using this helper and the return value, - *         one can quickly iterate at the right offset of the memory area. + *         Generally, use bpf_probe_read_user_str() or bpf_probe_read_kernel_str() + *         instead.   *     Return   *         On success, the strictly positive length of the string,   *         including the trailing NUL character. On error, a negative @@ -2750,6 +2722,65 @@ union bpf_attr {   *        **-EOPNOTSUPP** kernel configuration does not enable SYN cookies   *   *        **-EPROTONOSUPPORT** IP packet version is not 4 or 6 + * + * int bpf_probe_read_kernel(void *dst, u32 size, const void *unsafe_ptr) + *     Description + *         Safely attempt to read *size* bytes from kernel space address + *         *unsafe_ptr* and store the data in *dst*. + *     Return + *         0 on success, or a negative error in case of failure. + * + * int bpf_probe_read_user_str(void *dst, u32 size, const void *unsafe_ptr) + *     Description + *         Copy a NUL terminated string from an unsafe user address + *         *unsafe_ptr* to *dst*. The *size* should include the + *         terminating NUL byte. In case the string length is smaller than + *         *size*, the target is not padded with further NUL bytes. If the + *         string length is larger than *size*, just *size*-1 bytes are + *         copied and the last byte is set to NUL. + * + *         On success, the length of the copied string is returned. This + *         makes this helper useful in tracing programs for reading + *         strings, and more importantly to get its length at runtime. See + *         the following snippet: + * + *         :: + * + *             SEC("kprobe/sys_open") + *             void bpf_sys_open(struct pt_regs *ctx) + *             { + *                     char buf[PATHLEN]; // PATHLEN is defined to 256 + *                     int res = bpf_probe_read_user_str(buf, sizeof(buf), + *                                                   ctx->di); + * + *                 // Consume buf, for example push it to + *                 // userspace via bpf_perf_event_output(); we + *                 // can use res (the string length) as event + *                 // size, after checking its boundaries. + *             } + * + *         In comparison, using **bpf_probe_read_user()** helper here + *         instead to read the string would require to estimate the length + *         at compile time, and would often result in copying more memory + *         than necessary. + * + *         Another useful use case is when parsing individual process + *         arguments or individual environment variables navigating + *         *current*\ **->mm->arg_start** and *current*\ + *         **->mm->env_start**: using this helper and the return value, + *         one can quickly iterate at the right offset of the memory area. + *     Return + *         On success, the strictly positive length of the string, + *         including the trailing NUL character. On error, a negative + *         value. + * + * int bpf_probe_read_kernel_str(void *dst, u32 size, const void *unsafe_ptr) + *     Description + *         Copy a NUL terminated string from an unsafe kernel address *unsafe_ptr* + *         to *dst*. Same semantics as with bpf_probe_read_user_str() apply. + *     Return + *         On success, the strictly positive length of the string,    including + *         the trailing NUL character. On error, a negative value.   */  #define __BPF_FUNC_MAPPER(FN)        \      FN(unspec),            \ @@ -2862,7 +2893,12 @@ union bpf_attr {      FN(sk_storage_get),        \      FN(sk_storage_delete),        \      FN(send_signal),        \ -    FN(tcp_gen_syncookie), +    FN(tcp_gen_syncookie),        \ +    FN(skb_output),            \ +    FN(probe_read_user),        \ +    FN(probe_read_kernel),        \ +    FN(probe_read_user_str),    \ +    FN(probe_read_kernel_str),    /* integer value in 'imm' field of BPF_CALL instruction selects which helper   * function eBPF program intends to call From patchwork Wed Apr 21 13:09:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zidenberg, Tsahi" X-Patchwork-Id: 425838 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F315CC433ED for ; Wed, 21 Apr 2021 13:10:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C3BEB6162B for ; Wed, 21 Apr 2021 13:10:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241580AbhDUNLA (ORCPT ); Wed, 21 Apr 2021 09:11:00 -0400 Received: from smtp-fw-33001.amazon.com ([207.171.190.10]:25098 "EHLO smtp-fw-33001.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242240AbhDUNJ6 (ORCPT ); Wed, 21 Apr 2021 09:09:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1619010565; x=1650546565; h=subject:from:to:cc:references:message-id:date: mime-version:in-reply-to:content-transfer-encoding; bh=4dz49e0j4dlJ7n1U/S3W4X62ctZqCqEkCAOg6S2K1QI=; b=DjVS1CVwiOCSSl3SZL9zuavUbcHCk2BabKX3SYl32gTP2ldMd8ePoPcl bWr/s5JrPygO8xrf5+YNdXOrrdTcQY70DaXp3cMpljbkev2/VrOfNLQ5U r3fOV/VtjRaSb9mW/P6Fkjay6rrwr0tEGnmfyFFqSCRkkd/xfs11bEk8z c=; X-IronPort-AV: E=Sophos;i="5.82,240,1613433600"; d="scan'208";a="120450391" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO email-inbound-relay-2a-90c42d1d.us-west-2.amazon.com) ([10.25.36.210]) by smtp-border-fw-out-33001.sea14.amazon.com with ESMTP; 21 Apr 2021 13:09:24 +0000 Received: from EX13D01EUA001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan3.pdx.amazon.com [10.236.137.198]) by email-inbound-relay-2a-90c42d1d.us-west-2.amazon.com (Postfix) with ESMTPS id 5FB47A1D40; Wed, 21 Apr 2021 13:09:24 +0000 (UTC) Received: from 8c859063385e.ant.amazon.com (10.43.162.28) by EX13D01EUA001.ant.amazon.com (10.43.165.121) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Wed, 21 Apr 2021 13:09:20 +0000 Subject: [PATCH 3/8] bpf: Restrict bpf_probe_read{, str}() only to archs where, they work From: "Zidenberg, Tsahi" To: CC: Greg KH References: Message-ID: <03beea9a-bbee-82c0-6842-833e57b0d786@amazon.com> Date: Wed, 21 Apr 2021 16:09:15 +0300 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:78.0) Gecko/20100101 Thunderbird/78.9.1 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-Originating-IP: [10.43.162.28] X-ClientProxiedBy: EX13D13UWA003.ant.amazon.com (10.43.160.181) To EX13D01EUA001.ant.amazon.com (10.43.165.121) Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commit 0ebeea8ca8a4d1d453ad299aef0507dab04f6e8d upstream Given the legacy bpf_probe_read{,str}() BPF helpers are broken on archs with overlapping address ranges, we should really take the next step to disable them from BPF use there. To generally fix the situation, we've recently added new helper variants bpf_probe_read_{user,kernel}() and bpf_probe_read_{user,kernel}_str(). For details on them, see 6ae08ae3dea2 ("bpf: Add probe_read_{user, kernel} and probe_read_{user,kernel}_str helpers"). Given bpf_probe_read{,str}() have been around for ~5 years by now, there are plenty of users at least on x86 still relying on them today, so we cannot remove them entirely w/o breaking the BPF tracing ecosystem. However, their use should be restricted to archs with non-overlapping address ranges where they are working in their current form. Therefore, move this behind a CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE and have x86, arm64, arm select it (other archs supporting it can follow-up on it as well). For the remaining archs, they can workaround easily by relying on the feature probe from bpftool which spills out defines that can be used out of BPF C code to implement the drop-in replacement for old/new kernels via: bpftool feature probe macro Suggested-by: Linus Torvalds Signed-off-by: Daniel Borkmann Signed-off-by: Alexei Starovoitov Reviewed-by: Masami Hiramatsu Acked-by: Linus Torvalds Cc: Brendan Gregg Cc: Christoph Hellwig Link: https://lore.kernel.org/bpf/20200515101118.6508-2-daniel@iogearbox.net Cc: # 5.4 Signed-off-by: Tsahi Zidenberg ---  arch/arm/Kconfig         | 1 +  arch/arm64/Kconfig       | 1 +  arch/x86/Kconfig         | 1 +  init/Kconfig             | 3 +++  kernel/trace/bpf_trace.c | 6 ++++--  5 files changed, 10 insertions(+), 2 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 9aa88715f196..70f4057fb5b0 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -14,6 +14,7 @@ config ARM      select ARCH_HAS_KEEPINITRD      select ARCH_HAS_KCOV      select ARCH_HAS_MEMBARRIER_SYNC_CORE +    select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE      select ARCH_HAS_PTE_SPECIAL if ARM_LPAE      select ARCH_HAS_PHYS_TO_DMA      select ARCH_HAS_SETUP_DMA_OPS diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 9c8ea5939865..a8c49916ab8c 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -22,6 +22,7 @@ config ARM64      select ARCH_HAS_KCOV      select ARCH_HAS_KEEPINITRD      select ARCH_HAS_MEMBARRIER_SYNC_CORE +    select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE      select ARCH_HAS_PTE_DEVMAP      select ARCH_HAS_PTE_SPECIAL      select ARCH_HAS_SETUP_DMA_OPS diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 8ef85139553f..b9f666db10c1 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -70,6 +70,7 @@ config X86      select ARCH_HAS_KCOV            if X86_64      select ARCH_HAS_MEM_ENCRYPT      select ARCH_HAS_MEMBARRIER_SYNC_CORE +    select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE      select ARCH_HAS_PMEM_API        if X86_64      select ARCH_HAS_PTE_DEVMAP        if X86_64      select ARCH_HAS_PTE_SPECIAL diff --git a/init/Kconfig b/init/Kconfig index 96fc45d1b686..1781810d1501 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -2210,6 +2210,9 @@ config ASN1    source "kernel/Kconfig.locks"   +config ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE +    bool +  config ARCH_HAS_SYNC_CORE_BEFORE_USERMODE      bool   diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 0e329d48ab08..80f0072b31e0 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -808,14 +808,16 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)          return &bpf_probe_read_user_proto;      case BPF_FUNC_probe_read_kernel:          return &bpf_probe_read_kernel_proto; -    case BPF_FUNC_probe_read: -        return &bpf_probe_read_compat_proto;      case BPF_FUNC_probe_read_user_str:          return &bpf_probe_read_user_str_proto;      case BPF_FUNC_probe_read_kernel_str:          return &bpf_probe_read_kernel_str_proto; +#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE +    case BPF_FUNC_probe_read: +        return &bpf_probe_read_compat_proto;      case BPF_FUNC_probe_read_str:          return &bpf_probe_read_compat_str_proto; +#endif  #ifdef CONFIG_CGROUPS      case BPF_FUNC_get_current_cgroup_id:          return &bpf_get_current_cgroup_id_proto; From patchwork Wed Apr 21 13:10:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zidenberg, Tsahi" X-Patchwork-Id: 426259 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34976C43460 for ; Wed, 21 Apr 2021 13:12:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 103CB6144C for ; Wed, 21 Apr 2021 13:12:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239667AbhDUNNW (ORCPT ); Wed, 21 Apr 2021 09:13:22 -0400 Received: from smtp-fw-4101.amazon.com ([72.21.198.25]:45879 "EHLO smtp-fw-4101.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241867AbhDUNLX (ORCPT ); Wed, 21 Apr 2021 09:11:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1619010651; x=1650546651; h=subject:from:to:cc:references:message-id:date: mime-version:in-reply-to:content-transfer-encoding; bh=X1Guu4g3y6XQ31GyM/6aVlFFsY0PGtWjyVIc1+SURnU=; b=hU+NOvNoGG1Mcp+2Q6heZXYDNlTVwPGzdH0aZSZMATIzf7ntORKkz67K rkqhye/Q49BuDj+ZcaQctgtZNqjFAuvjcaCA72T9lJIQfY/oD7FaBKYfc R7MCCMqatjtIaRQ3PgVaMEwJbO2D5nMT1fhsSKJtCXvSvAF+D1eh3O4ru Y=; X-IronPort-AV: E=Sophos;i="5.82,240,1613433600"; d="scan'208";a="103137117" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO email-inbound-relay-2b-a7fdc47a.us-west-2.amazon.com) ([10.43.8.2]) by smtp-border-fw-out-4101.iad4.amazon.com with ESMTP; 21 Apr 2021 13:10:44 +0000 Received: from EX13D01EUA001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan2.pdx.amazon.com [10.236.137.194]) by email-inbound-relay-2b-a7fdc47a.us-west-2.amazon.com (Postfix) with ESMTPS id B9476C059F; Wed, 21 Apr 2021 13:10:42 +0000 (UTC) Received: from 8c859063385e.ant.amazon.com (10.43.161.253) by EX13D01EUA001.ant.amazon.com (10.43.165.121) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Wed, 21 Apr 2021 13:10:39 +0000 Subject: [PATCH 4/8] powerpc/bpf: Enable bpf_probe_read{, str}() on powerpc, again From: "Zidenberg, Tsahi" To: CC: Greg KH References: Message-ID: <39c705ad-d4d2-6ba7-dcf1-6e1ef9db4639@amazon.com> Date: Wed, 21 Apr 2021 16:10:34 +0300 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:78.0) Gecko/20100101 Thunderbird/78.9.1 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-Originating-IP: [10.43.161.253] X-ClientProxiedBy: EX13D22UWB004.ant.amazon.com (10.43.161.165) To EX13D01EUA001.ant.amazon.com (10.43.165.121) Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commit d195b1d1d1196681ac4775e0361e9cca70f740c2 upstream The commit 0ebeea8ca8a4d1d453a ("bpf: Restrict bpf_probe_read{, str}() only to archs where they work") caused that bpf_probe_read{, str}() functions were not longer available on architectures where the same logical address might have different content in kernel and user memory mapping. These architectures should use probe_read_{user,kernel}_str helpers. For backward compatibility, the problematic functions are still available on architectures where the user and kernel address spaces are not overlapping. This is defined CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE. At the moment, these backward compatible functions are enabled only on x86_64, arm, and arm64. Let's do it also on powerpc that has the non overlapping address space as well. Fixes: 0ebeea8ca8a4 ("bpf: Restrict bpf_probe_read{, str}() only to archs where they work") Signed-off-by: Petr Mladek Signed-off-by: Daniel Borkmann Acked-by: Michael Ellerman Link: https://lore.kernel.org/lkml/20200527122844.19524-1-pmladek@suse.com Cc: # 5.4 Signed-off-by: Tsahi Zidenberg ---  arch/powerpc/Kconfig | 1 +  1 file changed, 1 insertion(+) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index c4cbb65e742f..c50bfab7930b 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -129,6 +129,7 @@ config PPC      select ARCH_HAS_MMIOWB            if PPC64      select ARCH_HAS_PHYS_TO_DMA      select ARCH_HAS_PMEM_API +    select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE      select ARCH_HAS_PTE_DEVMAP        if PPC_BOOK3S_64      select ARCH_HAS_PTE_SPECIAL      select ARCH_HAS_MEMBARRIER_CALLBACKS From patchwork Wed Apr 21 13:11:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zidenberg, Tsahi" X-Patchwork-Id: 425836 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA6A4C43618 for ; Wed, 21 Apr 2021 13:16:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A916261490 for ; Wed, 21 Apr 2021 13:16:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243387AbhDUNRM (ORCPT ); Wed, 21 Apr 2021 09:17:12 -0400 Received: from smtp-fw-9103.amazon.com ([207.171.188.200]:6804 "EHLO smtp-fw-9103.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241372AbhDUNMI (ORCPT ); Wed, 21 Apr 2021 09:12:08 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1619010695; x=1650546695; h=subject:from:to:cc:references:message-id:date: mime-version:in-reply-to:content-transfer-encoding; bh=1Fo1DEpE+EuWqINvjKMYf9vDxCxl7txKdsxYq0F7drA=; b=SLr64dMagwqpdLXJYXnn6vrdDUEnt5loLoUTMn/Rp7C3aSJdwkkhJty/ x9XWvb7eg/tZcshLGBDAh4IpnUEVfEXXrKBE0KSEWBJFteFrT3PgMFUtw szpayat/5r5OD6s6Ve4q5Gkq7ccpozsS4LKbQ1hmYU5Ljdvk/qjlPgVV8 0=; X-IronPort-AV: E=Sophos;i="5.82,240,1613433600"; d="scan'208";a="927961451" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO email-inbound-relay-2a-538b0bfb.us-west-2.amazon.com) ([10.25.36.210]) by smtp-border-fw-out-9103.sea19.amazon.com with ESMTP; 21 Apr 2021 13:11:35 +0000 Received: from EX13D01EUA001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan3.pdx.amazon.com [10.236.137.198]) by email-inbound-relay-2a-538b0bfb.us-west-2.amazon.com (Postfix) with ESMTPS id A4C5EA1ACF; Wed, 21 Apr 2021 13:11:34 +0000 (UTC) Received: from 8c859063385e.ant.amazon.com (10.43.162.239) by EX13D01EUA001.ant.amazon.com (10.43.165.121) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Wed, 21 Apr 2021 13:11:31 +0000 Subject: [PATCH 5/8] bpf: Restrict bpf_trace_printk()'s %s usage and add %pks,, %pus specifier From: "Zidenberg, Tsahi" To: CC: Greg KH References: Message-ID: Date: Wed, 21 Apr 2021 16:11:26 +0300 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:78.0) Gecko/20100101 Thunderbird/78.9.1 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-Originating-IP: [10.43.162.239] X-ClientProxiedBy: EX13D40UWA004.ant.amazon.com (10.43.160.36) To EX13D01EUA001.ant.amazon.com (10.43.165.121) Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commit b2a5212fb634561bb734c6356904e37f6665b955 upstream Usage of plain %s conversion specifier in bpf_trace_printk() suffers from the very same issue as bpf_probe_read{,str}() helpers, that is, it is broken on archs with overlapping address ranges. While the helpers have been addressed through work in 6ae08ae3dea2 ("bpf: Add probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers"), we need an option for bpf_trace_printk() as well to fix it. Similarly as with the helpers, force users to make an explicit choice by adding %pks and %pus specifier to bpf_trace_printk() which will then pick the corresponding strncpy_from_unsafe*() variant to perform the access under KERNEL_DS or USER_DS. The %pk* (kernel specifier) and %pu* (user specifier) can later also be extended for other objects aside strings that are probed and printed under tracing, and reused out of other facilities like bpf_seq_printf() or BTF based type printing. Existing behavior of %s for current users is still kept working for archs where it is not broken and therefore gated through CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE. For archs not having this property we fall-back to pick probing under KERNEL_DS as a sensible default. conflict resolution: in vsprintf.c, add the new options [u/k] without other options added upstream Fixes: 8d3b7dce8622 ("bpf: add support for %s specifier to bpf_trace_printk()") Reported-by: Linus Torvalds Reported-by: Christoph Hellwig Signed-off-by: Daniel Borkmann Signed-off-by: Alexei Starovoitov Cc: Masami Hiramatsu Cc: Brendan Gregg Link: https://lore.kernel.org/bpf/20200515101118.6508-4-daniel@iogearbox.net Cc: # 5.4 Signed-off-by: Tsahi Zidenberg ---  Documentation/core-api/printk-formats.rst | 14 ++++  kernel/trace/bpf_trace.c                  | 94 +++++++++++++++--------  lib/vsprintf.c                            | 12 +++  3 files changed, 88 insertions(+), 32 deletions(-) diff --git a/Documentation/core-api/printk-formats.rst b/Documentation/core-api/printk-formats.rst index ecbebf4ca8e7..cf8c2ad8abef 100644 --- a/Documentation/core-api/printk-formats.rst +++ b/Documentation/core-api/printk-formats.rst @@ -110,6 +110,20 @@ used when printing stack backtraces. The specifier takes into  consideration the effect of compiler optimisations which may occur  when tail-calls are used and marked with the noreturn GCC attribute.   +Probed Pointers from BPF / tracing +---------------------------------- + +:: + +    %pks    kernel string +    %pus    user string + +The ``k`` and ``u`` specifiers are used for printing prior probed memory from +either kernel memory (k) or user memory (u). The subsequent ``s`` specifier +results in printing a string. For direct use in regular vsnprintf() the (k) +and (u) annotation is ignored, however, when used out of BPF's bpf_trace_printk(), +for example, it reads the memory it is pointing to without faulting. +  Kernel Pointers  ---------------   diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 80f0072b31e0..396b91a9b669 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -325,17 +325,15 @@ static const struct bpf_func_proto *bpf_get_probe_write_proto(void)    /*   * Only limited trace_printk() conversion specifiers allowed: - * %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %s + * %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %pks %pus %s   */  BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1,         u64, arg2, u64, arg3)  { +    int i, mod[3] = {}, fmt_cnt = 0; +    char buf[64], fmt_ptype; +    void *unsafe_ptr = NULL;      bool str_seen = false; -    int mod[3] = {}; -    int fmt_cnt = 0; -    u64 unsafe_addr; -    char buf[64]; -    int i;        /*       * bpf_check()->check_func_arg()->check_stack_boundary() @@ -361,40 +359,71 @@ BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1,          if (fmt[i] == 'l') {              mod[fmt_cnt]++;              i++; -        } else if (fmt[i] == 'p' || fmt[i] == 's') { +        } else if (fmt[i] == 'p') {              mod[fmt_cnt]++; +            if ((fmt[i + 1] == 'k' || +                 fmt[i + 1] == 'u') && +                fmt[i + 2] == 's') { +                fmt_ptype = fmt[i + 1]; +                i += 2; +                goto fmt_str; +            } +              /* disallow any further format extensions */              if (fmt[i + 1] != 0 &&                  !isspace(fmt[i + 1]) &&                  !ispunct(fmt[i + 1]))                  return -EINVAL; -            fmt_cnt++; -            if (fmt[i] == 's') { -                if (str_seen) -                    /* allow only one '%s' per fmt string */ -                    return -EINVAL; -                str_seen = true; - -                switch (fmt_cnt) { -                case 1: -                    unsafe_addr = arg1; -                    arg1 = (long) buf; -                    break; -                case 2: -                    unsafe_addr = arg2; -                    arg2 = (long) buf; -                    break; -                case 3: -                    unsafe_addr = arg3; -                    arg3 = (long) buf; -                    break; -                } -                buf[0] = 0; -                strncpy_from_unsafe(buf, -                            (void *) (long) unsafe_addr, + +            goto fmt_next; +        } else if (fmt[i] == 's') { +            mod[fmt_cnt]++; +            fmt_ptype = fmt[i]; +fmt_str: +            if (str_seen) +                /* allow only one '%s' per fmt string */ +                return -EINVAL; +            str_seen = true; + +            if (fmt[i + 1] != 0 && +                !isspace(fmt[i + 1]) && +                !ispunct(fmt[i + 1])) +                return -EINVAL; + +            switch (fmt_cnt) { +            case 0: +                unsafe_ptr = (void *)(long)arg1; +                arg1 = (long)buf; +                break; +            case 1: +                unsafe_ptr = (void *)(long)arg2; +                arg2 = (long)buf; +                break; +            case 2: +                unsafe_ptr = (void *)(long)arg3; +                arg3 = (long)buf; +                break; +            } + +            buf[0] = 0; +            switch (fmt_ptype) { +            case 's': +#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE +                strncpy_from_unsafe(buf, unsafe_ptr,                              sizeof(buf)); +                break; +#endif +            case 'k': +                strncpy_from_unsafe_strict(buf, unsafe_ptr, +                               sizeof(buf)); +                break; +            case 'u': +                strncpy_from_unsafe_user(buf, +                    (__force void __user *)unsafe_ptr, +                             sizeof(buf)); +                break;              } -            continue; +            goto fmt_next;          }            if (fmt[i] == 'l') { @@ -405,6 +434,7 @@ BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1,          if (fmt[i] != 'i' && fmt[i] != 'd' &&              fmt[i] != 'u' && fmt[i] != 'x')              return -EINVAL; +fmt_next:          fmt_cnt++;      }   diff --git a/lib/vsprintf.c b/lib/vsprintf.c index fb4af73142b4..985ea5c87465 100644 --- a/lib/vsprintf.c +++ b/lib/vsprintf.c @@ -2116,6 +2116,10 @@ static char *kobject_string(char *buf, char *end, void *ptr,   *                  c major compatible string   *                  C full compatible string   * - 'x' For printing the address. Equivalent to "%lx". + * - '[ku]s' For a BPF/tracing related format specifier, e.g. used out of + *           bpf_trace_printk() where [ku] prefix specifies either kernel (k) + *           or user (u) memory to probe, and: + *              s a string, equivalent to "%s" on direct vsnprintf() use   *   * ** When making changes please also update:   *    Documentation/core-api/printk-formats.rst @@ -2194,6 +2198,14 @@ char *pointer(const char *fmt, char *buf, char *end, void *ptr,          return kobject_string(buf, end, ptr, spec, fmt);      case 'x':          return pointer_string(buf, end, ptr, spec); +    case 'u': +    case 'k': +        switch (fmt[1]) { +        case 's': +            return string(buf, end, ptr, spec); +        default: +            return error_string(buf, end, "(einval)", spec); +        }      }        /* default is to _not_ leak addresses, hash before printing */ From patchwork Wed Apr 21 13:12:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zidenberg, Tsahi" X-Patchwork-Id: 426258 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CDCC2C43600 for ; Wed, 21 Apr 2021 13:16:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 95C2461476 for ; Wed, 21 Apr 2021 13:16:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241083AbhDUNRS (ORCPT ); Wed, 21 Apr 2021 09:17:18 -0400 Received: from smtp-fw-33001.amazon.com ([207.171.190.10]:28794 "EHLO smtp-fw-33001.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241647AbhDUNM6 (ORCPT ); Wed, 21 Apr 2021 09:12:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1619010745; x=1650546745; h=subject:from:to:cc:references:message-id:date: mime-version:in-reply-to:content-transfer-encoding; bh=zvmE2SQD2EUaZ6X04lwfy22RFgve7HWJGYg0gUkcLrI=; b=aIDfj1HudnFxyMr/OQDz7CaHnfbqfVPRi+DoposeMs8vcxDZfPM7fGiV QhAE/CSlcM6ZaRAiW8RQOtOMM+5hP7Kvh+6YeR9osxn3f/8MAZUpa0H7a WNkZRWtHmV8sjp+mCPtlsUlpsOSsweeZKHZRXSbcHKk0aHh1xjvJtZd86 I=; X-IronPort-AV: E=Sophos;i="5.82,240,1613433600"; d="scan'208";a="120451172" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO email-inbound-relay-1d-98acfc19.us-east-1.amazon.com) ([10.25.36.210]) by smtp-border-fw-out-33001.sea14.amazon.com with ESMTP; 21 Apr 2021 13:12:23 +0000 Received: from EX13D01EUA001.ant.amazon.com (iad12-ws-svc-p26-lb9-vlan3.iad.amazon.com [10.40.163.38]) by email-inbound-relay-1d-98acfc19.us-east-1.amazon.com (Postfix) with ESMTPS id CCBDEA20EF; Wed, 21 Apr 2021 13:12:22 +0000 (UTC) Received: from 8c859063385e.ant.amazon.com (10.43.160.209) by EX13D01EUA001.ant.amazon.com (10.43.165.121) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Wed, 21 Apr 2021 13:12:20 +0000 Subject: [PATCH 6/8] maccess: rename strncpy_from_unsafe_user to, strncpy_from_user_nofault From: "Zidenberg, Tsahi" To: CC: Greg KH References: Message-ID: Date: Wed, 21 Apr 2021 16:12:14 +0300 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:78.0) Gecko/20100101 Thunderbird/78.9.1 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-Originating-IP: [10.43.160.209] X-ClientProxiedBy: EX13D22UWC003.ant.amazon.com (10.43.162.250) To EX13D01EUA001.ant.amazon.com (10.43.165.121) Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commit bd88bb5d4007949be7154deae7cef7173c751a95 upstream This matches the naming of strncpy_from_user, and also makes it more clear what the function is supposed to do. Signed-off-by: Christoph Hellwig Signed-off-by: Andrew Morton Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Masami Hiramatsu Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20200521152301.2587579-7-hch@lst.de Signed-off-by: Linus Torvalds Cc: # 5.4 Signed-off-by: Tsahi Zidenberg ---  include/linux/uaccess.h     | 4 ++--  kernel/trace/bpf_trace.c    | 4 ++--  kernel/trace/trace_kprobe.c | 2 +-  mm/maccess.c                | 4 ++--  4 files changed, 7 insertions(+), 7 deletions(-) diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h index 67f016010aad..23e655549be2 100644 --- a/include/linux/uaccess.h +++ b/include/linux/uaccess.h @@ -354,8 +354,8 @@ extern long strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count);  extern long strncpy_from_unsafe_strict(char *dst, const void *unsafe_addr,                         long count);  extern long __strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count); -extern long strncpy_from_unsafe_user(char *dst, const void __user *unsafe_addr, -                     long count); +long strncpy_from_user_nofault(char *dst, const void __user *unsafe_addr, +        long count);  extern long strnlen_unsafe_user(const void __user *unsafe_addr, long count);    /** diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 396b91a9b669..720d78c62d05 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -161,7 +161,7 @@ static const struct bpf_func_proto bpf_probe_read_user_proto = {  BPF_CALL_3(bpf_probe_read_user_str, void *, dst, u32, size,         const void __user *, unsafe_ptr)  { -    int ret = strncpy_from_unsafe_user(dst, unsafe_ptr, size); +    int ret = strncpy_from_user_nofault(dst, unsafe_ptr, size);        if (unlikely(ret < 0))          memset(dst, 0, size); @@ -418,7 +418,7 @@ BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1,                                 sizeof(buf));                  break;              case 'u': -                strncpy_from_unsafe_user(buf, +                strncpy_from_user_nofault(buf,                      (__force void __user *)unsafe_ptr,                               sizeof(buf));                  break; diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index 233322c77b76..6e26364f1005 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -1104,7 +1104,7 @@ fetch_store_string_user(unsigned long addr, void *dest, void *base)        __dest = get_loc_data(dest, base);   -    ret = strncpy_from_unsafe_user(__dest, uaddr, maxlen); +    ret = strncpy_from_user_nofault(__dest, uaddr, maxlen);      if (ret >= 0)          *(u32 *)dest = make_data_loc(ret, __dest - base);   diff --git a/mm/maccess.c b/mm/maccess.c index 3ca8d97e5010..84c598673aa9 100644 --- a/mm/maccess.c +++ b/mm/maccess.c @@ -205,7 +205,7 @@ long __strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count)  }    /** - * strncpy_from_unsafe_user: - Copy a NUL terminated string from unsafe user + * strncpy_from_user_nofault: - Copy a NUL terminated string from unsafe user   *                address.   * @dst:   Destination address, in kernel space.  This buffer must be at   *         least @count bytes long. @@ -222,7 +222,7 @@ long __strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count)   * If @count is smaller than the length of the string, copies @count-1 bytes,   * sets the last byte of @dst buffer to NUL and returns @count.   */ -long strncpy_from_unsafe_user(char *dst, const void __user *unsafe_addr, +long strncpy_from_user_nofault(char *dst, const void __user *unsafe_addr,                    long count)  {      mm_segment_t old_fs = get_fs(); From patchwork Wed Apr 21 13:13:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zidenberg, Tsahi" X-Patchwork-Id: 426257 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8BC9C4363E for ; Wed, 21 Apr 2021 13:16:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6CBF16146D for ; Wed, 21 Apr 2021 13:16:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238604AbhDUNRS (ORCPT ); Wed, 21 Apr 2021 09:17:18 -0400 Received: from smtp-fw-33001.amazon.com ([207.171.190.10]:30186 "EHLO smtp-fw-33001.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239666AbhDUNOB (ORCPT ); Wed, 21 Apr 2021 09:14:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1619010808; x=1650546808; h=subject:from:to:cc:references:message-id:date: mime-version:in-reply-to:content-transfer-encoding; bh=Y3lJQFkY7OhNxx6rHkrgq18kqGIQr7cXxaUh0AOL81s=; b=M9q6Voyo1vonPsYiYrFDDzGyYiyok95YYpFBfV5J0zBsoKhchcO7ui+T WEMVwygGT3W9mJibLmSCmCu3acBgUh5qoVPeyE4XLMcCwQtmnpOW9ldqH bP1RE6vAUuD8wHcEwkKCf6xdTfMzbF+ycwi+G/TUHK7FJY/o/oVD0Ql9H 0=; X-IronPort-AV: E=Sophos;i="5.82,240,1613433600"; d="scan'208";a="120451457" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO email-inbound-relay-1d-2c665b5d.us-east-1.amazon.com) ([10.25.36.210]) by smtp-border-fw-out-33001.sea14.amazon.com with ESMTP; 21 Apr 2021 13:13:28 +0000 Received: from EX13D01EUA001.ant.amazon.com (iad12-ws-svc-p26-lb9-vlan3.iad.amazon.com [10.40.163.38]) by email-inbound-relay-1d-2c665b5d.us-east-1.amazon.com (Postfix) with ESMTPS id 88C09A0598; Wed, 21 Apr 2021 13:13:27 +0000 (UTC) Received: from 8c859063385e.ant.amazon.com (10.43.160.209) by EX13D01EUA001.ant.amazon.com (10.43.165.121) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Wed, 21 Apr 2021 13:13:25 +0000 Subject: [PATCH 7/8] maccess: rename strncpy_from_unsafe_strict to, strncpy_from_kernel_nofault From: "Zidenberg, Tsahi" To: CC: Greg KH References: Message-ID: <9239b0c8-50ac-2984-8d3e-e1f097d4bcd5@amazon.com> Date: Wed, 21 Apr 2021 16:13:18 +0300 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:78.0) Gecko/20100101 Thunderbird/78.9.1 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-Originating-IP: [10.43.160.209] X-ClientProxiedBy: EX13D05UWC002.ant.amazon.com (10.43.162.92) To EX13D01EUA001.ant.amazon.com (10.43.165.121) Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commit c4cb164426aebd635baa53685b0ebf1a127e9803 upstream This matches the naming of strncpy_from_user_nofault, and also makes it more clear what the function is supposed to do. conflict resolution: comments in mm/maccess.c Signed-off-by: Christoph Hellwig Signed-off-by: Andrew Morton Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Masami Hiramatsu Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20200521152301.2587579-8-hch@lst.de Signed-off-by: Linus Torvalds Cc: # 5.4 Signed-off-by: Tsahi Zidenberg ---  arch/x86/mm/maccess.c    |  2 +-  include/linux/uaccess.h  |  4 ++--  kernel/trace/bpf_trace.c |  4 ++--  mm/maccess.c             | 31 +++++++++++++++++++++++++------  4 files changed, 30 insertions(+), 11 deletions(-) diff --git a/arch/x86/mm/maccess.c b/arch/x86/mm/maccess.c index f5b85bdc0535..62c4017a2473 100644 --- a/arch/x86/mm/maccess.c +++ b/arch/x86/mm/maccess.c @@ -34,7 +34,7 @@ long probe_kernel_read_strict(void *dst, const void *src, size_t size)      return __probe_kernel_read(dst, src, size);  }   -long strncpy_from_unsafe_strict(char *dst, const void *unsafe_addr, long count) +long strncpy_from_kernel_nofault(char *dst, const void *unsafe_addr, long count)  {      if (unlikely(invalid_probe_range((unsigned long)unsafe_addr)))          return -EFAULT; diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h index 23e655549be2..7c61d3ddae57 100644 --- a/include/linux/uaccess.h +++ b/include/linux/uaccess.h @@ -351,8 +351,8 @@ extern long notrace probe_user_write(void __user *dst, const void *src, size_t s  extern long notrace __probe_user_write(void __user *dst, const void *src, size_t size);    extern long strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count); -extern long strncpy_from_unsafe_strict(char *dst, const void *unsafe_addr, -                       long count); +long strncpy_from_kernel_nofault(char *dst, const void *unsafe_addr, +        long count);  extern long __strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count);  long strncpy_from_user_nofault(char *dst, const void __user *unsafe_addr,          long count); diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 720d78c62d05..7b905aa800b2 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -242,7 +242,7 @@ bpf_probe_read_kernel_str_common(void *dst, u32 size, const void *unsafe_ptr,       * is returned that can be used for bpf_perf_event_output() et al.       */      ret = compat ? strncpy_from_unsafe(dst, unsafe_ptr, size) : -          strncpy_from_unsafe_strict(dst, unsafe_ptr, size); +          strncpy_from_kernel_nofault(dst, unsafe_ptr, size);      if (unlikely(ret < 0))  out:          memset(dst, 0, size); @@ -414,7 +414,7 @@ BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1,                  break;  #endif              case 'k': -                strncpy_from_unsafe_strict(buf, unsafe_ptr, +                strncpy_from_kernel_nofault(buf, unsafe_ptr,                                 sizeof(buf));                  break;              case 'u': diff --git a/mm/maccess.c b/mm/maccess.c index 84c598673aa9..82863bc6b550 100644 --- a/mm/maccess.c +++ b/mm/maccess.c @@ -167,17 +167,36 @@ EXPORT_SYMBOL_GPL(probe_user_write);   * If @count is smaller than the length of the string, copies @count-1 bytes,   * sets the last byte of @dst buffer to NUL and returns @count.   * - * strncpy_from_unsafe_strict() is the same as strncpy_from_unsafe() except - * for the case where architectures have non-overlapping user and kernel address - * ranges: strncpy_from_unsafe_strict() will additionally return -EFAULT for - * probing memory on a user address range where strncpy_from_unsafe_user() is - * supposed to be used instead. + * Same as strncpy_from_kernel_nofault() except that for architectures with + * not fully separated user and kernel address spaces this function also works + * for user address tanges. + * + * DO NOT USE THIS FUNCTION - it is broken on architectures with entirely + * separate kernel and user address spaces, and also a bad idea otherwise.   */    long __weak strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count)      __attribute__((alias("__strncpy_from_unsafe")));   -long __weak strncpy_from_unsafe_strict(char *dst, const void *unsafe_addr, +/** + * strncpy_from_kernel_nofault: - Copy a NUL terminated string from unsafe + *                 address. + * @dst:   Destination address, in kernel space.  This buffer must be at + *         least @count bytes long. + * @unsafe_addr: Unsafe address. + * @count: Maximum number of bytes to copy, including the trailing NUL. + * + * Copies a NUL-terminated string from unsafe address to kernel buffer. + * + * On success, returns the length of the string INCLUDING the trailing NUL. + * + * If access fails, returns -EFAULT (some data may have been copied + * and the trailing NUL added). + * + * If @count is smaller than the length of the string, copies @count-1 bytes, + * sets the last byte of @dst buffer to NUL and returns @count. + */ +long __weak strncpy_from_kernel_nofault(char *dst, const void *unsafe_addr,                         long count)      __attribute__((alias("__strncpy_from_unsafe")));   From patchwork Wed Apr 21 13:14:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zidenberg, Tsahi" X-Patchwork-Id: 425835 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2B380C43140 for ; Wed, 21 Apr 2021 13:16:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E3A5761452 for ; Wed, 21 Apr 2021 13:16:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241062AbhDUNRT (ORCPT ); Wed, 21 Apr 2021 09:17:19 -0400 Received: from smtp-fw-6002.amazon.com ([52.95.49.90]:53036 "EHLO smtp-fw-6002.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242508AbhDUNPK (ORCPT ); Wed, 21 Apr 2021 09:15:10 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1619010878; x=1650546878; h=subject:from:to:cc:references:message-id:date: mime-version:in-reply-to:content-transfer-encoding; bh=+p7LL96OPesTTJ2+Gjoku94KKXlIqR8BPVG20QQ92yI=; b=I5+Z0S8NUyT2zYLwy57cpRxnpfi3jrcz9/DMTgP7tGfj7auyWBgC1u80 yMzXRHf15HHvOZC/CiYg9eVfadaf104ro8ExwErL0SFxOR+QZs8ZZuOBj PFyUNdPFpXbfbLBDJrTK36EgbwKLPTVxn1CtXmIlXRPzdUlud/CeFoWac s=; X-IronPort-AV: E=Sophos;i="5.82,240,1613433600"; d="scan'208";a="107554639" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO email-inbound-relay-1e-c7c08562.us-east-1.amazon.com) ([10.43.8.2]) by smtp-border-fw-out-6002.iad6.amazon.com with ESMTP; 21 Apr 2021 13:14:31 +0000 Received: from EX13D01EUA001.ant.amazon.com (iad12-ws-svc-p26-lb9-vlan2.iad.amazon.com [10.40.163.34]) by email-inbound-relay-1e-c7c08562.us-east-1.amazon.com (Postfix) with ESMTPS id 8FA4F240BD0; Wed, 21 Apr 2021 13:14:30 +0000 (UTC) Received: from 8c859063385e.ant.amazon.com (10.43.161.253) by EX13D01EUA001.ant.amazon.com (10.43.165.121) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Wed, 21 Apr 2021 13:14:26 +0000 Subject: [PATCH 8/8] bpf: rework the compat kernel probe handling From: "Zidenberg, Tsahi" To: CC: Greg KH References: Message-ID: Date: Wed, 21 Apr 2021 16:14:21 +0300 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:78.0) Gecko/20100101 Thunderbird/78.9.1 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-Originating-IP: [10.43.161.253] X-ClientProxiedBy: EX13D17UWB003.ant.amazon.com (10.43.161.42) To EX13D01EUA001.ant.amazon.com (10.43.165.121) Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commit 8d92db5c04d10381f4db70ed99b1b576f5db18a7 upstream Instead of using the dangerous probe_kernel_read and strncpy_from_unsafe helpers, rework the compat probes to check if an address is a kernel or userspace one, and then use the low-level kernel or user probe helper shared by the proper kernel and user probe helpers.  This slightly changes behavior as the compat probe on a user address doesn't check the lockdown flags, just as the pure user probes do. conflict resolution: in kernel 5.4, these bpf_func_proto structs are static. Signed-off-by: Christoph Hellwig Signed-off-by: Andrew Morton Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Masami Hiramatsu Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20200521152301.2587579-14-hch@lst.de Signed-off-by: Linus Torvalds Cc: # 5.4 Signed-off-by: Tsahi Zidenberg ---  kernel/trace/bpf_trace.c | 109 ++++++++++++++++++++++++---------------  1 file changed, 67 insertions(+), 42 deletions(-) diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 7b905aa800b2..ac07eeb4fa06 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -138,17 +138,23 @@ static const struct bpf_func_proto bpf_override_return_proto = {  };  #endif   -BPF_CALL_3(bpf_probe_read_user, void *, dst, u32, size, -       const void __user *, unsafe_ptr) +static __always_inline int +bpf_probe_read_user_common(void *dst, u32 size, const void __user *unsafe_ptr)  { -    int ret = probe_user_read(dst, unsafe_ptr, size); +    int ret;   +    ret = probe_user_read(dst, unsafe_ptr, size);      if (unlikely(ret < 0))          memset(dst, 0, size); -      return ret;  }   +BPF_CALL_3(bpf_probe_read_user, void *, dst, u32, size, +       const void __user *, unsafe_ptr) +{ +    return bpf_probe_read_user_common(dst, size, unsafe_ptr); +} +  static const struct bpf_func_proto bpf_probe_read_user_proto = {      .func        = bpf_probe_read_user,      .gpl_only    = true, @@ -158,17 +164,24 @@ static const struct bpf_func_proto bpf_probe_read_user_proto = {      .arg3_type    = ARG_ANYTHING,  };   -BPF_CALL_3(bpf_probe_read_user_str, void *, dst, u32, size, -       const void __user *, unsafe_ptr) +static __always_inline int +bpf_probe_read_user_str_common(void *dst, u32 size, +                   const void __user *unsafe_ptr)  { -    int ret = strncpy_from_user_nofault(dst, unsafe_ptr, size); +    int ret;   +    ret = strncpy_from_user_nofault(dst, unsafe_ptr, size);      if (unlikely(ret < 0))          memset(dst, 0, size); -      return ret;  }   +BPF_CALL_3(bpf_probe_read_user_str, void *, dst, u32, size, +       const void __user *, unsafe_ptr) +{ +    return bpf_probe_read_user_str_common(dst, size, unsafe_ptr); +} +  static const struct bpf_func_proto bpf_probe_read_user_str_proto = {      .func        = bpf_probe_read_user_str,      .gpl_only    = true, @@ -179,25 +192,25 @@ static const struct bpf_func_proto bpf_probe_read_user_str_proto = {  };    static __always_inline int -bpf_probe_read_kernel_common(void *dst, u32 size, const void *unsafe_ptr, -                 const bool compat) +bpf_probe_read_kernel_common(void *dst, u32 size, const void *unsafe_ptr)  {      int ret = security_locked_down(LOCKDOWN_BPF_READ);        if (unlikely(ret < 0)) -        goto out; -    ret = compat ? probe_kernel_read(dst, unsafe_ptr, size) : -          probe_kernel_read_strict(dst, unsafe_ptr, size); +        goto fail; +    ret = probe_kernel_read_strict(dst, unsafe_ptr, size);      if (unlikely(ret < 0)) -out: -        memset(dst, 0, size); +        goto fail; +    return ret; +fail: +    memset(dst, 0, size);      return ret;  }    BPF_CALL_3(bpf_probe_read_kernel, void *, dst, u32, size,         const void *, unsafe_ptr)  { -    return bpf_probe_read_kernel_common(dst, size, unsafe_ptr, false); +    return bpf_probe_read_kernel_common(dst, size, unsafe_ptr);  }    static const struct bpf_func_proto bpf_probe_read_kernel_proto = { @@ -209,50 +222,37 @@ static const struct bpf_func_proto bpf_probe_read_kernel_proto = {      .arg3_type    = ARG_ANYTHING,  };   -BPF_CALL_3(bpf_probe_read_compat, void *, dst, u32, size, -       const void *, unsafe_ptr) -{ -    return bpf_probe_read_kernel_common(dst, size, unsafe_ptr, true); -} - -static const struct bpf_func_proto bpf_probe_read_compat_proto = { -    .func        = bpf_probe_read_compat, -    .gpl_only    = true, -    .ret_type    = RET_INTEGER, -    .arg1_type    = ARG_PTR_TO_UNINIT_MEM, -    .arg2_type    = ARG_CONST_SIZE_OR_ZERO, -    .arg3_type    = ARG_ANYTHING, -}; -  static __always_inline int -bpf_probe_read_kernel_str_common(void *dst, u32 size, const void *unsafe_ptr, -                 const bool compat) +bpf_probe_read_kernel_str_common(void *dst, u32 size, const void *unsafe_ptr)  {      int ret = security_locked_down(LOCKDOWN_BPF_READ);        if (unlikely(ret < 0)) -        goto out; +        goto fail; +      /* -     * The strncpy_from_unsafe_*() call will likely not fill the entire -     * buffer, but that's okay in this circumstance as we're probing +     * The strncpy_from_kernel_nofault() call will likely not fill the +     * entire buffer, but that's okay in this circumstance as we're probing       * arbitrary memory anyway similar to bpf_probe_read_*() and might       * as well probe the stack. Thus, memory is explicitly cleared       * only in error case, so that improper users ignoring return       * code altogether don't copy garbage; otherwise length of string       * is returned that can be used for bpf_perf_event_output() et al.       */ -    ret = compat ? strncpy_from_unsafe(dst, unsafe_ptr, size) : -          strncpy_from_kernel_nofault(dst, unsafe_ptr, size); +    ret = strncpy_from_kernel_nofault(dst, unsafe_ptr, size);      if (unlikely(ret < 0)) -out: -        memset(dst, 0, size); +        goto fail; + +    return 0; +fail: +    memset(dst, 0, size);      return ret;  }    BPF_CALL_3(bpf_probe_read_kernel_str, void *, dst, u32, size,         const void *, unsafe_ptr)  { -    return bpf_probe_read_kernel_str_common(dst, size, unsafe_ptr, false); +    return bpf_probe_read_kernel_str_common(dst, size, unsafe_ptr);  }    static const struct bpf_func_proto bpf_probe_read_kernel_str_proto = { @@ -264,10 +264,34 @@ static const struct bpf_func_proto bpf_probe_read_kernel_str_proto = {      .arg3_type    = ARG_ANYTHING,  };   +#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE +BPF_CALL_3(bpf_probe_read_compat, void *, dst, u32, size, +       const void *, unsafe_ptr) +{ +    if ((unsigned long)unsafe_ptr < TASK_SIZE) { +        return bpf_probe_read_user_common(dst, size, +                (__force void __user *)unsafe_ptr); +    } +    return bpf_probe_read_kernel_common(dst, size, unsafe_ptr); +} + +static const struct bpf_func_proto bpf_probe_read_compat_proto = { +    .func        = bpf_probe_read_compat, +    .gpl_only    = true, +    .ret_type    = RET_INTEGER, +    .arg1_type    = ARG_PTR_TO_UNINIT_MEM, +    .arg2_type    = ARG_CONST_SIZE_OR_ZERO, +    .arg3_type    = ARG_ANYTHING, +}; +  BPF_CALL_3(bpf_probe_read_compat_str, void *, dst, u32, size,         const void *, unsafe_ptr)  { -    return bpf_probe_read_kernel_str_common(dst, size, unsafe_ptr, true); +    if ((unsigned long)unsafe_ptr < TASK_SIZE) { +        return bpf_probe_read_user_str_common(dst, size, +                (__force void __user *)unsafe_ptr); +    } +    return bpf_probe_read_kernel_str_common(dst, size, unsafe_ptr);  }    static const struct bpf_func_proto bpf_probe_read_compat_str_proto = { @@ -278,6 +302,7 @@ static const struct bpf_func_proto bpf_probe_read_compat_str_proto = {      .arg2_type    = ARG_CONST_SIZE_OR_ZERO,      .arg3_type    = ARG_ANYTHING,  }; +#endif /* CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE */    BPF_CALL_3(bpf_probe_write_user, void __user *, unsafe_ptr, const void *, src,         u32, size)