[5.12,081/173] tracing: Correct the length check which causes memory corruption

From: Liangyan <liangyan.peng@linux.alibaba.com>

From: Liangyan <liangyan.peng@linux.alibaba.com>

commit 3e08a9f9760f4a70d633c328a76408e62d6f80a3 upstream.

We've suffered from severe kernel crashes due to memory corruption on
our production environment, like,

Call Trace:
[1640542.554277] general protection fault: 0000 [#1] SMP PTI
[1640542.554856] CPU: 17 PID: 26996 Comm: python Kdump: loaded Tainted:G
[1640542.556629] RIP: 0010:kmem_cache_alloc+0x90/0x190
[1640542.559074] RSP: 0018:ffffb16faa597df8 EFLAGS: 00010286
[1640542.559587] RAX: 0000000000000000 RBX: 0000000000400200 RCX:
0000000006e931bf
[1640542.560323] RDX: 0000000006e931be RSI: 0000000000400200 RDI:
ffff9a45ff004300
[1640542.560996] RBP: 0000000000400200 R08: 0000000000023420 R09:
0000000000000000
[1640542.561670] R10: 0000000000000000 R11: 0000000000000000 R12:
ffffffff9a20608d
[1640542.562366] R13: ffff9a45ff004300 R14: ffff9a45ff004300 R15:
696c662f65636976
[1640542.563128] FS:  00007f45d7c6f740(0000) GS:ffff9a45ff840000(0000)
knlGS:0000000000000000
[1640542.563937] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1640542.564557] CR2: 00007f45d71311a0 CR3: 000000189d63e004 CR4:
00000000003606e0
[1640542.565279] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[1640542.566069] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[1640542.566742] Call Trace:
[1640542.567009]  anon_vma_clone+0x5d/0x170
[1640542.567417]  __split_vma+0x91/0x1a0
[1640542.567777]  do_munmap+0x2c6/0x320
[1640542.568128]  vm_munmap+0x54/0x70
[1640542.569990]  __x64_sys_munmap+0x22/0x30
[1640542.572005]  do_syscall_64+0x5b/0x1b0
[1640542.573724]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[1640542.575642] RIP: 0033:0x7f45d6e61e27

James Wang has reproduced it stably on the latest 4.19 LTS.
After some debugging, we finally proved that it's due to ftrace
buffer out-of-bound access using a debug tool as follows:
[   86.775200] BUG: Out-of-bounds write at addr 0xffff88aefe8b7000
[   86.780806]  no_context+0xdf/0x3c0
[   86.784327]  __do_page_fault+0x252/0x470
[   86.788367]  do_page_fault+0x32/0x140
[   86.792145]  page_fault+0x1e/0x30
[   86.795576]  strncpy_from_unsafe+0x66/0xb0
[   86.799789]  fetch_memory_string+0x25/0x40
[   86.804002]  fetch_deref_string+0x51/0x60
[   86.808134]  kprobe_trace_func+0x32d/0x3a0
[   86.812347]  kprobe_dispatcher+0x45/0x50
[   86.816385]  kprobe_ftrace_handler+0x90/0xf0
[   86.820779]  ftrace_ops_assist_func+0xa1/0x140
[   86.825340]  0xffffffffc00750bf
[   86.828603]  do_sys_open+0x5/0x1f0
[   86.832124]  do_syscall_64+0x5b/0x1b0
[   86.835900]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

commit b220c049d519 ("tracing: Check length before giving out
the filter buffer") adds length check to protect trace data
overflow introduced in 0fc1b09ff1ff, seems that this fix can't prevent
overflow entirely, the length check should also take the sizeof
entry->array[0] into account, since this array[0] is filled the
length of trace data and occupy addtional space and risk overflow.

Link: https://lkml.kernel.org/r/20210607125734.1770447-1-liangyan.peng@linux.alibaba.com

Cc: stable@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Xunlei Pang <xlpang@linux.alibaba.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixes: b220c049d519 ("tracing: Check length before giving out the filter buffer")
Reviewed-by: Xunlei Pang <xlpang@linux.alibaba.com>
Reviewed-by: yinbinbin <yinbinbin@alibabacloud.com>
Reviewed-by: Wetp Zhang <wetp.zy@linux.alibaba.com>
Tested-by: James Wang <jnwang@linux.alibaba.com>
Signed-off-by: Liangyan <liangyan.peng@linux.alibaba.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/trace/trace.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Message ID	20210614102700.853933597@linuxfoundation.org
State	Superseded
Headers	show Return-Path: <stable-owner@kernel.org> From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>, stable@vger.kernel.org, Ingo Molnar <mingo@redhat.com>, Xunlei Pang <xlpang@linux.alibaba.com>, yinbinbin <yinbinbin@alibabacloud.com>, Wetp Zhang <wetp.zy@linux.alibaba.com>, James Wang <jnwang@linux.alibaba.com>, Liangyan <liangyan.peng@linux.alibaba.com>, "Steven Rostedt (VMware)" <rostedt@goodmis.org> Subject: [PATCH 5.12 081/173] tracing: Correct the length check which causes memory corruption Date: Mon, 14 Jun 2021 12:26:53 +0200 Message-Id: <20210614102700.853933597@linuxfoundation.org> In-Reply-To: <20210614102658.137943264@linuxfoundation.org> References: <20210614102658.137943264@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	None \| expand [5.12,002/173] ASoC: max98088: fix ni clock divider calculation [5.12,007/173] ASoC: codecs: lpass-rx-macro: add missing MODULE_DEVICE_TABLE [5.12,008/173] ASoC: codecs: lpass-tx-macro: add missing MODULE_DEVICE_TABLE [5.12,009/173] net/nfc/rawsock.c: fix a permission check bug [5.12,010/173] usb: cdns3: Fix runtime PM imbalance on error [5.12,011/173] ASoC: Intel: bytcr_rt5640: Add quirk for the Glavey TM800A550L tablet [5.12,013/173] bpf: Add deny list of btf ids check for tracing programs [5.12,016/173] ASoC: sti-sas: add missing MODULE_DEVICE_TABLE [5.12,017/173] spi: sprd: Add missing MODULE_DEVICE_TABLE [5.12,019/173] isdn: mISDN: netjet: Fix crash in nj_probe: [5.12,023/173] cgroup: disable controllers at parse time [5.12,024/173] wq: handle VM suspension in stall detection [5.12,030/173] scsi: hisi_sas: Drop free_irq() of devm_request_irq() allocated irq [5.12,031/173] scsi: target: qla2xxx: Wait for stop_phase1 at WWN removal [5.12,034/173] net: dsa: microchip: enable phy errata workaround on 9567 [5.12,035/173] Makefile: LTO: have linker check -Wframe-larger-than [5.12,039/173] dm verity: fix require_signatures module_param permissions [5.12,041/173] nvme-tcp: remove incorrect Kconfig dep in BLK_DEV_NVME [5.12,042/173] nvmet: fix false keep-alive timeout when a controller is torn down [5.12,044/173] powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P1010 i2c controllers [5.12,045/173] spi: Dont have controller clean up spi device before driver unbind [5.12,048/173] i2c: mpc: implement erratum A-004447 workaround [5.12,050/173] ALSA: firewire-lib: fix the context to call snd_pcm_stop_xrun() [5.12,051/173] ALSA: hda/realtek: headphone and mic dont work on an Acer laptop [5.12,053/173] ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP EliteBook x360 1040 G8 [5.12,055/173] ALSA: hda/realtek: fix mute/micmute LEDs for HP ZBook Power G8 [5.12,056/173] spi: bcm2835: Fix out-of-bounds access with more than 4 slaves [5.12,057/173] Revert "ACPI: sleep: Put the FACS table after using it" [5.12,059/173] drm: Fix use-after-free read in drm_getunique() [5.12,063/173] KVM: x86: Unload MMU on guest TLB flush if TDP disabled to force MMU sync [5.12,064/173] KVM: X86: MMU: Use the correct inherited permissions to get shadow page [5.12,066/173] staging: rtl8723bs: Fix uninitialized variables [5.12,067/173] usb: misc: brcmstb-usb-pinmap: check return value after calling platform_get_resou... [5.12,068/173] misc: rtsx: separate aspm mode into MODE_REG and MODE_CFG [5.12,070/173] tick/nohz: Only check for RCU deferred wakeup on user/guest entry when needed [5.12,071/173] bcache: remove bcache device self-defined readahead [5.12,073/173] async_xor: check src_offs is not NULL before updating it [5.12,076/173] btrfs: return value from btrfs_mark_extent_written() in case of error [5.12,077/173] btrfs: promote debugging asserts to full-fledged checks in validate_super [5.12,079/173] cgroup1: dont allow \n in renaming [5.12,081/173] tracing: Correct the length check which causes memory corruption [5.12,083/173] mmc: renesas_sdhi: abort tuning when timeout detected [5.12,085/173] USB: f_ncm: ncm_bitrate (speed) is unsigned [5.12,087/173] usb: pd: Set PD_T_SINK_WAIT_CAP to 310ms [5.12,090/173] usb: dwc3: gadget: Bail from dwc3_gadget_exit() if dwc->gadget is NULL [5.12,092/173] usb: dwc3: ep0: fix NULL pointer exception [5.12,095/173] usb: typec: wcove: Use LE to CPU conversion when accessing msg->header [5.12,097/173] usb: typec: intel_pmc_mux: Put fwnode in error case during ->probe() [5.12,098/173] usb: typec: intel_pmc_mux: Add missed error check for devm_ioremap_resource() [5.12,099/173] usb: gadget: f_fs: Ensure io_completion_wq is idle during unbind [5.12,103/173] USB: serial: cp210x: fix alternate function for CP2102N QFN20 [5.12,104/173] USB: serial: cp210x: fix CP2102N-A01 modem control [5.12,109/173] usb: typec: tcpm: Properly handle Alert and Status Messages [5.12,110/173] usb: typec: tcpm: cancel vdm and state machine hrtimer when unregister tcpm port [5.12,112/173] usb: typec: tcpm: Do not finish VDM AMS for retrying Responses [5.12,113/173] regulator: core: resolve supply for boot-on/always-on regulators [5.12,116/173] regulator: da9121: Return REGULATOR_MODE_INVALID for invalid mode [5.12,117/173] regulator: fan53880: Fix missing n_voltages setting [5.12,119/173] regulator: scmi: Fix off-by-one for linear regulators .n_voltages setting [5.12,122/173] regulator: rtmv20: Fix .set_current_limit/.get_current_limit callbacks [5.12,125/173] usb: dwc3: gadget: Disable gadget IRQ during pullup disable [5.12,126/173] usb: typec: tcpm: Correct the responses in SVDM Version 2.0 DFP [5.12,128/173] usb: typec: mux: Fix copy-paste mistake in typec_mux_match [5.12,129/173] drm/mcde: Fix off by 10^3 in calculation [5.12,130/173] drm/msm/a6xx: fix incorrectly set uavflagprd_inv field for A650 [5.12,131/173] drm/msm/a6xx: update/fix CP_PROTECT initialization [5.12,134/173] hwmon: (tps23861) define regmap max register [5.12,135/173] hwmon: (tps23861) set current shunt value [5.12,137/173] RDMA/ipoib: Fix warning caused by destroying non-initial netns [5.12,139/173] RDMA/mlx4: Do not map the core_clock page to user space unless enabled [5.12,141/173] RDMA: Verify port when creating flow rule [5.12,142/173] ARM: cpuidle: Avoid orphan section warning [5.12,144/173] tools/bootconfig: Fix error return code in apply_xbc() [5.12,145/173] phy: cadence: Sierra: Fix error return code in cdns_sierra_phy_probe() [5.12,147/173] ASoC: meson: gx-card: fix sound-dai dt schema [5.12,148/173] phy: ti: Fix an error code in wiz_probe() [5.12,149/173] gpio: wcd934x: Fix shift-out-of-bounds error [5.12,150/173] pinctrl: qcom: Fix duplication in gpio_groups [5.12,152/173] perf: Fix data race between pin_count increment/decrement [5.12,153/173] dt-bindings: connector: Replace BIT macro with generic bit ops [5.12,154/173] sched/fair: Keep load_avg and load_sum synced [5.12,155/173] sched/fair: Make sure to update tg contrib for blocked load [5.12,156/173] ASoC: SOF: reset enabled_cores state at suspend [5.12,157/173] sched/fair: Fix util_est UTIL_AVG_UNCHANGED handling [5.12,162/173] NFS: Fix a potential NULL dereference in nfs_get_client() [5.12,163/173] NFSv4: Fix deadlock between nfs4_evict_inode() and nfs4_opendata_get_inode() [5.12,164/173] pinctrl: qcom: Make it possible to select SC8180x TLMM [5.12,165/173] perf session: Correct buffer copying when peeking events [5.12,168/173] NFSv4: Fix second deadlock in nfs4_evict_inode() [5.12,170/173] scsi: core: Fix error handling of scsi_host_alloc() [5.12,171/173] scsi: core: Fix failure handling of scsi_add_host_with_dma()

[5.12,081/173] tracing: Correct the length check which causes memory corruption

Commit Message

Patch