diff mbox series

[1/4] perf/core: Add support to exclude kernel mode instruction tracing

Message ID 89c7ff59d887a0360434e607bd625393ec3190e5.1611909025.git.saiprakash.ranjan@codeaurora.org
State New
Headers show
Series Add support to exclude kernel mode hardware assisted instruction tracing | expand

Commit Message

Sai Prakash Ranjan Jan. 29, 2021, 7:05 p.m. UTC
Hardware assisted tracing families such as ARM Coresight, Intel PT
provides rich tracing capabilities including instruction level
tracing and accurate timestamps which are very useful for profiling
and also pose a significant security risk. One such example of
security risk is when kernel mode tracing is not excluded and these
hardware assisted tracing can be used to analyze cryptographic code
execution. In this case, even the root user must not be able to infer
anything.

To explain it more clearly in the words of a security team member
(credits: Mattias Nissler),

"Consider a system where disk contents are encrypted and the encryption
key is set up by the user when mounting the file system. From that point
on the encryption key resides in the kernel. It seems reasonable to
expect that the disk encryption key be protected from exfiltration even
if the system later suffers a root compromise (or even against insiders
that have root access), at least as long as the attacker doesn't
manage to compromise the kernel."

Here the idea is to protect such important information from all users
including root users since root privileges does not have to mean full
control over the kernel [1] and root compromise does not have to be
the end of the world.

Currently we can exclude kernel mode tracing via perf_event_paranoid
sysctl but it has following limitations,

 * It is applicable to all PMUs and not just the ones supporting
   instruction tracing.
 * No option to restrict kernel mode instruction tracing by the
   root user.
 * Not possible to restrict kernel mode instruction tracing when the
   hardware assisted tracing IPs like ARM Coresight ETMs use an
   additional interface via sysfs for tracing in addition to perf
   interface.

So introduce a new config CONFIG_EXCLUDE_KERNEL_HW_ITRACE to exclude
kernel mode instruction tracing which will be generic and applicable
to all hardware tracing families and which can also be used with other
interfaces like sysfs in case of ETMs.

[1] https://lwn.net/Articles/796866/

Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Suggested-by: Al Grant <al.grant@arm.com>
Tested-by: Denis Nikitin <denik@chromium.org>
Link: https://lore.kernel.org/lkml/20201015124522.1876-1-saiprakash.ranjan@codeaurora.org/
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
---
 init/Kconfig         | 12 ++++++++++++
 kernel/events/core.c |  6 ++++++
 2 files changed, 18 insertions(+)

Comments

Sai Prakash Ranjan Feb. 1, 2021, 7:41 a.m. UTC | #1
Hi Peter,

On 2021-01-30 01:00, Peter Zijlstra wrote:
> On Sat, Jan 30, 2021 at 12:35:10AM +0530, Sai Prakash Ranjan wrote:

> 

>> Here the idea is to protect such important information from all users

>> including root users since root privileges does not have to mean full

>> control over the kernel [1] and root compromise does not have to be

>> the end of the world.

> 

> And yet, your thing lacks:

> 


I guess you mean this lacks an explanation as to why this only applies
to ITRACE and not others? See below.

>> +config EXCLUDE_KERNEL_HW_ITRACE

>> +	bool "Exclude kernel mode hardware assisted instruction tracing"

>> +	depends on PERF_EVENTS

> 	depends on SECURITY_LOCKDOWN

> 

> or whatever the appropriate symbol is.

> 


Ok I suppose you mean CONFIG_SECURITY_LOCKDOWN_LSM? But I don't see
how this new config has to depend on that? This can work independently
whether complete lockdown is enforced or not since it applies to only
hardware instruction tracing. Ideally this depends on several hardware
tracing configs such as ETMs and others but we don't need them because
we are already exposing PERF_PMU_CAP_ITRACE check in the events core.

>> +	help

>> +	  Exclude kernel mode instruction tracing by hardware tracing

>> +	  family such as ARM Coresight ETM, Intel PT and so on.

>> +

>> +	  This option allows to disable kernel mode instruction tracing

>> +	  offered by hardware assisted tracing for all users(including root)

>> +	  especially for production systems where only userspace tracing 

>> might

>> +	  be preferred for security reasons.

> 

> Also, colour me unconvinced, pretty much all kernel level PMU usage

> can be employed to side-channel / infer crypto keys, why focus on

> ITRACE over others?


Here ITRACE is not just instruction trace, it is meant for hardware 
assisted
instruction trace such as Intel PT, Intel BTS, ARM coresight etc. These 
provide
much more capabilities than normal instruction tracing whether its 
kernel level
or userspace. More specifically, these provide more accurate branch 
trace like
Intel PT LBR (Last Branch Record), Intel BTS(Branch Trace Store) which 
can be
used to decode the program flow more accurately with timestamps in real 
time
than other PMUs. Also there is cycle accurate tracing which can 
theoretically
be used for some speculative execution based attacks. Which other kernel 
level
PMUs can be used to get a full branch trace that is not locked down? If 
there
is one, then this should probably be applied to it as well.

Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation
Peter Zijlstra Feb. 1, 2021, 1:41 p.m. UTC | #2
On Mon, Feb 01, 2021 at 01:11:04PM +0530, Sai Prakash Ranjan wrote:

> Ok I suppose you mean CONFIG_SECURITY_LOCKDOWN_LSM? But I don't see

> how this new config has to depend on that? This can work independently

> whether complete lockdown is enforced or not since it applies to only

> hardware instruction tracing. Ideally this depends on several hardware

> tracing configs such as ETMs and others but we don't need them because

> we are already exposing PERF_PMU_CAP_ITRACE check in the events core.


If you don't have lockdown, root pretty much owns the kernel, or am I
missing something?

> be used for some speculative execution based attacks. Which other

> kernel level PMUs can be used to get a full branch trace that is not

> locked down? If there is one, then this should probably be applied to

> it as well.


Just the regular counters. The information isn't as accurate, but given
enough goes you can infer plenty.

Just like all the SMT size-channel attacks.

Sure, PT and friends make it even easier, but I don't see a fundamental
distinction.
Sai Prakash Ranjan Feb. 2, 2021, 6:11 a.m. UTC | #3
Hi Peter,

On 2021-02-01 19:11, Peter Zijlstra wrote:
> On Mon, Feb 01, 2021 at 01:11:04PM +0530, Sai Prakash Ranjan wrote:

> 

>> Ok I suppose you mean CONFIG_SECURITY_LOCKDOWN_LSM? But I don't see

>> how this new config has to depend on that? This can work independently

>> whether complete lockdown is enforced or not since it applies to only

>> hardware instruction tracing. Ideally this depends on several hardware

>> tracing configs such as ETMs and others but we don't need them because

>> we are already exposing PERF_PMU_CAP_ITRACE check in the events core.

> 

> If you don't have lockdown, root pretty much owns the kernel, or am I

> missing something?

> 


You are right in saying that without lockdown root would own kernel but
this config(EXCLUDE_KERNEL) will independently make sure that kernel
level pmu tracing is not allowed(we return -EACCES) even if LOCKDOWN
config is disabled. So I'm saying that we don't need to depend on
LOCKDOWN config, its good to have LOCKDOWN config enabled but perf
subsystem doesn't have to care about that.

>> be used for some speculative execution based attacks. Which other

>> kernel level PMUs can be used to get a full branch trace that is not

>> locked down? If there is one, then this should probably be applied to

>> it as well.

> 

> Just the regular counters. The information isn't as accurate, but given

> enough goes you can infer plenty.

> 

> Just like all the SMT size-channel attacks.

> 

> Sure, PT and friends make it even easier, but I don't see a fundamental

> distinction.


Right, we should then exclude all kernel level pmu tracing, is it fine?

if (IS_ENABLED(CONFIG_EXCLUDE_KERNEL_HW_ITRACE) && 
!attr.exclude_kernel))
     return -EACCES;

Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation
Sai Prakash Ranjan Feb. 10, 2021, 7:38 a.m. UTC | #4
Hi Peter,

On 2021-02-02 11:41, Sai Prakash Ranjan wrote:
> Hi Peter,

> 

> On 2021-02-01 19:11, Peter Zijlstra wrote:

>> On Mon, Feb 01, 2021 at 01:11:04PM +0530, Sai Prakash Ranjan wrote:

>> 

>>> Ok I suppose you mean CONFIG_SECURITY_LOCKDOWN_LSM? But I don't see

>>> how this new config has to depend on that? This can work 

>>> independently

>>> whether complete lockdown is enforced or not since it applies to only

>>> hardware instruction tracing. Ideally this depends on several 

>>> hardware

>>> tracing configs such as ETMs and others but we don't need them 

>>> because

>>> we are already exposing PERF_PMU_CAP_ITRACE check in the events core.

>> 

>> If you don't have lockdown, root pretty much owns the kernel, or am I

>> missing something?

>> 

> 

> You are right in saying that without lockdown root would own kernel but

> this config(EXCLUDE_KERNEL) will independently make sure that kernel

> level pmu tracing is not allowed(we return -EACCES) even if LOCKDOWN

> config is disabled. So I'm saying that we don't need to depend on

> LOCKDOWN config, its good to have LOCKDOWN config enabled but perf

> subsystem doesn't have to care about that.

> 

>>> be used for some speculative execution based attacks. Which other

>>> kernel level PMUs can be used to get a full branch trace that is not

>>> locked down? If there is one, then this should probably be applied to

>>> it as well.

>> 

>> Just the regular counters. The information isn't as accurate, but 

>> given

>> enough goes you can infer plenty.

>> 

>> Just like all the SMT size-channel attacks.

>> 

>> Sure, PT and friends make it even easier, but I don't see a 

>> fundamental

>> distinction.

> 

> Right, we should then exclude all kernel level pmu tracing, is it fine?

> 

> if (IS_ENABLED(CONFIG_EXCLUDE_KERNEL_HW_ITRACE) && 

> !attr.exclude_kernel))

>     return -EACCES;

> 


Sorry for being pushy, but does the above make sense?

Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation
diff mbox series

Patch

diff --git a/init/Kconfig b/init/Kconfig
index af454a51f3c5..31b4d1f26bce 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1832,6 +1832,18 @@  config DEBUG_PERF_USE_VMALLOC
 
 endmenu
 
+config EXCLUDE_KERNEL_HW_ITRACE
+	bool "Exclude kernel mode hardware assisted instruction tracing"
+	depends on PERF_EVENTS
+	help
+	  Exclude kernel mode instruction tracing by hardware tracing
+	  family such as ARM Coresight ETM, Intel PT and so on.
+
+	  This option allows to disable kernel mode instruction tracing
+	  offered by hardware assisted tracing for all users(including root)
+	  especially for production systems where only userspace tracing might
+	  be preferred for security reasons.
+
 config VM_EVENT_COUNTERS
 	default y
 	bool "Enable VM event counters for /proc/vmstat" if EXPERT
diff --git a/kernel/events/core.c b/kernel/events/core.c
index aece2fe19693..044a774cef6d 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -11866,6 +11866,12 @@  SYSCALL_DEFINE5(perf_event_open,
 		goto err_task;
 	}
 
+	if (IS_ENABLED(CONFIG_EXCLUDE_KERNEL_HW_ITRACE) &&
+	    (event->pmu->capabilities & PERF_PMU_CAP_ITRACE) && !attr.exclude_kernel) {
+		err = -EACCES;
+		goto err_alloc;
+	}
+
 	if (is_sampling_event(event)) {
 		if (event->pmu->capabilities & PERF_PMU_CAP_NO_INTERRUPT) {
 			err = -EOPNOTSUPP;