mbox series

[v3,0/8] Fork brute force attack mitigation

Message ID 20210221154919.68050-1-john.wood@gmx.com
Headers show
Series Fork brute force attack mitigation | expand

Message

John Wood Feb. 21, 2021, 3:49 p.m. UTC
Attacks against vulnerable userspace applications with the purpose to break
ASLR or bypass canaries traditionally use some level of brute force with
the help of the fork system call. This is possible since when creating a
new process using fork its memory contents are the same as those of the
parent process (the process that called the fork system call). So, the
attacker can test the memory infinite times to find the correct memory
values or the correct memory addresses without worrying about crashing the
application.

Based on the above scenario it would be nice to have this detected and
mitigated, and this is the goal of this patch serie. Specifically the
following attacks are expected to be detected:

1.- Launching (fork()/exec()) a setuid/setgid process repeatedly until a
    desirable memory layout is got (e.g. Stack Clash).
2.- Connecting to an exec()ing network daemon (e.g. xinetd) repeatedly
    until a desirable memory layout is got (e.g. what CTFs do for simple
    network service).
3.- Launching processes without exec() (e.g. Android Zygote) and exposing
    state to attack a sibling.
4.- Connecting to a fork()ing network daemon (e.g. apache) repeatedly until
    the previously shared memory layout of all the other children is
    exposed (e.g. kind of related to HeartBleed).

In each case, a privilege boundary has been crossed:

Case 1: setuid/setgid process
Case 2: network to local
Case 3: privilege changes
Case 4: network to local

So, what will really be detected are fork/exec brute force attacks that
cross any of the commented bounds.

The implementation details and comparison against other existing
implementations can be found in the "Documentation" patch.

This v3 version has changed a lot from the v2. Basically the application
crash period is now compute on an on-going basis using an exponential
moving average (EMA), a detection of a brute force attack through the
"execve" system call has been added and the crossing of the commented
privilege bounds are taken into account. Also, the fine tune has also been
removed and now, all this kind of attacks are detected without
administrator intervention.

In the v2 version Kees Cook suggested to study if the statistical data
shared by all the fork hierarchy processes can be tracked in some other
way. Specifically the question was if this info can be hold by the family
hierarchy of the mm struct. After studying this hierarchy I think it is not
suitable for the Brute LSM since they are totally copied on fork() and in
this case we want that they are shared. So I leave this road.

So, knowing all this information I will explain now the different patches:

The 1/8 patch defines a new LSM hook to get the fatal signal of a task.
This will be useful during the attack detection phase.

The 2/8 patch defines a new LSM and manages the statistical data shared by
all the fork hierarchy processes.

The 3/8 patch detects a fork/exec brute force attack.

The 4/8 patch narrows the detection taken into account the privilege
boundary crossing.

The 5/8 patch mitigates a brute force attack.

The 6/8 patch adds self-tests to validate the Brute LSM expectations.

The 7/8 patch adds the documentation to explain this implementation.

The 8/8 patch updates the maintainers file.

This patch serie is a task of the KSPP [1] and can also be accessed from my
github tree [2] in the "brute_v3" branch.

[1] https://github.com/KSPP/linux/issues/39
[2] https://github.com/johwood/linux/

The previous versions can be found in:

https://lore.kernel.org/kernel-hardening/20200910202107.3799376-1-keescook@chromium.org/
https://lore.kernel.org/kernel-hardening/20201025134540.3770-1-john.wood@gmx.com/

Changelog RFC -> v2
-------------------
- Rename this feature with a more suitable name (Jann Horn, Kees Cook).
- Convert the code to an LSM (Kees Cook).
- Add locking  to avoid data races (Jann Horn).
- Add a new LSM hook to get the fatal signal of a task (Jann Horn, Kees
  Cook).
- Add the last crashes timestamps list to avoid false positives in the
  attack detection (Jann Horn).
- Use "period" instead of "rate" (Jann Horn).
- Other minor changes suggested (Jann Horn, Kees Cook).

Changelog v2 -> v3
------------------
- Compute the application crash period on an on-going basis (Kees Cook).
- Detect a brute force attack through the execve system call (Kees Cook).
- Detect an slow brute force attack (Randy Dunlap).
- Fine tuning the detection taken into account privilege boundary crossing
  (Kees Cook).
- Taken into account only fatal signals delivered by the kernel (Kees
  Cook).
- Remove the sysctl attributes to fine tuning the detection (Kees Cook).
- Remove the prctls to allow per process enabling/disabling (Kees Cook).
- Improve the documentation (Kees Cook).
- Fix some typos in the documentation (Randy Dunlap).
- Add self-test to validate the expectations (Kees Cook).

John Wood (8):
  security: Add LSM hook at the point where a task gets a fatal signal
  security/brute: Define a LSM and manage statistical data
  securtiy/brute: Detect a brute force attack
  security/brute: Fine tuning the attack detection
  security/brute: Mitigate a brute force attack
  selftests/brute: Add tests for the Brute LSM
  Documentation: Add documentation for the Brute LSM
  MAINTAINERS: Add a new entry for the Brute LSM

 Documentation/admin-guide/LSM/Brute.rst  |  224 +++++
 Documentation/admin-guide/LSM/index.rst  |    1 +
 MAINTAINERS                              |    7 +
 include/linux/lsm_hook_defs.h            |    1 +
 include/linux/lsm_hooks.h                |    4 +
 include/linux/security.h                 |    4 +
 kernel/signal.c                          |    1 +
 security/Kconfig                         |   11 +-
 security/Makefile                        |    4 +
 security/brute/Kconfig                   |   13 +
 security/brute/Makefile                  |    2 +
 security/brute/brute.c                   | 1102 ++++++++++++++++++++++
 security/security.c                      |    5 +
 tools/testing/selftests/Makefile         |    1 +
 tools/testing/selftests/brute/.gitignore |    2 +
 tools/testing/selftests/brute/Makefile   |    5 +
 tools/testing/selftests/brute/config     |    1 +
 tools/testing/selftests/brute/exec.c     |   44 +
 tools/testing/selftests/brute/test.c     |  507 ++++++++++
 tools/testing/selftests/brute/test.sh    |  226 +++++
 20 files changed, 2160 insertions(+), 5 deletions(-)
 create mode 100644 Documentation/admin-guide/LSM/Brute.rst
 create mode 100644 security/brute/Kconfig
 create mode 100644 security/brute/Makefile
 create mode 100644 security/brute/brute.c
 create mode 100644 tools/testing/selftests/brute/.gitignore
 create mode 100644 tools/testing/selftests/brute/Makefile
 create mode 100644 tools/testing/selftests/brute/config
 create mode 100644 tools/testing/selftests/brute/exec.c
 create mode 100644 tools/testing/selftests/brute/test.c
 create mode 100755 tools/testing/selftests/brute/test.sh

--
2.25.1

Comments

Randy Dunlap Feb. 22, 2021, 2:25 a.m. UTC | #1
Hi--

On 2/21/21 7:49 AM, John Wood wrote:
> 

> Signed-off-by: John Wood <john.wood@gmx.com>

> ---

>  security/brute/brute.c | 488 +++++++++++++++++++++++++++++++++++++++--

>  1 file changed, 474 insertions(+), 14 deletions(-)

> 

> diff --git a/security/brute/brute.c b/security/brute/brute.c

> index 70f812bb7763..645bd6e02638 100644

> --- a/security/brute/brute.c

> +++ b/security/brute/brute.c



> +/**

> + * print_fork_attack_running() - Warn about a fork brute force attack.

> + */

> +static inline void print_fork_attack_running(void)

> +{

> +	pr_warn("Fork brute force attack detected [%s]\n", current->comm);

> +}


Do these pr_warn() calls need to be rate-limited so that they don't
flood the kernel log?


> +/**

> + * print_exec_attack_running() - Warn about an exec brute force attack.

> + * @stats: Statistical data shared by all the fork hierarchy processes.

> + *

> + * The statistical data shared by all the fork hierarchy processes cannot be

> + * NULL.

> + *

> + * Before showing the process name it is mandatory to find a process that holds

> + * a pointer to the exec statistics.

> + *

> + * Context: Must be called with tasklist_lock and brute_stats_ptr_lock held.

> + */

> +static void print_exec_attack_running(const struct brute_stats *stats)

> +{

> +	struct task_struct *p;

> +	struct brute_stats **p_stats;

> +	bool found = false;

> +

> +	for_each_process(p) {

> +		p_stats = brute_stats_ptr(p);

> +		if (*p_stats == stats) {

> +			found = true;

> +			break;

> +		}

>  	}

> +

> +	if (WARN(!found, "No exec process\n"))

> +		return;

> +

> +	pr_warn("Exec brute force attack detected [%s]\n", p->comm);

> +}



thanks.
-- 
~Randy
Randy Dunlap Feb. 22, 2021, 2:30 a.m. UTC | #2
Hi,

one spello in 2 locations:

On 2/21/21 7:49 AM, John Wood wrote:
> To detect a brute force attack it is necessary that the statistics

> shared by all the fork hierarchy processes be updated in every fatal

> crash and the most important data to update is the application crash

> period. To do so, use the new "task_fatal_signal" LSM hook added in a

> previous step.

> 

> The application crash period must be a value that is not prone to change

> due to spurious data and follows the real crash period. So, to compute

> it, the exponential moving average (EMA) is used.

> 

> There are two types of brute force attacks that need to be detected. The

> first one is an attack that happens through the fork system call and the

> second one is an attack that happens through the execve system call. The

> first type uses the statistics shared by all the fork hierarchy

> processes, but the second type cannot use this statistical data due to

> these statistics dissapear when the involved tasks finished. In this


                   disappear

> last scenario the attack info should be tracked by the statistics of a

> higher fork hierarchy (the hierarchy that contains the process that

> forks before the execve system call).

> 

> Moreover, these two attack types have two variants. A slow brute force

> attack that is detected if the maximum number of faults per fork

> hierarchy is reached and a fast brute force attack that is detected if

> the application crash period falls below a certain threshold.

> 

> Also, this patch adds locking to protect the statistics pointer hold by

> every process.

> 

> Signed-off-by: John Wood <john.wood@gmx.com>

> ---

>  security/brute/brute.c | 488 +++++++++++++++++++++++++++++++++++++++--

>  1 file changed, 474 insertions(+), 14 deletions(-)

> 

> diff --git a/security/brute/brute.c b/security/brute/brute.c

> index 70f812bb7763..645bd6e02638 100644

> --- a/security/brute/brute.c

> +++ b/security/brute/brute.c




> +/**

> + * brute_get_exec_stats() - Get the exec statistics.

> + * @stats: When this function is called, this parameter must point to the

> + *         current process' statistical data. When this function returns, this

> + *         parameter points to the parent process' statistics of the fork

> + *         hierarchy that hold the current process' statistics.

> + *

> + * To manage a brute force attack that happens through the execve system call it

> + * is not possible to use the statistical data hold by this process due to these

> + * statistics dissapear when this task is finished. In this scenario this data


                 disappear

> + * should be tracked by the statistics of a higher fork hierarchy (the hierarchy

> + * that contains the process that forks before the execve system call).

> + *

> + * To find these statistics the current fork hierarchy must be traversed up

> + * until new statistics are found.

> + *

> + * Context: Must be called with tasklist_lock and brute_stats_ptr_lock held.

> + */

> +static void brute_get_exec_stats(struct brute_stats **stats)

> +{



-- 
~Randy
Randy Dunlap Feb. 22, 2021, 2:47 a.m. UTC | #3
Hi--

scripts/kernel-doc does not like these items to be marked
as being in kernel-doc notation. scripts/kernel-doc does not
recognize them as one of: struct, union, enum, typedef, so it
defaults to trying to interpret these as functions, and then
says:

(I copied these blocks to my test megatest.c source file.)


../src/megatest.c:1214: warning: cannot understand function prototype: 'const u64 BRUTE_EMA_WEIGHT_NUMERATOR = 7; '
../src/megatest.c:1219: warning: cannot understand function prototype: 'const u64 BRUTE_EMA_WEIGHT_DENOMINATOR = 10; '
../src/megatest.c:1228: warning: cannot understand function prototype: 'const unsigned char BRUTE_MAX_FAULTS = 200; '
../src/megatest.c:1239: warning: cannot understand function prototype: 'const unsigned char BRUTE_MIN_FAULTS = 5; '
../src/megatest.c:1249: warning: cannot understand function prototype: 'const u64 BRUTE_CRASH_PERIOD_THRESHOLD = 30000; '


On 2/21/21 7:49 AM, John Wood wrote:
> 

> +/**

> + * brute_stats_ptr_lock - Lock to protect the brute_stats structure pointer.

> + */

> +static DEFINE_RWLOCK(brute_stats_ptr_lock);


> +/**

> + * BRUTE_EMA_WEIGHT_NUMERATOR - Weight's numerator of EMA.

> + */

> +static const u64 BRUTE_EMA_WEIGHT_NUMERATOR = 7;


> +/**

> + * BRUTE_EMA_WEIGHT_DENOMINATOR - Weight's denominator of EMA.

> + */

> +static const u64 BRUTE_EMA_WEIGHT_DENOMINATOR = 10;


> +/**

> + * BRUTE_MAX_FAULTS - Maximum number of faults.

> + *

> + * If a brute force attack is running slowly for a long time, the application

> + * crash period's EMA is not suitable for the detection. This type of attack

> + * must be detected using a maximum number of faults.

> + */

> +static const unsigned char BRUTE_MAX_FAULTS = 200;


> +/**

> + * BRUTE_MIN_FAULTS - Minimum number of faults.

> + *

> + * The application crash period's EMA cannot be used until a minimum number of

> + * data has been applied to it. This constraint allows getting a trend when this

> + * moving average is used. Moreover, it avoids the scenario where an application

> + * fails quickly from execve system call due to reasons unrelated to a real

> + * attack.

> + */

> +static const unsigned char BRUTE_MIN_FAULTS = 5;


> +/**

> + * BRUTE_CRASH_PERIOD_THRESHOLD - Application crash period threshold.

> + *

> + * The units are expressed in milliseconds.

> + *

> + * A fast brute force attack is detected when the application crash period falls

> + * below this threshold.

> + */

> +static const u64 BRUTE_CRASH_PERIOD_THRESHOLD = 30000;


Basically we don't support scalars in kernel-doc notation...

-- 
~Randy
John Wood Feb. 23, 2021, 6:13 p.m. UTC | #4
Hi,

On Sun, Feb 21, 2021 at 06:25:51PM -0800, Randy Dunlap wrote:
> Hi--

>

> On 2/21/21 7:49 AM, John Wood wrote:

> >

> > +/**

> > + * print_fork_attack_running() - Warn about a fork brute force attack.

> > + */

> > +static inline void print_fork_attack_running(void)

> > +{

> > +	pr_warn("Fork brute force attack detected [%s]\n", current->comm);

> > +}

>

> Do these pr_warn() calls need to be rate-limited so that they don't

> flood the kernel log?


I think it is not necessary since when a brute force attack through the fork
system call is detected, a fork warning appears only once. Then, all the
offending tasks involved in the attack are killed. But if the parent try to run
again the same app already killed, a new crash will trigger a brute force attack
through the execve system call, then this parent is killed, and a new warning
message appears. Now, the parent and childs are killed, the attacks are
mitigated and only a few messages (one or two) have been shown in the kernel
log.

Thanks,
John Wood

> > +/**

> > + * print_exec_attack_running() - Warn about an exec brute force attack.

> > + * @stats: Statistical data shared by all the fork hierarchy processes.

> > + *

> > + * The statistical data shared by all the fork hierarchy processes cannot be

> > + * NULL.

> > + *

> > + * Before showing the process name it is mandatory to find a process that holds

> > + * a pointer to the exec statistics.

> > + *

> > + * Context: Must be called with tasklist_lock and brute_stats_ptr_lock held.

> > + */

> > +static void print_exec_attack_running(const struct brute_stats *stats)

> > +{

> > +	struct task_struct *p;

> > +	struct brute_stats **p_stats;

> > +	bool found = false;

> > +

> > +	for_each_process(p) {

> > +		p_stats = brute_stats_ptr(p);

> > +		if (*p_stats == stats) {

> > +			found = true;

> > +			break;

> > +		}

> >  	}

> > +

> > +	if (WARN(!found, "No exec process\n"))

> > +		return;

> > +

> > +	pr_warn("Exec brute force attack detected [%s]\n", p->comm);

> > +}

>

>

> thanks.

> --

> ~Randy

>
John Wood Feb. 23, 2021, 6:20 p.m. UTC | #5
Hi,

On Sun, Feb 21, 2021 at 06:47:16PM -0800, Randy Dunlap wrote:
> Hi--

>

> scripts/kernel-doc does not like these items to be marked

> as being in kernel-doc notation. scripts/kernel-doc does not

> recognize them as one of: struct, union, enum, typedef, so it

> defaults to trying to interpret these as functions, and then

> says:

>

> (I copied these blocks to my test megatest.c source file.)

>

>

> ../src/megatest.c:1214: warning: cannot understand function prototype: 'const u64 BRUTE_EMA_WEIGHT_NUMERATOR = 7; '

> ../src/megatest.c:1219: warning: cannot understand function prototype: 'const u64 BRUTE_EMA_WEIGHT_DENOMINATOR = 10; '

> ../src/megatest.c:1228: warning: cannot understand function prototype: 'const unsigned char BRUTE_MAX_FAULTS = 200; '

> ../src/megatest.c:1239: warning: cannot understand function prototype: 'const unsigned char BRUTE_MIN_FAULTS = 5; '

> ../src/megatest.c:1249: warning: cannot understand function prototype: 'const u64 BRUTE_CRASH_PERIOD_THRESHOLD = 30000; '

>

>

> On 2/21/21 7:49 AM, John Wood wrote:

> >

> > +/**

> > + * brute_stats_ptr_lock - Lock to protect the brute_stats structure pointer.

> > + */

> > +static DEFINE_RWLOCK(brute_stats_ptr_lock);

>

> > +/**

> > + * BRUTE_EMA_WEIGHT_NUMERATOR - Weight's numerator of EMA.

> > + */

> > +static const u64 BRUTE_EMA_WEIGHT_NUMERATOR = 7;

>

> > +/**

> > + * BRUTE_EMA_WEIGHT_DENOMINATOR - Weight's denominator of EMA.

> > + */

> > +static const u64 BRUTE_EMA_WEIGHT_DENOMINATOR = 10;

>

> > +/**

> > + * BRUTE_MAX_FAULTS - Maximum number of faults.

> > + *

> > + * If a brute force attack is running slowly for a long time, the application

> > + * crash period's EMA is not suitable for the detection. This type of attack

> > + * must be detected using a maximum number of faults.

> > + */

> > +static const unsigned char BRUTE_MAX_FAULTS = 200;

>

> > +/**

> > + * BRUTE_MIN_FAULTS - Minimum number of faults.

> > + *

> > + * The application crash period's EMA cannot be used until a minimum number of

> > + * data has been applied to it. This constraint allows getting a trend when this

> > + * moving average is used. Moreover, it avoids the scenario where an application

> > + * fails quickly from execve system call due to reasons unrelated to a real

> > + * attack.

> > + */

> > +static const unsigned char BRUTE_MIN_FAULTS = 5;

>

> > +/**

> > + * BRUTE_CRASH_PERIOD_THRESHOLD - Application crash period threshold.

> > + *

> > + * The units are expressed in milliseconds.

> > + *

> > + * A fast brute force attack is detected when the application crash period falls

> > + * below this threshold.

> > + */

> > +static const u64 BRUTE_CRASH_PERIOD_THRESHOLD = 30000;

>

> Basically we don't support scalars in kernel-doc notation...


So, to keep it commented it would be better to use a normal comment block?

/*
 * Documentation here
 */

What do you think?

Thanks,
John Wood

> --

> ~Randy

>
John Wood Feb. 23, 2021, 6:25 p.m. UTC | #6
Hi,

On Sun, Feb 21, 2021 at 06:30:10PM -0800, Randy Dunlap wrote:
> Hi,

>

> one spello in 2 locations:

>

> On 2/21/21 7:49 AM, John Wood wrote:

> [...]

> > these statistics dissapear when the involved tasks finished. In this

>

>                    disappear

> [...]

> > + * statistics dissapear when this task is finished. In this scenario this data

>

>                  disappear

> [...]


This typos will be corrected in the next version.

Thanks a lot,
John Wood

>

> --

> ~Randy

>
Randy Dunlap Feb. 23, 2021, 8:44 p.m. UTC | #7
On 2/23/21 10:20 AM, John Wood wrote:
> Hi,

> 

> On Sun, Feb 21, 2021 at 06:47:16PM -0800, Randy Dunlap wrote:

>> Hi--

>>

>> scripts/kernel-doc does not like these items to be marked

>> as being in kernel-doc notation. scripts/kernel-doc does not

>> recognize them as one of: struct, union, enum, typedef, so it

>> defaults to trying to interpret these as functions, and then

>> says:

>>

>> (I copied these blocks to my test megatest.c source file.)

>>

>>

>> ../src/megatest.c:1214: warning: cannot understand function prototype: 'const u64 BRUTE_EMA_WEIGHT_NUMERATOR = 7; '

>> ../src/megatest.c:1219: warning: cannot understand function prototype: 'const u64 BRUTE_EMA_WEIGHT_DENOMINATOR = 10; '

>> ../src/megatest.c:1228: warning: cannot understand function prototype: 'const unsigned char BRUTE_MAX_FAULTS = 200; '

>> ../src/megatest.c:1239: warning: cannot understand function prototype: 'const unsigned char BRUTE_MIN_FAULTS = 5; '

>> ../src/megatest.c:1249: warning: cannot understand function prototype: 'const u64 BRUTE_CRASH_PERIOD_THRESHOLD = 30000; '

>>

>>

>> On 2/21/21 7:49 AM, John Wood wrote:

>>>

>>> +/**

>>> + * brute_stats_ptr_lock - Lock to protect the brute_stats structure pointer.

>>> + */

>>> +static DEFINE_RWLOCK(brute_stats_ptr_lock);

>>

>>> +/**

>>> + * BRUTE_EMA_WEIGHT_NUMERATOR - Weight's numerator of EMA.

>>> + */

>>> +static const u64 BRUTE_EMA_WEIGHT_NUMERATOR = 7;

>>

>>> +/**

>>> + * BRUTE_EMA_WEIGHT_DENOMINATOR - Weight's denominator of EMA.

>>> + */

>>> +static const u64 BRUTE_EMA_WEIGHT_DENOMINATOR = 10;

>>

>>> +/**

>>> + * BRUTE_MAX_FAULTS - Maximum number of faults.

>>> + *

>>> + * If a brute force attack is running slowly for a long time, the application

>>> + * crash period's EMA is not suitable for the detection. This type of attack

>>> + * must be detected using a maximum number of faults.

>>> + */

>>> +static const unsigned char BRUTE_MAX_FAULTS = 200;

>>

>>> +/**

>>> + * BRUTE_MIN_FAULTS - Minimum number of faults.

>>> + *

>>> + * The application crash period's EMA cannot be used until a minimum number of

>>> + * data has been applied to it. This constraint allows getting a trend when this

>>> + * moving average is used. Moreover, it avoids the scenario where an application

>>> + * fails quickly from execve system call due to reasons unrelated to a real

>>> + * attack.

>>> + */

>>> +static const unsigned char BRUTE_MIN_FAULTS = 5;

>>

>>> +/**

>>> + * BRUTE_CRASH_PERIOD_THRESHOLD - Application crash period threshold.

>>> + *

>>> + * The units are expressed in milliseconds.

>>> + *

>>> + * A fast brute force attack is detected when the application crash period falls

>>> + * below this threshold.

>>> + */

>>> +static const u64 BRUTE_CRASH_PERIOD_THRESHOLD = 30000;

>>

>> Basically we don't support scalars in kernel-doc notation...

> 

> So, to keep it commented it would be better to use a normal comment block?

> 

> /*

>  * Documentation here

>  */

> 

> What do you think?


Yes, please, just a normal /* comment block.

thanks.
-- 
~Randy