diff mbox series

Documentation: dev-tools: Add Testing Overview

Message ID 20210410070529.4113432-1-davidgow@google.com
State New
Headers show
Series Documentation: dev-tools: Add Testing Overview | expand

Commit Message

David Gow April 10, 2021, 7:05 a.m. UTC
The kernel now has a number of testing and debugging tools, and we've
seen a bit of confusion about what the differences between them are.

Add a basic documentation outlining the testing tools, when to use each,
and how they interact.

This is a pretty quick overview rather than the idealised "kernel
testing guide" that'd probably be optimal, but given the number of times
questions like "When do you use KUnit and when do you use Kselftest?"
are being asked, it seemed worth at least having something. Hopefully
this can form the basis for more detailed documentation later.

Signed-off-by: David Gow <davidgow@google.com>
---
 Documentation/dev-tools/index.rst            |   3 +
 Documentation/dev-tools/testing-overview.rst | 102 +++++++++++++++++++
 2 files changed, 105 insertions(+)
 create mode 100644 Documentation/dev-tools/testing-overview.rst

Comments

Jonathan Corbet April 11, 2021, 5:05 p.m. UTC | #1
A nit but

> +The bulk of kernel tests are written using either the :doc:`kselftest
> +<kselftest>` or :doc:`KUnit <kunit/index>` frameworks. These both provide
> +infrastructure to help make running tests and groups of tests easier, as well
> +as providing helpers to aid in writing new tests.

If you just mention the relevant file, the docs build will make links
for you...so just "Documentation/dev-tools/kselftest.rst" rather than
the :doc: directive.  That helps to improve the readability of the
plain-text documentation as well.

> +`KUnit` tests therefore are best written against small, self-contained parts
> +of the kernel, which can be tested in isolation. This aligns well with the
> +concept of Unit testing.

If you want literal text, you need a double backtick: ``KUnit``.
Otherwise I'd just use normal quotes.

Thanks,

jon
Marco Elver April 12, 2021, 10:43 a.m. UTC | #2
On Sat, 10 Apr 2021 at 13:53, Daniel Latypov <dlatypov@google.com> wrote:
> On Sat, Apr 10, 2021 at 12:05 AM David Gow <davidgow@google.com> wrote:

[...]
> > +

> > +

> > +Sanitizers

> > +==========

> > +


The "sanitizers" have originally been a group of tools that relied on
compiler instrumentation to perform various dynamic analysis
(initially ASan, TSan, MSan for user space). The term "sanitizer" has
since been broadened to include a few non-compiler based tools such as
GWP-ASan in user space, of which KFENCE is its kernel cousin but it
doesn't have "sanitizer" in its name (because we felt GWP-KASAN was
pushing it with the acronyms ;-)). Also, these days we have HW_TAGS
based KASAN, which doesn't rely on compiler instrumentation but
instead on MTE in Arm64.

Things like kmemleak have never really been called a sanitizer, but
they _are_ dynamic analysis tools.

So to avoid confusion, in particular avoid establishing "sanitizers"
to be synonymous with "dynamic analysis" ("all sanitizers are dynamic
analysis tools, but not all dynamic analysis tools are sanitizers"),
the section here should not be called "Sanitizers" but "Dynamic
Analysis Tools". We could have a subsection "Sanitizers", but I think
it's not necessary.

> > +The kernel also supports a number of sanitizers, which attempt to detect

> > +classes of issues when the occur in a running kernel. These typically

>

> *they occur

>

> > +look for undefined behaviour of some kind, such as invalid memory accesses,

> > +concurrency issues such as data races, or other undefined behaviour like

> > +integer overflows.

> > +

> > +* :doc:`kmemleak` (Kmemleak) detects possible memory leaks.

> > +* :doc:`kasan` detects invalid memory accesses such as out-of-bounds and

> > +  use-after-free errors.

> > +* :doc:`ubsan` detects behaviour that is undefined by the C standard, like

> > +  integer overflows.

> > +* :doc:`kcsan` detects data races.

> > +* :doc:`kfence` is a low-overhead detector of memory issues, which is much

> > +  faster than KASAN and can be used in production.

>

> Hmm, it lives elsewhere, but would also calling out lockdep here be useful?

> I've also not heard anyone call it a sanitizer before, but it fits the

> definition you've given.

>

> Now that I think about it, I've never looked for documentation on it,

> is this the best page?

> https://www.kernel.org/doc/html/latest/locking/lockdep-design.html


Not a "sanitizer" but our sanitizers are all dynamic analysis tools,
and lockdep is also a dynamic analysis tool.

If we want to be pedantic, the kernel has numerous options to add
"instrumentation" (compiler based or explicit) that will detect some
kind of error at runtime. Most of them live in lib/Kconfig.debug. I
think mentioning something like that is in scope of this document, but
we certainly can't mention all debug tools the kernel has to offer.
Mentioning the big ones like above and then referring to
lib/Kconfig.debug is probably fine.

Dmitry recently gave an excellent talk on some of this:
https://www.youtube.com/watch?v=ufcyOkgFZ2Q

Thanks,
-- Marco
Brendan Higgins April 12, 2021, 9:19 p.m. UTC | #3
On Sat, Apr 10, 2021 at 12:05 AM David Gow <davidgow@google.com> wrote:
>

> The kernel now has a number of testing and debugging tools, and we've

> seen a bit of confusion about what the differences between them are.

>

> Add a basic documentation outlining the testing tools, when to use each,

> and how they interact.

>

> This is a pretty quick overview rather than the idealised "kernel

> testing guide" that'd probably be optimal, but given the number of times

> questions like "When do you use KUnit and when do you use Kselftest?"

> are being asked, it seemed worth at least having something. Hopefully

> this can form the basis for more detailed documentation later.

>

> Signed-off-by: David Gow <davidgow@google.com>


With the exception of some minor nits, I think the below will make a
great initial testing overview guide!

Thanks for getting the ball rolling on this!

> ---

>  Documentation/dev-tools/index.rst            |   3 +

>  Documentation/dev-tools/testing-overview.rst | 102 +++++++++++++++++++

>  2 files changed, 105 insertions(+)

>  create mode 100644 Documentation/dev-tools/testing-overview.rst

>

> diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst

> index 1b1cf4f5c9d9..f590e5860794 100644

> --- a/Documentation/dev-tools/index.rst

> +++ b/Documentation/dev-tools/index.rst

> @@ -7,6 +7,8 @@ be used to work on the kernel. For now, the documents have been pulled

>  together without any significant effort to integrate them into a coherent

>  whole; patches welcome!

>

> +A brief overview of testing-specific tools can be found in :doc:`testing-overview`.

> +


I think I would like to make this a little more apparent. This index
here is a bit bare bones and I think this testing-overview could be a
good: "I am lost where do I start?" sort of doc. That being said, I am
not sure what the best way to emphasize this might be. Maybe just have
an intro paragraph here with some callout text like in a `note` or
something like that.

>  .. class:: toc-title

>

>            Table of contents

> @@ -14,6 +16,7 @@ whole; patches welcome!

>  .. toctree::

>     :maxdepth: 2

>

> +   testing-overview

>     coccinelle

>     sparse

>     kcov

> diff --git a/Documentation/dev-tools/testing-overview.rst b/Documentation/dev-tools/testing-overview.rst

> new file mode 100644

> index 000000000000..8452adcb8608

> --- /dev/null

> +++ b/Documentation/dev-tools/testing-overview.rst

> @@ -0,0 +1,102 @@

> +.. SPDX-License-Identifier: GPL-2.0

> +

> +====================

> +Kernel Testing Guide

> +====================

> +

> +

> +There are a number of different tools for testing the Linux kernel, so knowing

> +when to use each of them can be a challenge. This document provides a rough

> +overview of their differences, and how they fit together.

> +

> +

> +Writing and Running Tests

> +=========================

> +

> +The bulk of kernel tests are written using either the :doc:`kselftest

> +<kselftest>` or :doc:`KUnit <kunit/index>` frameworks. These both provide

> +infrastructure to help make running tests and groups of tests easier, as well

> +as providing helpers to aid in writing new tests.

> +

> +If you're looking to verify the behaviour of the Kernel — particularly specific

> +parts of the kernel — then you'll want to use `KUnit` or `kselftest`.

> +

> +

> +The Difference Between KUnit and kselftest

> +------------------------------------------

> +

> +:doc:`KUnit <kunit/index>` is an entirely in-kernel system for "white box"

> +testing: because test code is part of the kernel, it can access internal

> +structures and functions which aren't exposed to userspace.

> +

> +`KUnit` tests therefore are best written against small, self-contained parts

> +of the kernel, which can be tested in isolation. This aligns well with the

> +concept of Unit testing.


I think I might have pushed the "unit testing" stuff too hard in the
past, but I feel that if you are going to mention "the concept of unit
testing" it might be a good idea to link to an authoritative source on
what unit testing is. Maybe link to Martin Fowler or something like
that?

> +For example, a KUnit test might test an individual kernel function (or even a

> +single codepath through a function, such as an error handling case), rather

> +than a feature as a whole.

> +

> +There is a KUnit test style guide which may give further pointers


I know you linked the index page for KUnit above, but I think you
might want to link the KUnit style guide here since you mention it.

> +:doc:`kselftest <kselftest>`, on the other hand, is largely implemented in

> +userspace, and tests are normal userspace scripts or programs.

> +

> +This makes it easier to write more complicated tests, or tests which need to

> +manipulate the overall system state more (e.g., spawning processes, etc.).

> +However, it's not possible to call kernel functions directly unless they're

> +exposed to userspace (by a syscall, device, filesystem, etc.) Some tests to

> +also provide a kernel module which is loaded by the test, though for tests

> +which run mostly or entirely within the kernel, `KUnit` may be the better tool.

> +

> +`kselftest` is therefore suited well to tests of whole features, as these will

> +expose an interface to userspace, which can be tested, but not implementation

> +details. This aligns well with 'system' or 'end-to-end' testing.


Again, I think you might want to link to some sources that explain
what "system" and "end-to-end" testing are.

Also, I think maybe adding a section on some common examples of when
to use Kselftest vs when to use KUnit would be helpful. For example:

 - If I add a new syscall, you might want to mention that the author
is *required*
   to add accompanying Kselftests.
 - A new internal API - for example a new crypto API - is *strongly recommended*
   to have accompanying KUnit tests.
 - Many new features, that have a large in kernel API, but also have a
user visible
   API should probably have both Kselftests as well as KUnit tests.

Enumerating other examples is probably a good idea, but I think this
offers a good flavor.

> +Code Coverage Tools

> +===================

> +

> +The Linux Kernel supports two different code coverage mesurement tools. These

> +can be used to verify that a test is executing particular functions or lines

> +of code. This is useful for determining how much of the kernel is being tested,

> +and for finding corner-cases which are not covered by the appropriate test.

> +

> +:doc:`kcov` is a feature which can be built in to the kernel to allow

> +capturing coverage on a per-task level. It's therefore useful for fuzzing and

> +other situations where information about code executed during, for example, a

> +single syscall is useful.

> +

> +:doc:`gcov` is GCC's coverage testing tool, which can be used with the kernel

> +to get global or per-module coverage. Unlike KCOV, it does not record per-task

> +coverage. Coverage data can be read from debugfs, and interpreted using the

> +usual gcov tooling.

> +

> +

> +Sanitizers

> +==========

> +

> +The kernel also supports a number of sanitizers, which attempt to detect

> +classes of issues when the occur in a running kernel. These typically

> +look for undefined behaviour of some kind, such as invalid memory accesses,

> +concurrency issues such as data races, or other undefined behaviour like

> +integer overflows.

> +

> +* :doc:`kmemleak` (Kmemleak) detects possible memory leaks.

> +* :doc:`kasan` detects invalid memory accesses such as out-of-bounds and

> +  use-after-free errors.

> +* :doc:`ubsan` detects behaviour that is undefined by the C standard, like

> +  integer overflows.

> +* :doc:`kcsan` detects data races.

> +* :doc:`kfence` is a low-overhead detector of memory issues, which is much

> +  faster than KASAN and can be used in production.

> +

> +These tools tend to test the kernel as a whole, and do not "pass" like

> +kselftest or KUnit tests. They can be combined with KUnit or kselftest by

> +running tests on a kernel with a sanitizer enabled: you can then be sure

> +that none of these errors are occurring during the test.

> +

> +Some of these sanitizers integrate with KUnit or kselftest and will

> +automatically fail tests if an issue is detected by a sanitizer.

> +

> --

> 2.31.1.295.g9ea45b61b8-goog

>
Brendan Higgins April 12, 2021, 9:22 p.m. UTC | #4
On Mon, Apr 12, 2021 at 3:43 AM Marco Elver <elver@google.com> wrote:
>

> On Sat, 10 Apr 2021 at 13:53, Daniel Latypov <dlatypov@google.com> wrote:

> > On Sat, Apr 10, 2021 at 12:05 AM David Gow <davidgow@google.com> wrote:

> [...]

> > > +

> > > +

> > > +Sanitizers

> > > +==========

> > > +

>

> The "sanitizers" have originally been a group of tools that relied on

> compiler instrumentation to perform various dynamic analysis

> (initially ASan, TSan, MSan for user space). The term "sanitizer" has

> since been broadened to include a few non-compiler based tools such as

> GWP-ASan in user space, of which KFENCE is its kernel cousin but it

> doesn't have "sanitizer" in its name (because we felt GWP-KASAN was

> pushing it with the acronyms ;-)). Also, these days we have HW_TAGS

> based KASAN, which doesn't rely on compiler instrumentation but

> instead on MTE in Arm64.

>

> Things like kmemleak have never really been called a sanitizer, but

> they _are_ dynamic analysis tools.

>

> So to avoid confusion, in particular avoid establishing "sanitizers"

> to be synonymous with "dynamic analysis" ("all sanitizers are dynamic

> analysis tools, but not all dynamic analysis tools are sanitizers"),

> the section here should not be called "Sanitizers" but "Dynamic

> Analysis Tools". We could have a subsection "Sanitizers", but I think

> it's not necessary.

>

> > > +The kernel also supports a number of sanitizers, which attempt to detect

> > > +classes of issues when the occur in a running kernel. These typically

> >

> > *they occur

> >

> > > +look for undefined behaviour of some kind, such as invalid memory accesses,

> > > +concurrency issues such as data races, or other undefined behaviour like

> > > +integer overflows.

> > > +

> > > +* :doc:`kmemleak` (Kmemleak) detects possible memory leaks.

> > > +* :doc:`kasan` detects invalid memory accesses such as out-of-bounds and

> > > +  use-after-free errors.

> > > +* :doc:`ubsan` detects behaviour that is undefined by the C standard, like

> > > +  integer overflows.

> > > +* :doc:`kcsan` detects data races.

> > > +* :doc:`kfence` is a low-overhead detector of memory issues, which is much

> > > +  faster than KASAN and can be used in production.

> >

> > Hmm, it lives elsewhere, but would also calling out lockdep here be useful?

> > I've also not heard anyone call it a sanitizer before, but it fits the

> > definition you've given.

> >

> > Now that I think about it, I've never looked for documentation on it,

> > is this the best page?

> > https://www.kernel.org/doc/html/latest/locking/lockdep-design.html

>

> Not a "sanitizer" but our sanitizers are all dynamic analysis tools,

> and lockdep is also a dynamic analysis tool.

>

> If we want to be pedantic, the kernel has numerous options to add

> "instrumentation" (compiler based or explicit) that will detect some

> kind of error at runtime. Most of them live in lib/Kconfig.debug. I

> think mentioning something like that is in scope of this document, but

> we certainly can't mention all debug tools the kernel has to offer.

> Mentioning the big ones like above and then referring to

> lib/Kconfig.debug is probably fine.

>

> Dmitry recently gave an excellent talk on some of this:

> https://www.youtube.com/watch?v=ufcyOkgFZ2Q


Good point Marco, and we (KUnit - myself, Daniel, and David) gave a
talk on KUnit at LF. Also, I think Shuah is/has given one (soon)?
Might be a good idea to link those here?
diff mbox series

Patch

diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst
index 1b1cf4f5c9d9..f590e5860794 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -7,6 +7,8 @@  be used to work on the kernel. For now, the documents have been pulled
 together without any significant effort to integrate them into a coherent
 whole; patches welcome!
 
+A brief overview of testing-specific tools can be found in :doc:`testing-overview`.
+
 .. class:: toc-title
 
 	   Table of contents
@@ -14,6 +16,7 @@  whole; patches welcome!
 .. toctree::
    :maxdepth: 2
 
+   testing-overview
    coccinelle
    sparse
    kcov
diff --git a/Documentation/dev-tools/testing-overview.rst b/Documentation/dev-tools/testing-overview.rst
new file mode 100644
index 000000000000..8452adcb8608
--- /dev/null
+++ b/Documentation/dev-tools/testing-overview.rst
@@ -0,0 +1,102 @@ 
+.. SPDX-License-Identifier: GPL-2.0
+
+====================
+Kernel Testing Guide
+====================
+
+
+There are a number of different tools for testing the Linux kernel, so knowing
+when to use each of them can be a challenge. This document provides a rough
+overview of their differences, and how they fit together.
+
+
+Writing and Running Tests
+=========================
+
+The bulk of kernel tests are written using either the :doc:`kselftest
+<kselftest>` or :doc:`KUnit <kunit/index>` frameworks. These both provide
+infrastructure to help make running tests and groups of tests easier, as well
+as providing helpers to aid in writing new tests.
+
+If you're looking to verify the behaviour of the Kernel — particularly specific
+parts of the kernel — then you'll want to use `KUnit` or `kselftest`.
+
+
+The Difference Between KUnit and kselftest
+------------------------------------------
+
+:doc:`KUnit <kunit/index>` is an entirely in-kernel system for "white box"
+testing: because test code is part of the kernel, it can access internal
+structures and functions which aren't exposed to userspace.
+
+`KUnit` tests therefore are best written against small, self-contained parts
+of the kernel, which can be tested in isolation. This aligns well with the
+concept of Unit testing.
+
+For example, a KUnit test might test an individual kernel function (or even a
+single codepath through a function, such as an error handling case), rather
+than a feature as a whole.
+
+There is a KUnit test style guide which may give further pointers
+
+
+:doc:`kselftest <kselftest>`, on the other hand, is largely implemented in
+userspace, and tests are normal userspace scripts or programs.
+
+This makes it easier to write more complicated tests, or tests which need to
+manipulate the overall system state more (e.g., spawning processes, etc.).
+However, it's not possible to call kernel functions directly unless they're
+exposed to userspace (by a syscall, device, filesystem, etc.) Some tests to
+also provide a kernel module which is loaded by the test, though for tests
+which run mostly or entirely within the kernel, `KUnit` may be the better tool.
+
+`kselftest` is therefore suited well to tests of whole features, as these will
+expose an interface to userspace, which can be tested, but not implementation
+details. This aligns well with 'system' or 'end-to-end' testing.
+
+
+Code Coverage Tools
+===================
+
+The Linux Kernel supports two different code coverage mesurement tools. These
+can be used to verify that a test is executing particular functions or lines
+of code. This is useful for determining how much of the kernel is being tested,
+and for finding corner-cases which are not covered by the appropriate test.
+
+:doc:`kcov` is a feature which can be built in to the kernel to allow
+capturing coverage on a per-task level. It's therefore useful for fuzzing and
+other situations where information about code executed during, for example, a
+single syscall is useful.
+
+:doc:`gcov` is GCC's coverage testing tool, which can be used with the kernel
+to get global or per-module coverage. Unlike KCOV, it does not record per-task
+coverage. Coverage data can be read from debugfs, and interpreted using the
+usual gcov tooling.
+
+
+Sanitizers
+==========
+
+The kernel also supports a number of sanitizers, which attempt to detect
+classes of issues when the occur in a running kernel. These typically
+look for undefined behaviour of some kind, such as invalid memory accesses,
+concurrency issues such as data races, or other undefined behaviour like
+integer overflows.
+
+* :doc:`kmemleak` (Kmemleak) detects possible memory leaks.
+* :doc:`kasan` detects invalid memory accesses such as out-of-bounds and
+  use-after-free errors.
+* :doc:`ubsan` detects behaviour that is undefined by the C standard, like
+  integer overflows.
+* :doc:`kcsan` detects data races.
+* :doc:`kfence` is a low-overhead detector of memory issues, which is much
+  faster than KASAN and can be used in production.
+
+These tools tend to test the kernel as a whole, and do not "pass" like
+kselftest or KUnit tests. They can be combined with KUnit or kselftest by
+running tests on a kernel with a sanitizer enabled: you can then be sure
+that none of these errors are occurring during the test.
+
+Some of these sanitizers integrate with KUnit or kselftest and will
+automatically fail tests if an issue is detected by a sanitizer.
+