diff mbox series

[v5,09/20] docs/sphinx: Add new qapi-doc Sphinx extension

Message ID 20200810195019.25427-10-peter.maydell@linaro.org
State Accepted
Headers show
Series Convert QAPI doc comments to generate rST instead of texinfo | expand

Commit Message

Peter Maydell Aug. 10, 2020, 7:50 p.m. UTC
Some of our documentation is auto-generated from documentation
comments in the JSON schema.

For Sphinx, rather than creating a file to include, the most natural
way to handle this is to have a small custom Sphinx extension which
processes the JSON file and inserts documentation into the rST
file being processed.

This is the same approach that kerneldoc and hxtool use.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

---
Changes v4->v5: match the changes in parameters to the
various visit_* methods from commit 7b3bc9e28f366
---
 docs/conf.py           |   6 +-
 docs/sphinx/qapidoc.py | 504 +++++++++++++++++++++++++++++++++++++++++
 MAINTAINERS            |   1 +
 3 files changed, 510 insertions(+), 1 deletion(-)
 create mode 100644 docs/sphinx/qapidoc.py

-- 
2.20.1

Comments

Richard Henderson Aug. 14, 2020, 6:40 p.m. UTC | #1
On 8/10/20 12:50 PM, Peter Maydell wrote:
> Some of our documentation is auto-generated from documentation

> comments in the JSON schema.

> 

> For Sphinx, rather than creating a file to include, the most natural

> way to handle this is to have a small custom Sphinx extension which

> processes the JSON file and inserts documentation into the rST

> file being processed.

> 

> This is the same approach that kerneldoc and hxtool use.

> 

> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

> ---

> Changes v4->v5: match the changes in parameters to the

> various visit_* methods from commit 7b3bc9e28f366

> ---

>  docs/conf.py           |   6 +-

>  docs/sphinx/qapidoc.py | 504 +++++++++++++++++++++++++++++++++++++++++

>  MAINTAINERS            |   1 +

>  3 files changed, 510 insertions(+), 1 deletion(-)

>  create mode 100644 docs/sphinx/qapidoc.py


Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~
Markus Armbruster Sept. 4, 2020, 12:29 p.m. UTC | #2
Peter Maydell <peter.maydell@linaro.org> writes:

> Some of our documentation is auto-generated from documentation

> comments in the JSON schema.

>

> For Sphinx, rather than creating a file to include, the most natural

> way to handle this is to have a small custom Sphinx extension which

> processes the JSON file and inserts documentation into the rST

> file being processed.

>

> This is the same approach that kerneldoc and hxtool use.

>

> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>


Got a pointer to "Writing Sphinx Extensions for Dummies" or similar?

Since I know nothing about Sphinx extensions, my review will be more or
less limited to the part facing QAPI.  Perhaps "generated docs look
sane" suffices for the rest.

> ---

> Changes v4->v5: match the changes in parameters to the

> various visit_* methods from commit 7b3bc9e28f366

> ---

>  docs/conf.py           |   6 +-

>  docs/sphinx/qapidoc.py | 504 +++++++++++++++++++++++++++++++++++++++++

>  MAINTAINERS            |   1 +

>  3 files changed, 510 insertions(+), 1 deletion(-)

>  create mode 100644 docs/sphinx/qapidoc.py

>

> diff --git a/docs/conf.py b/docs/conf.py

> index d6e173ef77b..44ec5050717 100644

> --- a/docs/conf.py

> +++ b/docs/conf.py

> @@ -52,7 +52,10 @@ except NameError:

>  # add these directories to sys.path here. If the directory is relative to the

>  # documentation root, use an absolute path starting from qemu_docdir.

>  #

> +# Our extensions are in docs/sphinx; the qapidoc extension requires

> +# the QAPI modules from scripts/.

>  sys.path.insert(0, os.path.join(qemu_docdir, "sphinx"))

> +sys.path.insert(0, os.path.join(qemu_docdir, "../scripts"))

>  

>  

>  # -- General configuration ------------------------------------------------

> @@ -67,7 +70,7 @@ needs_sphinx = '1.6'

>  # Add any Sphinx extension module names here, as strings. They can be

>  # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom

>  # ones.

> -extensions = ['kerneldoc', 'qmp_lexer', 'hxtool']

> +extensions = ['kerneldoc', 'qmp_lexer', 'hxtool', 'qapidoc']

>  

>  # Add any paths that contain templates here, relative to this directory.

>  templates_path = ['_templates']

> @@ -241,3 +244,4 @@ texinfo_documents = [

>  kerneldoc_bin = os.path.join(qemu_docdir, '../scripts/kernel-doc')

>  kerneldoc_srctree = os.path.join(qemu_docdir, '..')

>  hxtool_srctree = os.path.join(qemu_docdir, '..')

> +qapidoc_srctree = os.path.join(qemu_docdir, '..')

> diff --git a/docs/sphinx/qapidoc.py b/docs/sphinx/qapidoc.py

> new file mode 100644

> index 00000000000..4f571b2f45a

> --- /dev/null

> +++ b/docs/sphinx/qapidoc.py

> @@ -0,0 +1,504 @@

> +# coding=utf-8

> +#

> +# QEMU qapidoc QAPI file parsing extension

> +#

> +# Copyright (c) 2020 Linaro

> +#

> +# This work is licensed under the terms of the GNU GPLv2 or later.

> +# See the COPYING file in the top-level directory.

> +"""qapidoc is a Sphinx extension that implements the qapi-doc directive"""

> +

> +# The purpose of this extension is to read the documentation comments

> +# in QAPI JSON schema files, and insert them all into the current document.


Let's drop "JSON", it's not really true.

> +# The conf.py file must set the qapidoc_srctree config value to

> +# the root of the QEMU source tree.

> +# Each qapi-doc:: directive takes one argument which is the

> +# path of the .json file to process, relative to the source tree.


Beg your pardon?  Oh, you're talking about .rst files now.  The next two
patches will add such files.  Perhaps something like "The qapi-doc::
reST directive provided by this extension takes ..."

> +

> +import os

> +import re

> +

> +from docutils import nodes

> +from docutils.statemachine import ViewList

> +from docutils.parsers.rst import directives, Directive

> +from sphinx.errors import ExtensionError

> +from sphinx.util.nodes import nested_parse_with_titles

> +import sphinx

> +from qapi.gen import QAPISchemaVisitor

> +from qapi.schema import QAPIError, QAPISchema

> +

> +# Sphinx up to 1.6 uses AutodocReporter; 1.7 and later

> +# use switch_source_input. Check borrowed from kerneldoc.py.

> +Use_SSI = sphinx.__version__[:3] >= '1.7'

> +if Use_SSI:

> +    from sphinx.util.docutils import switch_source_input

> +else:

> +    from sphinx.ext.autodoc import AutodocReporter

> +

> +

> +__version__ = '1.0'

> +

> +# Function borrowed from pydash, which is under the MIT license

> +def intersperse(iterable, separator):

> +    """Like join, but for arbitrary iterables, notably arrays"""

> +    iterable = iter(iterable)

> +    yield next(iterable)

> +    for item in iterable:

> +        yield separator

> +        yield item


What's wrong with separator.join(iterable)?

> +

> +class QAPISchemaGenRSTVisitor(QAPISchemaVisitor):

> +    """A QAPI schema visitor which generates docutils/Sphinx nodes

> +

> +    This class builds up a tree of docutils/Sphinx nodes corresponding

> +    to documentation for the various QAPI objects. To use it, first create

> +    a QAPISchemaGenRSTVisitor object, and call its visit_begin() method.

> +    Then you can call one of the two methods 'freeform' (to add documentation

> +    for a freeform documentation chunk) or 'symbol' (to add documentation

> +    for a QAPI symbol). These will cause the visitor to build up the

> +    tree of document nodes. Once you've added all the documentation

> +    via 'freeform' and 'symbol' method calls, you can call 'get_document_nodes'

> +    to get the final list of document nodes (in a form suitable for returning

> +    from a Sphinx directive's 'run' method).

> +    """

> +    def __init__(self, sphinx_directive):

> +        self._cur_doc = None

> +        self._sphinx_directive = sphinx_directive

> +        self._top_node = nodes.section()

> +        self._active_headings = [self._top_node]

> +

> +    def _serror(self, msg):

> +        """Raise an exception giving a user-friendly syntax error message"""

> +        file = self._cur_doc.info.fname

> +        line = self._cur_doc.info.line

> +        raise ExtensionError('%s line %d: syntax error: %s' % (file, line, msg))


Note for later: ornate error message format, no use of QAPISourceInfo.

> +

> +    def _make_dlitem(self, term, defn):

> +        """Return a dlitem node with the specified term and definition.

> +

> +        term should be a list of Text and literal nodes.

> +        defn should be one of:

> +        - a string, which will be handed to _parse_text_into_node

> +        - a list of Text and literal nodes, which will be put into

> +          a paragraph node

> +        """

> +        dlitem = nodes.definition_list_item()

> +        dlterm = nodes.term('', '', *term)

> +        dlitem += dlterm

> +        if defn:

> +            dldef = nodes.definition()

> +            if isinstance(defn, list):

> +                dldef += nodes.paragraph('', '', *defn)

> +            else:

> +                self._parse_text_into_node(defn, dldef)

> +            dlitem += dldef

> +        return dlitem

> +

> +    def _make_section(self, title):

> +        """Return a section node with optional title"""

> +        section = nodes.section(ids=[self._sphinx_directive.new_serialno()])

> +        if title:

> +            section += nodes.title(title, title)

> +        return section

> +

> +    def _nodes_for_ifcond(self, ifcond, with_if=True):

> +        """Return list of Text, literal nodes for the ifcond

> +

> +        Return a list which gives text like ' (If: cond1, cond2, cond3)', where

> +        the conditions are in literal-text and the commas are not.

> +        If with_if is False, we don't return the "(If: " and ")".

> +        """

> +        condlist = intersperse([nodes.literal('', c) for c in ifcond],

> +                               nodes.Text(', '))

> +        if not with_if:

> +            return condlist

> +

> +        nodelist = [nodes.Text(' ('), nodes.strong('', 'If: ')]

> +        nodelist.extend(condlist)

> +        nodelist.append(nodes.Text(')'))

> +        return nodelist

> +

> +    def _nodes_for_one_member(self, member):

> +        """Return list of Text, literal nodes for this member

> +

> +        Return a list of doctree nodes which give text like

> +        'name: type (optional) (If: ...)' suitable for use as the

> +        'term' part of a definition list item.

> +        """

> +        term = [nodes.literal('', member.name)]

> +        if member.type.doc_type():

> +            term.append(nodes.Text(': '))

> +            term.append(nodes.literal('', member.type.doc_type()))

> +        if member.optional:

> +            term.append(nodes.Text(' (optional)'))

> +        if member.ifcond:

> +            term.extend(self._nodes_for_ifcond(member.ifcond))

> +        return term

> +

> +    def _nodes_for_variant_when(self, variants, variant):

> +        """Return list of Text, literal nodes for variant 'when' clause

> +

> +        Return a list of doctree nodes which give text like

> +        'when tagname is variant (If: ...)' suitable for use in

> +        the 'variants' part of a definition list.

> +        """

> +        term = [nodes.Text(' when '),

> +                nodes.literal('', variants.tag_member.name),

> +                nodes.Text(' is '),

> +                nodes.literal('', '"%s"' % variant.name)]

> +        if variant.ifcond:

> +            term.extend(self._nodes_for_ifcond(variant.ifcond))

> +        return term

> +

> +    def _nodes_for_members(self, doc, what, base=None, variants=None):

> +        """Return doctree nodes for the table of members"""

> +        dlnode = nodes.definition_list()

> +        for section in doc.args.values():

> +            term = self._nodes_for_one_member(section.member)

> +            # TODO drop fallbacks when undocumented members are outlawed

> +            if section.text:

> +                defn = section.text

> +            elif (variants and variants.tag_member == section.member

> +                  and not section.member.type.doc_type()):

> +                values = section.member.type.member_names()

> +                defn = [nodes.Text('One of ')]

> +                defn.extend(intersperse([nodes.literal('', v) for v in values],

> +                                        nodes.Text(', ')))

> +            else:

> +                defn = [nodes.Text('Not documented')]

> +

> +            dlnode += self._make_dlitem(term, defn)

> +

> +        if base:

> +            dlnode += self._make_dlitem([nodes.Text('The members of '),

> +                                         nodes.literal('', base.doc_type())],

> +                                        None)

> +

> +        if variants:

> +            for v in variants.variants:

> +                if v.type.is_implicit():

> +                    assert not v.type.base and not v.type.variants

> +                    for m in v.type.local_members:

> +                        term = self._nodes_for_one_member(m)

> +                        term.extend(self._nodes_for_variant_when(variants, v))

> +                        dlnode += self._make_dlitem(term, None)

> +                else:

> +                    term = [nodes.Text('The members of '),

> +                            nodes.literal('', v.type.doc_type())]

> +                    term.extend(self._nodes_for_variant_when(variants, v))

> +                    dlnode += self._make_dlitem(term, None)

> +

> +        if not dlnode.children:

> +            return None

> +

> +        section = self._make_section(what)

> +        section += dlnode

> +        return section

> +

> +    def _nodes_for_enum_values(self, doc, what):


@what is only ever 'Values', isn't it?

> +        """Return doctree nodes for the table of enum values"""

> +        seen_item = False

> +        dlnode = nodes.definition_list()

> +        for section in doc.args.values():

> +            termtext = [nodes.literal('', section.member.name)]

> +            if section.member.ifcond:

> +                termtext.extend(self._nodes_for_ifcond(section.member.ifcond))

> +            # TODO drop fallbacks when undocumented members are outlawed

> +            if section.text:

> +                defn = section.text

> +            else:

> +                defn = [nodes.Text('Not documented')]

> +

> +            dlnode += self._make_dlitem(termtext, defn)

> +            seen_item = True

> +

> +        if not seen_item:

> +            return None

> +

> +        section = self._make_section(what)

> +        section += dlnode

> +        return section

> +

> +    def _nodes_for_arguments(self, doc, boxed_arg_type):

> +        """Return doctree nodes for the arguments section"""

> +        if boxed_arg_type:

> +            assert not doc.args

> +            section = self._make_section('Arguments')

> +            dlnode = nodes.definition_list()

> +            dlnode += self._make_dlitem(

> +                [nodes.Text('The members of '),

> +                 nodes.literal('', boxed_arg_type.name)],

> +                None)

> +            section += dlnode

> +            return section

> +

> +        return self._nodes_for_members(doc, 'Arguments')

> +

> +    def _nodes_for_features(self, doc):

> +        """Return doctree nodes for the table of features"""

> +        seen_item = False

> +        dlnode = nodes.definition_list()

> +        for section in doc.features.values():

> +            dlnode += self._make_dlitem([nodes.literal('', section.name)],

> +                                        section.text)

> +            seen_item = True

> +

> +        if not seen_item:

> +            return None

> +

> +        section = self._make_section('Features')

> +        section += dlnode

> +        return section

> +

> +    def _nodes_for_example(self, exampletext):

> +        """Return doctree nodes for a code example snippet"""

> +        return nodes.literal_block(exampletext, exampletext)

> +

> +    def _nodes_for_sections(self, doc, ifcond):

> +        """Return doctree nodes for additional sections following arguments"""

> +        nodelist = []

> +        for section in doc.sections:

> +            snode = self._make_section(section.name)

> +            if section.name and section.name.startswith('Example'):

> +                snode += self._nodes_for_example(section.text)

> +            else:

> +                self._parse_text_into_node(section.text, snode)

> +            nodelist.append(snode)

> +        if ifcond:

> +            snode = self._make_section('If')

> +            snode += self._nodes_for_ifcond(ifcond, with_if=False)

> +            nodelist.append(snode)


I think I'd rather have a separate _node_for_ifcond().  Not a demand.

> +        if not nodelist:

> +            return None


Any particular reason for mapping [] to None?

> +        return nodelist

> +

> +    def _add_doc(self, typ, sections):

> +        """Add documentation for a command/object/enum...

> +

> +        We assume we're documenting the thing defined in self._cur_doc.

> +        typ is the type of thing being added ("Command", "Object", etc)

> +

> +        sections is a list of nodes for sections to add to the definition.

> +        """

> +

> +        doc = self._cur_doc

> +        snode = nodes.section(ids=[self._sphinx_directive.new_serialno()])

> +        snode += nodes.title('', '', *[nodes.literal(doc.symbol, doc.symbol),

> +                                       nodes.Text(' (' + typ + ')')])

> +        self._parse_text_into_node(doc.body.text, snode)

> +        for s in sections:

> +            if s is not None:

> +                snode += s


Looks like you're flattening the two-level list the callers create,
e.g. ...

> +        self._add_node_to_current_heading(snode)

> +

> +    def visit_enum_type(self, name, info, ifcond, features, members, prefix):

> +        doc = self._cur_doc

> +        self._add_doc('Enum',

> +                      [self._nodes_for_enum_values(doc, 'Values'),

> +                       self._nodes_for_features(doc),

> +                       self._nodes_for_sections(doc, ifcond)])


... this one: [[nodes for enum values...]
               [nodes for features...]
               [nodes for sections]]

Makes me wonder why you build two levels in the first place.

> +

> +    def visit_object_type(self, name, info, ifcond, features,

> +                          base, members, variants):

> +        doc = self._cur_doc

> +        if base and base.is_implicit():

> +            base = None

> +        self._add_doc('Object',

> +                      [self._nodes_for_members(doc, 'Members', base, variants),

> +                       self._nodes_for_features(doc),

> +                       self._nodes_for_sections(doc, ifcond)])

> +

> +    def visit_alternate_type(self, name, info, ifcond, features, variants):

> +        doc = self._cur_doc

> +        self._add_doc('Alternate',

> +                      [self._nodes_for_members(doc, 'Members'),

> +                       self._nodes_for_features(doc),

> +                       self._nodes_for_sections(doc, ifcond)])

> +

> +    def visit_command(self, name, info, ifcond, features, arg_type,

> +                      ret_type, gen, success_response, boxed, allow_oob,

> +                      allow_preconfig):

> +        doc = self._cur_doc

> +        self._add_doc('Command',

> +                      [self._nodes_for_arguments(doc,

> +                                                 arg_type if boxed else None),

> +                       self._nodes_for_features(doc),

> +                       self._nodes_for_sections(doc, ifcond)])

> +

> +    def visit_event(self, name, info, ifcond, features, arg_type, boxed):

> +        doc = self._cur_doc

> +        self._add_doc('Event',

> +                      [self._nodes_for_arguments(doc,

> +                                                 arg_type if boxed else None),

> +                       self._nodes_for_features(doc),

> +                       self._nodes_for_sections(doc, ifcond)])

> +

> +    def symbol(self, doc, entity):

> +        """Add documentation for one symbol to the document tree

> +

> +        This is the main entry point which causes us to add documentation

> +        nodes for a symbol (which could be a 'command', 'object', 'event',

> +        etc). We do this by calling 'visit' on the schema entity, which

> +        will then call back into one of our visit_* methods, depending

> +        on what kind of thing this symbol is.

> +        """

> +        self._cur_doc = doc

> +        entity.visit(self)

> +        self._cur_doc = None


Matches doc.py's QAPISchemaGenDocVisitor.symbol().  entity.visit() calls
one of the five functions preceding this one.  The function called
processes self._cur_doc.  They all boil down to a variation of
self._add_doc().  Okay.

> +

> +    def _start_new_heading(self, heading, level):

> +        """Start a new heading at the specified heading level

> +

> +        Create a new section whose title is 'heading' and which is placed

> +        in the docutils node tree as a child of the most recent level-1

> +        heading. Subsequent document sections (commands, freeform doc chunks,

> +        etc) will be placed as children of this new heading section.

> +        """

> +        if len(self._active_headings) < level:

> +            self._serror('Level %d subheading found outside a level %d heading'

> +                         % (level, level - 1))

> +        snode = self._make_section(heading)

> +        self._active_headings[level - 1] += snode

> +        self._active_headings = self._active_headings[:level]

> +        self._active_headings.append(snode)

> +

> +    def _add_node_to_current_heading(self, node):

> +        """Add the node to whatever the current active heading is"""

> +        self._active_headings[-1] += node

> +

> +    def freeform(self, doc):

> +        """Add a piece of 'freeform' documentation to the document tree

> +

> +        A 'freeform' document chunk doesn't relate to any particular

> +        symbol (for instance, it could be an introduction).

> +

> +        As a special case, if the freeform document is a single line

> +        of the form '= Heading text' it is treated as a section or subsection

> +        heading, with the heading level indicated by the number of '=' signs.

> +        """

> +

> +        # QAPIDoc documentation says free-form documentation blocks

> +        # must have only a body section, nothing else.

> +        assert not doc.sections

> +        assert not doc.args

> +        assert not doc.features

> +        self._cur_doc = doc

> +

> +        if re.match(r'=+ ', doc.body.text):

> +            # Section or subsection heading: must be the only thing in the block

> +            (heading, _, rest) = doc.body.text.partition('\n')

> +            if rest != '':

> +                raise ExtensionError('%s line %s: section or subsection heading'

> +                                     ' must be in its own doc comment block'

> +                                     % (doc.info.fname, doc.info.line))


Same ornate error message format as above, less 'syntax error: '.

> +            (leader, _, heading) = heading.partition(' ')

> +            self._start_new_heading(heading, len(leader))

> +            return

> +

> +        node = self._make_section(None)

> +        self._parse_text_into_node(doc.body.text, node)

> +        self._add_node_to_current_heading(node)

> +        self._cur_doc = None

> +

> +    def _parse_text_into_node(self, doctext, node):

> +        """Parse a chunk of QAPI-doc-format text into the node

> +

> +        The doc comment can contain most inline rST markup, including

> +        bulleted and enumerated lists.

> +        As an extra permitted piece of markup, @var will be turned

> +        into ``var``.

> +        """

> +

> +        # Handle the "@var means ``var`` case

> +        doctext = re.sub(r'@([\w-]+)', r'``\1``', doctext)

> +

> +        rstlist = ViewList()

> +        for line in doctext.splitlines():

> +            # The reported line number will always be that of the start line

> +            # of the doc comment, rather than the actual location of the error.

> +            # Being more precise would require overhaul of the QAPIDoc class

> +            # to track lines more exactly within all the sub-parts of the doc

> +            # comment, as well as counting lines here.

> +            rstlist.append(line, self._cur_doc.info.fname,

> +                           self._cur_doc.info.line)

> +        self._sphinx_directive.do_parse(rstlist, node)

> +

> +    def get_document_nodes(self):

> +        """Return the list of docutils nodes which make up the document"""

> +        return self._top_node.children

> +

> +class QAPIDocDirective(Directive):

> +    """Extract documentation from the specified QAPI .json file"""

> +    required_argument = 1

> +    optional_arguments = 1

> +    option_spec = {

> +        'qapifile': directives.unchanged_required

> +    }

> +    has_content = False

> +

> +    def new_serialno(self):

> +        """Return a unique new ID string suitable for use as a node's ID"""

> +        env = self.state.document.settings.env

> +        return 'qapidoc-%d' % env.new_serialno('qapidoc')

> +

> +    def run(self):

> +        env = self.state.document.settings.env

> +        qapifile = env.config.qapidoc_srctree + '/' + self.arguments[0]

> +

> +        # Tell sphinx of the dependency

> +        env.note_dependency(os.path.abspath(qapifile))

> +

> +        try:

> +            schema = QAPISchema(qapifile)

> +        except QAPIError as err:

> +            # Launder QAPI parse errors into Sphinx extension errors

> +            # so they are displayed nicely to the user

> +            raise ExtensionError(str(err))


I expected plumbing the error location through Sphinx to the user to
take a bit more effort.  I'm not complaining.

A QAPIError looks like this when converted to str:

    In file included from ...:
    ...
    In file included from ...:
    FILE: In ENTITY-TYPE 'NAME':
    FILE:LINE: MESSAGE

"In file" lines appear when the error is in a sub-module.

An "In ENTITY-TYPE" line appears when the error is in an entity
definition.

The other two ExtensionError() look like

    FILE line LINE: syntax error: MESSAGE

and

    FILE line LINE: MESSAGE

More ornate, less information.

I figure more errors can get thrown by nested_parse_with_titles() (see
below).  How do they look like?

Can we avoid the information loss?

Can we make the error message format more consistent?

> +

> +        vis = QAPISchemaGenRSTVisitor(self)

> +        vis.visit_begin(schema)

> +        for doc in schema.docs:

> +            if doc.symbol:

> +                vis.symbol(doc, schema.lookup_entity(doc.symbol))

> +            else:

> +                vis.freeform(doc)


Matches doc.py.  Visits the documentation doc comment blocks and the
entities they document.  The other backends use

           schema.visit(vis)

which visits the modules and their entities.  Okay.

> +

> +        return vis.get_document_nodes()

> +

> +    def do_parse(self, rstlist, node):

> +        """Parse rST source lines and add them to the specified node

> +

> +        Take the list of rST source lines rstlist, parse them as

> +        rST, and add the resulting docutils nodes as children of node.

> +        The nodes are parsed in a way that allows them to include

> +        subheadings (titles) without confusing the rendering of

> +        anything else.

> +        """

> +        # This is from kerneldoc.py -- it works around an API change in

> +        # Sphinx between 1.6 and 1.7. Unlike kerneldoc.py, we use

> +        # sphinx.util.nodes.nested_parse_with_titles() rather than the

> +        # plain self.state.nested_parse(), and so we can drop the saving

> +        # of title_styles and section_level that kerneldoc.py does,

> +        # because nested_parse_with_titles() does that for us.

> +        if Use_SSI:

> +            with switch_source_input(self.state, rstlist):

> +                nested_parse_with_titles(self.state, rstlist, node)

> +        else:

> +            save = self.state.memo.reporter

> +            self.state.memo.reporter = AutodocReporter(rstlist,

> +                                                       self.state.memo.reporter)

> +            try:

> +                nested_parse_with_titles(self.state, rstlist, node)

> +            finally:

> +                self.state.memo.reporter = save

> +

> +def setup(app):

> +    """ Register qapi-doc directive with Sphinx"""

> +    app.add_config_value('qapidoc_srctree', None, 'env')

> +    app.add_directive('qapi-doc', QAPIDocDirective)

> +

> +    return dict(

> +        version=__version__,

> +        parallel_read_safe=True,

> +        parallel_write_safe=True

> +    )

> diff --git a/MAINTAINERS b/MAINTAINERS

> index 0886eb3d2b5..fdd38f412d7 100644

> --- a/MAINTAINERS

> +++ b/MAINTAINERS

> @@ -3116,6 +3116,7 @@ M: Peter Maydell <peter.maydell@linaro.org>

>  S: Maintained

>  F: docs/conf.py

>  F: docs/*/conf.py

> +F: docs/sphinx/

>  

>  Miscellaneous

>  -------------


I had to read mostly backwards to understand the code.
Markus Armbruster Sept. 4, 2020, 2:44 p.m. UTC | #3
$ pycodestyle docs/sphinx/qapidoc.py 
docs/sphinx/qapidoc.py:42:1: E302 expected 2 blank lines, found 1
docs/sphinx/qapidoc.py:50:1: E302 expected 2 blank lines, found 1
docs/sphinx/qapidoc.py:74:80: E501 line too long (80 > 79 characters)
docs/sphinx/qapidoc.py:388:80: E501 line too long (80 > 79 characters)
docs/sphinx/qapidoc.py:391:80: E501 line too long (80 > 79 characters)
docs/sphinx/qapidoc.py:430:1: E302 expected 2 blank lines, found 1
docs/sphinx/qapidoc.py:489:80: E501 line too long (80 > 79 characters)
docs/sphinx/qapidoc.py:495:1: E302 expected 2 blank lines, found 1
$ PYTHONPATH=scripts pylint docs/sphinx/qapidoc.py 
************* Module qapidoc
docs/sphinx/qapidoc.py:36:4: E0611: No name 'AutodocReporter' in module 'sphinx.ext.autodoc' (no-name-in-module)
docs/sphinx/qapidoc.py:45:10: R1708: Do not raise StopIteration in generator, use return statement instead (stop-iteration-return)
docs/sphinx/qapidoc.py:104:4: R0201: Method could be a function (no-self-use)
docs/sphinx/qapidoc.py:253:4: R0201: Method could be a function (no-self-use)
docs/sphinx/qapidoc.py:34:4: C0412: Imports from package sphinx are not grouped (ungrouped-imports)

------------------------------------------------------------------
Your code has been rated at 9.64/10 (previous run: 8.85/10, +0.79)

Use your judgement.
Peter Maydell Sept. 4, 2020, 2:52 p.m. UTC | #4
On Fri, 4 Sep 2020 at 15:44, Markus Armbruster <armbru@redhat.com> wrote:
>

> $ pycodestyle docs/sphinx/qapidoc.py

> docs/sphinx/qapidoc.py:42:1: E302 expected 2 blank lines, found 1

> docs/sphinx/qapidoc.py:50:1: E302 expected 2 blank lines, found 1

> docs/sphinx/qapidoc.py:74:80: E501 line too long (80 > 79 characters)

> docs/sphinx/qapidoc.py:388:80: E501 line too long (80 > 79 characters)

> docs/sphinx/qapidoc.py:391:80: E501 line too long (80 > 79 characters)

> docs/sphinx/qapidoc.py:430:1: E302 expected 2 blank lines, found 1

> docs/sphinx/qapidoc.py:489:80: E501 line too long (80 > 79 characters)

> docs/sphinx/qapidoc.py:495:1: E302 expected 2 blank lines, found 1


Those seem like easy fixes.

> $ PYTHONPATH=scripts pylint docs/sphinx/qapidoc.py

> ************* Module qapidoc

> docs/sphinx/qapidoc.py:36:4: E0611: No name 'AutodocReporter' in module 'sphinx.ext.autodoc' (no-name-in-module)

> docs/sphinx/qapidoc.py:45:10: R1708: Do not raise StopIteration in generator, use return statement instead (stop-iteration-return)

> docs/sphinx/qapidoc.py:104:4: R0201: Method could be a function (no-self-use)

> docs/sphinx/qapidoc.py:253:4: R0201: Method could be a function (no-self-use)

> docs/sphinx/qapidoc.py:34:4: C0412: Imports from package sphinx are not grouped (ungrouped-imports)

>

> ------------------------------------------------------------------

> Your code has been rated at 9.64/10 (previous run: 8.85/10, +0.79)

>

> Use your judgement.


I'm pretty sure I remember running this or something with very
similar output at some point. IIRC they're mostly largely
unavoidable (in particular 1 and 5 are not satisfiable because
we're trying to handle multiple versions of Sphinx which present
different sets of importable symbols.)

thanks
-- PMM
Peter Maydell Sept. 21, 2020, 4:50 p.m. UTC | #5
On Fri, 4 Sep 2020 at 15:44, Markus Armbruster <armbru@redhat.com> wrote:
>
> $ pycodestyle docs/sphinx/qapidoc.py
> docs/sphinx/qapidoc.py:42:1: E302 expected 2 blank lines, found 1
> docs/sphinx/qapidoc.py:50:1: E302 expected 2 blank lines, found 1
> docs/sphinx/qapidoc.py:74:80: E501 line too long (80 > 79 characters)
> docs/sphinx/qapidoc.py:388:80: E501 line too long (80 > 79 characters)
> docs/sphinx/qapidoc.py:391:80: E501 line too long (80 > 79 characters)
> docs/sphinx/qapidoc.py:430:1: E302 expected 2 blank lines, found 1
> docs/sphinx/qapidoc.py:489:80: E501 line too long (80 > 79 characters)
> docs/sphinx/qapidoc.py:495:1: E302 expected 2 blank lines, found 1

All fixed.

> $ PYTHONPATH=scripts pylint docs/sphinx/qapidoc.py
> ************* Module qapidoc
> docs/sphinx/qapidoc.py:36:4: E0611: No name 'AutodocReporter' in module 'sphinx.ext.autodoc' (no-name-in-module)
> docs/sphinx/qapidoc.py:45:10: R1708: Do not raise StopIteration in generator, use return statement instead (stop-iteration-return)
> docs/sphinx/qapidoc.py:104:4: R0201: Method could be a function (no-self-use)
> docs/sphinx/qapidoc.py:253:4: R0201: Method could be a function (no-self-use)
> docs/sphinx/qapidoc.py:34:4: C0412: Imports from package sphinx are not grouped (ungrouped-imports)

Not fixed: I disagree with the linter on all these.

The first and fifth of these are unfixable because they are the
result of code that is trying to adapt to multiple versions of
Sphinx (one of which has AutodocReporter and the other of which
does not).

The second makes no sense and appears to me to be a linter
bug, because the code doesn't (directly) raise StopIteration.
In any case the function being complained about is just a
straight borrowing from pydash.

The third and fourth would mean that two of the 10 or so
_nodes_for_whatever methods would become functions and
gratuitously different in call signature from the rest
just because it happens that the current implementation
doesn't need 'self'.

(The version of pylint I have also warns about:
 * comments that say "TODO", both of which are ones that
   are carried over from the texinfo generator about dropping
   fallback code when undocumented members are outlawed
 * methods that are part of the QAPISchemaVisitor
   interface and apparently have "too many arguments")
 * single-letter variables
 * the Use_SSI variable name which is from the kerneldoc plugin
I'm going to ignore those too.)

thanks
-- PMM
Peter Maydell Sept. 21, 2020, 6:06 p.m. UTC | #6
On Fri, 4 Sep 2020 at 13:29, Markus Armbruster <armbru@redhat.com> wrote:
>
> Peter Maydell <peter.maydell@linaro.org> writes:
>
> > Some of our documentation is auto-generated from documentation
> > comments in the JSON schema.
> >
> > For Sphinx, rather than creating a file to include, the most natural
> > way to handle this is to have a small custom Sphinx extension which
> > processes the JSON file and inserts documentation into the rST
> > file being processed.
> >
> > This is the same approach that kerneldoc and hxtool use.
> >
> > Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
>
> Got a pointer to "Writing Sphinx Extensions for Dummies" or similar?

The upstream documentation on extending Sphinx is probably
a good place to start; in particular the tutorials are useful:

https://www.sphinx-doc.org/en/master/development/index.html

I have also found that a fair amount of trial-and-error,
examination of the source of Sphinx itself or of other
extensions, and occasionally asking questions on the Sphinx
mailing list has been necessary...

> > --- /dev/null
> > +++ b/docs/sphinx/qapidoc.py
> > @@ -0,0 +1,504 @@
> > +# coding=utf-8
> > +#
> > +# QEMU qapidoc QAPI file parsing extension
> > +#
> > +# Copyright (c) 2020 Linaro
> > +#
> > +# This work is licensed under the terms of the GNU GPLv2 or later.
> > +# See the COPYING file in the top-level directory.
> > +"""qapidoc is a Sphinx extension that implements the qapi-doc directive"""
> > +
> > +# The purpose of this extension is to read the documentation comments
> > +# in QAPI JSON schema files, and insert them all into the current document.
>
> Let's drop "JSON", it's not really true.

Fixed.

> > +# The conf.py file must set the qapidoc_srctree config value to
> > +# the root of the QEMU source tree.
> > +# Each qapi-doc:: directive takes one argument which is the
> > +# path of the .json file to process, relative to the source tree.
>
> Beg your pardon?  Oh, you're talking about .rst files now.  The next two
> patches will add such files.  Perhaps something like "The qapi-doc::
> reST directive provided by this extension takes ..."

OK, how about:

# The purpose of this extension is to read the documentation comments
# in QAPI schema files, and insert them all into the current document.
#
# It implements one new rST directive, "qapi-doc::".
# Each qapi-doc:: directive takes one argument, which is the
# path of the .json file to process, relative to the source tree.
#
# The docs/conf.py file must set the qapidoc_srctree config value to
# the root of the QEMU source tree.

?

> > +# Function borrowed from pydash, which is under the MIT license
> > +def intersperse(iterable, separator):
> > +    """Like join, but for arbitrary iterables, notably arrays"""
> > +    iterable = iter(iterable)
> > +    yield next(iterable)
> > +    for item in iterable:
> > +        yield separator
> > +        yield item
>
> What's wrong with separator.join(iterable)?

It doesn't work on all the things we'd like to intersperse().
If you try defining intersperse as

def intersperse(iterable, separator):
    separator.join(iterable)

then you get an exception:

  File "/home/petmay01/linaro/qemu-from-laptop/qemu/docs/../scripts/qapi/schema.py",
line 445, in visit
    self.base, self.local_members, self.variants)
  File "/home/petmay01/linaro/qemu-from-laptop/qemu/docs/sphinx/qapidoc.py",
line 314, in visit_object_type
    [self._nodes_for_members(doc, 'Members', base, variants),
  File "/home/petmay01/linaro/qemu-from-laptop/qemu/docs/sphinx/qapidoc.py",
line 173, in _nodes_for_members
    nodes.Text(', ')))
  File "/home/petmay01/linaro/qemu-from-laptop/qemu/docs/sphinx/qapidoc.py",
line 53, in intersperse
    separator.join(iterable)
TypeError: sequence item 0: expected str instance, literal found

> > +
> > +class QAPISchemaGenRSTVisitor(QAPISchemaVisitor):
> > +    """A QAPI schema visitor which generates docutils/Sphinx nodes
> > +
> > +    This class builds up a tree of docutils/Sphinx nodes corresponding
> > +    to documentation for the various QAPI objects. To use it, first create
> > +    a QAPISchemaGenRSTVisitor object, and call its visit_begin() method.
> > +    Then you can call one of the two methods 'freeform' (to add documentation
> > +    for a freeform documentation chunk) or 'symbol' (to add documentation
> > +    for a QAPI symbol). These will cause the visitor to build up the
> > +    tree of document nodes. Once you've added all the documentation
> > +    via 'freeform' and 'symbol' method calls, you can call 'get_document_nodes'
> > +    to get the final list of document nodes (in a form suitable for returning
> > +    from a Sphinx directive's 'run' method).
> > +    """
> > +    def __init__(self, sphinx_directive):
> > +        self._cur_doc = None
> > +        self._sphinx_directive = sphinx_directive
> > +        self._top_node = nodes.section()
> > +        self._active_headings = [self._top_node]
> > +
> > +    def _serror(self, msg):
> > +        """Raise an exception giving a user-friendly syntax error message"""
> > +        file = self._cur_doc.info.fname
> > +        line = self._cur_doc.info.line
> > +        raise ExtensionError('%s line %d: syntax error: %s' % (file, line, msg))
>
> Note for later: ornate error message format, no use of QAPISourceInfo.
>
> > +

> > +
> > +    def _nodes_for_enum_values(self, doc, what):
>
> @what is only ever 'Values', isn't it?

Yes. I think this was a legacy from conversion from scripts/qapi/doc.py,
which does pass in a 'Values' string, but it does that because it shares
code in texi_members() between enums and other things. In this code
I chose to split that out into different functions rather than using
one function with a member_func argument because the member vs enum
code is different enough that it's clearer that way. But I didn't
notice that there wasn't any longer much point in passing a 'what'
argument.

Fixed to hard-code 'Values' rather than passing it as an argument.

> > +    def _nodes_for_sections(self, doc, ifcond):
> > +        """Return doctree nodes for additional sections following arguments"""
> > +        nodelist = []
> > +        for section in doc.sections:
> > +            snode = self._make_section(section.name)
> > +            if section.name and section.name.startswith('Example'):
> > +                snode += self._nodes_for_example(section.text)
> > +            else:
> > +                self._parse_text_into_node(section.text, snode)
> > +            nodelist.append(snode)
> > +        if ifcond:
> > +            snode = self._make_section('If')
> > +            snode += self._nodes_for_ifcond(ifcond, with_if=False)
> > +            nodelist.append(snode)
>
> I think I'd rather have a separate _node_for_ifcond().  Not a demand.

I'm not sure what you have in mind here, so I'll leave it as it is :-)

> > +        if not nodelist:
> > +            return None
>
> Any particular reason for mapping [] to None?

I forget at this point. Probably related to the next query...

> > +        return nodelist
> > +
> > +    def _add_doc(self, typ, sections):
> > +        """Add documentation for a command/object/enum...
> > +
> > +        We assume we're documenting the thing defined in self._cur_doc.
> > +        typ is the type of thing being added ("Command", "Object", etc)
> > +
> > +        sections is a list of nodes for sections to add to the definition.
> > +        """
> > +
> > +        doc = self._cur_doc
> > +        snode = nodes.section(ids=[self._sphinx_directive.new_serialno()])
> > +        snode += nodes.title('', '', *[nodes.literal(doc.symbol, doc.symbol),
> > +                                       nodes.Text(' (' + typ + ')')])
> > +        self._parse_text_into_node(doc.body.text, snode)
> > +        for s in sections:
> > +            if s is not None:
> > +                snode += s
>
> Looks like you're flattening the two-level list the callers create,
> e.g. ...
>
> > +        self._add_node_to_current_heading(snode)
> > +
> > +    def visit_enum_type(self, name, info, ifcond, features, members, prefix):
> > +        doc = self._cur_doc
> > +        self._add_doc('Enum',
> > +                      [self._nodes_for_enum_values(doc, 'Values'),
> > +                       self._nodes_for_features(doc),
> > +                       self._nodes_for_sections(doc, ifcond)])
>
> ... this one: [[nodes for enum values...]
>                [nodes for features...]
>                [nodes for sections]]
>
> Makes me wonder why you build two levels in the first place.

This is probably just me not being a very experienced Python
programmer. What's the syntax for passing _add_doc the single list
which is just the concatenation of the various lists that
the _nodes_for_* methods return ?

> > +
> > +        if re.match(r'=+ ', doc.body.text):
> > +            # Section or subsection heading: must be the only thing in the block
> > +            (heading, _, rest) = doc.body.text.partition('\n')
> > +            if rest != '':
> > +                raise ExtensionError('%s line %s: section or subsection heading'
> > +                                     ' must be in its own doc comment block'
> > +                                     % (doc.info.fname, doc.info.line))
>
> Same ornate error message format as above, less 'syntax error: '.

Yes. I realized after sending v5 that this could be made to
call _serror() instead of directly raising the ExtensionError.

> > +        try:
> > +            schema = QAPISchema(qapifile)
> > +        except QAPIError as err:
> > +            # Launder QAPI parse errors into Sphinx extension errors
> > +            # so they are displayed nicely to the user
> > +            raise ExtensionError(str(err))
>
> I expected plumbing the error location through Sphinx to the user to
> take a bit more effort.  I'm not complaining.
>
> A QAPIError looks like this when converted to str:
>
>     In file included from ...:
>     ...
>     In file included from ...:
>     FILE: In ENTITY-TYPE 'NAME':
>     FILE:LINE: MESSAGE
>
> "In file" lines appear when the error is in a sub-module.
>
> An "In ENTITY-TYPE" line appears when the error is in an entity
> definition.
>
> The other two ExtensionError() look like
>
>     FILE line LINE: syntax error: MESSAGE
>
> and
>
>     FILE line LINE: MESSAGE
>
> More ornate, less information.
>
> I figure more errors can get thrown by nested_parse_with_titles() (see
> below).  How do they look like?

They look like this:

Warning, treated as error:
/home/petmay01/linaro/qemu-from-laptop/qemu/docs/../qapi/block.json:15:Bullet
list ends without a blank line; unexpected unindent.

(I've just noticed that with Sphinx 1.6, which we still have
to support, the file/line info isn't passed through, so you get:

Warning, treated as error:
/home/petmay01/linaro/qemu-from-laptop/qemu/docs/interop/qemu-qmp-ref.rst:7:Bullet
list ends without a blank line; unexpected unindent.

The plugin has code borrowed from kerneldoc.py which is
*supposed* to handle the older API Sphinx 1.6 used, but it
looks like it's broken. I'll have a look and see if it
is fixable, but possibly we may have to live with people
developing on old distros getting suboptimal errors.)

> Can we avoid the information loss?
>
> Can we make the error message format more consistent?

How do I get a QAPIError from within one of the
visit_* or freeform methods of a QAPISchemaVisitor ?
The current texinfo doc generator doesn't do anything like that.
If there's a way to get a QAPIError given
 * the 'doc' argument passed into freeform/visit*
 * the specific error message we want to emit
then we can make the errors this script itself emits
(which are just the ones about headings: wrongly-subnested
headings, and headings that aren't the only thing in their
freeform-doc block) consistent with the other QAPIError ones.
[As I mentioned in some earlier thread, we are identifying
these errors here for pragmatic "it was easy" reasons,
though you could make a case that scripts/qapi/parser.py
should be doing it.]

(Aside: what was your conclusion on the question of section
headers, anyway? I need to go back and re-read the older
threads but I think that is still an unresolved topic...)

However I suspect we can't generate QAPIErrors for errors which are
produced by Sphinx itself when it's handed invalid rST:
the API Sphinx provides us is that we can annotate the lines
of rST that we are assembling from the input files to indicate
their "true" file/line number, but we can't intercept error
messages it emits. Think of it like the C compiler -- a
preprocessor can add #line directives to pass fixed up file/line
info to the compiler, but it isn't in the loop when the compiler
actually prints out errors.

> I had to read mostly backwards to understand the code.

Yes. This seems to be the standard way to write a Sphinx extension.
(Compare the kerneldoc.py and the various example extensions
in the Sphinx docs.)

thanks
-- PMM
Markus Armbruster Sept. 22, 2020, 11:42 a.m. UTC | #7
Peter Maydell <peter.maydell@linaro.org> writes:

> On Fri, 4 Sep 2020 at 13:29, Markus Armbruster <armbru@redhat.com> wrote:
>>
>> Peter Maydell <peter.maydell@linaro.org> writes:
>>
>> > Some of our documentation is auto-generated from documentation
>> > comments in the JSON schema.
>> >
>> > For Sphinx, rather than creating a file to include, the most natural
>> > way to handle this is to have a small custom Sphinx extension which
>> > processes the JSON file and inserts documentation into the rST
>> > file being processed.
>> >
>> > This is the same approach that kerneldoc and hxtool use.
>> >
>> > Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
>>
>> Got a pointer to "Writing Sphinx Extensions for Dummies" or similar?
>
> The upstream documentation on extending Sphinx is probably
> a good place to start; in particular the tutorials are useful:
>
> https://www.sphinx-doc.org/en/master/development/index.html

Do you think having this link in a comment could help future readers /
maintainers?

> I have also found that a fair amount of trial-and-error,
> examination of the source of Sphinx itself or of other
> extensions, and occasionally asking questions on the Sphinx
> mailing list has been necessary...
>
>> > --- /dev/null
>> > +++ b/docs/sphinx/qapidoc.py
>> > @@ -0,0 +1,504 @@
>> > +# coding=utf-8
>> > +#
>> > +# QEMU qapidoc QAPI file parsing extension
>> > +#
>> > +# Copyright (c) 2020 Linaro
>> > +#
>> > +# This work is licensed under the terms of the GNU GPLv2 or later.
>> > +# See the COPYING file in the top-level directory.
>> > +"""qapidoc is a Sphinx extension that implements the qapi-doc directive"""
>> > +
>> > +# The purpose of this extension is to read the documentation comments
>> > +# in QAPI JSON schema files, and insert them all into the current document.
>>
>> Let's drop "JSON", it's not really true.
>
> Fixed.
>
>> > +# The conf.py file must set the qapidoc_srctree config value to
>> > +# the root of the QEMU source tree.
>> > +# Each qapi-doc:: directive takes one argument which is the
>> > +# path of the .json file to process, relative to the source tree.
>>
>> Beg your pardon?  Oh, you're talking about .rst files now.  The next two
>> patches will add such files.  Perhaps something like "The qapi-doc::
>> reST directive provided by this extension takes ..."
>
> OK, how about:
>
> # The purpose of this extension is to read the documentation comments
> # in QAPI schema files, and insert them all into the current document.
> #
> # It implements one new rST directive, "qapi-doc::".
> # Each qapi-doc:: directive takes one argument, which is the
> # path of the .json file to process, relative to the source tree.
> #
> # The docs/conf.py file must set the qapidoc_srctree config value to
> # the root of the QEMU source tree.
>
> ?

Works for me with "pathname" instead of "path".  I'd also write "schema
file" instead of ".json file", but that's a matter of taste and up to
you.

>> > +# Function borrowed from pydash, which is under the MIT license
>> > +def intersperse(iterable, separator):
>> > +    """Like join, but for arbitrary iterables, notably arrays"""
>> > +    iterable = iter(iterable)
>> > +    yield next(iterable)
>> > +    for item in iterable:
>> > +        yield separator
>> > +        yield item
>>
>> What's wrong with separator.join(iterable)?
>
> It doesn't work on all the things we'd like to intersperse().
> If you try defining intersperse as
>
> def intersperse(iterable, separator):
>     separator.join(iterable)
>
> then you get an exception:
>
>   File "/home/petmay01/linaro/qemu-from-laptop/qemu/docs/../scripts/qapi/schema.py",
> line 445, in visit
>     self.base, self.local_members, self.variants)
>   File "/home/petmay01/linaro/qemu-from-laptop/qemu/docs/sphinx/qapidoc.py",
> line 314, in visit_object_type
>     [self._nodes_for_members(doc, 'Members', base, variants),
>   File "/home/petmay01/linaro/qemu-from-laptop/qemu/docs/sphinx/qapidoc.py",
> line 173, in _nodes_for_members
>     nodes.Text(', ')))
>   File "/home/petmay01/linaro/qemu-from-laptop/qemu/docs/sphinx/qapidoc.py",
> line 53, in intersperse
>     separator.join(iterable)
> TypeError: sequence item 0: expected str instance, literal found

Aha.

@iterable appears to be a list of docutils.node.literal, whatever that
may be.

The doc string is confusing (I figure you borrowed the it along with the
code).  str.join() works just fine for arrays.  It's not the kind of
iterable that's the problem, it's the kind of thing the iterable yields:
str.join() wants str, we have. docutils.node.literal.

The lazy solution is to delete the confusing doc string.

Here's my attempt at a non-lazy solution:

    """Yield the members of *iterable* interspersed with *separator*."""

>> > +
>> > +class QAPISchemaGenRSTVisitor(QAPISchemaVisitor):
>> > +    """A QAPI schema visitor which generates docutils/Sphinx nodes
>> > +
>> > +    This class builds up a tree of docutils/Sphinx nodes corresponding
>> > +    to documentation for the various QAPI objects. To use it, first create
>> > +    a QAPISchemaGenRSTVisitor object, and call its visit_begin() method.
>> > +    Then you can call one of the two methods 'freeform' (to add documentation
>> > +    for a freeform documentation chunk) or 'symbol' (to add documentation
>> > +    for a QAPI symbol). These will cause the visitor to build up the
>> > +    tree of document nodes. Once you've added all the documentation
>> > +    via 'freeform' and 'symbol' method calls, you can call 'get_document_nodes'
>> > +    to get the final list of document nodes (in a form suitable for returning
>> > +    from a Sphinx directive's 'run' method).
>> > +    """
>> > +    def __init__(self, sphinx_directive):
>> > +        self._cur_doc = None
>> > +        self._sphinx_directive = sphinx_directive
>> > +        self._top_node = nodes.section()
>> > +        self._active_headings = [self._top_node]
>> > +
>> > +    def _serror(self, msg):
>> > +        """Raise an exception giving a user-friendly syntax error message"""
>> > +        file = self._cur_doc.info.fname
>> > +        line = self._cur_doc.info.line
>> > +        raise ExtensionError('%s line %d: syntax error: %s' % (file, line, msg))
>>
>> Note for later: ornate error message format, no use of QAPISourceInfo.
>>
>> > +
>
>> > +
>> > +    def _nodes_for_enum_values(self, doc, what):
>>
>> @what is only ever 'Values', isn't it?
>
> Yes. I think this was a legacy from conversion from scripts/qapi/doc.py,
> which does pass in a 'Values' string, but it does that because it shares
> code in texi_members() between enums and other things. In this code
> I chose to split that out into different functions rather than using
> one function with a member_func argument because the member vs enum
> code is different enough that it's clearer that way. But I didn't
> notice that there wasn't any longer much point in passing a 'what'
> argument.
>
> Fixed to hard-code 'Values' rather than passing it as an argument.
>
>> > +    def _nodes_for_sections(self, doc, ifcond):
>> > +        """Return doctree nodes for additional sections following arguments"""
>> > +        nodelist = []
>> > +        for section in doc.sections:
>> > +            snode = self._make_section(section.name)
>> > +            if section.name and section.name.startswith('Example'):
>> > +                snode += self._nodes_for_example(section.text)
>> > +            else:
>> > +                self._parse_text_into_node(section.text, snode)
>> > +            nodelist.append(snode)
>> > +        if ifcond:
>> > +            snode = self._make_section('If')
>> > +            snode += self._nodes_for_ifcond(ifcond, with_if=False)
>> > +            nodelist.append(snode)
>>
>> I think I'd rather have a separate _node_for_ifcond().  Not a demand.
>
> I'm not sure what you have in mind here, so I'll leave it as it is :-)

_nodes_for_sections() is used like this:

        self._add_doc(WHAT,
                      [self._nodes_for_WHAT(doc, ...),
                       self._nodes_for_features(doc),
                       self._nodes_for_sections(doc, ifcond)])

._add_doc()'s second argument is a list of nodes for sections.

._nodes_for_WHAT() computes the nodes specific to the kind of thing
we're processing: enum, object, alternate, command, event.

._nodes_for_features() computes the nodes for the feature section.

._nodes_for_sections() computes both the nodes for additional sections
and the nodes for the ifcond section.

I'd compute the nodes for the ifcond section in their own function, to
keep .nodes_for_sections() focused on just one purpose.

>> > +        if not nodelist:
>> > +            return None
>>
>> Any particular reason for mapping [] to None?
>
> I forget at this point. Probably related to the next query...
>
>> > +        return nodelist
>> > +
>> > +    def _add_doc(self, typ, sections):
>> > +        """Add documentation for a command/object/enum...
>> > +
>> > +        We assume we're documenting the thing defined in self._cur_doc.
>> > +        typ is the type of thing being added ("Command", "Object", etc)
>> > +
>> > +        sections is a list of nodes for sections to add to the definition.
>> > +        """
>> > +
>> > +        doc = self._cur_doc
>> > +        snode = nodes.section(ids=[self._sphinx_directive.new_serialno()])
>> > +        snode += nodes.title('', '', *[nodes.literal(doc.symbol, doc.symbol),
>> > +                                       nodes.Text(' (' + typ + ')')])
>> > +        self._parse_text_into_node(doc.body.text, snode)
>> > +        for s in sections:
>> > +            if s is not None:
>> > +                snode += s
>>
>> Looks like you're flattening the two-level list the callers create,
>> e.g. ...
>>
>> > +        self._add_node_to_current_heading(snode)
>> > +
>> > +    def visit_enum_type(self, name, info, ifcond, features, members, prefix):
>> > +        doc = self._cur_doc
>> > +        self._add_doc('Enum',
>> > +                      [self._nodes_for_enum_values(doc, 'Values'),
>> > +                       self._nodes_for_features(doc),
>> > +                       self._nodes_for_sections(doc, ifcond)])
>>
>> ... this one: [[nodes for enum values...]
>>                [nodes for features...]
>>                [nodes for sections]]
>>
>> Makes me wonder why you build two levels in the first place.
>
> This is probably just me not being a very experienced Python
> programmer. What's the syntax for passing _add_doc the single list
> which is just the concatenation of the various lists that
> the _nodes_for_* methods return ?

You could replace

                      [self._nodes_for_WHAT(doc, ...),
                       self._nodes_for_features(doc),
                       self._nodes_for_sections(doc, ifcond)])

by

                      self._nodes_for_WHAT(doc, ...)
                      + self._nodes_for_features(doc)
                      + self._nodes_for_sections(doc, ifcond)

Operator + concatenates sequences.

>> > +
>> > +        if re.match(r'=+ ', doc.body.text):
>> > +            # Section or subsection heading: must be the only thing in the block
>> > +            (heading, _, rest) = doc.body.text.partition('\n')
>> > +            if rest != '':
>> > +                raise ExtensionError('%s line %s: section or subsection heading'
>> > +                                     ' must be in its own doc comment block'
>> > +                                     % (doc.info.fname, doc.info.line))
>>
>> Same ornate error message format as above, less 'syntax error: '.
>
> Yes. I realized after sending v5 that this could be made to
> call _serror() instead of directly raising the ExtensionError.
>
>> > +        try:
>> > +            schema = QAPISchema(qapifile)
>> > +        except QAPIError as err:
>> > +            # Launder QAPI parse errors into Sphinx extension errors
>> > +            # so they are displayed nicely to the user
>> > +            raise ExtensionError(str(err))
>>
>> I expected plumbing the error location through Sphinx to the user to
>> take a bit more effort.  I'm not complaining.
>>
>> A QAPIError looks like this when converted to str:
>>
>>     In file included from ...:
>>     ...
>>     In file included from ...:
>>     FILE: In ENTITY-TYPE 'NAME':
>>     FILE:LINE: MESSAGE
>>
>> "In file" lines appear when the error is in a sub-module.
>>
>> An "In ENTITY-TYPE" line appears when the error is in an entity
>> definition.
>>
>> The other two ExtensionError() look like
>>
>>     FILE line LINE: syntax error: MESSAGE
>>
>> and
>>
>>     FILE line LINE: MESSAGE
>>
>> More ornate, less information.
>>
>> I figure more errors can get thrown by nested_parse_with_titles() (see
>> below).  How do they look like?
>
> They look like this:
>
> Warning, treated as error:
> /home/petmay01/linaro/qemu-from-laptop/qemu/docs/../qapi/block.json:15:Bullet
> list ends without a blank line; unexpected unindent.
>
> (I've just noticed that with Sphinx 1.6, which we still have
> to support, the file/line info isn't passed through, so you get:
>
> Warning, treated as error:
> /home/petmay01/linaro/qemu-from-laptop/qemu/docs/interop/qemu-qmp-ref.rst:7:Bullet
> list ends without a blank line; unexpected unindent.
>
> The plugin has code borrowed from kerneldoc.py which is
> *supposed* to handle the older API Sphinx 1.6 used, but it
> looks like it's broken. I'll have a look and see if it
> is fixable, but possibly we may have to live with people
> developing on old distros getting suboptimal errors.)

I'm okay with that.

>> Can we avoid the information loss?
>>
>> Can we make the error message format more consistent?
>
> How do I get a QAPIError from within one of the
> visit_* or freeform methods of a QAPISchemaVisitor ?

The visit_ methods take an info argument.  With that:

    raise QAPISemError(info, "the error message")

QAPISchemaGenDocVisitor.freeform() lacks such an argument.  Use
doc.info.

> The current texinfo doc generator doesn't do anything like that.

None of the code generators detects any errors.

I added one to test the advice I gave above.  It doesn't get caught.
Easy enough to fix: move then gen_FOO() into main()'s try ... except.

> If there's a way to get a QAPIError given
>  * the 'doc' argument passed into freeform/visit*
>  * the specific error message we want to emit
> then we can make the errors this script itself emits
> (which are just the ones about headings: wrongly-subnested
> headings, and headings that aren't the only thing in their
> freeform-doc block) consistent with the other QAPIError ones.
> [As I mentioned in some earlier thread, we are identifying
> these errors here for pragmatic "it was easy" reasons,
> though you could make a case that scripts/qapi/parser.py
> should be doing it.]

Oh yes.

> (Aside: what was your conclusion on the question of section
> headers, anyway? I need to go back and re-read the older
> threads but I think that is still an unresolved topic...)

If I remember correctly, I disliked "[PATCH 19/29]
qapi/qapi-schema.json: Put headers in their own doc-comment blocks",
because "it sweeps the actual problem under the rug, namely flaws in our
parsing and representation of doc comments."

Our documentation has a tree structure.  Ideally, we'd represent it as a
tree.  I figure such a tree would suit you well: you could translate
more or less 1:1 to Sphinx nodes.

We currently have a two-level list instead: a list of blocks (either
definition or free-form), where each block has a list of sections (see
QAPIDoc's doc string).

My commit dcdc07a97c "qapi: Make section headings start a new doc
comment block" was a minimally invasive patch to get us closer to a
(partially flattened) tree.  Kind of collects the dirt in an (hopefully
obvious) pile next to the rug.

You could finish the job and rework the representation into a tree.
That would be lovely.  I appreciate help with cleaning up existing QAPI
messes very much, but do not feel obliged.

You could work with the two-level list as is, relying on the fact that a
'=' can only occur right at the start of a free-form block.

You could tweak the two-level list just a bit, so that the heading
becomes properly represented, and you don't have to match free-form body
section text to find it.

Does this help you along?

> However I suspect we can't generate QAPIErrors for errors which are
> produced by Sphinx itself when it's handed invalid rST:
> the API Sphinx provides us is that we can annotate the lines
> of rST that we are assembling from the input files to indicate
> their "true" file/line number, but we can't intercept error
> messages it emits. Think of it like the C compiler -- a
> preprocessor can add #line directives to pass fixed up file/line
> info to the compiler, but it isn't in the loop when the compiler
> actually prints out errors.

Perfection is not a requirement :)

Errors from TexInfo are arguably worse than that now.  They are rare,
however.  The situation will change when people use reST markup in doc
comments.

>> I had to read mostly backwards to understand the code.
>
> Yes. This seems to be the standard way to write a Sphinx extension.
> (Compare the kerneldoc.py and the various example extensions
> in the Sphinx docs.)

Mimicking existing similar extensions makes sense.
Markus Armbruster Sept. 22, 2020, 11:47 a.m. UTC | #8
Peter Maydell <peter.maydell@linaro.org> writes:

> On Fri, 4 Sep 2020 at 15:44, Markus Armbruster <armbru@redhat.com> wrote:
>>
>> $ pycodestyle docs/sphinx/qapidoc.py
>> docs/sphinx/qapidoc.py:42:1: E302 expected 2 blank lines, found 1
>> docs/sphinx/qapidoc.py:50:1: E302 expected 2 blank lines, found 1
>> docs/sphinx/qapidoc.py:74:80: E501 line too long (80 > 79 characters)
>> docs/sphinx/qapidoc.py:388:80: E501 line too long (80 > 79 characters)
>> docs/sphinx/qapidoc.py:391:80: E501 line too long (80 > 79 characters)
>> docs/sphinx/qapidoc.py:430:1: E302 expected 2 blank lines, found 1
>> docs/sphinx/qapidoc.py:489:80: E501 line too long (80 > 79 characters)
>> docs/sphinx/qapidoc.py:495:1: E302 expected 2 blank lines, found 1
>
> All fixed.
>
>> $ PYTHONPATH=scripts pylint docs/sphinx/qapidoc.py
>> ************* Module qapidoc
>> docs/sphinx/qapidoc.py:36:4: E0611: No name 'AutodocReporter' in module 'sphinx.ext.autodoc' (no-name-in-module)
>> docs/sphinx/qapidoc.py:45:10: R1708: Do not raise StopIteration in generator, use return statement instead (stop-iteration-return)
>> docs/sphinx/qapidoc.py:104:4: R0201: Method could be a function (no-self-use)
>> docs/sphinx/qapidoc.py:253:4: R0201: Method could be a function (no-self-use)
>> docs/sphinx/qapidoc.py:34:4: C0412: Imports from package sphinx are not grouped (ungrouped-imports)
>
> Not fixed: I disagree with the linter on all these.
>
> The first and fifth of these are unfixable because they are the
> result of code that is trying to adapt to multiple versions of
> Sphinx (one of which has AutodocReporter and the other of which
> does not).
>
> The second makes no sense and appears to me to be a linter
> bug, because the code doesn't (directly) raise StopIteration.
> In any case the function being complained about is just a
> straight borrowing from pydash.
>
> The third and fourth would mean that two of the 10 or so
> _nodes_for_whatever methods would become functions and
> gratuitously different in call signature from the rest
> just because it happens that the current implementation
> doesn't need 'self'.

As I said, use your judgement.

> (The version of pylint I have also warns about:
>  * comments that say "TODO", both of which are ones that
>    are carried over from the texinfo generator about dropping
>    fallback code when undocumented members are outlawed
>  * methods that are part of the QAPISchemaVisitor
>    interface and apparently have "too many arguments")
>  * single-letter variables
>  * the Use_SSI variable name which is from the kerneldoc plugin
> I'm going to ignore those too.)

John Snow is working on a combination of pylint configuration that suits
us, genuine removal of lint, and the occasional pylint directive to shut
it up when it's wrong.  Informational, don't worry about it now.
Peter Maydell Sept. 24, 2020, 1:25 p.m. UTC | #9
On Tue, 22 Sep 2020 at 12:42, Markus Armbruster <armbru@redhat.com> wrote:
>
> Peter Maydell <peter.maydell@linaro.org> writes:
>
> > On Fri, 4 Sep 2020 at 13:29, Markus Armbruster <armbru@redhat.com> wrote:
> >> Got a pointer to "Writing Sphinx Extensions for Dummies" or similar?
> >
> > The upstream documentation on extending Sphinx is probably
> > a good place to start; in particular the tutorials are useful:
> >
> > https://www.sphinx-doc.org/en/master/development/index.html
>
> Do you think having this link in a comment could help future readers /
> maintainers?

Easy enough to add it, and it doesn't hurt.

> > OK, how about:
> >
> > # The purpose of this extension is to read the documentation comments
> > # in QAPI schema files, and insert them all into the current document.
> > #
> > # It implements one new rST directive, "qapi-doc::".
> > # Each qapi-doc:: directive takes one argument, which is the
> > # path of the .json file to process, relative to the source tree.
> > #
> > # The docs/conf.py file must set the qapidoc_srctree config value to
> > # the root of the QEMU source tree.
> >
> > ?
>
> Works for me with "pathname" instead of "path".  I'd also write "schema
> file" instead of ".json file", but that's a matter of taste and up to
> you.

Fixed.

> The doc string is confusing (I figure you borrowed the it along with the
> code).  str.join() works just fine for arrays.  It's not the kind of
> iterable that's the problem, it's the kind of thing the iterable yields:
> str.join() wants str, we have. docutils.node.literal.
>
> The lazy solution is to delete the confusing doc string.
>
> Here's my attempt at a non-lazy solution:
>
>     """Yield the members of *iterable* interspersed with *separator*."""

Docstring updated to that text.

> >> I think I'd rather have a separate _node_for_ifcond().  Not a demand.
> >
> > I'm not sure what you have in mind here, so I'll leave it as it is :-)
>
> _nodes_for_sections() is used like this:
>
>         self._add_doc(WHAT,
>                       [self._nodes_for_WHAT(doc, ...),
>                        self._nodes_for_features(doc),
>                        self._nodes_for_sections(doc, ifcond)])
>
> ._add_doc()'s second argument is a list of nodes for sections.
>
> ._nodes_for_WHAT() computes the nodes specific to the kind of thing
> we're processing: enum, object, alternate, command, event.
>
> ._nodes_for_features() computes the nodes for the feature section.
>
> ._nodes_for_sections() computes both the nodes for additional sections
> and the nodes for the ifcond section.
>
> I'd compute the nodes for the ifcond section in their own function, to
> keep .nodes_for_sections() focused on just one purpose.

Oh, OK, that makes sense. The name "_nodes_for_ifcond" is already
taken, though, so I'll call it "_nodes_for_if_section".

> >> Makes me wonder why you build two levels in the first place.
> >
> > This is probably just me not being a very experienced Python
> > programmer. What's the syntax for passing _add_doc the single list
> > which is just the concatenation of the various lists that
> > the _nodes_for_* methods return ?
>
> You could replace
>
>                       [self._nodes_for_WHAT(doc, ...),
>                        self._nodes_for_features(doc),
>                        self._nodes_for_sections(doc, ifcond)])
>
> by
>
>                       self._nodes_for_WHAT(doc, ...)
>                       + self._nodes_for_features(doc)
>                       + self._nodes_for_sections(doc, ifcond)
>
> Operator + concatenates sequences.

Right. We also need to change some of the _nodes_for_whatever
functions that were sometimes returning just a single node
(eg "return section") to instead return a one-element list
("return [section]"), and make them return [] where they were
previously returning None. But this seems like a good change,
as then all these functions consistently return a list
(which might have 0, 1 or many elements).

> > How do I get a QAPIError from within one of the
> > visit_* or freeform methods of a QAPISchemaVisitor ?
>
> The visit_ methods take an info argument.  With that:
>
>     raise QAPISemError(info, "the error message")

We need to launder it into an ExtensionError, so
   qapierr = QAPISemError(info, "message")
   raise ExtensionError(str(qapierr))

but otherwise this works nicely, and you get:

Extension error:
In file included from
/home/petmay01/linaro/qemu-from-laptop/qemu/docs/../qapi/qapi-schema.json:72:
/home/petmay01/linaro/qemu-from-laptop/qemu/docs/../qapi/block.json:4:
section or subsection heading must be in its own doc comment block

(Or we could do the launder-QAPIError-into-ExtensionError
in run() by extending the scope of its try..except, as I think
you suggest below.)

> > (Aside: what was your conclusion on the question of section
> > headers, anyway? I need to go back and re-read the older
> > threads but I think that is still an unresolved topic...)
>
> If I remember correctly, I disliked "[PATCH 19/29]
> qapi/qapi-schema.json: Put headers in their own doc-comment blocks",
> because "it sweeps the actual problem under the rug, namely flaws in our
> parsing and representation of doc comments."
>
> Our documentation has a tree structure.  Ideally, we'd represent it as a
> tree.  I figure such a tree would suit you well: you could translate
> more or less 1:1 to Sphinx nodes.
>
> We currently have a two-level list instead: a list of blocks (either
> definition or free-form), where each block has a list of sections (see
> QAPIDoc's doc string).
>
> My commit dcdc07a97c "qapi: Make section headings start a new doc
> comment block" was a minimally invasive patch to get us closer to a
> (partially flattened) tree.  Kind of collects the dirt in an (hopefully
> obvious) pile next to the rug.

Ah, yes, I'd forgotten that commit. So I can drop the bit of
code you disliked previously that insists on section headings
being in their own freeform block (handling "header line followed
by freeform text" is easy). That also incidentally means we're down
to only one parser-error in the qapidoc.py, which is the one that
catches mis-ordered section headings (eg using '=== header' when
you hadn't previously used '== header').

> You could finish the job and rework the representation into a tree.
> That would be lovely.  I appreciate help with cleaning up existing QAPI
> messes very much, but do not feel obliged.
>
> You could work with the two-level list as is, relying on the fact that a
> '=' can only occur right at the start of a free-form block.

I'll take this option, I think.

> You could tweak the two-level list just a bit, so that the heading
> becomes properly represented, and you don't have to match free-form body
> section text to find it.

thanks
-- PMM
Peter Maydell Sept. 24, 2020, 4:30 p.m. UTC | #10
On Mon, 21 Sep 2020 at 19:06, Peter Maydell <peter.maydell@linaro.org> wrote:
> (I've just noticed that with Sphinx 1.6, which we still have
> to support, the file/line info isn't passed through, so you get:
>
> Warning, treated as error:
> /home/petmay01/linaro/qemu-from-laptop/qemu/docs/interop/qemu-qmp-ref.rst:7:Bullet
> list ends without a blank line; unexpected unindent.
>
> The plugin has code borrowed from kerneldoc.py which is
> *supposed* to handle the older API Sphinx 1.6 used, but it
> looks like it's broken. I'll have a look and see if it
> is fixable, but possibly we may have to live with people
> developing on old distros getting suboptimal errors.)

Tracked down the cause of this -- it turns out that if you
feed nested_parse_with_titles() bogus rST then in some
cases it will detect the error with a line number that's
one off the end of the input text, eg on the 2 lines:
0: * a bulleted list
1: a misindented line

there's a syntax error here where line 1 is misindented,
but at least Sphinx 1.6 wants to attribute the error to a
nonexistent line 2, which then doesn't match in the
input-lines-to-source-info mapping for the fragment
and so gets reported for the next level out (the .rst file).
It just happened that the syntax error I used to test the
file/line reporting this time around was one of this kind.
I assume Sphinx 3 either gets the line attribution more
accurate or is not using the same mechanism for finding
the input line in the source mapping.

The fix is just to always add a blank line at the end of
every .rst fragment we hand to the Sphinx parser, which
doesn't affect the generated output and does sidestep this
issue.

thanks
-- PMM
Markus Armbruster Sept. 25, 2020, 5:51 a.m. UTC | #11
Peter Maydell <peter.maydell@linaro.org> writes:

> On Mon, 21 Sep 2020 at 19:06, Peter Maydell <peter.maydell@linaro.org> wrote:
>> (I've just noticed that with Sphinx 1.6, which we still have
>> to support, the file/line info isn't passed through, so you get:
>>
>> Warning, treated as error:
>> /home/petmay01/linaro/qemu-from-laptop/qemu/docs/interop/qemu-qmp-ref.rst:7:Bullet
>> list ends without a blank line; unexpected unindent.
>>
>> The plugin has code borrowed from kerneldoc.py which is
>> *supposed* to handle the older API Sphinx 1.6 used, but it
>> looks like it's broken. I'll have a look and see if it
>> is fixable, but possibly we may have to live with people
>> developing on old distros getting suboptimal errors.)
>
> Tracked down the cause of this -- it turns out that if you
> feed nested_parse_with_titles() bogus rST then in some
> cases it will detect the error with a line number that's
> one off the end of the input text, eg on the 2 lines:
> 0: * a bulleted list
> 1: a misindented line
>
> there's a syntax error here where line 1 is misindented,
> but at least Sphinx 1.6 wants to attribute the error to a
> nonexistent line 2, which then doesn't match in the
> input-lines-to-source-info mapping for the fragment
> and so gets reported for the next level out (the .rst file).
> It just happened that the syntax error I used to test the
> file/line reporting this time around was one of this kind.

Lucky!

> I assume Sphinx 3 either gets the line attribution more
> accurate or is not using the same mechanism for finding
> the input line in the source mapping.
>
> The fix is just to always add a blank line at the end of
> every .rst fragment we hand to the Sphinx parser, which
> doesn't affect the generated output and does sidestep this
> issue.

Sounds good, as long as it has a comment explaining why we need this.
diff mbox series

Patch

diff --git a/docs/conf.py b/docs/conf.py
index d6e173ef77b..44ec5050717 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -52,7 +52,10 @@  except NameError:
 # add these directories to sys.path here. If the directory is relative to the
 # documentation root, use an absolute path starting from qemu_docdir.
 #
+# Our extensions are in docs/sphinx; the qapidoc extension requires
+# the QAPI modules from scripts/.
 sys.path.insert(0, os.path.join(qemu_docdir, "sphinx"))
+sys.path.insert(0, os.path.join(qemu_docdir, "../scripts"))
 
 
 # -- General configuration ------------------------------------------------
@@ -67,7 +70,7 @@  needs_sphinx = '1.6'
 # Add any Sphinx extension module names here, as strings. They can be
 # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
 # ones.
-extensions = ['kerneldoc', 'qmp_lexer', 'hxtool']
+extensions = ['kerneldoc', 'qmp_lexer', 'hxtool', 'qapidoc']
 
 # Add any paths that contain templates here, relative to this directory.
 templates_path = ['_templates']
@@ -241,3 +244,4 @@  texinfo_documents = [
 kerneldoc_bin = os.path.join(qemu_docdir, '../scripts/kernel-doc')
 kerneldoc_srctree = os.path.join(qemu_docdir, '..')
 hxtool_srctree = os.path.join(qemu_docdir, '..')
+qapidoc_srctree = os.path.join(qemu_docdir, '..')
diff --git a/docs/sphinx/qapidoc.py b/docs/sphinx/qapidoc.py
new file mode 100644
index 00000000000..4f571b2f45a
--- /dev/null
+++ b/docs/sphinx/qapidoc.py
@@ -0,0 +1,504 @@ 
+# coding=utf-8
+#
+# QEMU qapidoc QAPI file parsing extension
+#
+# Copyright (c) 2020 Linaro
+#
+# This work is licensed under the terms of the GNU GPLv2 or later.
+# See the COPYING file in the top-level directory.
+"""qapidoc is a Sphinx extension that implements the qapi-doc directive"""
+
+# The purpose of this extension is to read the documentation comments
+# in QAPI JSON schema files, and insert them all into the current document.
+# The conf.py file must set the qapidoc_srctree config value to
+# the root of the QEMU source tree.
+# Each qapi-doc:: directive takes one argument which is the
+# path of the .json file to process, relative to the source tree.
+
+import os
+import re
+
+from docutils import nodes
+from docutils.statemachine import ViewList
+from docutils.parsers.rst import directives, Directive
+from sphinx.errors import ExtensionError
+from sphinx.util.nodes import nested_parse_with_titles
+import sphinx
+from qapi.gen import QAPISchemaVisitor
+from qapi.schema import QAPIError, QAPISchema
+
+# Sphinx up to 1.6 uses AutodocReporter; 1.7 and later
+# use switch_source_input. Check borrowed from kerneldoc.py.
+Use_SSI = sphinx.__version__[:3] >= '1.7'
+if Use_SSI:
+    from sphinx.util.docutils import switch_source_input
+else:
+    from sphinx.ext.autodoc import AutodocReporter
+
+
+__version__ = '1.0'
+
+# Function borrowed from pydash, which is under the MIT license
+def intersperse(iterable, separator):
+    """Like join, but for arbitrary iterables, notably arrays"""
+    iterable = iter(iterable)
+    yield next(iterable)
+    for item in iterable:
+        yield separator
+        yield item
+
+class QAPISchemaGenRSTVisitor(QAPISchemaVisitor):
+    """A QAPI schema visitor which generates docutils/Sphinx nodes
+
+    This class builds up a tree of docutils/Sphinx nodes corresponding
+    to documentation for the various QAPI objects. To use it, first create
+    a QAPISchemaGenRSTVisitor object, and call its visit_begin() method.
+    Then you can call one of the two methods 'freeform' (to add documentation
+    for a freeform documentation chunk) or 'symbol' (to add documentation
+    for a QAPI symbol). These will cause the visitor to build up the
+    tree of document nodes. Once you've added all the documentation
+    via 'freeform' and 'symbol' method calls, you can call 'get_document_nodes'
+    to get the final list of document nodes (in a form suitable for returning
+    from a Sphinx directive's 'run' method).
+    """
+    def __init__(self, sphinx_directive):
+        self._cur_doc = None
+        self._sphinx_directive = sphinx_directive
+        self._top_node = nodes.section()
+        self._active_headings = [self._top_node]
+
+    def _serror(self, msg):
+        """Raise an exception giving a user-friendly syntax error message"""
+        file = self._cur_doc.info.fname
+        line = self._cur_doc.info.line
+        raise ExtensionError('%s line %d: syntax error: %s' % (file, line, msg))
+
+    def _make_dlitem(self, term, defn):
+        """Return a dlitem node with the specified term and definition.
+
+        term should be a list of Text and literal nodes.
+        defn should be one of:
+        - a string, which will be handed to _parse_text_into_node
+        - a list of Text and literal nodes, which will be put into
+          a paragraph node
+        """
+        dlitem = nodes.definition_list_item()
+        dlterm = nodes.term('', '', *term)
+        dlitem += dlterm
+        if defn:
+            dldef = nodes.definition()
+            if isinstance(defn, list):
+                dldef += nodes.paragraph('', '', *defn)
+            else:
+                self._parse_text_into_node(defn, dldef)
+            dlitem += dldef
+        return dlitem
+
+    def _make_section(self, title):
+        """Return a section node with optional title"""
+        section = nodes.section(ids=[self._sphinx_directive.new_serialno()])
+        if title:
+            section += nodes.title(title, title)
+        return section
+
+    def _nodes_for_ifcond(self, ifcond, with_if=True):
+        """Return list of Text, literal nodes for the ifcond
+
+        Return a list which gives text like ' (If: cond1, cond2, cond3)', where
+        the conditions are in literal-text and the commas are not.
+        If with_if is False, we don't return the "(If: " and ")".
+        """
+        condlist = intersperse([nodes.literal('', c) for c in ifcond],
+                               nodes.Text(', '))
+        if not with_if:
+            return condlist
+
+        nodelist = [nodes.Text(' ('), nodes.strong('', 'If: ')]
+        nodelist.extend(condlist)
+        nodelist.append(nodes.Text(')'))
+        return nodelist
+
+    def _nodes_for_one_member(self, member):
+        """Return list of Text, literal nodes for this member
+
+        Return a list of doctree nodes which give text like
+        'name: type (optional) (If: ...)' suitable for use as the
+        'term' part of a definition list item.
+        """
+        term = [nodes.literal('', member.name)]
+        if member.type.doc_type():
+            term.append(nodes.Text(': '))
+            term.append(nodes.literal('', member.type.doc_type()))
+        if member.optional:
+            term.append(nodes.Text(' (optional)'))
+        if member.ifcond:
+            term.extend(self._nodes_for_ifcond(member.ifcond))
+        return term
+
+    def _nodes_for_variant_when(self, variants, variant):
+        """Return list of Text, literal nodes for variant 'when' clause
+
+        Return a list of doctree nodes which give text like
+        'when tagname is variant (If: ...)' suitable for use in
+        the 'variants' part of a definition list.
+        """
+        term = [nodes.Text(' when '),
+                nodes.literal('', variants.tag_member.name),
+                nodes.Text(' is '),
+                nodes.literal('', '"%s"' % variant.name)]
+        if variant.ifcond:
+            term.extend(self._nodes_for_ifcond(variant.ifcond))
+        return term
+
+    def _nodes_for_members(self, doc, what, base=None, variants=None):
+        """Return doctree nodes for the table of members"""
+        dlnode = nodes.definition_list()
+        for section in doc.args.values():
+            term = self._nodes_for_one_member(section.member)
+            # TODO drop fallbacks when undocumented members are outlawed
+            if section.text:
+                defn = section.text
+            elif (variants and variants.tag_member == section.member
+                  and not section.member.type.doc_type()):
+                values = section.member.type.member_names()
+                defn = [nodes.Text('One of ')]
+                defn.extend(intersperse([nodes.literal('', v) for v in values],
+                                        nodes.Text(', ')))
+            else:
+                defn = [nodes.Text('Not documented')]
+
+            dlnode += self._make_dlitem(term, defn)
+
+        if base:
+            dlnode += self._make_dlitem([nodes.Text('The members of '),
+                                         nodes.literal('', base.doc_type())],
+                                        None)
+
+        if variants:
+            for v in variants.variants:
+                if v.type.is_implicit():
+                    assert not v.type.base and not v.type.variants
+                    for m in v.type.local_members:
+                        term = self._nodes_for_one_member(m)
+                        term.extend(self._nodes_for_variant_when(variants, v))
+                        dlnode += self._make_dlitem(term, None)
+                else:
+                    term = [nodes.Text('The members of '),
+                            nodes.literal('', v.type.doc_type())]
+                    term.extend(self._nodes_for_variant_when(variants, v))
+                    dlnode += self._make_dlitem(term, None)
+
+        if not dlnode.children:
+            return None
+
+        section = self._make_section(what)
+        section += dlnode
+        return section
+
+    def _nodes_for_enum_values(self, doc, what):
+        """Return doctree nodes for the table of enum values"""
+        seen_item = False
+        dlnode = nodes.definition_list()
+        for section in doc.args.values():
+            termtext = [nodes.literal('', section.member.name)]
+            if section.member.ifcond:
+                termtext.extend(self._nodes_for_ifcond(section.member.ifcond))
+            # TODO drop fallbacks when undocumented members are outlawed
+            if section.text:
+                defn = section.text
+            else:
+                defn = [nodes.Text('Not documented')]
+
+            dlnode += self._make_dlitem(termtext, defn)
+            seen_item = True
+
+        if not seen_item:
+            return None
+
+        section = self._make_section(what)
+        section += dlnode
+        return section
+
+    def _nodes_for_arguments(self, doc, boxed_arg_type):
+        """Return doctree nodes for the arguments section"""
+        if boxed_arg_type:
+            assert not doc.args
+            section = self._make_section('Arguments')
+            dlnode = nodes.definition_list()
+            dlnode += self._make_dlitem(
+                [nodes.Text('The members of '),
+                 nodes.literal('', boxed_arg_type.name)],
+                None)
+            section += dlnode
+            return section
+
+        return self._nodes_for_members(doc, 'Arguments')
+
+    def _nodes_for_features(self, doc):
+        """Return doctree nodes for the table of features"""
+        seen_item = False
+        dlnode = nodes.definition_list()
+        for section in doc.features.values():
+            dlnode += self._make_dlitem([nodes.literal('', section.name)],
+                                        section.text)
+            seen_item = True
+
+        if not seen_item:
+            return None
+
+        section = self._make_section('Features')
+        section += dlnode
+        return section
+
+    def _nodes_for_example(self, exampletext):
+        """Return doctree nodes for a code example snippet"""
+        return nodes.literal_block(exampletext, exampletext)
+
+    def _nodes_for_sections(self, doc, ifcond):
+        """Return doctree nodes for additional sections following arguments"""
+        nodelist = []
+        for section in doc.sections:
+            snode = self._make_section(section.name)
+            if section.name and section.name.startswith('Example'):
+                snode += self._nodes_for_example(section.text)
+            else:
+                self._parse_text_into_node(section.text, snode)
+            nodelist.append(snode)
+        if ifcond:
+            snode = self._make_section('If')
+            snode += self._nodes_for_ifcond(ifcond, with_if=False)
+            nodelist.append(snode)
+        if not nodelist:
+            return None
+        return nodelist
+
+    def _add_doc(self, typ, sections):
+        """Add documentation for a command/object/enum...
+
+        We assume we're documenting the thing defined in self._cur_doc.
+        typ is the type of thing being added ("Command", "Object", etc)
+
+        sections is a list of nodes for sections to add to the definition.
+        """
+
+        doc = self._cur_doc
+        snode = nodes.section(ids=[self._sphinx_directive.new_serialno()])
+        snode += nodes.title('', '', *[nodes.literal(doc.symbol, doc.symbol),
+                                       nodes.Text(' (' + typ + ')')])
+        self._parse_text_into_node(doc.body.text, snode)
+        for s in sections:
+            if s is not None:
+                snode += s
+        self._add_node_to_current_heading(snode)
+
+    def visit_enum_type(self, name, info, ifcond, features, members, prefix):
+        doc = self._cur_doc
+        self._add_doc('Enum',
+                      [self._nodes_for_enum_values(doc, 'Values'),
+                       self._nodes_for_features(doc),
+                       self._nodes_for_sections(doc, ifcond)])
+
+    def visit_object_type(self, name, info, ifcond, features,
+                          base, members, variants):
+        doc = self._cur_doc
+        if base and base.is_implicit():
+            base = None
+        self._add_doc('Object',
+                      [self._nodes_for_members(doc, 'Members', base, variants),
+                       self._nodes_for_features(doc),
+                       self._nodes_for_sections(doc, ifcond)])
+
+    def visit_alternate_type(self, name, info, ifcond, features, variants):
+        doc = self._cur_doc
+        self._add_doc('Alternate',
+                      [self._nodes_for_members(doc, 'Members'),
+                       self._nodes_for_features(doc),
+                       self._nodes_for_sections(doc, ifcond)])
+
+    def visit_command(self, name, info, ifcond, features, arg_type,
+                      ret_type, gen, success_response, boxed, allow_oob,
+                      allow_preconfig):
+        doc = self._cur_doc
+        self._add_doc('Command',
+                      [self._nodes_for_arguments(doc,
+                                                 arg_type if boxed else None),
+                       self._nodes_for_features(doc),
+                       self._nodes_for_sections(doc, ifcond)])
+
+    def visit_event(self, name, info, ifcond, features, arg_type, boxed):
+        doc = self._cur_doc
+        self._add_doc('Event',
+                      [self._nodes_for_arguments(doc,
+                                                 arg_type if boxed else None),
+                       self._nodes_for_features(doc),
+                       self._nodes_for_sections(doc, ifcond)])
+
+    def symbol(self, doc, entity):
+        """Add documentation for one symbol to the document tree
+
+        This is the main entry point which causes us to add documentation
+        nodes for a symbol (which could be a 'command', 'object', 'event',
+        etc). We do this by calling 'visit' on the schema entity, which
+        will then call back into one of our visit_* methods, depending
+        on what kind of thing this symbol is.
+        """
+        self._cur_doc = doc
+        entity.visit(self)
+        self._cur_doc = None
+
+    def _start_new_heading(self, heading, level):
+        """Start a new heading at the specified heading level
+
+        Create a new section whose title is 'heading' and which is placed
+        in the docutils node tree as a child of the most recent level-1
+        heading. Subsequent document sections (commands, freeform doc chunks,
+        etc) will be placed as children of this new heading section.
+        """
+        if len(self._active_headings) < level:
+            self._serror('Level %d subheading found outside a level %d heading'
+                         % (level, level - 1))
+        snode = self._make_section(heading)
+        self._active_headings[level - 1] += snode
+        self._active_headings = self._active_headings[:level]
+        self._active_headings.append(snode)
+
+    def _add_node_to_current_heading(self, node):
+        """Add the node to whatever the current active heading is"""
+        self._active_headings[-1] += node
+
+    def freeform(self, doc):
+        """Add a piece of 'freeform' documentation to the document tree
+
+        A 'freeform' document chunk doesn't relate to any particular
+        symbol (for instance, it could be an introduction).
+
+        As a special case, if the freeform document is a single line
+        of the form '= Heading text' it is treated as a section or subsection
+        heading, with the heading level indicated by the number of '=' signs.
+        """
+
+        # QAPIDoc documentation says free-form documentation blocks
+        # must have only a body section, nothing else.
+        assert not doc.sections
+        assert not doc.args
+        assert not doc.features
+        self._cur_doc = doc
+
+        if re.match(r'=+ ', doc.body.text):
+            # Section or subsection heading: must be the only thing in the block
+            (heading, _, rest) = doc.body.text.partition('\n')
+            if rest != '':
+                raise ExtensionError('%s line %s: section or subsection heading'
+                                     ' must be in its own doc comment block'
+                                     % (doc.info.fname, doc.info.line))
+            (leader, _, heading) = heading.partition(' ')
+            self._start_new_heading(heading, len(leader))
+            return
+
+        node = self._make_section(None)
+        self._parse_text_into_node(doc.body.text, node)
+        self._add_node_to_current_heading(node)
+        self._cur_doc = None
+
+    def _parse_text_into_node(self, doctext, node):
+        """Parse a chunk of QAPI-doc-format text into the node
+
+        The doc comment can contain most inline rST markup, including
+        bulleted and enumerated lists.
+        As an extra permitted piece of markup, @var will be turned
+        into ``var``.
+        """
+
+        # Handle the "@var means ``var`` case
+        doctext = re.sub(r'@([\w-]+)', r'``\1``', doctext)
+
+        rstlist = ViewList()
+        for line in doctext.splitlines():
+            # The reported line number will always be that of the start line
+            # of the doc comment, rather than the actual location of the error.
+            # Being more precise would require overhaul of the QAPIDoc class
+            # to track lines more exactly within all the sub-parts of the doc
+            # comment, as well as counting lines here.
+            rstlist.append(line, self._cur_doc.info.fname,
+                           self._cur_doc.info.line)
+        self._sphinx_directive.do_parse(rstlist, node)
+
+    def get_document_nodes(self):
+        """Return the list of docutils nodes which make up the document"""
+        return self._top_node.children
+
+class QAPIDocDirective(Directive):
+    """Extract documentation from the specified QAPI .json file"""
+    required_argument = 1
+    optional_arguments = 1
+    option_spec = {
+        'qapifile': directives.unchanged_required
+    }
+    has_content = False
+
+    def new_serialno(self):
+        """Return a unique new ID string suitable for use as a node's ID"""
+        env = self.state.document.settings.env
+        return 'qapidoc-%d' % env.new_serialno('qapidoc')
+
+    def run(self):
+        env = self.state.document.settings.env
+        qapifile = env.config.qapidoc_srctree + '/' + self.arguments[0]
+
+        # Tell sphinx of the dependency
+        env.note_dependency(os.path.abspath(qapifile))
+
+        try:
+            schema = QAPISchema(qapifile)
+        except QAPIError as err:
+            # Launder QAPI parse errors into Sphinx extension errors
+            # so they are displayed nicely to the user
+            raise ExtensionError(str(err))
+
+        vis = QAPISchemaGenRSTVisitor(self)
+        vis.visit_begin(schema)
+        for doc in schema.docs:
+            if doc.symbol:
+                vis.symbol(doc, schema.lookup_entity(doc.symbol))
+            else:
+                vis.freeform(doc)
+
+        return vis.get_document_nodes()
+
+    def do_parse(self, rstlist, node):
+        """Parse rST source lines and add them to the specified node
+
+        Take the list of rST source lines rstlist, parse them as
+        rST, and add the resulting docutils nodes as children of node.
+        The nodes are parsed in a way that allows them to include
+        subheadings (titles) without confusing the rendering of
+        anything else.
+        """
+        # This is from kerneldoc.py -- it works around an API change in
+        # Sphinx between 1.6 and 1.7. Unlike kerneldoc.py, we use
+        # sphinx.util.nodes.nested_parse_with_titles() rather than the
+        # plain self.state.nested_parse(), and so we can drop the saving
+        # of title_styles and section_level that kerneldoc.py does,
+        # because nested_parse_with_titles() does that for us.
+        if Use_SSI:
+            with switch_source_input(self.state, rstlist):
+                nested_parse_with_titles(self.state, rstlist, node)
+        else:
+            save = self.state.memo.reporter
+            self.state.memo.reporter = AutodocReporter(rstlist,
+                                                       self.state.memo.reporter)
+            try:
+                nested_parse_with_titles(self.state, rstlist, node)
+            finally:
+                self.state.memo.reporter = save
+
+def setup(app):
+    """ Register qapi-doc directive with Sphinx"""
+    app.add_config_value('qapidoc_srctree', None, 'env')
+    app.add_directive('qapi-doc', QAPIDocDirective)
+
+    return dict(
+        version=__version__,
+        parallel_read_safe=True,
+        parallel_write_safe=True
+    )
diff --git a/MAINTAINERS b/MAINTAINERS
index 0886eb3d2b5..fdd38f412d7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3116,6 +3116,7 @@  M: Peter Maydell <peter.maydell@linaro.org>
 S: Maintained
 F: docs/conf.py
 F: docs/*/conf.py
+F: docs/sphinx/
 
 Miscellaneous
 -------------