diff mbox series

[19/29] qapi/qapi-schema.json: Put headers in their own doc-comment blocks

Message ID 20200206173040.17337-20-peter.maydell@linaro.org
State Superseded
Headers show
Series Convert QAPI doc comments to generate rST instead of texinfo | expand

Commit Message

Peter Maydell Feb. 6, 2020, 5:30 p.m. UTC
Our current QAPI doc-comment markup allows section headers
(introduced with a leading '=' or '==') anywhere in any documentation
comment.  This works for texinfo because the texi generator simply
prints a texinfo heading directive at that point in the output
stream.  For rST generation, since we're assembling a tree of
docutils nodes, this is awkward because a new section implies
starting a new section node at the top level of the tree and
generating text into there.

New section headings in the middle of the documentation of a command
or event would be pretty nonsensical, and in fact we only ever output
new headings using 'freeform' doc comment blocks whose only content
is the single line of the heading, with two exceptions, which are in
the introductory freeform-doc-block at the top of
qapi/qapi-schema.json.

Split that doc-comment up so that the heading lines are in their own
doc-comment.  This will allow us to tighten the specification to
insist that heading lines are always standalone, rather than
requiring the rST document generator to look at every line in a doc
comment block and handle headings in odd places.

This change makes no difference to the generated texi.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

---
 qapi/qapi-schema.json | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

-- 
2.20.1

Comments

Markus Armbruster Feb. 7, 2020, 3:34 p.m. UTC | #1
Peter Maydell <peter.maydell@linaro.org> writes:

> Our current QAPI doc-comment markup allows section headers

> (introduced with a leading '=' or '==') anywhere in any documentation

> comment.  This works for texinfo because the texi generator simply

> prints a texinfo heading directive at that point in the output

> stream.  For rST generation, since we're assembling a tree of

> docutils nodes, this is awkward because a new section implies

> starting a new section node at the top level of the tree and

> generating text into there.

>

> New section headings in the middle of the documentation of a command

> or event would be pretty nonsensical, and in fact we only ever output

> new headings using 'freeform' doc comment blocks whose only content

> is the single line of the heading, with two exceptions, which are in

> the introductory freeform-doc-block at the top of

> qapi/qapi-schema.json.

>

> Split that doc-comment up so that the heading lines are in their own

> doc-comment.  This will allow us to tighten the specification to

> insist that heading lines are always standalone, rather than

> requiring the rST document generator to look at every line in a doc

> comment block and handle headings in odd places.

>

> This change makes no difference to the generated texi.

>

> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

> ---

>  qapi/qapi-schema.json | 12 +++++++++---

>  1 file changed, 9 insertions(+), 3 deletions(-)

>

> diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json

> index 9751b11f8f1..f7ba60a5d0b 100644

> --- a/qapi/qapi-schema.json

> +++ b/qapi/qapi-schema.json

> @@ -1,7 +1,9 @@

>  # -*- Mode: Python -*-

>  ##

>  # = Introduction

> -#

> +##

> +

> +##

>  # This document describes all commands currently supported by QMP.

>  #

>  # Most of the time their usage is exactly the same as in the user Monitor, this

> @@ -25,9 +27,13 @@

>  #

>  # Please, refer to the QMP specification (docs/interop/qmp-spec.txt) for

>  # detailed information on the Server command and response formats.

> -#

> +##

> +

> +##

>  # = Stability Considerations

> -#

> +##

> +

> +##

>  # The current QMP command set (described in this file) may be useful for a

>  # number of use cases, however it's limited and several commands have bad

>  # defined semantics, specially with regard to command completion.


I figure this is a minimally invasive patch to avoid complications in
your rST generator.  I'm afraid it sweeps the actual problem under the
rug, namely flaws in our parsing and representation of doc comments.

The doc comment parser doesn't recognize headings.  Instead, that's done
somewhere in the bowels of the Texinfo generator.  Works as long as the
input is "sane", happily generates invalid Texinfo otherwise, see
tests/qapi-schema/doc-bad-section.json.

The proper fix is to make the parser recognize headers in the places
where headers make sense, and reject them elsewhere.

But maybe we don't have to.  Do you plan to support full rST in doc
comments?  If yes, why have our own syntax for headings?  Why not leave
it to rST?  If no, do you plan to support a subset of rST?  If yes,
define it, please.  How will it be enforced?
Peter Maydell Feb. 7, 2020, 4:13 p.m. UTC | #2
On Fri, 7 Feb 2020 at 15:35, Markus Armbruster <armbru@redhat.com> wrote:
>

> Peter Maydell <peter.maydell@linaro.org> writes:

>

> > Our current QAPI doc-comment markup allows section headers

> > (introduced with a leading '=' or '==') anywhere in any documentation

> > comment.  This works for texinfo because the texi generator simply

> > prints a texinfo heading directive at that point in the output

> > stream.  For rST generation, since we're assembling a tree of

> > docutils nodes, this is awkward because a new section implies

> > starting a new section node at the top level of the tree and

> > generating text into there.

> >

> > New section headings in the middle of the documentation of a command

> > or event would be pretty nonsensical, and in fact we only ever output

> > new headings using 'freeform' doc comment blocks whose only content

> > is the single line of the heading, with two exceptions, which are in

> > the introductory freeform-doc-block at the top of

> > qapi/qapi-schema.json.

> >

> > Split that doc-comment up so that the heading lines are in their own

> > doc-comment.  This will allow us to tighten the specification to

> > insist that heading lines are always standalone, rather than

> > requiring the rST document generator to look at every line in a doc

> > comment block and handle headings in odd places.


> I figure this is a minimally invasive patch to avoid complications in

> your rST generator.  I'm afraid it sweeps the actual problem under the

> rug, namely flaws in our parsing and representation of doc comments.

>

> The doc comment parser doesn't recognize headings.  Instead, that's done

> somewhere in the bowels of the Texinfo generator.  Works as long as the

> input is "sane", happily generates invalid Texinfo otherwise, see

> tests/qapi-schema/doc-bad-section.json.

>

> The proper fix is to make the parser recognize headers in the places

> where headers make sense, and reject them elsewhere.

>

> But maybe we don't have to.  Do you plan to support full rST in doc

> comments?  If yes, why have our own syntax for headings?  Why not leave

> it to rST?  If no, do you plan to support a subset of rST?  If yes,

> define it, please.  How will it be enforced?


Doc comments do support full rST. However, (as the commit message
here notes), if you're generating a tree of docutils nodes and
one of them has a section heading in it then you'll get a result
that looks like this:

[root]
  - [ some section created by the script for a QAPI command ]
  - [ some section ]
      - [text nodes, etc going into this section]
      - [a section resulting from rST parsing the header inside the docstring]
  - [ next section created by the script for a QAPI command ]

(ie you'll have defined a subsection within whatever document
paragraph/section the current command is documenting, not
a new top-level subsection which subsequent commands will
become children of)

What you actually want is that the new header results in
a differently structured tree:
[root]
  - [ some section created by the script for a QAPI command ]
  - [ some section ]
      - [text nodes, etc going into this section]
  - [ a new top level section whose header is whatever this header is ]
     - [ next section created by the script is a child of that section ]
     - [ etc ]

There's no way to get that without actually noticing and handling
headings specially as being something entirely different from
a lump of documentation text. "A heading is a single-line special-case
of a freeform comment" happens to be the way we mark up headings
now in 99% of cases, so that's what I implemented. (The Sphinx
extension will complain if there's trailing junk lines after
a heading line at the beginning of a freeform comment block.
If you use '== something' in a line in the middle of a doc
comment, we'll just interpret that as rST source, which is to
say a couple of literal equals signs at the start of a line.)

thanks
-- PMM
Markus Armbruster Feb. 8, 2020, 2:10 p.m. UTC | #3
Peter Maydell <peter.maydell@linaro.org> writes:

> On Fri, 7 Feb 2020 at 15:35, Markus Armbruster <armbru@redhat.com> wrote:

>>

>> Peter Maydell <peter.maydell@linaro.org> writes:

>>

>> > Our current QAPI doc-comment markup allows section headers

>> > (introduced with a leading '=' or '==') anywhere in any documentation

>> > comment.  This works for texinfo because the texi generator simply

>> > prints a texinfo heading directive at that point in the output

>> > stream.  For rST generation, since we're assembling a tree of

>> > docutils nodes, this is awkward because a new section implies

>> > starting a new section node at the top level of the tree and

>> > generating text into there.

>> >

>> > New section headings in the middle of the documentation of a command

>> > or event would be pretty nonsensical, and in fact we only ever output

>> > new headings using 'freeform' doc comment blocks whose only content

>> > is the single line of the heading, with two exceptions, which are in

>> > the introductory freeform-doc-block at the top of

>> > qapi/qapi-schema.json.

>> >

>> > Split that doc-comment up so that the heading lines are in their own

>> > doc-comment.  This will allow us to tighten the specification to

>> > insist that heading lines are always standalone, rather than

>> > requiring the rST document generator to look at every line in a doc

>> > comment block and handle headings in odd places.

>

>> I figure this is a minimally invasive patch to avoid complications in

>> your rST generator.  I'm afraid it sweeps the actual problem under the

>> rug, namely flaws in our parsing and representation of doc comments.

>>

>> The doc comment parser doesn't recognize headings.  Instead, that's done

>> somewhere in the bowels of the Texinfo generator.  Works as long as the

>> input is "sane", happily generates invalid Texinfo otherwise, see

>> tests/qapi-schema/doc-bad-section.json.

>>

>> The proper fix is to make the parser recognize headers in the places

>> where headers make sense, and reject them elsewhere.

>>

>> But maybe we don't have to.  Do you plan to support full rST in doc

>> comments?  If yes, why have our own syntax for headings?  Why not leave

>> it to rST?  If no, do you plan to support a subset of rST?  If yes,

>> define it, please.  How will it be enforced?

>

> Doc comments do support full rST. However, (as the commit message

> here notes), if you're generating a tree of docutils nodes and

> one of them has a section heading in it then you'll get a result

> that looks like this:

>

> [root]

>   - [ some section created by the script for a QAPI command ]

>   - [ some section ]

>       - [text nodes, etc going into this section]

>       - [a section resulting from rST parsing the header inside the docstring]

>   - [ next section created by the script for a QAPI command ]

>

> (ie you'll have defined a subsection within whatever document

> paragraph/section the current command is documenting, not

> a new top-level subsection which subsequent commands will

> become children of)

>

> What you actually want is that the new header results in

> a differently structured tree:

> [root]

>   - [ some section created by the script for a QAPI command ]

>   - [ some section ]

>       - [text nodes, etc going into this section]

>   - [ a new top level section whose header is whatever this header is ]

>      - [ next section created by the script is a child of that section ]

>      - [ etc ]

>

> There's no way to get that without actually noticing and handling

> headings specially as being something entirely different from

> a lump of documentation text. "A heading is a single-line special-case

> of a freeform comment" happens to be the way we mark up headings

> now in 99% of cases, so that's what I implemented. (The Sphinx

> extension will complain if there's trailing junk lines after

> a heading line at the beginning of a freeform comment block.

> If you use '== something' in a line in the middle of a doc

> comment, we'll just interpret that as rST source, which is to

> say a couple of literal equals signs at the start of a line.)


A couple of remarks.

Silently passing a "# == something" line to rST for literal
(mis-)interpretation is not nice.  It's the kind of indifference that
led to the messes you cleaned up in PATCH 04 and 08.  If the '=' markup
is only valid in certain places, it should be rejected where it isn't.

By refusing to translate "# == something" to rST (silently or loudly,
doesn't matter), the first tree structure becomes impossible.  Except
when I do the translating myself: I can put an *rST* section wherever I
want.

I'm still having difficulties understanding what exactly we gain by
translating '=' markup to rST.

By the way, your implementation rejects

    ##
    # = Introduction
    # xxx
    ##

but silently accepts

    ##
    # xxx
    # = Introduction
    ##

doc-good.json has more instances of this issue.  Before your series, we
actually check we generate the Texinfo we expect for it.  I can't find
where you cover this now.  It has saved me from my screwups more than
once, so I don't want to lose that.

Now let's put my doubts and your possible bugs / omissions aside and
assume we want '=' markup, and we want to keep the resulting sections
out of "sections created by the script for a QAPI command".

A schema's documentation is a sequence of comment blocks.

Each comment block is either a definition comment block or a free form
comment block.

Before your series, we recognize '=' markup everywhere, but that's
basically wrong (see "flaws in our parsing and representation of doc
comments" above).  It should be accepted only in free-form comment
blocks.

That way, the free-form comment blocks build a section structure, and
the definition comment blocks slot their stuff into this structure.

Form a language design perspective, I can't see the need for restricting
'=' further to occur only by themselves.

Is it an issue of implementation perhaps?
Peter Maydell Feb. 8, 2020, 2:43 p.m. UTC | #4
On Sat, 8 Feb 2020 at 14:10, Markus Armbruster <armbru@redhat.com> wrote:
> A couple of remarks.

>

> Silently passing a "# == something" line to rST for literal

> (mis-)interpretation is not nice.  It's the kind of indifference that

> led to the messes you cleaned up in PATCH 04 and 08.  If the '=' markup

> is only valid in certain places, it should be rejected where it isn't.

>

> By refusing to translate "# == something" to rST (silently or loudly,

> doesn't matter), the first tree structure becomes impossible.  Except

> when I do the translating myself: I can put an *rST* section wherever I

> want.

>

> I'm still having difficulties understanding what exactly we gain by

> translating '=' markup to rST.


We don't translate '=' markup to rST. We use it as the way
the input document tells us about the structure of the
tree of docutils sections it wants us to create. No rST
markup is ever involved in that process.

If we wanted to do something with '# == something' in actual
chunks of doc comment, we would have to specifically scan
the doc comment for that. Currently we simply pass doc
comments to the rST parser to be parsed. Then you have
to document the syntax of a doc comment as "rST, except
that as a special case a line starting == doesn't do what
it does in normal rST text but is a syntax error".

What I would like is:
 * doc comments are simply rST, and we interpret them as
   rST by passing them directly to the rST parser. (We do
   special case the "@var means ``var``" markup, but I
   would rather keep to a minimum the number of things we
   special case in that way.)
 * headings and subheadings affect the entire document
   structure; they are not rST source and are never
   interpreted as rST source or sent to the rST parser.
   They are a thing all of their own, not a bit of markup
   within a larger doc comment.
 * not to gratuitously change the way we in practice
   mark up headings in our document, because that makes
   the texi->rST transition harder for no clear gain.
   If we absolutely must mark them up in some other way
   than we do today, I can implement that, but it feels
   like unnecessary work.

> By the way, your implementation rejects

>

>     ##

>     # = Introduction

>     # xxx

>     ##

>

> but silently accepts

>

>     ##

>     # xxx

>     # = Introduction

>     ##


Yes. I could treat all freeform comments that aren't
the special one-line heading/subheading as being normal
freeform comments, which would avoid this inconsistency.
That's probably better.

> doc-good.json has more instances of this issue.  Before your series, we

> actually check we generate the Texinfo we expect for it.  I can't find

> where you cover this now.  It has saved me from my screwups more than

> once, so I don't want to lose that.


Yes, we no longer have a test case for "do we generate what we
expect to". That's harder to do with a Sphinx extension because
we don't ever output rST source anywhere that we could compare
to an expected version. (We still have the existing tests that
the doc comments spit out by parser.py match expected output.)
I'm dubious about running Sphinx and comparing the generated HTML
because that seems like it would be vulnerable to test failures
if Sphinx internals change the fine detail of how it outputs HTML.

> Now let's put my doubts and your possible bugs / omissions aside and

> assume we want '=' markup, and we want to keep the resulting sections

> out of "sections created by the script for a QAPI command".

>

> A schema's documentation is a sequence of comment blocks.

>

> Each comment block is either a definition comment block or a free form

> comment block.

>

> Before your series, we recognize '=' markup everywhere, but that's

> basically wrong (see "flaws in our parsing and representation of doc

> comments" above).  It should be accepted only in free-form comment

> blocks.

>

> That way, the free-form comment blocks build a section structure, and

> the definition comment blocks slot their stuff into this structure.

>

> Form a language design perspective, I can't see the need for restricting

> '=' further to occur only by themselves.

>

> Is it an issue of implementation perhaps?


Do you mean: could we make the implementation take a freeform
comment block like:

##
# = Foo
#
# Some text
#
# = Bar
#
# More text
##

and parse through that text to split it up into

- "start new level-1 section with heading 'Foo'"
- "a freeform comment block '\nSome text\n\n'" (for current section)
- "start new level-1 section with heading 'Bar'"
- "a freeform comment block '\nMore text\n\n'" (for current section)

so that the input rST doesn't need to mark off the headings
as being in their own freeform comment block ?

Yes, we could do that, and it wouldn't be too hard.
My issues with that are:

(1) you now don't have a way to literally write '= Foo'
to be interpreted the way that rST would interpret it
in a comment block that's otherwise just "any
rST markup is fine".

(2) it falsely suggests that headings are OK in
doc comment blocks and are just another kind of
markup within a doc comment, when they aren't, and
we're really just providing syntactic sugar here

(3) it obscures the actual structure of the
document, which is

 [root node]
  [section]
     [title 'Foo']
     [Text 'Some text']
  [section]
     [title 'Bar']
     [Text 'More text']

where "Some text" and "More text" are in entirely
different sections, even though they're in the same
doc comment block. (If 'Foo' is a subsection and
'Bar' a section, the text can end up even further
separated within the document tree.)

(4) it breaks a general approach within the handling
of doc comments, which is that we build the document
based on the various bits of information about the
symbol, etc, but a chunk of text from a document
comment is handled simply by handing it off to
the rST parser, not by doing some script-specific
pre-parsing and mangling of it and then handing it
to the rST parser. This in turn means that we need
to document the syntax of comment blocks as not
just "it's rST with the @var-means-``var`` extra"
but "it's rST, with @var-means-``var``, and = Foo
is invalid most places but within a freeform block
means this other thing". I think that minimising
the number of extras we add on top of "the syntax
for a block of text in a doc comment is rST"
makes things less confusing.

thanks
-- PMM
diff mbox series

Patch

diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json
index 9751b11f8f1..f7ba60a5d0b 100644
--- a/qapi/qapi-schema.json
+++ b/qapi/qapi-schema.json
@@ -1,7 +1,9 @@ 
 # -*- Mode: Python -*-
 ##
 # = Introduction
-#
+##
+
+##
 # This document describes all commands currently supported by QMP.
 #
 # Most of the time their usage is exactly the same as in the user Monitor, this
@@ -25,9 +27,13 @@ 
 #
 # Please, refer to the QMP specification (docs/interop/qmp-spec.txt) for
 # detailed information on the Server command and response formats.
-#
+##
+
+##
 # = Stability Considerations
-#
+##
+
+##
 # The current QMP command set (described in this file) may be useful for a
 # number of use cases, however it's limited and several commands have bad
 # defined semantics, specially with regard to command completion.