You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to use tree-sitter-rst to parse numpydoc docstrings, which are based on rst but not a strict subset (see Carreau/velin#36).
While parsing, I noticed that this:
See Also
--------
item : description
Notes
-----
Some text.
will consume anything after the incomplete definition list item as part of a definition list:
(document (section (title)) (ERROR (classifier)))
where the definition list item consumes everything afterwards and dumps it into the classifier.
Instead, I would have expected a error node, but one that only contains the actual term and classifier, while everything else afterwards is parsed as usual (in other words, I'd like tree-sitter to prefer the insertion of a token over consuming more tokens in this case).
Do you think there is anything that can be changed in this library to get this to work (in other words, is this a bug, either in tree-sitter-rst or in upstream tree-sitter)? Or would you rather recommend a derived grammar that is specific to numpydoc (if that's possible)?
The text was updated successfully, but these errors were encountered:
numpydoc makes use of docutils, so I guess it expects the same behavior (not sure though, I'm no expert on that code base). Edit: It appears that numpydoc is splitting the document (docstring) into sections and parses the content of these one by one. So no involvement of docutils or any other parsing library, just a bunch of regular expressions. This means that it also does not try to classify content as paragraphs or definition lists.
So yes, this can very well be a duplicate of #20. This might still be a duplicate of #20, but I also think that tree-sitter-rst can be a bit stricter than docutils (which to me appears to be very forgiving).
I'm trying to use
tree-sitter-rst
to parsenumpydoc
docstrings, which are based on rst but not a strict subset (see Carreau/velin#36).While parsing, I noticed that this:
will consume anything after the incomplete definition list item as part of a definition list:
where the definition list item consumes everything afterwards and dumps it into the classifier.
Instead, I would have expected a error node, but one that only contains the actual term and classifier, while everything else afterwards is parsed as usual (in other words, I'd like
tree-sitter
to prefer the insertion of a token over consuming more tokens in this case).Do you think there is anything that can be changed in this library to get this to work (in other words, is this a bug, either in
tree-sitter-rst
or in upstreamtree-sitter
)? Or would you rather recommend a derived grammar that is specific tonumpydoc
(if that's possible)?The text was updated successfully, but these errors were encountered: