Errors and Edge Cases¶
There are 3 possible ways that parsing can fail. All of them raise subclasses of
ParseError. All instances
ParseError contain a
message, and fields
end_index indicating where in
the token stream the error is occurring.
When the parser can find no possible parse for a token stream, it raises
NoParseError. The location will be the
furthest the parse got before there were no possible parses. Additional fields are included:
encountered- The token we failed at, or
Noneif the end of the stream was reached.
expected_terminals- All the terminals that were evaluated against encountered (and failed)
expected- Similar to
expected_terminals, except the parser includes any non-terminals that could have started at the encountered token, and hides any terminals or non-terminals that are implicitly covered another one. This is usually a higher level summary of what was expected at any given point.
Note, you can override method
ParseRuleSet.is_anonymous() to return true for some heads. Any anonymous rule
will never be eligible to appear in
expected. This is useful if you are transforming or generating the grammar,
and some rules don’t make sense to report.
Ambiguous parses are not an immediate error. The
parse function simply returns a forest which contains all
possible parse trees. However, if you call
ParseForest.single and there is ambiguity, then
will be thrown. It will indicate the earliest possible ambiguity, but there may be others. The error will contain a
values containing the possible alternatives parses. However, it only contains a subtree of the full parse
tree, and additionally, it may be only halfway through building a rule, so the subtree may be missing elements.
These limitations ensure that
values is a short list. It is recommended you do not use
you need more detail on ambiguity.
In some obscure grammars it is possible to define rules that have an infinite number of possible parses. Here is a simple example:
ParseRule("s", [NT("s")] ParseRule("s", [T("word")]
["word"], all the following are valid parse trees:
(s: word) (s: (s: word)) (s: (s: (s: word))) ...
It’s possible to improve support for infinite parses if there is demand. Let me know.