Remove one local copy of the Python antlr3 module
* Remove the directory thirdparty/antlr3/ * Modify the antlr3 symbolic link to point to thirdparty/antlr3-antlr-3.5/runtime/Python/antlr3/ Change-Id: I8104b7352e96d8e282da4e5bd8ff4fb4817aaa32
This commit is contained in:
parent
41d26396d6
commit
5836bf07e0
|
@ -1,159 +0,0 @@
|
|||
""" @package antlr3
|
||||
@brief ANTLR3 runtime package
|
||||
|
||||
This module contains all support classes, which are needed to use recognizers
|
||||
generated by ANTLR3.
|
||||
|
||||
@mainpage
|
||||
|
||||
\note Please be warned that the line numbers in the API documentation do not
|
||||
match the real locations in the source code of the package. This is an
|
||||
unintended artifact of doxygen, which I could only convince to use the
|
||||
correct module names by concatenating all files from the package into a single
|
||||
module file...
|
||||
|
||||
Here is a little overview over the most commonly used classes provided by
|
||||
this runtime:
|
||||
|
||||
@section recognizers Recognizers
|
||||
|
||||
These recognizers are baseclasses for the code which is generated by ANTLR3.
|
||||
|
||||
- BaseRecognizer: Base class with common recognizer functionality.
|
||||
- Lexer: Base class for lexers.
|
||||
- Parser: Base class for parsers.
|
||||
- tree.TreeParser: Base class for %tree parser.
|
||||
|
||||
@section streams Streams
|
||||
|
||||
Each recognizer pulls its input from one of the stream classes below. Streams
|
||||
handle stuff like buffering, look-ahead and seeking.
|
||||
|
||||
A character stream is usually the first element in the pipeline of a typical
|
||||
ANTLR3 application. It is used as the input for a Lexer.
|
||||
|
||||
- ANTLRStringStream: Reads from a string objects. The input should be a unicode
|
||||
object, or ANTLR3 will have trouble decoding non-ascii data.
|
||||
- ANTLRFileStream: Opens a file and read the contents, with optional character
|
||||
decoding.
|
||||
- ANTLRInputStream: Reads the date from a file-like object, with optional
|
||||
character decoding.
|
||||
|
||||
A Parser needs a TokenStream as input (which in turn is usually fed by a
|
||||
Lexer):
|
||||
|
||||
- CommonTokenStream: A basic and most commonly used TokenStream
|
||||
implementation.
|
||||
- TokenRewriteStream: A modification of CommonTokenStream that allows the
|
||||
stream to be altered (by the Parser). See the 'tweak' example for a usecase.
|
||||
|
||||
And tree.TreeParser finally fetches its input from a tree.TreeNodeStream:
|
||||
|
||||
- tree.CommonTreeNodeStream: A basic and most commonly used tree.TreeNodeStream
|
||||
implementation.
|
||||
|
||||
|
||||
@section tokenstrees Tokens and Trees
|
||||
|
||||
A Lexer emits Token objects which are usually buffered by a TokenStream. A
|
||||
Parser can build a Tree, if the output=AST option has been set in the grammar.
|
||||
|
||||
The runtime provides these Token implementations:
|
||||
|
||||
- CommonToken: A basic and most commonly used Token implementation.
|
||||
- ClassicToken: A Token object as used in ANTLR 2.x, used to %tree
|
||||
construction.
|
||||
|
||||
Tree objects are wrapper for Token objects.
|
||||
|
||||
- tree.CommonTree: A basic and most commonly used Tree implementation.
|
||||
|
||||
A tree.TreeAdaptor is used by the parser to create tree.Tree objects for the
|
||||
input Token objects.
|
||||
|
||||
- tree.CommonTreeAdaptor: A basic and most commonly used tree.TreeAdaptor
|
||||
implementation.
|
||||
|
||||
|
||||
@section Exceptions
|
||||
|
||||
RecognitionException are generated, when a recognizer encounters incorrect
|
||||
or unexpected input.
|
||||
|
||||
- RecognitionException
|
||||
- MismatchedRangeException
|
||||
- MismatchedSetException
|
||||
- MismatchedNotSetException
|
||||
.
|
||||
- MismatchedTokenException
|
||||
- MismatchedTreeNodeException
|
||||
- NoViableAltException
|
||||
- EarlyExitException
|
||||
- FailedPredicateException
|
||||
.
|
||||
.
|
||||
|
||||
A tree.RewriteCardinalityException is raised, when the parsers hits a
|
||||
cardinality mismatch during AST construction. Although this is basically a
|
||||
bug in your grammar, it can only be detected at runtime.
|
||||
|
||||
- tree.RewriteCardinalityException
|
||||
- tree.RewriteEarlyExitException
|
||||
- tree.RewriteEmptyStreamException
|
||||
.
|
||||
.
|
||||
|
||||
"""
|
||||
|
||||
# tree.RewriteRuleElementStream
|
||||
# tree.RewriteRuleSubtreeStream
|
||||
# tree.RewriteRuleTokenStream
|
||||
# CharStream
|
||||
# DFA
|
||||
# TokenSource
|
||||
|
||||
# [The "BSD licence"]
|
||||
# Copyright (c) 2005-2008 Terence Parr
|
||||
# All rights reserved.
|
||||
#
|
||||
# Redistribution and use in source and binary forms, with or without
|
||||
# modification, are permitted provided that the following conditions
|
||||
# are met:
|
||||
# 1. Redistributions of source code must retain the above copyright
|
||||
# notice, this list of conditions and the following disclaimer.
|
||||
# 2. Redistributions in binary form must reproduce the above copyright
|
||||
# notice, this list of conditions and the following disclaimer in the
|
||||
# documentation and/or other materials provided with the distribution.
|
||||
# 3. The name of the author may not be used to endorse or promote products
|
||||
# derived from this software without specific prior written permission.
|
||||
#
|
||||
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
||||
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
||||
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
||||
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
||||
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
||||
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
||||
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
||||
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
||||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
||||
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
import os
|
||||
import sys
|
||||
|
||||
__version__ = '3.4'
|
||||
|
||||
# This runtime is compatible with generated parsers using the following
|
||||
# API versions. 'HEAD' is only used by unittests.
|
||||
compatible_api_versions = ['HEAD', 1]
|
||||
|
||||
top_dir = os.path.normpath(os.path.join(os.path.abspath(__file__),
|
||||
os.pardir))
|
||||
sys.path.append(top_dir)
|
||||
|
||||
from antlr3.constants import *
|
||||
from antlr3.dfa import *
|
||||
from antlr3.exceptions import *
|
||||
from antlr3.recognizers import *
|
||||
from antlr3.streams import *
|
||||
from antlr3.tokens import *
|
|
@ -1,48 +0,0 @@
|
|||
"""Compatibility stuff"""
|
||||
|
||||
# begin[licence]
|
||||
#
|
||||
# [The "BSD licence"]
|
||||
# Copyright (c) 2005-2008 Terence Parr
|
||||
# All rights reserved.
|
||||
#
|
||||
# Redistribution and use in source and binary forms, with or without
|
||||
# modification, are permitted provided that the following conditions
|
||||
# are met:
|
||||
# 1. Redistributions of source code must retain the above copyright
|
||||
# notice, this list of conditions and the following disclaimer.
|
||||
# 2. Redistributions in binary form must reproduce the above copyright
|
||||
# notice, this list of conditions and the following disclaimer in the
|
||||
# documentation and/or other materials provided with the distribution.
|
||||
# 3. The name of the author may not be used to endorse or promote products
|
||||
# derived from this software without specific prior written permission.
|
||||
#
|
||||
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
||||
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
||||
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
||||
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
||||
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
||||
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
||||
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
||||
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
||||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
||||
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
#
|
||||
# end[licence]
|
||||
|
||||
try:
|
||||
set = set
|
||||
frozenset = frozenset
|
||||
except NameError:
|
||||
from sets import Set as set, ImmutableSet as frozenset
|
||||
|
||||
|
||||
try:
|
||||
reversed = reversed
|
||||
except NameError:
|
||||
def reversed(l):
|
||||
l = l[:]
|
||||
l.reverse()
|
||||
return l
|
||||
|
||||
|
|
@ -1,57 +0,0 @@
|
|||
"""ANTLR3 runtime package"""
|
||||
|
||||
# begin[licence]
|
||||
#
|
||||
# [The "BSD licence"]
|
||||
# Copyright (c) 2005-2008 Terence Parr
|
||||
# All rights reserved.
|
||||
#
|
||||
# Redistribution and use in source and binary forms, with or without
|
||||
# modification, are permitted provided that the following conditions
|
||||
# are met:
|
||||
# 1. Redistributions of source code must retain the above copyright
|
||||
# notice, this list of conditions and the following disclaimer.
|
||||
# 2. Redistributions in binary form must reproduce the above copyright
|
||||
# notice, this list of conditions and the following disclaimer in the
|
||||
# documentation and/or other materials provided with the distribution.
|
||||
# 3. The name of the author may not be used to endorse or promote products
|
||||
# derived from this software without specific prior written permission.
|
||||
#
|
||||
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
||||
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
||||
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
||||
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
||||
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
||||
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
||||
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
||||
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
||||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
||||
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
#
|
||||
# end[licence]
|
||||
|
||||
EOF = -1
|
||||
|
||||
## All tokens go to the parser (unless skip() is called in that rule)
|
||||
# on a particular "channel". The parser tunes to a particular channel
|
||||
# so that whitespace etc... can go to the parser on a "hidden" channel.
|
||||
DEFAULT_CHANNEL = 0
|
||||
|
||||
## Anything on different channel than DEFAULT_CHANNEL is not parsed
|
||||
# by parser.
|
||||
HIDDEN_CHANNEL = 99
|
||||
|
||||
# Predefined token types
|
||||
EOR_TOKEN_TYPE = 1
|
||||
|
||||
##
|
||||
# imaginary tree navigation type; traverse "get child" link
|
||||
DOWN = 2
|
||||
##
|
||||
#imaginary tree navigation type; finish with a child list
|
||||
UP = 3
|
||||
|
||||
MIN_TOKEN_TYPE = UP+1
|
||||
|
||||
INVALID_TOKEN_TYPE = 0
|
||||
|
File diff suppressed because it is too large
Load Diff
|
@ -1,213 +0,0 @@
|
|||
"""ANTLR3 runtime package"""
|
||||
|
||||
# begin[licence]
|
||||
#
|
||||
# [The "BSD licence"]
|
||||
# Copyright (c) 2005-2008 Terence Parr
|
||||
# All rights reserved.
|
||||
#
|
||||
# Redistribution and use in source and binary forms, with or without
|
||||
# modification, are permitted provided that the following conditions
|
||||
# are met:
|
||||
# 1. Redistributions of source code must retain the above copyright
|
||||
# notice, this list of conditions and the following disclaimer.
|
||||
# 2. Redistributions in binary form must reproduce the above copyright
|
||||
# notice, this list of conditions and the following disclaimer in the
|
||||
# documentation and/or other materials provided with the distribution.
|
||||
# 3. The name of the author may not be used to endorse or promote products
|
||||
# derived from this software without specific prior written permission.
|
||||
#
|
||||
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
||||
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
||||
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
||||
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
||||
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
||||
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
||||
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
||||
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
||||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
||||
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
#
|
||||
# end[licensc]
|
||||
|
||||
from antlr3.constants import EOF
|
||||
from antlr3.exceptions import NoViableAltException, BacktrackingFailed
|
||||
|
||||
|
||||
class DFA(object):
|
||||
"""@brief A DFA implemented as a set of transition tables.
|
||||
|
||||
Any state that has a semantic predicate edge is special; those states
|
||||
are generated with if-then-else structures in a specialStateTransition()
|
||||
which is generated by cyclicDFA template.
|
||||
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
recognizer, decisionNumber,
|
||||
eot, eof, min, max, accept, special, transition
|
||||
):
|
||||
## Which recognizer encloses this DFA? Needed to check backtracking
|
||||
self.recognizer = recognizer
|
||||
|
||||
self.decisionNumber = decisionNumber
|
||||
self.eot = eot
|
||||
self.eof = eof
|
||||
self.min = min
|
||||
self.max = max
|
||||
self.accept = accept
|
||||
self.special = special
|
||||
self.transition = transition
|
||||
|
||||
|
||||
def predict(self, input):
|
||||
"""
|
||||
From the input stream, predict what alternative will succeed
|
||||
using this DFA (representing the covering regular approximation
|
||||
to the underlying CFL). Return an alternative number 1..n. Throw
|
||||
an exception upon error.
|
||||
"""
|
||||
mark = input.mark()
|
||||
s = 0 # we always start at s0
|
||||
try:
|
||||
for _ in xrange(50000):
|
||||
#print "***Current state = %d" % s
|
||||
|
||||
specialState = self.special[s]
|
||||
if specialState >= 0:
|
||||
#print "is special"
|
||||
s = self.specialStateTransition(specialState, input)
|
||||
if s == -1:
|
||||
self.noViableAlt(s, input)
|
||||
return 0
|
||||
input.consume()
|
||||
continue
|
||||
|
||||
if self.accept[s] >= 1:
|
||||
#print "accept state for alt %d" % self.accept[s]
|
||||
return self.accept[s]
|
||||
|
||||
# look for a normal char transition
|
||||
c = input.LA(1)
|
||||
|
||||
#print "LA = %d (%r)" % (c, unichr(c) if c >= 0 else 'EOF')
|
||||
#print "range = %d..%d" % (self.min[s], self.max[s])
|
||||
|
||||
if c >= self.min[s] and c <= self.max[s]:
|
||||
# move to next state
|
||||
snext = self.transition[s][c-self.min[s]]
|
||||
#print "in range, next state = %d" % snext
|
||||
|
||||
if snext < 0:
|
||||
#print "not a normal transition"
|
||||
# was in range but not a normal transition
|
||||
# must check EOT, which is like the else clause.
|
||||
# eot[s]>=0 indicates that an EOT edge goes to another
|
||||
# state.
|
||||
if self.eot[s] >= 0: # EOT Transition to accept state?
|
||||
#print "EOT trans to accept state %d" % self.eot[s]
|
||||
|
||||
s = self.eot[s]
|
||||
input.consume()
|
||||
# TODO: I had this as return accept[eot[s]]
|
||||
# which assumed here that the EOT edge always
|
||||
# went to an accept...faster to do this, but
|
||||
# what about predicated edges coming from EOT
|
||||
# target?
|
||||
continue
|
||||
|
||||
#print "no viable alt"
|
||||
self.noViableAlt(s, input)
|
||||
return 0
|
||||
|
||||
s = snext
|
||||
input.consume()
|
||||
continue
|
||||
|
||||
if self.eot[s] >= 0:
|
||||
#print "EOT to %d" % self.eot[s]
|
||||
|
||||
s = self.eot[s]
|
||||
input.consume()
|
||||
continue
|
||||
|
||||
# EOF Transition to accept state?
|
||||
if c == EOF and self.eof[s] >= 0:
|
||||
#print "EOF Transition to accept state %d" \
|
||||
# % self.accept[self.eof[s]]
|
||||
return self.accept[self.eof[s]]
|
||||
|
||||
# not in range and not EOF/EOT, must be invalid symbol
|
||||
self.noViableAlt(s, input)
|
||||
return 0
|
||||
|
||||
else:
|
||||
raise RuntimeError("DFA bang!")
|
||||
|
||||
finally:
|
||||
input.rewind(mark)
|
||||
|
||||
|
||||
def noViableAlt(self, s, input):
|
||||
if self.recognizer._state.backtracking > 0:
|
||||
raise BacktrackingFailed
|
||||
|
||||
nvae = NoViableAltException(
|
||||
self.getDescription(),
|
||||
self.decisionNumber,
|
||||
s,
|
||||
input
|
||||
)
|
||||
|
||||
self.error(nvae)
|
||||
raise nvae
|
||||
|
||||
|
||||
def error(self, nvae):
|
||||
"""A hook for debugging interface"""
|
||||
pass
|
||||
|
||||
|
||||
def specialStateTransition(self, s, input):
|
||||
return -1
|
||||
|
||||
|
||||
def getDescription(self):
|
||||
return "n/a"
|
||||
|
||||
|
||||
## def specialTransition(self, state, symbol):
|
||||
## return 0
|
||||
|
||||
|
||||
def unpack(cls, string):
|
||||
"""@brief Unpack the runlength encoded table data.
|
||||
|
||||
Terence implemented packed table initializers, because Java has a
|
||||
size restriction on .class files and the lookup tables can grow
|
||||
pretty large. The generated JavaLexer.java of the Java.g example
|
||||
would be about 15MB with uncompressed array initializers.
|
||||
|
||||
Python does not have any size restrictions, but the compilation of
|
||||
such large source files seems to be pretty memory hungry. The memory
|
||||
consumption of the python process grew to >1.5GB when importing a
|
||||
15MB lexer, eating all my swap space and I was to impacient to see,
|
||||
if it could finish at all. With packed initializers that are unpacked
|
||||
at import time of the lexer module, everything works like a charm.
|
||||
|
||||
"""
|
||||
|
||||
ret = []
|
||||
for i in range(len(string) / 2):
|
||||
(n, v) = ord(string[i*2]), ord(string[i*2+1])
|
||||
|
||||
# Is there a bitwise operation to do this?
|
||||
if v == 0xFFFF:
|
||||
v = -1
|
||||
|
||||
ret += [v] * n
|
||||
|
||||
return ret
|
||||
|
||||
unpack = classmethod(unpack)
|
|
@ -1,210 +0,0 @@
|
|||
""" @package antlr3.dottreegenerator
|
||||
@brief ANTLR3 runtime package, tree module
|
||||
|
||||
This module contains all support classes for AST construction and tree parsers.
|
||||
|
||||
"""
|
||||
|
||||
# begin[licence]
|
||||
#
|
||||
# [The "BSD licence"]
|
||||
# Copyright (c) 2005-2008 Terence Parr
|
||||
# All rights reserved.
|
||||
#
|
||||
# Redistribution and use in source and binary forms, with or without
|
||||
# modification, are permitted provided that the following conditions
|
||||
# are met:
|
||||
# 1. Redistributions of source code must retain the above copyright
|
||||
# notice, this list of conditions and the following disclaimer.
|
||||
# 2. Redistributions in binary form must reproduce the above copyright
|
||||
# notice, this list of conditions and the following disclaimer in the
|
||||
# documentation and/or other materials provided with the distribution.
|
||||
# 3. The name of the author may not be used to endorse or promote products
|
||||
# derived from this software without specific prior written permission.
|
||||
#
|
||||
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
||||
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
||||
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
||||
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
||||
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
||||
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
||||
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
||||
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
||||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
||||
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
#
|
||||
# end[licence]
|
||||
|
||||
# lot's of docstrings are missing, don't complain for now...
|
||||
# pylint: disable-msg=C0111
|
||||
|
||||
from antlr3.tree import CommonTreeAdaptor
|
||||
import stringtemplate3
|
||||
|
||||
class DOTTreeGenerator(object):
|
||||
"""
|
||||
A utility class to generate DOT diagrams (graphviz) from
|
||||
arbitrary trees. You can pass in your own templates and
|
||||
can pass in any kind of tree or use Tree interface method.
|
||||
"""
|
||||
|
||||
_treeST = stringtemplate3.StringTemplate(
|
||||
template=(
|
||||
"digraph {\n" +
|
||||
" ordering=out;\n" +
|
||||
" ranksep=.4;\n" +
|
||||
" node [shape=plaintext, fixedsize=true, fontsize=11, fontname=\"Courier\",\n" +
|
||||
" width=.25, height=.25];\n" +
|
||||
" edge [arrowsize=.5]\n" +
|
||||
" $nodes$\n" +
|
||||
" $edges$\n" +
|
||||
"}\n")
|
||||
)
|
||||
|
||||
_nodeST = stringtemplate3.StringTemplate(
|
||||
template="$name$ [label=\"$text$\"];\n"
|
||||
)
|
||||
|
||||
_edgeST = stringtemplate3.StringTemplate(
|
||||
template="$parent$ -> $child$ // \"$parentText$\" -> \"$childText$\"\n"
|
||||
)
|
||||
|
||||
def __init__(self):
|
||||
## Track node to number mapping so we can get proper node name back
|
||||
self.nodeToNumberMap = {}
|
||||
|
||||
## Track node number so we can get unique node names
|
||||
self.nodeNumber = 0
|
||||
|
||||
|
||||
def toDOT(self, tree, adaptor=None, treeST=_treeST, edgeST=_edgeST):
|
||||
if adaptor is None:
|
||||
adaptor = CommonTreeAdaptor()
|
||||
|
||||
treeST = treeST.getInstanceOf()
|
||||
|
||||
self.nodeNumber = 0
|
||||
self.toDOTDefineNodes(tree, adaptor, treeST)
|
||||
|
||||
self.nodeNumber = 0
|
||||
self.toDOTDefineEdges(tree, adaptor, treeST, edgeST)
|
||||
return treeST
|
||||
|
||||
|
||||
def toDOTDefineNodes(self, tree, adaptor, treeST, knownNodes=None):
|
||||
if knownNodes is None:
|
||||
knownNodes = set()
|
||||
|
||||
if tree is None:
|
||||
return
|
||||
|
||||
n = adaptor.getChildCount(tree)
|
||||
if n == 0:
|
||||
# must have already dumped as child from previous
|
||||
# invocation; do nothing
|
||||
return
|
||||
|
||||
# define parent node
|
||||
number = self.getNodeNumber(tree)
|
||||
if number not in knownNodes:
|
||||
parentNodeST = self.getNodeST(adaptor, tree)
|
||||
treeST.setAttribute("nodes", parentNodeST)
|
||||
knownNodes.add(number)
|
||||
|
||||
# for each child, do a "<unique-name> [label=text]" node def
|
||||
for i in range(n):
|
||||
child = adaptor.getChild(tree, i)
|
||||
|
||||
number = self.getNodeNumber(child)
|
||||
if number not in knownNodes:
|
||||
nodeST = self.getNodeST(adaptor, child)
|
||||
treeST.setAttribute("nodes", nodeST)
|
||||
knownNodes.add(number)
|
||||
|
||||
self.toDOTDefineNodes(child, adaptor, treeST, knownNodes)
|
||||
|
||||
|
||||
def toDOTDefineEdges(self, tree, adaptor, treeST, edgeST):
|
||||
if tree is None:
|
||||
return
|
||||
|
||||
n = adaptor.getChildCount(tree)
|
||||
if n == 0:
|
||||
# must have already dumped as child from previous
|
||||
# invocation; do nothing
|
||||
return
|
||||
|
||||
parentName = "n%d" % self.getNodeNumber(tree)
|
||||
|
||||
# for each child, do a parent -> child edge using unique node names
|
||||
parentText = adaptor.getText(tree)
|
||||
for i in range(n):
|
||||
child = adaptor.getChild(tree, i)
|
||||
childText = adaptor.getText(child)
|
||||
childName = "n%d" % self.getNodeNumber(child)
|
||||
edgeST = edgeST.getInstanceOf()
|
||||
edgeST.setAttribute("parent", parentName)
|
||||
edgeST.setAttribute("child", childName)
|
||||
edgeST.setAttribute("parentText", parentText)
|
||||
edgeST.setAttribute("childText", childText)
|
||||
treeST.setAttribute("edges", edgeST)
|
||||
self.toDOTDefineEdges(child, adaptor, treeST, edgeST)
|
||||
|
||||
|
||||
def getNodeST(self, adaptor, t):
|
||||
text = adaptor.getText(t)
|
||||
nodeST = self._nodeST.getInstanceOf()
|
||||
uniqueName = "n%d" % self.getNodeNumber(t)
|
||||
nodeST.setAttribute("name", uniqueName)
|
||||
if text is not None:
|
||||
text = text.replace('"', r'\"')
|
||||
nodeST.setAttribute("text", text)
|
||||
return nodeST
|
||||
|
||||
|
||||
def getNodeNumber(self, t):
|
||||
try:
|
||||
return self.nodeToNumberMap[t]
|
||||
except KeyError:
|
||||
self.nodeToNumberMap[t] = self.nodeNumber
|
||||
self.nodeNumber += 1
|
||||
return self.nodeNumber - 1
|
||||
|
||||
|
||||
def toDOT(tree, adaptor=None, treeST=DOTTreeGenerator._treeST, edgeST=DOTTreeGenerator._edgeST):
|
||||
"""
|
||||
Generate DOT (graphviz) for a whole tree not just a node.
|
||||
For example, 3+4*5 should generate:
|
||||
|
||||
digraph {
|
||||
node [shape=plaintext, fixedsize=true, fontsize=11, fontname="Courier",
|
||||
width=.4, height=.2];
|
||||
edge [arrowsize=.7]
|
||||
"+"->3
|
||||
"+"->"*"
|
||||
"*"->4
|
||||
"*"->5
|
||||
}
|
||||
|
||||
Return the ST not a string in case people want to alter.
|
||||
|
||||
Takes a Tree interface object.
|
||||
|
||||
Example of invokation:
|
||||
|
||||
import antlr3
|
||||
import antlr3.extras
|
||||
|
||||
input = antlr3.ANTLRInputStream(sys.stdin)
|
||||
lex = TLexer(input)
|
||||
tokens = antlr3.CommonTokenStream(lex)
|
||||
parser = TParser(tokens)
|
||||
tree = parser.e().tree
|
||||
print tree.toStringTree()
|
||||
st = antlr3.extras.toDOT(t)
|
||||
print st
|
||||
|
||||
"""
|
||||
|
||||
gen = DOTTreeGenerator()
|
||||
return gen.toDOT(tree, adaptor, treeST, edgeST)
|
|
@ -1,364 +0,0 @@
|
|||
"""ANTLR3 exception hierarchy"""
|
||||
|
||||
# begin[licence]
|
||||
#
|
||||
# [The "BSD licence"]
|
||||
# Copyright (c) 2005-2008 Terence Parr
|
||||
# All rights reserved.
|
||||
#
|
||||
# Redistribution and use in source and binary forms, with or without
|
||||
# modification, are permitted provided that the following conditions
|
||||
# are met:
|
||||
# 1. Redistributions of source code must retain the above copyright
|
||||
# notice, this list of conditions and the following disclaimer.
|
||||
# 2. Redistributions in binary form must reproduce the above copyright
|
||||
# notice, this list of conditions and the following disclaimer in the
|
||||
# documentation and/or other materials provided with the distribution.
|
||||
# 3. The name of the author may not be used to endorse or promote products
|
||||
# derived from this software without specific prior written permission.
|
||||
#
|
||||
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
||||
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
||||
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
||||
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
||||
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
||||
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
||||
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
||||
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
||||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
||||
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
#
|
||||
# end[licence]
|
||||
|
||||
from antlr3.constants import INVALID_TOKEN_TYPE
|
||||
|
||||
|
||||
class BacktrackingFailed(Exception):
|
||||
"""@brief Raised to signal failed backtrack attempt"""
|
||||
|
||||
pass
|
||||
|
||||
|
||||
class RecognitionException(Exception):
|
||||
"""@brief The root of the ANTLR exception hierarchy.
|
||||
|
||||
To avoid English-only error messages and to generally make things
|
||||
as flexible as possible, these exceptions are not created with strings,
|
||||
but rather the information necessary to generate an error. Then
|
||||
the various reporting methods in Parser and Lexer can be overridden
|
||||
to generate a localized error message. For example, MismatchedToken
|
||||
exceptions are built with the expected token type.
|
||||
So, don't expect getMessage() to return anything.
|
||||
|
||||
Note that as of Java 1.4, you can access the stack trace, which means
|
||||
that you can compute the complete trace of rules from the start symbol.
|
||||
This gives you considerable context information with which to generate
|
||||
useful error messages.
|
||||
|
||||
ANTLR generates code that throws exceptions upon recognition error and
|
||||
also generates code to catch these exceptions in each rule. If you
|
||||
want to quit upon first error, you can turn off the automatic error
|
||||
handling mechanism using rulecatch action, but you still need to
|
||||
override methods mismatch and recoverFromMismatchSet.
|
||||
|
||||
In general, the recognition exceptions can track where in a grammar a
|
||||
problem occurred and/or what was the expected input. While the parser
|
||||
knows its state (such as current input symbol and line info) that
|
||||
state can change before the exception is reported so current token index
|
||||
is computed and stored at exception time. From this info, you can
|
||||
perhaps print an entire line of input not just a single token, for example.
|
||||
Better to just say the recognizer had a problem and then let the parser
|
||||
figure out a fancy report.
|
||||
|
||||
"""
|
||||
|
||||
def __init__(self, input=None):
|
||||
Exception.__init__(self)
|
||||
|
||||
# What input stream did the error occur in?
|
||||
self.input = None
|
||||
|
||||
# What is index of token/char were we looking at when the error
|
||||
# occurred?
|
||||
self.index = None
|
||||
|
||||
# The current Token when an error occurred. Since not all streams
|
||||
# can retrieve the ith Token, we have to track the Token object.
|
||||
# For parsers. Even when it's a tree parser, token might be set.
|
||||
self.token = None
|
||||
|
||||
# If this is a tree parser exception, node is set to the node with
|
||||
# the problem.
|
||||
self.node = None
|
||||
|
||||
# The current char when an error occurred. For lexers.
|
||||
self.c = None
|
||||
|
||||
# Track the line at which the error occurred in case this is
|
||||
# generated from a lexer. We need to track this since the
|
||||
# unexpected char doesn't carry the line info.
|
||||
self.line = None
|
||||
|
||||
self.charPositionInLine = None
|
||||
|
||||
# If you are parsing a tree node stream, you will encounter som
|
||||
# imaginary nodes w/o line/col info. We now search backwards looking
|
||||
# for most recent token with line/col info, but notify getErrorHeader()
|
||||
# that info is approximate.
|
||||
self.approximateLineInfo = False
|
||||
|
||||
|
||||
if input is not None:
|
||||
self.input = input
|
||||
self.index = input.index()
|
||||
|
||||
# late import to avoid cyclic dependencies
|
||||
from antlr3.streams import TokenStream, CharStream
|
||||
from antlr3.tree import TreeNodeStream
|
||||
|
||||
if isinstance(self.input, TokenStream):
|
||||
self.token = self.input.LT(1)
|
||||
self.line = self.token.line
|
||||
self.charPositionInLine = self.token.charPositionInLine
|
||||
|
||||
if isinstance(self.input, TreeNodeStream):
|
||||
self.extractInformationFromTreeNodeStream(self.input)
|
||||
|
||||
else:
|
||||
if isinstance(self.input, CharStream):
|
||||
self.c = self.input.LT(1)
|
||||
self.line = self.input.line
|
||||
self.charPositionInLine = self.input.charPositionInLine
|
||||
|
||||
else:
|
||||
self.c = self.input.LA(1)
|
||||
|
||||
def extractInformationFromTreeNodeStream(self, nodes):
|
||||
from antlr3.tree import Tree, CommonTree
|
||||
from antlr3.tokens import CommonToken
|
||||
|
||||
self.node = nodes.LT(1)
|
||||
adaptor = nodes.adaptor
|
||||
payload = adaptor.getToken(self.node)
|
||||
if payload is not None:
|
||||
self.token = payload
|
||||
if payload.line <= 0:
|
||||
# imaginary node; no line/pos info; scan backwards
|
||||
i = -1
|
||||
priorNode = nodes.LT(i)
|
||||
while priorNode is not None:
|
||||
priorPayload = adaptor.getToken(priorNode)
|
||||
if priorPayload is not None and priorPayload.line > 0:
|
||||
# we found the most recent real line / pos info
|
||||
self.line = priorPayload.line
|
||||
self.charPositionInLine = priorPayload.charPositionInLine
|
||||
self.approximateLineInfo = True
|
||||
break
|
||||
|
||||
i -= 1
|
||||
priorNode = nodes.LT(i)
|
||||
|
||||
else: # node created from real token
|
||||
self.line = payload.line
|
||||
self.charPositionInLine = payload.charPositionInLine
|
||||
|
||||
elif isinstance(self.node, Tree):
|
||||
self.line = self.node.line
|
||||
self.charPositionInLine = self.node.charPositionInLine
|
||||
if isinstance(self.node, CommonTree):
|
||||
self.token = self.node.token
|
||||
|
||||
else:
|
||||
type = adaptor.getType(self.node)
|
||||
text = adaptor.getText(self.node)
|
||||
self.token = CommonToken(type=type, text=text)
|
||||
|
||||
|
||||
def getUnexpectedType(self):
|
||||
"""Return the token type or char of the unexpected input element"""
|
||||
|
||||
from antlr3.streams import TokenStream
|
||||
from antlr3.tree import TreeNodeStream
|
||||
|
||||
if isinstance(self.input, TokenStream):
|
||||
return self.token.type
|
||||
|
||||
elif isinstance(self.input, TreeNodeStream):
|
||||
adaptor = self.input.treeAdaptor
|
||||
return adaptor.getType(self.node)
|
||||
|
||||
else:
|
||||
return self.c
|
||||
|
||||
unexpectedType = property(getUnexpectedType)
|
||||
|
||||
|
||||
class MismatchedTokenException(RecognitionException):
|
||||
"""@brief A mismatched char or Token or tree node."""
|
||||
|
||||
def __init__(self, expecting, input):
|
||||
RecognitionException.__init__(self, input)
|
||||
self.expecting = expecting
|
||||
|
||||
|
||||
def __str__(self):
|
||||
#return "MismatchedTokenException("+self.expecting+")"
|
||||
return "MismatchedTokenException(%r!=%r)" % (
|
||||
self.getUnexpectedType(), self.expecting
|
||||
)
|
||||
__repr__ = __str__
|
||||
|
||||
|
||||
class UnwantedTokenException(MismatchedTokenException):
|
||||
"""An extra token while parsing a TokenStream"""
|
||||
|
||||
def getUnexpectedToken(self):
|
||||
return self.token
|
||||
|
||||
|
||||
def __str__(self):
|
||||
exp = ", expected %s" % self.expecting
|
||||
if self.expecting == INVALID_TOKEN_TYPE:
|
||||
exp = ""
|
||||
|
||||
if self.token is None:
|
||||
return "UnwantedTokenException(found=%s%s)" % (None, exp)
|
||||
|
||||
return "UnwantedTokenException(found=%s%s)" % (self.token.text, exp)
|
||||
__repr__ = __str__
|
||||
|
||||
|
||||
class MissingTokenException(MismatchedTokenException):
|
||||
"""
|
||||
We were expecting a token but it's not found. The current token
|
||||
is actually what we wanted next.
|
||||
"""
|
||||
|
||||
def __init__(self, expecting, input, inserted):
|
||||
MismatchedTokenException.__init__(self, expecting, input)
|
||||
|
||||
self.inserted = inserted
|
||||
|
||||
|
||||
def getMissingType(self):
|
||||
return self.expecting
|
||||
|
||||
|
||||
def __str__(self):
|
||||
if self.inserted is not None and self.token is not None:
|
||||
return "MissingTokenException(inserted %r at %r)" % (
|
||||
self.inserted, self.token.text)
|
||||
|
||||
if self.token is not None:
|
||||
return "MissingTokenException(at %r)" % self.token.text
|
||||
|
||||
return "MissingTokenException"
|
||||
__repr__ = __str__
|
||||
|
||||
|
||||
class MismatchedRangeException(RecognitionException):
|
||||
"""@brief The next token does not match a range of expected types."""
|
||||
|
||||
def __init__(self, a, b, input):
|
||||
RecognitionException.__init__(self, input)
|
||||
|
||||
self.a = a
|
||||
self.b = b
|
||||
|
||||
|
||||
def __str__(self):
|
||||
return "MismatchedRangeException(%r not in [%r..%r])" % (
|
||||
self.getUnexpectedType(), self.a, self.b
|
||||
)
|
||||
__repr__ = __str__
|
||||
|
||||
|
||||
class MismatchedSetException(RecognitionException):
|
||||
"""@brief The next token does not match a set of expected types."""
|
||||
|
||||
def __init__(self, expecting, input):
|
||||
RecognitionException.__init__(self, input)
|
||||
|
||||
self.expecting = expecting
|
||||
|
||||
|
||||
def __str__(self):
|
||||
return "MismatchedSetException(%r not in %r)" % (
|
||||
self.getUnexpectedType(), self.expecting
|
||||
)
|
||||
__repr__ = __str__
|
||||
|
||||
|
||||
class MismatchedNotSetException(MismatchedSetException):
|
||||
"""@brief Used for remote debugger deserialization"""
|
||||
|
||||
def __str__(self):
|
||||
return "MismatchedNotSetException(%r!=%r)" % (
|
||||
self.getUnexpectedType(), self.expecting
|
||||
)
|
||||
__repr__ = __str__
|
||||
|
||||
|
||||
class NoViableAltException(RecognitionException):
|
||||
"""@brief Unable to decide which alternative to choose."""
|
||||
|
||||
def __init__(
|
||||
self, grammarDecisionDescription, decisionNumber, stateNumber, input
|
||||
):
|
||||
RecognitionException.__init__(self, input)
|
||||
|
||||
self.grammarDecisionDescription = grammarDecisionDescription
|
||||
self.decisionNumber = decisionNumber
|
||||
self.stateNumber = stateNumber
|
||||
|
||||
|
||||
def __str__(self):
|
||||
return "NoViableAltException(%r!=[%r])" % (
|
||||
self.unexpectedType, self.grammarDecisionDescription
|
||||
)
|
||||
__repr__ = __str__
|
||||
|
||||
|
||||
class EarlyExitException(RecognitionException):
|
||||
"""@brief The recognizer did not match anything for a (..)+ loop."""
|
||||
|
||||
def __init__(self, decisionNumber, input):
|
||||
RecognitionException.__init__(self, input)
|
||||
|
||||
self.decisionNumber = decisionNumber
|
||||
|
||||
|
||||
class FailedPredicateException(RecognitionException):
|
||||
"""@brief A semantic predicate failed during validation.
|
||||
|
||||
Validation of predicates
|
||||
occurs when normally parsing the alternative just like matching a token.
|
||||
Disambiguating predicate evaluation occurs when we hoist a predicate into
|
||||
a prediction decision.
|
||||
"""
|
||||
|
||||
def __init__(self, input, ruleName, predicateText):
|
||||
RecognitionException.__init__(self, input)
|
||||
|
||||
self.ruleName = ruleName
|
||||
self.predicateText = predicateText
|
||||
|
||||
|
||||
def __str__(self):
|
||||
return "FailedPredicateException("+self.ruleName+",{"+self.predicateText+"}?)"
|
||||
__repr__ = __str__
|
||||
|
||||
|
||||
class MismatchedTreeNodeException(RecognitionException):
|
||||
"""@brief The next tree mode does not match the expected type."""
|
||||
|
||||
def __init__(self, expecting, input):
|
||||
RecognitionException.__init__(self, input)
|
||||
|
||||
self.expecting = expecting
|
||||
|
||||
def __str__(self):
|
||||
return "MismatchedTreeNodeException(%r!=%r)" % (
|
||||
self.getUnexpectedType(), self.expecting
|
||||
)
|
||||
__repr__ = __str__
|
|
@ -1,47 +0,0 @@
|
|||
""" @package antlr3.dottreegenerator
|
||||
@brief ANTLR3 runtime package, tree module
|
||||
|
||||
This module contains all support classes for AST construction and tree parsers.
|
||||
|
||||
"""
|
||||
|
||||
# begin[licence]
|
||||
#
|
||||
# [The "BSD licence"]
|
||||
# Copyright (c) 2005-2008 Terence Parr
|
||||
# All rights reserved.
|
||||
#
|
||||
# Redistribution and use in source and binary forms, with or without
|
||||
# modification, are permitted provided that the following conditions
|
||||
# are met:
|
||||
# 1. Redistributions of source code must retain the above copyright
|
||||
# notice, this list of conditions and the following disclaimer.
|
||||
# 2. Redistributions in binary form must reproduce the above copyright
|
||||
# notice, this list of conditions and the following disclaimer in the
|
||||
# documentation and/or other materials provided with the distribution.
|
||||
# 3. The name of the author may not be used to endorse or promote products
|
||||
# derived from this software without specific prior written permission.
|
||||
#
|
||||
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
||||
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
||||
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
||||
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
||||
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
||||
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
||||
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
||||
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
||||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
||||
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
#
|
||||
# end[licence]
|
||||
|
||||
# lot's of docstrings are missing, don't complain for now...
|
||||
# pylint: disable-msg=C0111
|
||||
|
||||
from treewizard import TreeWizard
|
||||
|
||||
try:
|
||||
from antlr3.dottreegen import toDOT
|
||||
except ImportError, exc:
|
||||
def toDOT(*args, **kwargs):
|
||||
raise exc
|
|
@ -1,305 +0,0 @@
|
|||
"""ANTLR3 runtime package"""
|
||||
|
||||
# begin[licence]
|
||||
#
|
||||
# [The "BSD licence"]
|
||||
# Copyright (c) 2005-2008 Terence Parr
|
||||
# All rights reserved.
|
||||
#
|
||||
# Redistribution and use in source and binary forms, with or without
|
||||
# modification, are permitted provided that the following conditions
|
||||
# are met:
|
||||
# 1. Redistributions of source code must retain the above copyright
|
||||
# notice, this list of conditions and the following disclaimer.
|
||||
# 2. Redistributions in binary form must reproduce the above copyright
|
||||
# notice, this list of conditions and the following disclaimer in the
|
||||
# documentation and/or other materials provided with the distribution.
|
||||
# 3. The name of the author may not be used to endorse or promote products
|
||||
# derived from this software without specific prior written permission.
|
||||
#
|
||||
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
||||
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
||||
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
||||
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
||||
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
||||
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
||||
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
||||
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
||||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
||||
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
#
|
||||
# end[licence]
|
||||
|
||||
|
||||
import sys
|
||||
import optparse
|
||||
|
||||
import antlr3
|
||||
|
||||
|
||||
class _Main(object):
|
||||
def __init__(self):
|
||||
self.stdin = sys.stdin
|
||||
self.stdout = sys.stdout
|
||||
self.stderr = sys.stderr
|
||||
|
||||
|
||||
def parseOptions(self, argv):
|
||||
optParser = optparse.OptionParser()
|
||||
optParser.add_option(
|
||||
"--encoding",
|
||||
action="store",
|
||||
type="string",
|
||||
dest="encoding"
|
||||
)
|
||||
optParser.add_option(
|
||||
"--input",
|
||||
action="store",
|
||||
type="string",
|
||||
dest="input"
|
||||
)
|
||||
optParser.add_option(
|
||||
"--interactive", "-i",
|
||||
action="store_true",
|
||||
dest="interactive"
|
||||
)
|
||||
optParser.add_option(
|
||||
"--no-output",
|
||||
action="store_true",
|
||||
dest="no_output"
|
||||
)
|
||||
optParser.add_option(
|
||||
"--profile",
|
||||
action="store_true",
|
||||
dest="profile"
|
||||
)
|
||||
optParser.add_option(
|
||||
"--hotshot",
|
||||
action="store_true",
|
||||
dest="hotshot"
|
||||
)
|
||||
optParser.add_option(
|
||||
"--port",
|
||||
type="int",
|
||||
dest="port",
|
||||
default=None
|
||||
)
|
||||
optParser.add_option(
|
||||
"--debug-socket",
|
||||
action='store_true',
|
||||
dest="debug_socket",
|
||||
default=None
|
||||
)
|
||||
|
||||
self.setupOptions(optParser)
|
||||
|
||||
return optParser.parse_args(argv[1:])
|
||||
|
||||
|
||||
def setupOptions(self, optParser):
|
||||
pass
|
||||
|
||||
|
||||
def execute(self, argv):
|
||||
options, args = self.parseOptions(argv)
|
||||
|
||||
self.setUp(options)
|
||||
|
||||
if options.interactive:
|
||||
while True:
|
||||
try:
|
||||
input = raw_input(">>> ")
|
||||
except (EOFError, KeyboardInterrupt):
|
||||
self.stdout.write("\nBye.\n")
|
||||
break
|
||||
|
||||
inStream = antlr3.ANTLRStringStream(input)
|
||||
self.parseStream(options, inStream)
|
||||
|
||||
else:
|
||||
if options.input is not None:
|
||||
inStream = antlr3.ANTLRStringStream(options.input)
|
||||
|
||||
elif len(args) == 1 and args[0] != '-':
|
||||
inStream = antlr3.ANTLRFileStream(
|
||||
args[0], encoding=options.encoding
|
||||
)
|
||||
|
||||
else:
|
||||
inStream = antlr3.ANTLRInputStream(
|
||||
self.stdin, encoding=options.encoding
|
||||
)
|
||||
|
||||
if options.profile:
|
||||
try:
|
||||
import cProfile as profile
|
||||
except ImportError:
|
||||
import profile
|
||||
|
||||
profile.runctx(
|
||||
'self.parseStream(options, inStream)',
|
||||
globals(),
|
||||
locals(),
|
||||
'profile.dat'
|
||||
)
|
||||
|
||||
import pstats
|
||||
stats = pstats.Stats('profile.dat')
|
||||
stats.strip_dirs()
|
||||
stats.sort_stats('time')
|
||||
stats.print_stats(100)
|
||||
|
||||
elif options.hotshot:
|
||||
import hotshot
|
||||
|
||||
profiler = hotshot.Profile('hotshot.dat')
|
||||
profiler.runctx(
|
||||
'self.parseStream(options, inStream)',
|
||||
globals(),
|
||||
locals()
|
||||
)
|
||||
|
||||
else:
|
||||
self.parseStream(options, inStream)
|
||||
|
||||
|
||||
def setUp(self, options):
|
||||
pass
|
||||
|
||||
|
||||
def parseStream(self, options, inStream):
|
||||
raise NotImplementedError
|
||||
|
||||
|
||||
def write(self, options, text):
|
||||
if not options.no_output:
|
||||
self.stdout.write(text)
|
||||
|
||||
|
||||
def writeln(self, options, text):
|
||||
self.write(options, text + '\n')
|
||||
|
||||
|
||||
class LexerMain(_Main):
|
||||
def __init__(self, lexerClass):
|
||||
_Main.__init__(self)
|
||||
|
||||
self.lexerClass = lexerClass
|
||||
|
||||
|
||||
def parseStream(self, options, inStream):
|
||||
lexer = self.lexerClass(inStream)
|
||||
for token in lexer:
|
||||
self.writeln(options, str(token))
|
||||
|
||||
|
||||
class ParserMain(_Main):
|
||||
def __init__(self, lexerClassName, parserClass):
|
||||
_Main.__init__(self)
|
||||
|
||||
self.lexerClassName = lexerClassName
|
||||
self.lexerClass = None
|
||||
self.parserClass = parserClass
|
||||
|
||||
|
||||
def setupOptions(self, optParser):
|
||||
optParser.add_option(
|
||||
"--lexer",
|
||||
action="store",
|
||||
type="string",
|
||||
dest="lexerClass",
|
||||
default=self.lexerClassName
|
||||
)
|
||||
optParser.add_option(
|
||||
"--rule",
|
||||
action="store",
|
||||
type="string",
|
||||
dest="parserRule"
|
||||
)
|
||||
|
||||
|
||||
def setUp(self, options):
|
||||
lexerMod = __import__(options.lexerClass)
|
||||
self.lexerClass = getattr(lexerMod, options.lexerClass)
|
||||
|
||||
|
||||
def parseStream(self, options, inStream):
|
||||
kwargs = {}
|
||||
if options.port is not None:
|
||||
kwargs['port'] = options.port
|
||||
if options.debug_socket is not None:
|
||||
kwargs['debug_socket'] = sys.stderr
|
||||
|
||||
lexer = self.lexerClass(inStream)
|
||||
tokenStream = antlr3.CommonTokenStream(lexer)
|
||||
parser = self.parserClass(tokenStream, **kwargs)
|
||||
result = getattr(parser, options.parserRule)()
|
||||
if result is not None:
|
||||
if hasattr(result, 'tree') and result.tree is not None:
|
||||
self.writeln(options, result.tree.toStringTree())
|
||||
else:
|
||||
self.writeln(options, repr(result))
|
||||
|
||||
|
||||
class WalkerMain(_Main):
|
||||
def __init__(self, walkerClass):
|
||||
_Main.__init__(self)
|
||||
|
||||
self.lexerClass = None
|
||||
self.parserClass = None
|
||||
self.walkerClass = walkerClass
|
||||
|
||||
|
||||
def setupOptions(self, optParser):
|
||||
optParser.add_option(
|
||||
"--lexer",
|
||||
action="store",
|
||||
type="string",
|
||||
dest="lexerClass",
|
||||
default=None
|
||||
)
|
||||
optParser.add_option(
|
||||
"--parser",
|
||||
action="store",
|
||||
type="string",
|
||||
dest="parserClass",
|
||||
default=None
|
||||
)
|
||||
optParser.add_option(
|
||||
"--parser-rule",
|
||||
action="store",
|
||||
type="string",
|
||||
dest="parserRule",
|
||||
default=None
|
||||
)
|
||||
optParser.add_option(
|
||||
"--rule",
|
||||
action="store",
|
||||
type="string",
|
||||
dest="walkerRule"
|
||||
)
|
||||
|
||||
|
||||
def setUp(self, options):
|
||||
lexerMod = __import__(options.lexerClass)
|
||||
self.lexerClass = getattr(lexerMod, options.lexerClass)
|
||||
parserMod = __import__(options.parserClass)
|
||||
self.parserClass = getattr(parserMod, options.parserClass)
|
||||
|
||||
|
||||
def parseStream(self, options, inStream):
|
||||
lexer = self.lexerClass(inStream)
|
||||
tokenStream = antlr3.CommonTokenStream(lexer)
|
||||
parser = self.parserClass(tokenStream)
|
||||
result = getattr(parser, options.parserRule)()
|
||||
if result is not None:
|
||||
assert hasattr(result, 'tree'), "Parser did not return an AST"
|
||||
nodeStream = antlr3.tree.CommonTreeNodeStream(result.tree)
|
||||
nodeStream.setTokenStream(tokenStream)
|
||||
walker = self.walkerClass(nodeStream)
|
||||
result = getattr(walker, options.walkerRule)()
|
||||
if result is not None:
|
||||
if hasattr(result, 'tree'):
|
||||
self.writeln(options, result.tree.toStringTree())
|
||||
else:
|
||||
self.writeln(options, repr(result))
|
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
|
@ -1,418 +0,0 @@
|
|||
"""ANTLR3 runtime package"""
|
||||
|
||||
# begin[licence]
|
||||
#
|
||||
# [The "BSD licence"]
|
||||
# Copyright (c) 2005-2008 Terence Parr
|
||||
# All rights reserved.
|
||||
#
|
||||
# Redistribution and use in source and binary forms, with or without
|
||||
# modification, are permitted provided that the following conditions
|
||||
# are met:
|
||||
# 1. Redistributions of source code must retain the above copyright
|
||||
# notice, this list of conditions and the following disclaimer.
|
||||
# 2. Redistributions in binary form must reproduce the above copyright
|
||||
# notice, this list of conditions and the following disclaimer in the
|
||||
# documentation and/or other materials provided with the distribution.
|
||||
# 3. The name of the author may not be used to endorse or promote products
|
||||
# derived from this software without specific prior written permission.
|
||||
#
|
||||
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
||||
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
||||
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
||||
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
||||
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
||||
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
||||
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
||||
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
||||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
||||
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
#
|
||||
# end[licence]
|
||||
|
||||
from antlr3.constants import EOF, DEFAULT_CHANNEL, INVALID_TOKEN_TYPE
|
||||
|
||||
############################################################################
|
||||
#
|
||||
# basic token interface
|
||||
#
|
||||
############################################################################
|
||||
|
||||
class Token(object):
|
||||
"""@brief Abstract token baseclass."""
|
||||
|
||||
def getText(self):
|
||||
"""@brief Get the text of the token.
|
||||
|
||||
Using setter/getter methods is deprecated. Use o.text instead.
|
||||
"""
|
||||
raise NotImplementedError
|
||||
|
||||
def setText(self, text):
|
||||
"""@brief Set the text of the token.
|
||||
|
||||
Using setter/getter methods is deprecated. Use o.text instead.
|
||||
"""
|
||||
raise NotImplementedError
|
||||
|
||||
|
||||
def getType(self):
|
||||
"""@brief Get the type of the token.
|
||||
|
||||
Using setter/getter methods is deprecated. Use o.type instead."""
|
||||
|
||||
raise NotImplementedError
|
||||
|
||||
def setType(self, ttype):
|
||||
"""@brief Get the type of the token.
|
||||
|
||||
Using setter/getter methods is deprecated. Use o.type instead."""
|
||||
|
||||
raise NotImplementedError
|
||||
|
||||
|
||||
def getLine(self):
|
||||
"""@brief Get the line number on which this token was matched
|
||||
|
||||
Lines are numbered 1..n
|
||||
|
||||
Using setter/getter methods is deprecated. Use o.line instead."""
|
||||
|
||||
raise NotImplementedError
|
||||
|
||||
def setLine(self, line):
|
||||
"""@brief Set the line number on which this token was matched
|
||||
|
||||
Using setter/getter methods is deprecated. Use o.line instead."""
|
||||
|
||||
raise NotImplementedError
|
||||
|
||||
|
||||
def getCharPositionInLine(self):
|
||||
"""@brief Get the column of the tokens first character,
|
||||
|
||||
Columns are numbered 0..n-1
|
||||
|
||||
Using setter/getter methods is deprecated. Use o.charPositionInLine instead."""
|
||||
|
||||
raise NotImplementedError
|
||||
|
||||
def setCharPositionInLine(self, pos):
|
||||
"""@brief Set the column of the tokens first character,
|
||||
|
||||
Using setter/getter methods is deprecated. Use o.charPositionInLine instead."""
|
||||
|
||||
raise NotImplementedError
|
||||
|
||||
|
||||
def getChannel(self):
|
||||
"""@brief Get the channel of the token
|
||||
|
||||
Using setter/getter methods is deprecated. Use o.channel instead."""
|
||||
|
||||
raise NotImplementedError
|
||||
|
||||
def setChannel(self, channel):
|
||||
"""@brief Set the channel of the token
|
||||
|
||||
Using setter/getter methods is deprecated. Use o.channel instead."""
|
||||
|
||||
raise NotImplementedError
|
||||
|
||||
|
||||
def getTokenIndex(self):
|
||||
"""@brief Get the index in the input stream.
|
||||
|
||||
An index from 0..n-1 of the token object in the input stream.
|
||||
This must be valid in order to use the ANTLRWorks debugger.
|
||||
|
||||
Using setter/getter methods is deprecated. Use o.index instead."""
|
||||
|
||||
raise NotImplementedError
|
||||
|
||||
def setTokenIndex(self, index):
|
||||
"""@brief Set the index in the input stream.
|
||||
|
||||
Using setter/getter methods is deprecated. Use o.index instead."""
|
||||
|
||||
raise NotImplementedError
|
||||
|
||||
|
||||
def getInputStream(self):
|
||||
"""@brief From what character stream was this token created.
|
||||
|
||||
You don't have to implement but it's nice to know where a Token
|
||||
comes from if you have include files etc... on the input."""
|
||||
|
||||
raise NotImplementedError
|
||||
|
||||
def setInputStream(self, input):
|
||||
"""@brief From what character stream was this token created.
|
||||
|
||||
You don't have to implement but it's nice to know where a Token
|
||||
comes from if you have include files etc... on the input."""
|
||||
|
||||
raise NotImplementedError
|
||||
|
||||
|
||||
############################################################################
|
||||
#
|
||||
# token implementations
|
||||
#
|
||||
# Token
|
||||
# +- CommonToken
|
||||
# \- ClassicToken
|
||||
#
|
||||
############################################################################
|
||||
|
||||
class CommonToken(Token):
|
||||
"""@brief Basic token implementation.
|
||||
|
||||
This implementation does not copy the text from the input stream upon
|
||||
creation, but keeps start/stop pointers into the stream to avoid
|
||||
unnecessary copy operations.
|
||||
|
||||
"""
|
||||
|
||||
def __init__(self, type=None, channel=DEFAULT_CHANNEL, text=None,
|
||||
input=None, start=None, stop=None, oldToken=None):
|
||||
Token.__init__(self)
|
||||
|
||||
if oldToken is not None:
|
||||
self.type = oldToken.type
|
||||
self.line = oldToken.line
|
||||
self.charPositionInLine = oldToken.charPositionInLine
|
||||
self.channel = oldToken.channel
|
||||
self.index = oldToken.index
|
||||
self._text = oldToken._text
|
||||
self.input = oldToken.input
|
||||
if isinstance(oldToken, CommonToken):
|
||||
self.start = oldToken.start
|
||||
self.stop = oldToken.stop
|
||||
|
||||
else:
|
||||
self.type = type
|
||||
self.input = input
|
||||
self.charPositionInLine = -1 # set to invalid position
|
||||
self.line = 0
|
||||
self.channel = channel
|
||||
|
||||
#What token number is this from 0..n-1 tokens; < 0 implies invalid index
|
||||
self.index = -1
|
||||
|
||||
# We need to be able to change the text once in a while. If
|
||||
# this is non-null, then getText should return this. Note that
|
||||
# start/stop are not affected by changing this.
|
||||
self._text = text
|
||||
|
||||
# The char position into the input buffer where this token starts
|
||||
self.start = start
|
||||
|
||||
# The char position into the input buffer where this token stops
|
||||
# This is the index of the last char, *not* the index after it!
|
||||
self.stop = stop
|
||||
|
||||
|
||||
def getText(self):
|
||||
if self._text is not None:
|
||||
return self._text
|
||||
|
||||
if self.input is None:
|
||||
return None
|
||||
|
||||
if self.start < self.input.size() and self.stop < self.input.size():
|
||||
return self.input.substring(self.start, self.stop)
|
||||
|
||||
return '<EOF>'
|
||||
|
||||
|
||||
def setText(self, text):
|
||||
"""
|
||||
Override the text for this token. getText() will return this text
|
||||
rather than pulling from the buffer. Note that this does not mean
|
||||
that start/stop indexes are not valid. It means that that input
|
||||
was converted to a new string in the token object.
|
||||
"""
|
||||
self._text = text
|
||||
|
||||
text = property(getText, setText)
|
||||
|
||||
|
||||
def getType(self):
|
||||
return self.type
|
||||
|
||||
def setType(self, ttype):
|
||||
self.type = ttype
|
||||
|
||||
def getTypeName(self):
|
||||
return str(self.type)
|
||||
|
||||
typeName = property(lambda s: s.getTypeName())
|
||||
|
||||
def getLine(self):
|
||||
return self.line
|
||||
|
||||
def setLine(self, line):
|
||||
self.line = line
|
||||
|
||||
|
||||
def getCharPositionInLine(self):
|
||||
return self.charPositionInLine
|
||||
|
||||
def setCharPositionInLine(self, pos):
|
||||
self.charPositionInLine = pos
|
||||
|
||||
|
||||
def getChannel(self):
|
||||
return self.channel
|
||||
|
||||
def setChannel(self, channel):
|
||||
self.channel = channel
|
||||
|
||||
|
||||
def getTokenIndex(self):
|
||||
return self.index
|
||||
|
||||
def setTokenIndex(self, index):
|
||||
self.index = index
|
||||
|
||||
|
||||
def getInputStream(self):
|
||||
return self.input
|
||||
|
||||
def setInputStream(self, input):
|
||||
self.input = input
|
||||
|
||||
|
||||
def __str__(self):
|
||||
if self.type == EOF:
|
||||
return "<EOF>"
|
||||
|
||||
channelStr = ""
|
||||
if self.channel > 0:
|
||||
channelStr = ",channel=" + str(self.channel)
|
||||
|
||||
txt = self.text
|
||||
if txt is not None:
|
||||
txt = txt.replace("\n","\\\\n")
|
||||
txt = txt.replace("\r","\\\\r")
|
||||
txt = txt.replace("\t","\\\\t")
|
||||
else:
|
||||
txt = "<no text>"
|
||||
|
||||
return "[@%d,%d:%d=%r,<%s>%s,%d:%d]" % (
|
||||
self.index,
|
||||
self.start, self.stop,
|
||||
txt,
|
||||
self.typeName, channelStr,
|
||||
self.line, self.charPositionInLine
|
||||
)
|
||||
|
||||
|
||||
class ClassicToken(Token):
|
||||
"""@brief Alternative token implementation.
|
||||
|
||||
A Token object like we'd use in ANTLR 2.x; has an actual string created
|
||||
and associated with this object. These objects are needed for imaginary
|
||||
tree nodes that have payload objects. We need to create a Token object
|
||||
that has a string; the tree node will point at this token. CommonToken
|
||||
has indexes into a char stream and hence cannot be used to introduce
|
||||
new strings.
|
||||
"""
|
||||
|
||||
def __init__(self, type=None, text=None, channel=DEFAULT_CHANNEL,
|
||||
oldToken=None
|
||||
):
|
||||
Token.__init__(self)
|
||||
|
||||
if oldToken is not None:
|
||||
self.text = oldToken.text
|
||||
self.type = oldToken.type
|
||||
self.line = oldToken.line
|
||||
self.charPositionInLine = oldToken.charPositionInLine
|
||||
self.channel = oldToken.channel
|
||||
|
||||
self.text = text
|
||||
self.type = type
|
||||
self.line = None
|
||||
self.charPositionInLine = None
|
||||
self.channel = channel
|
||||
self.index = None
|
||||
|
||||
|
||||
def getText(self):
|
||||
return self.text
|
||||
|
||||
def setText(self, text):
|
||||
self.text = text
|
||||
|
||||
|
||||
def getType(self):
|
||||
return self.type
|
||||
|
||||
def setType(self, ttype):
|
||||
self.type = ttype
|
||||
|
||||
|
||||
def getLine(self):
|
||||
return self.line
|
||||
|
||||
def setLine(self, line):
|
||||
self.line = line
|
||||
|
||||
|
||||
def getCharPositionInLine(self):
|
||||
return self.charPositionInLine
|
||||
|
||||
def setCharPositionInLine(self, pos):
|
||||
self.charPositionInLine = pos
|
||||
|
||||
|
||||
def getChannel(self):
|
||||
return self.channel
|
||||
|
||||
def setChannel(self, channel):
|
||||
self.channel = channel
|
||||
|
||||
|
||||
def getTokenIndex(self):
|
||||
return self.index
|
||||
|
||||
def setTokenIndex(self, index):
|
||||
self.index = index
|
||||
|
||||
|
||||
def getInputStream(self):
|
||||
return None
|
||||
|
||||
def setInputStream(self, input):
|
||||
pass
|
||||
|
||||
|
||||
def toString(self):
|
||||
channelStr = ""
|
||||
if self.channel > 0:
|
||||
channelStr = ",channel=" + str(self.channel)
|
||||
|
||||
txt = self.text
|
||||
if txt is None:
|
||||
txt = "<no text>"
|
||||
|
||||
return "[@%r,%r,<%r>%s,%r:%r]" % (self.index,
|
||||
txt,
|
||||
self.type,
|
||||
channelStr,
|
||||
self.line,
|
||||
self.charPositionInLine
|
||||
)
|
||||
|
||||
|
||||
__str__ = toString
|
||||
__repr__ = toString
|
||||
|
||||
|
||||
INVALID_TOKEN = CommonToken(type=INVALID_TOKEN_TYPE)
|
||||
|
||||
# In an action, a lexer rule can set token to this SKIP_TOKEN and ANTLR
|
||||
# will avoid creating a token for this symbol and try to fetch another.
|
||||
SKIP_TOKEN = CommonToken(type=INVALID_TOKEN_TYPE)
|
File diff suppressed because it is too large
Load Diff
|
@ -1,619 +0,0 @@
|
|||
""" @package antlr3.tree
|
||||
@brief ANTLR3 runtime package, treewizard module
|
||||
|
||||
A utility module to create ASTs at runtime.
|
||||
See <http://www.antlr.org/wiki/display/~admin/2007/07/02/Exploring+Concept+of+TreeWizard> for an overview. Note that the API of the Python implementation is slightly different.
|
||||
|
||||
"""
|
||||
|
||||
# begin[licence]
|
||||
#
|
||||
# [The "BSD licence"]
|
||||
# Copyright (c) 2005-2008 Terence Parr
|
||||
# All rights reserved.
|
||||
#
|
||||
# Redistribution and use in source and binary forms, with or without
|
||||
# modification, are permitted provided that the following conditions
|
||||
# are met:
|
||||
# 1. Redistributions of source code must retain the above copyright
|
||||
# notice, this list of conditions and the following disclaimer.
|
||||
# 2. Redistributions in binary form must reproduce the above copyright
|
||||
# notice, this list of conditions and the following disclaimer in the
|
||||
# documentation and/or other materials provided with the distribution.
|
||||
# 3. The name of the author may not be used to endorse or promote products
|
||||
# derived from this software without specific prior written permission.
|
||||
#
|
||||
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
||||
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
||||
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
||||
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
||||
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
||||
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
||||
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
||||
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
||||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
||||
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
#
|
||||
# end[licence]
|
||||
|
||||
from antlr3.constants import INVALID_TOKEN_TYPE
|
||||
from antlr3.tokens import CommonToken
|
||||
from antlr3.tree import CommonTree, CommonTreeAdaptor
|
||||
|
||||
|
||||
def computeTokenTypes(tokenNames):
|
||||
"""
|
||||
Compute a dict that is an inverted index of
|
||||
tokenNames (which maps int token types to names).
|
||||
"""
|
||||
|
||||
if tokenNames is None:
|
||||
return {}
|
||||
|
||||
return dict((name, type) for type, name in enumerate(tokenNames))
|
||||
|
||||
|
||||
## token types for pattern parser
|
||||
EOF = -1
|
||||
BEGIN = 1
|
||||
END = 2
|
||||
ID = 3
|
||||
ARG = 4
|
||||
PERCENT = 5
|
||||
COLON = 6
|
||||
DOT = 7
|
||||
|
||||
class TreePatternLexer(object):
|
||||
def __init__(self, pattern):
|
||||
## The tree pattern to lex like "(A B C)"
|
||||
self.pattern = pattern
|
||||
|
||||
## Index into input string
|
||||
self.p = -1
|
||||
|
||||
## Current char
|
||||
self.c = None
|
||||
|
||||
## How long is the pattern in char?
|
||||
self.n = len(pattern)
|
||||
|
||||
## Set when token type is ID or ARG
|
||||
self.sval = None
|
||||
|
||||
self.error = False
|
||||
|
||||
self.consume()
|
||||
|
||||
|
||||
__idStartChar = frozenset(
|
||||
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_'
|
||||
)
|
||||
__idChar = __idStartChar | frozenset('0123456789')
|
||||
|
||||
def nextToken(self):
|
||||
self.sval = ""
|
||||
while self.c != EOF:
|
||||
if self.c in (' ', '\n', '\r', '\t'):
|
||||
self.consume()
|
||||
continue
|
||||
|
||||
if self.c in self.__idStartChar:
|
||||
self.sval += self.c
|
||||
self.consume()
|
||||
while self.c in self.__idChar:
|
||||
self.sval += self.c
|
||||
self.consume()
|
||||
|
||||
return ID
|
||||
|
||||
if self.c == '(':
|
||||
self.consume()
|
||||
return BEGIN
|
||||
|
||||
if self.c == ')':
|
||||
self.consume()
|
||||
return END
|
||||
|
||||
if self.c == '%':
|
||||
self.consume()
|
||||
return PERCENT
|
||||
|
||||
if self.c == ':':
|
||||
self.consume()
|
||||
return COLON
|
||||
|
||||
if self.c == '.':
|
||||
self.consume()
|
||||
return DOT
|
||||
|
||||
if self.c == '[': # grab [x] as a string, returning x
|
||||
self.consume()
|
||||
while self.c != ']':
|
||||
if self.c == '\\':
|
||||
self.consume()
|
||||
if self.c != ']':
|
||||
self.sval += '\\'
|
||||
|
||||
self.sval += self.c
|
||||
|
||||
else:
|
||||
self.sval += self.c
|
||||
|
||||
self.consume()
|
||||
|
||||
self.consume()
|
||||
return ARG
|
||||
|
||||
self.consume()
|
||||
self.error = True
|
||||
return EOF
|
||||
|
||||
return EOF
|
||||
|
||||
|
||||
def consume(self):
|
||||
self.p += 1
|
||||
if self.p >= self.n:
|
||||
self.c = EOF
|
||||
|
||||
else:
|
||||
self.c = self.pattern[self.p]
|
||||
|
||||
|
||||
class TreePatternParser(object):
|
||||
def __init__(self, tokenizer, wizard, adaptor):
|
||||
self.tokenizer = tokenizer
|
||||
self.wizard = wizard
|
||||
self.adaptor = adaptor
|
||||
self.ttype = tokenizer.nextToken() # kickstart
|
||||
|
||||
|
||||
def pattern(self):
|
||||
if self.ttype == BEGIN:
|
||||
return self.parseTree()
|
||||
|
||||
elif self.ttype == ID:
|
||||
node = self.parseNode()
|
||||
if self.ttype == EOF:
|
||||
return node
|
||||
|
||||
return None # extra junk on end
|
||||
|
||||
return None
|
||||
|
||||
|
||||
def parseTree(self):
|
||||
if self.ttype != BEGIN:
|
||||
return None
|
||||
|
||||
self.ttype = self.tokenizer.nextToken()
|
||||
root = self.parseNode()
|
||||
if root is None:
|
||||
return None
|
||||
|
||||
while self.ttype in (BEGIN, ID, PERCENT, DOT):
|
||||
if self.ttype == BEGIN:
|
||||
subtree = self.parseTree()
|
||||
self.adaptor.addChild(root, subtree)
|
||||
|
||||
else:
|
||||
child = self.parseNode()
|
||||
if child is None:
|
||||
return None
|
||||
|
||||
self.adaptor.addChild(root, child)
|
||||
|
||||
if self.ttype != END:
|
||||
return None
|
||||
|
||||
self.ttype = self.tokenizer.nextToken()
|
||||
return root
|
||||
|
||||
|
||||
def parseNode(self):
|
||||
# "%label:" prefix
|
||||
label = None
|
||||
|
||||
if self.ttype == PERCENT:
|
||||
self.ttype = self.tokenizer.nextToken()
|
||||
if self.ttype != ID:
|
||||
return None
|
||||
|
||||
label = self.tokenizer.sval
|
||||
self.ttype = self.tokenizer.nextToken()
|
||||
if self.ttype != COLON:
|
||||
return None
|
||||
|
||||
self.ttype = self.tokenizer.nextToken() # move to ID following colon
|
||||
|
||||
# Wildcard?
|
||||
if self.ttype == DOT:
|
||||
self.ttype = self.tokenizer.nextToken()
|
||||
wildcardPayload = CommonToken(0, ".")
|
||||
node = WildcardTreePattern(wildcardPayload)
|
||||
if label is not None:
|
||||
node.label = label
|
||||
return node
|
||||
|
||||
# "ID" or "ID[arg]"
|
||||
if self.ttype != ID:
|
||||
return None
|
||||
|
||||
tokenName = self.tokenizer.sval
|
||||
self.ttype = self.tokenizer.nextToken()
|
||||
|
||||
if tokenName == "nil":
|
||||
return self.adaptor.nil()
|
||||
|
||||
text = tokenName
|
||||
# check for arg
|
||||
arg = None
|
||||
if self.ttype == ARG:
|
||||
arg = self.tokenizer.sval
|
||||
text = arg
|
||||
self.ttype = self.tokenizer.nextToken()
|
||||
|
||||
# create node
|
||||
treeNodeType = self.wizard.getTokenType(tokenName)
|
||||
if treeNodeType == INVALID_TOKEN_TYPE:
|
||||
return None
|
||||
|
||||
node = self.adaptor.createFromType(treeNodeType, text)
|
||||
if label is not None and isinstance(node, TreePattern):
|
||||
node.label = label
|
||||
|
||||
if arg is not None and isinstance(node, TreePattern):
|
||||
node.hasTextArg = True
|
||||
|
||||
return node
|
||||
|
||||
|
||||
class TreePattern(CommonTree):
|
||||
"""
|
||||
When using %label:TOKENNAME in a tree for parse(), we must
|
||||
track the label.
|
||||
"""
|
||||
|
||||
def __init__(self, payload):
|
||||
CommonTree.__init__(self, payload)
|
||||
|
||||
self.label = None
|
||||
self.hasTextArg = None
|
||||
|
||||
|
||||
def toString(self):
|
||||
if self.label is not None:
|
||||
return '%' + self.label + ':' + CommonTree.toString(self)
|
||||
|
||||
else:
|
||||
return CommonTree.toString(self)
|
||||
|
||||
|
||||
class WildcardTreePattern(TreePattern):
|
||||
pass
|
||||
|
||||
|
||||
class TreePatternTreeAdaptor(CommonTreeAdaptor):
|
||||
"""This adaptor creates TreePattern objects for use during scan()"""
|
||||
|
||||
def createWithPayload(self, payload):
|
||||
return TreePattern(payload)
|
||||
|
||||
|
||||
class TreeWizard(object):
|
||||
"""
|
||||
Build and navigate trees with this object. Must know about the names
|
||||
of tokens so you have to pass in a map or array of token names (from which
|
||||
this class can build the map). I.e., Token DECL means nothing unless the
|
||||
class can translate it to a token type.
|
||||
|
||||
In order to create nodes and navigate, this class needs a TreeAdaptor.
|
||||
|
||||
This class can build a token type -> node index for repeated use or for
|
||||
iterating over the various nodes with a particular type.
|
||||
|
||||
This class works in conjunction with the TreeAdaptor rather than moving
|
||||
all this functionality into the adaptor. An adaptor helps build and
|
||||
navigate trees using methods. This class helps you do it with string
|
||||
patterns like "(A B C)". You can create a tree from that pattern or
|
||||
match subtrees against it.
|
||||
"""
|
||||
|
||||
def __init__(self, adaptor=None, tokenNames=None, typeMap=None):
|
||||
if adaptor is None:
|
||||
self.adaptor = CommonTreeAdaptor()
|
||||
|
||||
else:
|
||||
self.adaptor = adaptor
|
||||
|
||||
if typeMap is None:
|
||||
self.tokenNameToTypeMap = computeTokenTypes(tokenNames)
|
||||
|
||||
else:
|
||||
if tokenNames is not None:
|
||||
raise ValueError("Can't have both tokenNames and typeMap")
|
||||
|
||||
self.tokenNameToTypeMap = typeMap
|
||||
|
||||
|
||||
def getTokenType(self, tokenName):
|
||||
"""Using the map of token names to token types, return the type."""
|
||||
|
||||
try:
|
||||
return self.tokenNameToTypeMap[tokenName]
|
||||
except KeyError:
|
||||
return INVALID_TOKEN_TYPE
|
||||
|
||||
|
||||
def create(self, pattern):
|
||||
"""
|
||||
Create a tree or node from the indicated tree pattern that closely
|
||||
follows ANTLR tree grammar tree element syntax:
|
||||
|
||||
(root child1 ... child2).
|
||||
|
||||
You can also just pass in a node: ID
|
||||
|
||||
Any node can have a text argument: ID[foo]
|
||||
(notice there are no quotes around foo--it's clear it's a string).
|
||||
|
||||
nil is a special name meaning "give me a nil node". Useful for
|
||||
making lists: (nil A B C) is a list of A B C.
|
||||
"""
|
||||
|
||||
tokenizer = TreePatternLexer(pattern)
|
||||
parser = TreePatternParser(tokenizer, self, self.adaptor)
|
||||
return parser.pattern()
|
||||
|
||||
|
||||
def index(self, tree):
|
||||
"""Walk the entire tree and make a node name to nodes mapping.
|
||||
|
||||
For now, use recursion but later nonrecursive version may be
|
||||
more efficient. Returns a dict int -> list where the list is
|
||||
of your AST node type. The int is the token type of the node.
|
||||
"""
|
||||
|
||||
m = {}
|
||||
self._index(tree, m)
|
||||
return m
|
||||
|
||||
|
||||
def _index(self, t, m):
|
||||
"""Do the work for index"""
|
||||
|
||||
if t is None:
|
||||
return
|
||||
|
||||
ttype = self.adaptor.getType(t)
|
||||
elements = m.get(ttype)
|
||||
if elements is None:
|
||||
m[ttype] = elements = []
|
||||
|
||||
elements.append(t)
|
||||
for i in range(self.adaptor.getChildCount(t)):
|
||||
child = self.adaptor.getChild(t, i)
|
||||
self._index(child, m)
|
||||
|
||||
|
||||
def find(self, tree, what):
|
||||
"""Return a list of matching token.
|
||||
|
||||
what may either be an integer specifzing the token type to find or
|
||||
a string with a pattern that must be matched.
|
||||
|
||||
"""
|
||||
|
||||
if isinstance(what, (int, long)):
|
||||
return self._findTokenType(tree, what)
|
||||
|
||||
elif isinstance(what, basestring):
|
||||
return self._findPattern(tree, what)
|
||||
|
||||
else:
|
||||
raise TypeError("'what' must be string or integer")
|
||||
|
||||
|
||||
def _findTokenType(self, t, ttype):
|
||||
"""Return a List of tree nodes with token type ttype"""
|
||||
|
||||
nodes = []
|
||||
|
||||
def visitor(tree, parent, childIndex, labels):
|
||||
nodes.append(tree)
|
||||
|
||||
self.visit(t, ttype, visitor)
|
||||
|
||||
return nodes
|
||||
|
||||
|
||||
def _findPattern(self, t, pattern):
|
||||
"""Return a List of subtrees matching pattern."""
|
||||
|
||||
subtrees = []
|
||||
|
||||
# Create a TreePattern from the pattern
|
||||
tokenizer = TreePatternLexer(pattern)
|
||||
parser = TreePatternParser(tokenizer, self, TreePatternTreeAdaptor())
|
||||
tpattern = parser.pattern()
|
||||
|
||||
# don't allow invalid patterns
|
||||
if (tpattern is None or tpattern.isNil()
|
||||
or isinstance(tpattern, WildcardTreePattern)):
|
||||
return None
|
||||
|
||||
rootTokenType = tpattern.getType()
|
||||
|
||||
def visitor(tree, parent, childIndex, label):
|
||||
if self._parse(tree, tpattern, None):
|
||||
subtrees.append(tree)
|
||||
|
||||
self.visit(t, rootTokenType, visitor)
|
||||
|
||||
return subtrees
|
||||
|
||||
|
||||
def visit(self, tree, what, visitor):
|
||||
"""Visit every node in tree matching what, invoking the visitor.
|
||||
|
||||
If what is a string, it is parsed as a pattern and only matching
|
||||
subtrees will be visited.
|
||||
The implementation uses the root node of the pattern in combination
|
||||
with visit(t, ttype, visitor) so nil-rooted patterns are not allowed.
|
||||
Patterns with wildcard roots are also not allowed.
|
||||
|
||||
If what is an integer, it is used as a token type and visit will match
|
||||
all nodes of that type (this is faster than the pattern match).
|
||||
The labels arg of the visitor action method is never set (it's None)
|
||||
since using a token type rather than a pattern doesn't let us set a
|
||||
label.
|
||||
"""
|
||||
|
||||
if isinstance(what, (int, long)):
|
||||
self._visitType(tree, None, 0, what, visitor)
|
||||
|
||||
elif isinstance(what, basestring):
|
||||
self._visitPattern(tree, what, visitor)
|
||||
|
||||
else:
|
||||
raise TypeError("'what' must be string or integer")
|
||||
|
||||
|
||||
def _visitType(self, t, parent, childIndex, ttype, visitor):
|
||||
"""Do the recursive work for visit"""
|
||||
|
||||
if t is None:
|
||||
return
|
||||
|
||||
if self.adaptor.getType(t) == ttype:
|
||||
visitor(t, parent, childIndex, None)
|
||||
|
||||
for i in range(self.adaptor.getChildCount(t)):
|
||||
child = self.adaptor.getChild(t, i)
|
||||
self._visitType(child, t, i, ttype, visitor)
|
||||
|
||||
|
||||
def _visitPattern(self, tree, pattern, visitor):
|
||||
"""
|
||||
For all subtrees that match the pattern, execute the visit action.
|
||||
"""
|
||||
|
||||
# Create a TreePattern from the pattern
|
||||
tokenizer = TreePatternLexer(pattern)
|
||||
parser = TreePatternParser(tokenizer, self, TreePatternTreeAdaptor())
|
||||
tpattern = parser.pattern()
|
||||
|
||||
# don't allow invalid patterns
|
||||
if (tpattern is None or tpattern.isNil()
|
||||
or isinstance(tpattern, WildcardTreePattern)):
|
||||
return
|
||||
|
||||
rootTokenType = tpattern.getType()
|
||||
|
||||
def rootvisitor(tree, parent, childIndex, labels):
|
||||
labels = {}
|
||||
if self._parse(tree, tpattern, labels):
|
||||
visitor(tree, parent, childIndex, labels)
|
||||
|
||||
self.visit(tree, rootTokenType, rootvisitor)
|
||||
|
||||
|
||||
def parse(self, t, pattern, labels=None):
|
||||
"""
|
||||
Given a pattern like (ASSIGN %lhs:ID %rhs:.) with optional labels
|
||||
on the various nodes and '.' (dot) as the node/subtree wildcard,
|
||||
return true if the pattern matches and fill the labels Map with
|
||||
the labels pointing at the appropriate nodes. Return false if
|
||||
the pattern is malformed or the tree does not match.
|
||||
|
||||
If a node specifies a text arg in pattern, then that must match
|
||||
for that node in t.
|
||||
"""
|
||||
|
||||
tokenizer = TreePatternLexer(pattern)
|
||||
parser = TreePatternParser(tokenizer, self, TreePatternTreeAdaptor())
|
||||
tpattern = parser.pattern()
|
||||
|
||||
return self._parse(t, tpattern, labels)
|
||||
|
||||
|
||||
def _parse(self, t1, tpattern, labels):
|
||||
"""
|
||||
Do the work for parse. Check to see if the tpattern fits the
|
||||
structure and token types in t1. Check text if the pattern has
|
||||
text arguments on nodes. Fill labels map with pointers to nodes
|
||||
in tree matched against nodes in pattern with labels.
|
||||
"""
|
||||
|
||||
# make sure both are non-null
|
||||
if t1 is None or tpattern is None:
|
||||
return False
|
||||
|
||||
# check roots (wildcard matches anything)
|
||||
if not isinstance(tpattern, WildcardTreePattern):
|
||||
if self.adaptor.getType(t1) != tpattern.getType():
|
||||
return False
|
||||
|
||||
# if pattern has text, check node text
|
||||
if (tpattern.hasTextArg
|
||||
and self.adaptor.getText(t1) != tpattern.getText()):
|
||||
return False
|
||||
|
||||
if tpattern.label is not None and labels is not None:
|
||||
# map label in pattern to node in t1
|
||||
labels[tpattern.label] = t1
|
||||
|
||||
# check children
|
||||
n1 = self.adaptor.getChildCount(t1)
|
||||
n2 = tpattern.getChildCount()
|
||||
if n1 != n2:
|
||||
return False
|
||||
|
||||
for i in range(n1):
|
||||
child1 = self.adaptor.getChild(t1, i)
|
||||
child2 = tpattern.getChild(i)
|
||||
if not self._parse(child1, child2, labels):
|
||||
return False
|
||||
|
||||
return True
|
||||
|
||||
|
||||
def equals(self, t1, t2, adaptor=None):
|
||||
"""
|
||||
Compare t1 and t2; return true if token types/text, structure match
|
||||
exactly.
|
||||
The trees are examined in their entirety so that (A B) does not match
|
||||
(A B C) nor (A (B C)).
|
||||
"""
|
||||
|
||||
if adaptor is None:
|
||||
adaptor = self.adaptor
|
||||
|
||||
return self._equals(t1, t2, adaptor)
|
||||
|
||||
|
||||
def _equals(self, t1, t2, adaptor):
|
||||
# make sure both are non-null
|
||||
if t1 is None or t2 is None:
|
||||
return False
|
||||
|
||||
# check roots
|
||||
if adaptor.getType(t1) != adaptor.getType(t2):
|
||||
return False
|
||||
|
||||
if adaptor.getText(t1) != adaptor.getText(t2):
|
||||
return False
|
||||
|
||||
# check children
|
||||
n1 = adaptor.getChildCount(t1)
|
||||
n2 = adaptor.getChildCount(t2)
|
||||
if n1 != n2:
|
||||
return False
|
||||
|
||||
for i in range(n1):
|
||||
child1 = adaptor.getChild(t1, i)
|
||||
child2 = adaptor.getChild(t2, i)
|
||||
if not self._equals(child1, child2, adaptor):
|
||||
return False
|
||||
|
||||
return True
|
Loading…
Reference in New Issue