Remove one local copy of the Python antlr3 module
* Remove the directory thirdparty/antlr3/ * Modify the antlr3 symbolic link to point to thirdparty/antlr3-antlr-3.5/runtime/Python/antlr3/ Change-Id: I8104b7352e96d8e282da4e5bd8ff4fb4817aaa32
This commit is contained in:
parent
41d26396d6
commit
5836bf07e0
|
@ -1,159 +0,0 @@
|
||||||
""" @package antlr3
|
|
||||||
@brief ANTLR3 runtime package
|
|
||||||
|
|
||||||
This module contains all support classes, which are needed to use recognizers
|
|
||||||
generated by ANTLR3.
|
|
||||||
|
|
||||||
@mainpage
|
|
||||||
|
|
||||||
\note Please be warned that the line numbers in the API documentation do not
|
|
||||||
match the real locations in the source code of the package. This is an
|
|
||||||
unintended artifact of doxygen, which I could only convince to use the
|
|
||||||
correct module names by concatenating all files from the package into a single
|
|
||||||
module file...
|
|
||||||
|
|
||||||
Here is a little overview over the most commonly used classes provided by
|
|
||||||
this runtime:
|
|
||||||
|
|
||||||
@section recognizers Recognizers
|
|
||||||
|
|
||||||
These recognizers are baseclasses for the code which is generated by ANTLR3.
|
|
||||||
|
|
||||||
- BaseRecognizer: Base class with common recognizer functionality.
|
|
||||||
- Lexer: Base class for lexers.
|
|
||||||
- Parser: Base class for parsers.
|
|
||||||
- tree.TreeParser: Base class for %tree parser.
|
|
||||||
|
|
||||||
@section streams Streams
|
|
||||||
|
|
||||||
Each recognizer pulls its input from one of the stream classes below. Streams
|
|
||||||
handle stuff like buffering, look-ahead and seeking.
|
|
||||||
|
|
||||||
A character stream is usually the first element in the pipeline of a typical
|
|
||||||
ANTLR3 application. It is used as the input for a Lexer.
|
|
||||||
|
|
||||||
- ANTLRStringStream: Reads from a string objects. The input should be a unicode
|
|
||||||
object, or ANTLR3 will have trouble decoding non-ascii data.
|
|
||||||
- ANTLRFileStream: Opens a file and read the contents, with optional character
|
|
||||||
decoding.
|
|
||||||
- ANTLRInputStream: Reads the date from a file-like object, with optional
|
|
||||||
character decoding.
|
|
||||||
|
|
||||||
A Parser needs a TokenStream as input (which in turn is usually fed by a
|
|
||||||
Lexer):
|
|
||||||
|
|
||||||
- CommonTokenStream: A basic and most commonly used TokenStream
|
|
||||||
implementation.
|
|
||||||
- TokenRewriteStream: A modification of CommonTokenStream that allows the
|
|
||||||
stream to be altered (by the Parser). See the 'tweak' example for a usecase.
|
|
||||||
|
|
||||||
And tree.TreeParser finally fetches its input from a tree.TreeNodeStream:
|
|
||||||
|
|
||||||
- tree.CommonTreeNodeStream: A basic and most commonly used tree.TreeNodeStream
|
|
||||||
implementation.
|
|
||||||
|
|
||||||
|
|
||||||
@section tokenstrees Tokens and Trees
|
|
||||||
|
|
||||||
A Lexer emits Token objects which are usually buffered by a TokenStream. A
|
|
||||||
Parser can build a Tree, if the output=AST option has been set in the grammar.
|
|
||||||
|
|
||||||
The runtime provides these Token implementations:
|
|
||||||
|
|
||||||
- CommonToken: A basic and most commonly used Token implementation.
|
|
||||||
- ClassicToken: A Token object as used in ANTLR 2.x, used to %tree
|
|
||||||
construction.
|
|
||||||
|
|
||||||
Tree objects are wrapper for Token objects.
|
|
||||||
|
|
||||||
- tree.CommonTree: A basic and most commonly used Tree implementation.
|
|
||||||
|
|
||||||
A tree.TreeAdaptor is used by the parser to create tree.Tree objects for the
|
|
||||||
input Token objects.
|
|
||||||
|
|
||||||
- tree.CommonTreeAdaptor: A basic and most commonly used tree.TreeAdaptor
|
|
||||||
implementation.
|
|
||||||
|
|
||||||
|
|
||||||
@section Exceptions
|
|
||||||
|
|
||||||
RecognitionException are generated, when a recognizer encounters incorrect
|
|
||||||
or unexpected input.
|
|
||||||
|
|
||||||
- RecognitionException
|
|
||||||
- MismatchedRangeException
|
|
||||||
- MismatchedSetException
|
|
||||||
- MismatchedNotSetException
|
|
||||||
.
|
|
||||||
- MismatchedTokenException
|
|
||||||
- MismatchedTreeNodeException
|
|
||||||
- NoViableAltException
|
|
||||||
- EarlyExitException
|
|
||||||
- FailedPredicateException
|
|
||||||
.
|
|
||||||
.
|
|
||||||
|
|
||||||
A tree.RewriteCardinalityException is raised, when the parsers hits a
|
|
||||||
cardinality mismatch during AST construction. Although this is basically a
|
|
||||||
bug in your grammar, it can only be detected at runtime.
|
|
||||||
|
|
||||||
- tree.RewriteCardinalityException
|
|
||||||
- tree.RewriteEarlyExitException
|
|
||||||
- tree.RewriteEmptyStreamException
|
|
||||||
.
|
|
||||||
.
|
|
||||||
|
|
||||||
"""
|
|
||||||
|
|
||||||
# tree.RewriteRuleElementStream
|
|
||||||
# tree.RewriteRuleSubtreeStream
|
|
||||||
# tree.RewriteRuleTokenStream
|
|
||||||
# CharStream
|
|
||||||
# DFA
|
|
||||||
# TokenSource
|
|
||||||
|
|
||||||
# [The "BSD licence"]
|
|
||||||
# Copyright (c) 2005-2008 Terence Parr
|
|
||||||
# All rights reserved.
|
|
||||||
#
|
|
||||||
# Redistribution and use in source and binary forms, with or without
|
|
||||||
# modification, are permitted provided that the following conditions
|
|
||||||
# are met:
|
|
||||||
# 1. Redistributions of source code must retain the above copyright
|
|
||||||
# notice, this list of conditions and the following disclaimer.
|
|
||||||
# 2. Redistributions in binary form must reproduce the above copyright
|
|
||||||
# notice, this list of conditions and the following disclaimer in the
|
|
||||||
# documentation and/or other materials provided with the distribution.
|
|
||||||
# 3. The name of the author may not be used to endorse or promote products
|
|
||||||
# derived from this software without specific prior written permission.
|
|
||||||
#
|
|
||||||
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
|
||||||
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
|
||||||
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
|
||||||
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
|
||||||
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
|
||||||
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
|
||||||
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
|
||||||
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
|
||||||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
|
||||||
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
|
||||||
|
|
||||||
import os
|
|
||||||
import sys
|
|
||||||
|
|
||||||
__version__ = '3.4'
|
|
||||||
|
|
||||||
# This runtime is compatible with generated parsers using the following
|
|
||||||
# API versions. 'HEAD' is only used by unittests.
|
|
||||||
compatible_api_versions = ['HEAD', 1]
|
|
||||||
|
|
||||||
top_dir = os.path.normpath(os.path.join(os.path.abspath(__file__),
|
|
||||||
os.pardir))
|
|
||||||
sys.path.append(top_dir)
|
|
||||||
|
|
||||||
from antlr3.constants import *
|
|
||||||
from antlr3.dfa import *
|
|
||||||
from antlr3.exceptions import *
|
|
||||||
from antlr3.recognizers import *
|
|
||||||
from antlr3.streams import *
|
|
||||||
from antlr3.tokens import *
|
|
|
@ -1,48 +0,0 @@
|
||||||
"""Compatibility stuff"""
|
|
||||||
|
|
||||||
# begin[licence]
|
|
||||||
#
|
|
||||||
# [The "BSD licence"]
|
|
||||||
# Copyright (c) 2005-2008 Terence Parr
|
|
||||||
# All rights reserved.
|
|
||||||
#
|
|
||||||
# Redistribution and use in source and binary forms, with or without
|
|
||||||
# modification, are permitted provided that the following conditions
|
|
||||||
# are met:
|
|
||||||
# 1. Redistributions of source code must retain the above copyright
|
|
||||||
# notice, this list of conditions and the following disclaimer.
|
|
||||||
# 2. Redistributions in binary form must reproduce the above copyright
|
|
||||||
# notice, this list of conditions and the following disclaimer in the
|
|
||||||
# documentation and/or other materials provided with the distribution.
|
|
||||||
# 3. The name of the author may not be used to endorse or promote products
|
|
||||||
# derived from this software without specific prior written permission.
|
|
||||||
#
|
|
||||||
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
|
||||||
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
|
||||||
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
|
||||||
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
|
||||||
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
|
||||||
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
|
||||||
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
|
||||||
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
|
||||||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
|
||||||
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
|
||||||
#
|
|
||||||
# end[licence]
|
|
||||||
|
|
||||||
try:
|
|
||||||
set = set
|
|
||||||
frozenset = frozenset
|
|
||||||
except NameError:
|
|
||||||
from sets import Set as set, ImmutableSet as frozenset
|
|
||||||
|
|
||||||
|
|
||||||
try:
|
|
||||||
reversed = reversed
|
|
||||||
except NameError:
|
|
||||||
def reversed(l):
|
|
||||||
l = l[:]
|
|
||||||
l.reverse()
|
|
||||||
return l
|
|
||||||
|
|
||||||
|
|
|
@ -1,57 +0,0 @@
|
||||||
"""ANTLR3 runtime package"""
|
|
||||||
|
|
||||||
# begin[licence]
|
|
||||||
#
|
|
||||||
# [The "BSD licence"]
|
|
||||||
# Copyright (c) 2005-2008 Terence Parr
|
|
||||||
# All rights reserved.
|
|
||||||
#
|
|
||||||
# Redistribution and use in source and binary forms, with or without
|
|
||||||
# modification, are permitted provided that the following conditions
|
|
||||||
# are met:
|
|
||||||
# 1. Redistributions of source code must retain the above copyright
|
|
||||||
# notice, this list of conditions and the following disclaimer.
|
|
||||||
# 2. Redistributions in binary form must reproduce the above copyright
|
|
||||||
# notice, this list of conditions and the following disclaimer in the
|
|
||||||
# documentation and/or other materials provided with the distribution.
|
|
||||||
# 3. The name of the author may not be used to endorse or promote products
|
|
||||||
# derived from this software without specific prior written permission.
|
|
||||||
#
|
|
||||||
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
|
||||||
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
|
||||||
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
|
||||||
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
|
||||||
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
|
||||||
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
|
||||||
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
|
||||||
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
|
||||||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
|
||||||
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
|
||||||
#
|
|
||||||
# end[licence]
|
|
||||||
|
|
||||||
EOF = -1
|
|
||||||
|
|
||||||
## All tokens go to the parser (unless skip() is called in that rule)
|
|
||||||
# on a particular "channel". The parser tunes to a particular channel
|
|
||||||
# so that whitespace etc... can go to the parser on a "hidden" channel.
|
|
||||||
DEFAULT_CHANNEL = 0
|
|
||||||
|
|
||||||
## Anything on different channel than DEFAULT_CHANNEL is not parsed
|
|
||||||
# by parser.
|
|
||||||
HIDDEN_CHANNEL = 99
|
|
||||||
|
|
||||||
# Predefined token types
|
|
||||||
EOR_TOKEN_TYPE = 1
|
|
||||||
|
|
||||||
##
|
|
||||||
# imaginary tree navigation type; traverse "get child" link
|
|
||||||
DOWN = 2
|
|
||||||
##
|
|
||||||
#imaginary tree navigation type; finish with a child list
|
|
||||||
UP = 3
|
|
||||||
|
|
||||||
MIN_TOKEN_TYPE = UP+1
|
|
||||||
|
|
||||||
INVALID_TOKEN_TYPE = 0
|
|
||||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -1,213 +0,0 @@
|
||||||
"""ANTLR3 runtime package"""
|
|
||||||
|
|
||||||
# begin[licence]
|
|
||||||
#
|
|
||||||
# [The "BSD licence"]
|
|
||||||
# Copyright (c) 2005-2008 Terence Parr
|
|
||||||
# All rights reserved.
|
|
||||||
#
|
|
||||||
# Redistribution and use in source and binary forms, with or without
|
|
||||||
# modification, are permitted provided that the following conditions
|
|
||||||
# are met:
|
|
||||||
# 1. Redistributions of source code must retain the above copyright
|
|
||||||
# notice, this list of conditions and the following disclaimer.
|
|
||||||
# 2. Redistributions in binary form must reproduce the above copyright
|
|
||||||
# notice, this list of conditions and the following disclaimer in the
|
|
||||||
# documentation and/or other materials provided with the distribution.
|
|
||||||
# 3. The name of the author may not be used to endorse or promote products
|
|
||||||
# derived from this software without specific prior written permission.
|
|
||||||
#
|
|
||||||
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
|
||||||
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
|
||||||
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
|
||||||
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
|
||||||
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
|
||||||
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
|
||||||
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
|
||||||
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
|
||||||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
|
||||||
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
|
||||||
#
|
|
||||||
# end[licensc]
|
|
||||||
|
|
||||||
from antlr3.constants import EOF
|
|
||||||
from antlr3.exceptions import NoViableAltException, BacktrackingFailed
|
|
||||||
|
|
||||||
|
|
||||||
class DFA(object):
|
|
||||||
"""@brief A DFA implemented as a set of transition tables.
|
|
||||||
|
|
||||||
Any state that has a semantic predicate edge is special; those states
|
|
||||||
are generated with if-then-else structures in a specialStateTransition()
|
|
||||||
which is generated by cyclicDFA template.
|
|
||||||
|
|
||||||
"""
|
|
||||||
|
|
||||||
def __init__(
|
|
||||||
self,
|
|
||||||
recognizer, decisionNumber,
|
|
||||||
eot, eof, min, max, accept, special, transition
|
|
||||||
):
|
|
||||||
## Which recognizer encloses this DFA? Needed to check backtracking
|
|
||||||
self.recognizer = recognizer
|
|
||||||
|
|
||||||
self.decisionNumber = decisionNumber
|
|
||||||
self.eot = eot
|
|
||||||
self.eof = eof
|
|
||||||
self.min = min
|
|
||||||
self.max = max
|
|
||||||
self.accept = accept
|
|
||||||
self.special = special
|
|
||||||
self.transition = transition
|
|
||||||
|
|
||||||
|
|
||||||
def predict(self, input):
|
|
||||||
"""
|
|
||||||
From the input stream, predict what alternative will succeed
|
|
||||||
using this DFA (representing the covering regular approximation
|
|
||||||
to the underlying CFL). Return an alternative number 1..n. Throw
|
|
||||||
an exception upon error.
|
|
||||||
"""
|
|
||||||
mark = input.mark()
|
|
||||||
s = 0 # we always start at s0
|
|
||||||
try:
|
|
||||||
for _ in xrange(50000):
|
|
||||||
#print "***Current state = %d" % s
|
|
||||||
|
|
||||||
specialState = self.special[s]
|
|
||||||
if specialState >= 0:
|
|
||||||
#print "is special"
|
|
||||||
s = self.specialStateTransition(specialState, input)
|
|
||||||
if s == -1:
|
|
||||||
self.noViableAlt(s, input)
|
|
||||||
return 0
|
|
||||||
input.consume()
|
|
||||||
continue
|
|
||||||
|
|
||||||
if self.accept[s] >= 1:
|
|
||||||
#print "accept state for alt %d" % self.accept[s]
|
|
||||||
return self.accept[s]
|
|
||||||
|
|
||||||
# look for a normal char transition
|
|
||||||
c = input.LA(1)
|
|
||||||
|
|
||||||
#print "LA = %d (%r)" % (c, unichr(c) if c >= 0 else 'EOF')
|
|
||||||
#print "range = %d..%d" % (self.min[s], self.max[s])
|
|
||||||
|
|
||||||
if c >= self.min[s] and c <= self.max[s]:
|
|
||||||
# move to next state
|
|
||||||
snext = self.transition[s][c-self.min[s]]
|
|
||||||
#print "in range, next state = %d" % snext
|
|
||||||
|
|
||||||
if snext < 0:
|
|
||||||
#print "not a normal transition"
|
|
||||||
# was in range but not a normal transition
|
|
||||||
# must check EOT, which is like the else clause.
|
|
||||||
# eot[s]>=0 indicates that an EOT edge goes to another
|
|
||||||
# state.
|
|
||||||
if self.eot[s] >= 0: # EOT Transition to accept state?
|
|
||||||
#print "EOT trans to accept state %d" % self.eot[s]
|
|
||||||
|
|
||||||
s = self.eot[s]
|
|
||||||
input.consume()
|
|
||||||
# TODO: I had this as return accept[eot[s]]
|
|
||||||
# which assumed here that the EOT edge always
|
|
||||||
# went to an accept...faster to do this, but
|
|
||||||
# what about predicated edges coming from EOT
|
|
||||||
# target?
|
|
||||||
continue
|
|
||||||
|
|
||||||
#print "no viable alt"
|
|
||||||
self.noViableAlt(s, input)
|
|
||||||
return 0
|
|
||||||
|
|
||||||
s = snext
|
|
||||||
input.consume()
|
|
||||||
continue
|
|
||||||
|
|
||||||
if self.eot[s] >= 0:
|
|
||||||
#print "EOT to %d" % self.eot[s]
|
|
||||||
|
|
||||||
s = self.eot[s]
|
|
||||||
input.consume()
|
|
||||||
continue
|
|
||||||
|
|
||||||
# EOF Transition to accept state?
|
|
||||||
if c == EOF and self.eof[s] >= 0:
|
|
||||||
#print "EOF Transition to accept state %d" \
|
|
||||||
# % self.accept[self.eof[s]]
|
|
||||||
return self.accept[self.eof[s]]
|
|
||||||
|
|
||||||
# not in range and not EOF/EOT, must be invalid symbol
|
|
||||||
self.noViableAlt(s, input)
|
|
||||||
return 0
|
|
||||||
|
|
||||||
else:
|
|
||||||
raise RuntimeError("DFA bang!")
|
|
||||||
|
|
||||||
finally:
|
|
||||||
input.rewind(mark)
|
|
||||||
|
|
||||||
|
|
||||||
def noViableAlt(self, s, input):
|
|
||||||
if self.recognizer._state.backtracking > 0:
|
|
||||||
raise BacktrackingFailed
|
|
||||||
|
|
||||||
nvae = NoViableAltException(
|
|
||||||
self.getDescription(),
|
|
||||||
self.decisionNumber,
|
|
||||||
s,
|
|
||||||
input
|
|
||||||
)
|
|
||||||
|
|
||||||
self.error(nvae)
|
|
||||||
raise nvae
|
|
||||||
|
|
||||||
|
|
||||||
def error(self, nvae):
|
|
||||||
"""A hook for debugging interface"""
|
|
||||||
pass
|
|
||||||
|
|
||||||
|
|
||||||
def specialStateTransition(self, s, input):
|
|
||||||
return -1
|
|
||||||
|
|
||||||
|
|
||||||
def getDescription(self):
|
|
||||||
return "n/a"
|
|
||||||
|
|
||||||
|
|
||||||
## def specialTransition(self, state, symbol):
|
|
||||||
## return 0
|
|
||||||
|
|
||||||
|
|
||||||
def unpack(cls, string):
|
|
||||||
"""@brief Unpack the runlength encoded table data.
|
|
||||||
|
|
||||||
Terence implemented packed table initializers, because Java has a
|
|
||||||
size restriction on .class files and the lookup tables can grow
|
|
||||||
pretty large. The generated JavaLexer.java of the Java.g example
|
|
||||||
would be about 15MB with uncompressed array initializers.
|
|
||||||
|
|
||||||
Python does not have any size restrictions, but the compilation of
|
|
||||||
such large source files seems to be pretty memory hungry. The memory
|
|
||||||
consumption of the python process grew to >1.5GB when importing a
|
|
||||||
15MB lexer, eating all my swap space and I was to impacient to see,
|
|
||||||
if it could finish at all. With packed initializers that are unpacked
|
|
||||||
at import time of the lexer module, everything works like a charm.
|
|
||||||
|
|
||||||
"""
|
|
||||||
|
|
||||||
ret = []
|
|
||||||
for i in range(len(string) / 2):
|
|
||||||
(n, v) = ord(string[i*2]), ord(string[i*2+1])
|
|
||||||
|
|
||||||
# Is there a bitwise operation to do this?
|
|
||||||
if v == 0xFFFF:
|
|
||||||
v = -1
|
|
||||||
|
|
||||||
ret += [v] * n
|
|
||||||
|
|
||||||
return ret
|
|
||||||
|
|
||||||
unpack = classmethod(unpack)
|
|
|
@ -1,210 +0,0 @@
|
||||||
""" @package antlr3.dottreegenerator
|
|
||||||
@brief ANTLR3 runtime package, tree module
|
|
||||||
|
|
||||||
This module contains all support classes for AST construction and tree parsers.
|
|
||||||
|
|
||||||
"""
|
|
||||||
|
|
||||||
# begin[licence]
|
|
||||||
#
|
|
||||||
# [The "BSD licence"]
|
|
||||||
# Copyright (c) 2005-2008 Terence Parr
|
|
||||||
# All rights reserved.
|
|
||||||
#
|
|
||||||
# Redistribution and use in source and binary forms, with or without
|
|
||||||
# modification, are permitted provided that the following conditions
|
|
||||||
# are met:
|
|
||||||
# 1. Redistributions of source code must retain the above copyright
|
|
||||||
# notice, this list of conditions and the following disclaimer.
|
|
||||||
# 2. Redistributions in binary form must reproduce the above copyright
|
|
||||||
# notice, this list of conditions and the following disclaimer in the
|
|
||||||
# documentation and/or other materials provided with the distribution.
|
|
||||||
# 3. The name of the author may not be used to endorse or promote products
|
|
||||||
# derived from this software without specific prior written permission.
|
|
||||||
#
|
|
||||||
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
|
||||||
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
|
||||||
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
|
||||||
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
|
||||||
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
|
||||||
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
|
||||||
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
|
||||||
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
|
||||||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
|
||||||
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
|
||||||
#
|
|
||||||
# end[licence]
|
|
||||||
|
|
||||||
# lot's of docstrings are missing, don't complain for now...
|
|
||||||
# pylint: disable-msg=C0111
|
|
||||||
|
|
||||||
from antlr3.tree import CommonTreeAdaptor
|
|
||||||
import stringtemplate3
|
|
||||||
|
|
||||||
class DOTTreeGenerator(object):
|
|
||||||
"""
|
|
||||||
A utility class to generate DOT diagrams (graphviz) from
|
|
||||||
arbitrary trees. You can pass in your own templates and
|
|
||||||
can pass in any kind of tree or use Tree interface method.
|
|
||||||
"""
|
|
||||||
|
|
||||||
_treeST = stringtemplate3.StringTemplate(
|
|
||||||
template=(
|
|
||||||
"digraph {\n" +
|
|
||||||
" ordering=out;\n" +
|
|
||||||
" ranksep=.4;\n" +
|
|
||||||
" node [shape=plaintext, fixedsize=true, fontsize=11, fontname=\"Courier\",\n" +
|
|
||||||
" width=.25, height=.25];\n" +
|
|
||||||
" edge [arrowsize=.5]\n" +
|
|
||||||
" $nodes$\n" +
|
|
||||||
" $edges$\n" +
|
|
||||||
"}\n")
|
|
||||||
)
|
|
||||||
|
|
||||||
_nodeST = stringtemplate3.StringTemplate(
|
|
||||||
template="$name$ [label=\"$text$\"];\n"
|
|
||||||
)
|
|
||||||
|
|
||||||
_edgeST = stringtemplate3.StringTemplate(
|
|
||||||
template="$parent$ -> $child$ // \"$parentText$\" -> \"$childText$\"\n"
|
|
||||||
)
|
|
||||||
|
|
||||||
def __init__(self):
|
|
||||||
## Track node to number mapping so we can get proper node name back
|
|
||||||
self.nodeToNumberMap = {}
|
|
||||||
|
|
||||||
## Track node number so we can get unique node names
|
|
||||||
self.nodeNumber = 0
|
|
||||||
|
|
||||||
|
|
||||||
def toDOT(self, tree, adaptor=None, treeST=_treeST, edgeST=_edgeST):
|
|
||||||
if adaptor is None:
|
|
||||||
adaptor = CommonTreeAdaptor()
|
|
||||||
|
|
||||||
treeST = treeST.getInstanceOf()
|
|
||||||
|
|
||||||
self.nodeNumber = 0
|
|
||||||
self.toDOTDefineNodes(tree, adaptor, treeST)
|
|
||||||
|
|
||||||
self.nodeNumber = 0
|
|
||||||
self.toDOTDefineEdges(tree, adaptor, treeST, edgeST)
|
|
||||||
return treeST
|
|
||||||
|
|
||||||
|
|
||||||
def toDOTDefineNodes(self, tree, adaptor, treeST, knownNodes=None):
|
|
||||||
if knownNodes is None:
|
|
||||||
knownNodes = set()
|
|
||||||
|
|
||||||
if tree is None:
|
|
||||||
return
|
|
||||||
|
|
||||||
n = adaptor.getChildCount(tree)
|
|
||||||
if n == 0:
|
|
||||||
# must have already dumped as child from previous
|
|
||||||
# invocation; do nothing
|
|
||||||
return
|
|
||||||
|
|
||||||
# define parent node
|
|
||||||
number = self.getNodeNumber(tree)
|
|
||||||
if number not in knownNodes:
|
|
||||||
parentNodeST = self.getNodeST(adaptor, tree)
|
|
||||||
treeST.setAttribute("nodes", parentNodeST)
|
|
||||||
knownNodes.add(number)
|
|
||||||
|
|
||||||
# for each child, do a "<unique-name> [label=text]" node def
|
|
||||||
for i in range(n):
|
|
||||||
child = adaptor.getChild(tree, i)
|
|
||||||
|
|
||||||
number = self.getNodeNumber(child)
|
|
||||||
if number not in knownNodes:
|
|
||||||
nodeST = self.getNodeST(adaptor, child)
|
|
||||||
treeST.setAttribute("nodes", nodeST)
|
|
||||||
knownNodes.add(number)
|
|
||||||
|
|
||||||
self.toDOTDefineNodes(child, adaptor, treeST, knownNodes)
|
|
||||||
|
|
||||||
|
|
||||||
def toDOTDefineEdges(self, tree, adaptor, treeST, edgeST):
|
|
||||||
if tree is None:
|
|
||||||
return
|
|
||||||
|
|
||||||
n = adaptor.getChildCount(tree)
|
|
||||||
if n == 0:
|
|
||||||
# must have already dumped as child from previous
|
|
||||||
# invocation; do nothing
|
|
||||||
return
|
|
||||||
|
|
||||||
parentName = "n%d" % self.getNodeNumber(tree)
|
|
||||||
|
|
||||||
# for each child, do a parent -> child edge using unique node names
|
|
||||||
parentText = adaptor.getText(tree)
|
|
||||||
for i in range(n):
|
|
||||||
child = adaptor.getChild(tree, i)
|
|
||||||
childText = adaptor.getText(child)
|
|
||||||
childName = "n%d" % self.getNodeNumber(child)
|
|
||||||
edgeST = edgeST.getInstanceOf()
|
|
||||||
edgeST.setAttribute("parent", parentName)
|
|
||||||
edgeST.setAttribute("child", childName)
|
|
||||||
edgeST.setAttribute("parentText", parentText)
|
|
||||||
edgeST.setAttribute("childText", childText)
|
|
||||||
treeST.setAttribute("edges", edgeST)
|
|
||||||
self.toDOTDefineEdges(child, adaptor, treeST, edgeST)
|
|
||||||
|
|
||||||
|
|
||||||
def getNodeST(self, adaptor, t):
|
|
||||||
text = adaptor.getText(t)
|
|
||||||
nodeST = self._nodeST.getInstanceOf()
|
|
||||||
uniqueName = "n%d" % self.getNodeNumber(t)
|
|
||||||
nodeST.setAttribute("name", uniqueName)
|
|
||||||
if text is not None:
|
|
||||||
text = text.replace('"', r'\"')
|
|
||||||
nodeST.setAttribute("text", text)
|
|
||||||
return nodeST
|
|
||||||
|
|
||||||
|
|
||||||
def getNodeNumber(self, t):
|
|
||||||
try:
|
|
||||||
return self.nodeToNumberMap[t]
|
|
||||||
except KeyError:
|
|
||||||
self.nodeToNumberMap[t] = self.nodeNumber
|
|
||||||
self.nodeNumber += 1
|
|
||||||
return self.nodeNumber - 1
|
|
||||||
|
|
||||||
|
|
||||||
def toDOT(tree, adaptor=None, treeST=DOTTreeGenerator._treeST, edgeST=DOTTreeGenerator._edgeST):
|
|
||||||
"""
|
|
||||||
Generate DOT (graphviz) for a whole tree not just a node.
|
|
||||||
For example, 3+4*5 should generate:
|
|
||||||
|
|
||||||
digraph {
|
|
||||||
node [shape=plaintext, fixedsize=true, fontsize=11, fontname="Courier",
|
|
||||||
width=.4, height=.2];
|
|
||||||
edge [arrowsize=.7]
|
|
||||||
"+"->3
|
|
||||||
"+"->"*"
|
|
||||||
"*"->4
|
|
||||||
"*"->5
|
|
||||||
}
|
|
||||||
|
|
||||||
Return the ST not a string in case people want to alter.
|
|
||||||
|
|
||||||
Takes a Tree interface object.
|
|
||||||
|
|
||||||
Example of invokation:
|
|
||||||
|
|
||||||
import antlr3
|
|
||||||
import antlr3.extras
|
|
||||||
|
|
||||||
input = antlr3.ANTLRInputStream(sys.stdin)
|
|
||||||
lex = TLexer(input)
|
|
||||||
tokens = antlr3.CommonTokenStream(lex)
|
|
||||||
parser = TParser(tokens)
|
|
||||||
tree = parser.e().tree
|
|
||||||
print tree.toStringTree()
|
|
||||||
st = antlr3.extras.toDOT(t)
|
|
||||||
print st
|
|
||||||
|
|
||||||
"""
|
|
||||||
|
|
||||||
gen = DOTTreeGenerator()
|
|
||||||
return gen.toDOT(tree, adaptor, treeST, edgeST)
|
|
|
@ -1,364 +0,0 @@
|
||||||
"""ANTLR3 exception hierarchy"""
|
|
||||||
|
|
||||||
# begin[licence]
|
|
||||||
#
|
|
||||||
# [The "BSD licence"]
|
|
||||||
# Copyright (c) 2005-2008 Terence Parr
|
|
||||||
# All rights reserved.
|
|
||||||
#
|
|
||||||
# Redistribution and use in source and binary forms, with or without
|
|
||||||
# modification, are permitted provided that the following conditions
|
|
||||||
# are met:
|
|
||||||
# 1. Redistributions of source code must retain the above copyright
|
|
||||||
# notice, this list of conditions and the following disclaimer.
|
|
||||||
# 2. Redistributions in binary form must reproduce the above copyright
|
|
||||||
# notice, this list of conditions and the following disclaimer in the
|
|
||||||
# documentation and/or other materials provided with the distribution.
|
|
||||||
# 3. The name of the author may not be used to endorse or promote products
|
|
||||||
# derived from this software without specific prior written permission.
|
|
||||||
#
|
|
||||||
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
|
||||||
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
|
||||||
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
|
||||||
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
|
||||||
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
|
||||||
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
|
||||||
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
|
||||||
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
|
||||||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
|
||||||
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
|
||||||
#
|
|
||||||
# end[licence]
|
|
||||||
|
|
||||||
from antlr3.constants import INVALID_TOKEN_TYPE
|
|
||||||
|
|
||||||
|
|
||||||
class BacktrackingFailed(Exception):
|
|
||||||
"""@brief Raised to signal failed backtrack attempt"""
|
|
||||||
|
|
||||||
pass
|
|
||||||
|
|
||||||
|
|
||||||
class RecognitionException(Exception):
|
|
||||||
"""@brief The root of the ANTLR exception hierarchy.
|
|
||||||
|
|
||||||
To avoid English-only error messages and to generally make things
|
|
||||||
as flexible as possible, these exceptions are not created with strings,
|
|
||||||
but rather the information necessary to generate an error. Then
|
|
||||||
the various reporting methods in Parser and Lexer can be overridden
|
|
||||||
to generate a localized error message. For example, MismatchedToken
|
|
||||||
exceptions are built with the expected token type.
|
|
||||||
So, don't expect getMessage() to return anything.
|
|
||||||
|
|
||||||
Note that as of Java 1.4, you can access the stack trace, which means
|
|
||||||
that you can compute the complete trace of rules from the start symbol.
|
|
||||||
This gives you considerable context information with which to generate
|
|
||||||
useful error messages.
|
|
||||||
|
|
||||||
ANTLR generates code that throws exceptions upon recognition error and
|
|
||||||
also generates code to catch these exceptions in each rule. If you
|
|
||||||
want to quit upon first error, you can turn off the automatic error
|
|
||||||
handling mechanism using rulecatch action, but you still need to
|
|
||||||
override methods mismatch and recoverFromMismatchSet.
|
|
||||||
|
|
||||||
In general, the recognition exceptions can track where in a grammar a
|
|
||||||
problem occurred and/or what was the expected input. While the parser
|
|
||||||
knows its state (such as current input symbol and line info) that
|
|
||||||
state can change before the exception is reported so current token index
|
|
||||||
is computed and stored at exception time. From this info, you can
|
|
||||||
perhaps print an entire line of input not just a single token, for example.
|
|
||||||
Better to just say the recognizer had a problem and then let the parser
|
|
||||||
figure out a fancy report.
|
|
||||||
|
|
||||||
"""
|
|
||||||
|
|
||||||
def __init__(self, input=None):
|
|
||||||
Exception.__init__(self)
|
|
||||||
|
|
||||||
# What input stream did the error occur in?
|
|
||||||
self.input = None
|
|
||||||
|
|
||||||
# What is index of token/char were we looking at when the error
|
|
||||||
# occurred?
|
|
||||||
self.index = None
|
|
||||||
|
|
||||||
# The current Token when an error occurred. Since not all streams
|
|
||||||
# can retrieve the ith Token, we have to track the Token object.
|
|
||||||
# For parsers. Even when it's a tree parser, token might be set.
|
|
||||||
self.token = None
|
|
||||||
|
|
||||||
# If this is a tree parser exception, node is set to the node with
|
|
||||||
# the problem.
|
|
||||||
self.node = None
|
|
||||||
|
|
||||||
# The current char when an error occurred. For lexers.
|
|
||||||
self.c = None
|
|
||||||
|
|
||||||
# Track the line at which the error occurred in case this is
|
|
||||||
# generated from a lexer. We need to track this since the
|
|
||||||
# unexpected char doesn't carry the line info.
|
|
||||||
self.line = None
|
|
||||||
|
|
||||||
self.charPositionInLine = None
|
|
||||||
|
|
||||||
# If you are parsing a tree node stream, you will encounter som
|
|
||||||
# imaginary nodes w/o line/col info. We now search backwards looking
|
|
||||||
# for most recent token with line/col info, but notify getErrorHeader()
|
|
||||||
# that info is approximate.
|
|
||||||
self.approximateLineInfo = False
|
|
||||||
|
|
||||||
|
|
||||||
if input is not None:
|
|
||||||
self.input = input
|
|
||||||
self.index = input.index()
|
|
||||||
|
|
||||||
# late import to avoid cyclic dependencies
|
|
||||||
from antlr3.streams import TokenStream, CharStream
|
|
||||||
from antlr3.tree import TreeNodeStream
|
|
||||||
|
|
||||||
if isinstance(self.input, TokenStream):
|
|
||||||
self.token = self.input.LT(1)
|
|
||||||
self.line = self.token.line
|
|
||||||
self.charPositionInLine = self.token.charPositionInLine
|
|
||||||
|
|
||||||
if isinstance(self.input, TreeNodeStream):
|
|
||||||
self.extractInformationFromTreeNodeStream(self.input)
|
|
||||||
|
|
||||||
else:
|
|
||||||
if isinstance(self.input, CharStream):
|
|
||||||
self.c = self.input.LT(1)
|
|
||||||
self.line = self.input.line
|
|
||||||
self.charPositionInLine = self.input.charPositionInLine
|
|
||||||
|
|
||||||
else:
|
|
||||||
self.c = self.input.LA(1)
|
|
||||||
|
|
||||||
def extractInformationFromTreeNodeStream(self, nodes):
|
|
||||||
from antlr3.tree import Tree, CommonTree
|
|
||||||
from antlr3.tokens import CommonToken
|
|
||||||
|
|
||||||
self.node = nodes.LT(1)
|
|
||||||
adaptor = nodes.adaptor
|
|
||||||
payload = adaptor.getToken(self.node)
|
|
||||||
if payload is not None:
|
|
||||||
self.token = payload
|
|
||||||
if payload.line <= 0:
|
|
||||||
# imaginary node; no line/pos info; scan backwards
|
|
||||||
i = -1
|
|
||||||
priorNode = nodes.LT(i)
|
|
||||||
while priorNode is not None:
|
|
||||||
priorPayload = adaptor.getToken(priorNode)
|
|
||||||
if priorPayload is not None and priorPayload.line > 0:
|
|
||||||
# we found the most recent real line / pos info
|
|
||||||
self.line = priorPayload.line
|
|
||||||
self.charPositionInLine = priorPayload.charPositionInLine
|
|
||||||
self.approximateLineInfo = True
|
|
||||||
break
|
|
||||||
|
|
||||||
i -= 1
|
|
||||||
priorNode = nodes.LT(i)
|
|
||||||
|
|
||||||
else: # node created from real token
|
|
||||||
self.line = payload.line
|
|
||||||
self.charPositionInLine = payload.charPositionInLine
|
|
||||||
|
|
||||||
elif isinstance(self.node, Tree):
|
|
||||||
self.line = self.node.line
|
|
||||||
self.charPositionInLine = self.node.charPositionInLine
|
|
||||||
if isinstance(self.node, CommonTree):
|
|
||||||
self.token = self.node.token
|
|
||||||
|
|
||||||
else:
|
|
||||||
type = adaptor.getType(self.node)
|
|
||||||
text = adaptor.getText(self.node)
|
|
||||||
self.token = CommonToken(type=type, text=text)
|
|
||||||
|
|
||||||
|
|
||||||
def getUnexpectedType(self):
|
|
||||||
"""Return the token type or char of the unexpected input element"""
|
|
||||||
|
|
||||||
from antlr3.streams import TokenStream
|
|
||||||
from antlr3.tree import TreeNodeStream
|
|
||||||
|
|
||||||
if isinstance(self.input, TokenStream):
|
|
||||||
return self.token.type
|
|
||||||
|
|
||||||
elif isinstance(self.input, TreeNodeStream):
|
|
||||||
adaptor = self.input.treeAdaptor
|
|
||||||
return adaptor.getType(self.node)
|
|
||||||
|
|
||||||
else:
|
|
||||||
return self.c
|
|
||||||
|
|
||||||
unexpectedType = property(getUnexpectedType)
|
|
||||||
|
|
||||||
|
|
||||||
class MismatchedTokenException(RecognitionException):
|
|
||||||
"""@brief A mismatched char or Token or tree node."""
|
|
||||||
|
|
||||||
def __init__(self, expecting, input):
|
|
||||||
RecognitionException.__init__(self, input)
|
|
||||||
self.expecting = expecting
|
|
||||||
|
|
||||||
|
|
||||||
def __str__(self):
|
|
||||||
#return "MismatchedTokenException("+self.expecting+")"
|
|
||||||
return "MismatchedTokenException(%r!=%r)" % (
|
|
||||||
self.getUnexpectedType(), self.expecting
|
|
||||||
)
|
|
||||||
__repr__ = __str__
|
|
||||||
|
|
||||||
|
|
||||||
class UnwantedTokenException(MismatchedTokenException):
|
|
||||||
"""An extra token while parsing a TokenStream"""
|
|
||||||
|
|
||||||
def getUnexpectedToken(self):
|
|
||||||
return self.token
|
|
||||||
|
|
||||||
|
|
||||||
def __str__(self):
|
|
||||||
exp = ", expected %s" % self.expecting
|
|
||||||
if self.expecting == INVALID_TOKEN_TYPE:
|
|
||||||
exp = ""
|
|
||||||
|
|
||||||
if self.token is None:
|
|
||||||
return "UnwantedTokenException(found=%s%s)" % (None, exp)
|
|
||||||
|
|
||||||
return "UnwantedTokenException(found=%s%s)" % (self.token.text, exp)
|
|
||||||
__repr__ = __str__
|
|
||||||
|
|
||||||
|
|
||||||
class MissingTokenException(MismatchedTokenException):
|
|
||||||
"""
|
|
||||||
We were expecting a token but it's not found. The current token
|
|
||||||
is actually what we wanted next.
|
|
||||||
"""
|
|
||||||
|
|
||||||
def __init__(self, expecting, input, inserted):
|
|
||||||
MismatchedTokenException.__init__(self, expecting, input)
|
|
||||||
|
|
||||||
self.inserted = inserted
|
|
||||||
|
|
||||||
|
|
||||||
def getMissingType(self):
|
|
||||||
return self.expecting
|
|
||||||
|
|
||||||
|
|
||||||
def __str__(self):
|
|
||||||
if self.inserted is not None and self.token is not None:
|
|
||||||
return "MissingTokenException(inserted %r at %r)" % (
|
|
||||||
self.inserted, self.token.text)
|
|
||||||
|
|
||||||
if self.token is not None:
|
|
||||||
return "MissingTokenException(at %r)" % self.token.text
|
|
||||||
|
|
||||||
return "MissingTokenException"
|
|
||||||
__repr__ = __str__
|
|
||||||
|
|
||||||
|
|
||||||
class MismatchedRangeException(RecognitionException):
|
|
||||||
"""@brief The next token does not match a range of expected types."""
|
|
||||||
|
|
||||||
def __init__(self, a, b, input):
|
|
||||||
RecognitionException.__init__(self, input)
|
|
||||||
|
|
||||||
self.a = a
|
|
||||||
self.b = b
|
|
||||||
|
|
||||||
|
|
||||||
def __str__(self):
|
|
||||||
return "MismatchedRangeException(%r not in [%r..%r])" % (
|
|
||||||
self.getUnexpectedType(), self.a, self.b
|
|
||||||
)
|
|
||||||
__repr__ = __str__
|
|
||||||
|
|
||||||
|
|
||||||
class MismatchedSetException(RecognitionException):
|
|
||||||
"""@brief The next token does not match a set of expected types."""
|
|
||||||
|
|
||||||
def __init__(self, expecting, input):
|
|
||||||
RecognitionException.__init__(self, input)
|
|
||||||
|
|
||||||
self.expecting = expecting
|
|
||||||
|
|
||||||
|
|
||||||
def __str__(self):
|
|
||||||
return "MismatchedSetException(%r not in %r)" % (
|
|
||||||
self.getUnexpectedType(), self.expecting
|
|
||||||
)
|
|
||||||
__repr__ = __str__
|
|
||||||
|
|
||||||
|
|
||||||
class MismatchedNotSetException(MismatchedSetException):
|
|
||||||
"""@brief Used for remote debugger deserialization"""
|
|
||||||
|
|
||||||
def __str__(self):
|
|
||||||
return "MismatchedNotSetException(%r!=%r)" % (
|
|
||||||
self.getUnexpectedType(), self.expecting
|
|
||||||
)
|
|
||||||
__repr__ = __str__
|
|
||||||
|
|
||||||
|
|
||||||
class NoViableAltException(RecognitionException):
|
|
||||||
"""@brief Unable to decide which alternative to choose."""
|
|
||||||
|
|
||||||
def __init__(
|
|
||||||
self, grammarDecisionDescription, decisionNumber, stateNumber, input
|
|
||||||
):
|
|
||||||
RecognitionException.__init__(self, input)
|
|
||||||
|
|
||||||
self.grammarDecisionDescription = grammarDecisionDescription
|
|
||||||
self.decisionNumber = decisionNumber
|
|
||||||
self.stateNumber = stateNumber
|
|
||||||
|
|
||||||
|
|
||||||
def __str__(self):
|
|
||||||
return "NoViableAltException(%r!=[%r])" % (
|
|
||||||
self.unexpectedType, self.grammarDecisionDescription
|
|
||||||
)
|
|
||||||
__repr__ = __str__
|
|
||||||
|
|
||||||
|
|
||||||
class EarlyExitException(RecognitionException):
|
|
||||||
"""@brief The recognizer did not match anything for a (..)+ loop."""
|
|
||||||
|
|
||||||
def __init__(self, decisionNumber, input):
|
|
||||||
RecognitionException.__init__(self, input)
|
|
||||||
|
|
||||||
self.decisionNumber = decisionNumber
|
|
||||||
|
|
||||||
|
|
||||||
class FailedPredicateException(RecognitionException):
|
|
||||||
"""@brief A semantic predicate failed during validation.
|
|
||||||
|
|
||||||
Validation of predicates
|
|
||||||
occurs when normally parsing the alternative just like matching a token.
|
|
||||||
Disambiguating predicate evaluation occurs when we hoist a predicate into
|
|
||||||
a prediction decision.
|
|
||||||
"""
|
|
||||||
|
|
||||||
def __init__(self, input, ruleName, predicateText):
|
|
||||||
RecognitionException.__init__(self, input)
|
|
||||||
|
|
||||||
self.ruleName = ruleName
|
|
||||||
self.predicateText = predicateText
|
|
||||||
|
|
||||||
|
|
||||||
def __str__(self):
|
|
||||||
return "FailedPredicateException("+self.ruleName+",{"+self.predicateText+"}?)"
|
|
||||||
__repr__ = __str__
|
|
||||||
|
|
||||||
|
|
||||||
class MismatchedTreeNodeException(RecognitionException):
|
|
||||||
"""@brief The next tree mode does not match the expected type."""
|
|
||||||
|
|
||||||
def __init__(self, expecting, input):
|
|
||||||
RecognitionException.__init__(self, input)
|
|
||||||
|
|
||||||
self.expecting = expecting
|
|
||||||
|
|
||||||
def __str__(self):
|
|
||||||
return "MismatchedTreeNodeException(%r!=%r)" % (
|
|
||||||
self.getUnexpectedType(), self.expecting
|
|
||||||
)
|
|
||||||
__repr__ = __str__
|
|
|
@ -1,47 +0,0 @@
|
||||||
""" @package antlr3.dottreegenerator
|
|
||||||
@brief ANTLR3 runtime package, tree module
|
|
||||||
|
|
||||||
This module contains all support classes for AST construction and tree parsers.
|
|
||||||
|
|
||||||
"""
|
|
||||||
|
|
||||||
# begin[licence]
|
|
||||||
#
|
|
||||||
# [The "BSD licence"]
|
|
||||||
# Copyright (c) 2005-2008 Terence Parr
|
|
||||||
# All rights reserved.
|
|
||||||
#
|
|
||||||
# Redistribution and use in source and binary forms, with or without
|
|
||||||
# modification, are permitted provided that the following conditions
|
|
||||||
# are met:
|
|
||||||
# 1. Redistributions of source code must retain the above copyright
|
|
||||||
# notice, this list of conditions and the following disclaimer.
|
|
||||||
# 2. Redistributions in binary form must reproduce the above copyright
|
|
||||||
# notice, this list of conditions and the following disclaimer in the
|
|
||||||
# documentation and/or other materials provided with the distribution.
|
|
||||||
# 3. The name of the author may not be used to endorse or promote products
|
|
||||||
# derived from this software without specific prior written permission.
|
|
||||||
#
|
|
||||||
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
|
||||||
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
|
||||||
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
|
||||||
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
|
||||||
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
|
||||||
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
|
||||||
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
|
||||||
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
|
||||||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
|
||||||
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
|
||||||
#
|
|
||||||
# end[licence]
|
|
||||||
|
|
||||||
# lot's of docstrings are missing, don't complain for now...
|
|
||||||
# pylint: disable-msg=C0111
|
|
||||||
|
|
||||||
from treewizard import TreeWizard
|
|
||||||
|
|
||||||
try:
|
|
||||||
from antlr3.dottreegen import toDOT
|
|
||||||
except ImportError, exc:
|
|
||||||
def toDOT(*args, **kwargs):
|
|
||||||
raise exc
|
|
|
@ -1,305 +0,0 @@
|
||||||
"""ANTLR3 runtime package"""
|
|
||||||
|
|
||||||
# begin[licence]
|
|
||||||
#
|
|
||||||
# [The "BSD licence"]
|
|
||||||
# Copyright (c) 2005-2008 Terence Parr
|
|
||||||
# All rights reserved.
|
|
||||||
#
|
|
||||||
# Redistribution and use in source and binary forms, with or without
|
|
||||||
# modification, are permitted provided that the following conditions
|
|
||||||
# are met:
|
|
||||||
# 1. Redistributions of source code must retain the above copyright
|
|
||||||
# notice, this list of conditions and the following disclaimer.
|
|
||||||
# 2. Redistributions in binary form must reproduce the above copyright
|
|
||||||
# notice, this list of conditions and the following disclaimer in the
|
|
||||||
# documentation and/or other materials provided with the distribution.
|
|
||||||
# 3. The name of the author may not be used to endorse or promote products
|
|
||||||
# derived from this software without specific prior written permission.
|
|
||||||
#
|
|
||||||
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
|
||||||
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
|
||||||
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
|
||||||
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
|
||||||
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
|
||||||
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
|
||||||
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
|
||||||
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
|
||||||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
|
||||||
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
|
||||||
#
|
|
||||||
# end[licence]
|
|
||||||
|
|
||||||
|
|
||||||
import sys
|
|
||||||
import optparse
|
|
||||||
|
|
||||||
import antlr3
|
|
||||||
|
|
||||||
|
|
||||||
class _Main(object):
|
|
||||||
def __init__(self):
|
|
||||||
self.stdin = sys.stdin
|
|
||||||
self.stdout = sys.stdout
|
|
||||||
self.stderr = sys.stderr
|
|
||||||
|
|
||||||
|
|
||||||
def parseOptions(self, argv):
|
|
||||||
optParser = optparse.OptionParser()
|
|
||||||
optParser.add_option(
|
|
||||||
"--encoding",
|
|
||||||
action="store",
|
|
||||||
type="string",
|
|
||||||
dest="encoding"
|
|
||||||
)
|
|
||||||
optParser.add_option(
|
|
||||||
"--input",
|
|
||||||
action="store",
|
|
||||||
type="string",
|
|
||||||
dest="input"
|
|
||||||
)
|
|
||||||
optParser.add_option(
|
|
||||||
"--interactive", "-i",
|
|
||||||
action="store_true",
|
|
||||||
dest="interactive"
|
|
||||||
)
|
|
||||||
optParser.add_option(
|
|
||||||
"--no-output",
|
|
||||||
action="store_true",
|
|
||||||
dest="no_output"
|
|
||||||
)
|
|
||||||
optParser.add_option(
|
|
||||||
"--profile",
|
|
||||||
action="store_true",
|
|
||||||
dest="profile"
|
|
||||||
)
|
|
||||||
optParser.add_option(
|
|
||||||
"--hotshot",
|
|
||||||
action="store_true",
|
|
||||||
dest="hotshot"
|
|
||||||
)
|
|
||||||
optParser.add_option(
|
|
||||||
"--port",
|
|
||||||
type="int",
|
|
||||||
dest="port",
|
|
||||||
default=None
|
|
||||||
)
|
|
||||||
optParser.add_option(
|
|
||||||
"--debug-socket",
|
|
||||||
action='store_true',
|
|
||||||
dest="debug_socket",
|
|
||||||
default=None
|
|
||||||
)
|
|
||||||
|
|
||||||
self.setupOptions(optParser)
|
|
||||||
|
|
||||||
return optParser.parse_args(argv[1:])
|
|
||||||
|
|
||||||
|
|
||||||
def setupOptions(self, optParser):
|
|
||||||
pass
|
|
||||||
|
|
||||||
|
|
||||||
def execute(self, argv):
|
|
||||||
options, args = self.parseOptions(argv)
|
|
||||||
|
|
||||||
self.setUp(options)
|
|
||||||
|
|
||||||
if options.interactive:
|
|
||||||
while True:
|
|
||||||
try:
|
|
||||||
input = raw_input(">>> ")
|
|
||||||
except (EOFError, KeyboardInterrupt):
|
|
||||||
self.stdout.write("\nBye.\n")
|
|
||||||
break
|
|
||||||
|
|
||||||
inStream = antlr3.ANTLRStringStream(input)
|
|
||||||
self.parseStream(options, inStream)
|
|
||||||
|
|
||||||
else:
|
|
||||||
if options.input is not None:
|
|
||||||
inStream = antlr3.ANTLRStringStream(options.input)
|
|
||||||
|
|
||||||
elif len(args) == 1 and args[0] != '-':
|
|
||||||
inStream = antlr3.ANTLRFileStream(
|
|
||||||
args[0], encoding=options.encoding
|
|
||||||
)
|
|
||||||
|
|
||||||
else:
|
|
||||||
inStream = antlr3.ANTLRInputStream(
|
|
||||||
self.stdin, encoding=options.encoding
|
|
||||||
)
|
|
||||||
|
|
||||||
if options.profile:
|
|
||||||
try:
|
|
||||||
import cProfile as profile
|
|
||||||
except ImportError:
|
|
||||||
import profile
|
|
||||||
|
|
||||||
profile.runctx(
|
|
||||||
'self.parseStream(options, inStream)',
|
|
||||||
globals(),
|
|
||||||
locals(),
|
|
||||||
'profile.dat'
|
|
||||||
)
|
|
||||||
|
|
||||||
import pstats
|
|
||||||
stats = pstats.Stats('profile.dat')
|
|
||||||
stats.strip_dirs()
|
|
||||||
stats.sort_stats('time')
|
|
||||||
stats.print_stats(100)
|
|
||||||
|
|
||||||
elif options.hotshot:
|
|
||||||
import hotshot
|
|
||||||
|
|
||||||
profiler = hotshot.Profile('hotshot.dat')
|
|
||||||
profiler.runctx(
|
|
||||||
'self.parseStream(options, inStream)',
|
|
||||||
globals(),
|
|
||||||
locals()
|
|
||||||
)
|
|
||||||
|
|
||||||
else:
|
|
||||||
self.parseStream(options, inStream)
|
|
||||||
|
|
||||||
|
|
||||||
def setUp(self, options):
|
|
||||||
pass
|
|
||||||
|
|
||||||
|
|
||||||
def parseStream(self, options, inStream):
|
|
||||||
raise NotImplementedError
|
|
||||||
|
|
||||||
|
|
||||||
def write(self, options, text):
|
|
||||||
if not options.no_output:
|
|
||||||
self.stdout.write(text)
|
|
||||||
|
|
||||||
|
|
||||||
def writeln(self, options, text):
|
|
||||||
self.write(options, text + '\n')
|
|
||||||
|
|
||||||
|
|
||||||
class LexerMain(_Main):
|
|
||||||
def __init__(self, lexerClass):
|
|
||||||
_Main.__init__(self)
|
|
||||||
|
|
||||||
self.lexerClass = lexerClass
|
|
||||||
|
|
||||||
|
|
||||||
def parseStream(self, options, inStream):
|
|
||||||
lexer = self.lexerClass(inStream)
|
|
||||||
for token in lexer:
|
|
||||||
self.writeln(options, str(token))
|
|
||||||
|
|
||||||
|
|
||||||
class ParserMain(_Main):
|
|
||||||
def __init__(self, lexerClassName, parserClass):
|
|
||||||
_Main.__init__(self)
|
|
||||||
|
|
||||||
self.lexerClassName = lexerClassName
|
|
||||||
self.lexerClass = None
|
|
||||||
self.parserClass = parserClass
|
|
||||||
|
|
||||||
|
|
||||||
def setupOptions(self, optParser):
|
|
||||||
optParser.add_option(
|
|
||||||
"--lexer",
|
|
||||||
action="store",
|
|
||||||
type="string",
|
|
||||||
dest="lexerClass",
|
|
||||||
default=self.lexerClassName
|
|
||||||
)
|
|
||||||
optParser.add_option(
|
|
||||||
"--rule",
|
|
||||||
action="store",
|
|
||||||
type="string",
|
|
||||||
dest="parserRule"
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def setUp(self, options):
|
|
||||||
lexerMod = __import__(options.lexerClass)
|
|
||||||
self.lexerClass = getattr(lexerMod, options.lexerClass)
|
|
||||||
|
|
||||||
|
|
||||||
def parseStream(self, options, inStream):
|
|
||||||
kwargs = {}
|
|
||||||
if options.port is not None:
|
|
||||||
kwargs['port'] = options.port
|
|
||||||
if options.debug_socket is not None:
|
|
||||||
kwargs['debug_socket'] = sys.stderr
|
|
||||||
|
|
||||||
lexer = self.lexerClass(inStream)
|
|
||||||
tokenStream = antlr3.CommonTokenStream(lexer)
|
|
||||||
parser = self.parserClass(tokenStream, **kwargs)
|
|
||||||
result = getattr(parser, options.parserRule)()
|
|
||||||
if result is not None:
|
|
||||||
if hasattr(result, 'tree') and result.tree is not None:
|
|
||||||
self.writeln(options, result.tree.toStringTree())
|
|
||||||
else:
|
|
||||||
self.writeln(options, repr(result))
|
|
||||||
|
|
||||||
|
|
||||||
class WalkerMain(_Main):
|
|
||||||
def __init__(self, walkerClass):
|
|
||||||
_Main.__init__(self)
|
|
||||||
|
|
||||||
self.lexerClass = None
|
|
||||||
self.parserClass = None
|
|
||||||
self.walkerClass = walkerClass
|
|
||||||
|
|
||||||
|
|
||||||
def setupOptions(self, optParser):
|
|
||||||
optParser.add_option(
|
|
||||||
"--lexer",
|
|
||||||
action="store",
|
|
||||||
type="string",
|
|
||||||
dest="lexerClass",
|
|
||||||
default=None
|
|
||||||
)
|
|
||||||
optParser.add_option(
|
|
||||||
"--parser",
|
|
||||||
action="store",
|
|
||||||
type="string",
|
|
||||||
dest="parserClass",
|
|
||||||
default=None
|
|
||||||
)
|
|
||||||
optParser.add_option(
|
|
||||||
"--parser-rule",
|
|
||||||
action="store",
|
|
||||||
type="string",
|
|
||||||
dest="parserRule",
|
|
||||||
default=None
|
|
||||||
)
|
|
||||||
optParser.add_option(
|
|
||||||
"--rule",
|
|
||||||
action="store",
|
|
||||||
type="string",
|
|
||||||
dest="walkerRule"
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def setUp(self, options):
|
|
||||||
lexerMod = __import__(options.lexerClass)
|
|
||||||
self.lexerClass = getattr(lexerMod, options.lexerClass)
|
|
||||||
parserMod = __import__(options.parserClass)
|
|
||||||
self.parserClass = getattr(parserMod, options.parserClass)
|
|
||||||
|
|
||||||
|
|
||||||
def parseStream(self, options, inStream):
|
|
||||||
lexer = self.lexerClass(inStream)
|
|
||||||
tokenStream = antlr3.CommonTokenStream(lexer)
|
|
||||||
parser = self.parserClass(tokenStream)
|
|
||||||
result = getattr(parser, options.parserRule)()
|
|
||||||
if result is not None:
|
|
||||||
assert hasattr(result, 'tree'), "Parser did not return an AST"
|
|
||||||
nodeStream = antlr3.tree.CommonTreeNodeStream(result.tree)
|
|
||||||
nodeStream.setTokenStream(tokenStream)
|
|
||||||
walker = self.walkerClass(nodeStream)
|
|
||||||
result = getattr(walker, options.walkerRule)()
|
|
||||||
if result is not None:
|
|
||||||
if hasattr(result, 'tree'):
|
|
||||||
self.writeln(options, result.tree.toStringTree())
|
|
||||||
else:
|
|
||||||
self.writeln(options, repr(result))
|
|
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
|
@ -1,418 +0,0 @@
|
||||||
"""ANTLR3 runtime package"""
|
|
||||||
|
|
||||||
# begin[licence]
|
|
||||||
#
|
|
||||||
# [The "BSD licence"]
|
|
||||||
# Copyright (c) 2005-2008 Terence Parr
|
|
||||||
# All rights reserved.
|
|
||||||
#
|
|
||||||
# Redistribution and use in source and binary forms, with or without
|
|
||||||
# modification, are permitted provided that the following conditions
|
|
||||||
# are met:
|
|
||||||
# 1. Redistributions of source code must retain the above copyright
|
|
||||||
# notice, this list of conditions and the following disclaimer.
|
|
||||||
# 2. Redistributions in binary form must reproduce the above copyright
|
|
||||||
# notice, this list of conditions and the following disclaimer in the
|
|
||||||
# documentation and/or other materials provided with the distribution.
|
|
||||||
# 3. The name of the author may not be used to endorse or promote products
|
|
||||||
# derived from this software without specific prior written permission.
|
|
||||||
#
|
|
||||||
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
|
||||||
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
|
||||||
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
|
||||||
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
|
||||||
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
|
||||||
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
|
||||||
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
|
||||||
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
|
||||||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
|
||||||
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
|
||||||
#
|
|
||||||
# end[licence]
|
|
||||||
|
|
||||||
from antlr3.constants import EOF, DEFAULT_CHANNEL, INVALID_TOKEN_TYPE
|
|
||||||
|
|
||||||
############################################################################
|
|
||||||
#
|
|
||||||
# basic token interface
|
|
||||||
#
|
|
||||||
############################################################################
|
|
||||||
|
|
||||||
class Token(object):
|
|
||||||
"""@brief Abstract token baseclass."""
|
|
||||||
|
|
||||||
def getText(self):
|
|
||||||
"""@brief Get the text of the token.
|
|
||||||
|
|
||||||
Using setter/getter methods is deprecated. Use o.text instead.
|
|
||||||
"""
|
|
||||||
raise NotImplementedError
|
|
||||||
|
|
||||||
def setText(self, text):
|
|
||||||
"""@brief Set the text of the token.
|
|
||||||
|
|
||||||
Using setter/getter methods is deprecated. Use o.text instead.
|
|
||||||
"""
|
|
||||||
raise NotImplementedError
|
|
||||||
|
|
||||||
|
|
||||||
def getType(self):
|
|
||||||
"""@brief Get the type of the token.
|
|
||||||
|
|
||||||
Using setter/getter methods is deprecated. Use o.type instead."""
|
|
||||||
|
|
||||||
raise NotImplementedError
|
|
||||||
|
|
||||||
def setType(self, ttype):
|
|
||||||
"""@brief Get the type of the token.
|
|
||||||
|
|
||||||
Using setter/getter methods is deprecated. Use o.type instead."""
|
|
||||||
|
|
||||||
raise NotImplementedError
|
|
||||||
|
|
||||||
|
|
||||||
def getLine(self):
|
|
||||||
"""@brief Get the line number on which this token was matched
|
|
||||||
|
|
||||||
Lines are numbered 1..n
|
|
||||||
|
|
||||||
Using setter/getter methods is deprecated. Use o.line instead."""
|
|
||||||
|
|
||||||
raise NotImplementedError
|
|
||||||
|
|
||||||
def setLine(self, line):
|
|
||||||
"""@brief Set the line number on which this token was matched
|
|
||||||
|
|
||||||
Using setter/getter methods is deprecated. Use o.line instead."""
|
|
||||||
|
|
||||||
raise NotImplementedError
|
|
||||||
|
|
||||||
|
|
||||||
def getCharPositionInLine(self):
|
|
||||||
"""@brief Get the column of the tokens first character,
|
|
||||||
|
|
||||||
Columns are numbered 0..n-1
|
|
||||||
|
|
||||||
Using setter/getter methods is deprecated. Use o.charPositionInLine instead."""
|
|
||||||
|
|
||||||
raise NotImplementedError
|
|
||||||
|
|
||||||
def setCharPositionInLine(self, pos):
|
|
||||||
"""@brief Set the column of the tokens first character,
|
|
||||||
|
|
||||||
Using setter/getter methods is deprecated. Use o.charPositionInLine instead."""
|
|
||||||
|
|
||||||
raise NotImplementedError
|
|
||||||
|
|
||||||
|
|
||||||
def getChannel(self):
|
|
||||||
"""@brief Get the channel of the token
|
|
||||||
|
|
||||||
Using setter/getter methods is deprecated. Use o.channel instead."""
|
|
||||||
|
|
||||||
raise NotImplementedError
|
|
||||||
|
|
||||||
def setChannel(self, channel):
|
|
||||||
"""@brief Set the channel of the token
|
|
||||||
|
|
||||||
Using setter/getter methods is deprecated. Use o.channel instead."""
|
|
||||||
|
|
||||||
raise NotImplementedError
|
|
||||||
|
|
||||||
|
|
||||||
def getTokenIndex(self):
|
|
||||||
"""@brief Get the index in the input stream.
|
|
||||||
|
|
||||||
An index from 0..n-1 of the token object in the input stream.
|
|
||||||
This must be valid in order to use the ANTLRWorks debugger.
|
|
||||||
|
|
||||||
Using setter/getter methods is deprecated. Use o.index instead."""
|
|
||||||
|
|
||||||
raise NotImplementedError
|
|
||||||
|
|
||||||
def setTokenIndex(self, index):
|
|
||||||
"""@brief Set the index in the input stream.
|
|
||||||
|
|
||||||
Using setter/getter methods is deprecated. Use o.index instead."""
|
|
||||||
|
|
||||||
raise NotImplementedError
|
|
||||||
|
|
||||||
|
|
||||||
def getInputStream(self):
|
|
||||||
"""@brief From what character stream was this token created.
|
|
||||||
|
|
||||||
You don't have to implement but it's nice to know where a Token
|
|
||||||
comes from if you have include files etc... on the input."""
|
|
||||||
|
|
||||||
raise NotImplementedError
|
|
||||||
|
|
||||||
def setInputStream(self, input):
|
|
||||||
"""@brief From what character stream was this token created.
|
|
||||||
|
|
||||||
You don't have to implement but it's nice to know where a Token
|
|
||||||
comes from if you have include files etc... on the input."""
|
|
||||||
|
|
||||||
raise NotImplementedError
|
|
||||||
|
|
||||||
|
|
||||||
############################################################################
|
|
||||||
#
|
|
||||||
# token implementations
|
|
||||||
#
|
|
||||||
# Token
|
|
||||||
# +- CommonToken
|
|
||||||
# \- ClassicToken
|
|
||||||
#
|
|
||||||
############################################################################
|
|
||||||
|
|
||||||
class CommonToken(Token):
|
|
||||||
"""@brief Basic token implementation.
|
|
||||||
|
|
||||||
This implementation does not copy the text from the input stream upon
|
|
||||||
creation, but keeps start/stop pointers into the stream to avoid
|
|
||||||
unnecessary copy operations.
|
|
||||||
|
|
||||||
"""
|
|
||||||
|
|
||||||
def __init__(self, type=None, channel=DEFAULT_CHANNEL, text=None,
|
|
||||||
input=None, start=None, stop=None, oldToken=None):
|
|
||||||
Token.__init__(self)
|
|
||||||
|
|
||||||
if oldToken is not None:
|
|
||||||
self.type = oldToken.type
|
|
||||||
self.line = oldToken.line
|
|
||||||
self.charPositionInLine = oldToken.charPositionInLine
|
|
||||||
self.channel = oldToken.channel
|
|
||||||
self.index = oldToken.index
|
|
||||||
self._text = oldToken._text
|
|
||||||
self.input = oldToken.input
|
|
||||||
if isinstance(oldToken, CommonToken):
|
|
||||||
self.start = oldToken.start
|
|
||||||
self.stop = oldToken.stop
|
|
||||||
|
|
||||||
else:
|
|
||||||
self.type = type
|
|
||||||
self.input = input
|
|
||||||
self.charPositionInLine = -1 # set to invalid position
|
|
||||||
self.line = 0
|
|
||||||
self.channel = channel
|
|
||||||
|
|
||||||
#What token number is this from 0..n-1 tokens; < 0 implies invalid index
|
|
||||||
self.index = -1
|
|
||||||
|
|
||||||
# We need to be able to change the text once in a while. If
|
|
||||||
# this is non-null, then getText should return this. Note that
|
|
||||||
# start/stop are not affected by changing this.
|
|
||||||
self._text = text
|
|
||||||
|
|
||||||
# The char position into the input buffer where this token starts
|
|
||||||
self.start = start
|
|
||||||
|
|
||||||
# The char position into the input buffer where this token stops
|
|
||||||
# This is the index of the last char, *not* the index after it!
|
|
||||||
self.stop = stop
|
|
||||||
|
|
||||||
|
|
||||||
def getText(self):
|
|
||||||
if self._text is not None:
|
|
||||||
return self._text
|
|
||||||
|
|
||||||
if self.input is None:
|
|
||||||
return None
|
|
||||||
|
|
||||||
if self.start < self.input.size() and self.stop < self.input.size():
|
|
||||||
return self.input.substring(self.start, self.stop)
|
|
||||||
|
|
||||||
return '<EOF>'
|
|
||||||
|
|
||||||
|
|
||||||
def setText(self, text):
|
|
||||||
"""
|
|
||||||
Override the text for this token. getText() will return this text
|
|
||||||
rather than pulling from the buffer. Note that this does not mean
|
|
||||||
that start/stop indexes are not valid. It means that that input
|
|
||||||
was converted to a new string in the token object.
|
|
||||||
"""
|
|
||||||
self._text = text
|
|
||||||
|
|
||||||
text = property(getText, setText)
|
|
||||||
|
|
||||||
|
|
||||||
def getType(self):
|
|
||||||
return self.type
|
|
||||||
|
|
||||||
def setType(self, ttype):
|
|
||||||
self.type = ttype
|
|
||||||
|
|
||||||
def getTypeName(self):
|
|
||||||
return str(self.type)
|
|
||||||
|
|
||||||
typeName = property(lambda s: s.getTypeName())
|
|
||||||
|
|
||||||
def getLine(self):
|
|
||||||
return self.line
|
|
||||||
|
|
||||||
def setLine(self, line):
|
|
||||||
self.line = line
|
|
||||||
|
|
||||||
|
|
||||||
def getCharPositionInLine(self):
|
|
||||||
return self.charPositionInLine
|
|
||||||
|
|
||||||
def setCharPositionInLine(self, pos):
|
|
||||||
self.charPositionInLine = pos
|
|
||||||
|
|
||||||
|
|
||||||
def getChannel(self):
|
|
||||||
return self.channel
|
|
||||||
|
|
||||||
def setChannel(self, channel):
|
|
||||||
self.channel = channel
|
|
||||||
|
|
||||||
|
|
||||||
def getTokenIndex(self):
|
|
||||||
return self.index
|
|
||||||
|
|
||||||
def setTokenIndex(self, index):
|
|
||||||
self.index = index
|
|
||||||
|
|
||||||
|
|
||||||
def getInputStream(self):
|
|
||||||
return self.input
|
|
||||||
|
|
||||||
def setInputStream(self, input):
|
|
||||||
self.input = input
|
|
||||||
|
|
||||||
|
|
||||||
def __str__(self):
|
|
||||||
if self.type == EOF:
|
|
||||||
return "<EOF>"
|
|
||||||
|
|
||||||
channelStr = ""
|
|
||||||
if self.channel > 0:
|
|
||||||
channelStr = ",channel=" + str(self.channel)
|
|
||||||
|
|
||||||
txt = self.text
|
|
||||||
if txt is not None:
|
|
||||||
txt = txt.replace("\n","\\\\n")
|
|
||||||
txt = txt.replace("\r","\\\\r")
|
|
||||||
txt = txt.replace("\t","\\\\t")
|
|
||||||
else:
|
|
||||||
txt = "<no text>"
|
|
||||||
|
|
||||||
return "[@%d,%d:%d=%r,<%s>%s,%d:%d]" % (
|
|
||||||
self.index,
|
|
||||||
self.start, self.stop,
|
|
||||||
txt,
|
|
||||||
self.typeName, channelStr,
|
|
||||||
self.line, self.charPositionInLine
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class ClassicToken(Token):
|
|
||||||
"""@brief Alternative token implementation.
|
|
||||||
|
|
||||||
A Token object like we'd use in ANTLR 2.x; has an actual string created
|
|
||||||
and associated with this object. These objects are needed for imaginary
|
|
||||||
tree nodes that have payload objects. We need to create a Token object
|
|
||||||
that has a string; the tree node will point at this token. CommonToken
|
|
||||||
has indexes into a char stream and hence cannot be used to introduce
|
|
||||||
new strings.
|
|
||||||
"""
|
|
||||||
|
|
||||||
def __init__(self, type=None, text=None, channel=DEFAULT_CHANNEL,
|
|
||||||
oldToken=None
|
|
||||||
):
|
|
||||||
Token.__init__(self)
|
|
||||||
|
|
||||||
if oldToken is not None:
|
|
||||||
self.text = oldToken.text
|
|
||||||
self.type = oldToken.type
|
|
||||||
self.line = oldToken.line
|
|
||||||
self.charPositionInLine = oldToken.charPositionInLine
|
|
||||||
self.channel = oldToken.channel
|
|
||||||
|
|
||||||
self.text = text
|
|
||||||
self.type = type
|
|
||||||
self.line = None
|
|
||||||
self.charPositionInLine = None
|
|
||||||
self.channel = channel
|
|
||||||
self.index = None
|
|
||||||
|
|
||||||
|
|
||||||
def getText(self):
|
|
||||||
return self.text
|
|
||||||
|
|
||||||
def setText(self, text):
|
|
||||||
self.text = text
|
|
||||||
|
|
||||||
|
|
||||||
def getType(self):
|
|
||||||
return self.type
|
|
||||||
|
|
||||||
def setType(self, ttype):
|
|
||||||
self.type = ttype
|
|
||||||
|
|
||||||
|
|
||||||
def getLine(self):
|
|
||||||
return self.line
|
|
||||||
|
|
||||||
def setLine(self, line):
|
|
||||||
self.line = line
|
|
||||||
|
|
||||||
|
|
||||||
def getCharPositionInLine(self):
|
|
||||||
return self.charPositionInLine
|
|
||||||
|
|
||||||
def setCharPositionInLine(self, pos):
|
|
||||||
self.charPositionInLine = pos
|
|
||||||
|
|
||||||
|
|
||||||
def getChannel(self):
|
|
||||||
return self.channel
|
|
||||||
|
|
||||||
def setChannel(self, channel):
|
|
||||||
self.channel = channel
|
|
||||||
|
|
||||||
|
|
||||||
def getTokenIndex(self):
|
|
||||||
return self.index
|
|
||||||
|
|
||||||
def setTokenIndex(self, index):
|
|
||||||
self.index = index
|
|
||||||
|
|
||||||
|
|
||||||
def getInputStream(self):
|
|
||||||
return None
|
|
||||||
|
|
||||||
def setInputStream(self, input):
|
|
||||||
pass
|
|
||||||
|
|
||||||
|
|
||||||
def toString(self):
|
|
||||||
channelStr = ""
|
|
||||||
if self.channel > 0:
|
|
||||||
channelStr = ",channel=" + str(self.channel)
|
|
||||||
|
|
||||||
txt = self.text
|
|
||||||
if txt is None:
|
|
||||||
txt = "<no text>"
|
|
||||||
|
|
||||||
return "[@%r,%r,<%r>%s,%r:%r]" % (self.index,
|
|
||||||
txt,
|
|
||||||
self.type,
|
|
||||||
channelStr,
|
|
||||||
self.line,
|
|
||||||
self.charPositionInLine
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
__str__ = toString
|
|
||||||
__repr__ = toString
|
|
||||||
|
|
||||||
|
|
||||||
INVALID_TOKEN = CommonToken(type=INVALID_TOKEN_TYPE)
|
|
||||||
|
|
||||||
# In an action, a lexer rule can set token to this SKIP_TOKEN and ANTLR
|
|
||||||
# will avoid creating a token for this symbol and try to fetch another.
|
|
||||||
SKIP_TOKEN = CommonToken(type=INVALID_TOKEN_TYPE)
|
|
File diff suppressed because it is too large
Load Diff
|
@ -1,619 +0,0 @@
|
||||||
""" @package antlr3.tree
|
|
||||||
@brief ANTLR3 runtime package, treewizard module
|
|
||||||
|
|
||||||
A utility module to create ASTs at runtime.
|
|
||||||
See <http://www.antlr.org/wiki/display/~admin/2007/07/02/Exploring+Concept+of+TreeWizard> for an overview. Note that the API of the Python implementation is slightly different.
|
|
||||||
|
|
||||||
"""
|
|
||||||
|
|
||||||
# begin[licence]
|
|
||||||
#
|
|
||||||
# [The "BSD licence"]
|
|
||||||
# Copyright (c) 2005-2008 Terence Parr
|
|
||||||
# All rights reserved.
|
|
||||||
#
|
|
||||||
# Redistribution and use in source and binary forms, with or without
|
|
||||||
# modification, are permitted provided that the following conditions
|
|
||||||
# are met:
|
|
||||||
# 1. Redistributions of source code must retain the above copyright
|
|
||||||
# notice, this list of conditions and the following disclaimer.
|
|
||||||
# 2. Redistributions in binary form must reproduce the above copyright
|
|
||||||
# notice, this list of conditions and the following disclaimer in the
|
|
||||||
# documentation and/or other materials provided with the distribution.
|
|
||||||
# 3. The name of the author may not be used to endorse or promote products
|
|
||||||
# derived from this software without specific prior written permission.
|
|
||||||
#
|
|
||||||
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
|
||||||
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
|
||||||
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
|
||||||
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
|
||||||
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
|
||||||
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
|
||||||
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
|
||||||
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
|
||||||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
|
||||||
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
|
||||||
#
|
|
||||||
# end[licence]
|
|
||||||
|
|
||||||
from antlr3.constants import INVALID_TOKEN_TYPE
|
|
||||||
from antlr3.tokens import CommonToken
|
|
||||||
from antlr3.tree import CommonTree, CommonTreeAdaptor
|
|
||||||
|
|
||||||
|
|
||||||
def computeTokenTypes(tokenNames):
|
|
||||||
"""
|
|
||||||
Compute a dict that is an inverted index of
|
|
||||||
tokenNames (which maps int token types to names).
|
|
||||||
"""
|
|
||||||
|
|
||||||
if tokenNames is None:
|
|
||||||
return {}
|
|
||||||
|
|
||||||
return dict((name, type) for type, name in enumerate(tokenNames))
|
|
||||||
|
|
||||||
|
|
||||||
## token types for pattern parser
|
|
||||||
EOF = -1
|
|
||||||
BEGIN = 1
|
|
||||||
END = 2
|
|
||||||
ID = 3
|
|
||||||
ARG = 4
|
|
||||||
PERCENT = 5
|
|
||||||
COLON = 6
|
|
||||||
DOT = 7
|
|
||||||
|
|
||||||
class TreePatternLexer(object):
|
|
||||||
def __init__(self, pattern):
|
|
||||||
## The tree pattern to lex like "(A B C)"
|
|
||||||
self.pattern = pattern
|
|
||||||
|
|
||||||
## Index into input string
|
|
||||||
self.p = -1
|
|
||||||
|
|
||||||
## Current char
|
|
||||||
self.c = None
|
|
||||||
|
|
||||||
## How long is the pattern in char?
|
|
||||||
self.n = len(pattern)
|
|
||||||
|
|
||||||
## Set when token type is ID or ARG
|
|
||||||
self.sval = None
|
|
||||||
|
|
||||||
self.error = False
|
|
||||||
|
|
||||||
self.consume()
|
|
||||||
|
|
||||||
|
|
||||||
__idStartChar = frozenset(
|
|
||||||
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_'
|
|
||||||
)
|
|
||||||
__idChar = __idStartChar | frozenset('0123456789')
|
|
||||||
|
|
||||||
def nextToken(self):
|
|
||||||
self.sval = ""
|
|
||||||
while self.c != EOF:
|
|
||||||
if self.c in (' ', '\n', '\r', '\t'):
|
|
||||||
self.consume()
|
|
||||||
continue
|
|
||||||
|
|
||||||
if self.c in self.__idStartChar:
|
|
||||||
self.sval += self.c
|
|
||||||
self.consume()
|
|
||||||
while self.c in self.__idChar:
|
|
||||||
self.sval += self.c
|
|
||||||
self.consume()
|
|
||||||
|
|
||||||
return ID
|
|
||||||
|
|
||||||
if self.c == '(':
|
|
||||||
self.consume()
|
|
||||||
return BEGIN
|
|
||||||
|
|
||||||
if self.c == ')':
|
|
||||||
self.consume()
|
|
||||||
return END
|
|
||||||
|
|
||||||
if self.c == '%':
|
|
||||||
self.consume()
|
|
||||||
return PERCENT
|
|
||||||
|
|
||||||
if self.c == ':':
|
|
||||||
self.consume()
|
|
||||||
return COLON
|
|
||||||
|
|
||||||
if self.c == '.':
|
|
||||||
self.consume()
|
|
||||||
return DOT
|
|
||||||
|
|
||||||
if self.c == '[': # grab [x] as a string, returning x
|
|
||||||
self.consume()
|
|
||||||
while self.c != ']':
|
|
||||||
if self.c == '\\':
|
|
||||||
self.consume()
|
|
||||||
if self.c != ']':
|
|
||||||
self.sval += '\\'
|
|
||||||
|
|
||||||
self.sval += self.c
|
|
||||||
|
|
||||||
else:
|
|
||||||
self.sval += self.c
|
|
||||||
|
|
||||||
self.consume()
|
|
||||||
|
|
||||||
self.consume()
|
|
||||||
return ARG
|
|
||||||
|
|
||||||
self.consume()
|
|
||||||
self.error = True
|
|
||||||
return EOF
|
|
||||||
|
|
||||||
return EOF
|
|
||||||
|
|
||||||
|
|
||||||
def consume(self):
|
|
||||||
self.p += 1
|
|
||||||
if self.p >= self.n:
|
|
||||||
self.c = EOF
|
|
||||||
|
|
||||||
else:
|
|
||||||
self.c = self.pattern[self.p]
|
|
||||||
|
|
||||||
|
|
||||||
class TreePatternParser(object):
|
|
||||||
def __init__(self, tokenizer, wizard, adaptor):
|
|
||||||
self.tokenizer = tokenizer
|
|
||||||
self.wizard = wizard
|
|
||||||
self.adaptor = adaptor
|
|
||||||
self.ttype = tokenizer.nextToken() # kickstart
|
|
||||||
|
|
||||||
|
|
||||||
def pattern(self):
|
|
||||||
if self.ttype == BEGIN:
|
|
||||||
return self.parseTree()
|
|
||||||
|
|
||||||
elif self.ttype == ID:
|
|
||||||
node = self.parseNode()
|
|
||||||
if self.ttype == EOF:
|
|
||||||
return node
|
|
||||||
|
|
||||||
return None # extra junk on end
|
|
||||||
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
def parseTree(self):
|
|
||||||
if self.ttype != BEGIN:
|
|
||||||
return None
|
|
||||||
|
|
||||||
self.ttype = self.tokenizer.nextToken()
|
|
||||||
root = self.parseNode()
|
|
||||||
if root is None:
|
|
||||||
return None
|
|
||||||
|
|
||||||
while self.ttype in (BEGIN, ID, PERCENT, DOT):
|
|
||||||
if self.ttype == BEGIN:
|
|
||||||
subtree = self.parseTree()
|
|
||||||
self.adaptor.addChild(root, subtree)
|
|
||||||
|
|
||||||
else:
|
|
||||||
child = self.parseNode()
|
|
||||||
if child is None:
|
|
||||||
return None
|
|
||||||
|
|
||||||
self.adaptor.addChild(root, child)
|
|
||||||
|
|
||||||
if self.ttype != END:
|
|
||||||
return None
|
|
||||||
|
|
||||||
self.ttype = self.tokenizer.nextToken()
|
|
||||||
return root
|
|
||||||
|
|
||||||
|
|
||||||
def parseNode(self):
|
|
||||||
# "%label:" prefix
|
|
||||||
label = None
|
|
||||||
|
|
||||||
if self.ttype == PERCENT:
|
|
||||||
self.ttype = self.tokenizer.nextToken()
|
|
||||||
if self.ttype != ID:
|
|
||||||
return None
|
|
||||||
|
|
||||||
label = self.tokenizer.sval
|
|
||||||
self.ttype = self.tokenizer.nextToken()
|
|
||||||
if self.ttype != COLON:
|
|
||||||
return None
|
|
||||||
|
|
||||||
self.ttype = self.tokenizer.nextToken() # move to ID following colon
|
|
||||||
|
|
||||||
# Wildcard?
|
|
||||||
if self.ttype == DOT:
|
|
||||||
self.ttype = self.tokenizer.nextToken()
|
|
||||||
wildcardPayload = CommonToken(0, ".")
|
|
||||||
node = WildcardTreePattern(wildcardPayload)
|
|
||||||
if label is not None:
|
|
||||||
node.label = label
|
|
||||||
return node
|
|
||||||
|
|
||||||
# "ID" or "ID[arg]"
|
|
||||||
if self.ttype != ID:
|
|
||||||
return None
|
|
||||||
|
|
||||||
tokenName = self.tokenizer.sval
|
|
||||||
self.ttype = self.tokenizer.nextToken()
|
|
||||||
|
|
||||||
if tokenName == "nil":
|
|
||||||
return self.adaptor.nil()
|
|
||||||
|
|
||||||
text = tokenName
|
|
||||||
# check for arg
|
|
||||||
arg = None
|
|
||||||
if self.ttype == ARG:
|
|
||||||
arg = self.tokenizer.sval
|
|
||||||
text = arg
|
|
||||||
self.ttype = self.tokenizer.nextToken()
|
|
||||||
|
|
||||||
# create node
|
|
||||||
treeNodeType = self.wizard.getTokenType(tokenName)
|
|
||||||
if treeNodeType == INVALID_TOKEN_TYPE:
|
|
||||||
return None
|
|
||||||
|
|
||||||
node = self.adaptor.createFromType(treeNodeType, text)
|
|
||||||
if label is not None and isinstance(node, TreePattern):
|
|
||||||
node.label = label
|
|
||||||
|
|
||||||
if arg is not None and isinstance(node, TreePattern):
|
|
||||||
node.hasTextArg = True
|
|
||||||
|
|
||||||
return node
|
|
||||||
|
|
||||||
|
|
||||||
class TreePattern(CommonTree):
|
|
||||||
"""
|
|
||||||
When using %label:TOKENNAME in a tree for parse(), we must
|
|
||||||
track the label.
|
|
||||||
"""
|
|
||||||
|
|
||||||
def __init__(self, payload):
|
|
||||||
CommonTree.__init__(self, payload)
|
|
||||||
|
|
||||||
self.label = None
|
|
||||||
self.hasTextArg = None
|
|
||||||
|
|
||||||
|
|
||||||
def toString(self):
|
|
||||||
if self.label is not None:
|
|
||||||
return '%' + self.label + ':' + CommonTree.toString(self)
|
|
||||||
|
|
||||||
else:
|
|
||||||
return CommonTree.toString(self)
|
|
||||||
|
|
||||||
|
|
||||||
class WildcardTreePattern(TreePattern):
|
|
||||||
pass
|
|
||||||
|
|
||||||
|
|
||||||
class TreePatternTreeAdaptor(CommonTreeAdaptor):
|
|
||||||
"""This adaptor creates TreePattern objects for use during scan()"""
|
|
||||||
|
|
||||||
def createWithPayload(self, payload):
|
|
||||||
return TreePattern(payload)
|
|
||||||
|
|
||||||
|
|
||||||
class TreeWizard(object):
|
|
||||||
"""
|
|
||||||
Build and navigate trees with this object. Must know about the names
|
|
||||||
of tokens so you have to pass in a map or array of token names (from which
|
|
||||||
this class can build the map). I.e., Token DECL means nothing unless the
|
|
||||||
class can translate it to a token type.
|
|
||||||
|
|
||||||
In order to create nodes and navigate, this class needs a TreeAdaptor.
|
|
||||||
|
|
||||||
This class can build a token type -> node index for repeated use or for
|
|
||||||
iterating over the various nodes with a particular type.
|
|
||||||
|
|
||||||
This class works in conjunction with the TreeAdaptor rather than moving
|
|
||||||
all this functionality into the adaptor. An adaptor helps build and
|
|
||||||
navigate trees using methods. This class helps you do it with string
|
|
||||||
patterns like "(A B C)". You can create a tree from that pattern or
|
|
||||||
match subtrees against it.
|
|
||||||
"""
|
|
||||||
|
|
||||||
def __init__(self, adaptor=None, tokenNames=None, typeMap=None):
|
|
||||||
if adaptor is None:
|
|
||||||
self.adaptor = CommonTreeAdaptor()
|
|
||||||
|
|
||||||
else:
|
|
||||||
self.adaptor = adaptor
|
|
||||||
|
|
||||||
if typeMap is None:
|
|
||||||
self.tokenNameToTypeMap = computeTokenTypes(tokenNames)
|
|
||||||
|
|
||||||
else:
|
|
||||||
if tokenNames is not None:
|
|
||||||
raise ValueError("Can't have both tokenNames and typeMap")
|
|
||||||
|
|
||||||
self.tokenNameToTypeMap = typeMap
|
|
||||||
|
|
||||||
|
|
||||||
def getTokenType(self, tokenName):
|
|
||||||
"""Using the map of token names to token types, return the type."""
|
|
||||||
|
|
||||||
try:
|
|
||||||
return self.tokenNameToTypeMap[tokenName]
|
|
||||||
except KeyError:
|
|
||||||
return INVALID_TOKEN_TYPE
|
|
||||||
|
|
||||||
|
|
||||||
def create(self, pattern):
|
|
||||||
"""
|
|
||||||
Create a tree or node from the indicated tree pattern that closely
|
|
||||||
follows ANTLR tree grammar tree element syntax:
|
|
||||||
|
|
||||||
(root child1 ... child2).
|
|
||||||
|
|
||||||
You can also just pass in a node: ID
|
|
||||||
|
|
||||||
Any node can have a text argument: ID[foo]
|
|
||||||
(notice there are no quotes around foo--it's clear it's a string).
|
|
||||||
|
|
||||||
nil is a special name meaning "give me a nil node". Useful for
|
|
||||||
making lists: (nil A B C) is a list of A B C.
|
|
||||||
"""
|
|
||||||
|
|
||||||
tokenizer = TreePatternLexer(pattern)
|
|
||||||
parser = TreePatternParser(tokenizer, self, self.adaptor)
|
|
||||||
return parser.pattern()
|
|
||||||
|
|
||||||
|
|
||||||
def index(self, tree):
|
|
||||||
"""Walk the entire tree and make a node name to nodes mapping.
|
|
||||||
|
|
||||||
For now, use recursion but later nonrecursive version may be
|
|
||||||
more efficient. Returns a dict int -> list where the list is
|
|
||||||
of your AST node type. The int is the token type of the node.
|
|
||||||
"""
|
|
||||||
|
|
||||||
m = {}
|
|
||||||
self._index(tree, m)
|
|
||||||
return m
|
|
||||||
|
|
||||||
|
|
||||||
def _index(self, t, m):
|
|
||||||
"""Do the work for index"""
|
|
||||||
|
|
||||||
if t is None:
|
|
||||||
return
|
|
||||||
|
|
||||||
ttype = self.adaptor.getType(t)
|
|
||||||
elements = m.get(ttype)
|
|
||||||
if elements is None:
|
|
||||||
m[ttype] = elements = []
|
|
||||||
|
|
||||||
elements.append(t)
|
|
||||||
for i in range(self.adaptor.getChildCount(t)):
|
|
||||||
child = self.adaptor.getChild(t, i)
|
|
||||||
self._index(child, m)
|
|
||||||
|
|
||||||
|
|
||||||
def find(self, tree, what):
|
|
||||||
"""Return a list of matching token.
|
|
||||||
|
|
||||||
what may either be an integer specifzing the token type to find or
|
|
||||||
a string with a pattern that must be matched.
|
|
||||||
|
|
||||||
"""
|
|
||||||
|
|
||||||
if isinstance(what, (int, long)):
|
|
||||||
return self._findTokenType(tree, what)
|
|
||||||
|
|
||||||
elif isinstance(what, basestring):
|
|
||||||
return self._findPattern(tree, what)
|
|
||||||
|
|
||||||
else:
|
|
||||||
raise TypeError("'what' must be string or integer")
|
|
||||||
|
|
||||||
|
|
||||||
def _findTokenType(self, t, ttype):
|
|
||||||
"""Return a List of tree nodes with token type ttype"""
|
|
||||||
|
|
||||||
nodes = []
|
|
||||||
|
|
||||||
def visitor(tree, parent, childIndex, labels):
|
|
||||||
nodes.append(tree)
|
|
||||||
|
|
||||||
self.visit(t, ttype, visitor)
|
|
||||||
|
|
||||||
return nodes
|
|
||||||
|
|
||||||
|
|
||||||
def _findPattern(self, t, pattern):
|
|
||||||
"""Return a List of subtrees matching pattern."""
|
|
||||||
|
|
||||||
subtrees = []
|
|
||||||
|
|
||||||
# Create a TreePattern from the pattern
|
|
||||||
tokenizer = TreePatternLexer(pattern)
|
|
||||||
parser = TreePatternParser(tokenizer, self, TreePatternTreeAdaptor())
|
|
||||||
tpattern = parser.pattern()
|
|
||||||
|
|
||||||
# don't allow invalid patterns
|
|
||||||
if (tpattern is None or tpattern.isNil()
|
|
||||||
or isinstance(tpattern, WildcardTreePattern)):
|
|
||||||
return None
|
|
||||||
|
|
||||||
rootTokenType = tpattern.getType()
|
|
||||||
|
|
||||||
def visitor(tree, parent, childIndex, label):
|
|
||||||
if self._parse(tree, tpattern, None):
|
|
||||||
subtrees.append(tree)
|
|
||||||
|
|
||||||
self.visit(t, rootTokenType, visitor)
|
|
||||||
|
|
||||||
return subtrees
|
|
||||||
|
|
||||||
|
|
||||||
def visit(self, tree, what, visitor):
|
|
||||||
"""Visit every node in tree matching what, invoking the visitor.
|
|
||||||
|
|
||||||
If what is a string, it is parsed as a pattern and only matching
|
|
||||||
subtrees will be visited.
|
|
||||||
The implementation uses the root node of the pattern in combination
|
|
||||||
with visit(t, ttype, visitor) so nil-rooted patterns are not allowed.
|
|
||||||
Patterns with wildcard roots are also not allowed.
|
|
||||||
|
|
||||||
If what is an integer, it is used as a token type and visit will match
|
|
||||||
all nodes of that type (this is faster than the pattern match).
|
|
||||||
The labels arg of the visitor action method is never set (it's None)
|
|
||||||
since using a token type rather than a pattern doesn't let us set a
|
|
||||||
label.
|
|
||||||
"""
|
|
||||||
|
|
||||||
if isinstance(what, (int, long)):
|
|
||||||
self._visitType(tree, None, 0, what, visitor)
|
|
||||||
|
|
||||||
elif isinstance(what, basestring):
|
|
||||||
self._visitPattern(tree, what, visitor)
|
|
||||||
|
|
||||||
else:
|
|
||||||
raise TypeError("'what' must be string or integer")
|
|
||||||
|
|
||||||
|
|
||||||
def _visitType(self, t, parent, childIndex, ttype, visitor):
|
|
||||||
"""Do the recursive work for visit"""
|
|
||||||
|
|
||||||
if t is None:
|
|
||||||
return
|
|
||||||
|
|
||||||
if self.adaptor.getType(t) == ttype:
|
|
||||||
visitor(t, parent, childIndex, None)
|
|
||||||
|
|
||||||
for i in range(self.adaptor.getChildCount(t)):
|
|
||||||
child = self.adaptor.getChild(t, i)
|
|
||||||
self._visitType(child, t, i, ttype, visitor)
|
|
||||||
|
|
||||||
|
|
||||||
def _visitPattern(self, tree, pattern, visitor):
|
|
||||||
"""
|
|
||||||
For all subtrees that match the pattern, execute the visit action.
|
|
||||||
"""
|
|
||||||
|
|
||||||
# Create a TreePattern from the pattern
|
|
||||||
tokenizer = TreePatternLexer(pattern)
|
|
||||||
parser = TreePatternParser(tokenizer, self, TreePatternTreeAdaptor())
|
|
||||||
tpattern = parser.pattern()
|
|
||||||
|
|
||||||
# don't allow invalid patterns
|
|
||||||
if (tpattern is None or tpattern.isNil()
|
|
||||||
or isinstance(tpattern, WildcardTreePattern)):
|
|
||||||
return
|
|
||||||
|
|
||||||
rootTokenType = tpattern.getType()
|
|
||||||
|
|
||||||
def rootvisitor(tree, parent, childIndex, labels):
|
|
||||||
labels = {}
|
|
||||||
if self._parse(tree, tpattern, labels):
|
|
||||||
visitor(tree, parent, childIndex, labels)
|
|
||||||
|
|
||||||
self.visit(tree, rootTokenType, rootvisitor)
|
|
||||||
|
|
||||||
|
|
||||||
def parse(self, t, pattern, labels=None):
|
|
||||||
"""
|
|
||||||
Given a pattern like (ASSIGN %lhs:ID %rhs:.) with optional labels
|
|
||||||
on the various nodes and '.' (dot) as the node/subtree wildcard,
|
|
||||||
return true if the pattern matches and fill the labels Map with
|
|
||||||
the labels pointing at the appropriate nodes. Return false if
|
|
||||||
the pattern is malformed or the tree does not match.
|
|
||||||
|
|
||||||
If a node specifies a text arg in pattern, then that must match
|
|
||||||
for that node in t.
|
|
||||||
"""
|
|
||||||
|
|
||||||
tokenizer = TreePatternLexer(pattern)
|
|
||||||
parser = TreePatternParser(tokenizer, self, TreePatternTreeAdaptor())
|
|
||||||
tpattern = parser.pattern()
|
|
||||||
|
|
||||||
return self._parse(t, tpattern, labels)
|
|
||||||
|
|
||||||
|
|
||||||
def _parse(self, t1, tpattern, labels):
|
|
||||||
"""
|
|
||||||
Do the work for parse. Check to see if the tpattern fits the
|
|
||||||
structure and token types in t1. Check text if the pattern has
|
|
||||||
text arguments on nodes. Fill labels map with pointers to nodes
|
|
||||||
in tree matched against nodes in pattern with labels.
|
|
||||||
"""
|
|
||||||
|
|
||||||
# make sure both are non-null
|
|
||||||
if t1 is None or tpattern is None:
|
|
||||||
return False
|
|
||||||
|
|
||||||
# check roots (wildcard matches anything)
|
|
||||||
if not isinstance(tpattern, WildcardTreePattern):
|
|
||||||
if self.adaptor.getType(t1) != tpattern.getType():
|
|
||||||
return False
|
|
||||||
|
|
||||||
# if pattern has text, check node text
|
|
||||||
if (tpattern.hasTextArg
|
|
||||||
and self.adaptor.getText(t1) != tpattern.getText()):
|
|
||||||
return False
|
|
||||||
|
|
||||||
if tpattern.label is not None and labels is not None:
|
|
||||||
# map label in pattern to node in t1
|
|
||||||
labels[tpattern.label] = t1
|
|
||||||
|
|
||||||
# check children
|
|
||||||
n1 = self.adaptor.getChildCount(t1)
|
|
||||||
n2 = tpattern.getChildCount()
|
|
||||||
if n1 != n2:
|
|
||||||
return False
|
|
||||||
|
|
||||||
for i in range(n1):
|
|
||||||
child1 = self.adaptor.getChild(t1, i)
|
|
||||||
child2 = tpattern.getChild(i)
|
|
||||||
if not self._parse(child1, child2, labels):
|
|
||||||
return False
|
|
||||||
|
|
||||||
return True
|
|
||||||
|
|
||||||
|
|
||||||
def equals(self, t1, t2, adaptor=None):
|
|
||||||
"""
|
|
||||||
Compare t1 and t2; return true if token types/text, structure match
|
|
||||||
exactly.
|
|
||||||
The trees are examined in their entirety so that (A B) does not match
|
|
||||||
(A B C) nor (A (B C)).
|
|
||||||
"""
|
|
||||||
|
|
||||||
if adaptor is None:
|
|
||||||
adaptor = self.adaptor
|
|
||||||
|
|
||||||
return self._equals(t1, t2, adaptor)
|
|
||||||
|
|
||||||
|
|
||||||
def _equals(self, t1, t2, adaptor):
|
|
||||||
# make sure both are non-null
|
|
||||||
if t1 is None or t2 is None:
|
|
||||||
return False
|
|
||||||
|
|
||||||
# check roots
|
|
||||||
if adaptor.getType(t1) != adaptor.getType(t2):
|
|
||||||
return False
|
|
||||||
|
|
||||||
if adaptor.getText(t1) != adaptor.getText(t2):
|
|
||||||
return False
|
|
||||||
|
|
||||||
# check children
|
|
||||||
n1 = adaptor.getChildCount(t1)
|
|
||||||
n2 = adaptor.getChildCount(t2)
|
|
||||||
if n1 != n2:
|
|
||||||
return False
|
|
||||||
|
|
||||||
for i in range(n1):
|
|
||||||
child1 = adaptor.getChild(t1, i)
|
|
||||||
child2 = adaptor.getChild(t2, i)
|
|
||||||
if not self._equals(child1, child2, adaptor):
|
|
||||||
return False
|
|
||||||
|
|
||||||
return True
|
|
Loading…
Reference in New Issue