Remove one local copy of the Python antlr3 module

* Remove the directory thirdparty/antlr3/
* Modify the antlr3 symbolic link to point to
  thirdparty/antlr3-antlr-3.5/runtime/Python/antlr3/

Change-Id: I8104b7352e96d8e282da4e5bd8ff4fb4817aaa32
This commit is contained in:
Victor Stinner 2015-09-23 16:50:17 +02:00
parent 41d26396d6
commit 5836bf07e0
15 changed files with 1 additions and 9428 deletions

2
antlr3
View File

@ -1 +1 @@
thirdparty/antlr3
thirdparty/antlr3-antlr-3.5/runtime/Python/antlr3/

View File

@ -1,159 +0,0 @@
""" @package antlr3
@brief ANTLR3 runtime package
This module contains all support classes, which are needed to use recognizers
generated by ANTLR3.
@mainpage
\note Please be warned that the line numbers in the API documentation do not
match the real locations in the source code of the package. This is an
unintended artifact of doxygen, which I could only convince to use the
correct module names by concatenating all files from the package into a single
module file...
Here is a little overview over the most commonly used classes provided by
this runtime:
@section recognizers Recognizers
These recognizers are baseclasses for the code which is generated by ANTLR3.
- BaseRecognizer: Base class with common recognizer functionality.
- Lexer: Base class for lexers.
- Parser: Base class for parsers.
- tree.TreeParser: Base class for %tree parser.
@section streams Streams
Each recognizer pulls its input from one of the stream classes below. Streams
handle stuff like buffering, look-ahead and seeking.
A character stream is usually the first element in the pipeline of a typical
ANTLR3 application. It is used as the input for a Lexer.
- ANTLRStringStream: Reads from a string objects. The input should be a unicode
object, or ANTLR3 will have trouble decoding non-ascii data.
- ANTLRFileStream: Opens a file and read the contents, with optional character
decoding.
- ANTLRInputStream: Reads the date from a file-like object, with optional
character decoding.
A Parser needs a TokenStream as input (which in turn is usually fed by a
Lexer):
- CommonTokenStream: A basic and most commonly used TokenStream
implementation.
- TokenRewriteStream: A modification of CommonTokenStream that allows the
stream to be altered (by the Parser). See the 'tweak' example for a usecase.
And tree.TreeParser finally fetches its input from a tree.TreeNodeStream:
- tree.CommonTreeNodeStream: A basic and most commonly used tree.TreeNodeStream
implementation.
@section tokenstrees Tokens and Trees
A Lexer emits Token objects which are usually buffered by a TokenStream. A
Parser can build a Tree, if the output=AST option has been set in the grammar.
The runtime provides these Token implementations:
- CommonToken: A basic and most commonly used Token implementation.
- ClassicToken: A Token object as used in ANTLR 2.x, used to %tree
construction.
Tree objects are wrapper for Token objects.
- tree.CommonTree: A basic and most commonly used Tree implementation.
A tree.TreeAdaptor is used by the parser to create tree.Tree objects for the
input Token objects.
- tree.CommonTreeAdaptor: A basic and most commonly used tree.TreeAdaptor
implementation.
@section Exceptions
RecognitionException are generated, when a recognizer encounters incorrect
or unexpected input.
- RecognitionException
- MismatchedRangeException
- MismatchedSetException
- MismatchedNotSetException
.
- MismatchedTokenException
- MismatchedTreeNodeException
- NoViableAltException
- EarlyExitException
- FailedPredicateException
.
.
A tree.RewriteCardinalityException is raised, when the parsers hits a
cardinality mismatch during AST construction. Although this is basically a
bug in your grammar, it can only be detected at runtime.
- tree.RewriteCardinalityException
- tree.RewriteEarlyExitException
- tree.RewriteEmptyStreamException
.
.
"""
# tree.RewriteRuleElementStream
# tree.RewriteRuleSubtreeStream
# tree.RewriteRuleTokenStream
# CharStream
# DFA
# TokenSource
# [The "BSD licence"]
# Copyright (c) 2005-2008 Terence Parr
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# 3. The name of the author may not be used to endorse or promote products
# derived from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
import os
import sys
__version__ = '3.4'
# This runtime is compatible with generated parsers using the following
# API versions. 'HEAD' is only used by unittests.
compatible_api_versions = ['HEAD', 1]
top_dir = os.path.normpath(os.path.join(os.path.abspath(__file__),
os.pardir))
sys.path.append(top_dir)
from antlr3.constants import *
from antlr3.dfa import *
from antlr3.exceptions import *
from antlr3.recognizers import *
from antlr3.streams import *
from antlr3.tokens import *

View File

@ -1,48 +0,0 @@
"""Compatibility stuff"""
# begin[licence]
#
# [The "BSD licence"]
# Copyright (c) 2005-2008 Terence Parr
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# 3. The name of the author may not be used to endorse or promote products
# derived from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#
# end[licence]
try:
set = set
frozenset = frozenset
except NameError:
from sets import Set as set, ImmutableSet as frozenset
try:
reversed = reversed
except NameError:
def reversed(l):
l = l[:]
l.reverse()
return l

View File

@ -1,57 +0,0 @@
"""ANTLR3 runtime package"""
# begin[licence]
#
# [The "BSD licence"]
# Copyright (c) 2005-2008 Terence Parr
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# 3. The name of the author may not be used to endorse or promote products
# derived from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#
# end[licence]
EOF = -1
## All tokens go to the parser (unless skip() is called in that rule)
# on a particular "channel". The parser tunes to a particular channel
# so that whitespace etc... can go to the parser on a "hidden" channel.
DEFAULT_CHANNEL = 0
## Anything on different channel than DEFAULT_CHANNEL is not parsed
# by parser.
HIDDEN_CHANNEL = 99
# Predefined token types
EOR_TOKEN_TYPE = 1
##
# imaginary tree navigation type; traverse "get child" link
DOWN = 2
##
#imaginary tree navigation type; finish with a child list
UP = 3
MIN_TOKEN_TYPE = UP+1
INVALID_TOKEN_TYPE = 0

File diff suppressed because it is too large Load Diff

View File

@ -1,213 +0,0 @@
"""ANTLR3 runtime package"""
# begin[licence]
#
# [The "BSD licence"]
# Copyright (c) 2005-2008 Terence Parr
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# 3. The name of the author may not be used to endorse or promote products
# derived from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#
# end[licensc]
from antlr3.constants import EOF
from antlr3.exceptions import NoViableAltException, BacktrackingFailed
class DFA(object):
"""@brief A DFA implemented as a set of transition tables.
Any state that has a semantic predicate edge is special; those states
are generated with if-then-else structures in a specialStateTransition()
which is generated by cyclicDFA template.
"""
def __init__(
self,
recognizer, decisionNumber,
eot, eof, min, max, accept, special, transition
):
## Which recognizer encloses this DFA? Needed to check backtracking
self.recognizer = recognizer
self.decisionNumber = decisionNumber
self.eot = eot
self.eof = eof
self.min = min
self.max = max
self.accept = accept
self.special = special
self.transition = transition
def predict(self, input):
"""
From the input stream, predict what alternative will succeed
using this DFA (representing the covering regular approximation
to the underlying CFL). Return an alternative number 1..n. Throw
an exception upon error.
"""
mark = input.mark()
s = 0 # we always start at s0
try:
for _ in xrange(50000):
#print "***Current state = %d" % s
specialState = self.special[s]
if specialState >= 0:
#print "is special"
s = self.specialStateTransition(specialState, input)
if s == -1:
self.noViableAlt(s, input)
return 0
input.consume()
continue
if self.accept[s] >= 1:
#print "accept state for alt %d" % self.accept[s]
return self.accept[s]
# look for a normal char transition
c = input.LA(1)
#print "LA = %d (%r)" % (c, unichr(c) if c >= 0 else 'EOF')
#print "range = %d..%d" % (self.min[s], self.max[s])
if c >= self.min[s] and c <= self.max[s]:
# move to next state
snext = self.transition[s][c-self.min[s]]
#print "in range, next state = %d" % snext
if snext < 0:
#print "not a normal transition"
# was in range but not a normal transition
# must check EOT, which is like the else clause.
# eot[s]>=0 indicates that an EOT edge goes to another
# state.
if self.eot[s] >= 0: # EOT Transition to accept state?
#print "EOT trans to accept state %d" % self.eot[s]
s = self.eot[s]
input.consume()
# TODO: I had this as return accept[eot[s]]
# which assumed here that the EOT edge always
# went to an accept...faster to do this, but
# what about predicated edges coming from EOT
# target?
continue
#print "no viable alt"
self.noViableAlt(s, input)
return 0
s = snext
input.consume()
continue
if self.eot[s] >= 0:
#print "EOT to %d" % self.eot[s]
s = self.eot[s]
input.consume()
continue
# EOF Transition to accept state?
if c == EOF and self.eof[s] >= 0:
#print "EOF Transition to accept state %d" \
# % self.accept[self.eof[s]]
return self.accept[self.eof[s]]
# not in range and not EOF/EOT, must be invalid symbol
self.noViableAlt(s, input)
return 0
else:
raise RuntimeError("DFA bang!")
finally:
input.rewind(mark)
def noViableAlt(self, s, input):
if self.recognizer._state.backtracking > 0:
raise BacktrackingFailed
nvae = NoViableAltException(
self.getDescription(),
self.decisionNumber,
s,
input
)
self.error(nvae)
raise nvae
def error(self, nvae):
"""A hook for debugging interface"""
pass
def specialStateTransition(self, s, input):
return -1
def getDescription(self):
return "n/a"
## def specialTransition(self, state, symbol):
## return 0
def unpack(cls, string):
"""@brief Unpack the runlength encoded table data.
Terence implemented packed table initializers, because Java has a
size restriction on .class files and the lookup tables can grow
pretty large. The generated JavaLexer.java of the Java.g example
would be about 15MB with uncompressed array initializers.
Python does not have any size restrictions, but the compilation of
such large source files seems to be pretty memory hungry. The memory
consumption of the python process grew to >1.5GB when importing a
15MB lexer, eating all my swap space and I was to impacient to see,
if it could finish at all. With packed initializers that are unpacked
at import time of the lexer module, everything works like a charm.
"""
ret = []
for i in range(len(string) / 2):
(n, v) = ord(string[i*2]), ord(string[i*2+1])
# Is there a bitwise operation to do this?
if v == 0xFFFF:
v = -1
ret += [v] * n
return ret
unpack = classmethod(unpack)

View File

@ -1,210 +0,0 @@
""" @package antlr3.dottreegenerator
@brief ANTLR3 runtime package, tree module
This module contains all support classes for AST construction and tree parsers.
"""
# begin[licence]
#
# [The "BSD licence"]
# Copyright (c) 2005-2008 Terence Parr
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# 3. The name of the author may not be used to endorse or promote products
# derived from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#
# end[licence]
# lot's of docstrings are missing, don't complain for now...
# pylint: disable-msg=C0111
from antlr3.tree import CommonTreeAdaptor
import stringtemplate3
class DOTTreeGenerator(object):
"""
A utility class to generate DOT diagrams (graphviz) from
arbitrary trees. You can pass in your own templates and
can pass in any kind of tree or use Tree interface method.
"""
_treeST = stringtemplate3.StringTemplate(
template=(
"digraph {\n" +
" ordering=out;\n" +
" ranksep=.4;\n" +
" node [shape=plaintext, fixedsize=true, fontsize=11, fontname=\"Courier\",\n" +
" width=.25, height=.25];\n" +
" edge [arrowsize=.5]\n" +
" $nodes$\n" +
" $edges$\n" +
"}\n")
)
_nodeST = stringtemplate3.StringTemplate(
template="$name$ [label=\"$text$\"];\n"
)
_edgeST = stringtemplate3.StringTemplate(
template="$parent$ -> $child$ // \"$parentText$\" -> \"$childText$\"\n"
)
def __init__(self):
## Track node to number mapping so we can get proper node name back
self.nodeToNumberMap = {}
## Track node number so we can get unique node names
self.nodeNumber = 0
def toDOT(self, tree, adaptor=None, treeST=_treeST, edgeST=_edgeST):
if adaptor is None:
adaptor = CommonTreeAdaptor()
treeST = treeST.getInstanceOf()
self.nodeNumber = 0
self.toDOTDefineNodes(tree, adaptor, treeST)
self.nodeNumber = 0
self.toDOTDefineEdges(tree, adaptor, treeST, edgeST)
return treeST
def toDOTDefineNodes(self, tree, adaptor, treeST, knownNodes=None):
if knownNodes is None:
knownNodes = set()
if tree is None:
return
n = adaptor.getChildCount(tree)
if n == 0:
# must have already dumped as child from previous
# invocation; do nothing
return
# define parent node
number = self.getNodeNumber(tree)
if number not in knownNodes:
parentNodeST = self.getNodeST(adaptor, tree)
treeST.setAttribute("nodes", parentNodeST)
knownNodes.add(number)
# for each child, do a "<unique-name> [label=text]" node def
for i in range(n):
child = adaptor.getChild(tree, i)
number = self.getNodeNumber(child)
if number not in knownNodes:
nodeST = self.getNodeST(adaptor, child)
treeST.setAttribute("nodes", nodeST)
knownNodes.add(number)
self.toDOTDefineNodes(child, adaptor, treeST, knownNodes)
def toDOTDefineEdges(self, tree, adaptor, treeST, edgeST):
if tree is None:
return
n = adaptor.getChildCount(tree)
if n == 0:
# must have already dumped as child from previous
# invocation; do nothing
return
parentName = "n%d" % self.getNodeNumber(tree)
# for each child, do a parent -> child edge using unique node names
parentText = adaptor.getText(tree)
for i in range(n):
child = adaptor.getChild(tree, i)
childText = adaptor.getText(child)
childName = "n%d" % self.getNodeNumber(child)
edgeST = edgeST.getInstanceOf()
edgeST.setAttribute("parent", parentName)
edgeST.setAttribute("child", childName)
edgeST.setAttribute("parentText", parentText)
edgeST.setAttribute("childText", childText)
treeST.setAttribute("edges", edgeST)
self.toDOTDefineEdges(child, adaptor, treeST, edgeST)
def getNodeST(self, adaptor, t):
text = adaptor.getText(t)
nodeST = self._nodeST.getInstanceOf()
uniqueName = "n%d" % self.getNodeNumber(t)
nodeST.setAttribute("name", uniqueName)
if text is not None:
text = text.replace('"', r'\"')
nodeST.setAttribute("text", text)
return nodeST
def getNodeNumber(self, t):
try:
return self.nodeToNumberMap[t]
except KeyError:
self.nodeToNumberMap[t] = self.nodeNumber
self.nodeNumber += 1
return self.nodeNumber - 1
def toDOT(tree, adaptor=None, treeST=DOTTreeGenerator._treeST, edgeST=DOTTreeGenerator._edgeST):
"""
Generate DOT (graphviz) for a whole tree not just a node.
For example, 3+4*5 should generate:
digraph {
node [shape=plaintext, fixedsize=true, fontsize=11, fontname="Courier",
width=.4, height=.2];
edge [arrowsize=.7]
"+"->3
"+"->"*"
"*"->4
"*"->5
}
Return the ST not a string in case people want to alter.
Takes a Tree interface object.
Example of invokation:
import antlr3
import antlr3.extras
input = antlr3.ANTLRInputStream(sys.stdin)
lex = TLexer(input)
tokens = antlr3.CommonTokenStream(lex)
parser = TParser(tokens)
tree = parser.e().tree
print tree.toStringTree()
st = antlr3.extras.toDOT(t)
print st
"""
gen = DOTTreeGenerator()
return gen.toDOT(tree, adaptor, treeST, edgeST)

View File

@ -1,364 +0,0 @@
"""ANTLR3 exception hierarchy"""
# begin[licence]
#
# [The "BSD licence"]
# Copyright (c) 2005-2008 Terence Parr
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# 3. The name of the author may not be used to endorse or promote products
# derived from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#
# end[licence]
from antlr3.constants import INVALID_TOKEN_TYPE
class BacktrackingFailed(Exception):
"""@brief Raised to signal failed backtrack attempt"""
pass
class RecognitionException(Exception):
"""@brief The root of the ANTLR exception hierarchy.
To avoid English-only error messages and to generally make things
as flexible as possible, these exceptions are not created with strings,
but rather the information necessary to generate an error. Then
the various reporting methods in Parser and Lexer can be overridden
to generate a localized error message. For example, MismatchedToken
exceptions are built with the expected token type.
So, don't expect getMessage() to return anything.
Note that as of Java 1.4, you can access the stack trace, which means
that you can compute the complete trace of rules from the start symbol.
This gives you considerable context information with which to generate
useful error messages.
ANTLR generates code that throws exceptions upon recognition error and
also generates code to catch these exceptions in each rule. If you
want to quit upon first error, you can turn off the automatic error
handling mechanism using rulecatch action, but you still need to
override methods mismatch and recoverFromMismatchSet.
In general, the recognition exceptions can track where in a grammar a
problem occurred and/or what was the expected input. While the parser
knows its state (such as current input symbol and line info) that
state can change before the exception is reported so current token index
is computed and stored at exception time. From this info, you can
perhaps print an entire line of input not just a single token, for example.
Better to just say the recognizer had a problem and then let the parser
figure out a fancy report.
"""
def __init__(self, input=None):
Exception.__init__(self)
# What input stream did the error occur in?
self.input = None
# What is index of token/char were we looking at when the error
# occurred?
self.index = None
# The current Token when an error occurred. Since not all streams
# can retrieve the ith Token, we have to track the Token object.
# For parsers. Even when it's a tree parser, token might be set.
self.token = None
# If this is a tree parser exception, node is set to the node with
# the problem.
self.node = None
# The current char when an error occurred. For lexers.
self.c = None
# Track the line at which the error occurred in case this is
# generated from a lexer. We need to track this since the
# unexpected char doesn't carry the line info.
self.line = None
self.charPositionInLine = None
# If you are parsing a tree node stream, you will encounter som
# imaginary nodes w/o line/col info. We now search backwards looking
# for most recent token with line/col info, but notify getErrorHeader()
# that info is approximate.
self.approximateLineInfo = False
if input is not None:
self.input = input
self.index = input.index()
# late import to avoid cyclic dependencies
from antlr3.streams import TokenStream, CharStream
from antlr3.tree import TreeNodeStream
if isinstance(self.input, TokenStream):
self.token = self.input.LT(1)
self.line = self.token.line
self.charPositionInLine = self.token.charPositionInLine
if isinstance(self.input, TreeNodeStream):
self.extractInformationFromTreeNodeStream(self.input)
else:
if isinstance(self.input, CharStream):
self.c = self.input.LT(1)
self.line = self.input.line
self.charPositionInLine = self.input.charPositionInLine
else:
self.c = self.input.LA(1)
def extractInformationFromTreeNodeStream(self, nodes):
from antlr3.tree import Tree, CommonTree
from antlr3.tokens import CommonToken
self.node = nodes.LT(1)
adaptor = nodes.adaptor
payload = adaptor.getToken(self.node)
if payload is not None:
self.token = payload
if payload.line <= 0:
# imaginary node; no line/pos info; scan backwards
i = -1
priorNode = nodes.LT(i)
while priorNode is not None:
priorPayload = adaptor.getToken(priorNode)
if priorPayload is not None and priorPayload.line > 0:
# we found the most recent real line / pos info
self.line = priorPayload.line
self.charPositionInLine = priorPayload.charPositionInLine
self.approximateLineInfo = True
break
i -= 1
priorNode = nodes.LT(i)
else: # node created from real token
self.line = payload.line
self.charPositionInLine = payload.charPositionInLine
elif isinstance(self.node, Tree):
self.line = self.node.line
self.charPositionInLine = self.node.charPositionInLine
if isinstance(self.node, CommonTree):
self.token = self.node.token
else:
type = adaptor.getType(self.node)
text = adaptor.getText(self.node)
self.token = CommonToken(type=type, text=text)
def getUnexpectedType(self):
"""Return the token type or char of the unexpected input element"""
from antlr3.streams import TokenStream
from antlr3.tree import TreeNodeStream
if isinstance(self.input, TokenStream):
return self.token.type
elif isinstance(self.input, TreeNodeStream):
adaptor = self.input.treeAdaptor
return adaptor.getType(self.node)
else:
return self.c
unexpectedType = property(getUnexpectedType)
class MismatchedTokenException(RecognitionException):
"""@brief A mismatched char or Token or tree node."""
def __init__(self, expecting, input):
RecognitionException.__init__(self, input)
self.expecting = expecting
def __str__(self):
#return "MismatchedTokenException("+self.expecting+")"
return "MismatchedTokenException(%r!=%r)" % (
self.getUnexpectedType(), self.expecting
)
__repr__ = __str__
class UnwantedTokenException(MismatchedTokenException):
"""An extra token while parsing a TokenStream"""
def getUnexpectedToken(self):
return self.token
def __str__(self):
exp = ", expected %s" % self.expecting
if self.expecting == INVALID_TOKEN_TYPE:
exp = ""
if self.token is None:
return "UnwantedTokenException(found=%s%s)" % (None, exp)
return "UnwantedTokenException(found=%s%s)" % (self.token.text, exp)
__repr__ = __str__
class MissingTokenException(MismatchedTokenException):
"""
We were expecting a token but it's not found. The current token
is actually what we wanted next.
"""
def __init__(self, expecting, input, inserted):
MismatchedTokenException.__init__(self, expecting, input)
self.inserted = inserted
def getMissingType(self):
return self.expecting
def __str__(self):
if self.inserted is not None and self.token is not None:
return "MissingTokenException(inserted %r at %r)" % (
self.inserted, self.token.text)
if self.token is not None:
return "MissingTokenException(at %r)" % self.token.text
return "MissingTokenException"
__repr__ = __str__
class MismatchedRangeException(RecognitionException):
"""@brief The next token does not match a range of expected types."""
def __init__(self, a, b, input):
RecognitionException.__init__(self, input)
self.a = a
self.b = b
def __str__(self):
return "MismatchedRangeException(%r not in [%r..%r])" % (
self.getUnexpectedType(), self.a, self.b
)
__repr__ = __str__
class MismatchedSetException(RecognitionException):
"""@brief The next token does not match a set of expected types."""
def __init__(self, expecting, input):
RecognitionException.__init__(self, input)
self.expecting = expecting
def __str__(self):
return "MismatchedSetException(%r not in %r)" % (
self.getUnexpectedType(), self.expecting
)
__repr__ = __str__
class MismatchedNotSetException(MismatchedSetException):
"""@brief Used for remote debugger deserialization"""
def __str__(self):
return "MismatchedNotSetException(%r!=%r)" % (
self.getUnexpectedType(), self.expecting
)
__repr__ = __str__
class NoViableAltException(RecognitionException):
"""@brief Unable to decide which alternative to choose."""
def __init__(
self, grammarDecisionDescription, decisionNumber, stateNumber, input
):
RecognitionException.__init__(self, input)
self.grammarDecisionDescription = grammarDecisionDescription
self.decisionNumber = decisionNumber
self.stateNumber = stateNumber
def __str__(self):
return "NoViableAltException(%r!=[%r])" % (
self.unexpectedType, self.grammarDecisionDescription
)
__repr__ = __str__
class EarlyExitException(RecognitionException):
"""@brief The recognizer did not match anything for a (..)+ loop."""
def __init__(self, decisionNumber, input):
RecognitionException.__init__(self, input)
self.decisionNumber = decisionNumber
class FailedPredicateException(RecognitionException):
"""@brief A semantic predicate failed during validation.
Validation of predicates
occurs when normally parsing the alternative just like matching a token.
Disambiguating predicate evaluation occurs when we hoist a predicate into
a prediction decision.
"""
def __init__(self, input, ruleName, predicateText):
RecognitionException.__init__(self, input)
self.ruleName = ruleName
self.predicateText = predicateText
def __str__(self):
return "FailedPredicateException("+self.ruleName+",{"+self.predicateText+"}?)"
__repr__ = __str__
class MismatchedTreeNodeException(RecognitionException):
"""@brief The next tree mode does not match the expected type."""
def __init__(self, expecting, input):
RecognitionException.__init__(self, input)
self.expecting = expecting
def __str__(self):
return "MismatchedTreeNodeException(%r!=%r)" % (
self.getUnexpectedType(), self.expecting
)
__repr__ = __str__

View File

@ -1,47 +0,0 @@
""" @package antlr3.dottreegenerator
@brief ANTLR3 runtime package, tree module
This module contains all support classes for AST construction and tree parsers.
"""
# begin[licence]
#
# [The "BSD licence"]
# Copyright (c) 2005-2008 Terence Parr
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# 3. The name of the author may not be used to endorse or promote products
# derived from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#
# end[licence]
# lot's of docstrings are missing, don't complain for now...
# pylint: disable-msg=C0111
from treewizard import TreeWizard
try:
from antlr3.dottreegen import toDOT
except ImportError, exc:
def toDOT(*args, **kwargs):
raise exc

View File

@ -1,305 +0,0 @@
"""ANTLR3 runtime package"""
# begin[licence]
#
# [The "BSD licence"]
# Copyright (c) 2005-2008 Terence Parr
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# 3. The name of the author may not be used to endorse or promote products
# derived from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#
# end[licence]
import sys
import optparse
import antlr3
class _Main(object):
def __init__(self):
self.stdin = sys.stdin
self.stdout = sys.stdout
self.stderr = sys.stderr
def parseOptions(self, argv):
optParser = optparse.OptionParser()
optParser.add_option(
"--encoding",
action="store",
type="string",
dest="encoding"
)
optParser.add_option(
"--input",
action="store",
type="string",
dest="input"
)
optParser.add_option(
"--interactive", "-i",
action="store_true",
dest="interactive"
)
optParser.add_option(
"--no-output",
action="store_true",
dest="no_output"
)
optParser.add_option(
"--profile",
action="store_true",
dest="profile"
)
optParser.add_option(
"--hotshot",
action="store_true",
dest="hotshot"
)
optParser.add_option(
"--port",
type="int",
dest="port",
default=None
)
optParser.add_option(
"--debug-socket",
action='store_true',
dest="debug_socket",
default=None
)
self.setupOptions(optParser)
return optParser.parse_args(argv[1:])
def setupOptions(self, optParser):
pass
def execute(self, argv):
options, args = self.parseOptions(argv)
self.setUp(options)
if options.interactive:
while True:
try:
input = raw_input(">>> ")
except (EOFError, KeyboardInterrupt):
self.stdout.write("\nBye.\n")
break
inStream = antlr3.ANTLRStringStream(input)
self.parseStream(options, inStream)
else:
if options.input is not None:
inStream = antlr3.ANTLRStringStream(options.input)
elif len(args) == 1 and args[0] != '-':
inStream = antlr3.ANTLRFileStream(
args[0], encoding=options.encoding
)
else:
inStream = antlr3.ANTLRInputStream(
self.stdin, encoding=options.encoding
)
if options.profile:
try:
import cProfile as profile
except ImportError:
import profile
profile.runctx(
'self.parseStream(options, inStream)',
globals(),
locals(),
'profile.dat'
)
import pstats
stats = pstats.Stats('profile.dat')
stats.strip_dirs()
stats.sort_stats('time')
stats.print_stats(100)
elif options.hotshot:
import hotshot
profiler = hotshot.Profile('hotshot.dat')
profiler.runctx(
'self.parseStream(options, inStream)',
globals(),
locals()
)
else:
self.parseStream(options, inStream)
def setUp(self, options):
pass
def parseStream(self, options, inStream):
raise NotImplementedError
def write(self, options, text):
if not options.no_output:
self.stdout.write(text)
def writeln(self, options, text):
self.write(options, text + '\n')
class LexerMain(_Main):
def __init__(self, lexerClass):
_Main.__init__(self)
self.lexerClass = lexerClass
def parseStream(self, options, inStream):
lexer = self.lexerClass(inStream)
for token in lexer:
self.writeln(options, str(token))
class ParserMain(_Main):
def __init__(self, lexerClassName, parserClass):
_Main.__init__(self)
self.lexerClassName = lexerClassName
self.lexerClass = None
self.parserClass = parserClass
def setupOptions(self, optParser):
optParser.add_option(
"--lexer",
action="store",
type="string",
dest="lexerClass",
default=self.lexerClassName
)
optParser.add_option(
"--rule",
action="store",
type="string",
dest="parserRule"
)
def setUp(self, options):
lexerMod = __import__(options.lexerClass)
self.lexerClass = getattr(lexerMod, options.lexerClass)
def parseStream(self, options, inStream):
kwargs = {}
if options.port is not None:
kwargs['port'] = options.port
if options.debug_socket is not None:
kwargs['debug_socket'] = sys.stderr
lexer = self.lexerClass(inStream)
tokenStream = antlr3.CommonTokenStream(lexer)
parser = self.parserClass(tokenStream, **kwargs)
result = getattr(parser, options.parserRule)()
if result is not None:
if hasattr(result, 'tree') and result.tree is not None:
self.writeln(options, result.tree.toStringTree())
else:
self.writeln(options, repr(result))
class WalkerMain(_Main):
def __init__(self, walkerClass):
_Main.__init__(self)
self.lexerClass = None
self.parserClass = None
self.walkerClass = walkerClass
def setupOptions(self, optParser):
optParser.add_option(
"--lexer",
action="store",
type="string",
dest="lexerClass",
default=None
)
optParser.add_option(
"--parser",
action="store",
type="string",
dest="parserClass",
default=None
)
optParser.add_option(
"--parser-rule",
action="store",
type="string",
dest="parserRule",
default=None
)
optParser.add_option(
"--rule",
action="store",
type="string",
dest="walkerRule"
)
def setUp(self, options):
lexerMod = __import__(options.lexerClass)
self.lexerClass = getattr(lexerMod, options.lexerClass)
parserMod = __import__(options.parserClass)
self.parserClass = getattr(parserMod, options.parserClass)
def parseStream(self, options, inStream):
lexer = self.lexerClass(inStream)
tokenStream = antlr3.CommonTokenStream(lexer)
parser = self.parserClass(tokenStream)
result = getattr(parser, options.parserRule)()
if result is not None:
assert hasattr(result, 'tree'), "Parser did not return an AST"
nodeStream = antlr3.tree.CommonTreeNodeStream(result.tree)
nodeStream.setTokenStream(tokenStream)
walker = self.walkerClass(nodeStream)
result = getattr(walker, options.walkerRule)()
if result is not None:
if hasattr(result, 'tree'):
self.writeln(options, result.tree.toStringTree())
else:
self.writeln(options, repr(result))

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -1,418 +0,0 @@
"""ANTLR3 runtime package"""
# begin[licence]
#
# [The "BSD licence"]
# Copyright (c) 2005-2008 Terence Parr
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# 3. The name of the author may not be used to endorse or promote products
# derived from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#
# end[licence]
from antlr3.constants import EOF, DEFAULT_CHANNEL, INVALID_TOKEN_TYPE
############################################################################
#
# basic token interface
#
############################################################################
class Token(object):
"""@brief Abstract token baseclass."""
def getText(self):
"""@brief Get the text of the token.
Using setter/getter methods is deprecated. Use o.text instead.
"""
raise NotImplementedError
def setText(self, text):
"""@brief Set the text of the token.
Using setter/getter methods is deprecated. Use o.text instead.
"""
raise NotImplementedError
def getType(self):
"""@brief Get the type of the token.
Using setter/getter methods is deprecated. Use o.type instead."""
raise NotImplementedError
def setType(self, ttype):
"""@brief Get the type of the token.
Using setter/getter methods is deprecated. Use o.type instead."""
raise NotImplementedError
def getLine(self):
"""@brief Get the line number on which this token was matched
Lines are numbered 1..n
Using setter/getter methods is deprecated. Use o.line instead."""
raise NotImplementedError
def setLine(self, line):
"""@brief Set the line number on which this token was matched
Using setter/getter methods is deprecated. Use o.line instead."""
raise NotImplementedError
def getCharPositionInLine(self):
"""@brief Get the column of the tokens first character,
Columns are numbered 0..n-1
Using setter/getter methods is deprecated. Use o.charPositionInLine instead."""
raise NotImplementedError
def setCharPositionInLine(self, pos):
"""@brief Set the column of the tokens first character,
Using setter/getter methods is deprecated. Use o.charPositionInLine instead."""
raise NotImplementedError
def getChannel(self):
"""@brief Get the channel of the token
Using setter/getter methods is deprecated. Use o.channel instead."""
raise NotImplementedError
def setChannel(self, channel):
"""@brief Set the channel of the token
Using setter/getter methods is deprecated. Use o.channel instead."""
raise NotImplementedError
def getTokenIndex(self):
"""@brief Get the index in the input stream.
An index from 0..n-1 of the token object in the input stream.
This must be valid in order to use the ANTLRWorks debugger.
Using setter/getter methods is deprecated. Use o.index instead."""
raise NotImplementedError
def setTokenIndex(self, index):
"""@brief Set the index in the input stream.
Using setter/getter methods is deprecated. Use o.index instead."""
raise NotImplementedError
def getInputStream(self):
"""@brief From what character stream was this token created.
You don't have to implement but it's nice to know where a Token
comes from if you have include files etc... on the input."""
raise NotImplementedError
def setInputStream(self, input):
"""@brief From what character stream was this token created.
You don't have to implement but it's nice to know where a Token
comes from if you have include files etc... on the input."""
raise NotImplementedError
############################################################################
#
# token implementations
#
# Token
# +- CommonToken
# \- ClassicToken
#
############################################################################
class CommonToken(Token):
"""@brief Basic token implementation.
This implementation does not copy the text from the input stream upon
creation, but keeps start/stop pointers into the stream to avoid
unnecessary copy operations.
"""
def __init__(self, type=None, channel=DEFAULT_CHANNEL, text=None,
input=None, start=None, stop=None, oldToken=None):
Token.__init__(self)
if oldToken is not None:
self.type = oldToken.type
self.line = oldToken.line
self.charPositionInLine = oldToken.charPositionInLine
self.channel = oldToken.channel
self.index = oldToken.index
self._text = oldToken._text
self.input = oldToken.input
if isinstance(oldToken, CommonToken):
self.start = oldToken.start
self.stop = oldToken.stop
else:
self.type = type
self.input = input
self.charPositionInLine = -1 # set to invalid position
self.line = 0
self.channel = channel
#What token number is this from 0..n-1 tokens; < 0 implies invalid index
self.index = -1
# We need to be able to change the text once in a while. If
# this is non-null, then getText should return this. Note that
# start/stop are not affected by changing this.
self._text = text
# The char position into the input buffer where this token starts
self.start = start
# The char position into the input buffer where this token stops
# This is the index of the last char, *not* the index after it!
self.stop = stop
def getText(self):
if self._text is not None:
return self._text
if self.input is None:
return None
if self.start < self.input.size() and self.stop < self.input.size():
return self.input.substring(self.start, self.stop)
return '<EOF>'
def setText(self, text):
"""
Override the text for this token. getText() will return this text
rather than pulling from the buffer. Note that this does not mean
that start/stop indexes are not valid. It means that that input
was converted to a new string in the token object.
"""
self._text = text
text = property(getText, setText)
def getType(self):
return self.type
def setType(self, ttype):
self.type = ttype
def getTypeName(self):
return str(self.type)
typeName = property(lambda s: s.getTypeName())
def getLine(self):
return self.line
def setLine(self, line):
self.line = line
def getCharPositionInLine(self):
return self.charPositionInLine
def setCharPositionInLine(self, pos):
self.charPositionInLine = pos
def getChannel(self):
return self.channel
def setChannel(self, channel):
self.channel = channel
def getTokenIndex(self):
return self.index
def setTokenIndex(self, index):
self.index = index
def getInputStream(self):
return self.input
def setInputStream(self, input):
self.input = input
def __str__(self):
if self.type == EOF:
return "<EOF>"
channelStr = ""
if self.channel > 0:
channelStr = ",channel=" + str(self.channel)
txt = self.text
if txt is not None:
txt = txt.replace("\n","\\\\n")
txt = txt.replace("\r","\\\\r")
txt = txt.replace("\t","\\\\t")
else:
txt = "<no text>"
return "[@%d,%d:%d=%r,<%s>%s,%d:%d]" % (
self.index,
self.start, self.stop,
txt,
self.typeName, channelStr,
self.line, self.charPositionInLine
)
class ClassicToken(Token):
"""@brief Alternative token implementation.
A Token object like we'd use in ANTLR 2.x; has an actual string created
and associated with this object. These objects are needed for imaginary
tree nodes that have payload objects. We need to create a Token object
that has a string; the tree node will point at this token. CommonToken
has indexes into a char stream and hence cannot be used to introduce
new strings.
"""
def __init__(self, type=None, text=None, channel=DEFAULT_CHANNEL,
oldToken=None
):
Token.__init__(self)
if oldToken is not None:
self.text = oldToken.text
self.type = oldToken.type
self.line = oldToken.line
self.charPositionInLine = oldToken.charPositionInLine
self.channel = oldToken.channel
self.text = text
self.type = type
self.line = None
self.charPositionInLine = None
self.channel = channel
self.index = None
def getText(self):
return self.text
def setText(self, text):
self.text = text
def getType(self):
return self.type
def setType(self, ttype):
self.type = ttype
def getLine(self):
return self.line
def setLine(self, line):
self.line = line
def getCharPositionInLine(self):
return self.charPositionInLine
def setCharPositionInLine(self, pos):
self.charPositionInLine = pos
def getChannel(self):
return self.channel
def setChannel(self, channel):
self.channel = channel
def getTokenIndex(self):
return self.index
def setTokenIndex(self, index):
self.index = index
def getInputStream(self):
return None
def setInputStream(self, input):
pass
def toString(self):
channelStr = ""
if self.channel > 0:
channelStr = ",channel=" + str(self.channel)
txt = self.text
if txt is None:
txt = "<no text>"
return "[@%r,%r,<%r>%s,%r:%r]" % (self.index,
txt,
self.type,
channelStr,
self.line,
self.charPositionInLine
)
__str__ = toString
__repr__ = toString
INVALID_TOKEN = CommonToken(type=INVALID_TOKEN_TYPE)
# In an action, a lexer rule can set token to this SKIP_TOKEN and ANTLR
# will avoid creating a token for this symbol and try to fetch another.
SKIP_TOKEN = CommonToken(type=INVALID_TOKEN_TYPE)

File diff suppressed because it is too large Load Diff

View File

@ -1,619 +0,0 @@
""" @package antlr3.tree
@brief ANTLR3 runtime package, treewizard module
A utility module to create ASTs at runtime.
See <http://www.antlr.org/wiki/display/~admin/2007/07/02/Exploring+Concept+of+TreeWizard> for an overview. Note that the API of the Python implementation is slightly different.
"""
# begin[licence]
#
# [The "BSD licence"]
# Copyright (c) 2005-2008 Terence Parr
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# 3. The name of the author may not be used to endorse or promote products
# derived from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#
# end[licence]
from antlr3.constants import INVALID_TOKEN_TYPE
from antlr3.tokens import CommonToken
from antlr3.tree import CommonTree, CommonTreeAdaptor
def computeTokenTypes(tokenNames):
"""
Compute a dict that is an inverted index of
tokenNames (which maps int token types to names).
"""
if tokenNames is None:
return {}
return dict((name, type) for type, name in enumerate(tokenNames))
## token types for pattern parser
EOF = -1
BEGIN = 1
END = 2
ID = 3
ARG = 4
PERCENT = 5
COLON = 6
DOT = 7
class TreePatternLexer(object):
def __init__(self, pattern):
## The tree pattern to lex like "(A B C)"
self.pattern = pattern
## Index into input string
self.p = -1
## Current char
self.c = None
## How long is the pattern in char?
self.n = len(pattern)
## Set when token type is ID or ARG
self.sval = None
self.error = False
self.consume()
__idStartChar = frozenset(
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_'
)
__idChar = __idStartChar | frozenset('0123456789')
def nextToken(self):
self.sval = ""
while self.c != EOF:
if self.c in (' ', '\n', '\r', '\t'):
self.consume()
continue
if self.c in self.__idStartChar:
self.sval += self.c
self.consume()
while self.c in self.__idChar:
self.sval += self.c
self.consume()
return ID
if self.c == '(':
self.consume()
return BEGIN
if self.c == ')':
self.consume()
return END
if self.c == '%':
self.consume()
return PERCENT
if self.c == ':':
self.consume()
return COLON
if self.c == '.':
self.consume()
return DOT
if self.c == '[': # grab [x] as a string, returning x
self.consume()
while self.c != ']':
if self.c == '\\':
self.consume()
if self.c != ']':
self.sval += '\\'
self.sval += self.c
else:
self.sval += self.c
self.consume()
self.consume()
return ARG
self.consume()
self.error = True
return EOF
return EOF
def consume(self):
self.p += 1
if self.p >= self.n:
self.c = EOF
else:
self.c = self.pattern[self.p]
class TreePatternParser(object):
def __init__(self, tokenizer, wizard, adaptor):
self.tokenizer = tokenizer
self.wizard = wizard
self.adaptor = adaptor
self.ttype = tokenizer.nextToken() # kickstart
def pattern(self):
if self.ttype == BEGIN:
return self.parseTree()
elif self.ttype == ID:
node = self.parseNode()
if self.ttype == EOF:
return node
return None # extra junk on end
return None
def parseTree(self):
if self.ttype != BEGIN:
return None
self.ttype = self.tokenizer.nextToken()
root = self.parseNode()
if root is None:
return None
while self.ttype in (BEGIN, ID, PERCENT, DOT):
if self.ttype == BEGIN:
subtree = self.parseTree()
self.adaptor.addChild(root, subtree)
else:
child = self.parseNode()
if child is None:
return None
self.adaptor.addChild(root, child)
if self.ttype != END:
return None
self.ttype = self.tokenizer.nextToken()
return root
def parseNode(self):
# "%label:" prefix
label = None
if self.ttype == PERCENT:
self.ttype = self.tokenizer.nextToken()
if self.ttype != ID:
return None
label = self.tokenizer.sval
self.ttype = self.tokenizer.nextToken()
if self.ttype != COLON:
return None
self.ttype = self.tokenizer.nextToken() # move to ID following colon
# Wildcard?
if self.ttype == DOT:
self.ttype = self.tokenizer.nextToken()
wildcardPayload = CommonToken(0, ".")
node = WildcardTreePattern(wildcardPayload)
if label is not None:
node.label = label
return node
# "ID" or "ID[arg]"
if self.ttype != ID:
return None
tokenName = self.tokenizer.sval
self.ttype = self.tokenizer.nextToken()
if tokenName == "nil":
return self.adaptor.nil()
text = tokenName
# check for arg
arg = None
if self.ttype == ARG:
arg = self.tokenizer.sval
text = arg
self.ttype = self.tokenizer.nextToken()
# create node
treeNodeType = self.wizard.getTokenType(tokenName)
if treeNodeType == INVALID_TOKEN_TYPE:
return None
node = self.adaptor.createFromType(treeNodeType, text)
if label is not None and isinstance(node, TreePattern):
node.label = label
if arg is not None and isinstance(node, TreePattern):
node.hasTextArg = True
return node
class TreePattern(CommonTree):
"""
When using %label:TOKENNAME in a tree for parse(), we must
track the label.
"""
def __init__(self, payload):
CommonTree.__init__(self, payload)
self.label = None
self.hasTextArg = None
def toString(self):
if self.label is not None:
return '%' + self.label + ':' + CommonTree.toString(self)
else:
return CommonTree.toString(self)
class WildcardTreePattern(TreePattern):
pass
class TreePatternTreeAdaptor(CommonTreeAdaptor):
"""This adaptor creates TreePattern objects for use during scan()"""
def createWithPayload(self, payload):
return TreePattern(payload)
class TreeWizard(object):
"""
Build and navigate trees with this object. Must know about the names
of tokens so you have to pass in a map or array of token names (from which
this class can build the map). I.e., Token DECL means nothing unless the
class can translate it to a token type.
In order to create nodes and navigate, this class needs a TreeAdaptor.
This class can build a token type -> node index for repeated use or for
iterating over the various nodes with a particular type.
This class works in conjunction with the TreeAdaptor rather than moving
all this functionality into the adaptor. An adaptor helps build and
navigate trees using methods. This class helps you do it with string
patterns like "(A B C)". You can create a tree from that pattern or
match subtrees against it.
"""
def __init__(self, adaptor=None, tokenNames=None, typeMap=None):
if adaptor is None:
self.adaptor = CommonTreeAdaptor()
else:
self.adaptor = adaptor
if typeMap is None:
self.tokenNameToTypeMap = computeTokenTypes(tokenNames)
else:
if tokenNames is not None:
raise ValueError("Can't have both tokenNames and typeMap")
self.tokenNameToTypeMap = typeMap
def getTokenType(self, tokenName):
"""Using the map of token names to token types, return the type."""
try:
return self.tokenNameToTypeMap[tokenName]
except KeyError:
return INVALID_TOKEN_TYPE
def create(self, pattern):
"""
Create a tree or node from the indicated tree pattern that closely
follows ANTLR tree grammar tree element syntax:
(root child1 ... child2).
You can also just pass in a node: ID
Any node can have a text argument: ID[foo]
(notice there are no quotes around foo--it's clear it's a string).
nil is a special name meaning "give me a nil node". Useful for
making lists: (nil A B C) is a list of A B C.
"""
tokenizer = TreePatternLexer(pattern)
parser = TreePatternParser(tokenizer, self, self.adaptor)
return parser.pattern()
def index(self, tree):
"""Walk the entire tree and make a node name to nodes mapping.
For now, use recursion but later nonrecursive version may be
more efficient. Returns a dict int -> list where the list is
of your AST node type. The int is the token type of the node.
"""
m = {}
self._index(tree, m)
return m
def _index(self, t, m):
"""Do the work for index"""
if t is None:
return
ttype = self.adaptor.getType(t)
elements = m.get(ttype)
if elements is None:
m[ttype] = elements = []
elements.append(t)
for i in range(self.adaptor.getChildCount(t)):
child = self.adaptor.getChild(t, i)
self._index(child, m)
def find(self, tree, what):
"""Return a list of matching token.
what may either be an integer specifzing the token type to find or
a string with a pattern that must be matched.
"""
if isinstance(what, (int, long)):
return self._findTokenType(tree, what)
elif isinstance(what, basestring):
return self._findPattern(tree, what)
else:
raise TypeError("'what' must be string or integer")
def _findTokenType(self, t, ttype):
"""Return a List of tree nodes with token type ttype"""
nodes = []
def visitor(tree, parent, childIndex, labels):
nodes.append(tree)
self.visit(t, ttype, visitor)
return nodes
def _findPattern(self, t, pattern):
"""Return a List of subtrees matching pattern."""
subtrees = []
# Create a TreePattern from the pattern
tokenizer = TreePatternLexer(pattern)
parser = TreePatternParser(tokenizer, self, TreePatternTreeAdaptor())
tpattern = parser.pattern()
# don't allow invalid patterns
if (tpattern is None or tpattern.isNil()
or isinstance(tpattern, WildcardTreePattern)):
return None
rootTokenType = tpattern.getType()
def visitor(tree, parent, childIndex, label):
if self._parse(tree, tpattern, None):
subtrees.append(tree)
self.visit(t, rootTokenType, visitor)
return subtrees
def visit(self, tree, what, visitor):
"""Visit every node in tree matching what, invoking the visitor.
If what is a string, it is parsed as a pattern and only matching
subtrees will be visited.
The implementation uses the root node of the pattern in combination
with visit(t, ttype, visitor) so nil-rooted patterns are not allowed.
Patterns with wildcard roots are also not allowed.
If what is an integer, it is used as a token type and visit will match
all nodes of that type (this is faster than the pattern match).
The labels arg of the visitor action method is never set (it's None)
since using a token type rather than a pattern doesn't let us set a
label.
"""
if isinstance(what, (int, long)):
self._visitType(tree, None, 0, what, visitor)
elif isinstance(what, basestring):
self._visitPattern(tree, what, visitor)
else:
raise TypeError("'what' must be string or integer")
def _visitType(self, t, parent, childIndex, ttype, visitor):
"""Do the recursive work for visit"""
if t is None:
return
if self.adaptor.getType(t) == ttype:
visitor(t, parent, childIndex, None)
for i in range(self.adaptor.getChildCount(t)):
child = self.adaptor.getChild(t, i)
self._visitType(child, t, i, ttype, visitor)
def _visitPattern(self, tree, pattern, visitor):
"""
For all subtrees that match the pattern, execute the visit action.
"""
# Create a TreePattern from the pattern
tokenizer = TreePatternLexer(pattern)
parser = TreePatternParser(tokenizer, self, TreePatternTreeAdaptor())
tpattern = parser.pattern()
# don't allow invalid patterns
if (tpattern is None or tpattern.isNil()
or isinstance(tpattern, WildcardTreePattern)):
return
rootTokenType = tpattern.getType()
def rootvisitor(tree, parent, childIndex, labels):
labels = {}
if self._parse(tree, tpattern, labels):
visitor(tree, parent, childIndex, labels)
self.visit(tree, rootTokenType, rootvisitor)
def parse(self, t, pattern, labels=None):
"""
Given a pattern like (ASSIGN %lhs:ID %rhs:.) with optional labels
on the various nodes and '.' (dot) as the node/subtree wildcard,
return true if the pattern matches and fill the labels Map with
the labels pointing at the appropriate nodes. Return false if
the pattern is malformed or the tree does not match.
If a node specifies a text arg in pattern, then that must match
for that node in t.
"""
tokenizer = TreePatternLexer(pattern)
parser = TreePatternParser(tokenizer, self, TreePatternTreeAdaptor())
tpattern = parser.pattern()
return self._parse(t, tpattern, labels)
def _parse(self, t1, tpattern, labels):
"""
Do the work for parse. Check to see if the tpattern fits the
structure and token types in t1. Check text if the pattern has
text arguments on nodes. Fill labels map with pointers to nodes
in tree matched against nodes in pattern with labels.
"""
# make sure both are non-null
if t1 is None or tpattern is None:
return False
# check roots (wildcard matches anything)
if not isinstance(tpattern, WildcardTreePattern):
if self.adaptor.getType(t1) != tpattern.getType():
return False
# if pattern has text, check node text
if (tpattern.hasTextArg
and self.adaptor.getText(t1) != tpattern.getText()):
return False
if tpattern.label is not None and labels is not None:
# map label in pattern to node in t1
labels[tpattern.label] = t1
# check children
n1 = self.adaptor.getChildCount(t1)
n2 = tpattern.getChildCount()
if n1 != n2:
return False
for i in range(n1):
child1 = self.adaptor.getChild(t1, i)
child2 = tpattern.getChild(i)
if not self._parse(child1, child2, labels):
return False
return True
def equals(self, t1, t2, adaptor=None):
"""
Compare t1 and t2; return true if token types/text, structure match
exactly.
The trees are examined in their entirety so that (A B) does not match
(A B C) nor (A (B C)).
"""
if adaptor is None:
adaptor = self.adaptor
return self._equals(t1, t2, adaptor)
def _equals(self, t1, t2, adaptor):
# make sure both are non-null
if t1 is None or t2 is None:
return False
# check roots
if adaptor.getType(t1) != adaptor.getType(t2):
return False
if adaptor.getText(t1) != adaptor.getText(t2):
return False
# check children
n1 = adaptor.getChildCount(t1)
n2 = adaptor.getChildCount(t2)
if n1 != n2:
return False
for i in range(n1):
child1 = adaptor.getChild(t1, i)
child2 = adaptor.getChild(t2, i)
if not self._equals(child1, child2, adaptor):
return False
return True