This is the tenth part of the YAUL series. For your convenience you can find other parts in the table of contents in Part 1 — Introduction
We have our parser almost done, there are just few things left. Let’s begin.
Table of Contents
Program structure
For now we focused on parsing separate constructions. Now it is high time to parse the whole application:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
def p_start(p): """start : program""" p[0] = p[1] def p_program_not_empty(p): """program : list_function_statement""" p[0] = Compiler.FunctionStatementBlock() p[0].Statements = System.Collections.Generic.List[Compiler.IStatement] ( p[1][0] ) p[0].Functions = System.Collections.Generic.List[Compiler.FunctionDefinition]( p[1][1] ) def p_program_empty(p): """program : """ p[0] = Compiler.FunctionStatementBlock() p[0].Statements = System.Collections.Generic.List[Compiler.IStatement] ( [] ) p[0].Functions = System.Collections.Generic.List[Compiler.FunctionDefinition]( [] ) def p_list_function_statement_first_statement(p): """list_function_statement : statement""" p[0] = [p[1]], [] def p_list_function_statement_first_function(p): """list_function_statement : function_decl""" p[0] = [], [p[1]] def p_list_function_statement_next_statement(p): """list_function_statement : list_function_statement statement""" p[0] = p[1] p[0][0].append(p[2]) def p_list_function_statement_next_function(p): """list_function_statement : list_function_statement function_decl""" p[0] = p[1] p[0][1].append(p[2]) |
Our application is simply a list of functions and statements. We go through the source code and collect all of them.
Error handling
We would like to be able to diagnose common parsing errors, like missing semicolon, wrong syntax, etc. This is called parser resynchronization. Here is the code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
# resynchronization def p_block_error(p): """block : '{' error '}'""" pass def p_block_error_multiple(p): """block : '{' list_statement error '}'""" pass def p_statement_error(p): """statement : error ';'""" def p_function_decl_with_params2(p): """function_decl : error FUNCTION IDENT '(' list_param ')' block """ p[0] = Compiler.FunctionDefinition() p[0].Name = p[3] p[0].Body = p[7] p[0].Parameters = System.Collections.Generic.List[Compiler.Parameter] (p[5]) def p_error(p): global errors if p: errors.append("Line {0:3}:\tSyntax error - unexpected '{1}' ".format(p.lineno, str(p.value))) else: errors.append("Syntax error - unexpected EOF ") if p.type == '}': yacc.errok() # skip additional } if p.type == 'FUNCTION': # resynchronize on function keyword return p |
This is just a basic error handling. In practice we would like to have better error descriptions, like what was expected in the code, but for now this is sufficient.
Missing stuff
And here is other parser’s missing stuff:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
def initialize(plyBasePath): global yacc global lex global sys global clr global imp global parser global lexer global System global Compiler global ConstantExpression import imp import sys import clr import os clr.AddReference("System") clr.AddReference("System.Core") clr.AddReference("Compiler"); import System import Compiler lex = imp.load_source('ply.lex', plyBasePath + '\\lex.py') yacc = imp.load_source('ply.yacc', plyBasePath + '\\yacc.py') lexer = lex.lex(module = sys.modules[__name__], debug=1) parser = yacc.yacc(module = sys.modules[__name__], debug=1) def parse(text): return parser.parse(text, lexer=lexer) if __name__ == '__main__': if len(sys.argv) != 2: print("Not valid number of arguments") sys.exit(1) filename = sys.argv[1] file_content = open(filename, "r").read() astTree = parse(file_content) if errors: print "INTERPRETATION FAILED!" for error in errors: print error else: try: scriptExpression = Compiler.CreateExpressionFromAST(astTree); Compiler.ExecuteScript(scriptExpression); except Exception as e: print "INTERPRETATION FAILED!" print e a=raw_input() |
Summary
This is it when it comes to PLY’s part. Next time we are going to finish C#’s part and our language will be complete.