This is the tenth part of YAUL series. For your convenience you can find other parts in the table of contents in Part 1 — Introduction

We have our parser almost done, there are just few things left. Let’s begin.

Program structure

For now we focused on parsing separate constructions. Now it is high time to parse the whole application:

def p_start(p):
    """start : program"""
    p[0] = p[1]

def p_program_not_empty(p):
    """program  : list_function_statement"""
    p[0] = Compiler.FunctionStatementBlock()
    p[0].Statements = System.Collections.Generic.List[Compiler.IStatement] ( p[1][0] )
    p[0].Functions = System.Collections.Generic.List[Compiler.FunctionDefinition]( p[1][1] )

def p_program_empty(p):
    """program  : """
    p[0] = Compiler.FunctionStatementBlock()
    p[0].Statements = System.Collections.Generic.List[Compiler.IStatement] ( [] )
    p[0].Functions = System.Collections.Generic.List[Compiler.FunctionDefinition]( [] )

def p_list_function_statement_first_statement(p):
    """list_function_statement : statement"""
    p[0] = [p[1]], []

def p_list_function_statement_first_function(p):
    """list_function_statement : function_decl"""
    p[0] = [], [p[1]]
def p_list_function_statement_next_statement(p):
    """list_function_statement : list_function_statement statement"""
    p[0] = p[1]

def p_list_function_statement_next_function(p):
    """list_function_statement : list_function_statement function_decl"""
    p[0] = p[1]

Our application is simply a list of functions and statements. We go through the source code and collect all of them.

Error handling

We would like to be able to diagnose common parsing errors, like missing semicolon, wrong syntax, etc. This is called parser resynchronization. Here is the code:

# resynchronization
def p_block_error(p):
    """block : '{' error '}'"""

def p_block_error_multiple(p):
    """block : '{' list_statement error '}'"""

def p_statement_error(p):
     """statement : error ';'"""

def p_function_decl_with_params2(p):
    """function_decl : error FUNCTION IDENT '(' list_param ')' block """
    p[0] = Compiler.FunctionDefinition()
    p[0].Name = p[3]
    p[0].Body = p[7]
    p[0].Parameters = System.Collections.Generic.List[Compiler.Parameter] (p[5])

def p_error(p):
    global errors
    if p:
        errors.append("Line {0:3}:\tSyntax error - unexpected '{1}' ".format(p.lineno, str(p.value)))
        errors.append("Syntax error - unexpected EOF ")

    if p.type == '}':
        yacc.errok()  # skip additional }

    if p.type == 'FUNCTION':  # resynchronize on function keyword
        return p

This is just a basic error handling. In practice we would like to have better error descriptions, like what was expected in the code, but for now this is sufficient.

Missing stuff

And here is other parser’s missing stuff:

def initialize(plyBasePath):
    global yacc
    global lex
    global sys
    global clr
    global imp
    global parser
    global lexer
    global System
    global Compiler
    global ConstantExpression

    import imp
    import sys
    import clr
    import os

    import System
    import Compiler

    lex = imp.load_source('ply.lex', plyBasePath + '\\')
    yacc = imp.load_source('ply.yacc',  plyBasePath + '\\')
    lexer = lex.lex(module = sys.modules[__name__], debug=1)
    parser = yacc.yacc(module = sys.modules[__name__], debug=1)

def parse(text):
    return parser.parse(text, lexer=lexer)

if __name__ == '__main__':
    if len(sys.argv) != 2:
        print("Not valid number of arguments")

    filename = sys.argv[1]
    file_content = open(filename, "r").read()

    astTree = parse(file_content)

    if errors:
        for error in errors:
            print error

            scriptExpression = Compiler.CreateExpressionFromAST(astTree);
        except Exception as e:
            print "INTERPRETATION FAILED!"
            print e



This is it when it comes to PLY’s part. Next time we are going to finish C#’s part and our language will be complete.