In this post I describe how to use Python Lex-Yacc (PLY) in C# with IronPython.

Introduction

IronPython is a .NET implementation of Python interpreter. It is based on Python 2.7.x and can be executed as separate application or run directly from .NET applications.

PLY is a Python implementation of Lex and Yacc. These tools are used to write compilers — they are capable of parsing file using LALR grammar and produce AST.

In this post I describe how to execute PLY from C# using IronPython. I have verified all steps in Visual Studio 2012-2015.

Let’s go

First, you need to create a project and install IronPython using Nuget. You can do it from package manager command line or by using Visual Studio package manager GUI. After installing IronPython you should have the following references in your project:

  • IronPython
  • IronPython.Modules
  • Microsoft.Dynamic
  • Microsoft.Scripting
  • Microsoft.Scripting.Metadata

Next, you need to download PLY files. Create a directory Lib\ply in your project directory and put there all PLY files (__init__.py, cpp.py, ctokens.py, lex.py, and yacc.py).

Since PLY uses some popular Python packages, you probably need to extract some files from ordinary Python distribution. In order to do that, just download Pyton 2.7.x, zip its Lib directory to python27.zip, and put the file in Lib directory in your project. You can remove redundant files from the archive and leave only ones required by PLY, however, it is not necessary for now.

Make sure, that all Lib files (python27.zip and PLY scripts) are copied to output directory — select them in Solution Explorer, hit F4, and set the flag appropriately. Also, change build action for python27.zip to Embedded Resource.

Now it is time to write some code. First, create a script Parsing\parser.py with method parse accepting text to parse:

def parse(text):
    if lex == 0 or yacc == 0:
        raise RuntimeError("Not initialized")

    global errors
    errors = []

    parsedObject = parser.parse(text, lexer=lexer)
    return parsedObject

This is just a stub of a parsing method which will invoke parsing using PLY. I leave all the rest for you, since the rest is just a PLY code (grammar definitions, code for building AST etc).

You also need another one function in your script, which will be used to load all PLY scripts:

def initialize(plyBasePath):
    global yacc
    global lex
    global sys
    global clr
    global parser
    global lexer
    global System

    import imp
    import sys
    import clr

    lex = imp.load_source('ply.lex', plyBasePath + '\\lex.py')
    yacc = imp.load_source('ply.yacc',  plyBasePath + '\\yacc.py')
    lexer = lex.lex(module = sys.modules[__name__], debug=1)
    parser = yacc.yacc(module = sys.modules[__name__])

    clr.AddReference("System")
    import System

This function defines variables for PLY, loads some .NET libraries, end prepares parser and lexer. Notice that it is running PLY in debug mode but it is only for tutorial purposes.

Make sure that parser.py is copied to output directory.

This is it when it comes to Python part (not including grammar and rest of PLY actual code). Now let’s move to C# part.

Let’s define a class for parser:

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Reflection;
using IronPython.Hosting;
using IronPython.Modules;
using Microsoft.Scripting.Hosting;

namespace Parsing
{
    public class Parser
    {
        private static dynamic _ipy;

        public Parser()
        {
            _ipy = _ipy ?? CreateEngine();
        }

        private dynamic CreateEngine()
        {
            ScriptRuntimeSetup setup = Python.CreateRuntimeSetup(GetRuntimeOptions());
            var pyRuntime = new ScriptRuntime(setup);
            ScriptEngine engineInstance = Python.GetEngine(pyRuntime);

            AddPythonLibrariesToSysMetaPath(engineInstance);

            dynamic ipy = pyRuntime.UseFile(@"Parsing\parser.py");
            ipy.initialize(GetPlyPath());

            return ipy;
        }

        private static Dictionary < string, object> GetRuntimeOptions()
        {
            var options = new Dictionary < string, object>();
            options["Debug"] = false;
            return options;
        }

        private static string GetPlyPath()
        {
            return Path.Combine(Environment.CurrentDirectory, "Lib", "ply");
        }
    }
}

We have simple Parser class which creates an IronPython runtime. You can notice that I am passing Debug flag to runtime so it will print out all debugging informations when something wrong happens.

As we can see, first I create a runtime with runtime options. Next, I add Python libraries to Python paths so our scripts will be able to use them. Next, I load a file parser.py and call its initialize method (the one which I described above).

Below is a function which adds Python libraries to path:

private void AddPythonLibrariesToSysMetaPath(ScriptEngine engineInstance)
{
	Assembly asm = GetType().Assembly;
	IEnumerable < string> resQuery =
		from name in asm.GetManifestResourceNames()
		where name.ToLowerInvariant().EndsWith("python27.zip")
		select name;
	string resName = resQuery.Single();
	var importer = new ResourceMetaPathImporter(asm, resName);
	dynamic sys = engineInstance.GetSysModule();
	sys.meta_path.append(importer);
	sys.path.append(importer);
}

First, we get current executing assembly. Next, we access Python libraries stored in a zip file which is in resources. Next, we use importer to extract them in memory and add to path.

Finally, here comes the crucial method for parsing code:

public object Parse(string content)
{
	return _ipy.parse(content);
}

We simply pass a file content to IronPython runtime and call parse method from our script.