Antlr 4 with C# and Visual Studio 2. It’s been more than a year since I posted anything to Programming Pages, so I figure I should rectify that (been busy mainly with my alter- ego, Physics Pages). Recently, I had another look at the parsing package Antlr and discovered that a whole new version has come out (Antlr 4), along with a book The Definitive Antlr 4 Reference, written by Antlr’s author, Terence Parr. However, as with Antlr 3, the book is written exclusively in Java, so a fair bit of detective work is needed to discover how to use it with C# in Visual Studio.
- The Lex & Yacc Page. Bison The YACC-compatible Parser Generator November 1995, Bison Version 1.25 by Charles Donnelly and Richard Stallman Introduction.
- Example Lex and Yacc Programs. Here are a number of short Lex programs to demonstrate what sorts of things you can do with Lex. Several of these programs are copied.
- UpdateStar is compatible with Windows platforms. UpdateStar has been tested to meet all of the technical requirements to be compatible with Windows 10, 8.1, Windows 8.
- All Ubuntu Packages in 'precise' Generated: Tue Oct 4 21:33:08 2016 UTC Copyright
I have already read Aho's Compiler Design and Tanenbaum's OS concepts book and they all only discuss concepts and code in a high level. They don't go into the details.
An introduction to lex and yacc. I'm working on an editor to work with dialog templates. Not, I hasten to add, a dialog editor as such - but an editor that can handle. The result is 20/5 = 4. Let's take a look at any simple or complex mathematical expression. We know what kind of token should come first in infix notation.
The code required both to specify the lexer and parser, and to write the supporting C#, has pretty well completely changed from Antlr 3, so a fresh tutorial is needed. Installation in Visual Studio. Support for Antlr. Visual Studio 2. 01. I no longer have this installed so I can’t test it), but not, it seems, for Visual Studio 2. To get the files you need, visit the Antlr download page and get the .
In the project directory, create a folder named Reference and within the Reference folder create another folder called Antlr. Unzip the entire contents of the .
You’ll need to do this for each project you create. Then, in Solution Explorer, right- click on the project and select . Then right click on the project name (it’ll say . Scroll to near the end of the file where you should find the line. Import Project=. Right- click on References and select the correct version of the runtime dll from the Antlr. The version should match the version of . NET that you’re using in your project, so if you’re using .
NET 4. 5, load the file . Now your project should be all set. One final note: you’ll need Java to be installed in order to run Antlr. This is true even if you’re coding in C#, since Antlr. Java and calls it to generate the code for your lexer and parser. Writing an Antlr 4 grammar. You should now be in a position to start work on the actual code.
If you installed the VS extension above, you can add an Antlr. Add –> New Item. In the dialog you should see 7 Antlr items; 3 of these are for Antlr 4 files, with the other 4 being for Antlr 3. Since we’ll be working with both a lexer and a parser, select . In Solution Explorer you should find a file called Calculator. What we’re interested in are the parser and lexer rules. The syntax for these has changed significantly from Antlr 3, to the extent that any grammar files you may have written for the earlier version very probably won’t work in Antlr 4.
The INT token is defined using the usual regular expression for one or more digits. The four arithmetic operations are given labels that we’ll use later.
Finally, we’ve modified the WS (whitespace) label so it includes blanks, returns and newlines. These are not comments; rather they are tags that are used in writing the code that tells the parser what to do when each of these expressions is found. This is where the biggest difference between Antlr 3 and Antlr 4 occurs: in Antlr 4, there is no code in the target language (C# here) written in the grammar file. All such code is moved elsewhere in the program. This makes the grammar file language- independent, so once you’ve written it you could drop it in to another project and use it support a Java application, or any other language for which Antlr 4 is defined. Using the grammar in a C# program.
So how exactly do you use the grammar to interpret an input string? Before we can use the lexer and parser, we need to write the code that should be run when each type of expression is parsed from the input. To do this, we need to write a C# class called a visitor. Antlr 4 has provided a base class for your visitor class, and it is named Calculator. Base. Visitor< Result>. Your job is to override some or all of the methods in the base class so that they run the code you want in response to each bit of parsed input.
In our case, we want the calculator to return an int for each expression it calculates, so create a new class called (say) Calculator. Visitor and make it inherit Calculator. Base. Visitor< int>. Your skeleton class looks like this. System. Collections. Generic. using System. Linq. using System.
Text. namespace Calculator. Within the class type the keywords . Look at the methods that start with .
Among these you should find a method for each label you assigned in the . Thus you should have Visit. Int, Visit. Parens, Visit.
Mul. Div and Visit. Add. Sub. These are the methods you need to override. The other Visit methods you can ignore, as the versions provided in the base class will work fine.
Here’s the complete class. We’ll discuss the code in a minute.
System. Collections. Generic. using System. Linq. using System. Text. namespace Calculator. This is called when the object being parsed is a single integer, so what you want to return is the value of this integer.
We can get this by calling the INT() method of the context object, and then Get. Text() from that.
This returns the integer as a string, so we need to use int. Parse to convert it to an int. Now look at the Visit. Parens() method at the bottom.
Here, the contents of the parentheses could be an expr of arbitrary complexity, but we want to return whatever that expr evaluates to as the result. This is what the inherited Visit() method does: it takes an expr as an argument and calls the correct method depending on the type of the expr. The expr() method being called from the .
Both of these represent binary operators, so there are two subsidiary exprs to evaluate before the operator is applied. In each case, we evaluate the left and right operands by calling Visit(context. Visit(context. expr(1)) respectively. Then we check which operator is in the .
Note that in the grammar file above we defined a parameter called . Also note that the labels we gave to the four arithmetic operators in the lexer rules show up as fields within the Calculator Parser class, so we can compare the Type of . Once we know that, we can return the correct calculation. At long last, we’re ready to look at the code that uses all this stuff. Here’s the Main() function. System. Collections. Generic. using System.
Linq. using System. Text. using System.
Threading. Tasks. System. IO. using Antlr. Runtime. using Antlr. Runtime. Misc. using Antlr. Runtime. Tree. namespace Calculator. Then on line 1. 8 we pass the string read from the stream (via input. Stream. Read. To.
End()) to an Antlr. Input. Stream. Important note: Antlr.
Input. Stream is supposed to accept a raw Stream object for its input, but I couldn’t get this to work. The program just hung up when attempting to read from the Stream directly.
It seems this is a bug in Antlr 4 so may be fixed in a future release. The lexer is then created with the Antlr. Input. Stream, the tokens from the lexer are saved in a Common. Token. Stream, which is then passed to the parser. The parser is then run by calling parser. The output from the parser is saved in an IParse. Tree, which is then passed to a Calculator.
Visitor to do the evaluations. Finally, the result of the parse + evaluation is printed. To run the program, type one or more expressions in the console (you can separate them by newlines or blanks). When you’re done, type control- Z on a line by itself and the output from the program should then be displayed.