This is a translation of the german article posted at FrozenHand.A bit more than 4 months later I may still not have a prototype I could release. What I have is a working type system, an assembler and parts of the compiler and the AST/code generator.First I had a closer look at the ANTLR parser generator from the Java domain. ANTLR would have come with it's own IDE and even automatically created an AST. Unfortunately the C# version was not working at that time, so I had to continue my search.With Coco/R I have found an appropriate generator and am already using it successfully.Pros:
Generates both scanner and parser
Uses a top-down algorithm and thus easy to understand and debug
There is no separate runtime, all the logic is being generated
Licensed under the GPL
Automatic error management and recovery (very basic)
Cons:
Total violation of .NET naming conventions
Generated files and classes are always named "Parser" / "Scanner" ( + ".cs")
Exceptions have to be manually translated into "errors"
No way to customize automatically generated error messages
No IDE integration
Especially the last point, the lack of IDE integration, results in an annoying scroll orgy after a certain grammar size has been reached (Grammars, even context free ones, are not 1-dimensional constructs. One production can be referenced by multiple others.). In C# you can partition your code using #region directives. The Attributed Grammar language (*.atg) does not know such a concept.I am trying to find out whether splitting the grammar up into multiple files will reduce the mess. At the same time I have integrated the invocation of the parser generator into the build process by using MSBuild. All I have to do is to press Compile and PxCoco/R (my custom version of Coco/R) does it's work. Should the *.atg file contains errors, Visual Studio 2005 will treat them like every C# error: They are listed in the errors tab and double clicking on them will send you to the exact position in the grammar file, where the error has been reported.A mapping between the embedded C# code in the grammar file and the generated output would be even more useful. Currently, if the C# compiler detects an error in the generated output, you will have to look at the surrounding function, then switch over to the grammar definition and find the corresponding production.Feature Update
More compiler optimizations
! and ? will not be part of identifiers but typecast and inline condition operators, ...
... instead you can use as part of identifiers
One file format (*.pxs) for scripts, compiled applications and build instructions, at the same time
No namespaces for now
User types will not be part of the initial releases but are definitely planned