Langbahn Team – Weltmeisterschaft

Draft:Tunnel Grammar Studio


Written inC++
Operating systemWindows
Available inEnglish
Licenseproprietary
Websitewww.experasoft.com/en/products/tgs/

The Tunnel Grammar Studio (TGS) is an Integrated Development Environment (IDE) for generating stand-alone parsing machines from given lexer and parser grammars. The accepted grammar syntax is based on the Augmented Backus–Naur form (ABNF) with an additional syntax for defining tokens with their lexemes. A graphical representation of the developed grammars is available directly into the IDE, which also includes a visual debugger that builds a syntax tree for a given input in forward and backward steps.

The parsers generated by TGS work with an optimized version of the tunnel parsing algorithm.[1]. The algorithm can parse an input linearly and deterministically on the basis of some types of ambiguous grammars[2] and grammars that have a deterministic push-down automaton. The algorithm can accept a sequence of tokens based on the their names and on the characters in their lexemes, case-sensitively or case-insensitively[3].

The tunnel parsing algorithm uses an additional stack with integer counters[4] to be able to handle integer number of repetitions that can be defined for an element in an ABNF grammar. The output of the tunnel parsing algorithm is a sequence of syntax structure construction commands that can be used by the builder module of the automatically generated parsing machine, to build a concrete or an abstract syntax tree with different levels of abstraction[5].

Each parsing machine might have different modules, such as: supplier, scanner, lexer, parser, optimizer, and builder[6]. The parser module in a generated parsing machine by TGS does not use the thread dedicated stack to recurse into the parsing rules (to minimize the possibility of stack-overflow errors), but instead operates over the input in iterative steps due to the use of the tunnel parsing algorithm. The creation of the syntax trees, their traversal, the conversion of their nodes into a string, and the tree's release can also be performed with iterative steps[7]

Example

An input string "Hello World" can be processed by a TGS generated lexer using the following lexer grammar:

; using the TGS's addition to the ABNF standard
word = 1*('a'-'z' / 'A'-'Z') 

; can also be written as "word = 1*(%x61-7A / %x41-5A)" by using the ABNF standard syntax

The result of the lexer's processing will be the following token sequence:

 t-sequence("word", "Hello")
 t-character(' '}
 t-sequence("word", "World")
 t-eof

The tokens can be parsed successfully on the basis of the following parser grammar with the tunnel parsing algorithm:

text = {word, %s"Hello"} *(' ' 0*1 {word}) 

; where: {word, %s"Hello"} defines an element that can accept tokens named "word" with a lexeme that matches case-sensitively (the %s prefix) to the string "Hello", {word} defines an element that matches tokens named "word" with any lexeme, and the characters ' ' and ',' can be written as %x20 and %x2C respectively by using the ABNF standard syntax

One of the syntax trees that can be automatically build by the generated, compiled and executed parsing machine for the grammars and the input is shown in the following image:

A graphical representation of the syntax tree
A graphical representation of the syntax tree

References

  1. ^ Handzhiyski, Nikolay (2024). "An iterative parsing algorithm with application in the profiling of parsers". University of Plovdiv Paisii Hilendarski.
  2. ^ Handzhiyski, Nikolay; Somova, Elena (2023). "Tunnel Parsing with Ambigous Grammars". Cybernetics and Information Technologies. 23 (2): 34–53. doi:10.2478/cait-2023-0012.
  3. ^ Handzhiyski, Nikolay; Somova, Elena (2022). "Tunnel Parsing with the Token's Lexeme". Cybernetics and Information Technologies. 22 (2): 124–144. doi:10.2478/cait-2022-0021.
  4. ^ Handzhiyski, Nikolay; Somova, Elena (2020). "Tunnel Parsing with counted repetitions". Computer Science. 21 (4): 441–462. doi:10.7494/csci.2020.21.4.3753.
  5. ^ Handzhiyski, Nikolay; Somova, Elena (2021). "The Expressive Power of the Statically Typed Concrete Syntax Trees". Proceedings of the 14th International Conference Education and Research in the Information Society.
  6. ^ Handzhiyski, Nikolay; Somova, Elena (2021). "A Parsing Machine Architecture Encapsulating Different Parsing Approaches". International Journal on Information Technologies and Security (IJITS). 13 (3): 27–38.
  7. ^ Handzhiyski, Nikolay; Somova, Elena (2023). "Tunnel Parsing". Composability, Comprehensibility and Correctness of Working Software. Lecture Notes in Computer Science. Vol. 11950. pp. 325–343. doi:10.1007/978-3-031-42833-3_8. ISBN 978-3-031-42832-6.