What this post is about?
it's a simple example of how context-free grammar parsers can be designed & implemented.
it's fairly advanced topic, requires knowledge of programming, programming languages, regular expressions & related.
perhaps it's too few words, please contact me if You wish something elaborated.
our parser will accept texts written in language with:
1. integer numbers & boolean values.
2. arithmetic operations on integer numbers:
3. arithmetic operations on boolean values:
4. expressions & grouping operations (parentheses).
5. variables & assignment instruction.
6. conditional instruction if.
7. input & print instructions.
this should enable for basic arithmetic/logic operations whose results can be printed on screen.
they are mandatory for any serious programming language in my opinion, input & print instructions as well - even if they are only standard library functions.
first we'll design language's grammar, which consists of productions, terminal symbols, nonterminal symbols & separators written in a serie of code lines.
* token -> instruction | expression
* instruction -> conditional_instruction | assignment_instruction | variable_declaration | input_instruction | print_instruction
* conditional_instruction -> if ( boolean_expression) instruction;
* assignment_instruction -> identifier = expression;
* variable_declaration -> variable_type identifier;
* variable_type -> int | boolean
* print_instruction -> print identifier;
* expression -> number_expression | boolean_expression | ( expression ) | input_instruction
* number_expression -> number | ( number_expression ) | -number_expression | number_expression + number_expression | number_expression - number_expression | number_expression * number_expression | number_expression / number_expression
* boolean_expression -> boolean_value | ( boolean_expression ) | not boolean_expression | boolean_expression or boolean_expression | boolean_expression and boolean_expression
in this language we do not allow for conversion between types, for simplicity.
what is an identifier, number, boolean_value etc... we'll define a little later, during tokenization process. terminal symbols of grammar will be defined there, as well as other things.
(to be continued, if/when i can).
See also: Context-Free Grammars, Simple compiler design.