YACC (Yet Another Compiler Compiler) written in C is a compiler for checking the input format given to the program written in any programming language.
Syntax Analysis is the second phase of the compiler which is machine independent. The input to this phase is the tokens and the output is the syntax tree also called as parse tree. Parsing (another name for Syntax Analysis) is the process of constructing the parse tree according to the rules specified by the programming language.
For example consider the following statement
3+4, so if we specify the rule as "digit op digit" in infix format then the input 3+4 is correct and it will give the correct result that is 7.
If we give the input as 34+ then the parser gives the syntax error as the rule is specified for infix format and not for the postfix format.
The syntax rules are specified by Context Free Grammar (CFG). Lets see an example. If we have to write the rules for the calculator performing the four basic operations, addition, subtraction, multiplication and division. Then the rules are written as follows:
E : E '+' T {$$ = $1 + $3;}
| E '-' T {$$ = $1 - $3;}
| T {$$ = $1;}
E and T are the non terminals means the variable and the + and - are the terminal symbol. $1 + $3 represents the RHS means the value of E and T respectively and $$ represents the LHS, the result of $1+$3 is assigned to $$ that is E. Means the result of 3+4 is 7.
Lets see the program for Calculator using YACC. The file must be saved as .y or .yacc extension.
YACC file Specification
Syntax Analysis is the second phase of the compiler which is machine independent. The input to this phase is the tokens and the output is the syntax tree also called as parse tree. Parsing (another name for Syntax Analysis) is the process of constructing the parse tree according to the rules specified by the programming language.
For example consider the following statement
3+4, so if we specify the rule as "digit op digit" in infix format then the input 3+4 is correct and it will give the correct result that is 7.
If we give the input as 34+ then the parser gives the syntax error as the rule is specified for infix format and not for the postfix format.
The syntax rules are specified by Context Free Grammar (CFG). Lets see an example. If we have to write the rules for the calculator performing the four basic operations, addition, subtraction, multiplication and division. Then the rules are written as follows:
E : E '+' T {$$ = $1 + $3;}
| E '-' T {$$ = $1 - $3;}
| T {$$ = $1;}
E and T are the non terminals means the variable and the + and - are the terminal symbol. $1 + $3 represents the RHS means the value of E and T respectively and $$ represents the LHS, the result of $1+$3 is assigned to $$ that is E. Means the result of 3+4 is 7.
Lets see the program for Calculator using YACC. The file must be saved as .y or .yacc extension.
YACC file Specification
declarations
%%
rules
%%
subroutines
The Rule Section
expr : expr '+' term { $$ = $1 + $3; }| term { $$ = $1; };term : term '*' factor { $$ = $1 * $3; }| factor { $$ = $1; };factor : '(' expr ')' { $$ = $2; }| ID| NUM;How YACC Works?For working of YACC we need tokens from LEX, a tool for generating tokens for given statement.Communication Between LEX and YACCYACC creates a file y.tab.h which the specification of Tokens when the the yacc file is executed
with -d option as yacc -d cal.y. This file has to be included in LEX file as header file.The LEX Codecalc.l %{ #include <stdio.h> #include <stdlib.h> #include "y.tab.h" //generated by yacc -d %}
%% [0-9]+(\.[0-9]+)?([eE][0-9]+)? {yylval.f = atof(yytext); return NUM;} [-+()*/] {return yytext[0];} [ \t\f\v\n] { ; } "$" {return 0;} %% int yywrap() {return -1;}
The YACC Code
calc.y %{ #include <stdio.h> #include <stdlib.h> extern int yylex(); void yyerror(char *msg); %} %union { float f; } %token <f> NUM %type <f> E T F %% S : E {printf("%f\n", $1);} ; E : E '+' T {$$ = $1 + $3;} | E '-' T {$$ = $1 - $3;} | T {$$ = $1;} ; T : T '*' F {$$ = $1 * $3;} | T '/' F {$$ = $1 / $3;} | F {$$ = $1;} ; F : '(' E ')' {$$ = $2;} | '-' F {$$ = -$2;} | NUM {$$ = $1;} ; %% void yyerror(char *msg) { fprintf(stderr, "%s\n", msg); exit(1); } int main() {printf("enter any expression"); yyparse(); return 0; }Results:The input should end with $ sign indicates the end of the input.
No comments:
Post a Comment