By default, Flex and Bison generate old-school code with global variables. Trawling the manuals to find the options that generate re-entrant code is tedious, so I’m recording a small example that works on my system (which has Bison 2.7.12 and Flex 2.5.35).
Flex preamble
With these options, yylval is now a pointer. When converting existing Flex source, we mostly replace with yylval with *yylval.
%option outfile="flex.c" header-file="flex.h" %option reentrant bison-bridge %option noyywrap nounput noinput
%{ #include "bison.h" %} |
Bison preamble
%output "bison.c" %defines "bison.h" %define api.pure full %lex-param { yyscan_t scanner } %parse-param { yyscan_t scanner } %parse-param { val_callback_t callback }
%code requires { #include "val.h" #define YYSTYPE val_ptr #ifndef YY_TYPEDEF_YY_SCANNER_T #define YY_TYPEDEF_YY_SCANNER_T typedef void *yyscan_t; #endif }
%code { #include "flex.h" int yyerror(yyscan_t scanner, val_callback_t callback, const char *msg) { return 0; } } |
val.h: semantic values
Rather than use the %union Bison declaration or similar, I prefer to define the type that holds the semantic values in a C source file. In general, I like to minimize the amount of C in the Bison and Flex source.
enum { T_INT, T_STRING, };
struct val_s { int type; struct { char *s; struct val_s **kid; int nkid; }; }; typedef struct val_s *val_ptr; typedef int (*val_callback_t)(val_ptr); |
Calling the parser
Because the parser is no longer global, we must initialize and pass a yyscan_t variable to Bison and Flex.
yyscan_t scanner; if (yylex_init(&scanner)) exit(1); YY_BUFFER_STATE buf = NULL; // Uncomment to parse from a string instead of standard input. // buf = yy_scan_string("input string", scanner); int f(struct val_s *v) { val_print_pre(v); putchar('\n'); val_print_tree("", v); val_free(v); return 0; } if (yyparse(scanner, f)) exit(1); yy_delete_buffer(buf, scanner); yylex_destroy(scanner); |
Complete example
See https://github.com/blynn/symple/, which reads an expression and pretty-prints it:
$ ./main 'sin(x)*cos(y) + e^x' +(*(sin(x), cos(y)), ^(e, x)) +─┬─*─┬─sin───x │ └─cos───y └─^─┬─e └─x |
2 comments:
Blast from the past! I haven't used Flex or Bison since almost 20 years ago when I used them for in-house tools and products involving parsing. I remember being happy about the Flex "reentrant" option, and then also getting into C++ and using C++ for both Flex and Bison to create nicely encapsulated parsers. I should still have all that code around somewhere backed up. In 1995 I did switch away from C/C++ when possible though, and ended up creating my in-house parsing apps in ML (mlyacc, mllex) and then Haskell (Happy) instead... And now I'm still parsing, but using Scala. It never ends.
My experiences have been similar, e.g. I wrote a goyacc-compatible lexer for Go (http://cs.stanford.edu/~blynn/nex). But I still use C when I want speed.
Post a Comment