This is EBNF grammar for ANSI C (C99) and it contains almost every rule. It may be missing stuff, please tell me if you notice something missing.

I am writing a C compiler, with my backend and hopefully my own frontend in OCaml. That is why I wrote this grammar. I also have written the AWK grammar, but it’s not uploaded anywhere. Tell me if you want it.

Thanks.

  • Not with this grammar. There’s this parser-generator-immedate called BNFC that uses it’s own flavor of BNF (Labeled BNF) to generate Yacc/Lex (or ANTLR when can), an abstract syntax tree, etc, but I don’t like it. There are no EBNF parser generators AFAIK. One could, possibly, feed this to ChatGPT and ask for a Yacc/Lex pair in return, or even a manual parser! I may do that, but I first have to clean this up and add stuff that aren’t there.

    ChatGPT has changed langdev a lot for me. I automate a good portion of the processo with it. But one needs solid specs to feed to it.

    As I said I wish to implement the frontend myself, basically the lexer/parser. But I kinda get bored with LP because it’s too time-consuming. Plus LR(1) can only be generated, it’s only LL(1) which can be hand-written. I have not decided yet. I wish to focus more on the backend, because that is where you can do innovative shit and perhaps, write a paper on it.

    Also, I’m going to leave C23 to people who have years of experience. ANSI C is the lower denomniator of C. I am using C99 standard, which should be able to compile a good portion of code bases. C99 is the last required POSIX standard for C. That’s when C went under ISO.

    Thanks.