Code-generation: lang=>JSON, JSON=>lang and merging into lang

Considering engineering my own code-generator. If I do go ahead, will open-source the end result.

Needs to read [parse] one language, and output JSON (conformant to a specific JSON-schema).

Then needs to read JSON, and reproduce the code in that language, and [possibly] merge the generated code with existing code.

Languages I’m looking to support are all rather popular (Python, Go, Rust, JavaScript).

Any pointers—e.g.: specific LLVM libraries and sub-projects to use—would be appreciated.


reproduce the code in that language

Is the intention to exactly reproduce the original source code? Or
some code that's functionally equivalent?

Possibly protobuf or capn proto would be much more clean alternatives to json. I was working with interpreting instruction semantics a while back, and you shouldn’t have to write a parser to get the data structure back into coherent form, you can get what you want automatically and have the structure isolated into a common schema.

Dear Stephen, Kenneth and ,

Thanks for your replies.

Was intentionally not giving the full picture (because I didn’t want the cake given away; I.e.: wanted to figure things out for myself [then open-source it all]).

Anyway, I’ll tell you why I want JSON as the intermediary and what kind of format it should confirm to.

Here’s an example setup when applied to the Web domain, but it would also apply to other domains:
0. JSON written that generates documentation for the user/developer

  1. This JSON then used to generate a REST API in language , including database models, endpoints and tests with JSON mocks (via LLVM). Additionally input validation code will be attached to the models (and/or endpoints)
  2. Language edited in a restricted place (e.g.: adding a new mock and a new field to the database model, and adding a new endpoint)*
  3. “JSON documentation” (like 0.) generated from this source-code (via LLVM)
  4. JSON documentation used to generate frontend code, including tests with mocks, endpoint request (success/error handling), input validation & form generation (via JSON schema)
  5. Restricted* changes made to frontend code, e.g.: validation logic (regex?)
  6. Repeat 3. Repeat 1.

By 6. you start to see the advantage of merging the changed into an existing code-base (rather than starting fresh). The existing code-base would have all the implementation specific logic, such as widgets and themes on the frontend, and transaction design, caching and complex database queries on the REST API end.

Will likely confirm to the API Blueprint (AST [JSON]) format or Swagger. In addition to them thinking of edge-cases I haven’t, they also do some [limited] code-generation themselves, such as basic clients (and sometimes servers) in various languages, and HTML documentation.

Now that you know my full use-case you’ll be better equipped to provide suggestions with how to proceed

Thanks for your continued help,

Alec Taylor

  • Edits to non-restricted places are fine, but won’t get picked up by the JSON generator

PS: capn proto is awesome [for unrelated stuff], thanks for the link :slight_smile:

PPS: As I flesh this out more and more, looks like I could generate everything from just looking in the tests rather than code from elsewhere


Well, I don’t know what language you are using, but the only protobuf implementation for ocaml (piqi) is actually pretty powerful in that you can write a piqi spec, and then export it to json, xml and from protobuf export you can automatically go between language vernacular representation to serialized string, it’s very straight forward. I’m really not even certain if that handles your use case, but just using piqi doesn’t mean you have to use ocaml. You can use whatever language you want if you can just export your data via protobuf, because there are bindings for protobuf in just about every single language.

I’m currently working on some improvements to it that should make it more capable of handling large amounts of data much better in that it will make library usage tail recursive.