First, I apologize for the such delay in my response. I was busy with exams first and then went on vacation.
This sounds like a great idea that people would find useful.
It might be. Especially because there are many places in llvm where the yaml format is used to specify input configurations.
So could you reply with a short example showing how the parser code gets translated into a schema?
Sure, lets view at such example:
using namespace llvm;
enum class ColorTy {
White,
Black,
Blue,
};
struct Baby {
std::string Name;
ColorTy Color;
};
LLVM_YAML_IS_SEQUENCE_VECTOR(Baby)
struct Animal {
std::string Name;
std::optional<int> Age;
std::vector<Baby> Babies;
};
LLVM_YAML_IS_SEQUENCE_VECTOR(Animal)
namespace llvm {
namespace yaml {
template <> struct ScalarEnumerationTraits<ColorTy> {
static void enumeration(IO &io, ColorTy &value) {
io.enumCase(value, "white", ColorTy::White);
io.enumCase(value, "black", ColorTy::Black);
io.enumCase(value, "blue", ColorTy::Blue);
}
};
template <> struct MappingTraits<Baby> {
static void mapping(IO &io, Baby &info) {
io.mapRequired("name", info.Name);
io.mapRequired("color", info.Color);
}
};
template <> struct MappingTraits<Animal> {
static void mapping(IO &io, Animal &info) {
io.mapRequired("name", info.Name);
io.mapOptional("age", info.Age);
io.mapOptional("babies", info.Babies);
}
};
} // namespace yaml
} // namespace llvm
int main() {
std::vector<Animal> Animals;
yaml::GenerateSchema Gen(OS);
Gen << Animals;
}
This code example first defines the types for storing yaml, and then defines the necessary traits for further work with yaml. At the same time, if we create yaml::Input, we will be able to read from raw_ostream yaml. And if we create yaml::Output, then we will be able to dump the object, which we could later change, into raw_ostream. I propose to make another heir of yaml::IO, thanks to which it will be possible to obtain the general structure of a specific yaml.
parser code gets translated into a schema
You don’t have to write any additional code to get the schema. Most of the information about keys, types and default values ​​(unfortunately not all, because the original implementation does not support the callback mechanism with type names) can be obtained from callbacks to the child type of yaml::IO. From the proposed example, we get this schema in json format (initially it was also in yaml format, but it seems that in my IDE the schemas should be in json format).
{
"$schema": "http://json-schema.org/draft-04/schema",
"title": "YAML Schema",
"items": {
"properties": {
"age": {
"type": "string"
},
"babies": {
"items": {
"properties": {
"color": {
"enum": [
"white",
"black",
"blue"
],
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"name",
"color"
],
"type": "object"
},
"type": "array"
},
"name": {
"type": "string"
}
},
"required": [
"name"
],
"type": "object"
},
"type": "array"
}
I added manually options "$schema"
and "title"
for better integration with IDE. Further conveniences will be demonstrated using IDE VSCode
and redhat.vscode-yaml
extension for supporting yaml schema display.
I think you can also give the schema and a file to a verifier tool?
Sure, it is needed to add obtained schema to files match by pattern.
{
"yaml.schemas": {
"/home/timur/timur/os-llvm/schema-test/schema.yaml": "*.my.yaml"
}
}
After that, we can start working with mytest.yaml
file and see such recomendation from our IDE:
is it possible to discover every “schema” currently in use in llvm and see whether this generator works with them?
I think so. I can try dumping the input file scheme in clang-tidy or clang-format.
This could generate you a lot of corner cases that you can cover in unit tests using smaller examples.
Yes, I think that as a unittest it is possible to dump a schema from some llvm-tool and compare it with the expected.
Thank you for your reply, I apologize again for such a long delay, I am ready to answer your questions.