Extending the SouceMgr

Hey folks,
Hope you're doing well.

It seems to me that it is not possible to extend the current behavior of `llvm::SourceMgr` (at least I couldn't think of any solution) and it is quite C/C++ish ( for lack of a better word).

For my use case, I need to tweak it a little bit and since there is now way to do that, I have to write my own which means I'm going to lose the ability to use any llvm api that expect a source manager and that's a big price to pay.

It seems that the source manger will benefits from a trait like design. This way, any front end that need a customized source manager can easily do that and the current source manager would be an implementation of that trait.

What do you think?

Cheers
Sameer

I /think/ the C++ terminology that’s a closer match for what you’re describing might be a “concept” rather than a “trait”?

But then, if I’m understanding correctly, all the code that currently is written in terms of the SourceMgr would have to become a template so it can be used by different implementations of the SourceMgr concept? That, at first guess, seems unlikely/infeasible?

  • Dave

Sorry about the terminology, I was thinking about CRTP style traits. I see CRTP based classes and templates quite often it the LLVM source code.

So I think the current SourceMgr can be renamed to something like DefaultSourceManager that implement the SourceMgr CRTP trait ( sorry I don’t know the exact term here).

Sorry about the terminology, I was thinking about CRTP style traits. I see CRTP based classes and templates quite often it the LLVM source code.

Yeah, there’s a fair bit of CRTP, and traits, but they are different things - https://www.internalpointers.com/post/quick-primer-type-traits-modern-cpp discusses traits, for some details.

So I think the current SourceMgr can be renamed to something like DefaultSourceManager that implement the SourceMgr CRTP trait ( sorry I don’t know the exact term here).

I’m still a bit unclear on how the existing SourceMgr-using code would work - it’d have to be updated to be templated on the specific SourceMgr implementation in use, would it? Some other ideas?

Might be worth a more narrowly defined discussion about what extensibility/features you’re looking for in SourceMgr, without the need to create a whole new hierarchy.

I think a bit of pseudo code might be good to demonstrate roughly what I had in mind:


template <typename ConcreteType, template <typename T> class... Traits>

class WithTrait : public Traits<ConcreteType>... {

protected:

WithTrait(){};

friend ConcreteType;

};

template <typename ConcreteType, template <typename> class TraitType>

class TraitBase {

protected:

ConcreteType &Object() { return static_cast<ConcreteType &>(*this); };

};

template <typename ConcreteType>

class SourceMgr : public TraitBase<ConcreteType, SourceMgr> {

public:

.... source mgr interface functions ....

};

class DefaultSourceMgr : public WithTrait<DefaultSourceMgr, SourceMgr> {

..... implementation ....

I think a bit of pseudo code might be good to demonstrate roughly what I had in mind:


template <typename ConcreteType, template <typename T> class... Traits>

class WithTrait : public Traits<ConcreteType>... {

protected:

WithTrait(){};

friend ConcreteType;

};

template <typename ConcreteType, template <typename> class TraitType>

class TraitBase {

protected:

ConcreteType &Object() { return static_cast<ConcreteType &>(*this); };

};

template <typename ConcreteType>

class SourceMgr : public TraitBase<ConcreteType, SourceMgr> {

public:

.... source mgr interface functions ....

};

class DefaultSourceMgr : public WithTrait<DefaultSourceMgr, SourceMgr> {

..... implementation ....

}

This way function that want to use SourceMgr would need a template but that template can be defaulted to DefaultSourceMgr. I hope I could demonstrate my thoughts clear enough.

Yep yep. Not sure what the specific purpose of all the various abstractions there would be, but get the gist of the architecture.

As for the functionality I need in the sourcemgr, my compiler uses namespaces as the center piece and the unit of compilation, i want my source manager to accept a namespace name and load the source code of that namespace. At the moment I had to copy paste the llvm::SourceMgr and manually tweak it.

Perhaps there’s something narrower that could be added directly to the existing SourceMgr (perhaps something fairly general/not specific to the particular features you’re designing for, but addressing whatever in SourceMgr doesn’t make it suitable for that direction/gets in the way)? I suspect if the extension is too invasive that way, or such that it would require the kind of abstraction/templates suggested above - it might be that SourceMgr isn’t a great foundation for the functionality you’re building.

What about adding some customizable hooks to the source manager to let users tweak different aspect of it?

Are there many llvm APIs that take a SourceMgr? The main ones I see are YAML parsing and some backend diagnostic stuff, but I admit I didn’t look very hard. Clang has its own SourceManager class that is independent of llvm’s
so it seems like the llvm::SourceMgr isn’t completely necessary for building a frontend?

I can see that it used in many places through out the source code.

IMHO the concept of source manager is necessary but since the llvm::sourcemgr is made in a certain way and at the same time it’s not possible to extend the llvm::SourceMgr, frontends will try to make their own. So if the llvm::SourceMgr at the current state can’t deliver enough that frontends can take benefit from and they have to write their own any way, then that raises the question of what is the purpose of llvm::SourceMgr ? Logically speaking, I think the source manager need to have a generic api with a mechanism to let the frontend tweak its behavior based on their set of requirement.

The source manager concept seems to be quite common among frondends and I think everyone can benefit from a better implementation,

I think llvm’s SourceMgr is mostly implemented for LLVM’s own internal needs (TableGen and the integrated assembler - maybe some others?) & /probably/ isn’t the right foundation for more high level language frontends (as seen by Clang having it’s own SourceManager, not trying to use llvm’s SourceMgr).

AFAIK the clang implementation is older and the aim is to use the llvm one instead (based on a similar discussion on IRC). But I think if it is internal to the llvm it’s better to mention that explicitly in the docs and maybe even remove it from code examples and tutorials. wdyt?

AFAIK the clang implementation is older and the aim is to use the llvm one instead (based on a similar discussion on IRC).

I would imagine that’d be fairly difficult, but perhaps some folks have that in mind - might be worth roping them into this discussion if they’ve got ideas about the direction SourceMgr should go to enable that.

But I think if it is internal to the llvm it’s better to mention that explicitly in the docs and maybe even remove it from code examples and tutorials. wdyt?

I think it’s OK in the examples and tutorials - not every feature there is suitable for a production frontend - they’re only toy examples.

  • Dave