Summary of the roundtable discussion at the US LLVM Dev Meeting 2019

Hello all,

It was encouraging to see positive support for LLVM libc during the
round table discussion. For the benefit of those were not present,
below is a high level summary of what was discussed. I encourage
others to add items I missed here.

1. Many expressed a desire to use LLVM libc in sandboxed environments.
This requires that LLVM libc provide the ability to selectively pick
and choose pieces suitable for one's context.
Side Note: This is in line with our goal of building a modular libc.
Header generation etc are part of the solution to build a modular
libc.

2. Some of the members pointed out that LLVM libc should be
implemented in a modern language so that modern static analysis tools
and sanitizers can be used to test them.
Side Note: We have started the implementation in C++. So, I guess we
are already good with respect to this point.

3. Some of the members were curious about how we build the abstraction
layer above the OS-specific syscall layer. This did not lead to a
discussion about any particular way. It was more a discussion about
making a case for the need for an abstraction layer to accommodate the
differences across OSes.
Side Note: I agree that this will be interesting. I am of the opinion
that there cannot be one single solution libc-wide. That is, how we
build the abstraction layers has to be taken up on a case-by-case
basis.

4. It was also suggested to check whether we can write parts of the
libc++ implementations in a way that they can be used by LLVM libc as
well. The implementation of std::vector was pointed out as an example
where such a scheme can be attempted.

5. With respect to header generation, there were questions about
selectively including/excluding specific standards.
Side Note: My personal opinion is that there will be aspects like this
for which we will end up using a hybrid (macros + header generation)
solution.

Thank you,
Siva Chandra

Thank you for that excellent summary. A few comments inline:

Hello all,

It was encouraging to see positive support for LLVM libc during the
round table discussion. For the benefit of those were not present,
below is a high level summary of what was discussed. I encourage
others to add items I missed here.

1. Many expressed a desire to use LLVM libc in sandboxed environments.
This requires that LLVM libc provide the ability to selectively pick
and choose pieces suitable for one's context.
Side Note: This is in line with our goal of building a modular libc.
Header generation etc are part of the solution to build a modular
libc.

2. Some of the members pointed out that LLVM libc should be
implemented in a modern language so that modern static analysis tools
and sanitizers can be used to test them.
Side Note: We have started the implementation in C++. So, I guess we
are already good with respect to this point.

I think we're off to a good start here, but there's C++ and C++. We should aim to use modern C++ idioms that reduce the likelhood of vulnerabilities. For the most part, libc interfaces have very simple memory management and so we should be in a good position to write code that is amenable to analysis.

3. Some of the members were curious about how we build the abstraction
layer above the OS-specific syscall layer. This did not lead to a
discussion about any particular way. It was more a discussion about
making a case for the need for an abstraction layer to accommodate the
differences across OSes.
Side Note: I agree that this will be interesting. I am of the opinion
that there cannot be one single solution libc-wide. That is, how we
build the abstraction layers has to be taken up on a case-by-case
basis.

There are two issues that I'd like to highlight here. The first is not so much the *kind* of platform abstraction layer, but simply the *existence* of a platform abstraction layer. It is far easier to modify an existing platform abstraction layer than to insert one from scratch. A few things to think about:

  - Don't assume that all platforms expose file handles that are `int`s.
    For example, on Windows a HANDLE is a pointer. For the C standard
    `FILE*` abstraction, the `FILE` can contain an arbitrary handle, for
    POSIX compatibility, some platforms will need to implement a file
    descriptor table on top of the platform's native support. Don't
    depend on that existing for non-POSIX APIs.
  - Don't assume that all platforms support ELF linker tricks. COFF and
    WebAssembly both have different linkage models that support
    overlapping feature sets.
  - Don't assume that you can open a file. Embedded platforms and some
    sandboxed environments will want to bake resources into the binary.
    Don't assume you can `[f]open` things like locales and time-zone
    files. Add a PAL function to open a specific resources. On most
    POSIXy systems, this may just be an `open` call in a specific
    directory.
  - Even if you can open a file, don't assume that you can open an
    *arbitrary* file or network connection. Some sandboxing policies
    require you to explicitly state intent for these (either statically
    in a policy manifest or by dynamically presenting a capability).

4. It was also suggested to check whether we can write parts of the
libc++ implementations in a way that they can be used by LLVM libc as
well. The implementation of std::vector was pointed out as an example
where such a scheme can be attempted.

There was also some brief discussion about whether the same modularity approaches can be applied to libc++. If we can make a libc that supports lightweight embedded or sandboxed platforms with no filesystem and no locale support, it would be nice to be able to build a libc++ on top of it that also didn't expose these dependencies.

5. With respect to header generation, there were questions about
selectively including/excluding specific standards.
Side Note: My personal opinion is that there will be aspects like this
for which we will end up using a hybrid (macros + header generation)
solution.

+1. I really like where the TableGen approach seems to be going.

David