my experience with clang

Hi,

As promised previously, I'll try to provide a review of clang. I'm not an expert on compilers by any means, though.

I used clang to make a static code analyzer tool, based on the ARCHER paper from Stanford, albeit simpler. It is able to detect both static and dynamic memory overflows. It only supports intra-procedural analysis. It also provides analysis for the PHP interpreter API varargs functions (printf style).
In case someone is interested, the full source-code is available at: http://web.ist.utl.pt/nuno.lopes/sirs-project.tar.bz2
It also includes a presentation of the project in Portuguese, as well as some examples of bugs that it is able to find.

My code doesn't use the clang analysis framework, as the path-sensitive analyzer wasn't ready by the time I started the project.

So, about clang.. It is a very nice tool with a low learning curve. really. I once tried to look to the gcc code and I gave up (I admit I didn't try too much, but..). From all the compiler tools I've worked so far, clang proved to be the easiest one. This is due to the nice C++/OOP usage, as well as an intuitive AST (if you know C, you know how the AST looks like).
A con of clang in the point of view of code analysis is that clang is optimized for IDEs. That means that some AST nodes could be removed altogether (e.g. ParenExpr). Also, similar expressions are represented differently:
int x=2;
and
int x; x=2;

This makes sense in the IDE world, but only makes things more difficult in the analysis world. But I'm not sure how clang could be improved any further about this point.
Also using clang as a gcc replacement is very difficult, mainly where you are using ./configure && make. I had to do a script to strip unknown options, as well as run gcc in parallel to clang (as ./configure usually checks if the compiler is able to create executable files).

If I would recomend clang? Yes, sure! Although the API is not stable, it's still a nice framework.

Thank you all, especially Ted, who was always ready to answer my questions with throughout explanations.

I hope you enjoyed my presence here and I hope this is not the end of my work in the compiler world :slight_smile:

Regards,
Nuno Lopes

P.S.: I feel I'm missing a lot of things, but I'll send another e-mail if I remember something important.

Hi Nuno,

Sorry for the late response to this email. As I promised in my personal communication, I wanted to take a look at what you did in some detail after the holidays so that I could share it with the list. I think it is exciting what you were able to do with clang in such a short time. Comments inline.

I used clang to make a static code analyzer tool, based on the ARCHER paper
from Stanford, albeit simpler. It is able to detect both static and dynamic
memory overflows. It only supports intra-procedural analysis. It also
provides analysis for the PHP interpreter API varargs functions (printf
style).

For those of you who are not familiar, Archer essentially performs a DFS traversal of the possible "states" of a program/function, starting from the function entrance. Each state consists of a set of constraints on the values of one or more variables; e.g. that the variable 'x' is in the range '0' to '5'. These symbolic ranges can then be used to flag possible buffer overruns by determining if an index value could ever exceed the bounds of an array.

I like how your implementation resembles an interpreter. While the implementation style varies from using a dispatches based on switch statements or a chain of if..else's, overall it's the same mindset that I believe was used to construct the original Archer tool, and a similar design view will be incorporated into the path-sensitive solver in clang.

The checking of the parameters for PHP is also really nice. With not that much code you were able to write a custom check for a code base that in practice can be really useful.

Regarding your implementation of the buffer overrun checker, one thing that I wasn't certain about was whether or not your analysis did any backtracking when it encountered an infeasible state. For example:

    if (x == 1) // do something
    ...
    if (x == 1) // do something

Here the two branches are control-dependent, and there are only two paths here. Backtracking in the DFS once an infeasible path is encountered can greatly reduce your search space, and thus increase your scalability. From what I can tell the checking of constraints is lazy; I believe they are only checked at an array access, and this is to only determine if an out-of-bounds access can occur, not if the state is infeasible. For checking feasibility of paths, one should try and do this eagerly to prune off as much of the search space as possible. Explicit state space exploration with caching in the worst case is exponential, but never caching means that the algorithm is always exponential. It also means that loops will be explored indefinitely, and thus cause certain paths to never be explored because of the nature of DFS.

As a possible algorithmic optimization, while the Archer paper did not use state caching (arguing that it didn't help), I believe that one could use liveness information to prune variables from a state, and thus increase the possibility of a cache hit. Caching is one of the key ways that analyses that perform an explicit state space exploration actually achieve real scalability. Clang has a liveness analysis already available, so it could be employed for this purpose.

In case someone is interested, the full source-code is available at:
http://web.ist.utl.pt/nuno.lopes/sirs-project.tar.bz2
It also includes a presentation of the project in Portuguese, as well as
some examples of bugs that it is able to find.

FYI: Using Google's translation tools, I was able to read most of your presentation. It's a good short presentation. Thanks for providing it!

My code doesn't use the clang analysis framework, as the path-sensitive
analyzer wasn't ready by the time I started the project.

So, about clang.. It is a very nice tool with a low learning curve. really.
I once tried to look to the gcc code and I gave up (I admit I didn't try too
much, but..). From all the compiler tools I've worked so far, clang proved
to be the easiest one. This is due to the nice C++/OOP usage, as well as an
intuitive AST (if you know C, you know how the AST looks like).
A con of clang in the point of view of code analysis is that clang is
optimized for IDEs. That means that some AST nodes could be removed
altogether (e.g. ParenExpr). Also, similar expressions are represented
differently:

int x=2;
and
int x; x=2;

This makes sense in the IDE world, but only makes things more difficult in
the analysis world. But I'm not sure how clang could be improved any further
about this point.

This is an excellent point, and as I believe you are implying, such things are an inherit tradeoff of trying to capture more of the lexical information of the program in the ASTs. There are two things I would like to add about this point when it comes to thinking about writing source-level tools.

The first point I would like to mention is that in general different analyses will "care" about certain kinds of AST nodes and be completely uninterested in other nodes. Thus almost any analysis at the source-level has to deal with "cruft" that it has to ignore, even though the nodes it ignores are semantically meaningful. From this perspective, adding things like ParenExpr, while sometimes annoying to deal with, doesn't necessarily require much more effort to handle, especially if you add wrapper functions (as you did in your implementation) to essentially strip such nodes away.

The second point I would like to add concerns dealing with similar expressions such as:

int x=2
and
int x; x=2;

In general, handling both cases should fall out automatically in most cases simply by making the analysis *semantically* driven rather than syntactically driven. If one reasons about the semantics of the expressions using first-principles, such "corner cases" often are handled automatically. Your implementation of the buffer overrun checker actually appears to be fairly semantically based, which means it doesn't have to resort so much to the hacks that appear in bug-finding tools like lint (and even gcc) when flagging certain kinds of errors. Tools that essentially try to perform a "grep" on the AST tend to be fairly fragile, and can miss bugs easily because the same thing is written in a slightly different way. Of course there will always be corner cases that tools must be engineered to handle, but not making unnecessary assumptions about the syntactic structure of code goes a long way.

Also using clang as a gcc replacement is very difficult, mainly where you
are using ./configure && make. I had to do a script to strip unknown
options, as well as run gcc in parallel to clang (as ./configure usually
checks if the compiler is able to create executable files).

This is another, extremely valid point. The current incarnation of the clang driver is not a plug-in replacement for gcc; it doesn't support many (most?) of its arguments. There are many parties interested in developing a driver or evolving the current one to provide (near) plug-in replacement for gcc. As usual, anyone interested in helping out with this effort is more than encouraged to submit patches!

Stepping back a bit, almost all source-code analysis tools like those vended by GrammaTech and Coverity use various techniques to intercept the compiler, parse the source file that the compiler would parse, store some representation of the parsed file on disk, and then invoke the regular compiler as usual. The actual source-code analysis is then done offline, apart from the actual build of the codebase. The reason for this is that static analysis tools that are interprocedural require a whole-program image, which really can only be done after "capturing the build" of the entire codebase.

We are currently building infrastructure that can be used to capture the build for performing interprocedural analysis. Essentially we will intercept the native compiler, serialize the ASTs to disk, and then later perform some indexing on the serialized ASTs. Capturing the build of course is not only useful for finding bugs, but for other kinds of analyses as well, and so it is something we hope to do right in clang to serve various clients.

If I would recomend clang? Yes, sure! Although the API is not stable, it's
still a nice framework.

Wonderful! Natural we hope to have the API stabilize through continued use and development of the front-end.

I hope this is not the end of my work in the compiler world :slight_smile:

So do we.

Thanks for sharing with us your experiences with using clang!

Cheers,
Ted

Hi Nuno,

Sorry for the late response to this email. As I promised in my personal communication, I wanted to take a look at what you did in some detail after the holidays so that I could share it with the list. I think it is exciting what you were able to do with clang in such a short time. Comments inline.

Thank you for your throughout answer (as usual)! I really appreciate it and I've really learned a lot with your answers.

A few little comments:

The checking of the parameters for PHP is also really nice. With not that much code you were able to write a custom check for a code base that in practice can be really useful.

Yes, I agree. It is quite simple and really useful (it can save a few crashes and potential security bugs). I now need to port it to the liveness analyzer to get info about unititalized variables.
My initial idea was to extend this to user-space (I even sent a proposal to the gcc mailing list some time ago). This would require some mechanism to allow arbitrary functions to be passed to the gcc's __attribute__((__format__(my_function, 1, 2))). Not sure how the user-space program would specify the my_function, but if someone has an idea about it, I would love to ear/read it :slight_smile: I wouldn't mind to implement it in clang.

Regarding your implementation of the buffer overrun checker, one thing that I wasn't certain about was whether or not your analysis did any backtracking when it encountered an infeasible state. For example:

   if (x == 1) // do something
   ...
   if (x == 1) // do something

Yes, it is able to skip some infeasible paths. However, in this case it wouldn't work, as I didn't implemented support for != restrictions (in this case, x != 1). If using e.g. 'x > 1' instead, it would crop the infeasible paths. Anyway the memory usage was really excessive. I had to limit the memory at 700 MBs (in the CC script), because before linux was freezing (linux is really bad at swapping..). This was not clang fault, though (I had major memory leakages).

Nuno

The checking of the parameters for PHP is also really nice. With not
that much code you were able to write a custom check for a code base that
in practice can be really useful.

Yes, I agree. It is quite simple and really useful (it can save a few
crashes and potential security bugs). I now need to port it to the liveness
analyzer to get info about unititalized variables.
My initial idea was to extend this to user-space (I even sent a proposal to
the gcc mailing list some time ago).

One random and maybe interesting thought: the linux kernel people are marking pointers as user or kernel and using their 'sparse' tool to flag semantic violations. Instead of adding special support to clang to handle something like this, I wonder if Christopher's alternate address space work could be used to handle this...

-Chris

Interesting possibility! If this is something you’re interested in I’ll try to get my address spaces clang work committed sooner rather than later.

Another use of these types of pointer attributes is Microsoft’s __ptr32/__ptr64, though I don’t think that’s so much for analysis as pure pointer hackery.

The checking of the parameters for PHP is also really nice. With
not
that much code you were able to write a custom check for a code
base that
in practice can be really useful.

Yes, I agree. It is quite simple and really useful (it can save a few
crashes and potential security bugs). I now need to port it to the
liveness
analyzer to get info about unititalized variables.
My initial idea was to extend this to user-space (I even sent a
proposal to
the gcc mailing list some time ago).

One random and maybe interesting thought: the linux kernel people are
marking pointers as user or kernel and using their 'sparse' tool to
flag semantic violations. Instead of adding special support to clang
to handle something like this, I wonder if Christopher's alternate
address space work could be used to handle this...

Interesting possibility! If this is something you're interested in
I'll try to get my address spaces clang work committed sooner rather
than later.

Another use of these types of pointer attributes is Microsoft's
__ptr32/__ptr64, though I don't think that's so much for analysis as
pure pointer hackery.

Uhm I wonder how this relates with the varargs function checks I was talking about.. As you probably know gcc supports the printf checks through an __attribute__, and I don't know how the address spaces thing could be used to parse the format string and so on. (please enlighten me if I'm wrong!).

Nuno

Hi Nuno,

I think there may be a misunderstanding by what you mean by "extend this to user-space." I agree that the vararg checking and the address space qualifiers are not the same exact topic, although the latter could be used to augment the former.

Not everyone has looked at your code, so they may not even be aware of what kinds of problems you were looking for in the use of the PHP interpreter API varargs functions. My understanding you were looking at internal consistency within the interpreter codebase of how these functions were used; from this perspective, I'm not certain what you mean by "user-space." That term is often overloaded; to an OS person the world is often divided into the "kernel" and "user" address spaces, and user-space pointers should never be directly dereferenced within the kernel (this can happen when arguments passed from system calls, etc., are not properly handled in the kernel).

My understanding (which I could be wrong) is that this is a completely different concept from what you mean by extending the checking to user-space. If you could clarify a little more about what you mean that would be helpful. I'm also not really clear by what you mean by "porting" it (the varargs checker) to the liveness/uninitialized analyses.

Ted

Oh, it has nothing to do with it. You mentioned "user-space" and it triggered a random association in my brain. :slight_smile:

-Chris

One random and maybe interesting thought: the linux kernel people are
marking pointers as user or kernel and using their 'sparse' tool to
flag semantic violations. Instead of adding special support to clang
to handle something like this, I wonder if Christopher's alternate
address space work could be used to handle this...

Interesting possibility! If this is something you're interested in I'll try to get my address spaces clang work committed sooner rather than later.

It was just a random idea, nothing is driving it. It would be interesting to see these patches when they are ready though, to match the llvm IR support. No need to rush it though.

Another use of these types of pointer attributes is Microsoft's __ptr32/__ptr64, though I don't think that's so much for analysis as pure pointer hackery.

Don't forget about support for 8086 machines with near and far pointers! :wink:

-Chris

I think there may be a misunderstanding by what you mean by "extend this to user-space." I agree that the vararg checking and the address space qualifiers are not the same exact topic, although the latter could be used to augment the former.

Not everyone has looked at your code, so they may not even be aware of what kinds of problems you were looking for in the use of the PHP interpreter API varargs functions. My understanding you were looking at internal consistency within the interpreter codebase of how these functions were used; from this perspective, I'm not certain what you mean by "user-space." That term is often overloaded; to an OS person the world is often divided into the "kernel" and "user" address spaces, and user-space pointers should never be directly dereferenced within the kernel (this can happen when arguments passed from system calls, etc., are not properly handled in the kernel).

I think I didn't explain myself well, sorry.

The PHP interpreter has the following function:
int zend_parse_parameters(int num_args, char *type_spec, ...);

it is usually used like this:
zend_parse_parameters(ZEND_NUM_ARGS(), "s|l", &str, &str_len, &number);

The problem is that the number and type of arguments depend on the format string. In this case it receives a string (str + length) and a long (optional). No compiler is currently able (AFAIK) to check if the function is called correctly. Also, 'number' might not be initialized, while str and str_len do (if the function doesn't return FAILURE).
I implemented a simple checker with clang to verify the parameter types. I mentioned that I need to port it to the liveness analyzer because I want to check if the parameters after the '|' are used before initialization and if the ones before are not initialized unnecessarily.

I doubt that anytime soon compilers will be able to analyze these varargs functions automatically (well, you could try to do use some heuristics, like searching for a switch, but..), so my idea was to expose some kind of API to the programmers to allow them to specify some arbitrary function to validate the arguments.
GCC supports the following:
void my_printf(const char *format, ...) __attribute__((format(printf, 1, 2)));

but GCC only supports the printf and scanf functions. My idea was to generalize this, by allowing the user to specify some function (without touching in the compiler's code).
While the idea seems fairly acceptable, I don't have any syntax proposal.

Reference: Ian Lance Taylor - Re: expanding __attribute__((format,..))

Any thoughts? :slight_smile:

Nuno

I doubt that anytime soon compilers will be able to analyze these varargs
functions automatically (well, you could try to do use some heuristics, like
searching for a switch, but..), so my idea was to expose some kind of API to
the programmers to allow them to specify some arbitrary function to validate
the arguments.
GCC supports the following:
void my_printf(const char *format, ...) __attribute__((format(printf, 1,
2)));

but GCC only supports the printf and scanf functions. My idea was to
generalize this, by allowing the user to specify some function (without
touching in the compiler's code).
While the idea seems fairly acceptable, I don't have any syntax proposal.

Reference: Ian Lance Taylor - Re: expanding __attribute__((format,..))

Something like this would be very nice, but it has to be simple enough to be comprehendible by actual mortals, therein lies the trick :slight_smile:

-Chris

The PHP interpreter has the following function:
int zend_parse_parameters(int num_args, char *type_spec, ...);

it is usually used like this:
zend_parse_parameters(ZEND_NUM_ARGS(), "s|l", &str, &str_len, &number);

OK. This example really helps.

The problem is that the number and type of arguments depend on the format string. In this case it receives a string (str + length) and a long (optional). No compiler is currently able (AFAIK) to check if the function is called correctly. Also, 'number' might not be initialized, while str and str_len do (if the function doesn't return FAILURE).

So if I understand correctly, zend_parse_parameters has the following postcondition:

"return value" != FAILURE => str == INITIALIZED, str_len == INITIALIZED,

"return value" == FAILURE => str == UNINITIALIZED, str_len == UNINITIALIZED

What you would like to do is expand the "uninitialized values" analysis to take into account the "return value" so that you can flag possible bad uses of "str" and "str_len"?

I implemented a simple checker with clang to verify the parameter types. I mentioned that I need to port it to the liveness analyzer

I think you mean the "uninitialized values" analyzer, not the "liveness analyzer." They are two completely different concepts. Liveness determines if the value in a variable will ever be used after a given point. Uninitialized values determines if the value in a variable is a garbage, regardless of whether or not the value will be used later. Further, in an optimizing compiler both analyses are a form of a dataflow analysis, except that "uninitialized values" is a forward dataflow analysis (information propagates forward in the CFG) and "liveness" is a reverse dataflow analysis (information propagates backwards in the CFG). This is an implementation detail, but it doesn't illustrate that they are two separate concepts.

because I want to check if the parameters after the '|' are used before initialization

Let me see if I understand what you mean. After a call to "zend_parse_parameters", you want to track the possible initialized/uninitialized state of the "str" and "str_len" arguments (which depends on the "return value" of zend_parse_parameters). If you use "str" or "str_len" (or whatever other variables were used as arguments) if they could be in the "uninitialized" state, you want to flag an error. Is this what you mean?

and if the ones before are not initialized unnecessarily.

This one I'm not certain what you mean. I'm not certain what you mean by "not initialized unnecessarily."

I doubt that anytime soon compilers will be able to analyze these varargs functions automatically (well, you could try to do use some heuristics, like searching for a switch, but..), so my idea was to expose some kind of API to the programmers to allow them to specify some arbitrary function to validate the arguments.
GCC supports the following:
void my_printf(const char *format, ...) __attribute__((format(printf, 1, 2)));

but GCC only supports the printf and scanf functions. My idea was to generalize this, by allowing the user to specify some function (without touching in the compiler's code).
While the idea seems fairly acceptable, I don't have any syntax proposal.

There was some interesting work on ESC/Java on providing powerful, logic-based annotations to functions and classes. The annotations were injected in comments, and a hacked Java parser would read those comments (similar to parsing Javadoc comments) and use those annotations to describe pre- and postconditions for functions/classes/whatever. Some of the preconditions/postconditions one could associate with a function were extremely expressive; the downside is that they could require an expensive theorem prover to actual verify that the conditions would hold. On the other hand, the syntax of these annotations was actually not all that gross, although adding parser support for comment-based annotations for C/C++ is much more of a challenge because these languages are far messier in their syntactic structure. Adding attributes to support such stuff might be reasonable as well, as long as the logic-based annotations were embedded in a quoted string within the attributes.

I'm not proposing, however, that we implement ESC/Java for clang, although a subset of those features might be extremely useful, as it is better to encode such properties concerning the contract associated with a function's interface in the actual source code (e.g. header files) instead of hardwiring such knowledge into a specific tool. This not only allows the tool to become more extensible as more code is annotated, but also means that the knowledge is more portable, and doesn't die out when a specific tool dies out.

The other thing that I would like to mention is that the particular property you are describing is a little more than extending a flow-sensitive uninitialized values analysis. Because the uninitialized/initialized state of "str" and "str_len" depends on the return value of zend_parse_parameters, it almost inherently becomes a path-sensitive property if you want to check it with any real precision. We will likely extend the uninitialized values analysis to work in the new path-sensitive dataflow engine that we are building; in that case adding such information might actually be pretty easy and should give you the precision that you need to not spit out too much noise to the user.

it is usually used like this:
zend_parse_parameters(ZEND_NUM_ARGS(), "s|l", &str, &str_len, &number);

So if I understand correctly, zend_parse_parameters has the following postcondition:

"return value" != FAILURE => str == INITIALIZED, str_len == INITIALIZED,

"return value" == FAILURE => str == UNINITIALIZED, str_len == UNINITIALIZED

yes, and 'number' may or may not be initialized.

What you would like to do is expand the "uninitialized values" analysis to take into account the "return value" so that you can flag possible bad uses of "str" and "str_len"?

exactly.

because I want to check if the parameters after the '|' are used before initialization

Let me see if I understand what you mean. After a call to "zend_parse_parameters", you want to track the possible initialized/ uninitialized state of the "str" and "str_len" arguments (which depends on the "return value" of zend_parse_parameters). If you use "str" or "str_len" (or whatever other variables were used as arguments) if they could be in the "uninitialized" state, you want to flag an error. Is this what you mean?

yep :slight_smile:

and if the ones before are not initialized unnecessarily.

This one I'm not certain what you mean. I'm not certain what you mean by "not initialized unnecessarily."

expanding the previous example:

1: char *str = NULL;
2: int str_len, number = 3;
3:
4: if (zend_parse_parameters(ZEND_NUM_ARGS(), "s|l", &str, &str_len, &number) == FAILURE) {
5: return;
6: }
7:
8: printf("got the string: %s and the number: %d\n", str, number);

in this case the 'str' didn't need to be initialized, because it is guaranteed that after line 6 it was filled in by zend_parse_parameters. 'number' needs to be initialized, because it is used in line 8 and it isn't guaranteed that zend_parse_parameters will fill it in.

I'm not proposing, however, that we implement ESC/Java for clang, although a subset of those features might be extremely useful, as it is better to encode such properties concerning the contract associated with a function's interface in the actual source code (e.g. header files) instead of hardwiring such knowledge into a specific tool. This not only allows the tool to become more extensible as more code is annotated, but also means that the knowledge is more portable, and doesn't die out when a specific tool dies out.

Uhm, interesting.. I wasn't aware of this ESC/Java tool. I'll investigate it further, thanks.

The other thing that I would like to mention is that the particular property you are describing is a little more than extending a flow- sensitive uninitialized values analysis. Because the uninitialized/ initialized state of "str" and "str_len" depends on the return value of zend_parse_parameters, it almost inherently becomes a path- sensitive property if you want to check it with any real precision. We will likely extend the uninitialized values analysis to work in the new path-sensitive dataflow engine that we are building; in that case adding such information might actually be pretty easy and should give you the precision that you need to not spit out too much noise to the user.

Yes, you are right :slight_smile: But in this case the usage of that function is pretty standard. If it returns FAILURE, the code simply returns. So most cases can be handled with this heuristic. Anyway it'll report much less false-positives than my current regex-based script :slight_smile:
But sure, I'm waiting for your path-sensitive solver, so that I can trash mine :stuck_out_tongue:

Thanks,
Nuno

and if the ones before are not initialized unnecessarily.

This one I'm not certain what you mean. I'm not certain what you mean by "not initialized unnecessarily."

expanding the previous example:

1: char *str = NULL;
2: int str_len, number = 3;
3:
4: if (zend_parse_parameters(ZEND_NUM_ARGS(), "s|l", &str, &str_len, &number) == FAILURE) {
5: return;
6: }
7:
8: printf("got the string: %s and the number: %d\n", str, number);

in this case the 'str' didn't need to be initialized, because it is guaranteed that after line 6 it was filled in by zend_parse_parameters. 'number' needs to be initialized, because it is used in line 8 and it isn't guaranteed that zend_parse_parameters will fill it in.

Got it. Makes sense.

I'm not proposing, however, that we implement ESC/Java for clang, although a subset of those features might be extremely useful, as it is better to encode such properties concerning the contract associated with a function's interface in the actual source code (e.g. header files) instead of hardwiring such knowledge into a specific tool. This not only allows the tool to become more extensible as more code is annotated, but also means that the knowledge is more portable, and doesn't die out when a specific tool dies out.

Uhm, interesting.. I wasn't aware of this ESC/Java tool. I'll investigate it further, thanks.

Keep in mind that ESC/Java was a research project, and not something that was deployed on a wide scale. There are some interesting lessons from that project, including that annotations can be a huge burden to users if not used measuredly. We can discuss this more if you are interested.

Another (more recent) use of annotations was at Microsoft, where they developed "SAL" (Standard Annotation Language) for annotating functions for use in buffer overrun detection. The goal was to provide enough annotations that they could *verify* the absence of annotations in many cases:

   http://msdn2.microsoft.com/en-us/library/ms235402(VS.80).aspx

The one trick is that to use annotations in this way required some colossal effort by Microsoft engineers (and management), but the hope was that by using the annotations they would be safe from certain classes of security bugs (at least for the code that they annotated).