[RFC] Add option to limit llvm-profdata profile output size

We encountered a problem where a sample profile has unbounded size after merging multiple profiles with llvm-profdata. We observed that in sample profiles originally collected from applications, a large proportion of function entries have a low sample count, meaning that these functions are on the cold path, and there is little performance impact whether they are inlined or not. In this case they can be dropped from the merged profile.

For the purpose of reducing profile size we implemented a flag to llvm-profdata, --output-size-limit=n, that will reduce the size of the output to n or less by dropping cold functions. This option is format agnostic, as it only affects the output file size. Since it is hard to accurately compute the expected output size due to options like --compress-all-sections in extensible binary format, we used a heuristic approach by dropping a calculated number of functions at each iteration, until the size is satisfied.
Note that due to the design of existing code in SampleProfileWriter (by assuming OutputStream can only be a raw_fd_ostream), our approach currently is not optimally efficient as each iteration actually writes the file. This can be optimized by a major refactor on SampleProfileWriter and related classes.