Thread-local-storage memory layout


I am writing a freestanding statically linked x86_64 kernel. And now I
want to load and use thread-local storage. As I have a freestanding
binary I can't use libc/Linux kernel thread-local mechanism and need
to implement initialization code by myself.

I am trying to find a clear explanation of what is the memory layout
for x86 TLS but can't find anything useful. Maybe LLVM developers who
worked with TLS could help me.

Ok, I have two TLS sections .tdata .tbss

  [ 6] .tdata PROGBITS 00000000002a90d0 001a5dd0
       0000000000000004 0000000000000000 WAT 0 0 16
  [ 7] .tbss NOBITS 00000000002a90e0 001a5dd4
       0000000000009c94 0000000000000000 WAT 0 0 16

And here is the ELF segment information:

  TLS 0x00000000001a5dd0 0x00000000002a90d0 0x00000000002a90d0
                 0x0000000000000004 0x0000000000009ca4 R 0x10

So I load this segment into memory using file offset, filesize, memorysize.

My understanding that for x86_64 there is a "struct thread_info" that
is located right after TLS loaded segment. And the first element of
this structure is a pointer (i.e. sizeof(uintptr_t)).

Now I need to setup %fs and thread_info structure and it is where I
have a few questions:

* what is the alignment requirement for struct thread_info?. Is it
sizeof(uintptr_t) (i.e. 8 bytes at x86_64)?
* where thread_info->pointer points to? Does it point to the struct itself?
* what should be value of %fs register? Is it address of thread_info struct?

Thank you in advance.