Target::ReadMemory query

Hi,

I have a question regarding some of the logic of "Target::ReadMemory".
I'm looking at this code:

(in my copy this is line 1314)
     if (!addr.IsSectionOffset())
     {
         SectionLoadList &section_load_list = GetSectionLoadList();
         if (section_load_list.IsEmpty())
         {
             // No sections are loaded, so we must assume
             // we are not running yet and anything we are given is
             // a file address.
             file_addr = addr.GetOffset();
             // "addr" doesn't have a section, so its offset is the
             //file address
             m_images.ResolveFileAddress (file_addr, resolved_addr);

         }
         else
         {
             // We have at least one section loaded. This can be becuase
             // we have manually loaded some sections with
             // "target modules load ..."
             // or because we have have a live process that has
             // sections loaded
             // through the dynamic loader
             load_addr = addr.GetOffset();
             // "addr" doesn't have a section,
             //so its offset is the load address
             section_load_list.ResolveLoadAddress (load_addr, resolved_addr);
         }
     }

My question is why is section_load_list.IsEmpty() used (according to the comment below it) to deduce that the target is not running?

Further down the Target::ReadMemory I see the invocation ProcessIsValid() to discover the reverse. So why don't we just use

if(!ProcessIsValid()) instead of section_load_list.IsEmpty() earlier on in this function?

(To be clear: I'm looking at this as I'm trying to debug a loaded/running embedded target, over a gdb-remote, and my debug of the lldb session reports section_load_list.IsEmpty(). Possibly another issue, yes, but I think my question above still applies.)

I'm wondering if anyone in lldb-dev can comment on this?

Matt

Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
More information can be found at www.csr.com. Keep up to date with CSR on our technical blog, www.csr.com/blog, CSR people blog, www.csr.com/people, YouTube, www.youtube.com/user/CSRplc, Facebook, www.facebook.com/pages/CSR/191038434253534, or follow us on Twitter at www.twitter.com/CSR_plc.
New for 2014, you can now access the wide range of products powered by aptX at www.aptx.com.

Hi,

I have a question regarding some of the logic of "Target::ReadMemory".
I'm looking at this code:

(in my copy this is line 1314)
   if (!addr.IsSectionOffset())
   {
       SectionLoadList &section_load_list = GetSectionLoadList();
       if (section_load_list.IsEmpty())
       {
           // No sections are loaded, so we must assume
           // we are not running yet and anything we are given is
           // a file address.
           file_addr = addr.GetOffset();
           // "addr" doesn't have a section, so its offset is the
           //file address
           m_images.ResolveFileAddress (file_addr, resolved_addr);
       }
       else
       {
           // We have at least one section loaded. This can be becuase
           // we have manually loaded some sections with
           // "target modules load ..."
           // or because we have have a live process that has
           // sections loaded
           // through the dynamic loader
           load_addr = addr.GetOffset();
           // "addr" doesn't have a section,
           //so its offset is the load address
           section_load_list.ResolveLoadAddress (load_addr, resolved_addr);
       }
   }

My question is why is section_load_list.IsEmpty() used (according to the comment below it) to deduce that the target is not running?

Its not that it is running, it is for cases where you are symbolicating or manually loading images at various addresses. For symbolication we do things like:

1 - create target with /bin/ls (and it will load a bunch of shared libraries)
  (lldb) file /bin/ls
2 - take a crash log and load all executable images at their crash site location:
  (lldb) image load --file "/bin/ls .text 0x10000
  (lldb) image load --file "/usr/lib/libc.so .text 0x10000000
3 - make a memory request using the PC from the crash log:
  (lldb) memory read 0x10018

So if the target has mapped any modules to any addresses, we wish to be able to resolve these addresses. If they are section/offset addresses, then often these sections have data in the object files that can be read (like all code in your .text segments).

Further down the Target::ReadMemory I see the invocation ProcessIsValid() to discover the reverse. So why don't we just use

if(!ProcessIsValid()) instead of section_load_list.IsEmpty() earlier on in this function?

Because of the above case.

(To be clear: I'm looking at this as I'm trying to debug a loaded/running embedded target, over a gdb-remote, and my debug of the lldb session reports section_load_list.IsEmpty(). Possibly another issue, yes, but I think my question above still applies.)

I'm wondering if anyone in lldb-dev can comment on this?

The the basic answer is that Target::ReadMemory is often used to be able to read data from the sections of your object files when you aren't running yet and can do so using faked "load addresses" when the target's section load list has some valid entries. The target's section load list will be empty unless a dynamic loader modifies them at runtime or the user manually loaded section addresses using "target modules load" or one of the SBTarget APIs.

Your embedded target should be using "<arch>-unknown-unknown" for a triple. This will cause your target at runtime to use the DynamicLoaderStatic plug-in which will set the "load" address for all of your binary images to match the "file" address in the file and should give you the result you are looking for.

Greg Clayton wrote:

Your embedded target should be using "<arch>-unknown-unknown" for a triple. This will cause your target at runtime to use the DynamicLoaderStatic plug-in which will set the "load" address for all of your binary images to match the "file" address in the file and should give you the result you are looking for.

Hi Greg,

The DynamicLoader fails to load my sections because GetEntryPoint returns invalid address:

addr_t
DynamicLoaderPOSIXDYLD::GetEntryPoint()
{
     if (m_entry_point != LLDB_INVALID_ADDRESS)
         return m_entry_point;

     if (m_auxv.get() == NULL)
         return LLDB_INVALID_ADDRESS;

     AuxVector::iterator I = m_auxv->FindEntry(AuxVector::AT_ENTRY);

     if (I == m_auxv->end())
==> return LLDB_INVALID_ADDRESS;

Is one of my problems, the fact that DynamicLoaderPOSIXDYLD is being used not DynamicLoaderStatic? (Of course my simple embedded target has no OS, processes, or loaders. However, I do have an ELF file, which I load prior to gdb-remote using "target create <elf-file>").

Incidentally the first part of my GDB-RSP is:

-> +
-> QStartNoAckMode
<- +
<- OK
-> +
-> QThreadSuffixSupported
<-
-> QListThreadsInStopReply
<-
-> qHostInfo
<- pid:1;endian:little;triple:kalimba-unknown-unknown;ptrsize:4;
-> vCont?
<- vCont;c;t
-> qVAttachOrWaitSupported
<-
-> qProcessInfo
<- pid:1;endian:little;triple:kalimba-unknown-unknown;ptrsize:4;
-> ?
<- S05

(I get the DynamicLoaderPOSIXDYLD used, regardless of whether my triple string is kalimba-unknown-unknown or i386-unknown-unknown. kalimba is our chip's name).

I'll spend the rest of my day trying to figure out why DynamicLoaderStatic is not being used.

thanks
Matt

Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
More information can be found at www.csr.com. Keep up to date with CSR on our technical blog, www.csr.com/blog, CSR people blog, www.csr.com/people, YouTube, www.youtube.com/user/CSRplc, Facebook, www.facebook.com/pages/CSR/191038434253534, or follow us on Twitter at www.twitter.com/CSR_plc.
New for 2014, you can now access the wide range of products powered by aptX at www.aptx.com.

Matthew Gardiner wrote:

(I get the DynamicLoaderPOSIXDYLD used, regardless of whether my triple string is kalimba-unknown-unknown or i386-unknown-unknown. kalimba is our chip's name).

I'll spend the rest of my day trying to figure out why DynamicLoaderStatic is not being used.

My hunch was correct (thanks for the hint Greg!) - the dynamic loader plugin, was previously DynamicLoaderPOSIXDYLD, if I *hack* stuff, and arrange that DynamicLoaderStatic is used, then my ELF has it's sections loaded, and a memory read from a known global variable address works. (At that moment a smile appears on my face!).

The problem is figuring out why DynamicLoaderPOSIXDYLD was used as the dynamic loader in my "embedded target debug".

The first issue I've seen, as to the cause of the above, is that when the DYLDers are being queried by plugin discovery, my target's architecture, is being reported as having os=linux.

{Data = "unknown--linux"
Arch = llvm::Triple::UnknownArch
Vendor = llvm::Triple::UnknownVendor
OS = llvm::Triple::Linux
..}

This is despite my stub, reporting "unknown" in the os part of the triple:

-> qHostInfo
<- pid:1;endian:little;triple:kalimba-unknown-unknown;ptrsize:4;
...
-> qProcessInfo
<- pid:1;endian:little;triple:kalimba-unknown-unknown;ptrsize:4;

I guess, I'll have to dig deeper into the Target/Get/Set/Architecture code. (After doing more digging, since I first started to pen this mail, I believe that upon "target create", my lldb Target object initially acquires the host OS's id in it's triple. Perhaps the issue is that ProcessGDBRemote fails to subsequently override this? That's what I'm looking at now.)

After forcing my copy's DynamicLoaderPOSIXDYLD::CreateInstance to return NULL, the next issue to hack around was:

DynamicLoader *
DynamicLoaderStatic::CreateInstance (Process* process, bool force)
{
...
create = (object_file->GetStrata() == ObjectFile::eStrataRawImage);
...
}

when parsing my ELF object_file->GetStrata() returns eStrataUser as the strata, and thus initially the DynamicLoaderStatic was not created.

So does not DynamicLoaderStatic::CreateInstance require some modification e.g.

if (object_file)
{
     const ObjectFile::Strata s = object_file->GetStrata();
     create = (s == ObjectFile::eStrataRawImage) ||
       (s == ObjectFile::eStrataUser);
}

?

Matt

Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
More information can be found at www.csr.com. Keep up to date with CSR on our technical blog, www.csr.com/blog, CSR people blog, www.csr.com/people, YouTube, www.youtube.com/user/CSRplc, Facebook, www.facebook.com/pages/CSR/191038434253534, or follow us on Twitter at www.twitter.com/CSR_plc.
New for 2014, you can now access the wide range of products powered by aptX at www.aptx.com.

Your problem seems to be that ELF files all claim to have the host OS and vendor:

bool
ObjectFileELF::GetArchitecture (ArchSpec &arch)
{
    if (!ParseHeader())
        return false;

    arch.SetArchitecture (eArchTypeELF, m_header.e_machine, LLDB_INVALID_CPUTYPE);
    arch.GetTriple().setOSName (Host::GetOSString().GetCString());
    arch.GetTriple().setVendorName(Host::GetVendorString().GetCString());
    return true;
}

This code should probably check if the host architecture has the same CPU type before setting this:

bool
ObjectFileELF::GetArchitecture (ArchSpec &arch)
{
    if (!ParseHeader())
        return false;

    arch.SetArchitecture (eArchTypeELF, m_header.e_machine, LLDB_INVALID_CPUTYPE);

    // TODO: add code to look at .note sections and anything else in the program headers,
    // section headers, symbol table, etc to properly determine the vendor and OS.

    ArchSpec host_arch_32(Host::GetArchitecture (Host::eSystemDefaultArchitecture32);
    ArchSpec host_arch_64(Host::GetArchitecture (Host::eSystemDefaultArchitecture32);
    // Only set the vendor and os to the host values if the architectures match
    if ((host_arch_32.IsValid() && arch.IsCompatibleMatch(host_arch_32) ||
  (host_arch_64.IsValid() && arch.IsCompatibleMatch(host_arch_64))
    {
        arch.GetTriple().setOSName (Host::GetOSString().GetCString());
        arch.GetTriple().setVendorName(Host::GetVendorString().GetCString());
    }

    return true;
}

But it would be better to also look around in the ELF file and look for .note sections or anything else that can help you determine the correct triple for a given ELF file. If "kalimba" architectures are never native you can put an extra check in here. You might be able to also look at the type of the ELF file in the ELF header (e_type) and see if it is:

ET_NONE - probably best not to set the os and vendor to host (is this the kind of file you have?)
ET_EXEC, ET_DYN, ET_CORE - do what is being done above with host architectures and maybe add some .note code to see if you can identify anything more about the binary. I am guessing linux ELF files for executables and shared libraries have something that you will be able to use to properly identify them.

So some more intelligent code in the ObjectFileELF can help us to classify the binaries more correctly, it should improve things in LLDB.

There are some comments on this in
http://llvm.org/bugs/show_bug.cgi?id=17209 including a partial list of
ELF note types.

Greg Clayton wrote:

Your problem seems to be that ELF files all claim to have the host OS and vendor:

Yes, indeed.

bool
ObjectFileELF::GetArchitecture (ArchSpec &arch)
{
     if (!ParseHeader())
         return false;

However, in my working copy (only a week old), GetArchitecture's logic is duplicated by GetModuleSpecifications, and upon invoking "target create" my lldb ends up using the host's specification:

ObjectFileELF::GetModuleSpecifications (const lldb_private::FileSpec& file,
...
{
<snip>

     if (spec.GetArchitecture().IsValid())
     {
        // We could parse the ABI tag information ...
==>spec.GetArchitecture().GetTriple().setOSName (Host::GetOSString().GetCString());

But regardless of this, i.e. ObjectFileELF::Get* returning unreliable OS type at "target create", I think the problem is when the target's inferior is first attached, particularly in the case of embedded targets modeled by "ProcessGDBRemote", the target architecture is not correctly "adjusted" when the qHostInfo/qProcessInfo are received from the stub (why have the messages if they are not acted on?).

In ProcessGDBRemote::DoConnectRemote, we have:

if (!m_target.GetArchitecture().IsValid())
{
     if (m_gdb_comm.GetProcessArchitecture().IsValid())
     {
m_target.SetArchitecture(m_gdb_comm.GetProcessArchitecture());
     }
     else
     {
          m_target.SetArchitecture(m_gdb_comm.GetHostArchitecture());
     }
}

So in my case, since my target's architecture is already "valid" (i.e. m_core is defined, etc.), the DoConnectRemote code doesn't consider the stub's opinion on the target. Surely in the remote/embedded case we must trust the stub's host info if supplied? In my opinion, this is the cause of a lot of my problems.

However, if I comment out "if (!m_target.GetArchitecture().IsValid())" and allow the SetArchitectures to proceed, I still run into problems, since:

bool
Target::SetArchitecture (const ArchSpec &arch_spec)
{
...
            m_arch = arch_spec;
....
==> SetExecutableModule (executable_sp, true);

That is, the arch_spec my gdb-remote passes to SetArchitecture, is firstly assigned to m_arch, but is then overwritten by SetExecutableModule.

It turns out that SetExecutableModule overwrites the archspec supplied by my stub, since

void
Target::SetExecutableModule
{
         ....
         if (!m_arch.IsValid())
         {
         ==>m_arch = executable_sp->GetArchitecture();

!m_arch.IsValid() occurred, seemingly, because my stub was not setting cputype. However, when I setup the "cputype" in qHostInfo, more problems arise:

bool
GDBRemoteCommunicationClient::GetHostInfo (bool force)
{
...
     if (cpu != LLDB_INVALID_CPUTYPE)
     {

since the code inside this logic then sets up some very apple/ios/macosx behaviour.

I apologise for the above braindump, but my conclusion is that to get lldb to work properly for non-apple, non-linux, etc. bare-metal embedded architectures, I need to submit several patches, in particular to the gdb-remote handling logic.

With your blessing, are you happy for me to do this?

But it would be better to also look around in the ELF file and look for .note sections or anything else that can help you determine the correct triple for a given ELF file. If "kalimba" architectures are never native you can put an extra check in here. You might be able to also look at the type of the ELF file in the ELF header (e_type) and see if it is:

ET_NONE - probably best not to set the os and vendor to host (is this the kind of file you have?)
ET_EXEC, ET_DYN, ET_CORE - do what is being done above with host architectures and maybe add some .note code to see if you can identify anything more about the binary. I am guessing linux ELF files for executables and shared libraries have something that you will be able to use to properly identify them.

So some more intelligent code in the ObjectFileELF can help us to classify the binaries more correctly, it should improve things in LLDB.

I'm happy to supply some better ObjectFileELF code in lldb. But my opinion as stated above is that the information received from the stub should *strongly* influence the specification of the architecture/OS etc. in the final Target object.

Out of interest the ELF for one of our kalimba variants is as follows:

ELF Header:
   Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
   Class: ELF32
   Version: 1 (current)
   OS/ABI: UNIX - System V
   ABI Version: 0
   Type: EXEC (Executable file)
   Machine: <unknown>: 0x72ec
   Version: 0x1
   Entry point address: 0x80000000
....

(Sooner or later I'd like to submit a patch to upstream lldb with core definitions for this chip. )

thanks
Matt

Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
More information can be found at www.csr.com. Keep up to date with CSR on our technical blog, www.csr.com/blog, CSR people blog, www.csr.com/people, YouTube, www.youtube.com/user/CSRplc, Facebook, www.facebook.com/pages/CSR/191038434253534, or follow us on Twitter at www.twitter.com/CSR_plc.
New for 2014, you can now access the wide range of products powered by aptX at www.aptx.com.

Greg Clayton wrote:

Your problem seems to be that ELF files all claim to have the host OS and vendor:

Yes, indeed.

bool
ObjectFileELF::GetArchitecture (ArchSpec &arch)
{
    if (!ParseHeader())
        return false;

However, in my working copy (only a week old), GetArchitecture's logic is duplicated by GetModuleSpecifications, and upon invoking "target create" my lldb ends up using the host's specification:

ObjectFileELF::GetModuleSpecifications (const lldb_private::FileSpec& file,
...
{
<snip>

   if (spec.GetArchitecture().IsValid())
   {
      // We could parse the ABI tag information ...
==>spec.GetArchitecture().GetTriple().setOSName (Host::GetOSString().GetCString());

But regardless of this, i.e. ObjectFileELF::Get* returning unreliable OS type at "target create", I think the problem is when the target's inferior is first attached, particularly in the case of embedded targets modeled by "ProcessGDBRemote", the target architecture is not correctly "adjusted" when the qHostInfo/qProcessInfo are received from the stub (why have the messages if they are not acted on?).

In ProcessGDBRemote::DoConnectRemote, we have:

if (!m_target.GetArchitecture().IsValid())
{
   if (m_gdb_comm.GetProcessArchitecture().IsValid())
   {
m_target.SetArchitecture(m_gdb_comm.GetProcessArchitecture());
   }
   else
   {
        m_target.SetArchitecture(m_gdb_comm.GetHostArchitecture());
   }
}

So in my case, since my target's architecture is already "valid" (i.e. m_core is defined, etc.), the DoConnectRemote code doesn't consider the stub's opinion on the target. Surely in the remote/embedded case we must trust the stub's host info if supplied?

Not necessarily. A host can be "x86_64-apple-macosx", yet if we are debugging an iOS simulator app we will get "x86_64-apple-ios" as the process triple. So really we want to really trust the process info over the host info.

In my opinion, this is the cause of a lot of my problems.

However, if I comment out "if (!m_target.GetArchitecture().IsValid())" and allow the SetArchitectures to proceed, I still run into problems, since:

bool
Target::SetArchitecture (const ArchSpec &arch_spec)
{
...
          m_arch = arch_spec;
....
==> SetExecutableModule (executable_sp, true);

That is, the arch_spec my gdb-remote passes to SetArchitecture, is firstly assigned to m_arch, but is then overwritten by SetExecutableModule.

It turns out that SetExecutableModule overwrites the archspec supplied by my stub, since

void
Target::SetExecutableModule
{
       ....
       if (!m_arch.IsValid())
       {
       ==>m_arch = executable_sp->GetArchitecture();

!m_arch.IsValid() occurred, seemingly, because my stub was not setting cputype. However, when I setup the "cputype" in qHostInfo, more problems arise:

bool
GDBRemoteCommunicationClient::GetHostInfo (bool force)
{
...
   if (cpu != LLDB_INVALID_CPUTYPE)
   {

since the code inside this logic then sets up some very apple/ios/macosx behaviour.

cputype and subtype is currently assumed to be a mach-o thing and it then might default to the wrong vendor and OS. I would suggest setting the triple only for non-mach-o cases.

I apologise for the above braindump, but my conclusion is that to get lldb to work properly for non-apple, non-linux, etc. bare-metal embedded architectures, I need to submit several patches, in particular to the gdb-remote handling logic.

With your blessing, are you happy for me to do this?

We need to watch out for certain pitfalls, but yes, this should work better than it does now and fixes are required. We just need to make sure not to regress on the desktop and remote targets with any changes we make.

But it would be better to also look around in the ELF file and look for .note sections or anything else that can help you determine the correct triple for a given ELF file. If "kalimba" architectures are never native you can put an extra check in here. You might be able to also look at the type of the ELF file in the ELF header (e_type) and see if it is:

ET_NONE - probably best not to set the os and vendor to host (is this the kind of file you have?)
ET_EXEC, ET_DYN, ET_CORE - do what is being done above with host architectures and maybe add some .note code to see if you can identify anything more about the binary. I am guessing linux ELF files for executables and shared libraries have something that you will be able to use to properly identify them.

So some more intelligent code in the ObjectFileELF can help us to classify the binaries more correctly, it should improve things in LLDB.

I'm happy to supply some better ObjectFileELF code in lldb. But my opinion as stated above is that the information received from the stub should *strongly* influence the specification of the architecture/OS etc. in the final Target object.

We need both to be correct. The object files often determine the triple before we run and we want to get this right in the object file. After we run, you might have created a target with a completely wrong file and arch and we might need to change things once we attach. So both should be possible and be as correct as they can be.

Out of interest the ELF for one of our kalimba variants is as follows:

ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: <unknown>: 0x72ec
Version: 0x1
Entry point address: 0x80000000
....

(Sooner or later I'd like to submit a patch to upstream lldb with core definitions for this chip. )

You could currently assume if the Elf header contains kalimba (0x72ec) that its triple is always then unknown-unknown.

Greg Clayton wrote:

So in my case, since my target's architecture is already "valid" (i.e. m_core is defined, etc.), the DoConnectRemote code doesn't consider the stub's opinion on the target. Surely in the remote/embedded case we must trust the stub's host info if supplied?

Not necessarily. A host can be "x86_64-apple-macosx", yet if we are debugging an iOS simulator app we will get "x86_64-apple-ios" as the process triple. So really we want to really trust the process info over the host info.

Sorry, Greg, I didn't make my point that well, what I meant was that we should trust the information that the stub returns to us, in the form of the qHostInfo and qProcessInfo messages. I don't think we should be overwriting this data with data from the ELF, which I saw happening in lldb earlier on in my analysis.

I apologise for the above braindump, but my conclusion is that to get lldb to work properly for non-apple, non-linux, etc. bare-metal embedded architectures, I need to submit several patches, in particular to the gdb-remote handling logic.

With your blessing, are you happy for me to do this?

We need to watch out for certain pitfalls, but yes, this should work better than it does now and fixes are required. We just need to make sure not to regress on the desktop and remote targets with any changes we make.

Cool, thanks.

I'm happy to supply some better ObjectFileELF code in lldb. But my opinion as stated above is that the information received from the stub should *strongly* influence the specification of the architecture/OS etc. in the final Target object.

We need both to be correct. The object files often determine the triple before we run and we want to get this right in the object file. After we run, you might have created a target with a completely wrong file and arch and we might need to change things once we attach. So both should be possible and be as correct as they can be.

Yes, agreed. In an embedded debugging session it is the information regarding the running inferior, modeled by the gdb-remote, which reflects the reality of the situation.

Incidentally, do you ever envisage a scenario where lldb could create an empty target, that is, from the command-line:

(lldb) target create
Target container created.

then using gdb-remote to map to the inferior?

for example:

(lldb) gdb-remote <port>
...
(lldb) target list
Current targets:
* target #0: (empty) ( arch=devicename-unknown-unknown, state=running, etc. )

In this scenario, clearly there would be no symbolic lookup, address/line mapping, and so on. However, disassembly (with breakpoints and instruction step), and memory/register access, would still be possible. Such a feature does have value in the embedded world. I believe that gdb can currently achieve this.

You could currently assume if the Elf header contains kalimba (0x72ec) that its triple is always then unknown-unknown.

Indeed. But also nice to also extract that information from the gdb-remote packets mentioned above. I'm going to try achieve this.

thanks again
Matt

Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
More information can be found at www.csr.com. Keep up to date with CSR on our technical blog, www.csr.com/blog, CSR people blog, www.csr.com/people, YouTube, www.youtube.com/user/CSRplc, Facebook, www.facebook.com/pages/CSR/191038434253534, or follow us on Twitter at www.twitter.com/CSR_plc.
New for 2014, you can now access the wide range of products powered by aptX at www.aptx.com.

Greg Clayton wrote:

So in my case, since my target's architecture is already "valid" (i.e. m_core is defined, etc.), the DoConnectRemote code doesn't consider the stub's opinion on the target. Surely in the remote/embedded case we must trust the stub's host info if supplied?

Not necessarily. A host can be "x86_64-apple-macosx", yet if we are debugging an iOS simulator app we will get "x86_64-apple-ios" as the process triple. So really we want to really trust the process info over the host info.

Sorry, Greg, I didn't make my point that well, what I meant was that we should trust the information that the stub returns to us, in the form of the qHostInfo and qProcessInfo messages. I don't think we should be overwriting this data with data from the ELF, which I saw happening in lldb earlier on in my analysis.

I apologise for the above braindump, but my conclusion is that to get lldb to work properly for non-apple, non-linux, etc. bare-metal embedded architectures, I need to submit several patches, in particular to the gdb-remote handling logic.

With your blessing, are you happy for me to do this?

We need to watch out for certain pitfalls, but yes, this should work better than it does now and fixes are required. We just need to make sure not to regress on the desktop and remote targets with any changes we make.

Cool, thanks.

I'm happy to supply some better ObjectFileELF code in lldb. But my opinion as stated above is that the information received from the stub should *strongly* influence the specification of the architecture/OS etc. in the final Target object.

We need both to be correct. The object files often determine the triple before we run and we want to get this right in the object file. After we run, you might have created a target with a completely wrong file and arch and we might need to change things once we attach. So both should be possible and be as correct as they can be.

Yes, agreed. In an embedded debugging session it is the information regarding the running inferior, modeled by the gdb-remote, which reflects the reality of the situation.

Incidentally, do you ever envisage a scenario where lldb could create an empty target, that is, from the command-line:

(lldb) target create
Target container created.

We could allow with an option:

(lldb) target create --empty

The lldb::SB API currently allows you to create an empty target by specifying None for the path and triple:

(lldb) script target = lldb.debugger.CreateTarget(None)

then using gdb-remote to map to the inferior?

for example:

(lldb) gdb-remote <port>

You can currently do this:

% lldb
(lldb) gdb-remote <port>

The problem is if you already have a target it will try to re-use the currently selected target, but if you are starting from scratch it will work as it will create a target if there isn't one.

...
(lldb) target list
Current targets:
* target #0: (empty) ( arch=devicename-unknown-unknown, state=running, etc. )

In this scenario, clearly there would be no symbolic lookup, address/line mapping, and so on. However, disassembly (with breakpoints and instruction step), and memory/register access, would still be possible. Such a feature does have value in the embedded world. I believe that gdb can currently achieve this.

We can too. Just don't create a target and call "gdb-remote ...". This is where the world of lldb_private::Platform instances come in. When we attach to a binary, you might have a dynamic loader that can find all loaded images (executables and shared libraries, or just a collection of ROM images) and using the UUID and the image names, your Platform might be able to locate the required files on disk. For iOS for example, we might be asked to find a "/usr/lib/libfoo.dylib" with UUID 8AA96CBA-EFD6-31B1-AC82-75ED1CEEDE77. The PlatformRemoteiOS knows to go look in the SDK on disk for a local copy of this image which we can find in "/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS8.0.sdk/usr/lib/libfoo.dylib", so if your embedded platform has a common way of storing files or can use some system search features (like Spotlight on MacOSX, and I am sure there is some similar window and linux system wide searching APIs) to locate the required files and make it "just work".

You could currently assume if the Elf header contains kalimba (0x72ec) that its triple is always then unknown-unknown.

Indeed. But also nice to also extract that information from the gdb-remote packets mentioned above. I'm going to try achieve this.

I look forward to seeing what you come up with.