July 4, 2021

Debugging with gdb

Adapted from a post to edk2-devel by Laszlo Ersek.

  1. Clone Andrei Warkentin's repo https://github.com/andreiw/andreiw-wip.git

    git clone https://github.com/andreiw/andreiw-wip.git
    cd andreiw-wip
    
  2. From the root of your clone copy the subdirectory "uefi/DebugPkg" to your edk2 workspace:

    cp -av uefi/DebugPkg $WORKSPACE
    
  3. Apply the following patch to OvmfPkg/OvmfPkgX64.dsc:

    -----------------
    diff --git a/OvmfPkg/OvmfPkgX64.dsc b/OvmfPkg/OvmfPkgX64.dsc
    index 97fdedb..794521b 100644
    --- a/OvmfPkg/OvmfPkgX64.dsc
    +++ b/OvmfPkg/OvmfPkgX64.dsc
    @@ -592,3 +592,4 @@
     !endif
    
       OvmfPkg/PlatformDxe/Platform.inf
    +  DebugPkg/GdbSyms/GdbSyms.inf
    -----------------
    

Note that we don't modify the FDF file.

Add the following to Initialize() in DebugPkg/GdbSyms/GdbSyms.c (taken from https://lists.01.org/pipermail/edk2-devel/2018-April/024156.html):

DEBUG ((DEBUG_VERBOSE, "%a: %llx\n", __FUNCTION__, &ESTP));
DEBUG ((DEBUG_VERBOSE, "%a: %llx\n", __FUNCTION__, &EDIITH));
DEBUG ((DEBUG_VERBOSE, "%a: %llx\n", __FUNCTION__, &EIDH));
DEBUG ((DEBUG_VERBOSE, "%a: %llx\n", __FUNCTION__, &EIOHU));
DEBUG ((DEBUG_VERBOSE, "%a: %llx\n", __FUNCTION__, &EIDDE));
DEBUG ((DEBUG_VERBOSE, "%a: %llx\n", __FUNCTION__, &EIDCNE));
DEBUG ((DEBUG_VERBOSE, "%a: %llx\n", __FUNCTION__, &EIDCRE));
DEBUG ((DEBUG_VERBOSE, "%a: %llx\n", __FUNCTION__, &EIDCME));
  1. Build OVMF (X64) as usual. Don't clean the build directory.

  2. Start OVMF in qemu. Pass the "-s" option to qemu. (Shorthand for "-gdb tcp::1234").

  3. Start gdb in another terminal, and issue the following commands:

    (gdb) set architecture i386:x86-64:intel
    (gdb) target remote localhost:1234
    (gdb) source [WORKSPACE]/DebugPkg/Scripts/gdb_uefi.py
    (gdb) reload-uefi -o [WORKSPACE]/Build/OvmfX64/DEBUG_GCC48/X64/DebugPkg/GdbSyms/GdbSyms/DEBUG/GdbSyms.dll
    
  4. Issue "bt" at the gdb prompt:

    (gdb) bt
    

... And then you'll get a stack trace, but the local variables cannot be displayed. Each will be reported as:

<error reading variable: can't compute CFA for this frame>

In order to fix this, we must understand the above procedure, so let's see the explanation (sometimes quite the hand-waving, sorry).

  • Andrei's DebugPkg/GdbSyms module does nothing at all. We don't even include it in the final firmware image (see the note about the FDF in (3)).

The reason we need it because it uses the following types

EFI_SYSTEM_TABLE_POINTER
EFI_DEBUG_IMAGE_INFO_TABLE_HEADER
EFI_IMAGE_DOS_HEADER
EFI_IMAGE_OPTIONAL_HEADER_UNION
EFI_IMAGE_DEBUG_DIRECTORY_ENTRY
EFI_IMAGE_DEBUG_CODEVIEW_NB10_ENTRY
EFI_IMAGE_DEBUG_CODEVIEW_RSDS_ENTRY
EFI_IMAGE_DEBUG_CODEVIEW_MTOC_ENTRY

When we build the tree in (4), the layout of the above structures (names, fields, offsets -- basically the type definitions) will be generated into the GdbSyms.dll file, as debug symbols.

The "DebugPkg/Scripts/gdb_uefi.py" script is a gdb extension (a new command), written in Python. It is run "under" gdb, and can communicate with gdb in a programmatic fashion. It can peek and poke at memory, cast pointers, ie. whatever you can do at the gdb prompt, but it can do it algorithmically.

The script defines a new command, "reload-uefi". "reload-uefi" does the following: - It loads the type information for the structures listed under (a), from the GdbSyms.dll file. GdbSyms.dll is used for nothing else. - It locates the EFI system table in (guest) memory, looking at each multiple of 4MB, checking for the signature and verifying the CRC. - When the system table is found, it follows a long chain of pointers and structures (relying heavily on the struct types from (a)) until it lands on the list of loaded EFI images. - It iterates over the list of loaded EFI images. For each image:

It reads the header that identifies the "debug symbols file" for the image, it makes gdb load that symbol file, and it asks gdb to "foist" that symbol file on the base address of the loaded image in question. The "debug symbols file" is just the dll file in the Build directory. E.g. for "QemuVideoDxe.efi", it is "QemuVideoDxe.dll" (with full, absolute path under $WORKSPACE of course).

The "bt" command in (7) should just work, but it doesn't really. The problem is that for local variables to be displayed, the .eh_frame section should be present in the DLL files (eg. QemuVideoDxe.dll), but it is absent.

https://refspecs.linuxbase.org/LSB_4.1.0/LSB-Core-generic/LSB-Core-generic/ehframechpt.html

It is absent for a reason: GenFw cannot cope with (relocation entries for) .eh_frame sections. We've discussed this earlier on the list, but for illustration it is simplest to look at the build log. I'm removing the longest common prefix from pathnames, for readability:

"ld" -o QemuVideoDxe.dll ... \
--script=.../BaseTools/Scripts/gcc4.4-ld-script ...

In this step, QemuVideoDxe.dll is created. The referenced linker script contains:

.eh_frame ALIGN(0x20) :
{
  KEEP (*(.eh_frame))
}

so the DLL does contain an eh_frame section at this point.:

"objcopy"  QemuVideoDxe.dll

Not sure why this is useful, I think this command does nothing.:

cp -f QemuVideoDxe.dll QemuVideoDxe.debug

Aha! The DLL, with the .eh_frame section, is saved as "QemuVideoDxe.debug".:

objcopy --strip-unneeded -R .eh_frame QemuVideoDxe.dll

The .eh_frame section is hereby removed from the DLL.:

objcopy --add-gnu-debuglink=QemuVideoDxe.debug QemuVideoDxe.dll

This adds a .gnu_debuglink section to the DLL, pointing at the "QemuVideoDxe.debug" file. (Only the last pathname component, ie. the filename, is saved, in that section.)

This might be helpful for other debugging solutions, but it doesn't seem to help with qemu+gdb at all.:

"GenFw" -e UEFI_DRIVER -o QemuVideoDxe.efi QemuVideoDxe.dll

This is where GenFw converts the DLL to an EFI executable.

If we had the .eh_frame section still present in the DLL, this command would fail, because GenFw doesn't know how to handle .eh_frame.

(More precisely: it doesn't know how to handle relocations for .eh_frame, ie. '.rela.eh_frame'. See the discussion at http://thread.gmane.org/gmane.comp.bios.tianocore.devel/3796/focus=3800 .)

Because the build process has removed the .eh_frame (and the referrer .rela.eh_frame) sections, GenFw succeeds. But, as a consequence, the DLL file will also be unusable for printing local variables, because it has no (or insufficient) call frame information.

The solution (or at least "one" solution) is to update Andrei's gdb script. Rather than loading the DLL file for the EFI file (see (b4)), let's load the DEBUG file immediately. The DEBUG file is the original version of the DLL file, ie. from before the stripping.

Hence, after step (2), before step (3), the following patch should be applied:

-----------------
--- DebugPkg/Scripts/gdb_uefi.py.orig   2014-06-18 22:36:42.131171623 +0200
+++ DebugPkg/Scripts/gdb_uefi.py        2014-06-18 23:24:56.086566783 +0200
 @@ -18,6 +18,7 @@
 import array
 import getopt
 import binascii
+import re

 __license__ = "BSD"
 __version = "1.0.0"
@@ -243,6 +244,7 @@
             base = base + opt['SizeOfHeaders']
         if sym_name != self.EINVAL:
             sym_name = sym_name.cast (self.ptype('CHAR8')).string ()
+            sym_name = re.sub(r"\.dll$", ".debug", sym_name)
             syms.append ("add-symbol-file %s 0x%x" % \
                              (sym_name,
                               long (base)))
-----------------

And then local variables are displayed too.

© Rebecca Cran 2022