19 March, 2012

What does this address mean? (introducing Address Lookup Tool)

Did you ever encounter cryptic error messages like "Access violation at address 00515974 in module 'Project1.exe'. Write of address 00000056"? Have you wondered what source code line is mentioned in this error message? Well, look no further...

Exception tracer solution?

Well, basically, if you use EurekaLog or any other exception tracer tool - you usually don't get just "Access Violation at address..." error message, but get a full bug report instead - including call stack and CPU state. However, if:
  • you're working with application without exception tracer
  • application handles the error by itself without passing it to exception tracer
then you will have a simple error message only, without any advanced/detailed information. If you can reproduce this case again - good. You can catch it under the debugger and analyze the situation. But if you for some reason is unable to do that, but urgently need to restore original source code point - you're stuck with numbers like "00515974".

Address Lookup Tool

EurekaLog 7 as well as standalone EurekaLog Tools Pack have a Address Lookup Tool, which can help you to analyze addresses in the application and provide more information than just RAW address. You can launch this tool via Start menu (Programs\EurekaLog 7\Tools\Address Lookup) or via Tools\EurekaLog\Address Lookup IDE command (for EurekaLog 7 only, not applicable to Tools Pack).

Note: there are 32-bit and 64-bit versions available. Use the tool of corresponding bitness. 32-bit Address Lookup will fail for 64-bit executable files and visa versa. 64-bit Address Lookup will only be installed on Win64 machine.

Address Lookup main window
As you can see - the interface is simple. First, you need to specify your application. It can be .exe file, DLL or BPL (package). Use the file which has raised the error in question. If it's not mentioned in error message - you can try to find it from address by using other tools (see more detailed explanation below). Of course, executable file must match the file which has raised the error. I.e. you can't use version 2 of your application to resolve addresses from version 1 - that will most probably don't work, since addresses may be changed.

Next, select source of debug information to lookup. Usually, you should leave this option in its default position - "Auto". But you may want to use specific source, if you want to.

Note: support for madExcept information is experimental. It may not always work properly.

So, basically, you'll need at least one source of debug information available. If you use EurekaLog or other exception tracer solution - you already have one. If not - then you should supply it manually. The easiest way to do this - enable Map file = Detailed option on Linker page from project options in IDE (however, you'll need to rebuild your application).

Last thing to do - is to enter address itself. There is a catch though. Usually you get the full absolute address like "00515974" in the example above. This address can not be resolved into source code line. What you need is to have an offset from the start of the exe/DLL/BPL file. If you have .exe file - this task is easy: just substract $400000 from your address value and you're done:

Looking for source line from address (click on image to enlarge)
$00515974 - $00400000 = $00115974 - and (as you can see) this is line number 128 in Unit99.pas, which belongs to TForm99.Button1Click method. Location found.

The Address Lookup Tool also supports include files:

Finding address inside .inc file (click to enlarge)
A quick explanation for the information description:
  • Provider - is the name of a EurekaLog debug information provider class, which was able to extract location information from your executable. If you leave "Auto" position in "Debug information source" option - then this can be any class among supported. If you select specific source - then this will be specific class only which corresponds to your choice.
  • Location - a short description of found source code location. The same information is provided below this line with more detailed view:
    • Unit - name of the unit with the code. This is the name as it appears into "unit UnitName;" statement.
    • Source - name of source file. Well, usually this is the same as unit name + .pas extension, but it is not necessary so. Think about .dpr file and .inc files. This is the file which you need to open, if you want to see source code location.
    • Class - name of the class to which belongs method. If your address belongs to simple function - class name will be empty.
    • Routine - name of the routine (procedure, function or method).
    • Line - absolute line number inside "source" file.
    The values which are not present in the below detalization are: offset ("(00114974)"), binary file name ("{Project102.exe}") and absolute address ("[0B735974]"). These values do not tell you any additional information and are provided only for sanity checks. Please note that .exe file will be loaded at the different base address than $00400000 (because $00400000 is already busy by AddressLookup.exe), so calculated absolute address (0B735974) will not match your data.
  • Routine offset (bytes) - the difference (offset) between current address and routine's start in bytes.
  • Routine offset (lines) - the difference (offset) between current line and first line in routine (in lines). This valus is useless if you use include files inside routine.
  • Line offset (bytes) - the offset of code from the start of the current line in bytes. This is applicable for complex expressions in single line.
The last 3 pieces of information can help you to track down location if source code was changed. For example, suppose that you have a error in version 1 of your application, and you search for source code location using .exe file from version 1; of course, you'll get location in source code for version 1. But what if you don't source code for version 1 of your application? Then you still can try to find it if source code wasn't changed much around your address. Just find routine by name in source code of version 2 and then shift down from the start of the routine to the specified amound of lines (or bytes) - and you'll get to the desired location.

Working with addresses and offsets

Each executable module (exe or DLL) can be loaded into memory on different addresses, which are called base addresses. As you can guess, the same code will have different address if it was loaded on different base addresses. Thus, absolute address of code location is usually useless - because base address of executable on analyzing stage may be different than it was when the error was raised.

For example, suppose there was an error in your DLL at address $6E775974 and DLL itself was loaded at address $6E660000 (which is base address for that DLL). Now, you want to find source code location for that DLL from address $6E775974. So, you load DLL and try to look for this address, but on your machine DLL is now loaded at address $642A0000 (not $6E660000 - as it was when error was raised). This means that address $6E775974 is useless - it will point to something else.

This is similar to the process address space isolation, when two processes can have different data on the same address. If you need to exchange data between processes you can't pass pointers (addresses) - you need to use relative offsets.

The similar approach is applied here too: find an offset of your code from the start of the module. In our imaginable example this is $6E775974 - $6E660000 = $00115974. And now - when you have offset - you can find exact the same location at any time. Just retrieve the base address of DLL (you can use Process Explorer tool to find it) and add our offset to it: $642A0000 + $00115974 = $643B5974 - and that is the final address of your code when you loaded DLL.

So that is how you work with addresses and offsets. To find source code location you need either:
  • relative offset (1 value)
  • absolute address and base address (2 values)
Just absolute address alone is useless.

However, there is a quick tip: exe files are loaded at the same address in 99.99% cases. So you can blindly assume the base address of $00400000 for .exe files.

Manual search

If you can't use Address Lookup Tool (for example: you miss debug information source), then you still can get useful information from code, but only if you have source code for exact the same version of your application. To do that:
  • load your application into IDE (restore it to previous version, if needed);
  • launch it via Run/Run (F9) command;
  • put your application on pause (Run/Pause);
  • use Search/Go to address command;
  • enter absolute address (don't forget to add '$' prefix) into dialog;
  • click OK;

Searching for address location manually (click to enlarge)
And you'll be moved to the source code location. If there is debug information available for that location - you will see source code in IDE code editor; otherwise you'll see a CPU debugger positioned to your location.

You can download freeware EurekaLog Tools Pack here.

3 comments:

  1. Will this work for access violations where the address specified in the exception is also the address of the EIP pointer (instruction pointer)?

    Which should mean that the EIP points to an invalid address ? I do not think this will work for those kind of access violations and thus not all addresses are probably "findable".

    If there's somehow a way to figure out what the culprit is of such access violation I'd be interested to know this.

    ReplyDelete
  2. just to be more clear it's also the address being read from for example such access violation:
    "Access violation at address 4787D800. Read of address 4787D800"

    ReplyDelete
  3. This note shows you how to translate a RAW address into source code location in .pas/.dpr. So if you have address to invalid location - well, you'll find that invalid location, yes.

    And this is simple method of diagnostic, last resort measure. If you want to get more information - you should use exception tracer tool.

    P.S. If EIP is wrong it usually means wrong signature of DLL function (so you will return to random location after calling that function) or stack corruption issue (you rewrite return address on the stack, so you'll jump to arbitrary location at return). You may try to use debugger and exception tracers, but it rarely can help - because trashing return address usually means trashing call stack too.

    However, if you get invalid EIP due to function's call via unassigned/invalid pointer - then debugger or exception tracer tool will be able to show you call stack.

    ReplyDelete

You can use some HTML-tags like:

<b>Bold</b>
<i>Italic</i>
<a href="http://www.example.com/">Link</a>