Inspecting Our Program

Inspecting Our Program #

There are at least two programs you can use to look at programs and debug. First, compile the following program using the command as -g -o main.o main.s. What the -g flag will do is add debug symbols to the executable. The linker command will be the same as before. (ld -o main main.o)

.global _start 

_start: 

    mov     r4, #2      @ load 2 into r4 
    add     r4, #6      @ add 6 to r4 

Object dump #

The first tool you can use is objdump. If you invoke it as objdump -d main you can see the actual commands that the assembler wrote to the file. For this simple program, the output is:

pi@raspberrypi:~/hrm$ objdump -d main

main:     file format elf32-littlearm

Disassembly of section .text:

00010054 <_start>:
   10054:	e3a04002 	mov	r4, #2
   10058:	e2844006 	add	r4, r4, #6

The file format tells you that the file is an elf. If you recall, the part of the program where all of the instructions are is the .text section, so that makes sense. You can then see the _start label and our two operations, MOV and ADD.

This is helpful because sometimes the compiler will not use the same instructions that you think it is using. It will sometimes substitute some commands for others for various reasons.

Machine code #

Welcome to the nitty-gritty. In the end, the processor (and computers in general) only understands 1’s and 0’s. In the above example, under the _start label in the first column, you can see the relative address each instruction appears at. This makes sense that the first instruction is at 0x10054 and the next at 0x10058 because each instruction is 32-bits or 4 bytes.

The next column is the actual machine code instruction but in hex format. If you write it in binary, it would look like this:

11100011 10100000 01000000 00000010

The processor then takes these ones and zeros and understands it as this.

1 2 3 4 5 6 7 8
1110 00 1 1101 0 0000 0100 000000000010
  1. Condition Code: 1110 = Always execute
  2. Always 00 for data processing
  3. 1 = Immediate operand at end as opposed to a register
  4. Opcode: 1101 = MOV
  5. Update CPSR: This would be 1 if we used an instruction with an -s suffix or a command such as CMP.
  6. Rn = 1st operand register: Since MOV only has a destination, this is 0
  7. Rd = Destination register: 0010 = 4
  8. Operand2 = This was explained in the very beginning. Because the flag in field 3 is set, this is an immediate value. 0b10 = 3

If you want to deep dive into decoding instructions, they are all explained in the manual.

GNU Debugger #

The machine code is great for static analysis, but what if you want to inspect the program while it is running? The GNU Debugger (GDB) is a perfect tool to do this. There are a multitude of tutorials on how to use this tool as it has been in use for almost 35 years. Below is a quick overview but I would highly suggest reading the documentation or looking at a detailed cheatsheet

Running a program #

To start gdb, just invoke gdb application_name. After it loads, you will see a very uneventful gdb> prompt. To run the program and pause on the first operation, use start. If you would like to add any arguments, just add them after start. (e.g., start arg1 arg2) If you have set breakpoints, you can also use r or run to skip stopping on the first operation. The q command will exit GDB.

Setting breakpoints #

If you want to stop on a label, you can simply type break labelname or b labelname. If you’re trying to set a breakpoint on a certain line, you can use the l command to show a list of the next 10 lines of the program. You can then set a breakpoint by typing b lineno. To remove a breakpoint, use d as opposed to b with the same argument you used to set the breakpoint.

Leaving breakpoints #

After the program is stopped, you can:

  • c: continue running until the next breakpoint
  • s: Runs the next instruction (steps into function calls)
  • n: Runs the next instruction (steps over function calls)

Reading data #

There are two primary ways to look at the values your program is using. The first is how you would examine memory.

The other way is to look at variable values, or in assembly the register values, using print.