The NSMB Hacking Domain » Some techinical asm hacking questions

Pages: 1

Luca91

Posted on 07-16-14, 12:45 pm

Link
ID: 36586
+1

Karma: 29
Posts: 1/4
Since: 07-16-14

Hi,
I've just read "How ASM hacks are setup" and it gave me a good idea how asm hacks works (thank you Dirbaio!!).
By the way since I like to understand those things in details, I'd like to ask some questions.
1) I'd like to know how the final .bin in injected on the rom file. I was thinking about: wouldn't replace or add some code to the game functions, messing up the functions pointers ? Also if my code is longer than the original function code, wouldn't the next function be overwritten by my custom code ?
2) Can someone describe the injection process in details ?

Thank you very much for the help, please add as much dettails as you can, I like low level stuff so much

Dirbaio

Posted on 07-16-14, 08:32 pm (rev. 1 by

Dirbaio on 07-16-14, 08:37 pm)

Link
ID: 36618
+1

Super Mario
( ͡° ͜ʖ ͡°)

Karma: 10141
Posts: 4055/4458
Since: 06-08-11

These are pretty good questions.

Yes, it's impossible to "move" code in RAM, because you'd have to fix all the jumps and pointers in the code, which is pretty hard to do reliably. This means we wouldn't be able to replace a function with a bigger one, and adding hooks is pretty much impossible.

So, this is how I solved this problem:

The ARM9 binary is divided in 'sections' that go to different places in RAM. There's a small code that executes right when the game boots that decompresses the rest of the ARM9 binary and copies the sections to their right places in RAM according to a table.

Then there are overlays. These are placed in RAM after the ARM9 sections, they're loaded on demand but they can be simply viewed as more sections.

Then the rest of the RAM is reserved for the 'arena', which is used for dynamic memory by the game.

For example, in NSMB USA ROM, 0x02000000-0x020986E0 is used for the ARM9 binary, 0x020986E0-0x021901E0 is used for overlays, and the rest is the arena.

So, the easiest way to add more code is to add a new section after the overlays, and change the arena start address to make space for it. This is what NSMBe does: All your code gets compiled to an .elf file which is then converted to .bin and inserted into the binary as a new section as-is.

(Yes, we're taking away space from the arena here. This will cause the game to go out-of-memory if the code we're inserting is quite big. However, it seems NSMB is not quite memory heavy, I've been able to insert as much as 100kb of code without having the game crash, I don't know where's the limit but it seems high enough for our purposes)

Then, to make hooks and function replacements work, NSMBe reads the function name list from your code and patches some instructions in the NSMB binary:

To make function replacements (nsub_02xx) work, it just replaces the first instruction of the original function with a jump (B instruction in ARM ASM) to our new function. The rest of the old function code stays there, but it will never get executed because all calls to that function will hit the jump and go to our new function

Hooks are a bit more complicated. The editor takes the instruction at the hooked address and replaces it with a B instruction. That B instruction points to a small piece of code that contains the replaced instruction (so it doesn't get deleted so we don't break the code), saves all registers to stack, calls the hook function, pops all registers to stack, then jumps back to the next instruction of the original code.

All this weirdness is to avoid crashes if the hook function overwrites any registers. It sucks because then you can't easily write a hook that modifies the value of a register, which is sometimes useful.

So, that's it, that's how the injection works

Hope you understand it more now.

You can also look at NSMBe's source code if you're curious. These are the two most interesting files:

Arm9BinaryHandler.cs handles decompressing and adding sections to arm9.bin

PatchMaker.cs handles reading the result of the compilation, making the patches and adding the section to the arm9.bin.

Luca91

Posted on 07-16-14, 09:17 pm (rev. 1 by Luca91 on 07-16-14, 09:18 pm)

Link
ID: 36625
+1

Karma: 29
Posts: 2/4
Since: 07-16-14

These are pretty good questions.

Thank you Dirbaio

Also thanks for your explanation, you were very clear!

For example, in NSMB USA ROM, 0x02000000-0x020986E0 is used for the ARM9 binary, 0x020986E0-0x021901E0 is used for overlays, and the rest is the arena.

Uhm mate are you sure that those addr are right ? I'm asking this because I saw that in your arenaoffs the address is 02065F10.

To make function replacements (nsub_02xx) work, it just replaces the first instruction of the original function with a jump (B instruction in ARM ASM) to our new function. The rest of the old function code stays there, but it will never get executed because all calls to that function will hit the jump and go to our new function

When the new function ends what happens ? is there a jump back to the end of the original function to continue the original code flow ?

Well, this is almost the same of injecting a PE32 executable using a code-cave (except for the stack pushing/popping), right ?

Still many thanks,
those rom hacking ideas are awesome, I always wanted to get into rom hacking :')

Dirbaio

Posted on 07-16-14, 09:39 pm

Link
ID: 36627
+0

Super Mario
( ͡° ͜ʖ ͡°)

Karma: 10141
Posts: 4060/4458
Since: 06-08-11

Posted by Luca91

Uhm mate are you sure that those addr are right ? I'm asking this because I saw that in your arenaoffs the address is 02065F10.

Yes. If you look at NSMB Overlay list you'll see the overlay offsets.

The arenaoffs.txt file doesn't contain the starting address of the arena. It contains the address of a value that contains the starting address of the arena. It's part of a function that initializes the arena when the game boots. If you change the value at that address, you're basically changing the arena start adress, which is what NSMBe does.

If you look at that address (02065F10) in the RAM viewer, you'll see the value 0x021901E0 (or something slightly higher if you have inserted code!)

The reason that address is in a separate file you have to give to NSMBe when patching is that it changes from game to game. If you want to hack a different game, you can put a RAM dump of it in IDA Pro, search for that function, put the new offset in the file and you can use NSMBe to insert code into another game! (Yes, it works in all Nintendo DS games built with the official devkit, at least all the ones I tried).

Posted by Luca91

When the new function ends what happens ? is there a jump back to the end of the original function to continue the original code flow ?

First let me explain how a "normal" function call works.

When you want to call a function, you use the instruction BL (Branch with Link). What this does is it saves the address of the next instruction into the register R14, also called LR (Link Register), and then jumps to the function you want to call. (The address it saves is called the return address).

The function you called then does it's thing, and then does a "return" executing the instruction "BX LR". BX means "Branch Exchange", it basically jumps to the address in the register instead of a predefined address. So it returns to the return address, which is right after the BL instruction, and execution of the caller function continues.

(Some functions also push LR to stack and then pop that value at the end to PC, which is the Program Counter register, it always points to the currently executing instruction, so writing to it basically does a jump, it's pretty much the same).

So, back to the replace function thing. It works exactly the same way, except there's a B instruction in the middle. Some code from NSMB calls the replaced function with a BL instruction. Then the replaced function jumps to our replacement function with the B instruction NSMBe patched in. Our function does whatever it wants and then returns using BX LR. This will jump right back to the original NSMB code that called the replace function.

So you see, to replace a function we have to do very little!

Note we don't have to worry about not overwriting registers or anything because there's a thing named "calling convention". It defines how a function can call another function, what registers can modify one function when it's called, how are parameters and return values used, etc. Both NSMB's code (compiled with Nintendo's compiler) and our code (compiled with GCC) respect the standard ARM calling convention, so it just works, and we can even pass parameters and return values.

Posted by Luca91

Well, this is almost the same of injecting a PE32 executable using a code-cave (except for the stack pushing/popping), right ?

I dunno what's that. But uh, I guess yes? lol

Luca91

Posted on 07-16-14, 10:15 pm (rev. 3 by Luca91 on 07-16-14, 10:17 pm)

Link
ID: 36634
+0

Karma: 29
Posts: 3/4
Since: 07-16-14

This helped me a lot to understand in deep the process!
Last question, just to be sure that I've understood all correctly: all functions in the NSMB are called using a BL instruction, right ? That should be essentially needed in order to make the functions replacements to works (I mean, to jump to the correct address using BX LR).

I dunno what's that. But uh, I guess yes? lol

When I started reverse enginering PE execubles years ago, I've found the entry point, replaced the next instruction with a jump to some free space, rewrite the replaced instruction, then I've called a messageboxA (just for testing), and then I've jumped back to the line just after original instruction (the one I replaced with jump istruction) to continue the code flow. Hope that this make sense

Dirbaio

Posted on 07-16-14, 10:36 pm

Link
ID: 36635
+1

Super Mario
( ͡° ͜ʖ ͡°)

Karma: 10141
Posts: 4064/4458
Since: 06-08-11

Posted by Luca91

Last question, just to be sure that I've understood all correctly: all functions in the NSMB are called using a BL instruction, right ? That should be essentially needed in order to make the functions replacements to works (I mean, to jump to the correct address using BX LR).

Yes, that's how function calling in ASM is done. You'll find that in *any* ARM ASM code, not just NSMB or DS games

Pages: 1