Pages: 1
Posted on 08-09-11, 12:31 am
☭ coffee and cream


Karma: 10415
Posts: 109/2768
Since: 06-26-11
So basically, I'm trying to code a SNES emulator for the DS. SNES runs at 2.68MHz, DS ARM9 runs at 66MHz (plus SPC emulation will be done by ARM7), so one would say that the DS is powerful enough.

But in practice, something as silly as a simple ldr/str copy loop running 128 times, performed 192 times per frame (for a total of 24576 iterations per frame), takes up ~40% of the ARM9's power.

Here is the loop:
drawline_mode1: ldr r3, =PPU_CGRAM mov r2, #0x200 lol1: subs r2, r2, #4 ldr r0, [r3, r2] str r0, [r1, r2] bne lol1 ldmia sp!, {lr} bx lr


The CPU part of the emulator isn't what is taking much power with my current test ROM. When the call to PPU_DrawLine (which runs the loop above) is commented out, ARM9 CPU usage goes down to literally 0%.

The worst part is that the final PPU code needs to be drawing 4 BGs and 128 OBJs and handling various control/priority/etc bits, not just copying stuff from CGRAM to the framebuffer. How many CPU power is that gonna take?


On the other hand, Dirbaio's Fireworlds is running a shitload of stuff, and isn't that hard on CPU...


So yeah. Is the DS really that underpowered, or am I just doing something wrong? (like relying on desmume's counters)
_________________________
Kuribo64 - RH-fucking-cafe - Kafuka

zrghij
Posted on 08-09-11, 06:23 pm
Super Mario
( ͡° ͜ʖ ͡°)

Karma: 10010
Posts: 577/4457
Since: 06-08-11
128*192*4 bytes = 96 kb is A LOT of data to copy each frame.
If you used DMA it'd be a little faster, but it's still A LOT.

If you're going to implement the SNES GPU on the DS CPU, it's going to be slow.
You need to use the DS GPU to emulate the SNES GPU, for example use the DS BG's to emulate the SNES BG's.
Yeah, I know, this is shit.

Fireworlds, for example, displays lots of stuff, but it uses the 3D GPU, so I don't have to send that much data.
For example, Each quad is 1 polygon_attr write (32bit), 1 texture param write (32bit), 4 vertexs (16bit), 4 texcoords (16bit). So 4+4+4*4 = 24 bytes per quad. Let's say I'm displaying 1000 quads. Then it'd be only 24K of data to send.

Posted on 08-09-11, 06:46 pm
☭ coffee and cream


Karma: 10415
Posts: 110/2768
Since: 06-26-11
I'm sure you were going to suggest using the DS's hardware. And that isn't quite possible for the following reasons:
* DS tiles are stored in a linear format, while on SNES it's planar. Heavy conversion would have to be done at each frame, or complex methods would be required to detect chages to VRAM, or something...
* the SNES's PPU supports some fancy stuff, like per-tile priority, 16x16 tiles, special blending/window logic functions, that the DS doesn't support.

Or well, it can be done, but isn't gonna be accurate... it's gonna end up like it is on SNemulDS. (my goal was to do something less glitched than SNemulDS)
_________________________
Kuribo64 - RH-fucking-cafe - Kafuka

zrghij
Posted on 08-17-11, 11:44 pm
Ninji


Karma: 379
Posts: 5/226
Since: 08-17-11
What about the Dsi or 3DS?
Would that work better?
_________________________
Pro lurker

My Hack (whoops link is fixed now):
http://nsmbhd.net/thread/2953-super-luigi-world-ds/
Posted on 08-17-11, 11:48 pm
Super Mario
( ͡° ͜ʖ ͡°)

Karma: 10010
Posts: 638/4457
Since: 06-08-11
The 3DS for sure but it hasn't been hacked yet.

The DSi, well, it could but:
- DSi mode is not widely available (only the expensive iEvo card and the sudokuhax)
- DSi mode is only twice powerful as the DS, so it wouldnt be enough still to do the rendering by CPU.

Also both kinda defeat the purpose of doing a SNES emu for DS
Pages: 1