Banner
Views: 753,461,950
Time:
11 users online: AntiDuck, chickaDEE Magazine, codfish1002, corrosive, Dark Mario Bros,  Erik,  Giftshaven, Goomber, Madeline,  MarioFanGamer, SiameseTwins - Guests: 39 - Bots: 139Users: 39,609 (1,830 active)
Latest: Imaoofingromhacker
Tip: Don't use savestates when testing your own levels.Not logged in.
ASM Workshop Summary
Forum Index - SMW Hacking - SMW Hacking Help - Tutorials - Old Tutorials - ASM Workshop Summary
Pages: « 1 »
So I promised to make this thread, and well... I am a day late. Or two. But here it is, anyway!

Okay, as some of you might know, we had two ASM workshops the past two Saturdays - on February 28th, we had a workshop for people who were about to begin with ASM. On March 7th, we had a semi-advanced workshop. However, I was sad to see most people could not actually tell what we were talking about, since they had to go, or were off-line for another reason.

So, they had to read the log. And you know... a log is very disorganised. You can try to read it, but you won't understand a thing, or at least not much, because of all the other random crap that was thrown out. On March 7th, we had less of that, but the stuff we explained here was more advanced, and thus still not easier to follow, than the stuff we had explained on February 28th.

So, I promised to make a summary. And well... I guess here, the introduction of the thread stops. Let me make you clear, that I made this a thread, so you could ask questions!

I will now put the links of the logs here, in case you are interested:

February 28th!
March 7th! <-- I accidentally removed this. Oh well, it's not necessary to read it.

-------------------------
1. BASIC WORKSHOP
1.1 - HEX AND BINARY
1.2 - A X Y REGISTERS
1.3 - LDA, STA, RTS, INC, DEC, INX, INY, STZ, ADC, SBC.
1.4 - DIRECT PAGE, TCD, TDC, XBA.
1.5 - BANKS
1.6 - EMULATION MODE, PROCESSOR FLAGS, REP, SEP, CLC, SEC, CLV, SED, SEI, CLI, CLD, CONDITIONAL OPERATIONS.
1.7 - JUMPS.
1.8 - INDEXING
1.9 - STACK, STACK POINTER, TSC, TCS, TXS, TSX.
1.10 - TRANSFERRING
--------------------------
2. SEMI-ADVANCED WORKSHOP
2.1 - VECTORS, H/V-BLANKING, INTERRUPTS
2.2 - BITWISE OPERATIONS
2.3 - VARIOUS PPU REGS
2.4 - DMA AND HDMA
2.5 - BLOCK MOVES -> MVN, MVP
2.6 - MORE ADDRESSING MODES
2.7 - USELESS OPCODES
--------------------------
==================================================
1. BASIC WORKSHOP
==================================================
1.1 HEX AND BINARY

Before understand anything of how SNES works, you need to understand hex and binary.
They're pretty simple to comprehend.
The decimal system (I assume you know that one) goes as following:

0 1 2 3 4 5 6 7 8 9 - Base 10 system (10 digits)

The hexadecimal system (referred to as: hex) goes as following:

0 1 2 3 4 5 6 7 8 9 A B C D E F - Base 16 system (16 digits)

This means that:

Quote
 Hex | Dec
---------
0 | 0
6 | 6
9 | 9
B | 11
F | 15
10 | 16
14 | 20
20 | 32
30 | 48
40 | 64
80 |128
90 |144
A0 |160
C0 |192
F0 |240
FF |255
100 |256
etc. |etc.


Should be relatively easy to comprehend, no?

Binary is relatively simple too.
It has the following digits:

0 1 - Base 2 system (2 digits)

0 is often referred to as 'clear', while 1 is often referred to as 'set'. 1 byte consists out of 8 bits, which can be clear or set. A byte looks like this:

xxxxxxxx

Where x = a bit, set or clear.

The leftmost bit is referred to as Bit 7. In bold:
xxxxxxxx

While the rightmost bit is referred to as Bit 0. In bold:
xxxxxxxx

Little chart:
Quote
Binary  | Hex
---------------
00000000| 00
00000001| 01
00000010| 02
00000100| 04
00001000| 08
00010000| 10
00100000| 20
01000000| 40
10000000| 80
10000001| 81
10000010| 82
10000011| 83
11000000| C0
11100000| E0
11111110| FE
11111111| FF


So, as you might have noticed, Bit 7, when set, is worth much more than Bit 0. Bit 0 is only worth $01 (hex) when set, while Bit 7 is worth $80!

----------------------------------

1.2 - A X Y REGISTERS

Okay, let's start finally. This is where ASM really starts.

The A, X and Y registers are places, in which you can temporarily keep a value. You can use these registers to transfer values from point A to point B.
A = Accumulator
X = X index, Y = Y index.
Normally, you usually want to use Accumulator for most things, since they're more compatible with more commands. X and Y can be useful too -> see 1.8 INDEXING and 2.6 MORE ADDRESSING MODES.

Either way, these registers can be 8-bit or 16-bit (meaning they can contain one byte, or two bytes, at once.)
How one can manipulate this exactly -> see 1.6 - EMULATION MODE, PROCESSOR FLAGS, REP, SEP, CLC, SEC, CLV, SED, SEI, CLI, CLD, CONDITIONAL BRANCHING.

----------------------------------

1.3 - LDA, STA, RTS, INC, DEC, INX, INY, STZ, ADC, SBC.

So what to do with these registers? You'll learn more about it in this paragraph.

LDA - Load a value into the Accumulator. We explained 4 addressing modes:

LDA #$00 -> Load 8-bit value $00 into the Accumulator. This value can vary from #$00 to #$FF in 8-bit mode. In 16-bit mode, it can vary from #$0000 to #$FFFF.
This command basically directly loads a value into the Accumulator.
LDA $00 -> Load a value from the address $00. This means, that whatever is in address $00 will get into A. Direct page is also involved -> see 1.4 DIRECT PAGE, TCD, TDC, XBA.
LDA $0000 -> Load a value from 16-bit address $0000. Needs to cooperate with the Data Bank register -> see 1.5 BANKS for info about this reg.
LDA $000000 -> Load value from 24-bit address $000000 -> see 1.5 BANKS for more info about addresses are formatted.

The equivalents for X and Y are LDX and LDY. There are less addressing modes for them to use, though.

So basically, LDA will get a value in the register A. Now, you want to transfer the value in A over to another address. What do you use? You use STA!

The three addressing modes we explained:

Quote
STA $00 -> Store value from A into address $00.
STA $0000 -> Store value from A into address $0000.
STA $000000 -> Store value from A into address $000000.


(No, there is no STA #$00. That would be rather stupid, storing A into a value >_>)

XY equivalents: STX, STY.

Each code needs to end somehow, to return back to the game. We use RTS or RTL for this. RTS/RTL is used at the end of a subroutine. How exactly this goes -> see 1.7 - JUMPS.
Either way, putting RTS or RTL at the end of a routine, is almost always required to return back to the game (it is not necessary, but preferable).

Quote
Example:
LDA #$01 ; Load #$01 into A.
STA $19 ; Store into $19
RTS ; Return to the game.


INC - Will increase an address/register by 1.
DEC - Will decrease an address/register by 1.
INX, INY - Will increase X, Y by 1.
DEX, DEY - Will decrease X, Y by 1.

Note that INC / DEC can increase the Accumulator, but are not limited to this, unlike the commands for increasing/decreasing X and Y.
DEC on its own (or DEC A) will decrease A. INC (or INC A) will increase it. All by 1 per time it's run. INX, INY, DEX and DEY are used on its own, in the same way. So:

Quote
INX ; Increase X
INY ; Increase Y
DEX ; Decrease X
DEY ; Decrease Y


With INC/DEC, you can also increase/decrease an address, with:

Quote
DEC $00 ; Decrease $00
INC $01 ; Increase $01
DEC $0200 ; Decrease $0200
INC $1FFF ; Increase $1FFF


Note: There is no such thing as INC $000000 or DEC $000000. This command does not exist.

STZ - It will set an an address to zero, without affecting A.

Quote
STZ $19 ; Will make $19 = #$00.


ADC / SBC.
Can add to or subtract from A.

Quote
Example:
LDA #$05 ; Load #$05 into A
CLC ; Clear carry. This is needed
ADC #$02 ; Add #$02 to A.
STA $19 ; Store to $19. Result = #$07.


Quote
Example 2:
LDA #$06 ; Load #$06 into A
SEC ; Set carry.
SBC #$04 ; Subtract #$04 from A.
STA $1A ; Store to $19. Result = #$02.


You can also add addresses to or subtract them from A:

Quote
LDA #$06
CLC
ADC $00 ; Add A with $00.
STA $19


Quote
LDA #$12
SEC
SBC $01 ; Subtract $01 from A.
STA $1E


-----------------------------------------------------

1.4 DIRECT PAGE, TCD, TDC, XBA

The direct page register is used when working with 8-bit addresses, such as LDA $00, STA $06, ADC $10, etc.

However, there's a little trick to these addresses.
There is such a thing as a direct page register, which adds itself to the specified 8-bit address in commands such as LDA $00.
It's a 16-bit register, and seperate from normal addresses - it doesn't really have a point at where it is located, inside a bank.

In SMW, this register is almost always #$0000. So LDA $00, actually just loads a value from 16-bit address $0000.
However, one can change it. If the register would be #$0100, and LDA $00 would be used, LDA $00 would actually equal to LDA $0100.
If the register was #$0180, LDA $00 would actually be LDA $0180, etc.

So, how does one change the direct page? You use TCD to change it, and TDC to get the value back.
You will need a 16-bit value in Accumulator, and the command TCD, to get the value into the direct page.

Quote
Example:
LDA #$0200
TCD


This will change the DP register to #$0200. So LDA $01 equals LDA $0201 now.
To get the 16-bit value back, simply use:

TDC

You don't necessarily need a 16-bit accumulator for this, and a few other instructions.
You must know, that the A register is actually always 16-bit. Except, sometimes, only the 8-bit mode is actually active. But some commands, such as TDC and XBA, can still change the inactive byte. The low byte (the one which is always active) is simply called the 'A Accumulator', while the high byte is called the 'B Accumulator'. Combined, they're called 'C'.

XBA will exchange Accumulator A with Accumulator B.

So:

Quote
LDA #$0200
TCD


In 8-bit mode, the same effect would be:

Quote
LDA #$02 ; Load into low byte
XBA ; Swap low byte with high byte.
LDA #$00 ; Load into low byte.
TCD ; Transfer 16-bit A to DP register.


---------------------------------

1.5 - BANKS

Banks are basically chunks of data.
They consist of $FFFF bytes each, which form the 16-bit addresses.
Banks themselves make the difference in 24-bit addresses - they're there to make the amount of data bigger. If banks didn't exist, a ROM would be limited to 32kb (if only $8000-$FFFF consisted out of ROM data). Luckily, there's much more free data available, thanks to the banks, forming 24-bit addresses.

One can seperate ROMs into LoROM and HiROM. The most important one to you is LoROM, since SMW is in this format. I'll briefly explain how banks $00 to $3F (the most important ones) are built up in LoROM:

$0000-$1FFF - Low RAM data. It is mapped from bank $7E, like in the RAM map. So, loading these addresses in 16-bit mode will not make a difference in any bank.
$2000-$2FFF - PPU data, and several other SNES regs.
$3000-$3FFF - SuperFx, other enhancement chips.
$4000-$4FFF - H/V-Counter, NMI/IRQ enable, controller data, etc.
$5000-$7FFF - Other enhancement chips, and stuff.
$8000-$FFFF - ROM data. NOTE : The ROM data is different for EACH bank!

This means that $008000 in a headered ROM is PC address x200, while $018000 would be PC address x8200. $01FFFF is PC address x101FF, while $028000 is PC address x10200, etc.

You cannot STORE to ROM data. It won't have any reasonable effect. ROM = Read Only Memory.

Quote
STA $89AB


When the data bank register is between 00 and 3F, will NOT have any effect you will want to accomplish with this, what it is.

You can store to any kind of RAM data in $0000-$1FFF in banks 00-3F, and the entire banks 7E and 7F.
Quote
STA $02
STA $1505
STA $001F00
STA $7E08B0
STA $7E8000
STA $7F8000


These are all valid.

A few bank registers. There are two bank registers:
Data Bank Register -> Used for 16-bit address commands
Program Bank Register -> Holds the current bank the code is running in

Data Bank Register. So it's used for 16-bit address commands, such as LDA $0102, STA $3984, etc. So how does it work? Well, the data bank register fills in the 'empty' bank in a 16-bit command. This means that:

Quote
; DBR = 02
LDA $8000 ; Load value from $028000.


Quote
; DBR = 05
STA $0200 ; Store value into $050200.


So, it's used to save a bit of space. The DBR does NOT affect 24-bit address commands, such as LDA $009020, STA $7F8000, etc. The addresses that are specified, will be affected.

Program Bank Register. It basically holds the bank where the code is running into. If the code, at this moment, is running at $028030, the PBR = #$02. If it's running at $7F8000 (yes, RAM routines exist), PBR = #$7F.

------------------------------------

1.6 - EMULATION MODE, PROCESSOR FLAGS, REP, SEP, CLC, SEC, CLV, SED, SEI, CLI, CLD, CONDITIONAL OPERATIONS.

So, if you know LDA STA RTS and some other commands, that's all really nice, but sometimes, you want some code to run, only on a certain condition. To explain that, however, let me first explain you the processor flags.

8, or actually 9 bits, form the processor flags, in the following order:

envmxdizc

Let's handle these one by one.

E - Emulation Mode.

Emulation mode is a bit of a funny mode. When SNES runs in Emulation mode, it actually runs like a NES (pretty much). NES has more limits than SNES, so you will almost always want to switch to Native mode, as soon as possible. How do you do this? Well, for all other processor bits, you can use REP and SEP, but this one is a little different. Instead, you use XCE - Exchange Carry and Emulation bit.

Quote
CLC
XCE


This code will clear carry, and then exchange this bit with the emulation bit. This means that the emulation bit will be cleared, and you will be in emulation mode.

Quote
SEC
XCE


This will then set the carry bit, and then exchange this bit with the emulation bit, which means that emulation mode will now be on.

In emulation mode, the processor flags also have a different set-up:

nv-bdizc

b is a different one here - the break flag. Probably set when BRK is met.

Negative

This bit will be set when the last result of an instruction was a negative value. A negative value =

#$80 - #$FF in 8 bit mode
#$8000 - #$FFFF in 16 bit mode

So, negative bit will be set when:

Quote
LDA #$C0


Or...

Quote
LDA #$70
CLC
ADC #$20


The negative bit will be cleared when:

Quote
LDY #$20 ; Load #$20 into Y


Quote
LDA #$FF
CLC
ADC #$02


And a few other occasions, which you will learn later.
You can manually set it with: SEP #$80
And clear it with: REP #$80

Overflow flag

This is kind of a weird bit, since it's set/cleared at a few occassions, other than you might expect. The following:

Quote
CLC
ADC #$80 ; Set.


Quote
CLC
ADC #$C0 ; Set


So basically, when adding a value between #$80 and #$FF to A (or #$8000-#$FFFF in 16-bit mode), overflow = set.

Quote
SEC
SBC #$80 ; Cleared.


Quote
SEC
SBC #$FF ; Cleared.


So, when subtracting a value between #$80 and #$FF from A (or #$8000-#$FFFF in 16-bit mode), overflow = cleared.

Quote
BIT $00 ; Set is bit 6 of $00 = set, cleared if bit 6 of $00 = clear.


You can manually set this flag with: SEP #$40
And clear it with: REP #$40, or CLV

Accumulator Size
This is one of the fewer flags that can only be set manually. It holds the size of the Accumulator - 8 bit or 16 bit. If this bit is set, the Accumulator is 8 bit. If this bit is clear, the Accumulator is 16-bit.

In 8-bit mode, commands go as following:

Quote
LDA #$00

Quote
LDA $00 ; Get value from $00 into low byte of Accumulator

16-bit:

Quote
LDA #$0000

Quote
LDA $00 ; Get value from $00 into low byte of Accumulator, $01 into high byte of Accumulator


So, in 16-bit mode, you can do more at once.

Quote

REP #$20 ; Clear m bit. 16-bit
SEP #$20 ; Set m bit. 8-bit


Index size

Same as Accumulator size, just for the indexes X and Y. Should be easy to understand.

Quote

REP #$10 ; Clear x bit. 16-bit
SEP #$10 ; Set x bit. 8-bit


Decimal flag

When this is set, calculation will sort of go in decimal.

Example:

Quote
LDA #$06
CLC
ADC #$06


Result = 12. Not exactly useful... but eh.

Setting it:
SED or SEP #$08

Clearing it:
CLD or REP #$08

Interrupt disable flag

This flag can disable interrupts such as IRQ -> 2.1 - VECTORS, H/V-BLANKING, INTERRUPTS.
It's not the only way to do it, but it's relatively convenient for disabling IRQ. Note: It will not disable NMI. There is a way to do this, however.

Set with: SEI, SEP #$04
Clear with: CLI, REP #$04

Zero flag

This flag will be set when the result result equaled zero.
That means it's set when:

Quote
LDA #$00


Quote
LDA #$FF
CLC
ADC #$01


And it's cleared when it does not equal zero.
You can also:
Set it with SEP #$02
Clear it with REP #$02

Carry flag

This flag is probably one of the more diverse flags.
It's set or cleared at the following occasions:

Quote
LDA #$FF
CLC
ADC #$02 ; Set


Quote
LDA #$60
CLC
ADC #$E0 ; Set


Quote
LDA #$20
SEC
SBC #$30 ; Cleared


Quote
LDA #$50
SEC
SBC #$F0 ; Cleared


Basically, when it wraps around when adding, it gets set, while when it wraps around when subtracting, it gets cleared. There are also a few other occasions.

Odd thing: CLC run (clear carry), but carry gets set in the end. And vice versa for subtracting. Why is that?

Well, now, let me explain a bit more of ADC and SBC.
ADC adds the specified value + the carry bit to the accumulator.
This means that, if the carry bit is set, it will add an additional 1. This is why you always have to clear carry beforehand. So, how can this be useful?

Quote
LDA $94
CLC
ADC #$F0
STA $94
LDA $95
ADC #$00
STA $95


Let's say $94 was #$30 here, and $95 #$02.

#$30 + #$F0 will set carry, and result = #$20. Carry is not cleared afterwards, so it will stay set. As I just said, ADC will add specified value + carry bit. Carry is now 1, so it will add an additional 1. This means that:

Quote
LDA $95
ADC #$00
STA $95


Is now actually $95 + #$01 = new $95.
$95 was originally #$02, so the new result = #$03.
In 16-bit, this code would be:

Quote
LDA $94
CLC
ADC #$00F0
STA $94


So basically, with the carry flag, you can do pseudo 16-bit math.

SBC is similar. It subtracts an additional 1 from the value when carry = 0 (clear). This is why you always set carry beforehand.

You can also:
Set carry with: SEC, SEP #$01
Clear carry with: CLC, REP #$01

Conditional branches

So, now you know the processor flags, you can easily understand conditional branches. With these, you can branch on certain conditions. I'll explain the following commands:

Quote
CMP ; Comparing
BCC ; Branch if less
BCS ; Branch if equal to or more
BEQ ; Branch if equal
BNE ; Branch if not equal
BMI ; Branch if minus / negative
BPL ; Branch if plus / positive
BVS ; Branch if overflow set
BVC ; Branch if overflow clear
BRA ; Branch always
BRL ; Branch always long


CMP compares a value with the Accumulator. Its X and Y equivalents are CPX and CPY.
So, what does CMP actually do?

Quote
LDA #$02
CMP #$05


This will basically do #$02 - #$05, WITHOUT AFFECTING A.
But the processor flags WILL be affected! Which is what the branching commands rely on.

Let's explain a bit about a certain RAM address of SMW, which you will want to experiment with a lot: $19. In the RAM map, specified as $7E0019. Without changing the DP, LDA $19 will normally load the powerup status, and STA $19 will store a value to it.
The RAM address is like this:

00 - Small
01 - Big
02 - Cape
03 - Fire
04 through FF - Garbage / not recommended to use

So, to check if powerup = fire, do this:

Quote
LDA $19
CMP #$03
BEQ IsFire ; We're looking a bit ahead on the things. I'll explain this one later.


So, this will subtract #$03 from A, without affecting A.
Say, powerup = cape. Then it'd be:

#$02 - #$03 = #$FF. (Wraps around) Negative bit = set, carry bit = cleared.

You can also compare A with addresses:

Quote
LDA #$03
CMP $19


Except then, the subtraction will go the other way around, of course.

So, that's what CMP does.
Now for the branching commands. Let's start with the easiest ones: BEQ and BNE.

BEQ - Branch if equal (zero flag = set)
BNE - Branch if not equal (zero flag = clear)

Quote
LDA $19
CMP #$03
BEQ IsFire


This will compare $19 with #$03 (fire). So, if he was fire Mario, how would this go?

#$03 - #$03 = #$00.

Result = #$00, so zero flag = set. BEQ branches when this flag is set, so you will branch to IsFire.
About that - any good assembler has the option to branch to labels.
BEQ IsFire isn't actually formatted like this in a ROM - say, you had this code:

Quote
LDA $19
CMP #$03
BEQ IsFire
NOP ; Useless
NOP ; Useless
IsFire: ; This is the label to branch to


You have to specify the label to branch to somewhere in the code, of course. This doesn't need to be later in the code, it can also be earlier in the code. Just make sure it's not too far from the branching command.

About BEQ IsFire. In this code, BEQ IsFire, in hex, would be BEQ $02.

If we now changed the code to this:

Quote
LDA $19
CMP #$03
BEQ IsFire
IsFire:


BEQ IsFire = BEQ $00.

Quote
LDA $19
CMP #$03
Loop:
BEQ Loop


BEQ Loop = BEQ $FE.

Branching commands have an 8-bit parameter, to specify where they will end up. This parameter is signed - it means that 80-FF will make it branch backwards, while 01-7F will make it branch forwards (00 will do nothing, really). I hope this makes sense - but normally, you will want to use a label for this, since it's easier.

Anyway, back to BNE. It will branch when zero flag = clear. So:

Quote
LDA $19
CMP #$03
BNE IsNotFire


If Mario was big, it'd be:

#$01 - #$03 = #$FE. So zero flag = cleared. And you branch now. If Mario was fire, it'd be:

#$03 - #$03 = #$00. So zero flag = set. You don't branch, so you still run the code after BNE IsNotFire - you won't jump to IsNotFire.

BCC - Branch if less (branch if carry clear)
BCS - Branch if equal to or more (branch if carry set)

Quote
LDA $19
CMP #$02


If Mario = big,

#$01 - #$02 = #$FF. Carry = cleared.
So you branch.

If Mario = small,

#$00 - #$02 = #$FE. Carry = cleared.
So you branch.

If Mario = cape,

#$02 - #$02 = #$00. Carry = set (automatically set by CMP here?)
So you don't branch.

And,

#$03 - #$02 = #$01. Carry = set.
So you don't branch.

BCS branches when carry is set (so equal to or more than the specified value).
Aliases are BLT and BGE (I think)

BMI - Branch if minus
BPL - Branch if plus

These rely on the negative bit.
They're not really convenient for $19, but let's use them anyway:

Quote
LDA $19
CMP #$03
BMI BranchIfNegative


This will branch if:

#$02 - #$03 = #$FF (negative)

#$00 - #$03 = #$FD (negative)

But ALSO :

#$83 - #$03 = #$80 (negative)!

So, it will branch if the value is #$80-#$FF from it away (if you were to add to it). So in this case, values #$83 - #$02.

BPL will branch, if the negative bit is not set. So, in this case, #$03 - #$82.

BVS - Branch if overflow set
BVC - Branch if overflow clear

Not anything to state here, in particular...

BRA - Branch always. This will always make you branch to the specified label. It's 8-bit
BRL - Same as above, but 16-bit. It's relatively useless, except for custom blocks. In sprites and ASM hacks -> use JMP.

Phew, done with this paragraph.

--------------------------------------------------

1.7 - JUMPS

There are four jump commands:

JMP - Jump, 16-bit
JML - Jump, 24-bit
JSR - Jump and push return addresses, 16-bit
JSL - Jump and push return addresses, 24-bit

Basically, these jump to a specified SNES address. In an assembler, it's, of course, also possible to jump to a label.

Let me explain these:

JMP - You will jump to a label or 16-bit address (JMP $89AB for example) with this command. A bit like BRL, but easier to use, because it's not PC relative, unlike all branch commands, which I explained earlier. However, this JMP will not push return addresses. So you will have to use the return address corresponding to the last JSR or JSL (which I'll explain later).

JML - Same as above, except 24-bit. WATCH OUT WITH THIS! If the last pushed return addresses were 16-bit, and you jump to a different bank, you'll be busted. There is a way to prevent, this, though, with making use of some pushing commands. I'll explain that at 1.9 - STACK, STACK POINTER, TSC, TCS, TXS, TSX.

JSR - Same as JMP, except this will also push 16-bit return addresses. This means that, when you hit a return address, you will jump back to the place the JSR was (and then the next command). JMP does not do this.

JSR needs an RTS to end.

JSL - Same as above, except 24-bit. Can jump to other banks with no problems.

JSL needs an RTL to end.

-----------------------------
1.8 - INDEXING

So, all fine, these X and Y indexes... but what makes them REALLY useful?
Well... this paragraph is your answer... okay, and 1.10, 2.5, 2.6. But that aside.

So, how does indexing go? Well, before telling that, loading from a ROM address, can be specified with a label, if that ROM address is inside a patch or sprite. So:

Quote
LDA $9000


If this command and the address $9000 is inside a patch or sprite, you can just specify this command as LDA Label, and put Label in front of the place where $9000 would be.

So anyway, that was a bit of general info. About indexing. Indexing is basically $9000,x or $9000,y. Or Label,x or Label,y. There are the following indexing commands (the basic ones):

$00,x ; Mostly used for RAM tables
$0000,x ; Can be RAM or ROM
$0000,y ; Can be RAM or ROM
$000000,x ; Can be RAM or ROM

As you can see, there's not much compatibility for ,y in normal indexing. So you will often want to use the X index for this. So, how does this go? Well, it's relatively simple.

LDA $9E,x

This will load a value from $9E + X.
Now say, X was #$02, you'd load from:

$9E + X index = $9E + $02 = $A0.

So technically, LDA $9E,x here, would be LDA $A0. 'Then why not use LDA $A0?' Well, that's the purpose of tables. Sometimes, you want to load from an address when X is a certain value.

Say:

Quote
LDX $19 ; powerup status into X
LDA Table,x
STA $12 ; Some random address

Table:
db $02,$04,$26,$10


If Mario was small, you'd store #$02 into $12.
If Mario was big, you'd store #$04 into $12.
If Mario had a cape, you'd store #$26 into $12.
If Mario was fiery, you'd store #$10 into $12.

See how this can be useful?

So, this is basically how basic indexing goes. Specified address + specified index. Not too hard, is it?
Example 2:
Y = #$05.
LDA $9000,y.
So, $9000 + Y = $9000 + $05 = $9005.
So, LDA $9005.

-----------------------------

1.9 - STACK, STACK POINTER, TCS, TSC, TXS, TSX.

The stack is useful for temporarily keeping some values, which you will want back later. Several commands make use of the stack, other than JSR, JSL, RTS and RTL. The following, and even a few more, make use of them:

Quote
PEI
PER
PEA
PHA
PHB
PHD
PHK
PHP
PHX
PHY
PLA
PLB
PLD
PLP
PLX
PLY


Commands such as RTI uses the stack too - I'll explain that one at 2.1 - VECTORS, H/V-BLANKING, INTERRUPTS.

Anyway, the action of getting values ON the stack, is called pushing, while getting values FROM the stack, is called pulling, or popping.

PHA, PHX, PHY, PLA, PLX, PLY - This will push the values from A, X and Y on the stack. Depending on their size, it's either an 8-bit or 16-bit value.

Quote
LDA #$80
PHA


This will temporarily push the value #$80 onto the stack. Note: #$80 will remain in A until A is changed.

You need to pull this value out. With PLA, you can get a value from the stack back into A again.
BIG FAT NOTE: Keep the push/pull in balance. Otherwise the stack will overflow, causing the game to crash eventually. So, when you push a value, you will need to get it back again.

Quote
LDA #$80
PHA
SEC
SBC #$7D
STA $19
PLA


So, with the stack, you can keep the value intact, while still doing something with the Accumulator. Note, that you don't have 1 stack per register. They all use the same stack. This does not limit things, however - you can use this to your advantage. Example:

Quote
LDA #$80
PHA
PLX ; Pull into X


This will get #$80 into X AND A. In this case, there's a better way for it (see 1.10 - TRANSFERRING), but for some other commands, like PHB PLA, it's definitely a convenient way to do it.

So:
PHA - Push value from A onto the stack
PHX - Push value from X onto the stack
PHY - Push value from Y onto the stack
PLA - Pull value from the stack into A
PLX - Pull value from the stack into X
PLY - Pull value from the stack into Y

Note: When you pull something, you pull the LAST PUSHED value! This also means you can pull return addresses, if you ever wanted to do so.

PHB - Push data bank register onto the stack (8-bit)
PLB - Pull value from the stack into data bank register (8-bit)

So:
Quote
LDA #$01
PHA
PLB


This will change the DBR to #$01 - one of the more convenient ways to do it, if not the only one.

Quote
PHB
PLA


Get DBR into A.

PHD - Push direct page onto the stack (16-bit)
PLD - Pull value from stack into direct page (16-bit)

Alternative to TCD, TDC.

PHK - Push program bank onto the stack (8-bit)

No, there's no PLK, that would be ridiculous, warping to another bank.
Anyway, often used:

Quote
PHK
PLB


Get program bank into data bank register. Often used.

Quote
PHK
PLA


Get program bank into A, for whatever purpose.

PHP - Push processor flags onto stack (nvmxdizc, not e) (8-bit)
PLP - Push value from stack into processor flags (nvmxdizc, not e) (8-bit)

PEI - Push 16-bit value from the specified address.
Example:

Quote
PEI ($19)


This will push the 16-bit value from $19 and $1A onto the stack. The effect is comparable with a 16-bit A:

Quote
LDA $19
PHA


Just shorter.

PEA - Push 16-bit direct value
PER - Push 16-bit direct value, relative.

PEA $8000 will push the value $8000 onto the stack. A bit like:

Quote
LDA #$8000
PHA


But shorter.
PER does the same thing, but it's noted like branch commands - it's PC relative, which means that the value that is pushed, depends on where the code is currently running. Normally you will want to use PEA, since PER is kind of redundant, like JMP is to BRL.

Stack pointer

Okay, so now you know these stack commands. Now, where is this stack, exactly?
In SMW, it starts at $01FF. Funny, is that the stack actually grows down. So instead of going to $0200 after a push, it goes back to $01FE. And so on.

The stack pointer decreases by one at every push, and increases by one at every pull. The stack pointer actually points at the address to store the next push to. You can also change the stack pointer manually, with the commands TCS and TXS, and you can get the stack pointer with TSC and TSX. How do you do this? It's actually the same as with TCD and TDC - except there's also a command for X now.

Quote
LDA #$01FF
TCS


Change stack pointer to #$01FF.

Quote
LDX #$01FF
TXS


Same thing, just by using X.

About the pushing and pulling decrement and increment:

Quote

LDA #$01FF
TCS
SEP #$20
LDA #$02
PHA


An 8-bit value is pushed now, so the stack pointer = #$01FE.

Quote

LDA #$01ff
TCS
LDA #$0003
PHA
SEP #$20
LDA #$02
PHA


First a 16-bit value is pushed, then an 8-bit value, so the stack pointer = #$01FC.

Now it also should make sense why the push/pull ratio should be in balance.

So, basically, pushing is actually storing a value, and pulling is loading a value.

Funny, useless example:

Quote
REP #$30
TSX
LDA #$0EFE ; Status bar.
TCS
SEP #$20
LDA.b #$00
PHA
LDA.b #$01
PHA
LDA.b #$02
PHA
LDA.b #$03
PHA
TXS
SEP #$30



Notice the statusbar? Now look at the code. Does it make sense?

---------------------------------------

1.10 - TRANSFERRING

This is a relatively easy chapter. Transfer register to register.

TAX - A -> X
TAY - A -> Y
TCD - 16-bit A -> DP
TCS - 16-bit A -> Stack Pointer
TDC - DP -> 16-bit A
TSC - Stack Pointer -> 16-bit A
TSX - Stack Pointer -> 16-bit X
TXA - X -> A
TXS - 16-bit X -> Stack Pointer
TXY - X -> Y
TYA - Y -> A
TYX - Y -> X

... do I need to explain more?

Alright, Basic workshop DONE!

------------------------------------------------
==================================================
2. SEMI-ADVANCED WORKSHOP
==================================================

2.1 - VECTORS, H/V-BLANKING, INTERRUPTS

Page about the SNES memory map, and vectors. Use it.
I also highly advise you to download regs.txt for this chapter!

So, anyway, vectors. What are they used for? Well, they point to a routine, when a certain interrupt occurs. These vectors are all located at $00FFE0-$00FFFF, and they point to a 16-bit address somewhere in bank 00.

There are the following interrupts:
RESET
IRQ
NMI
BRK/COP/ABORT

But first, this. There are different vectors, depending on whether you are in emulation or native mode.
In Native mode, all these vectors are at $FFE0-$FFEF, while in Emulation mode, they're at $FFF0-$FFFF.

So anyway, to start at RESET.

RESET interrupt - when this interrupt occurs (that is, at Power On, or Reset), the game basically starts up. This address, in Emulation mode, is located at $FFFC-$FFFD. There's also one for native mode, but that one is unused ($FFEC-$FFED).
In SMW, this address points to $8000. This means, the game starts at $008000. But you can set it to make it start somewhere else in bank 00, as well. This routine is supposed to loop forever, until power is turned off / the game resets.

IRQ interrupt - with this interrupt, you can basically 'cut' a screen in multiple parts, having, for example, multiple background modes on screen at once. This can be done at any scanline or pixel.

Ersanio made a nice picture of how SMW makes use of it:


Smalls used it like here:


IRQ is not limited to only one per screen - technically, IRQ can be executed on any scanline, making it very similar to HDMA. However, IRQ is slower, but it's less limited. With IRQ, you can do almost anything - affect scrolling, BG mode, brightness, etc. etc. These options are set during H-blank.
The IRQ handler's vector in Emulation Mode is located at $FFFE-$FFFF, while in Native mode, it's located at $FFEE-$FFFF.
This interrupt is maskable by $4200 and the interrupt disable flag - I'll explain $4200 at NMI.
Bit 7 of $4211 is set when IRQ is fired.
$4209 and $420A set at which V count IRQ should occur, while $4207 and $4208 set at which H count IRQ should occur - in SMW, only $4209 and $420A are actually used for this.

NMI interrupt - This interrupt is fired during V-blank and is mainly used for changing PPU data (SNES registers $2100-$21FF), and PPU mirrors (like $3E in SMW, for $2105, and $40 for $2131). What this PPU is -> see 2.3 - VARIOUS PPU REGS.
It's not necessary, but preferable to keep NMI short, and not make it execute after V-blank ended.
The NMI handler's vector in Emulation Mode is located at $FFFA-$FFFB, while in Native mode, it's located at $FFEA-$FFEB.
NMI is not maskable with the interrupt disable flag! You will have to use $4200 for this.
Bit 7 of $4210 is set when NMI is fired.

About $4200 :
It's in this format:
n-yx---a

n = NMI enable
x/y = IRQ enable
a = Auto-Joypad Read Enable

If n = clear, no NMI will occur. If both x and y are clear, no IRQ will occur, although the same thing can be achieved by setting the i flag in the processor flags.
More info about this -> regs.txt.

BRK and COP - Interrupts that occur when these opcodes are met. Not necessarily important.

ABORT - Not sure when exactly it's activated, perhaps by a hardware signal... aborts current instruction.

Finally - what all interrupts have in common, are that they:
- Push a 24-bit address
- A few other values, such as processor flags.

RTI pulls these back again. This means that you ALWAYS end an interrupt with RTI (except for RESET, which should never end at all).

------------------------------------
2.2 - BITWISE OPERATIONS

This paragraph will cover the following commands:
AND
ORA
EOR
TRB
TSB
ASL
LSR
ROL
ROR
BIT

... plenty of commands. Note: All of these are only compatible with A. There's no X or Y equivalents for them. So TAX TAY TXA TYA your way out here!

AND

Okay, let's start with AND. AND compares the value in A with the
specified value, bit by bit. This means:

Quote
LDA #$30
AND #$10


You'd rather note this down in binary. So:

Quote
LDA #%00110000
AND #%00010000


Bit 0 in A is compared with bit 0 with the AND parameter, bit 1 in A with bit 1 in AND, bit 2 with bit 2, etc.

So:
LDA #%00110000
AND #%00010000

Bits in bold are compared to eachother (bit 0). And that goes the same pattern for all other bits.

So anyway, AND compares these bits. And then? Well, AND outputs a resulting bit for them. If:

Bit in A = 0 (clear)
Bit in AND = 0 (clear)
Result = 0 (clear)

Bit in A = 1 (set)
Bit in AND = 0 (clear)
Result = 0 (clear)

Bit in A = 0 (clear)
Bit in AND = 1 (set)
Result = 0 (clear)

Bit in A = 1 (set)
Bit in AND = 1 (set)
Result = 1 (set)

So basically, you could note:

0 - 0 = 0
1 - 0 = 0
(0 - 1) = 0
1 - 1 = 1

Or say: If either of the two bits is clear, the resulting bit will be clear.

So to speak:

Quote
LDA #%00110000
AND #%00010000


Compare these bit-by-bit, and the result:

Quote
#%00010000 ; Which is #$10


Goes into A.

So:

Quote
LDA #$30
AND #$10


Results in #$10.

Now, how is this useful? Well, let's take a controller address ($15 in SMW). Say, you want to check if Select is pressed. How do you do this?

Quote
LDA $15
AND #$20 ; Let's say, Select = #$20.
BNE IsPressed


So basically, this will ignore if any other button is pressed. It just checks whether Select is pressed. Well, AND #$20 now clears all bits out, except for the Select bit. So the value can either be #$00, or #$20. If Select is not pressed, result = #$00. So the zero flag is set. As you learned earlier, BNE = Branch if zero flag not set. So, you won't branch.
If Select is pressed, result = #$20. So the zero flag is not set. And so, you will branch. Convenient, no? So, why is this better than using CMP?

Quote
LDA $15
CMP #$20
BEQ IsPressed


CMP actually checks if only Select is pressed. This means, that, if you were pressing Select, but also for example Start (#$10, accumulating the value to #$30), you wouldn't branch. Even though Select was pressed.

IMPORTANT: AND does effect the Accumulator! And with that, all other bitwise commands, except for BIT.

You can also do AND $00, etc. All addressing modes compatible with LDA, are compatible with AND.

ORA

ORA works exactly like AND, except for the following:

0 - 0 = 0
1 - 0 = 1
(0 - 1 = 1)
1 - 1 = 1

So, you could say, if either of the two bits is set, the resulting bit will be set.

So, when exactly is this useful? Well, you could do this to set a bit, regardless of whether it was set before, or not.

Example:
Quote
LDA $12
AND #$08
STA $12


This is a little more convenient than LDA $12 CLC ADC #$08 STA $12 if you want bit 3 to be set.
(Also, there's even a better way to run this code, but that aside...)

ORA is typically used to check if a multiple amount of RAM addresses is zero, or not.

Example:
Quote
LDA $00
ORA $01
ORA $02
ORA $03
BNE OneOfThemIsNotZero


This will branch if not all of them are zero.

EOR

This is a fun one. It works like AND and ORA, except:

0 - 0 = 0
1 - 0 = 1
(0 - 1 = 1)
1 - 1 = 0

That means, you could say, the resulting bit = 1 when the compared bits were not equal, while the resulting bit = 0 when the compared bits were equal.

Classical example:

Quote
LDA #$02
EOR #$FF ; This will flip the accumulator's value around


If EOR #$FF is used on any value, it 'flips around'. What does that mean? Well, EOR #$FF on the following:

#$00 = #$FF
#$01 = #$FE
#$02 = #$FD
...
#$FC = #$03
#$FD = #$02
#$FE = #$01
#$FF = #$00

This can be relatively useful, for, example, speed values for a different direction.
EOR #$01 will flip bit 0, as here:

Quote
LDA $13D4 ; Pause flag in SMW
EOR #$01 ; Flip
STA $13D4


It's pretty convenient.

TRB

TRB is not often useful, but it can be slightly quicker. Say, you want to clear bit 7 of $12. How would you do this? Well, with your current knowledge, you would use:

Quote
LDA $12
AND #$7F ; Keep all bits, except bit 7. Clear that one out.
STA $12


TRB is slightly quicker. You have to specify the bit you want to clear in A, and then use TRB $Address. So in this case:

Quote
LDA #$80
TRB $12


This will do the same thing, just quicker.

TSB

Similar to TSB, except it will set the specified bits. I think you can guess where this is going...

Quote
LDA $12
ORA #$80 ; Set bit 7.
STA $12


With TSB:
Quote
LDA #$80
TSB $12


Easy, no?

ASL

This will shift the bits of the Accumulator up left, once. It kind of gives the idea of a multiplication by 2. Carry is also affected by ASL, and with that, LSR, ROL and ROR.

Example:

Quote
LDA #%01000000 ; #$40
ASL A


Result = #%10000000 (#$80)

So, what if you ASL A another time?
Well, Accumulator will become #%00000000 (#$00), but the carry bit will be set (the bit will go into carry). However, if you were to ASL again, the Accumulator remains #%00000000, and the carry bit will be cleared. So the bit 'vanishes'.

You can also ASL $Address, etc.

LSR

Same as ASL, but then the other way around - to the right.

Quote
LDA #%00000010 ; (#$02)
LSR A


Will result in #%00000001 (#$01).
Another time: #%00000000, carry set, and another time, same for A, and carry clear.

ROL

ROL is exactly like ASL, except carry bit will go into bit 0 when ROL is used.
So:

Quote
LDA #%10000000
ROL A


= Accumulator #%00000000, carry set.

ROL A again = Accumulator #%00000001, carry clear.
So here, it basically wraps around.

Quote
LDA #%00000000
SEC
ROL A


Result is #%00000001, and carry clear, here too.

ROR

Same as LSR, except carry bit will go into bit 7 when ROL is used. I don't really think I have to explain that anymore.

BIT

BIT sets a few processor flags when used - it's a bit like a LDA $Address, but it does not affect A, and it affects a few different processor flags: n, v and z.

So:

Quote
BIT $00


If bit 7 is set, n is set (negative)
If bit 6 is set, v is set (overflow)
If none of the bits are set, z is set (zero flag)

BIT #$value also exists... I don't see why it's really useful, though. This one also only sets the zero flag, not overflow or negative.

Normally you won't need this commands.

---------------------------------

2.3 - VARIOUS PPU REGS

This is where regs.txt will be quite handy to you.
We explained quite a few registers:

$2100 - Brightness
$2105 - BG mode
$210D-$2114 - Layer XY positions
$2115-$2119 - VRAM address + write (not read)
$211A-$2120 - Various Mode 7 parameters
$2121-$2122 - CGRAM write
$2131 - CGADSUB
$2132 - Fixed Color Data
$2133 - Screen Mode / Video Select

$2100 - Brightness
f---bbbb


F = Force blank
B = brightness bit

This is, without a doubt, the easiest register. If the F bit is set, force blank is on. During force blank, you can write to VRAM with no problem.

The - bits don't do anything. You may clear or set them to your wishes.

The b bits are for the brightness. If force blank is on, brightness won't be seen, since everything is black.
0F is the maximum brightness (normally used in SMW levels, not all of them though), 00 the minimum (basically the same as force blank, but the screen won't actually be off).
SMW's mirror is $0DAE.

$2105 - BG Mode.
DCBAemmm


DCBA - One tile for a layer is either 8x8 or 16x16. A = Layer 1, B = Layer 2, C = Layer 3, D = Layer 4.

e - Layer 3 priority, only applies to Mode 1. This causes layer 3 to go above EVERYTHING! Mode 1 is the normal mode in SMW levels (not all of them use it, though, for example, the boss battles. But the status bar is always Mode 1, and on layer 3.)

mmm - Background mode.

There are 8 possibilities, so Mode 0 up to Mode 7.

I'll explain them briefly. Note: Sprites are NOT affected here!

Mode 0

Four layers, each 4 colours per palette. Layer 1 = palettes 0 and 1, layer 2 = palettes 2 and 3, layer 3 = palettes 4 and 5, layer 4 = palettes 6 and 7.

Mode 1

Three layers. BG1 and BG2 = 16 colours per palette, BG3 = 4 colours per palette. BG1 and BG2 use palettes 0-7, BG3 uses palettes 0-1.
Normally used in SMW.

Mode 2

Two layers. BG1 and BG2 = 16 colours per palette, use palettes 0-7. Offset mode can be used here, allowing tiles to scroll independantly.

Mode 3

Two layers. BG1 = 256 colours (palettes 0-F), BG2 = 16 colours per palette (0-7). Could be useful for titlescreens.

Mode 4

Two layers. BG1 = 256 colours (palettes 0-F), BG2 = 4 colours per palette (0-1). And offset mode.

Mode 5

Two layers. BG1 = 16 colours per palette (0-7), BG2 = 4 colours per palettes (0-1). And pseudo hi resolution (half-pixels).

Mode 6

One layer. BG1 = 16 colours per palette (0-7). Pseudo hi resolution + offset mode.

Mode 7

One, or two layers. BG1 = 256 colours (0-F), BG2 (if used) = 128 colours + priority bit. In mode 7, one can stretch and rotate the layer, even per scanline, which would enable creating pseudo 3D environments. More about this in $211A-$2120.

SMW uses $3E as a mirror.

210D-$2114
------xx xxxxxxxx : Modes 0-6
---mmmmm mmmmmmmm : Mode 7


These registers are used for Layer XY scrolling positions.
These are typically write twice registers - you need to write twice to them to get a good effect. Like SMW does it:

Quote
LDA $1A ; Low byte
STA $210D ; Layer 1 X scrolling
LDA $1B ; High byte
STA $210D ; Layer 1 X scrolling


$210D - Layer 1 X
$210E - Layer 1 Y
$210F - Layer 2 X
$2110 - Layer 2 Y
$2111 - Layer 3 X
$2112 - Layer 3 Y
$2113 - Layer 4 X
$2114 - Layer 4 Y

SMW uses $1A-$25 for a mirror for $210D-$2112 (note: $1A-$25 are 16-bit)

2115 wb++?- VMAIN - Video Port Control
i---mmii
2116 wl++?- VMADDL - VRAM Address low byte
2117 wh++?- VMADDH - VRAM Address high byte
aaaaaaaa aaaaaaaa
2118 wl++-- VMDATAL - VRAM Data Write low byte
2119 wh++-- VMDATAH - VRAM Data Write high byte
xxxxxxxx xxxxxxxx


First though - what is the VRAM? The VRAM holds the GFX for layers and sprites, and it holds the tilemaps for layers.

So, with changing the VRAM, you can change the tilemap, and the GFX to use. You should see the VRAM as one full bank, 64 kb, ranging from $0000-$FFFF.

So, let's start. But with $2116-$2117 first.
With these two, you specify the 16-bit VRAM address to write to/read from. This address increments by a certain number at every write/read. The high byte is $2117, the low byte $2116.
So:

LDA #$5800
STA $2116

The VRAM address to write to / read from = $5800.

Now, that should be clear. Let's do $2115.

2115 wb++?- VMAIN - Video Port Control
i---mmii

The first i explains when $2116/7 should be incremented - when $2118 is written to (8-bit input), or $2119 (16-bit input, both $2118 and $2119 are actually written to).

I'm not sure what the mm bits do - they're probably not too important though, as they're rarely used, or at least in SMW.
The other ii bits - if they're both clear, $2116/7 increments by 1. If the lower i bit is set, $2116/7 increment by 32 (used by stripe image in vertical direction mode). If the higher i bit, or both i bits are set, $2116/7 increment by 128 (could be used by stripe image in vertical direction mode, for BG Mode 7)

$2118/9 are used to be written to. They write to the VRAM address specified in $2116/7. If only $2118 is written to (8-bit value), and $2115 bit 7 is clear, VRAM address increments there. Example:

Quote
STZ $2115 ; Increment by write at $2118, by one.
REP #$20
LDA #$5800 ;
STA $2116 ; VRAM address = $5800
SEP #$20
LDA #$20 ; Write #$20 to $5800
STA $2118 ; Increment $2116/7 by one
LDA #$30 ; Write #$30 to $5801
STA $2118 ; Increment $2116/7 by one
LDA #$40 ; Write #$40 to $5802
STA $2118 ; Etc.


A word address is not too tough, either:

Quote
LDA #$80 ; Increment by write at $2119, by one
REP #$20
LDA #$5800
STA $2116 ; VRAM address = $5800
LDA #$38A1 ; Write #$38A0 to $5800
STA $2118 ; $2119 points to $5801, increment that address by one.
; $2116/7 now point to $5802
LDA #$38A1 ; Write #$38A1 to $5802
STA $2118 ; $2119 points to $5803, increment that address by one.
; $2116/7 now point to $5804
; etc.


Makes sense, no?

211a wb++?- M7SEL - Mode 7 Settings
rc----yx
211b ww+++- M7A - Mode 7 Matrix A (also used with $2134/6)
211c ww+++- M7B - Mode 7 Matrix B (also used with $2134/6)
211d ww+++- M7C - Mode 7 Matrix C
211e ww+++- M7D - Mode 7 Matrix D
aaaaaaaa aaaaaaaa
211f ww+++- M7X - Mode 7 Center X
2120 ww+++- M7Y - Mode 7 Center Y
---xxxxx xxxxxxxx


These are several Mode 7 addresses. Let me explain them one by one:

$211A - The r bit is for telling the game whether to repeat the Mode 7 playing field over and over, or to replace it with nothing, or tile 0. If this bit is clear, it will repeat over and over, creating lulz:




So, you want to disable this. Well, set the r bit. And then? Well, the c bit is pretty important too. By making it clear, the stuff around the playing field will be invisible. If c is set, tile 0 will repeat, over and over.
And then there's y and x. When y is set, the playing field will be flipped vertically... and... well, x should be obvious.

$211B and $211E - Scale X and Scale Y.
These can stretch the mode 7 layer, X wise and Y wise... These are write twice registers.

$211C and $211D - Shear X and Shear Y.
These should rotate the mode 7 layer. In order to rotate, shear X and Y should be the opposite of eachother. So, if X = #$F000, Y = #$0FFF. These are write twice registers.

$211F and $2120 - Center X and Y.
These indicate the center XY position of the Mode 7 layer, where all rotation and scaling origins from.

2121 wb+++- CGADD - CGRAM Address
cccccccc
2122 ww+++- CGDATA - CGRAM Data write
-bbbbbgg gggrrrrr


This enables you to change colours of a palette. $2121 specifies the colour, $2122 specifies the RGB value for this colour ($2122 is write twice).

Example:
Quote
LDA #$19 ; Change colour 9 of palette 1
STA $2121
LDA #$F0 ; Low byte of RGB value, gggrrrrr
STA $2122
LDA #$70 ; High byte of RGB value, -bbbbbgg
STA $2122


2131 wb+++- CGADSUB - Color math designation
shbo4321


CGADSUB - where to enable color math on, specified in $2132.

1 through 4 - Layers 1 through 4
o - Object layer (sprite palettes C-F)
b - Backdrop
h - Half-colour mode (Add RGB of colour data with RGB of original colour, and divide by 2, creating a half-colour)
s - Subtract colours when set, add colours when clear.

This basically specifies on which layers the fixed colour data should be added. SMW's mirror for it is $40.

2132 wb+++- COLDATA - Fixed Color Data
bgrccccc


You may wish to write thrice to this one to get the wanted result. r, g or b specifies on which colour the intensity should have effect. Example:

Quote
LDA #$3F
STA $2132
LDA #$5F
STA $2132
LDA #$9F
STA $2132


White colour.

Quote
LDA #$20
STA $2132
LDA #$40
STA $2132
LDA #$80
STA $2132


Black colour.

--------------------------------

2.4 DMA AND HDMA

DMA is used to quickly transfer data to PPU regs.
HDMA does that too, but per scanline.

You should know about the following:

$420B - DMA Channel Enable
$420C - HDMA Channel Enable

All channels are between $4300-$437F, there are 8 channels, so $0F addresses for each one of them. DMA and HDMA share their channels.
These addresses ($420B and $420C) specify on which channels to enable DMA or HDMA on. Simple format:

76543210 (channel).

So, now about the (H)DMA channels.
Let's specify the addressing with $43x0-$43xF now, x being channel number.

$43x0 - DMA Control.
da-ifttt

D = direction. If clear - ROM/RAM -> PPU
If set - PPU -> RAM

a, i and f, not sure. But they're not the most important ones.

ttt - Transfer mode. If:
000 - One byte, to one register (p)
001 - Two registers, one byte per register (p, p+1)
010 - One register, write twice (p, p)
011 - Two registers, write twice (p, p, p+1, p+1)
100 - Four registers, write once (p, p+1, p+2, p+3)
101 - Two registers, write twice alternate (p, p+1, p, p+1)
110 - One register, write twice (p, p)
111 - Two register, write twice (p, p, p+1, p+1)

This affects the (H)DMA table a lot.

$43x1 - PPU Address
Write to or read from $21xx. Specify xx at $43x1.

$43x2-$43x4 - Source address.
Here you place the address of the table to read from.

$43x5-$43x6 - Amount of bytes to transfer.
Only used in DMA. This specifies the amount of bytes to transfer.
But if you specify it with #$0000, it will actually writes 65536 bytes.

Those are the only relevant addresses for most simple (H)DMA codes. So, how does it go? Well, here's an example:

Quote
STZ $4330 ; One byte, one reg
STZ $4331 ; $2100 - Brightness
REP #$20
LDA #HDMATable ; 16-bit address
STA $4332
PHK ; In case it's in the same bank, of course
PLY ; Get into Y
STY $4334 ; Bank byte
SEP #$20
LDA #$08 ; Channel 3
STA $0D9F ; SMW's mirror for $420C.
RTS


Makes sense, no?
So, as for a HDMA table. How is it formatted? Well, that depends on $43x0. But in this case, it's formatted as following:

Quote
HDMATable:
db $05 ; 5 scanlines
db $0F ; Full brightness

db $04 ; 4 scanlines
db $0E ; A bit less

db $06 ; 6 scanlines
db $0D ; Even less

db $00 ; End here (0 scanlines)


Same goes for DMA tables, except there's no such thing as a scanline byte here. So it's simply values for the registers.

----------------------------------

2.5 - BLOCK MOVES -> MVN, MVP

This is a bit like DMA, except it's quite a bit slower. But it can do RAM -> RAM. So, MVN/MVP is basically a bunch of LDAs and STAs, but faster. How does it go?

Well, first of all, you need a 16-bit A, X and Y. And the data bank reg is involved, so:

PHB
REP #$30
move code here
SEP #$30
PLB

X will contain the source address (16-bit), Y the destination address (16-bit), and A the amount of bytes to transfer.
MVN will now specify the source bank and the destination bank, so for example: MVN $177E. Source bank = $17, destination bank = $7E.
So, say:

PHB
REP #$30
LDA #$0003
LDX #Table
LDY #$0EF9
MVN $177E ; Let's say this code was in bank 17.
SEP #$30
PLB
RTS

Table:
db $01,$02,$03,$04

A will get decremented with MVN, while X and Y increment. All by 1.
So, #$01 goes into $7E0EF9, #$02 into $7E0EFA, #$03 into $7E0EFB and #$04 into $7E0EFC. When A = #$FFFF, the transfer stops.
So, A is always #$FFFF after this operation.

What's the difference between MVN and MVP? Well, by using MVP, X and Y also decrement, together with A. This means, that you will specify the end of the source and destination tables. So usually, you want to use MVN.

---------------------------------

2.6 - MORE ADDRESSING MODES

There is a total of 25 addressing modes. Let's explain these:

2.1 Immediate
2.2 Direct Page
2.3 Absolute
2.4 Absolute Long
2.5 Implied
2.6 Accumulator
2.7 PC Relative
2.8 Stack (Push/Pull)
2.9 Stack (Absolute)
2.10 Stack (PC Relative Long)
2.11 Stack (RTL/RTS/RTI)
2.12 Stack/Interrupt
2.13 Block Move
2.14 DP Indexed,X
2.15 Absolute Indexed,X / Absolute Indexed, Y
2.16 Absolute Long Indexed, X
2.17 DP Indirect
2.18 DP Indirect Long
2.19 DP Indexed Indirect, X
2.20 DP Indirect Indexed, Y
2.21 DP Indirect Long Indexed, Y
2.22 Stack (DP Indirect)
2.23 Stack Relative
2.24 Stack Relative Indirect Indexed, Y
2.25 N/A

Immediate - LDA #$00
Direct page - LDA $00
Absolute - LDA $0000
Absolute long - LDA $000000
Implied - INX, SEC, etc.
Accumulator - INC A
PC relative - BEQ Label, BRA Label, etc.
Stack (Push/Pull) - PHA, PHB, PLA, PLB, etc.
Stack (Absolute) - PEA $0000
Stack (PC Relative Long) - PER $0000
Stack (RTS/RTL/RTI) - ... yeah.
Stack/Interrupt - BRK, COP
Block Move - MVN $0000, MVP $0000
DP Indexed,x - LDA $00,x
Absolute Indexed,x / ,y - LDA $0000,x / LDA $0000,y
Absolute Long Indexed,x - LDA $000000,x
Stack (DP Indirect) - PEI ($00)

Others not explained yet.
But let's.

DP Indirect - LDA ($00). This will load a value from the address formed by $00 and $01. This means, that if $00 was #$19, and $01 was #$00, this would basically be LDA $0019.

DP Indirect Long - LDA [$00]. Same as above, except it's 24-bit. So $02 also affects it - the bank byte. If:
$00 = #$19
$01 = #$00
$02 = #$7E
LDA [$00] = LDA $7E0019.

DP Indexed Indirect, X - LDA ($00,x).
I always say 'solve the stuff between the ()s, like you would in math'. And that's basically true - first, you have to add x to $00. So, if x = #$02, this would basically be LDA ($02). If $02 = #$00 and $03 = #$90, you would load from $9000. So, here, LDA ($00,x) = LDA $9000.

DP Indirect Long Indexed, Y - LDA [$00],y. So, first you solve the stuff between []'s here. If:

$00 = #$19
$01 = #$00
$02 = #$7E

It'd be LDA $7E0019,y, basically. And the funny thing is: This command doesn't exist on its own. There is no LDA $xxxxxx,y, only LDA $xxxxxx,x. But you can simulate it with LDA [$xx],y.

Stack Relative - LDA $00,s.
This one is indexed by the stack pointer.
Actually, to load the last pushed value, you'd do LDA $01,s, and the one before that, LDA $02,s. You will load them, WITHOUT pulling them!

Stack Relative Indirect Indexed, Y - LDA ($00,s),y. Or actually, LDA ($01,s),y. Say, the last pushed value was #$19, and the one before that was #$00, it'd basically be:

LDA $0019,y.

This command isn't too hard to understand, just really useless.

N/A - Used by WDM. Does nothing at all.

----------------------------------

2.7 - USELESS OPCODES

Yep.. well, useless opcodes (NOT addressing modes) are the following:

NOP
WDM
BRK (well, it can be useful. But barely)
COP

NOP basically does nothing. Yep. Well, not true. It wastes 2 cycles. Besides that, nothing.

WDM was reserved for use for future 16-bit opcodes (?), but it was never used after all.

BRK pushes certain values on the stack, and causes an interrupt. But in SMW, it's relatively useless. But perhaps one might be able to make use of it, by, for example, making it jump to a routine which is often accessed. That would be a quicker jump than having to use JML all the time.

COP... seems useless. Might perhaps be used for similar purposes?
---------------------------

Phew, this one done too.

If I missed anything, or if something is incorrect, please point it out!

tl;dr


(I've been working on this all day)


Mod edit: Added some inline links so you can easily go to the chapter that interests you.
Yes! Thanks for formatting this all Roy, fantastic job. You did a great job with the workshop and this is good too. Now I can use DMA! ^.^
Thank What a Nice jobs you make.
Thank roy I leanr a lot of new thing.

--------------------
Roy you are so fucking awesome for this. I never would've bothered with the logs. I love you so much (in a gay non-gay way).

Also, I say immediate sticky!
This forum has plenty of stickies, someone is already making a document from this summary, so that it can be preserved on SMWC.

--------------------
--------> Don't follow "Find Roy's Dignity", my hack. Because it's pretty outdated. <--------
Sticky? This is really a good thread. BTW Yup to many stickies in the forum. Beside the point though,
A good way to convert Decimal to Hex/Bin/Oct. Use the calculator, use scientific mode and Viola! Conversion ready!

--------------------
Your layout has been removed. I never had a layout.
Whoops, little error:
Writing to VRAM, say, $5000, will actually write to double that address in the VRAM bank, thus $A000.

So:
REP #$10
SEP #$20
LDA #$80
STA $2115
LDY #$5000 ; To VRAM $A000
STY $2116
LDY #$A038 ; Something
STY $2118

Probably because bit 0 isn't specified - 15-bit addresses in $2116/7, all even numbers, probably.

--------------------
--------> Don't follow "Find Roy's Dignity", my hack. Because it's pretty outdated. <--------
Stickied on request by someone else (I swear, I'm not egoistic enough to sticky this for no reason).

Just want to tell this:
If you notice any errors (hey, I typed this in what, a day or so? I was tired by the time it got finished ;_;), please PM me and tell me which part to fix.

--------------------
--------> Don't follow "Find Roy's Dignity", my hack. Because it's pretty outdated. <--------
Oh, awesome - this is certainly going to be so useful to me, considering that I partially missed both ASM workshops... thanks a bunch, Roy! I shall consider reading it later!
Destickied since a document on this ASM workshop already exists (go to the Documents section), and this thread didn't actually become what I intended to sticky it for in the first place - to be noticed and to be a place where questions could be asked.
This thread has been dead for more than 5 months, and since it thus serves the same purpose as the document, it has become obsolete.

Derp.

--------------------
--------> Don't follow "Find Roy's Dignity", my hack. Because it's pretty outdated. <--------
This tutorial really helped a lot, so;
*bumpty-bump*.
Originally posted by NAAMxLOOS
This tutorial really helped a lot, so;
*bumpty-bump*.

I know you're allowed to bump threads, but not just because you think it's important- it's more for help. Just so you know.
Pages: « 1 »
Forum Index - SMW Hacking - SMW Hacking Help - Tutorials - Old Tutorials - ASM Workshop Summary

The purpose of this site is not to distribute copyrighted material, but to honor one of our favourite games.

Copyright © 2005 - 2019 - SMW Central
Legal Information - Privacy Policy - Link To Us


Total queries: 21

Menu

Follow Us On

  • Facebook
  • Twitter
  • YouTube

Affiliates

  • Talkhaus
  • SMBX Community
  • GTx0
  • Super Luigi Bros
  • ROMhacking.net
  • MFGG
  • Gaming Reinvented