16 users online: Aqua-Grove-Prod,  Ayami, B2De, Big Brawler, CalHal,  Fernap, flips_bad, Guleyan112, JeepySol, KauanGamer, Lumy, Nciktendo, neidoodle, RXDARK, TCgamerboy2002, Zav - Guests: 72 - Bots: 189
Users: 65,640 (2,214 active)
Latest user: MaeBunZ

In-Depth Sprite Coding Tutorial (WIP)

ASM CodingCustom Sprites

Tutorial is in WIP state! Chapters might be added later on!

MarioFanGamer's tutorial on how to code sprites, in-depth.


Hello, and this is my tutorial how to create sprites but also how work with them. Have you ever wished to make your own Goomba? Have you ever asked, how Carol has made the bosses? Then this tutorial is for you!
Be careful though, that sprites are a complex thing. As such, this tutorial is, no joke, really big. If you don't like a wall of text, then don't read this tutorial but remember that you won't learn much in coding sprites.

Since this is a programming tutorial, 65c816 ASM for Asar knowledge is recommend. I also assume you know how to use the tools.

Big fat note before any complains: This tutorial is usually not providing you any codes during a lesson, it only gives ideas about how to code. At best, there are code snippets but the rest is where you have to code on your own. Only at the end of a part, I will provide some (somewhat) documented codes (you can find them inside the blue boxes) but these only should give you an example, how a sprite can look like, an example if you followed the tutorial. After all, the proof is in the pudding, not to mention that you'll only have the feeling of learning if a tutorial only provides codes. If you didn't get it, you still can look at it but please remember to study the code and what it does.
Also, the tutorial is delebritely non-chronological inside chapters. The different order happens because I have sorted the chapters on difficulty/ advancement rather than theme.

  • Background: Layers 1-4. "Layers" also include sprite tiles (they are, more or less, an own type layer).
  • OAM tiles, objects: Sprite tiles.
  • Extra prioritised layer 3: That is layer 3 which goes in front of everything. That's why stuff like the status bar or tides, despite being on layer 3, don't go behind sprites and other layers.
  • Scratch RAM: Miscellaneous RAM (often $00-$0F), used to preserve values for a short time where the stack is counterproductive or for pointers.

The sprites we create are for PIXI, the latest sprite insertion tool for SMW.

Table of Content

Part 0: Pre-information

print "INIT ",pc
print "MAIN ",pc

This is a basic generator "code", at least something which PIXI and Asar allow and doesn't crash the game. Now, it tells you a couple things:
  • print "MAIN ",pc - This tells PIXI, where the main portion of the sprite starts, the sprite code run all time when it exists.
  • print "INIT ",pc - This only runs when a sprite gets generated (usually). Shooters and generators do not have got an init code, though. More about that on regular sprites.
  • RTL - This is the final return opcode, when the sprite code is finished. Never use a RTS for the final return opcode!

Random fact: TRASM sprites (sprites for Romi's Sprite Tool which used that old assembler) use instead of print a dcb "MAIN" and </code>dcb "INIT"</code> to decide when the codes start. But since we are forced to use Asar with PIXI, it doesn't matter.

Aside from that, we also need a CFG file which holds the configuration of the sprites. For that, we use the CFG Editor, included with PIXI and Sprite Tool (each comes with an own version but mostly function similarly).

Part 1: Generators

Generators are the codes when. They are the most simple type of sprites because they haven't got any special requirements. In fact, they doesn't even use an own RAM besides the generator number. That means, the CFG file for generators is mostly a dummy and look like this:

The only relevant data is ASM_File.asm which is the source code. Anything else is either irrelevant (the FF's) or not to touch (the 03).

Now, use the code above and put anything you want to between print "MAIN ",pc and the RTL. You can do many stuff with your knowledge. You can create a power up timer, a double jump, a coin spawner from the screen, etc. The sky is the limit (well, not quite but you get the idea).

Most generator stuff can be done with LevelASM, though, so you rather want to use that instead of generators. However, when a generator loads, it replaces the currently loaded one (including the generator disabler) so you can easily control the generator between two screens.

Tip: If you use in-sprite tables (tables inside inside the ASM file), I recommend you to use a bank wrapper since neither Sprite Tool nor PIXI set the data bank (the implied high byte for 16-bit addresses) automatically. It looks like this:
print "INIT ",pc
print "MAIN ",pc
; Your code

What it does is to preserve the data bank register first, replaces it with the bank the code is in, runs the code and before the sprites finishes, it restores the data bank back (more about preserving and restoring later on).
The reason we do this is on the how accessing data works: When you want to access memory on the SNES, the number has to be 24-bit. The bank byte in absolute addressing ($xxxx) comes from the data bank register.
If you don't do this then the generator will load false values (unless you use long addresses, see below), thus messing the code up a bit.
Direct page (8-bit) addressing always have its bank to be set to $00 but this shouldn't really matter.

Keep in mind that you need to have a PLB in front of every RTL. However, you can make the actual code into a subroutine and call it in side the wrapper.
Either do that or force Asar to use long addresses (especially recommend with a low number of ROM tables inside the generator / shooter/ sprite) with a .l right after the opcode. For example, LDA Table becomes LDA.l Table.

Anyway, with that knowledge, you shouldn't get too much trouble to create a simple generator.

Note: For a code to spawn a sprite, we will come later on. Just do things which affects Mario first.

Part 2: Shoot, I forgot how to code!

Part 2.1: Pre-information

Not really but the next thing we make, is a shooter.
They are a bit more complicated than generators because they are specialised to spawn a sprite and also have got RAM address reserved for them. Not only that but you can have multiple shooters on screen. That does another weirdness: Each kind of shooter RAM is 8 bytes long (as long as you had not done any wizard stuff and made each table larger or smaller) and another byte is reserved to hold the index.
As such, they are the bridge to generators / UberASM / blocks and other kinds of sprites.

Part 2.2: RAM addresses

Beause of that, I list each RAM address which are important for shooters:
  • $18FF,x - The index of the current shooter.
  • $1783,x - This holds the shooter number.
  • $178B,x - The shooter's Y position, low byte.
  • $1793,x - The shooter's Y position, high byte.
  • $179B,x - The shooter's X position, low byte.
  • $17A3,x - The shooter's X position, high byte.
  • $17AB,x - The shooter timer, often used to determine, when the shooter should spawn a sprite. It decrease every second frame.
  • $17B3,x - The shooter's index in the level table.

That are quite a few RAM addresses. And many of them are indexed with X which means the used address is the given address plus whatever is in X (or Y if you use ,y instead).

Note for the position: In most computers and programms, the top left corner is (0|0). As such, when I talk about a position, I mean the top left corner of a sprite or whatever.

Part 2.3: When do they shoot?

However, there is still one problem: How does a shooter code look like? The explaination comes now:
First of all, most shooters are build to only run when the timer, $17AB,x, is zero. Right after that, we set after how many frames the shooter runs again. Because it decrease after each second frame, the maximum time till the shooter runs again, are 512 frames, 8 and a couple squished seconds.

Then, most shooters are programmed to run when they are on-screen but not despawned. You can do that by first subtracting the low bytes of the shooter's Y position with the layer 1 Y position and then do the same with the high bytes but this time, do a subtraction with carry so do not use a SEC before (tip: since the difference for the low bytes is not used to be stored and carry doesn't matter, you can just use a CMP there). Finally, check if both values are the same. If not, then return. That is a bit confusing so I'll give you the code this time:
	LDA $178B,x	; If the shooter is
	CMP $1C		; more to the left than the camera...
	LDA $1793,x	; ... or further than 0x100 pixels
	SBC $1D		; 
	BNE Return	; Return

Explaination: Simply checking for the high byte has got a problem that only these shooters will shoot which are in the same sub-screen (F1/F2 in Lunar Magic) as the camera is. That is not what we want to. We want that the shooter is inside the camera. Instead, we use the knowledge that when we compare the position of two objects, the one to the right always have got the same or higher high byte but the low byte actually can be lower. That way, if you subtract two values (something, which CMP does), it clears the carry flag if the low byte is lower. Since a "SBC" subtracts the given value (or the value from the given RAM) plus one at a clear carry, the screen position is adjusted for the difference.
For example, if the shooter is at position 0x0142 but the camera at 0x0080, we know that both are in-between the 0x00FF range. However, the shooter's low bytes is smaller than the camera's one so we subtract the former with the latter's value. That clears the carry flag. We then subtract the shooter's position high byte with the layer 1 position high byte. Without the carry, the values are inequal, but since the carry flag is clear this time, the SNES treats that the layer 1's position high byte is by one higher, thus being equal to the shooter's position high byte.
That, my dear friend, is a so called 'pseudo-16-bit comparison'. Pseudo-16-bit maths allows you to mess with the high byte by modifying the low byte in 8-bit mode to some degree. What I mean is that you can add or subtract 1 to the high byte if the low byte went beyond $FF or $00. Anything beyond that requires real 16-bit maths.
This is the reaons why opcodes such as ADC and SBC exist: They allow the use of pseudo-16-bit maths. INC and DEC don't affect the carry flag, though, you have to use braching instead.

Most shooters do the same with the X position. However, shooters will only shoot when they are really on screen horizontally. Because of that, they also check if the difference in the X position is between 0xF0 and 0xFF. If yes then return. Pretty complicated, isn't it?
Because of that, we will do it a little bit differently, not to mention I have to teach you it anyway: We will check, both, the high and the low byte together. That way, we way can do real 16-bit maths.
The first thing we do is to load the shooter's X position high byte. We then use an XBA. That opcode swaps the accumulator low and high byte, even in 8-bit mode, therefore allowing you to load a 16-bit number in 8-bit mode, thus saving some RAM, bytes and cycles if we had stored it in RAM (including the stack, see below). This is especially usefull for tables with split bytes like the sprite positions. We then load the low byte and change to 16-bit mode. Now, we just have to subtract it with the layer 1 X position. Before we branch, we will make sure A is back to 8-bit. It will not affect the carry flag so it doesn't mess up the code.
Confused? Just look into this:
	LDA $17A3,x		; Load sprite X position (high byte)
	XBA				; Swap accumulator low and high byte
	LDA $179B,x		; Load the sprite X position (low byte)
	REP #$20		; 16-bit A
	SEC : SBC $1A	; Load the layer 1 X position
	CMP #$00F0		; If the difference is 0x00F0 or larger...
	SEP #$20		; 8-bit A
	BCS Return		; branch

That way, a shooter will only shoot if it is fully on-screen horizontally.

Part 2.4: Spawning a sprite

Now we have done the part, when the shooter will shoot but not how to spawn a sprite. This part teaches you, how to do it.
We first jump to $02A9E4 or rather $02A9DE. These jump to the in-built spawn sprite routine at $02A9EF. The reason there are two kind of the routines is because the latter only checks for two sprite slots less than the former does. Also, this is sprite memory dependent and maximally only allows you to spawn 10 / 0xA sprites. For a sprite memory independent code with more sprite slots to use (12 slots), just check if any of the $14C8,x (see below) addresses are empty.

Anyway, A and Y now contain the slot number for an empty sprite. If it is negative, no free slot is found so this is the point where you don't spawn a sprite. If it is positive, you are ready to spawn a sprite. As you would have guessed, we first have to set the RAM addresses correctly and since you don't know these yet, here is a short list of the most important sprite tables:
  • $9E,x - This holds the sprite number.
  • $AA,x - Sprite Y speed.
  • $B6,x - Sprite X speed.
  • $D8,x - Sprite Y position, low byte.
  • $E4,x - Sprite X position, low byte.
  • $14C8,x - Sprite status. Here is a list with all of them.
  • $14D4,x - Sprite Y position, high byte.
  • $14E0,x - Sprite X position, high byte.

These are generally the most important sprite numbers to spawn a sprite from somewhere else. Only in a couple other cases, you might need to use other addresses, especially if the sprite isn't spawned in its initial state.
Also, in most cases you can use Y for the index. In fact, since we are in a shooter, we are forced to use it and only need to use X if Y does not support the addressing mode (e.g. long address index with Y i.e. $xxxxxx,y doesn't exist so you have to use X for these instead) or use routines where the sprite slot must be in X.

Ironically, the very first thing we have to do to preserve X by using a PHX. In case you did not know it before, the main use of push opcodes (of which "PHX" is one) is to preserve values and so you can use them later. You have seen it with the bank wrapper in part 1. Here, we preserve X instead of the data bank.
The values are put on a so-called stack which holds all preserved values (aka there is no seperate stack) which you can compare it to a real stack of, let's say, books. That also means, the last value you have pushed must be pulled first or in other words: "last in, first first" (short: "LIFO").
You also have to pull out as many bytes as you have pushed to the stack unless you do address manipulation. Both subroutine and return opcodes uses the stack as much as the push and pull opcodes so if you push and pull wrongly, you mess up the return destination and (likely) crash the game.
And one last thing: Push and pull opcodes are quite slow. If speed is more important than space use the shooter index. But the difference of few cycles and few bytes isn't very large so it wouldn't matter for the most part.

Anyway, we set the sprite number. The sprite we try to spawn is for now a bullet bill but you can try to spawn any vanilla sprite.
For the next step, we have to initialise the newly spawned sprite. Simply jump to $07F7D2 so any address is cleared and the tweaker bytes are set up for the sprite. This also is the reason why we have to use X instead of Y. $07F7D2 assumes the sprite slot is in X, not Y.
Anyway, after we have done that, we need to get X back so we use PLX, the opposide of PHX. We do that because we now use data of both, the shooter and the newly spawned sprite, not to mention we have to pull the value back to X anyway.

There is one important detail you have keep in mind, though: Because we now use Y, said index not only doesn't support long index with Y but even direct page index with Y (i.e. $xx,y) are unheard in 65c816 ASM (except for LDX and STX). Asar assembles this automatically to 16-bit addresses but keep this limitation in mind, especially with SA-1 (see below). I know, it is weird but for sure, we on SMWC didn't design the processor.
Alright, that's an important detail you have to know to code sprites. The only things we have to do now is to set the positions (remember that you need to subtract the Y position by 1 and mind the high byte) and set the sprite status to 0x01 or sometimes 0x08 so we have spawned a sprite successfully. Hooray!

Now, this only covers for regular sprites, though, but don't worry, a custom sprite is not much harder.
After the initialisation, we first have to set the custom sprite number. That is $7FAB9E,x for the low byte. In addition, you have to set bit 3 of $7FAB10,x to mark the sprite as custom*. We then jump in addition to $0187A7, which clears additional data for custom sprites. There is no need to write to $9E,x because the sprite's acts like setting gets stored here.

Also, since we have successfully spawned a sprite (not just set up), just store 0x09 to $1DFC for a sound effect.

At that point, we are basically ready. Sure, we can optimise it a little bit like spawning a smoke but you have gotten a fully functional shooter.

Phew, that is a lot of stuff you have to make for a single shooter sprite. Don't worry, once you get into that, it becomes a lot easier. Also, reusing codes is normal.
Especially if you steal them from someone else. {B-)

Oh, and a final note: I recommend you on making the sprite number a define. I have put an instruction for defines in my patch creation tutorial but here is a short recap:
They are simply used to manage stuff like which values you want to use (especially freeRAM) without scrolling down to the value and changing every single line containing the value. In other words: They add more user-frendliness for both, you and the user.
In order to create a define, all what you have to do is to an explaimation mark and then right after it (no space) the define name. Basically this: !Define. Then put some kind of space, an equal sign and another space and the string you want to put there (e.g. !Define = $42). Keep in mind to put quotes if the string in question has got spaces (so for putting "PHB : PHK : PLB" into a define, you do this: '!Define = "PHB : PHK : PLB"').
You also can use some maths (arithmetic and bitwise for that matter) in the ASM file. It's very important with GIEPY because the sprite number is a 10-bit number and also insists on having the bit set.

First Midway Point

As you can see, a shooter alone is more complicated than a block. But it gets worse. The next chapters are the beginner chapters, necessary to code a usable sprite. But this is quite a bit on information.
I mean, sprites have got graphics, physics, and are generally more flexible than shooters alone. If you really did expect, coding a sprite is so simple then... you're kind of right but "kind of right" still means you're wrong too.
Once again, once you know the basics, it gets easier, not to mention that reusing codes isn't something you can shame for.

Part 3: Understanding the tweaker bytes

Before we start to code a sprite, we first create the configuration file. That is another property what regular sprites differ from generators and shooters. As mentioned in the spawning part of the shooter coding, regular sprites have got the "tweaker bytes". These control, how individual sprites act and can therefore, despite using the same code, have different behaviours.
The CFG file contains all these information but knowing what is what is difficult so we use a tool included with PIXI instead: The CFG editor. Obviously, a GUI will not make editing these file a pain in the butt. The PIXI CFG editor also has got some visual help for hitboxes and palette which may or may not help.
A preferred alternative to CFG files also exist: JSON files. Unlike CFG files, which use binary values, JSON files have each flag separated. In addition, JSON files actually support custom sprite displays. Note that for the sake of the tutorial, I'll still refer JSON files as CFG files.
In case you want to use Sprite Tool's CFG editor for some reason, you can take these two documentations as a reference for the former and Lunar Magic as a reference for the latter.

Now, most of the settings are quite obvious but let us look into each setting anyway:
  • Object clipping: How the sprite will interact with blocks. The darker area is the main position of the sprite, where you place it as the X tile in Lunar Magic, and the red dots mark the interaction points.
  • Can be jumped on: That sets the "spiky" flag i.e. if you try jump on said sprite with this flag disabled, you get hurt if you hit the top (unless you are on Yoshi, that is) with the default interaction (see below for more information).
  • Dies when jumped on: That just means that the sprite will not turn carryable after you attack it on the top.
  • Hop in/kick shells: This is a bit weird. Basically, setting that makes the sprite acts a little bit like a shellless Koopa in terms of interaction with Koopa shells but it is a bit weird so unless the sprite in question is a shellless Koopa, leave it.
  • Disappear in a cloud of smoke: One of the death animations. Most sprites either get smushed (if jumped on at least) or (otherwise) fall offscreen but others like piranha plants actually disappear in a smoke when killed.
  • Sprite clipping: See object clipping but for sprites and Mario.
  • Use shell as death frame: This is... a bit weird, honestly. We use custom graphics anyway so this should not bother you.
  • Falls straight down when killed: This only applies to sprites which you kill by jumping on them. It's the third set of death animations, only enabled when they neither fall get smushed nor disappear in a smoke.
  • Use second graphics page: That means, the sprite will use the second sprite page or page 4 in 8x8 editor terms, otherwise the first page (8x8 editor page 3). This gets stored to the sprite's YXPPCCCT properties (more on that later) but remember that you need to implement it into the code to make use of it.
  • Palette: Similar to the second GFX page, this is part of the YXPPCCCT properties which control the colours. Once again, without an implementation of it, the palette is hardcoded.
  • Disable fireball killing: That makes the sprite immune (as in "it interacts with them but doesn't get hurt") to fireballs.
  • Disable cape killing: The sprite doesn't get killed by a cape, bounce blocks or earthquake (all of them use the same or similar interaction code) but still is thrown upwards.
  • Disable water splash: Most sprites (and Mario) spawns a smoke when they enter or leave water (emulating a splash). Not these with that bit activated.
  • Don't interact with Layer 2: Yeah, there are cases where layer 2 interaction with some sprites is buggy and is therefore unwished.
  • Don't disable clipping when star killed: It actually means that the sprite still run their main code when it's killed. In custom sprites, it means that even dead sprites still run their code. Useful to overwrite dead graphics.
  • Invincible to star/cape/fire/bounce blk: None of the sprite will interact with these stuff. In contrast to "Disable fireball killing" and "Disable cape killing" this bit actually disables interaction whereas the other only made the sprite invulnerable to fireballs but still could interact with them. It also makes them immune to Bob-Omb explosions and Yoshi won't interact with these sprites.
  • Process when offscreen: That only means that sprites will not despawn when they leave the camera too far but still disappear when they out of bounds.
  • Don't change into a shell when stunned: Sprites with that bit cleared jump in their graphics routine directly to draw the shell graphics when they are carryable. Basically, most carryable sprites uses their own GFX routine. If the carryable sprite doesn't have one, they use the shell graphics. That bit is only used to skip the lots of branching and thus saving some time for carryable sprites. Since only specific sprites use non-shell carryable graphics, not to mention we will overwrite them anyway, this bit is quite useless.
  • Can't be kicked like a shell: There are two types of carryable sprites: These which slide when kicked (e.g. shells, throw blocks) and these which don't (all other sprites). This bit enabled makes the sprite use the latter behaviour.
  • Process interaction with Mario every frame: Since too many sprites already can easly cause lags in SMW, the developers made it so that sprites will only interact with Mario every second frame. Not always is that effect wished like on platforms because Mario would be in air every second frame. This is why the creators had implemented that bit.
  • Gives power-up when eaten by Yoshi: A bit you likely never want to have on your sprites. The idea is that there are some effects when Yoshi eats a sprite, most of which are power ups. Now, the code only has been implemented on these. However, the other sprites' failcheck is this bit or the fact that you cannot eat them (usually) or both. In fact, chucks are another sprite which yoshi treats them as power ups but you first have to do a sprite swap before because they are inedible. In the best case, you can finish the game quickly, in the worst case (as it is almost always the case), the game crashes. In other words: Do not use it!
  • Don't use default interaction with Mario: The default interaction is a typical Mario-enemy interaction (jumping on it kills it or hurts you, the sides hurt you altogether and star and sliding kills them without exceptions). Enabling that bit allows you to use a custom interaction instead.
  • Inedible: Yoshi just cannot eat the sprite without glitches. Period.
  • Stay in Yoshi's mouth: Yoshi will keep sprites with that bit activated, either because the sprites is hard to swallow (Koopas) or the item is too usefull (spring board, keys, etc.). Keep in mind that your sprite needs a code for the carryable state.
  • Weird ground behaviour: ... I seriously have no idea what it does.
  • Don't interact with sprites: You have three guesses. If one of these is "There is no interaction between this sprite and other sprites", you are correct.
  • Don't change direction when touched: The idea is that sprites turn to you when you hit them from the side (default interaction). If this is undesired, enable this bit.
  • Don't turn into a coin when goal passed: Once again, self explaining. However, these sprites will still disappear when you hit a goal tape, they just will not give a coin.
  • Spawns a new sprite: As in "you hit it from above" (default interaction). Some sprites turn into a different sprite (such as Para-Koopas into Koopas) and if their acts like is at least 0x73 (see below), the spawn a cape similar to a Super Koopa. You likely won't need it as you can always code your own interaction.
  • Don't interact with objects: "Objects" as in "blocks". No need to explain that.
  • Make platform passable from below: That sounds a bit weird but that means that solid and platform sprites use a similar routine. What controls their behaviour is that sprites with that bit set are passable from below (acting like a platform, like tile 100) and these with a clear bit not (acting like a solid block, like tile 130).
  • Don't erease when goal passed: Not all sprites turn into a coin when you hit the goal. However, not all of them stay alive either. This bit lets sprites still stay onscreen when you hit a goal tape (as long as they don't turn into a coin in the first place, that's it).
  • Can't be killed by sliding: Some sprites which use the default interaction are immune to sliding because of this bit.
  • Takes 5 fireballs to kill: Some sprites like Chucks, Morton, Roy and Ludwig have got some HP. They take multiple jumps or fireballs to actually kill them.
  • Can be jumped with upward Y speed: The idea is that enemies will hurt you if your Y speed is negative i.e. you move upwards and haven't stomped other sprites. That bit prevents it.
  • Death frame 2 tiles high: Most sprites just uses one tile when they fall offscreen. Not these with that bit activated. This doesn't matter since we work with custom graphics anyway.
  • Don't turn into a coin with silver POW: That makes the sprite immune to silver P-switches.
  • Don't get stuck into walls (carryable sprites): You might have noticed that some sprites push themselves out of a block. These sprites often are carryable so it is easy to put them into a wall so they obviously have this bit set. Others enjoy being stuck.

Aside from these, we have got a couple other settings too for Sprite Tool and PIXI:
  • Acts like: For tweaking, this just is the sprite number we want to tweak. For custom sprites, some general codes in the original game use the sprite number to determine some variations. For now, this doesn't matter and we want to use sprite 36, an unused sprite in the original game.
  • (Extra) Property byte: It is sometimes wasteful to have similar sprites just with a different setting. The idea of property bytes is to still use one ASM file instead to have multiple copies of ASM files with minor tweaks. We will talk about it in a later chapter.
  • Use Xkas for assembling (only romi's Sprite Tool): Old Sprite Tool supports both, TRASM and xkas. Since we use Asar (unless you use for some reason* Sprite Tool 1.40 or older), we have to set the assembler option in the CFG file, though.
  • Extra Bytes Count (PIXI only): It sets the amount extra bytes the sprite can use. "Extra bit clear" is the amount of extra bytes when the extra bit is clear, "extra bit set" is the same but when the extra bit for the sprite is clear. You likely want to have it set that both use the same value.

*Other than SA-1 Sprite Tool (which only is in version 1.40), that is.

Part 4: The real coding.

Part 4.1: Pre-information.

After you understood the tweaker bytes, we now do the coding part. Do you remember the RAM addresses for regular sprites? I will show it again:
  • $9E,x - This holds the sprite number.
  • $AA,x - Sprite Y speed.
  • $B6,x - Sprite X speed.
  • $D8,x - Sprite Y position, low byte.
  • $E4,x - Sprite X position, low byte.
  • $14C8,x - Sprite status. Here is a list with all of them.
  • $14D4,x - Sprite X position, high byte.
  • $14E0,x - Sprite Y position, high byte.

That are not all addresses but with the time I introduce new addresses. Speaking of these, $15E9 is one of them. That is the current sprite index, the $18FF for regular sprites.

Regular sprites also use an init code so put a RTL for the init code. Similar to shooters, I recommend you to put the sprite's main code into a subroutine so that you won't have to put a PLB after every RTL:
print "INIT ",pc

print "MAIN ",pc
	JSR MainCode


Part 4.2: Graphics

Part 4.2.1: Before the coding...

The first thing we do is on how to draw a sprite. We do that by putting the code in at least one JSR subroutine, be it in an own or the sprite main code. And if you do the latter, it must be the last thing in that routine. Yes, "must", not needn't to. The reason is one of the most important and common routines in sprites: GetDrawInfo.
This routine sets up some settings for the GFX but also the offscreen flags.
In fact, they are so important that we introduce something new, something we had not the possibility on the shooters yet: Shared subroutines. These are one of the biggest advantages of PIXI over romi's Sprite Tool. In case you did not know it beforehand, shared routines are a way to save freespace because of commonly used codes.
Now, the main use of it is to prevent that almost all sprites have got big routines (and GetDrawInfo is one of them) which is a major problem if you did not use the shared subroutines patch (and actually implemented it). In fact, there was a sprite pack on the sprites section which just did that: romi's sprites. If you have used these sprites before the last remoderation (state 2020), have you ever wondered, why you had to patch GetDrawInfo and SubOffScreen? That's why.
The use of shared subroutines on PIXI is not that hard. I mean, the tool and not a seperate patch takes care for them. The usage isn't hard too. I mean, the programm is based of GPS and before you have coded sprites, you probably have worked on blocks before but if didn't for some reason, here is the usage:
  • Open the ASM file you want to call the code.
  • Look at the top to see if there are any usage information (input, output)
  • Call a macro with the name of the ASM file (i.e. for GetDrawInfo.asm, it is %GetDrawInfo()).

Also, since I mentioned this is a PIXI feature, romi's Sprite Tool doesn't support it and you have to do it differently. If you want to use the routine directly, the best way is to download PIXI, copy the code from the ASM file and paste it into the sprite (I recommend to put it to the bottom of the code). The routine ends with an RTL but you often can change these to an RTS and put a label at the top of the code so you can JSR there instead.
The only exception are routines like, coincidentally, GetDrawInfo which does stack address manipulations. You can fix that by replacing the whole pull opcodes at the end with just a PLA : PLA. Of course, you still have to change the RTL but that should be obvious.
Otherwise, the in- and outputs are the same.
Of course, nothing can stop you from using the shared subroutines patch for some routines.

The pulls at the end are also the reason that the GFX must be located in a JSR and, if shared with the main code, one of the last things to be executed: It assumes the GFX routine is accessed per JSR and terminates the GFX routine if the sprite is too far outside of the screen.

Anyway. A look into the ASM file reveals that GetDrawInfo.asm has no inputs, just outputs: Y, which holds the current OAM slot, $00, the sprite's X position relative to the screen and $01, the sprite's Y position relative to the screen.

We also need to look at the new RAM addresses:
  • $15EA,x: OAM tile index.
  • $0300,y: OAM tile X position.
  • $0301,y: OAM tile Y position.
  • $0302,y: OAM tile number.
  • $0303,y: OAM tile properties.
  • $0460,y: OAM X position high byte and tile size.

Now, OAM stands for "Object Attribute Memory" or in other words: Sprite tiles. It is made out of two part: The position and tile table and the size and X high byte table. The former is a 512 byte table and the latter is 128 byte table, though for the latter you don't need to take care of it as SMW does it already for you.
Random fact: The OAM tile size table is actually a decompressed 32 bytes table (4 OAM tiles per byte).

There are quite a few things you have to know about sprite tiles, though these are more side information than anything which you can skip:
  • The OAM table in SMW technically starts at $0200,y but since SMW reserves the $0300,y addresses for regular sprites, we use that instead. A similar case with $0460,y: The first OAM tiles starts at $0420,y.
  • Each OAM tile in SMW has got a higher prority than the next drawn tile. Even if the prority in the properties for said tile is lower than the others, it still draws them over the later drawn tiles or cut them off if said tile goes behind the foreground.
  • The SNES has been build to only allow to draw up to 32 OAM tiles on a single scanline. Any more and a 33 time over occurs: Any later drawn sprite tile disappear on said scanline.
    Tiles outside of the screen are not counted.
  • A similar problem occurs if the SNES has to draw 280 pixels (35 8x8 tiles) or more of OAM tiles on a single scanline, no matter the actual amount of the OAM tiles. You can understand this as if each OAM tiles gets converted into 8x8 tiles (like that an 16x16 tile counts horizontally as two 8x8 tiles) on a scanline and if the SNES has to draw more than 35 8x8 tiles, this problem occurs. Any more, and the SNES fails to draw the rest on this scanline.
    Keep in mind that even if the SNES has to draw 35 8x8 tiles to achieve this effect, only the first 32 OAM tiles are actually drawn if it occurs. And similar to the 33 time over, any offscreen tiles are not counted.

Quite a few games like SMB3 delebritely utilise the second effect. The main use of it is the so called "masking" feature, a way how to make enemies like piranha plants only go behind the ground they stand on but not any other tile like decoration or the background. To see this in effect in SMW, this version of the Piranha Plant has got the masking feature.
The latter two can be seen in Iggy and Larry's boss rooms where if you look carefully, you notice that any sprite falling at the lava gets cut off.

If you want to understand more about OAM tiles, check out dotsarecool / Retro Game Mechanics's video about this.

Anyway, that are all somewhat important points which you have to keep in mind when creating a sprite.

Now, let's move on to code the graphics routine.

Part 4.2.2: A simple 16x16 sprite

Alright, that knowledge is basically enough to create a GFX routine. Simply store $00 to $0300,y and so do the same with $01 and $0301,y and store whatever tilenumber you want to $0302,y (here I used a mushroom). Now, for $0303,y, we first need to understand the YXPPCCCT format:
  • Y and X: Flips the tile in the direction the bit is activated
  • PP: Priority. Be aware that this only affects the priority between backgrounds. As noted above, it's only the tile number which matters, not the tiles' priority between two OAM tiles. We can see it in details later. Anyway, here are what each value mean:
    • 3 is the highest priority (goes in front of everything but extra prioritised layer 3)
    • 2 is the common one and these sprite tiles goes in front of everything but prioritised layer 1 and 2 and extra prioritised layer 3
    • 1 is for sprites behind the foreground and only layer 3 tiles (safe for the extra prioritised) have got a lower priority.
    • 0 is the lowest priority. Only unprioritised layer 3 has got a lower priority.
  • CCC: Palette. Be aware that OAM only addresses the second half of the CG-RAM, the palette.
  • T: Tile number high bit aka the page number.

Another thing we have to mention is the OAM finisher routine at $01B7B3. Similar to GetDrawInfo, it is one of the most important routines in sprite coding, thus being another commonly used routine. Thankfully, it is an RTL routine which means, you can freely jump to it. It also takes following inputs:
  • A: Tiles to draw. The value you enter is the amount of tiles you want to draw minus one i.e. 0x00 means, you draw one tile, 0x03 means four tiles.
  • Y: OAM tile size. Usually, you don't need to store to $0460,y because the tile size routine already does it for you if desired. In Y for this routine, 0x00 means only 8x8 tiles and 0x02 means only 16x16 tiles. If you want to set the tile size manually, just load any negative value like 0xFF into Y and obviously, set the values (either 0x00 or 0x02) to $0460,y. Keep in mind that the tile size index is four times than the OAM index so you have to LSR the index twice.

With that information, we are basically ready to code a simple sprite graphics routine. Here is an example code with is a blue mushroom:

	LDA $00
	STA $0300,y	; X position
	LDA $01
	STA $0301,y	; Y position
	LDA #$24
	STA $0302,y	; Tile number
	LDA #$26
	STA $0303,y	; Properties

	LDA #$00	; Tile to draw - 1
	LDY #$02	; 16x16 sprite
	JSL $01B7B3

Our result:

Of course, we can optimise it a bit further. I mentioned above that you can make the palette and tile page controllable in the CFG file. Also, the default priority isn't always the same. Some modes like the vertical modes, for some reason, puts each regular sprite with the highest priority.

The YXPPCCCT properties for the sprites are stored in $15F6,x (palette and tile page stored from the CFG editor but you can overwrite them) and the default priority (actually properties but only the priority bits are stored there) for the current level mode is $64.
That means, we load the value in $15F6,x, ORA it with $64 and store it to $0303,y et voilà, you got a generic sprite graphics routine.

Note: You are allowed to edit $64 but remember to restore the value back.

Part 4.3: Do some stuff!

Part 4.3.1: A simple coin shroom

Besides graphics, of course, we want to give the sprite some action. It also occupies a sprite slot till we managed to find a way to destroy it or load a level.

Being outside of the graphics routine, we first want to use SubOffScreen. This one removes the sprite when it's far too offscreen. The distance is dependent on the value in A for SubOffScreen, which serves as an index for the actual values, so don't forget to load A before calling the routine!
(Note: The SubOffScreen for Sprite Tool use different labels for different values. The range is still the same, though.)

Note: The values are in relation to the screen borders. -$xx is the distance to the left side of the screen and +$yy is to the right side of the screen.
  1. X0 - This is found in most sprites (and such you likely want to use this setting). It ranges from -$40 to +$30
  2. X1 - It is used by the horizontal Para-Koopa, checkerboard and rock platforms, dolphins, hammer brothers and their platform but also Big Boo and line-guided sprites. It ranges from -$40 to +$A0
  3. X2 - It is used by the brown chained platform. It ranges from -$10 to $A0
  4. X3 - It is used by the Eerie. It ranges from -$70 to +$60
  5. X4 - It is used by mushroom scale platforms. It ranges from -$90 to +$A0
  6. X5 - Apperantelly unused but usable. It ranges from -$80 to +$A0
  7. X6 - It seems to be unused, especially since it ranges from $40 to +$A0 (the former is in the middle of the screen for clarification, hence the missing minus).
  8. X7 - It is used by the rotating ball 'n' chain and platforms but also the Mega Mole. It ranges from -$50 and +$60

It might seem weird that the despawn range for the right screen border is smaller than the left side but here is the reason: Most sprites have got a width of 16 tiles. This means that sprites despawn at the same distance in both directions instead of one dispawning one tile further to the right than to the left.
The bigger differences are there because the sprites are either larger sprites, move in range or because of some other weird reasons.

We also want to make sure, the sprite only runs in the normal state, when $14C8,x is 0x08. This is because custom sprites run their code in any state, not just when they're alive.

After doing this, let's move on what you really wanted to read, how to code a proper sprite, not just some graphic routines or shooters.
We look for sprites routines which gives the sprite e.g. interaction or speed. Here is a short list which you might need to use:
  • $01A7DC - Makes the sprite interact with Mario
  • $019138 - Makes the sprite interact with blocks
  • $018032 - Makes the sprite interact with other sprites
  • $01801A - Updates the sprite Y position
  • $018022 - Updates the sprite X position
  • $01802A - Updates both of the sprite's position and adds gravity to it

Now, what do we want to code first? Um... let's see... Oh, I have an idea! We try to make it give you some coins when you touch it!

For now, we have to think what we need for the sprite. The only thing really required for the mushroom is the contact routine. But first, let's talk about tweaker bytes. Remember "Don't use default interactions"? This controls, how the contact routine works: Either as a contact checker or with an interaction code. If you don't check "Don't use default interactions" on the sprite, it behaves like an enemy, something which we don't want.
Now back to the coding part: After checking for the sprite state and freeze flag, jump to $01A7DC and check for the carry flag. If it is set, the sprite touches Mario, if not, it doesn't.
In the contact routine (may be in a subroutine but not needed for smaller codes), we want to give Mario some coins. Pro tip: Instead to add to Mario's coin counter, add it to $13CC which increases your coin count too but in an increasing manner. Better yet: Load the amount of coins to get in A and jump to $05B329 which does the same thing but also increase the amount of coins you have gotten in the level (see green star blocks which gives you a 1-up after collected 30 coins).
I also recommend you to put the amount of coins you get in a define. And to add more user-frendliness, please use decimal numbers for the coin value. That is, you leave out the dollar sign in front of the number.
We also want to destroy the mushroom. In order to do that, simply set $14C8,x to zero (don't worry, if the sprite state is smaller than $08, the game counts such sprites as "dead" and you don't need to mark them manually).

That way, our sprite is basically ready.


I sometimes have got additional lessons. This is where I won't lead you to the instruction that much but rather you should find out about the stuff yourself. Neither will I include the bonus code with the source code (exception: this lession).
Anyway, your lesson is to add a sparkle effect for the mushroom and also increase the score by some value.
Tips: The glitter is a smoke so you might want to use SpawnSmoke. For the score, there is a routine you can jump to to get some points but you have to search the information yourself in the ROM Map. You also might want to add some kind of sound effect to the sprite.

If you have followed the instruction correctly, you get this sprite:

Part 4.3.2: Please move!

For the next part, we learn about movement and block interaction. We know that $01802A adds sprite movement with gravity. In fact, it also sets ground interaction which stops the sprite from falling down further too but the sprite still keeps the momentum. More on that later.
In order to add movement, simply store any value to $B6,x (it's vertical counterpart is $AA,x). Values smaller than 0x80 are positive and add movement to the right, values bigger then that are negative and add movement to the left. As an additional bonus, try to make the coin shroom move as fast as a regular mushroom.
Keep in mind that speed in SMW is divided by 16. That means, if the speed value is 0x08 than the sprite moves half a pixel per frame whereas 0x7F means the sprite moves almost eight pixels per frame.

But wait! You haven't told us how to determine the sprite direction!
Oh, right. The direction of a sprite is usually $157C,x. That way, you can use it as the speed index (tip: since we are outside the graphics routine, Y is free to use).

Do it correctly and you get something like this:

There is one problem, though: While the gravity routine certainly detects the ground, the same can't be said for walls. We need to use $1588,x for that. It is the sprite table which holds the direction in which the sprite is blocked.
In order to change the sprite's direction, check if it touches any wall (tip: you don't need to check the direction individually because the jumping destination is set the same for both branches). If it does, you need to flip its direction and invert its speed. To change the direction, remember to flip bit zero of $1588,x, (use EOR for that). To invert the speed, form the speed's two's complement and store it there i.e. flip all the bits (EOR #$FF) and then increase it by one.

In addition, I mentioned above that the sprite gravity routine still retains the sprite's momentum. We can fix that if we check for the ground detection and set the Y speed to zero.
Where we are at clearing out the Y speed, the gravity routine also doesn't detect ceiling but you can combine the ground with the ceiling detection.

Here is how the mushroom should move:


We want to make the sprite behave even closer to the mushroom which means following: The mushroom only moves when it is in the air and stays stationary on the ground when placed. This means, we need to get a way to use some sprite table and use it as a flag. SMW has got many miscellanous sprite tables and $C2,x is one of them. It's often used to define the sprite's own states (here whether the mushroom shouldn't move or not) but any miscellanous sprite table works.
Cape interaction is done by the the cape itself. Remember what controls how sprite interacts with the cape!

Part 4.2.3: A animating a 16x16 sprite

For the next part, we want to modify the graphics routine a bit. We want to animate a sprite. This is done by using the frame counter and a tile table index.

Creating a tile table isn't difficult, is it? All you need to do is to somewhere put a Tiles: db $xx,$xx,$xx,... (with "Tiles" being any label name). Now we need a way to access it. That is done with using one of the index registers. The only problem: We only have two index registers and both are already used! That doesn't mean it's impossible. Remember that there are many ways to work around it:
  • We determine the tile before we call GetDrawInfo by using Y for the tile index, load the tile and store it into scratch RAM.
  • We use X/Y for the tile index. After we get the tile (or store if we use X), we restore it back.

Each of the posibilities have got their own advantages and disadvantages.
The first option is rarely used (if at all) so we use the second one instead.

Now for the value for the index:
We want to use a frame counter for that. Yes, 'a', not 'the'. SMW has got two kind of frame counters: A regular increasing one each frame (unless there is lag) and one which only increases when the game isn't frozen. The former is $13 and the latter, the one we want to use, is $14.

Next, we want to slow it down a bit so we add a couple LSRs for A (usually, 2 or 3 LSR are enough). Next, we only have one frame of data. We can do this if we put an AND #$01. This is our index and we need to put it in X. Where you put the code doesn't matter that much but for 16x16 sprites it's usually put where the tile is stored.

You also can set the frame index in the sprite main code and store it to $1602,x which many, if not all sprites in SMW use it as the graphics pointer (though you still can use it as a miscellanous sprite table). That way, you have got easier control for the graphics frames. In fact, the game's frame counter method only allows powers of 2 for the index whereas with a sprite's own frame counter, you can use any value between 0 and 255.

We also want the sprite to flip. We create another table called tile flip index (a table with two values, one of them is $40, the other one $00). The problem is that you need to preserve and restore Y twice but fortunatelly, we can use some scratch RAM to preserve the value too, can't we?
Tip: Remember in a previous lesson how we managed to go around scratch RAM with the help of a single opcode (you have to find this out yourself - it's not found in the source code). Alternatively, recall on how the sprite gets its properties. This might be a bit tricky, though.

Anyway, adding with some simple movement this is how our sprite can look like:

Part 4.3.3: Help! I'm attacked!

Now we want the Goomba to attack Mario. The first thing we want to do is to use the CFG editor. We want to check "Execute while being killed" so we can use our own graphics when the Goomba has been killed. The problem is that we need to modify it a bit by making it store the correct graphics.
Now, we don't need to code a whole new routine for each state. Remember $1602,x? We move the frame counter into the main code (after the sprite state check) and store the result to $1602,x. The graphics routine then loads $1602,x for the index.
For the Y flipped tile, remember what controls the properties? $15F6,x, a sprite table. When an enemy has been killed and falls to the bottom of the screen, its sprite state, $14C8,x, is set to 0x02 and when the code checks for the sprite state, you jump to the behaviour for falling to the screen. You also set $1602,x to one of the frames (here, the second frame as this is what Goombas in SMW do).
Alternatively, you can use some scratch RAM with bit seven set when the sprite is killed and call it later.
The exact procedure doesn't matter because all you need to do is to somehow flip the tile (tip: I recommend you to do the first option because of the tile index).
If done correctly, the Goomba has got a proper death frame.

Next, we modify the main routine. The first thing we want to have is to make the sprite face to Mario when spawned (sprites don't do this automatically). For that, we need the help of another routine: SubHorzPos. This checks whether Mario is on the left side of the sprite or not. If Y is zero, it's on the right, else on the left. We can just store it to the sprite's direction. We also want to add sprite interaction so jump to $018032 in the main code.

Now for the interaction. We go back to the CFG editor and want to have "Can be jumped on" and "Falls straight down when killed" checked (so that the sprite doesn't hurt you when stomping it and smushing a sprite is a bit buggy so we leave that) and have "Don't use default interaction" unchecked. Keep in mind that the default interaction always clears the carry flag (or it's supposed to) because the interaction part has been done.

This results on a goomba which you can stomp on it. We can experiment a bit further by turning it into a spiny if we uncheck "Can be jumped on" and change the graphics appropriately.
We also could make it stomped when jumped on but for that, we need to use a different interaction routine as using SMW's sqished state would be a bit to difficult.

Finally, we also want the sprite to interact with other sprites so jump to $018032.

I also introduce you some more sprite tables which we didn't need before:
  • $1504,x, $1534,x, $1594,x, $160E,x, $163E,x are all general sprite tables with no special uses or properties.
  • $1510,x is unitialised. You have to do it manually. Moreover, no sprite other than the brown chained platform use it (that doesn't stop you from using it too).
  • $1528,x is, when activated with "Takes 5 fireballs to kill", the health points of a sprite à la Chuck.
  • $1540,x, $15AC,x and $163E,x decrease themselves each frame. $1540,x similar to $C2,x but for timers i.e. the most common sprite timer table.
  • $154C,x and $1564,x are other sprite tables decreasing themselves. The former is used to disable interaction with Mario and the latter with other sprites.
  • $1558,x also decreases itself each frame. Sprites killed in lava / mud sink into it before they disappear but you can use it in the main code since killed in lava / mud is an own state.
  • $15B8,x is the slope RAM. More information on the RAM Map
  • $15D0,x is the flag whether a sprite is eaten or not.
  • $161A,x controls whether the sprite will reload or not (0x00 to reload, 0xFF to not reload). Use it e.g. in custom death animations, when the sprite is supposed to be death but still technically living (see e.g. the Super Mario Land sprites).
  • $1626,x is the counter of consecutive enemies killed by this sprite. Basically, when you kill enemies per shells, you can chain them to get one ups.
  • $164A,x is the "sprite is in water flag". It's non-zero when the sprite is in water and negative when in lava.
  • $187B,x is used in the default interaction to give the sprite some stomp immunity (think of chucks after you hit them).
  • $1FD6,x is unsed but you can use it as a general sprite table. Not recommend unless you run out of sprite tables, though, since some UberASM codes relies on the unused sprite table (e.g. Erik's sprite shooter).
  • $1FE2,x is basically a temporal "Disable Water Splash" and "Invincible to star/fire/bouncing blk.". Initialised sprites also have this set as default for a couple frames.

Depending on which routines you call, some special tables can be used as general tables because it's the the routines which gives the tables a use.


You can try to make our Goomba jump after some time. For that, we use $1540,x for the timer.

Part 4.3.4: "Your" own custom default interaction

Now we want to talk about custom default interactions. That is, we want to recreate or at least emulate SMW's contact routine. For reference, you can check "DefaultInteractR" in all.log for further information.
Now, we know following stuff, what happens when you interact with the sprite:
  • If Mario has a star, kill it. The sprite has some kind of boost in the Y direction and also moves in the horizontal direction.
  • If on top of the sprite, attack it, else hurt Mario (the routine takes care for).
  • You can avoid getting hurt when you slide on the sprite (by both, a slope and per cape). Being on top still counts as a jump, though.
  • When the sprite hurts you, it'll (usually) turn to Mario. But only when the sprite wins.
  • If you attack the sprite with a jump, let Mario gain a boost. $01AA33 handels this for you.
  • If you ride Yoshi, kill the sprite in a smoke.
  • A spin jump is similar to Yoshi i.e. kill the sprite in a smoke but only stop Mario instead of boosting him in the air. The game sets the player's Y speed to 0xF8 (that's to take account of of gravity).
  • In all of the three cases, you gain points or a 1-Up if doing very well.

Yoshi interaction is handled himself and so do fireballs, the cape and bounce blocks. I'll explain the details in a different chapter and how you add custom interaction to them.
Also, when killing the sprite, remember to not set $14C8,x to zero as this will just erease the sprite. Falling, smushed and spin jumped are one of the various sprite states which is what the sprite should turn into (well, not smushed but the other two).

Now, how do you check whether Mario is on top of the sprite or not? Here is how:
SMW uses the clipping values to check whether he is on top of a sprite or not (code at $01A897). We check the difference between Mario and the sprite's position instead. Fortunatelly, the general contact checker calls SubOffScreen which stores the local difference between Mario and the sprite to $0E and $0E hasn't been changed since. We then compare it with some kind of offset (usually 0xE6 but the exact value may differ depending on the clipping values). If the value is negative, Mario is on top, else he's on the sides / bottom.
The alternative method (used in SMW) is to use clipping values directly from Default. It's a bit complicated so I'll put the code here
	LDA #$14	; Load 14 as the height of Mario's body
	STA $01		;
	LDA $05		; Get the sprite clipping Y position
	SEC			;
	SBC $01		; Subtract it with Mario's body's height
	ROL $00		; Preserve carry A
	CMP $D3		; If
	PHP		; Preserve carry B
	LSR $00		; Get carry A
	LDA $0B		; Correct the high byte
	SEC #$00	;
	PLP			; Get carry B
	SEC $D4		; Subtract 
	BMI SpriteWins

The above code is pretty complicated (especially since it can be solved with 16-bit maths) but what it does is pretty simple: If the sprite's top clipping is below Mario's Y position + 0x14, Mario counts as "above" the sprite, else he's on the side or below. For a better visualisation, Mario's position is always the same regardless of his size, it's only when he's on Yoshi where it's shifted upwards (0x10 pixels, one block). Taking his position and his hitbox into account, this means 0xC pixels from his feet (0x1C on Yoshi's feet) count as "above the sprite".

There are other stuff which you shouldn't forget:
  • $1697 is the consecutive enemies stomped RAM whereas $18D2 is the same but for star killed enemies.
  • Speaking of $1697: SMW uses that to determine whether you can jump on the sprite jumping Y speed or not if even if the appropriate tweaker bit isn't set.
  • The star counter only resets when you have contact with a sprite without a star or stand on a moving platform. Done that way, because... SMW.
  • Adding points and their respetive sound effects have to be handled manually. The only exception is the 1-up where the sound effect comes from the score sprite instead of the interaction routine.
  • You also are the one responsible to not make $1697 and $18D2 go beyond 8.
  • You have to add the star particles in case of a spin / Yoshi jump with the help of the routine at $07FC3B
  • Speaking of particles, $01AB99 generates the contact graphics.
  • For easier customisability, you can add checks whether specific tweaker bits are activated or not. For example, you can check for the "Can be jumped on"-bit to make the sprite spiky or do some other weird stuff (keep in mind that you when jump on a spiky enemies with a spin jump or Yoshi you don't get hurt when doing that).

For this part, I just have included custom generic interaction routine which you are free to modify or base off instead of a whole sprite. But don't think you won't have any need to study it! In fact, you should experiment with it.

Here are following stuff you can try:
  • You can add some kind of timer when attacking the sprite. That way, you can add a squished state to the sprite (no, it [still] isn't setting the sprite state, else we would have done this long ago).
    For that, when attacking the sprite you use set a miscellanous sprite table to mark the sprite as squished and let it disappear after some time (or revive it if you feel it that way).
    (By the way, this is something you really should do.)
  • Similar to above, you can add a hit point system.
  • The goomba can behave as a kind of a moving spring board.
  • Spin jumps count as regular or Yoshi jumps and vice versa.
  • Spin and/or Yoshi jumps don't work when the sprite is spiked.

Part 4.2.4: How to draw two tiles

Now you're ready to draw more than one sprite tile. It is a bit more complicated but still not groundbreaking. Here is how sprites draw two or more tiles:
  • Most sprites draw their tiles in a loop.
  • Sometimes, you can set multiple tiles at once.

The former is usually needed for larger and complicated graphics but you might want to use the latter for sprites which use two OAM tiles. In fact, SMW even does the latter for most of the two tile sprites.

So... let's make a copy of the file, change the ASM source in the new CFG to the new ASM file and try to turn our Goomba into a Koopa (at least graphically). In order to do that, keep in mind that each OAM tile uses four bytes (excluding X position high byte and size but you get the idea) and such, in order to get the new tile, you increase the RAM address by four. That means, $0300,y is the first tile's X position, $0304,y is for the second one and $0308,y is for the third one. You get the deal for the other tiles and other settings.

Next, we need to know, how to move tiles. We do this by adding some kind of value before storing to the position. Morover, remember that adding a positive value means the tile shifts to the bottom / right and negative values (which are in eight bits values over $80) shifts the tile to the top / left. We want to make the second tile moved upwards by one block. This means, 0x10 pixels. But since positive values means "move tile to the bottom", we have to take its inverse, 0xF0 for that. You store the value to the OAM tile position (here, the Y position).
Moreover, you don't need to load stuff multiple times for each tile. For example, you after you have loaded the tile's X position, you can simply store it to $0300,y and $0304,y without loading the position again.

Lastly, don't forget to change the tile count since we draw two tiles, not one.

Testing the sprite, it works fine...

... except when we kill it:

Just because we flipped a tile it doesn't mean we flip the whole tilemap!
16x32 sprites in SMW don't flip their tiles and just fall offscreen instead but if you have the feel to do that, either swap the tiles or make the offset a variable and invert it when the sprite dies.
Another problem is that the tiles are unflipped for one frame. That's because we have set the Y flip after the graphics routine. Either use the custom generic interaction and set the Y bit in YXPPCCCT there, check for the sprite state before drawing the routine (but still make the sprite draw when not killed) or do something differently and check if the tiles themselves have been flipped and do then the offset flip.

Anyway. That's it! You now know how to draw multiple tiles in one shot (i.e. outside of a loop).


With that information, you have enough knowledge to add to the Goomba squished a frame in a similar way SMW does this (two 8x8 tiles, same tile number but one is flipped horizontally). Just don't forget that we draw 8x8 tiles instead of 16x16 tiles!

Part 4.2.5: Complicated graphics routine.

Next, we want to set up the tiles in a loop instead of putting them all at once. Setting up a loop isn't really difficult since you just need to use some kind of loop count (usually X or some kind of scratch RAM). Most use X for that but we use scratch RAM instead. More on that later.

Be careful that one of the problems with drawing tiles in a loop is that one index is used for the OAM tiles and another one to get the tile data. This means, if you want to access sprite tables, you either have to restore the sprite index back (tip: use $15E9 instead of the stack) or store the necessary values in scratch RAM (e.g. properties are often done that way).

Anyways. Here is how drawing sprites with a loop is different than from drawing sprites outside of them (besides the problem with the index):
  • All the data (except usually for properties and sometimes tiles) need to come from tables.
  • In addition, before you decrease the loop count, you have to increment Y four times by one, else you would only write to a single OAM tile.

Now you might ask: But which values should I take? However it matches with the hitbox. Our sprite currently uses a hitbox for 16x16 sprites, though you can use the CFG editor and change the sprite clipping index to 0x16 to make it at least bigger for sprites.
Anyway. Which values do we have to chose? Well, let's think about on how to make our hitbox not to stand out too much: We want to make the hitbox as centred to the graphics as possible. However, the sprite's object hitbox is also located on the bottom which is why it has to be offset vertically. So in other words: The tiles should extend to the top and equally in the horizontal direction. Here are the values we need for now:
For the vertical displacement, we use 0xF0 for the top tiles and 0x00 for the bottom tiles.
In the horizontal direction, we use 0xF8 (left tiles) and 0x08 (right tiles) instead. 0xF8 and 0x08 instead of 0x00 and 0x10 because the sprite itself is centred in the horizontal direction and not on the left side on the hitbox.

We put these values in the displacement tables. Yes, "tables". We need two tables, one for the X displacement and another one for the Y displacement.
Tip: Put the values 2x2 tables instead of 4x1 tables. That way, you have an easier grasp on which values you have to put for the tables.

Keep in mind that we only know how to draw a sprite with just one frame.
In order to draw an animated 32x32 sprite, we need to copy our tile table and set the values for the second frame. Next, we want to make sure the graphics routine can access the second table.
Remember how we determined the frames for a single tile? For larger sprites, there is a small twist: After you got the correct value, you use two ASLs afterwards. Or better yet: You remove as many LSRs as you need ASLs and shift the value on the AND accordingly.
Next, we come to the tile flipping part. The procedure is similar to the tile table. This time, we don't have to cancel LSRs out and just use two ASLs. Keep in mind that we need to duplicate our X displacement table but with mirrored offsets i.e. the offsets for the right tiles become for the left tiles and vica versa.

Keep in mind that this only works when the tile number is a power of two (i.e. one, two, four, eight, etc.). For other values, I recommend you to make an offset table.

There is still one last thing to cover before the sprite 32x32 animates and flip its tiles: Adjusting the index to make the sprite use the second tables instead of the first ones.
This is done so by getting the loop counter into A (this is why the loop counter is in scratch RAM, not in X), add the offset for the necessary table to A and copy it (back) to X with TAX (fortunatelly, if you follow this way, you don't need to use a TAX).
If you do use X for the loop counter, though, you shouldn't forget to preserve and restore X to not make the loop gets messed up.

Here is how or results look like:

Don't they look beautiful?

That's it! That's how far a simple 32x32 (and to an extend, any multi-tile GFX routine) goes.
How you exactly draw the tiles is something on your end. Experiment with the graphics routine and see how it works is the key on drawing sprites.
You can check the source code for a way to draw a 32x32 sprite. Or not and develop your own routine.

For the most part, this is all you need.

But if you really understood on how to draw multiple sprite tiles, there are three more things to mention:
  • As mentioned above, the priority between OAM tiles depends on whether the tile is drawn first regardless of its priority in its YXPPCCCT properties.
  • In some other cases, you have special tiles which must be handled differently (e.g. a tile seperate from the sprite's initial position). I recommend you to draw them outside the loop, else the set up would be complicated. But don't forget to work with the updated index!
  • There also are cases where you want to have tiles of different sizes. This is a bit complicated since as mention in the chapter when discussing about OAM tiles, the size table is located on a seperate table. And to make it more complicated, the OAM index in SMW is used for the main table, not to the smaller table. This means, you have to find a way to divide Y by four (two LSRs).
    But don't worry, the procedure is similar to getting the correct table index but with Y instead of X and division instead of addition i.e. copy Y to A, use two LSRs and copy it back. Now you can store the value to $0460,y. The only possible values you can store to are 0x00 and 0x02. Finally, and that's the most important part, when loading Y for the OAM finisher, it must be negative (we usually use 0xFF for that) if the sprite use tiles of different sizes*.

	; parts of the GFX routine

	LSR #2
	LDA Sizes,x
	STA $0460,y

	; rest of the GFX routine

	LDA #$something
	JSL $01B7B3


Because of complains in the old tutorial, if you really did understand this chapter, try do modify the given code in a such way that you get a 64x64 sprite.


I also have got a second bonus lesson.
This time, we make the hitbox 32x32 because using a hitbox smaller or bigger than the actual graphics looks certainly weirds. As you know, Mega Moles use an actual 32x32 hitbox. Its object clipping index is 0x07 and the sprite clipping index 0x2F (at least, that's what I think).
This also means, you have to adjust the displacement tables to make them take account for the different hitbox. As you know, the darker box is the sprite's main position, where the X tile is located when you place the sprite in Lunar Magic. Think about it and try to put the correct values for the tables.

Second Midway Point

I wasn't kidding that sprites are even yet harder than shooters. Don't be surprised if you were lost in this tutorial. Take a break and read the parts you didn't get again.
Anyway. This knowledge is enough to be able to create regular sprites and even supporting the sprites section.
In the next parts, we will discuss on how to turn a sprite into a different one, talk about extra bytes, hitboxes and health points alongside a simple boss.

After the boss, there is a third midway point where we talk about more specific stuff like carryable sprites, line-guided sprites and dynamic sprites.

Part 5: Advanced sprite creation

These sections are more random than the first part as they all contribute on making fancier sprites but are not "really" necessary to create sprites.

Part 5.1: Extra Whatever

This part is about is about Extra Property Bytes, Extra Bytes and the Extra Bit.

To explain what these extra things are: In order to save ROM space, there are settings which are set by the sprite number or level data instead of the ASM file.
  • The settings are located in the ASM file. How you implement it, is on your side, though conditionals (see below) are the most common way.
  • The property bytes are additional sprite tables whose values are dependent on the CFG/JSON file.
    The first byte is $7FAB28,x and the second one $7FAB34,x.
  • Most sprites on SMW takes on the sprite level table usually three bytes but Lunar Magic allows you to expand this by a couple bytes. This is what we call extra bytes.
    How many extra bytes a sprite can use depends on the CFG/JSON file. If you remember chapter 3, you can change the value per CFG Editor on the fields extra bytes count. The addresses where the extra bytes are stored are $7FAB40,x, $7FAB4C,x, $7FAB58,x and $7FAB64,x.
  • In addition, you get the familiar extra bit (fittingly called "extra bit") which is bit 2 of $7EAB10,x.

Now we go to conditionals: Conditionals are an Asar feature which allows you to control which part of a code is compiled or not. If you have ever coded with other programming languages, you know this from preprocessors from other programming languages (such as C).
Anyway, the usage is simple:
if !Define
	; Something

To keep it short, anything between an "if" and "endif" only gets compiled if the value after the "if" is non-zero. If you put for "!Define" a zero, Asar skips compiling the code inside the conditional, any other value tells Asar to compile it.
Furthermore, you can control it further with commands like "==" (non-zero if both values are equal) and use "else" for an alternative code in case the conditional for it is zero.
Check Asar's readme for more information.

Next, we come to the actual coding. We want to turn our Goomba into a Goombrat i.e. give him the ability turn around on ledge edges. We do this by seperating the ground and ceiling contact. The ceiling contact then just clears out the Y speed.
Now we come to the ground routine. We need codes for the sprite in both, in air and on ground. On the ground, aside from clearing out the speed (which is a must because of the momentum) we set some kind of sprite table (e.g. $151C,x) zero in addition to the speed. We check if said sprite table is zero in the air. If yes, flip the sprite and also mark the sprite as flipped.

Next, we want the flip to be only applied when the appropriate setting is activated. Try to find a way on that one.


We want to combine the ledge-Goomba with a jumping Goomba into a single ASM file. How can you combine both kinds of Goombas into one ASM file?
Tip: You need to use AND #2x to get individual bits.

Side note: It sometimes might be a good idea to save the extra bit, property byte and byte into a sprite table (on init load) or into scratch RAM inside routines. The reason for that is that their RAM addresses are 24-bit (except on SA-1 if the BW-RAM mirror are used) and therefore are a bit limited. It also helps you that you sometimes (especially if the settings in question uses multiple bits) need to shift the bits before you can use them and minimises the use of shifting.

Part 5.2: Transforming a sprite into a different one and approximation checks

Remember when we spawned a sprite from a shooter? Let's recap this:
  • Search for an empty sprite slot.
  • We set the old sprite's position plus-minus some offset to the new one.
  • Set the sprite number.
  • Initialise sprite tables.
  • We either set it in an initialised state or already living, sometimes stunned (0x01, 0x08 and 0x09, respectively).
  • Set the sprite tables for the sprite (when spawning in an already existing state).
  • Set extra bytes and mark the sprite as custom and call the custom sprite initialise routine (if it is custom).

You can use most of the knowledge to transform a sprite into another one. The most notable differences:
  • The biggest one is that you don't need to change the index.
  • That's because you don't search for an empty sprite slot.
  • Setting the position is such unnecessary because the initialise sprite routine doesn't clear out the positions.
  • In case the sprite is supposed to be "death" when it transforms into a different sprite, set $161A,x to 0xFF.

There are many stuff we can do but first, we create a sprite whose only ability is to transform itself into another sprite. Immedietly, when it spawns. (This also means, you don't need to use routines like GetDrawInfo as the sprite only runs its code once.)

The idea is to simulate so-called "run-once" sprites. Well, "sprite" is the wrong term as when a game calls a run-once "sprite", it simply loads a sprite in a different way than usual. That way, it allows to spawn sprites in a different state than usual (e.g. stunned Koopas aka Koopa shells) and/or multiple sprites (e.g. 5 eeries and 3 gray chained platforms).
Having a sprite only contain a main routine with a code to transform into a different one works similar. To see this in action, look into the
kicked shells. All they do is to transform into kicked Koopa (shells) with some X speed.

We want more!

Your lesson is to code an already crumbled Dry Bone and Bony Beetle which will revive itself after some time. Which values you have to set can be seen in all.log.
$1534,x is the crumbled state and $1540,x is the timer, how long it stays crumbled.

Now we try something else. We want our sprite to transform into something else but only when Mario is close to it. We use for that some kind of approximity check like SubHorzPos and SubVertPos.
You know the former when we used it to determine whether the sprite is on the right or the left side of Mario and set its facing direction. It has got another use: It outputs the distance between Mario and the sprite. There are multiple versions of it but the way how PIXI's routines work is that SubHorzPos stores the result to $0E (16-bit) and SubVertPos to $0F (low byte) and A (high byte).
After you have gotten the output, go to 16-bit mode and check if the value is negative. If yes, invert the value else don't. Compare it with the distance you have set (which may or may not be different in the direction). That being said, there are cases where you only want an approximation check in one direction. For one, you save some work and time (yours and the SNES's), for the other, it might be even required to do so.
Even SMW sometimes checks just for the horizontal direction (e.g. Thwomps) but sprites checking in either direction aren't unheard of (e.g. if you're too high or too low to a Rip Van Fish, they won't chase you even if you have got the same X position).

Or if you really want it fancy, you can use a circular approximity check. You can do this by taking the difference between Mario and the sprite's position, square them individually and add them together (aka the Pythagorean Theorem with r² = (xM - xS)² + (yM - yS)²). In order to square values, simply do a multiplication (though if the radius is set inside the sprite, you can square them in Asar). xM is Mario's X position, xS the sprite's X position and yM and yS the same but with the Y position. If the result is smaller than the square of the radius then do the action.
Keep in mind that circular approximity is a bit slower than just doing a rectangular approximity check and the radius is quite limited too because squares often double the amount of bytes you need, meaning that you can't check for distances larger than a whole screen (unless you managed to find a way for a 16-bit * 16-bit multiplication on the SNES which makes the routine even slower).

You also can add an effect like a smoke animation when transforming though for the sprite you should create, you need to recreate the Bone Pile i.e. modify the crumbled dry bone in a such way that they only revive if and only if Mario is close to them. (But please, for the sake of the tutorial, don't look up the code!)

Part 5.3: Custom Sprite Interactions

Now we start on making a custom sprite interaction. Specifically for sprite <-> sprite interaction but also between the cape, fireballs and bounce sprites.

You know that you can add a sprite<->sprite interaction with $018032. In addition, the interaction between fireballs, capes and bounce sprites (and Yoshi stomps, which are basically bounce sprites) are handled by themselves. You can hijack each of the interactions but this a rather bad idea. Luckily, you can a create custom interaction yourself.

Part 5.3.1: Sprite To Sprite Interaction

Before we go on and make a custom sprite interaction, we first talk on how SMW handles it: Sprites will only run their codes every second frame (which is XOR'd by the sprite index). That's pretty obvious as that reduces the amount of cycles. However, sprites also will only with other sprites if the former are in a higher slot than the latter. This has got a side effect in which if a sprite has enabled interaction with other sprites but doesn't call the sprite<->sprite interaction routine, they can only interact with them if the other one has got a higher sprite slot. This is a phenomenon in SMW as e.g. hopping flames and Thwomps which can only be "randomly" killed by a shell.
Now, why is the interaction done that way? Because that, obviously, avoids slowdown too.
Do the maths: SMW can run up to 12 sprites onscreen. Since said sprite won't interact with itself, there are 11 other sprites to check for. Repeat that eleven times. The resulting sum we get are 12*11 or 132 checks and SMW can't efford that (SA-1 allows you to have up to 22 sprites onscreen which means there are 22*21 or 462 checks).
Now we only check for sprites in the lower sprite slots. Let's start with slot 12: There are 11 sprites to check. The next slot is 11 which checks for 10 sprites, followed by slot 10 and 9 sprites, etc. This means, we have the sum of the natural numbers up to 11. Since a sum of all natural numbers up to n is a triangular number, we use the formular n(n+1)/2. If n equals to 11, we get 11*12/2 = 66.
As you can see, the number of checks has been more than halved.

There are exceptions, though: Bosses. Bosses use a custom interaction in which they check for all sprites. Makes sense considering that can be really annoying if you can only attack with a few sprites and there usually is one boss, maximally two bosses on screen too so time isn't as much of a problem at them. And even in non-boss sprites, there are cases where you want to check for all sprites (save for sprites in slot 10 and 11 which are reserved for powerup). Explosions (which are just Bob-Ombs in disguise) also check for all sprite slots which also is one of the reasons why the game lags if there are too many explosions onscreen.
Now, how should we check for the sprites? Loop over all sprites, of course. A custom interaction is useless if the boss can interact only with a few chosen enemies.

Now, here we are at the actual coding. As a matter of a fact, you loop through all sprites and check if their state is 0x08 or higher. That's quite obvious. But you also should check if the other sprites are supposed to interact with the custom sprite before you check for their clipping (possible invulnerabilities: fireball-cape-bounce-block-invulnerability, whitelist, blacklist, carryable and kicked, etc.).
You then check if they touch each other. This is done by calling the routines $03B69F and $03B6E5. The only difference between the two is where the clipping data is stored too. Otherwise, both are identical. In fact, both even get the clipping data from the sprite from what ever value X is so make sure to transfer Y to X before you call one of the both get clipping routines.
After getting the clipping data of both sprites, call $03B72B. The output you get is the carry flag which is set if both sprites touch each other and cleared if not (think of the Mario<->sprite interaction routine which calls that routine too).

You then have free choice to do what you can do. You can make an HP counter, a sprite killing other enemies, etc. The sky is once again your limit.

Currently, there are no plans on a sprite which can benefit with that information but you will need this knowledge for the boss chapter.

Part 5.3.2: Sprite To Others Interaction

So, it wasn't that difficult with sprite<->sprite interaction and using other sprite types isn't really difficult too. The only difference is the clipping routine you call the sprite. Fireballs (and extended sprites in general) get their clipping from $02A547, quake and Yoshi stomps from $029663.
Capes are a bit different, though, because the clipping data is gotten from $029696 which is an RTS-routine. You need to use the JSL2RTS to get the daa or you can use the code below:
	LDA.W $13E9
	SBC.B #$02
	STA $00
	LDA.W $13EA
	SBC.B #$00
	STA $08
	LDA.B #$14
	STA $02
	LDA.W $13EB
	STA $01
	LDA.W $13EC
	STA $09
	LDA.B #$10
	STA $03

Keep in mind to call $03B69F and not $03B6E5 to get the sprite clipping data because the former stores to the scratch RAM which is already used for the clipping data from the other sprite type.

As in the previous sub-subchapter, there aren't any examples I can think of which can benefit outside of a boss. Well, maybe with the exception of the custom mushroom we created earlier.

Part 6: HP, RNG and Boss Creation

Actually, no, we don't. I haven't decided what boss to create so let's skip that part.

Goal Point!

That's it! Now you have pretty much the basic knowledge of coding sprites.

This isn't all of it, though, there still is a secret exit left with the topics about extended and cluster sprites, non-standard states (such as carryable), line-guided sprites, dynamic sprites, etc.

Bonus Parts

Additional information I leaved out on the main tutorial because they aren't really "required" or are "unrelated" to sprites.

Bonus 1: SA-1 support

Now, since the increased use of SA-1, we want to make the sprite support this enhancement chip. To explain SA-1 breifly: It is a chip for the SNES included in a couple games like Super Mario RPG. It runs roughly four times as fast as the SNES (or rather how fast it reads the ROM in SlowROM) and is commonly used for speedup but sometimes for bitmap conversion (which allows you easier to transform sprites).
The problem: Most of the RAM and even ROM addresses have to be remapped to different addresses to take effect of SA-1. You can omit SA-1 support but keep in mind that moderators may or may not do the job of SA-1 support, though.

Anyway, the conversion isn't a really hard thing. In fact, PIXI even features a couple SA-1 defines, specifically !dp, !addr, !bank (and their old counterparts !Base1, !Base2, and !BankB), !BankA and sprite defines.
  • !dp is to remap addresses from $0000-$00FF but unless you're forced to use absolute addressing (such as loading and storing indexed with Y or using indirect jumps), it's usually not required.
  • !addr is more commonly used for addresses $0100-$1FFF.
  • !bank is also very common. It is used for ROM addresses. In contrast to the other settings, this actually is zero on SA-1 and $800000 on SNES. That's because we almost never work beyond a 4MiB ROM and banks $80+ is the FastROM mirror so we jump there instead of banks $00+. On SA-1, it's always its own space, though.

  • !BankA is the rarest SA-1 define you will use. It is solely used for the Map16 address which have been remapped to BW-RAM. Other WRAM on bank $7E/ $7F is either leaved that way (not recommend since the SA-1 chip handels sprites) or use a different address altogether. In fact, if you have FreeRAM, I recommend you on making them independent of !BankA.

  • In order to use these, you'd simply need to OR the address with the define. For example, $15E9 becomes $15E9|!addr.

    For sprite addresses, this is a bit different: Since version 1.10, SA-1 Pack allows one to use more sprites onscreen. This means, sprite tables have to use different RAM addresses. They too are defines. In fact, the define's name are the same as the sprite tables' addresses i.e. all you need to do is to change the dollar sign into an exclamation mark or use the defines PIXI which you can check out in sa1def.asm (see bonus 2).
    By the way, Asar converts to the correct addressing mode when the address is being indexed by Y (which, as mentioned above, only exists with absolue addressing when using with A). If you feel this is to unsave, force Asar to use absolute addressing (i.e. use a .w).
    That being said, this only applies for regular sprites. Any other RAM address, unless defined as such, need to use the above defines, RAM tables of other kind of sprites including (don't worry, it happens to everyone, myself including).

    There also are cases where you want to use SNES registers. Unfortunatelly, SA-1 is isolated from the SNES, registers including. That being said, you likely will only use muliplication and division registers (see below) which exists not only on SNES but also in better form in SA-1. See the appropriate bonus chapter for both arithemtic operators on how to use them. That, and try to use conditionals like here:
    if !SA1
    	; SA-1 code here
    	; SNES code here

    And just as a big, big, big reminder: When you create SA-1 hybrids, don't think that making all RAM addresses SA-1 means that your sprite is already compatible with SA-1! You need to test it on an SA-1 ROM to assure there aren't any bugs only on the SA-1 version or missed conversions!

    As for romi's Sprite Tool... Unfortunatelly, it only supports SNES natively, in part due to the fact that the code is inside the problem and in part because Asar support was added just for the sake of having Asar support. Worse yet: The latest SA-1 compatible version is based of version 1.40 which officially only supports Xkas (the Asar version is version 1.41, btw.) and sprites have to be converted to SA-1 only.

    Bonus 2: Having a readable code

    Nothing in coding would work if the entire code were undocumented (granted, it could, but it would be very difficult). Here are a couple tricks (besides using comments):

    RAM defines

    Why would you ever need to use RAM defines (other than FreeRAM, of course) when you can just use the RAM addresses anyway? Well, it's for easier code management (and SA-1). It's often nice to see what function a RAM address do (and for SA-1 hybrids). For example, we can put $C2,x in the coin mushroom in a define and name it as the moving flag. That's all for this bonus chapter.

    Splitting the sections

    It's also recommend to split code sections so you know where you need to edit the code.
    Here are some common approaches:
    • The main code is seperated from its header and the init routine.
    • If the init routine is large enough, you might consider putting it in its own section.
    • Subroutine are seperated from the main code.
    • Two subroutines are seperated too.

    Bonus 3: State Pointers

    If you want a sprite to have multiple states then don't use a chain of comparisons. Instead, we use some kind of pointers. There are three ways on doing this:
    1. You can use SMW's build in pointer by accessing $0086DF (needs to be a JSL). Anything behind the routine are the list of pointers. Which pointer is chosen depends on the value in A. Keep in mind that they don't set the return address. If you want to jump back after the rotines, you have to set the return address manually.
    2. You can double A, put it to an index register, load the address table and store the result to some kind of scratch RAM. You then use JMP.w ($0000) or JSR.w ($0000) or what ever scratch RAM you use (keep in mind that the address need to be absolute and have to put a "|!Base1" for addresses $0000-$00FF for SA-1 compatibility)
    3. You can double A, put it to X and use JSR.w (Pointers,x) for that.

    But how do we set up the pointers? Simply with a pointer table! A pointer is nothing more but a value containing an address. You can even use label as values i.e. dw Label is completely valid.
    You also might need to put a label for the table if you don't use the first option. Here is what I mean:
    Pointers: ; Not required when jumping to $0086DF
    dw Code1
    dw Code2
    dw Code3

    Here is a list of examples to use each way of pointer implementation:

    Execute pointers (option 1):

    Scratch RAM pointers (option 2):

    Return address manipulation (option 3):

    Indexed address pointer (option 4):

    Okay, we have the codes but which pointers should we take? Well, let's take a look at them:
    1. Option 1 is the value saving most space but is the slowest option because it manipulates the stack.
    2. Option 2 is quite large but is at least faster than the first option.
    3. Option 3 is slightly smaller than the first one but not as fast, not to mention you have to subtract the destination address by one.
    4. Option 4 isn't much larger than the first option and also faster than the second and third option but requires X so you always have to restore. JSR (Pointers,y) doesn't exist in the 65c816 processor. Keep also in mind to set the data bank to the same value as the program bank.

    It's really up to you on which option to chose as they have their own advantages and disadvantages.

    Advanced arithmetics

    This is about "advanced" arithemtics, arithemtics beyond addition and subtraction. The SNES is a 65c816 processor which means addition and subtraction exist as opcodes but this doesn't hold true for multiplication and division. In addition, this bonus chapter also features a usage on square root and trigonometry.

    Bonus 4.1: Multiplication and Division

    This sub-chapter is about repeated addition and its inverse aka multiplication and division. In contrast to addition and subtraction, which are existing opcodes, the 65c816 doesn't support them natively. This doesn't mean that there is no way to do these two stuff. In fact, since multiplication is just repeated addition, all you need to do is to just create a loop with one factor being the loop counter and another one being the value to add to and preferably swap the factors beforehand so that you can save some time by doing less additions.
    Division isn't really as easy, though. Doing this by hand takes quite a bit of time. In fact, some processors which do support multiplication but not division (e.g. Super FX) just use a lookup table and multiply it with some number instead. It's more like this: x / y = x * reciprocal[y].

    That being said, four of the binary opcodes allows you to do a multiplication and division by 2. These are ASL, LSR, ROL and ROR. Here is a short explaination on what each of the opcodes do:
    • ASL (Arithmetic Shift Left) shifts each bit to the left. Bit 7 goes to the carry flag.
    • LSR (Logical Shift Right) shifts each bit to the right. Bit 0 bit goes to the carry flag.
    • ROL (ROtate Left) shifts each bit to the left. Bit 7 goes to the carry flag and the carry flag goes to bit 0.
    • ROR (ROtate Right) shifts each bit to the right. Bit 0 goes to the carry flag and the carry flag goes to bit 7.
    (For all of them: Replace bit 7 with bit 15 if in 16-bit mode)

    As you can see, the multiplication and division by two opcodes are nothing more but bitshifts. This means, each digit moves either to the left or to the right, effectively multiplying or dividing by the base number, hence why these opcodes multiply and divide by two. You also can attach a RAM address (only direct page and absolute) to modify a RAM address.
    Moreover, there are two kinds of shiting opcodes: The regular shifts (ASL and LSR) and the rotational shifts (ROL and ROR). What makes them different is that the former treats the accumulator and memory as a row of digits (any bit going outside becomes the carry flag but can't be retrived) whereas the latter treats the accumulator and memory plus the carry flag as a ring of bits (meaning that the bits wrap around when overshoot).
    Now the question is, which bit shift should I use? That depends on whether the wrap around is really needed or not. For simple multiplication and division, the regular shifts are really necessary. On the other hand, if you do need to retrieve the carry flag (such as pseudo 16-bit multiplication and division), use the rotational shifts. Other then that, both opcodes are identical even in speed.

    Furthermore, you can use the multiplication with some addition to multiply a value with anything else but 2 and still faster than the former method (e.g. in order to multiply by 3, use this STA $00 : ASL : CLC : ADC $00). As for division, there is no way on how to use them except for fractions and even then, that's quite a bit complicated.

    In case both factors are variables, you need to use repeated addition or better, use the Ancient Egyption Multipliaction (or the Russion Peasant multiplication, depending on which language you speak). The AEM works by checking if factor A is odd. If yes, add the value in factor B to the intermediate result else don't. Next, divide A by two and multiply B by two. Repeat until A is zero. Finally, add the values together.
    You don't really need to code it from the scratch as there is already a routine ready here.
    As for division, the best way be to do a long division. I give you the code instantly but you can take a look into the Wikipedia article about the long binary division if you want to understand it how it works.

    Or you have got the possibility on using the SNES and SA-1's multiplication and division registers. While the 65c816 processor hasn't got an ASM opcode for multiplication and division, it doesn't mean that the CPU hasn't build them as registers. Registers behave similar to RAM addresses in certain ways but instead to control the game, it controls the SNES itself. That way you can change the palette or graphics on-screen. Another weirdness and another reason why registers aren't RAM is their access. Most PPU registers (which usually controls the graphics) can only be accessed during f- or v-blank (i.e. screen is turned off or, so to speak, in NMI), most of them also during h-blank (often done per IRQ but also HDMA is on h-blank) and only a few of them can be done everytime (and that alone can have its own side effects).
    Fortunatelly, CPU registers are less strict than their PPU counterparts.

    SNES registers are located in $002100-$00FFFF (specifically $002100-$0021FF for typically PPU registers, $4200-$43FF CPU registers and the rest is usually unused if not used by enhancement chips or HiROM) but are mirrored in banks $01-$3F which means you can access them per absolute addressing ($xxxx doesn't point to bank $7E if the databank isn't set to $7E for that matter). You might have believed that $2000-$7FFF points to RAM just as much as $0000-$1FFF but in fact, even $0000-$1FFF don't access WRAM, at least directly.
    I wrote in the chapter on the databank change (you know, PHB, PHK, etc.) that the implied high byte of absolute addressing is the data bank. However, the data bank wrapper sets the data bank to where ever the code is located and sprite codes are in ROM, not WRAM banks. So how in the heck are we able to if the bank is set to ROM? The answer are WRAM mirrors. They exists in the first two MiB of a SNES LoROM, banks $00-$3F and mirror WRAM addresses $0000-$1FFF. There is a reason why ROM in a LoROM is located in addresses $8000-$FFFF, half a bank.
    If something is neither ROM nor SNES WRAM, it's either a processor registers (SNES or enhancement chip) or RAM mirrors to expansion chips (e.g. SA-1's BW-RAM).

    The SNES multiplication registers are located at $4202 (8-bit multiplicand A) and $4203 (8-bit multiplicand B) and the result is stored to $4216 and $4217 (16-bit product). In addition, you need to wait 8 cycles (around four NOPs). The division registers are $4204 (dividend low byte), $4205 (dividend high byte) and $4206 (8-bit divisor), the result at $4214 and $4215 (16-bit quotient) and the reminder at $4216 and $4217 (16-bit reminder). Here you need to wait for 16 cycles (eight NOPs).
    Actually, I lied with the delay. The SNES takes one cycle to fetch a byte. Since it's unlikely you have changed the direct page to $2100, you need to realistically wait for 5 and 13 cycles, respectively.
    In other words: The SNES supports 8-bit × 8-bit multiplication which equals to a 16-bit product and 16-bit ÷ 8-bit division with a 16-bit quotient and reminder.

    In addition, there is another multiplication register, specifically a PPU register. The matrix registers $211B and $211C feature a second function outside of Mode 7: 16-bit * 8-bit multiplication. The usage is a bit different from the CPU multiplication registers in which $211B is a write twice register, meaning that you need to two 8-bit writes in order to affect it (e.g. LDA #$xx : STA $211B : LDA #$yy : STA $211B). $211C is a write twice register too but as a multiplication register, it behaves like a write once one. A second difference is that PPU multiplication is signed so be careful when you work with values bigger than $8000 and $80, respectively! If you tend to work with large numbers and something went wrong, this likely is the reason why (especially with the 8-bit multiplicand).

    The result of the 16-bit * 8-bit multiplication is stored to $2134-$2136 (24-bit product)
    Also an important reminder: Don't forget that the multiplication and division are unsigned! If you want to have a sign dependent multiplication and division, I recommend you to invert all negative values, multiply them together and then check if one of the two values were negative (tip: use EOR for the negative flag when flipping the sign).

    SA-1 works a bit differently: Compared to the SNES, the SA-1 multiplication and division registers are shared. So in order to determine whether the operation is a multiplication or division, simply set $2250 to zero if you want to multiply two numbers and to one to divide two numbers.
    As for the arithemtic registers themselves, the multiplicand registers are $2251 and $2253 (both 16-bit so don't forget to store to $2252 and $2254 too if you are in 8-bit mode) and the result is stored to $2306-$2309 (yes, the result is a 32-bit number). The division registers are the same i.e. the dividend goes to $2251, the divisor goes to $2253, quotient to $2306 and the reminder to $2308 (again, all of these registers are 16-bit). The wait cycle is estimated to 5 cycles so you need to place at most a single NOP if you want to get the value directly!
    Also to note is that SA-1's multiplication and division registers are, similar to PPU multiplication, signed (except for the divisor)! Much like before, check for very large values.

    Another problem which might occur is that since the SNES doesn't support floating numbers but just integers, using binary point numbers might be a bit difficult too. In order to compensate for that, there is a fixed point between bit 7 and 8, between the high and low byte. As such, the actual value is multiplied by 256 in the accumulator. This makes the high byte the integer portion and the low byte the fractional digits.
    This is really important for square roots and trigonometry where most of the results are irrational numbers and only sometimes are rational fractions but especially for trigonometry, almost never integers.

    Bonus 4.2: Square Roots

    In contrast to multiplication and division, which the SNES has got its own CPU registers and neither does SMW make use of square roots too. So in order to be able to use square roots, we use a lookup table. The input of that routine is a 16-bit value of A, the carry flag as bit 16. The result is 17-bit too with the carry flag acting as bit 16 but as mentioned in the previous chapter, since the SNES doesn't support floating point numbers, there is a fixed point between bit 7 and 8, meaning the actual result is basically multiplied by 256 in the accumulator.
    Getting the integer portion also is somewhat different because of the carry flag. If the input was just the accumulator, you can simply take the high byte from that result, though you might want to have the number rounded. In that case, you just need to multiply the low byte (and just the low byte) by two, swap the low and high byte like here and add carry to the high byte:

    If you use it with the carry flag, you can simply swap the low and high byte with each other, load 0x00 and use ROL to get the carry:

    If you want to round the value with the carry flag, I recommend you to preserve the processor flag and thus the carry flag (i.e. use PHP*) before you do the above rounding. After rounding the low byte, swap the high and low byte again, load zero, add it to the carry, use PLP to pull the carry flag back and add the carry flag once again:

    *The opcode, not the scripting language.

    Bonus 4.3: Trigonometry

    If you are old enough and payed attention to the maths class you might recognise this so you can skip the first part of it. If you also don't want to read this, fine. You can skip that part too.
    If not, here is what trigonometry is: Trigonometry is the mesurement of rectangular triangels. Furthermore, it allows you to get the position of a point on a circle, useful to get circling sprites or for a wave motion.
    But first, let's talk what is necessary on trigonometry: We use rectangular triangles because one of the angles is constantly at 90°, not to mention that trigonometry uses the Pythagorean Theorem which also uses rectangular triangles. This leaves us having two variable angels but because the angular sum in triangles is always 180°, this leaves us with having only one variable for the angle.
    This result on functions which use only one variable. But how are these functions defined? We need to talk about the sides: The side on the opposide of the right angle is the longest side and also called "hypothenuse". The shorter sides are called "catheti" (singular: "cathetus").
    There is no definite difference between the two catheti. The difference comes from which angle you look at them. Since each rectangular triangle has got one right angle and these always have an angel of 90°, we ignore that angle and only focus on the other two. Each catheti has got one angle they are on the opposide to it and one, they are adjacent to it (which is how the individual catheti has got the name, "opposite" and "adjacent").
    Moreover, each angle's adjacent is the other angle's opposite. As such, we only focus on angle α which also makes side a the opposite and side b the adjacent (and for that matter, the right angle as γ and the hypothenuse as side c). We also put the triangle in a circle in such a way that the hypothenuse goes from the centre to the circle's edge i.e. a radius of the hypothenuse's length is the same as the circle's radius and α is located at the centre.
    Confused? Here is an image showing it:

    That being said, the two catheti have got special names if the hypothenuse has got the length one (or as a circle the radius one i.e. you have a "unit circle"). The special opposite is called "sine" and the special adjacent is called "cosine".
    Furthermore, we can define these catheti as the functions "sin(α)" and "cos(α)".

    You also can use both functions to calculate the catheti if you know the hypothenuse and an angle. This means:
    • Any opposite is the hypothenuse times the sine or a = c*sin(α)
    • Any adjacent is the hypothenuse times the cosine or b = c*cos(α)

    You likely are more familiar with sin(α) = a/c and cos(α) = b/c but the reason we do it that way is because we have the hypothenuse (which we now refer as "length" or "radius") and angle given but search for the catheti.

    Now to the actual programming (don't skip it!): SMW has got its own trigonometric table (after all, there are sprites like Ball 'n' Chains, Boo Circles, etc.). Alternatively, you can use some circle routines like PIXI's own ones or use your own table if you aren't satisfied with the existing tables.

    SMW's trigonometry table ranges in a half-period. In order to get the other half, we simply need to inverse the value. It also is a 16-bit table but similar to the square root, the values are often irrational numbers and only sometimes rational fractions. This means, there is a fixed point between the 7th and 8th bit.
    In addition, this table is just a sine table. Sine and cosine are basically the same function with the only difference that the latter is shifted by a quarter period (after all, each angle's adjacent is the other angle's opposite). In order to get the cosine value, you need to add to the angle 0x80 and than AND the value by 0x01FF before we use that as the index. Store both values somewhere in scratch RAM so that we can determine, whether the value is positive or not.
    The table is located at $07F7DB. In order to get the index, we need to multiply the low byte (remember that the table is a half-period) by two (due to it being a 16-bit table). With the index, we can load from $07F7DB,x.
    After we have gotten the value, we need to check if the integer, the high byte, is one. If yes then we skip the multiplication (after all, multiplying something by 1 does effectively nothing).
    Though the result is stored as a 16-bit value to $4216, we only need to focus on the high byte in $4217 because the low byte is still the fractional value and the high byte the integer value. You can, however, shift the value in $4216 to the left and then add to $4217 zero without clearing the carry flag to round the result (remember the chapter about the square root). Next, we check if the angle's high byte is one or not (that's why I wrote that you need to preserve it). We can use "LSR $xx" for that and check if the carry is clear. If yes then don't invert the value.

    That might be complicated to some of you so I provide you the code for it:

    In addition, here are a couple fact about sine and cosine:
    • You can imagin the circle as a clock with one hand. It points to the right at 0°. Increasing the angle moves the hand counter-clockwise (at least in usual maths. It's clockwise in computer science). This means the X position is affected by cosine and the Y position by sine. That way you can determine e.g. the direction in which an enemy shoots a projectile.
    • One of the properties of sine and cosine is that not only are these two functions periodic but also their derivatives. Derivatives are a tool in calculus to mesure the steepness of a slope, the rate of change of a function. Furthermore, some units in physics are derivatives from other units like speed is the derivative of distance and acceleration is the derivative of speed.
      The derivative of sin(θ) is cos(θ), whose derivative is -sin(θ), whose derivative is -cos(θ), whose derivative is back at sin(θ) (so not only are the functions themselves periodic but also their derivatives).
    • Take care of the chain rule. Basically, if you have got a function inside another one like as in u(v(x)) then not only do you need to derive the outer function but also multiply it with the inner function: u'(v(x)) × v'(x). In this case, if there is a factor in front of the x (e.g. sin(3x)), then it means that its derivative is cos(3x) × (3x)' = 3 × cos(3x). Fortunatelly, any constant without an x inside the function gets eradicated (e.g. (cos(4x + 2))' = -sin(4x + 2) × (4x + 2)' = -4sin(4x + 2)).
    • Because sine and cosine are catheti on a triangle with a hypothenuse having the length 1 and by the Pythagorean theory, the catheti's squares equals to the hypothenuse's square, sin²(θ) + cos²(θ) (the superscripted 2 denotes that you need to square the function's result) always equals to 1 (technically 1² but 1 * 1 is always one so we can leave out the square).

    Bonus X: Smaller stuff

    Here are a couple smaller stuff which I don't deserve an own chapter but still might be helpful when creating sprites.

    JSL on RTS

    Because of how SMW is coded, many routines which could have been accessable from all banks, can only be done so from bank 1 due to being RTS routines. This would suck but thankfully, there is a way to work around that: We know that when the SNES calls a subroutine, it puts the next address to the stack. But we also know that the stack can be accessed by the push and pull commands. The code is so commonly used, you can put it even inside a macro.
    Anyways. We have to know, how the SNES stores and pulls the return address. The format is following: Bank, high, low. This is the order when storing the address. When getting the return address, the order is reversed: Low, high, bank.
    As such, we can push the current programm bank and then the rest of the return destination (keep in mind to subtract one from the return destination). But there is one problem: A RTS only returns somewhere in the same bank. We can go around that if we managed to make the RTS jump to a RTL. We then push the RTL's address (minus one, obviously). For example, there is a RTL in bank 1 at $018021. We push the value 0x8020 into the stack.
    You can use PEA $xxxx which puts the 16-bit value (not address despite looking like an address) onto the stack.

    In fact, you might want to use it as a macro:
    macro JSL2RTS(destination, rtl)
    PEA ?return-1
    PEA <rtl>-1
    JML <destination>

    Ranged comparison

    You sometimes want to compare stuff between two values. Sure, we can use multiple CMPs for that but there is a better way to do this:
    In order to compare between two unsigned numbers, you first subtract A by the lowest value you want to compare with. You then compare the range minus 1 (which becomes +1 due to minus * minus = plus). Carry clear means the value is in range.
    Example: A branch is only taken when A is between 0x13 and 0x42:
    	LDA $00
    	SEC : SBC #$13
    	CMP #$2A	; Because 0x42 - (0x13 - 1) = 1 + 0x42 - 0x13 = 0x2A
    	BCC Branch

    If you want to compare between a positive and a negative number, you add A in a such way that the negative value becomes zero. The comparison works similar as above: You check the range plus (not minus this time) one and if the carry is clear, the value is in range.
    Example: A branch is only taken when A is between 0x10 and 0xF0:
    	LDA $00
    	CLC : ADC #$10
    	CMP #$21	; Because 0x10 - 0xF0 + 1 = 0x10 - (-0x10) + 1 = 0x10 + 0x10 + 1 = 0x21
    	BCC Branch

    Edit: Minor fix where I have entered the wrong file name for the generic sprite interaction.
    Tag (a) was not closed.
    I put a second post here just in case I need to split my post.
    Dropping by to say I'm looking forward to any future updates!
    It looks extremely detailed so far!

    I know the basics of sprite making, but the userbase really needs an updated custom sprite programming tutorial.

    Looks good so far. I'd suggest to explain simple things like adding an acceleration to your sprite, making it jump properly, and so on. I know this is a WIP tutorial, but the suggestions stand.

    For the 32x32 GFX routine, interesting you used a scratch RAM instead of the X register to control the routine loop. I've never thought using this approach.

    Dream team (feed them, please):

    This tutorial really helped me out a lot! :D
    I'm hoping for future updates ;)
    With this tutorial i was able to create my first sprite!I don't know what to put here.
    Originally posted by MarioFanGamer
    I put a second post here just in case I need to split my post.

    Uhh,can you send a code of a goomba that can be squished?
    Thinking about what patcher to use.

    Your ASM Tutorials...
    They rock.

    There the best tutorials I ever seen on my whole entire life!

    Thank you! #smw{:TUP:}

    koopas are AWESOME,,,
    I have decided to just ignore the boss chapter for now and added information about extra bit, extra property bytes, extra bytes, transforming sprites, approximity and custom sprite interaction on other sprites.

    In addition, I used teletype for inline code snippets and opcodes and made some other fixes (spelling, grammar).

    Not all downloads work since the staff's FTP server doesn't work since the move to Cloudflare (which happened around... three years ago).
    Thanks! I needed this. Now I can learn how to write sprites without building off others' works.

    The handheld NSMBs are the best. 😀 Layout BG by Retronator on Tumblr.

    I'm currently making "Mega Man" related sprites.
    First of all, I'm making 4 types of Metall, can you give me some advice?

    What I want to say is that when the face is hidden, when you step on it, it will be flipped like an invincible shell or the fireball will not work, and when the face is exposed, if you step on it, Goomba's I want to turn it over and grab it so that it can be thrown.
    I want to change the specifications with Extra Bits except for Mega Man 7's Metall FX.
    One is "Mega Man 1" Metall and "Mega Man 2" Neo Metall, and the other is "Mega Man 3" Metall DX walking type and flight type.

    All you have to do is look at this ASM File and give us some advice on where to write it!
    If you want to make a sprite flip over with a custom interaction like a Galoomba, you need to use a code like this but with some modifications. The code which needs to be changed is JumpedOnSprite which handle what happens when you stomp on a Metall. For regular jumps:
    	LDA #$02
    	STA $14C8,x		; Make the sprite falling down the screen
    	STZ $AA,x		; Remove speed
    	STZ $B6,x

    That needs to be replaced with the following code:
    	LDA #$02
    	STA !1540,x		; Make the sprite falling down the screen
    	LDA #$09		; Remove speed
    	STZ !14C8,x

    This will put in the Metall into a carryable state like a shell or Galoomba. Right now, I haven't explained custom carryable states and the like so it's better if you just set the Metall's acts like to 0x0F (Galoomba). This will make them truly act like a Galoomba in this state.

    Depending on whether you want to make the Metal invulnerable to Yoshi and spin jump when it's shielding, you need to add a check of $1504,x (assuming you use this code in your sprite) after JumpedOnSprite. Speaking of the linked code: It already explains how to make the Metall temporarily invulnerable to fireballs so no need for me to explain it as well.

    Of course, with Dyzen, things are a bit different. I code the sprites on my own i.e. without Dyzen, mostly because I don't really need it and some things it does is IMO too overblown. Conversely, my knowledge of Dyzen goes to zero which is why I can't help you out further. Either way, that knowledge I posted is still applies with Dyzen, just in a different way.
    What is the code for the Cut Man's cutter throw and cutter action?

    Cut Man's ASM file contains

    ;!hasami_flag,xが hasami_flag,x is
    ;0のときは開いている Open when 0
    ;1のときは閉じている Closed when 1.
    ;2のときは描画しない No drawing when 2

    !cutter_sprite = $06

    	LDA !1588,x
    	AND #$03
    	BEQ +
    	STZ !sprite_speed_x,x
    	AND #$08
    	BEQ +
    	STZ !sprite_speed_y,x
    	JSL $01802A|!BankB
    	JSR check_mario
    	BMI +
    	CMP #$08
    	BCC +
    	LDA #$08
    	LDA !sprite_blocked_status,x
    	AND #$04
    	BEQ ++
    	STA !157C,x
    	LDA !sprite_blocked_status,x
    	AND #$03
    	BEQ +
    	LDA #$C8
    	STA !sprite_speed_y,x
    	LDA #$08
    	STA !pose,x
    	LDA $14
    	AND #$01
    	STA !hasami_flag,x
    	LDA WalkingXSpeed,y
    	STA !sprite_speed_x,x
    	REP #$20
    	LDA $0E
    	BPL +
    	EOR #$FFFF
    	CMP #$0030
    	SEP #$20
    	BCS +
    	LDA #$A6
    	BRA -
    	INC !1570,x
    	LDA !1570,x
    	CMP #$08
    	BCC ++
    	STZ !1570,x
    	INC !anime_count,x
    	LDA #$FF
    	LDA !anime_count,x
    	CMP #$06
    	BCC +
    	STZ !anime_count,x
    	LDA !anime_count,x
    	ADC #$02
    	STA !pose,x
    	LDA #$10	; size of range of timer values (i.e. max value - min value)
    	ADC #$10	; min value
    	STA !timer,x
    	db $30,$D0
    	JSR check_mario
    	STA !157C,x
    	LDA !timer,x
    	BNE +
    	LDA #$10
    	STA !timer,x
    	DEC !state,x
    	INC !hasami_flag,x
    	LDA #$09
    	STA !pose,x
    	STZ $00
    	STZ $01
    	LDA cutter_speed,y
    	STA $02
    	STZ $03
    	LDA #$00
    	STA $00
    	LDA #$00
    	STA $01
    	LDA #$00
    	STA $02
    	LDA #$00
    	STA $03
    	LDA #!cutter_sprite
        STA $01
        PHX                     ;\ preserve sprite indexes of Magikoopa and magic
        PHY                     ;/
        %SubVertPos()           ; $0E = vertical distance to Mario
        STY $02                 ; $02 = vertical direction to Mario
        LDA $0F                 ;\ $0C = vertical distance to Mario, positive
        BPL +                   ; |
            EOR #$FF            ; |
            CLC : ADC #$01      ; |
    +   STA $0C                 ;/
        %SubHorzPos()           ; $0F = horizontal distance to Mario
        STY $03                 ; $03 = horizontal direction to Mario
        LDA $0E                 ;\ $0D = horizontal distance to Mario, positive
        BPL +                   ; |
            EOR #$FF            ; |
            CLC : ADC #$01      ; |
    +   STA $0D
        LDY #$00
        LDA $0D                 ;\ if vertical distance less than horizontal distance,
        CMP $0C                 ; |
        BCS +                   ;/ branch
            INY                 ; set y register
            PHA                 ;\ switch $0C and $0D
            LDA $0C             ; |
            STA $0D             ; |
            PLA                 ; |
            STA $0C             ;/
    +   STZ $0B                 ;\ zero out $00 and $0B
        STZ $00                 ;/
        LDX $01                 ;\ divide $0C by $0D?
    -   LDA $0B                 ; |\ if $0C + loop counter is less than $0D,
        CLC : ADC $0C           ; | |
        CMP $0D                 ; | |
        BCC +                   ; |/ branch
            SBC $0D             ; | else, subtract $0D
            INC $00             ; | and increase $00
    +   STA $0B                 ; |
        DEX                     ; |\ if still cycles left to run,
        BNE -                   ;/ / go to start of loop
        TYA                     ;\ if $0C and $0D was not switched,
        BEQ +                   ;/ branch
            LDA $00             ;\ else, switch $00 and $01
            PHA                 ; |
            LDA $01             ; |
            STA $00             ; |
            PLA                 ; |
            STA $01             ;/
    +   LDA $00                 ;\ if horizontal distance was inverted,
        LDY $02                 ; | invert $00
        BEQ +                   ; |
            EOR #$FF            ; |
            CLC : ADC #$01      ; |
            STA $00             ;/
    +   LDA $01                 ;\ if vertical distance was inverted,
        LDY $03                 ; | invert $01
        BEQ +                   ; |
            EOR #$FF            ; |
            CLC : ADC #$01      ; |
            STA $01             ;/
    +   PLY                     ;\ retrieve Magikoopa and magic sprite indexes
        PLX                     ;/
    	LDA !1588,x
    	AND #$03
    	BEQ +
    	STZ !sprite_speed_x,x
    	AND #$08
    	BEQ +
    	STZ !sprite_speed_y,x
    	JSL $01802A|!BankB
    	JSR check_mario
    	BMI +
    	CMP #$08
    	BCC +
    	LDA #$08
    	LDA !sprite_blocked_status,x
    	AND #$04
    	STA !157C,x
    	LDA !sprite_blocked_status,x
    	AND #$03
    	BEQ +
    	LDA #$C8
    	STA !sprite_speed_y,x
    	LDA #$08
    	STA !pose,x
    	LDA $14
    	AND #$02
    	STA !hasami_flag,x
    	LDA WalkingXSpeed,y
    	STA !sprite_speed_x,x
    	REP #$20
    	LDA $0E
    	BPL +
    	EOR #$FFFF
    	CMP #$0030
    	SEP #$20
    	BCS +
    	LDA #$A6
    	BRA -
    	INC !1570,x
    	LDA !1570,x
    	CMP #$08
    	STZ !1570,x
    	INC !anime_count,x
    	LDA #$FF
    	LDA !anime_count,x
    	CMP #$06
    	BCC +
    	STZ !anime_count,x
    	LDA !anime_count,x
    	ADC #$02
    	STA !pose,x

    The Rolling Cutter ASM file contains

    !CutManSpriteNumber = $05
    print "INIT ",pc
    	STA $157C,x
    print "MAIN ",pc
    PHB : PHK : PLB
    	JSR MainCode
    	JSR Graphics
    	LDA #$00
    	%SubOffScreen()	; Check if offscreen
    	db $AA,$AE,$AA,$AE
    	PHY			; Preserve OAM index
    	LDY $157C,x	; Set tile flip depending on direction
    	LDA TileFlip,y
    	ORA $15F6,x	; Put it together with the sprite's and level's OAM properties
    	ORA $64
    	STA $03
    	PLY			; Restore OAM index
    	LDA #$01	; Set frame index
    	STA $02
    	LDA $00
    	STA $0300|!Base2,y	; X position
    	LDA $01
    	STA $0301,y	; Y position
    	LDY $1602,x
    	LDA tilemap,y
    	PLY			; Restore OAM slot
    	STA $0302|!Base2,y	; Tile number
    	LDA $03
    	STA $0303|!Base2,y	; Properties
    	LDA #$00	; Tiles to draw - 1
    	LDY #$02	; 16x16 sprite
    	JSL $01B7B3

    There is one thing I don't understand when drawing 32x32 sprites, in particular looking at this code example.

    First, you retrieve a free OAM slot with GetDrawInfo, and its index is stored in Y. After that, you loop four times (GFXloop) to draw the four tiles, and on each loop you increase Y by 4 to go to next OAM slot. How do you know that the next OAM slot is free? From what I understand, GetDrawInfo only guarantees that Y is free, not Y+4, Y+8, and Y+12, so we don't know if those slots are being used by another sprite. Am I missing something?
    ROM Hack Manager - SMW Resources - SMW Toolbox
    Actually, GetDrawInfo doesn't guarantee a tile is asigned as the allocation happens before the sprite gets processed (its main purpose is to mark a sprite as offscreen, get the sprite position relative to the screen and terminate the GFX code if it's too far outside the screen).
    In the vanilla game, the OAM allocation uses hardcoded values, though they're dependent on the sprite header which allows the game to use larger sprites when necessary at the cost of not being able to place as many of them onscreen (and it is actively limited in software).
    For NMSTL, the allocation is dynamic and it happens by checking for every OAM slot until it find the first used object and use that one's index + 1 for the current sprite. In such a system, the slots are guaranteed to be free by the algorithm since if a higher slot were free, the NMSTL algorithm would have stopped there.
    Ok, let me try to describe the vanilla behaviour for OAM allocation and see if I understand it correctly.

    Every vanilla sprite specifies somewhere whether it is normal (16x16) or special (32x32, e.g. Big Mole). When the game runs, for each active sprite, the game allocates it an OAM slice based on its size. For normal sprites (16x16) the game will assign OAM slots (4 bytes) in increasing (or decreasing?) order in the normal sprite area (as defined in the sprite header); no more OAM slots available means the sprite will not be drawn. If the sprite is special (e.g., big mole, 32x32), the game positions the sprite in the OAM area (16 bytes) dedicated to special sprites (as defined in the sprite header). This is not very flexible, because if I'm not placing any special sprite in the level, there will be 4 unused OAM slots that could be used for normal sprites.

    For instance, sprite header 04: M:08-01=07 SP:60,9F M1:01-00=01 M2:01-00=01, means we can have 7 normal sprites and 2 special sprites. If I have 8 koopas active in a screen and no special sprite, only 7 koopas will be drawn, as 1 doesn't have a free OAM slot, which is a shame because there are slots reserved for special sprites that are currently empty.

    Applying the NMSTL patch (or SA-1 I guess) makes it so that the sprite header is irrelevant for determining where a sprite will be placed in OAM, and there is no more distinction between normal and special sprite area. The game assigns OAM slots in increasing (or decreasing) order based on the sprite size (4 bytes for a normal sprite, 16 bytes for a special sprite), but there is no longer an OAM area dedicated to special sprites. This has the evident advantage of being able to have more than two big moles on screen, have 8 koopas, etc.

    Now, how do custom sprites work? When I create a custom sprite I'm not specifying whether it's special (32x32) or normal (16x16) anywhere, so how can I make vanilla game know if it should take a normal OAM slot or a special one? How does NMSTL knows if it should allocate 4 or 16 bytes?

    Concerning allocation, does it mean storing the index of the first OAM slot allocated to the sprite in table $15EA?
    ROM Hack Manager - SMW Resources - SMW Toolbox
    Okay, let's start again because there is a fundamental misunderstanding in how OAM allocation works as well as the point of this part of the tutorial.

    One key factor is that Sprite Memory not only handles OAM allocation but also how many can be spawned (on that aside, I hate the include SpawnSprite for PIXI as it doesn't respect the sprite memory which resulted in quite a few shenanigans, ditto for GPS's spawn_sprite).

    Originally posted by zuccha
    This is not very flexible, because if I'm not placing any special sprite in the level, there will be 4 unused OAM slots that could be used for normal sprites.

    Indeed, the way SMW handles that is limited and wasteful, yes, and so many want to see it replaced (NMSTL exists but everyone agrees it isn't a good system either), but the only ones who need to take care of OAM allocation are these who change them.

    Originally posted by zuccha
    Every vanilla sprite specifies somewhere whether it is normal (16x16) or special (32x32, e.g. Big Mole).

    No. It's true that there are special sprites. However, they use a lot of tiles compared to the Mega Moles which are bigger than the average Koopa but still gets dwarfed by even bigger sprites like Banzai Bill and the Browned Chain Platform. The list of sprites are defined in $02A7D2 and $02A7E4 (indexed by $1692). This list is only used when spawning a sprite as it enforced them to be placed in certain slots (the one which have enough objects to fit in their tilemap).

    Originally posted by zuccha
    For instance, sprite header 04: M:08-01=07 SP:60,9F M1:01-00=01 M2:01-00=01, means we can have 7 normal sprites and 2 special sprites. If I have 8 koopas active in a screen and no special sprite, only 7 koopas will be drawn, as 1 doesn't have a free OAM slot, which is a shame because there are slots reserved for special sprites that are currently empty.

    Almost. Sprite memory also affects how many sprites can exist at a time so seven Koopas will be drawn in this situation but that's because only seven of them can even be spawned.

    Originally posted by zuccha
    Applying the NMSTL patch (or SA-1 I guess) makes it so that the sprite header is irrelevant for determining where a sprite will be placed in OAM, and there is no more distinction between normal and special sprite area. The game assigns OAM slots in increasing (or decreasing) order based on the sprite size (4 bytes for a normal sprite, 16 bytes for a special sprite), but there is no longer an OAM area dedicated to special sprites. This has the evident advantage of being able to have more than two big moles on screen, have 8 koopas, etc.

    The first part (irrelevant sprite header, no reserved OAM) is correct but the last one (allocation of tiles is wrong) since the game doesn't allocate them in any order but instead on fixed slots (that's what I meant by "hardcoded values" and why I mentioned the allocation by NMSTL is "dynamic"). You can see the indices at $07F0000, indexed by sprite_index + ($07F0B4 + sprite_memory_index) (it's not quite sprite_memory_index * 12 in this situation but I hope you get the point).
    This is how the issue of Mega Moles are solved because most sprite memory settings including 0 (all slots, no special sprites) assign five objects for each sprite, hence why the game has no trouble having them more than three onscreen (this is possible in their level, Bowser's Valley 1) but sprites like Chucks, Thwomps and Urchins even use five tiles but are regularly placed into sprite memory 0 levels, often multiple at the same time (hint: Splittin' Chucks are a thing).
    The Ball 'n' Chain also is a good example of a non-reserved sprie as they they take six tiles (four for the ball, two for the chain) but quite a few levels have more than two onscreen at the time (heck, Ludwig's Castle has five at one point). This is solved by using sprite memory E which assigns each sprite six tiles at the cost that two sprite slots are unavailable.
    Actually, I lied at one point: The upper two most slots (0x10 and 0x11) have only one tile reserved each (this is true for all sprite memory settings, for that matter) but these slots also are reserved for items (which only use one tile) while regular sprites will never be placed into these slots (it also is another reason I dislike the aforementioned spawn sprite routines).

    Originally posted by zuccha
    Now, how do custom sprites work? When I create a custom sprite I'm not specifying whether it's special (32x32) or normal (16x16) anywhere, so how can I make vanilla game know if it should take a normal OAM slot or a special one? How does NMSTL knows if it should allocate 4 or 16 bytes?

    The way custom sprites work is the same as vanilla sprites do: They simply don't; and NMSTL doesn't work that way either. It actually allocates OAM for the current sprite from the first used tile (that is, top to bottom) to the end of OAM, it virtually allocate "infinitely" many tiles for the current sprite.
    In fact, the reason why disassemblies of big sprites require you to use NMSTL is because they're typically set to act like sprite 36 (which is a null sprite and thus guaranteed to not be referenced within the shared routines) and thus can't take advantage of this OAM reservation.

    The punchline is that this section doesn't try to explain what "special sprites" are (not that they're usable by any means) but instead how to draw more than one tile for a sprite. If it were indeed how you thought it worked, I would have described that already but in reality, the system is actually that simple. Don't put any more thoughts into it, just do it as if you have an infinite amount of tiles available.
    Alright, I think that now I understand something about OAM allocation, which is already a lot more than before. I guess that the rest will become more clear as I delve deeper into SMW hacking.

    Thanks for taking the time to give such an extensive answer, it's a complicated and hard subject to explain #smw{:peace:}
    ROM Hack Manager - SMW Resources - SMW Toolbox
    『ロックマン7 宿命の対決!』に登場するキントットに、の動きをさせたいのですが、まず、こんな感じでグラフィックの設定をする方法を教えて下さい。

    I want to make the Kintot in "Mega Man 7" do Cheep-Cheep movements, but first, how do I set up the graphics like this?

    ASM CodingCustom Sprites