A few notes, mostly optimization-related. First, some of your percussion macros are redundant. This, for example, should be K8. c16 instead, since you have a redundant instrument call. Second, all your [r1] loops would be more efficient as [r2] loops (and some of the shorter ones shouldn't be looped at all, although that's incredibly minor and I'm only really pointing it out since I'm already pointing out other things. Third, it's better to use $DF instead of p0,0 if you want to disable vibrato. Finally, I tested it, and the ARAM overflow is only 0x100 bytes if you include @15, so I'm sure you could make this Yoshi-compatible by slightly downsampling any sample.
The one non-logistical note I have is that the clap feels a bit too loud, especially at the beginning. Otherwise, nice job!