Language…
6 users online: cozyduck, Firstnamebutt, Golden Yoshi, GRIMMKIN, Pink Gold Peach, qantuum - Guests: 246 - Bots: 415
Users: 64,795 (2,377 active)
Latest user: mathew

Feedback and Rules Changes

Link Thread Closed
Well thankfully you're not the one judging for me. I promise I won't be hurt over the opinion of someone who's comments, while having bits of valid points, seem to be moreso surrounding the fact they are sour over an easier level winning that doesn't meet their criteria of kaizo.
Just chiming in to say that I support the idea of a Kaizo Hard LDC. While Kaizo Hard hacking has admittedly gotten relatively quiet in the last years due to the rise of savestateless kaizo, there still appears to be some demand for a Kaizo Hard contest and I feel like restricting people to conform to the Kaizo Light difficulty limits creativity (just look at some of the entries in this contest that had good ideas but failed to score high because they were too difficult, like Twisted Plains).

(Also, as a side note: he schlee doesn't even have pure vanilla aesthetics, it goes out of its way to pixelate the screen after death. Even ignoring the ugly cement block spam, it is designed to look bad. Don't really get the 5/10 in aethetics either, but it doesn't matter after all)
I will be honest, knowing it did not have any hope in hell of winning I can't say I scored it with any form of critical thought process. At first glance it's a big vanilla cement wall. I knew what the level was by the time I reached it in my judging. Moved on to spend my time playing other serious levels opposed to spending it on he schlee.

I find it funny that's the area of concern being brought to light with my judging when the entry was clearly a troll and other comments have reflected I put a lot of thought & care into my judging.

Also worth noting I loved the first 6 hours I dumped into Twisted Plains. The fast travel aspect was neat as was the layering of abilities. Once I cleared shadow castle was where I opted out.

I don't see myself ever being a kaizo hard player but if there's actual demand I understand the ask.
Originally posted by strizer86
Well thankfully you're not the one judging for me. I promise I won't be hurt over the opinion of someone who's comments, while having bits of valid points, seem to be moreso surrounding the fact they are sour over an easier level winning that doesn't meet their criteria of kaizo.


dude. cmon. im not here to fight

but ill say this, you dont know shit about me to assume im 'sour' about anything.
And if you are slightly smart an pay attention to anything, im not disappointed with an 'easy' level winning. My disappointment is with a level that brings absolutely nothing new, different or even better to the table winning.. and specially with much more creative and interesting levels with such bad placings. This really makes me question whats the purpose of the contest..
That's why I added this edit to give an example:

"Also worth noting I loved the first 6 hours I dumped into Twisted Plains. The fast travel aspect was neat as was the layering of abilities. Once I cleared shadow castle was where I opted out.

I don't see myself ever being a kaizo hard player but if there's actual demand I understand the ask."


My point is your initial criticism of my judging is silly based on the level, and then telling me it's absurdly wrong is over the top. Especially when all other comments I've gotten reflect otherwise.
Ok, time out.

Please stop with the individual sniping - I asked for overall feedback, not comments on any specific judge, and this is veering away from "debate over the point of the contest" to individual accusation and justification over scores, which this is not meant to be.

I think it's fair to say that I've heard the statements on judge diversity and a need for more formalized judging . No need to beat that into the ground, it has been expressed well and in ways that make it actionable. If you want to use that topic to attack or accuse judges of malpractice, do so privately and respectfully, but not here.

If anyone has other points of feedback for me, please keep them coming here.
I'm not a doctor.

For what it's worth Strizer, I thought your streams were pleasantly chill and I heard a lot of good things.

Originally posted by Doctor No
Originally posted by NaroGugul
But its 'confusing' (in absence of a better word) to shape any kind of deeper thinking if the views on what the contest is about can vary so much.


Originally posted by GbreezeSunset
These are all valid ways to make Kaizo, but the Twitch community needs to understand that the long levels, puzzly and cryptic levels, and otherwise unpopular-on-twitch design tropes are an equally valid aspect of Kaizo design. My goal is not to change the perspective of others players or attempt to convince someone that their way of designing or their interpretation of Kaizo is somehow invalid or lesser than another. My goal is to have a contest and group of judges that will allow for all aspects of the Kaizo community to be appreciated.



These two things are what my biggest takeaway from this year's contest will be, apart from the structural things that I know can be improved on (judge diversity and consistency in communicating results/criteria, me rambling into a microphone more than is necessary, etc.).

I'm not sure any other contest here fundamentally questions its existence like KLDC has led us to now, but I also think that our contest is uniquely open to such observation. My job isn't to sit on a mountaintop and dictate the Ten Commandments of Kaizo to the community; I study kaizo but I don't play it and barely create it.

At the end of the day, it's up to the whole community - players, casual creators, kaizo auteurs, and everyone in between - to determine what this contest should recognize and celebrate.

Previous KLDCs have had multiple categories, so maybe that's something to explore - some sort of delineation that allows for people to enter whatever kind of level they wish and have it compared to its peers. I don't know, which is why I'm asking. I'm not good enough at anything kaizo related to dictate the terms of what makes a good level or not beyond my own personal taste. If the community can come to a consensus on what we want KLDC to celebrate, I'm happy to take it and put in the effort to throw the biggest party I can.


I suppose we've done enough criticizing and it's time to think of solutions. First, to address the elephants in the room: boardcar and twisted plains (which have already been discussed a lot). I don't think they should have ranked higher in this contest. But I do think these levels should have an opportunity to shine and kaizo:hard shouldn't be completely barred from existence. Does that mean holding a seperate contest? I'm not sure, as I haven't made a kaizo:hard level in like 5 years, I have no idea if there would be a ton of demand for it. But perhaps it's worth thinking about? Not sure what the format would be, if they would be co-hosted at the same time and you couldn't enter both?

Second, the second elephant: long levels. I think a good compromise would be this: make sure at least 1 judge next time doesn't mind long levels at all (even better: actively likes them), but institute a rule where levels can't be over X minutes for a full perfect run to avoid super long levels (this X number can be discussed of course). I know that pyro's rather long level scored really high, so it's obviously not impossible, but the length issue is worth mentioning regardless.

Next, there's the cryptic and puzzly kaizo levels. I'm mainly thinking of stuff like modular or what people call "old morsel design", that you would see in like gaijin or something. These are levels where the solution to setups is extremely unclear and you have to figure out how to solve setups, as well as grind out the muscle memory of the actual level. I won't argue that these levels can do well (lazy's entry was fairly puzzly and placed quite well), but it's no shocker that these levels can be extremely controversial in the twitch community. There's not much to say on this besides that a slight bit of judge diversity might help, and that dode's suggestion on the weight of the creativity score may help this a bit as well (and I'm curious what thoughts are on this)

Last thing I'd add is more advice, and then some encouragement. This can be taken with a grain of salt, as I am not a twitch streamer, and I know people rely on twitch revenue for their livelihood. But I wanna point out the power of taking breaks, especially for my own playing. When I test a bunch of levels in a row, my patience quickly runs thin. I definitely understand how that can happen when you're hours into a judging stream and levels are just starting to become annoying, but I doubt any viewers would be upset if you just took a break with judging and played something more cathartic or relaxing for a break, like your fave kaizo hack or something. Idk, it's just a thought. Lastly, I'd encourage Doctor No and the judges that I think this contest was hosted extremely well and for an effort to do something totally fresh and new on smwc, a great job was done all around. I'm only suggesting ways that next time could be improved.

Edit: ninja'd, added some stuff
I know there is this language barrier and i probably sound more agressive than i intend to be.

Im sorry Doctor No about anything

My intention was never to call out any judge, but to illustrate the point in question.. eg.: short comments from the judges.
(even tho i specified his name in this case, i wasnt trying to denigrate his judge status.. even if i disagree with some points)


Anyway, despite any disagreement, thanks for making the contest this year a reality. The kaizo community really deserved an event like that.
I appreciate that Gbreeze! I had a lot of fun and going back and trimming clips of the first time I beat each section of each stage has been a great reminder of the feelings of joy & accomplishment that comes with it.
Originally posted by GbreezeSunset
Second, the second elephant: long levels. I think a good compromise would be this: make sure at least 1 judge next time doesn't mind long levels at all (even better: actively likes them), but institute a rule where levels can't be over X minutes for a full perfect run to avoid super long levels (this X number can be discussed of course). I know that pyro's rather long level scored really high, so it's obviously not impossible, but the length issue is worth mentioning regardless.


The X-minute clear limit was something on my mind. From talking to judges and reviewing comments, one of the issues with "level length" is really "forgiveness of checkpoints." You can have a 10-minute long clear level that is a pleasure to play through because it's very forgiving with checkpoints, and you can have a 2-minute clear level that is toothpicks under your fingernails because it's just 15 yumps in a row before your first checkpoint.

My interpretation, and this is reinforced after watching some blind playthroughs of past KLDC winners and standouts, is the levels that scored well because they balanced their length with checkpoints. Not saying that you need a checkpoint after every trick, because that's just essentially save-stating, but giving the player a chance to breathe after doing something impressive and moving on to a new section with new techniques.

(Also, for what it's worth after watching the results VOD, I really want to find a way to celebrate how good these levels are without people fixating on the ordinals. ALL of these levels deserve to be remembered and celebrated.)
I'm not a doctor.

Originally posted by strizer86
For clarification on my judging of aesthetics I took 5/10 as vanilla.

If it was normal and functional it was 5/10. If it had palette swaps, more minor custom graphics, and functional it was moreso a 6/7, and if they went above and beyond with custom GFX it was like 8+.


Originally posted by Doctor No
Aesthetics (10)
Does the level have any visual or audio errors that detract from the gameplay?


I don't think your judging of vanilla levels as 5/10 as being wrong, because that is definitely a valid and acceptable way to judge. But this is the kind of thing that I would hope can be clarified ahead of time and standardized via a rubric, because some judges seemed to take very different opinions on what qualified a loss or gain of points in this category. Taking my level as an example, I specifically chose to use understated and non-distracting GFX and use vanilla foreground because the only criteria given in the rules was 'visual or audio errors' so I assumed that all that mattered for points was function, and not being negative to gameplay.


Unrelated to that, here is my own opinion on these matters, so what follows will be much more subjective and how I feel aesthetics should be evaluated in romhacking for SMW. I feel that it's not exactly right to reward levels that do complete visual overhauls over levels that prefer vanilla aesthetics. With the way that SMWC works and how fantastic of a resource it is, it would take only 10 minutes for me to download some very aesthetically pleasing GFX packs and to completely re-do the visuals of any level. As a result, choosing vanilla aesthetics isn't something bad that should be punished with a lower score- Vanilla foregrounds are the most well-known, and every player should have no trouble discerning the proper hitbox and function of obstacles when encountering vanilla foreground, so I choose to use them in most everything I do (one level in my hack did not use them, and nearly every player complained that they didn't know what certain tiles functioned as). I understand that perhaps if you were to be creating every resource used in your level, like some design contest where beyond the baseROM every resource has to be made, coded, etc. by you, then a complete visual overhaul would be indicative of a significant amount of work and care. The fact that the winner of the contest involved a complete graphics overhaul and a custom port is perhaps related to this as well- looking at the scoring method on the website, that should hold no addition to their score. Obviously there is a good argument for giving more credit to people who worked harder on crafting the aesthetic of their hack, but with the way that resources are shared and used in the romhacking scene, understanding who put the effort in can be a non-trivial matter. By the same token, my level should not receive more points because I coded the custom sprite myself.
Can totally appreciate wanting clarification on that in future prior to. Thankfully it only equated to 10% of the overall level score. I always feel like graphics are a risk in some ways because as you mention there's basically universal indicators and understandings of functionality at this point, and straying too far from those can result in it being a detractor opposed to benefiting the level.
Judging aesthetics just by how much of work you think it took to make is a misunderstanding of the role aesthetics have in the gameplay.
Aesthetics is ultimately the main 'mmean of comunication' between the level and the player. If its not clear, not consistent, gameplay suffers. If it doesnt tell a story or if the music doesnt fit (pace/ambience/etc), your experience as a player suffers.
Its not just to 'look good'.
You can have absolutely amazing results with vanilla (just see some of the past vldcs levels) as you can have really terrible results with chocolate.
But, some think its 'silly' to question about it.. well..
Originally posted by NaroGugul
Judging aesthetics just by how much of work you think it took to make is a misunderstanding of the role aesthetics have in the gameplay.
Aesthetics is ultimately the main 'mmean of comunication' between the level and the player. If its not clear, not consistent, gameplay suffers. If it doesnt tell a story or if the music doesnt fit (pace/ambience/etc), your experience as a player suffers.
Its not just to 'look good'.
You can have absolutely amazing results with vanilla (just see some of the past vldcs levels) as you can have really terrible results with chocolate.
But, some think its 'silly' to question about it.. well..

Of course, we can actually argue about the criteria which should be used to rate the aesthetics, but I feel the important thing here is that those criteria should be a standard of the contest. I'm sure every judge comes up with its own criteria in those LDC, like the one Strizer has explained, and I'm sure all of those are pretty valid on their own, but its more an issue about consistency between judges, and *knowing* the *rules of the game* before playing it. idol tried something like this for the CLDC: purely vanilla aesthetics will be worth 6/10 points. I'm not saying that that's the best approach (as I have proposed, I would use well-though rubrics), but it is at least a starting point that surely could be refined contest after contest.

If done right, this could also help judges judge, as they would have clear indicators and they could actually analyze the levels with those in mind, not only in the aesthetics department. For example, one of the issues seems to be with the level length. Working with rubrics, one of the indicators on the design department could be:
"Is there a good distribution of checkpoints in the level, taking into account its length?" (note that it is just an example). And this shouldn't limit judges subjectivity at all: they have the freedom to assign a number of points to that indicator and it's completely up to them what it means "good distribution": that's where judges subjectivity comes in play. But this approach has the benefit that a player that had a bad score in the design department because only added a checkpoint in a pretty long level can actually look at the scores, understand *why* he/she got such a low score (since the results would be split in indicators for each department) and actually *improve* from that.

I'm not making those things up btw: it's how evaluation actually works (or, I should say, how it should work) in education and other fields, at least in my country, and it helps both creators into knowing how the judging process will work *before* the contest and to understand their score *after* the contest, and to assure consistency in the judging.
Length is a delicate subject.

Theres no effective way to evaluate that.
You can make a level thats just 1 minute long for a deathless clear and it could take dozens of hours of blind gameplay.
At the same time you can make a level thats 10 minutes long and people beat it blind in less than 20 maybe.
Number of screens? Number of sublevels? thats all irrelevant, because it really depends on the circumstances.
Also, the perception of length is totally subjective.

I can't say length shouldnt be limited somehow, but from the creation perspective, it should be as long as you still have 'something to say'.

This is just a random babbling.. but when i think about length i imagine Bach composing a violin partita (a set of seemingly irrelevant pieces of unnaccompained music for the violin).. Then he starts writing the Chaconne (a kinda boring repetitive style of music).. i wonder, what if he was limited to write only 2 minutes? If you know Bach's Chaconne and its significance, i think you already know what im talking about...
I believe no hard rule will just make levels that overstate their welcome vanish, that's the logical equivalent of trying to prevent bad levels from existing. I do not believe that making time a strict limit is a good idea, in fact, limiting the lenght of the levels might end up hurting a good idea because makers will be with the knife pressed up against their necks if they want to make a lenghty level. Just look at idol's entry, it has a 5 minute clear video but it didnt took judges more than an hour to beat it on the first play. The level might be lenghty but it can have enough variety that it makes the player keep going. The only way to judge if a level goes on for too long is on the level design judging process itself. I also stand for that level design IS NOT subjective, in fact it's the most objective criteria on the score system. Design is an science, it focus on the relative efficiency of said thing being designed in relation to accomplish it's main purpose. If something is well designed, it will be round, achieve it's objectives with confidence and realiability, and it will be open to the user to have meaningful and desirable interactions with it. A good level must have focus, not waste your time, not be easily broken and offer a meaningful experience to the player. Long levels can be said to be poorly designed because their ideas are not focused enough to make a meaningful interaction with the player, wasting his time and therefore lacks precision on achieving the purposes it set out to reach.
Level design is not a "things I like vs things I don't like in a level", that's subjective. Designing is not. You might not like that a level is big, but that's your own personal taste, it has nothing to do with the design of the level itself.
It is not an attack on any judges, or the team behind the contest, but people should have more clear standards and if a judging proccess wants to be fair and unbiased, it should judge levels for the levels itself, not on their personal preferences.

TL; DR: Long levels that overstate their welcome with unfocused obstacles should receive lower scores, and hard-limiting a restrictive time can work against the creators.
Also, design is objective, so much there are ways to actually measure a good design with reliability on every field where design is a thing.
"Not all pie sitters cry."
-James Morgan McGill
I promise ill stop talking :)

While i stand for a more objective judging on every aspect, i know it can be failed if the criteria used is wrong. Again, and for the last time, this isnt an attack, but look at how strizer judged the aesthetics. Yes, its a very objective way to judge aesthetics. Its the best criteria? Hardly
(and i talk this from experience.. if you look at my portifolio you see ive always favored visual/aesthetic overhauls.. so trust me, i know how much work and thinking can go into that)

But anyway.. what i propose for future contest judging. Please, bear with me.

You know anything can 'be good' standing by themselves, alone. The problem is its a competition, so it shouldnt be only judged as singular.

I know this is an weird example, but are you familiar with the 'classical' bodybuilding judging style?
in short:

- its 4 rounds, each judging a different aspect of the physique presented.. All rounds have the same weight to the final scoring.

- theres no 'point' system. The judges give straight up placings to each round. (look up to it if you are curious).. eg. if someone gets 1st from all judges in a round he will get 5 points (1+1+1+1+1). Then then the sum of all rounds give your final score.. Less points wins.

- they are judged 1st by themselves, then through comparisons.

This is interesting, because it not only evaluates themselves, but also take in consideration the whole/context.

So what im basically saying is, even if some levels are amazing by themselves, when compared to others they can flip positions quite easily. So comparisons should be made (maybe the kldc judges did it, idk)

Anyway. Im not going to go deeper into this. If anyone is interested, just search around (you will notice judging rounds changed over the years but the logic stays the same). Arnold Schwarzenegger book Encyclopedia of Bodybuildig has a nice explanation.

While that system isnt free from controversial results, its an interesting way to favor a more objective result in a highly subjective sport.
Just throwing my two cents in here:

I think that fundamentally there are two ways you can run these contests.

1. Set up constraints on how people make things, but allow any design goal (e.g. 24h and vanilla... at least the way they should be run)

2. Set up constraints on what types of levels and design goals are acceptable, then be as flexible as possible with how the author gets there (this contest)

I guess I just generally think that (2) is flawed. Even though people are told not to make things "for the judges", you have set up the entire premise of the contest on subjective footing. I admit that in any contest, judges are going to have their own biases, but if you group contests along (2) they are more likely to try to fit things to some model of an "ideal level" in their head.

Supposing that the purpose of these contests is to get people to make cool stuff, I would be strongly against any rule or restriction that limits what an author is allowed to aim for. E.g. for this contest, level length, time limit, puzzles, etc. The more you standardize things the more you are going to get "follow the leader", which is the exact opposite of the spirit of kaizo.

I guess that isn't helpful without providing alternative things to judge on, so here are a few:




Clear creative vision : did the level have a clear conceptual theme that was communicated to the player. Did it stick to that theme the whole time.
--good example : (you know what this looks like)
--bad example : a level which is long by virtue of changing gimmicks halfway through. a "generic" level which doesn't present any new ideas.

Coherence : did the level's design consistently match the vision communicated above.
--good example : (you know what this looks like)
--bad example : a fast paced platforming level with an out of place puzzle stuck in the middle. basically anything that feels unintentional

Depth : did the design of the level thoroughly explore the idea it is based on. did it linger on the same idea without adding new variations
--good example : a short punchy level. a long level which iterates on an idea with a lot of depth, without repeating itself
--bad example : a level which repeats itself

Pacing : does the rate at which new ideas are expressed remain relatively consistent
--good example : (you know what this looks like)
--bad example : a giant difficulty spike causing progress to grind to a halt. a level which allocates lots of time to one thought, then barrages the player with a bunch of other things near the end.
--bad example : level which goes from being extremely fast paced to slow based or vice versa, where it is clear that this wasn't done deliberately.

annoying / out of place / tedious parts : subtract x * the number of these from the final score.




I think something like this would be way better suited for basically every contest. But thats just like my opinion man.
Hey guys, thank you so much for the feedback. The whole judging thing was a new experience for me and I was very happy to play everyone's creations.

I do agree my comments on the levels could've been way better. I mostly tried to focus on why levels lost points with me and didn't focus enough on commentating everything else.

I've seen a few people discussing this and I followed my own rubric to help and try to judge levels more consistently. Breaking down the main categories into mini sections based on what what I felt belonged in that section. For example, in level design I paid attention and rated: checkpoint placement, difficulty curve/spikes, pacing & flow, introduction/expansion of ideas, and if the level felt well tested (no camera issues,blind jumps, or despawning).

I'll take what I've learned here and, if i get the chance, apply it to future events or maybe just some for fun fan judging. :)
Just noticed this thread is actually a thing, so here are my 2 cents as well:

- I agree on Dode that creativity should've ranked higher. I also agree on snoruntpyro that the "Kaizo elements" category was unnecessary and random instead. If these 10 points would've been added to creativity i think this would've a more fair scoring, since creativity is an important aspect on its own (not considering if obstacles are fair, checkpoint usage etc, but instead on focusing how well the designer played with his gimmick). Instead we had this quite subjective kaizo elements thing...

- A vkldc as a own contest sounds like a great idea but i think i said that already at some point

- I also agree that streaming doesn't replace the scoring sheet. Not only are the comments super short for some level, but I would've also like to see the exact scoring for the levels. Did my level lack creativity? Or did judges dislike its design? Exact scores and detailed comments are important to improve, even if this means a "streaming pause" for around 10-15 minutes (just my rough estimation). Of course there could be a non-streaming judge too. I'm indiffirent on that last aspect though.

- i agree with idol that disqualifying trolls / low effort entries should be more strict. Why should he schlee get a participation trophy only for building a cement wall and making some custom palette?

- And i agree on Naro that i cannot understand how Chroma Castle could win this contest. The level isn't but, but as Naro said there were so many other way more creative levels. Chroma Castles most custom thing was its flashy palettes and that's all. It kinda reminded me of VLDC9 where a certain judge gave such flashy levels almost perfect scores only for their aesthetics while place 4-2 actually had way better design and creative gimmicks. As said the level isn't bad, but it feels strange that such a simple "save play" wins 1st place.

- Even if judging is anonymous i wouldn't entire disagree if judges would be discussing their scores before revealing them. I think this had been done for VLDCX? I don't say every level should be rescored after this, but it would be more easy to notice if some judges are "too random" with their scoring (e.g. if someone gives a level +/- 10 points more/less than the other judges) and some rebalancing could be made to ensure a more fair total score in the end.

Just to clarify: I'm not mad or anything, but slightly disappointed. In the end Doctor No did a great jub with hosting this and i appreciate the judges' efforts. But i hope all the feedback is taken into consideration so that the next KLDC or VKLDC can be even better.








Link Thread Closed