Levels of Testing

by InvertedVertex
in Discussion, Theory
on 10 August 2020

Testing your games is important. I’ve said that in a lot of different ways and context with my articles thus far. With interactive systems, the only time you can get useful metrics and input is when you actually go and interact with them. This time, we’ll take a bit of a deeper dive into testing, specifically a look at the different (albeit arbitrary) levels of testing, based on who is doing the testing.

So, without further ado, let’s get into the list.

Level 1 – Solo testing

Solo testing, as the name implies, is done with the designer playtesting the game all by themselves, i.e. by taking the role of each individual player that would normally be in the game. While it might seem strange if you haven’t done this before, solo testing is extremely valuable, especially with early prototypes. This type of testing is used to give you preliminary input on the game’s systems. Here, you can see if your rules work, and given the fact that there are no other real players, you can do as many changes on the fly as necessary without worrying about breaking up the flow of the playtest.

Now, the main downside of solo testing is that there is practically no hidden information, since you’re going to be aware of everything that each different player would know. For games with little to no hidden/private information, this doesn’t cause much issues, but in games where hidden info is the key, you won’t get much useful feedback about how that information is used. The best you can do is pretend that each ‘player’ lacks some part of the information, so you could still kinda test the interactions. For games that have hidden info as a core staple of the design (e.g. my game Neon Arena), solo testing becomes pretty much impossible.

Solo testing isn’t a mandatory thing with every game, but I like to do it where applicable before testing with others to make sure that the game doesn’t crash&burn immediately. And speaking of testing with others, the next level of testing is…

Level 2 – Testing with friends/family

Here, the designer tests the game with friends/family/associates. Some designers will tell you that this is a bad idea, since people closer to you will be more reluctant to tell you that they disliked something, and that might not give you a very objective picture of things. There is truth to this, but I think there definitely is value in this type of testing, especially if you make it clear ahead of time that honest feedback will be the most useful. Personally, I always try to test with friends after going through solo testing. For one, this is usually the most convenient method of testing with others, since chances are that organizing the where and the when will be easiest with people of this category. Also, before testing the game with random people, I want to polish the game to a certain degree, barring some quick and dirty prototypes, but this is more a matter of personal preference.

Be wary that when testing multiple times with the same people, you increase the risk of creating an echo chamber effect, where both you and the players get comfortable with the changes and direction of the game, and might start ignoring different perspectives. This is largely due to the fact that people in general try to avoid large changes, especially in this context, when you see that things start to work properly with the current way you’re handling them. To avoid/prevent this, you need to get a lot more different inputs, and the best way to do it is to test the game with random people.

Level 3 – Testing with random people

For most games, this is the first ‘real’ test of how the game holds up in testing. When testing with people who aren’t really socially connected to you, they have no real reason to sugar-coat the feedback they give, whereas friends or family might, even subconsciously. Still, you need to let people know what kind of feedback you’re looking for beforehand in order to get the best results. The biggest benefit of testing with random people is that you can’t really know what their preferences are, nor what kind of perspective they might have, so you’ll get varied feedback, which will help with seeing how things work from different angles. Testing with random people is best used when you want to gauge the reactions to a first-time playthrough of the game. However, that can also be a double-edged sword, because for some things in a game, the first playthrough might not be enough for the players to fully grasp or get used to some of the mechanics. It’s important to balance testing between completely new players and ones who already have experience with the game.

The biggest issue with testing with random people, however, is logistics. The easiest way is to find playtesting (or generally just tabletop game) groups locally and try to organize a playtest that way. Board game events, regardless of size, are also a good way to find playtesters, however unlike testing with friends, you need to be a bit more organized and announce things further ahead of time, to ensure that people are aware of your test and that there is time for it.

Level 3.1 – Testing with other designers

This is kind of a subcategory of the previous level, at least the way I see it. Most playtesting groups are going to be made up of other designers for the most part. The distinction comes mostly from the form of feedback you’ll get, and what other designers might pay attention to that normal players might not pick up on. Designers can offer more ideas on how to handle issues, especially if they’ve had experience with similar mechanics in the past. However, that doesn’t mean that their input should be held in higher regard than that of non-designer playtesters. Take all feedback equally, as you’re the only one who knows how to properly evaluate it for your game.

Level 4 – Observing playtests (actively)

In all previous levels, the assumption was that you took part in the playtest by being one of the players in the game. However, past a certain point, you need to start doing tests where you don’t play the game yourself. This is where this level comes into play. Here, you literally watch from the sidelines as other people play the game by themselves. This level is going with the assumption that you’re still actively engaged with the playtest. This engagement is most often in the form of you explaining the rules to any players who are playing the game for the first time with that playtest, and assisting players with any rules questions they might have during gameplay. You should avoid giving non-rules advice to players as much as possible, because that carries the risk of polluting the data you want to get, which brings us to…

Level 4.1 – Observing playtests (passively)

The same as the previous level, except this time you don’t do anything except observe the gameplay after initially explaining the rules. This approach is useful for getting data on how understandable and intuitive the rules of the game are, and allows you to see how the players will handle any edge cases or weird interactions they might come across.

So now, after removing most of the designer’s influence on the test, it’s time to go for the logical conclusion, the final level, which cuts the designer completely off from the playtest itself.

Level 5 – Blind testing

Blind testing. This is best reserved for games which are in the later stages of design, though nothing’s really preventing you from doing this immediately from the start should you choose to do so. A blind test is one that tests the rulebook* as much as it does the game itself. The concept is simple: players learn the game through the provided rulebook, and then proceed to play as normal. This simulates the ‘fresh out the box’ experience players would have with the game when it’s a finalized product. Here, even if you’re observing, it’s absolutely paramount that you don’t interfere in any way. This method is one that evaluates the game as a whole, and that’s why I mentioned it’s best left for later.

You want your game to be polished enough so that the feedback you receive won’t be cluttered with reports of mechanics not working or rules making certain gameplay impossible. In my opinion, blind testing is a definite way of stress-testing the system you’ve created.

*Note: blind tests are meant to test the tutorial/learning resources for your game, regardless of format. Whether it’s the most common case of the in-box rulebook, or something like a how-to-play video, the purpose of blind testing remains the same.

An important note to mention here is that all the levels we’ve discussed can be done in any order you see fit, and some can be outright skipped if you don’t want to do them. Often, you might find yourself going back to some previous level after making certain changes to your game. Returning to a previous level does not mean you’re losing progress. There is no one single right way to test, and as long as you do as much of it as you can, you should be fine. However, the order isn’t completely arbitrary either. It can be a useful thing to keep in the back of your mind as a quick roadmap of how you might start and progress your testing efforts.

As always, keep designing, keep playing, and keep testing. Next time we’ll be tackling another MOPED principle: O for Optimization, which is in equal part trial-and-error and actually knowing what you’re doing.

I leave you with the chill-ass tune of the day:

This is InvertedVertex, signing off.

Tags: beginner-friendly, InvertedVertex, theory

Levels of Testing