Game Testing - Exploring the Test Space

Game Testing – Exploring the Test Space
Introduction
So you are about to plan for how to test the game you are developing, or
preferably – will start to develop. Where do you start? Maybe you have some
understanding of what it is that you are going to develop. Requirements, or
stakeholders that are telling you what should be in the game, and how it should
work to some degree. But often this only covers a small portion of the complete
test space. I use “test space” to denote the complete set of tests you would have
to run to test absolutely everything.
You know that you will never cover the entire test space with tests, since this
would not be very cost effective. But you probably have some idea that just
testing the core functionality of the game works will not be enough. It will always
be a priority question just which tests you will actually perform, but how do you
think of all the tests that you would possibly need in the first place?
In this article I will try to give a list of different types of test that I think should at
least be considered before development of a new game starts. I will use ISO
25010 [1] as a base, and add thoughts from James Bach [2] and James Whittaker
[3] who have developed different approaches for exploratory testing, which will
help to think about other ways to explore the test space. This will not be a
complete, all-comprehensive list, but my hope is that it will be a good start.
ISO 25010
ISO 25010 is a quality model for systems and software. It contains eight
categories of software quality, which can be good to use as a starting point when
designing tests.
Functional Suitability
This category covers a large test space, and it is easy to completely miss large
parts of it if you do not take them time to think through your approach.
The degree to which the product provides functions that meet stated and implied
needs when the product is used under specified conditions [1]. A user action
should lead to some kind of result. When I press the “New Game” button
something should happen. When a press the mouse button in game it has some
effect depending on where the cursor is located. When a user presses “Connect
to Social Media Platform” the game should connect to said social medial
platform. And so on. A large chunk of your tests will probably end up in this
category, and these are most likely the first ones you think of.
Also included in this category, and equally obvious, would be artificial
intelligence. Is the AI taking appropriate actions based on specified conditions?
Same goes for audio – are the correct audio files played based on specified
conditions? Verifying game mechanics are included here as well. Same thing with
physics engines.

Another big thing, depending on the game under test, would be verifying that the
game world meets stated or implied needs. Is everything in the right place? A
rock, a vendor, monsters, NPCs, caves, castles, houses, lakes, mountains etc.
Multiplayer functionality is also part of functional suitability.
There are obviously many different types of tests in this category, and they vary
wildly between different genres and games. Trying to create a complete
overview map of all functional suitability tests for different genres is a whole
article in itself. And even then a specific game could deviate from the genre and
require other unique tests.
How to come up with tests for all these different areas is not an easy task. In a
previous article I mentioned using Systematic Inventive Thinking as one way
think creatively around what to test, by starting with a normal user behavior and
applying SIT. [4]
Reliability
So apart from the functional suitability tests, another category of tests is
reliability tests. The degree to which a system or component performs specified
functions under specified conditions for a specified period of time[1].
Fault tolerance is one aspect. What happens when something goes wrong? Will
the game crash, or will the game handle the fault in a suitable way?
Recovery is another. What happens when the game actually crashes? What
happens with user data? Progression, items, scores, etc. Preferably the game
should never crash, but if it does, it should have minimal impact on the user.
Duration and maturity tests is also included in the reliability category. What
happens if you leave the game on for 24 hours? A week? Memory leaks could
typically be found by this kind of tests. How does the game run after 12 hours of
intensive playing? Any performance degrades? Any other degrades?
In this category I would also include random automated tests that click through
the UI, and tests that drop automated bots into a game world and let them
traverse the world randomly until they pass the walls of the game world – this
could for example let you find places where characters models could get stuck in
the world.
Stress testing is also part of reliability testing. In it’s simplest form it can include
tests like pressing a button many times instead of one. It can also include
thousands of users logging into a server at the same time, or hundreds of players
fighting in PvP against each other, or participating in a public non-instanced
quest together. Another example is a large amount of players performing
microtransactions simultaneously. Under the reliability umbrella we are looking
for if the game can handle these situations, and will not crash, log you out or
something similar, not the performance under these conditions, which is covered
by the Performance Efficiency category below.

Operability
The degree to which the product has attributes that enable it to be understood,
learned, used and attractive to the user, when used under specified conditions
[1]. So what does this actually mean?
Easy of use, learability, attractiveness. Is the game easy to start playing? Can the
user understand the game mechanics? Is the user interface understandable? Is
the tutorial good?
But this could also include fun factor testing when it comes to game. Is the game
actually fun to play? Realism testing could also be included in this category,
which includes the believability of the physics engine and many other things. Is
the game realistic and immersive to the player? Balancing could also be included
here as it can correlate to if the game is fun and understandable.
Performance Efficiency
The performance relative to the amount of resources used under stated
conditions [1]. This can include response time, loading time, and different kinds
of time behavior. It can also include different types of resource utilization, such
as RAM, GPU AND CPU usage. FPS would be connected to that resource
utilization.
This also includes performance tests under stressful conditions, such as relative
FPS when fighting alongside 10 people in PvP versus 200 people.
Many of these tests also impact how attractive the game is to the user, which we
talked about in the operability section.
Security
The degree of protection of information and data so that unauthorized persons
or systems cannot read or modify them and authorized persons or systems are
not denied access to them [1]. In a single player game without any
microtransactions, security is important. But this importance explodes when
testing an MMO, MOBA or similar game. Confidentiality, integrity, accountability,
authenticity [1]. Security testing requires very specific competence compared to
other types of testing, but is essential to modern games. How to battle botting
and gold sellers can be included in this category.
Compatibility
The degree to which two or more systems or components can exchange
information and/or perform their required functions while sharing the same
hardware or software environment [1].
Interoperability is part of this. Support for different game pads, gaming mouses,
keyboards, Oculus and Morpheus, and similar gaming paraphernalia. One
software system – the game – and one or more hardware systems.

Co-existence between programs. Having Spotify running in the background.
Running Team Speak, Ventrilo or other VoIP clients. Two or more different
software system co-existing.
Maintainability
The degree of effectiveness and efficiency with which the product can be
modified [1]. Many activities included in this category are development
activities. But securing testability [5] is definitely something of interest to the
game tester.
You could include testing for modification [6] support in this category.
Transferability
The degree to which a system or component can be effectively and efficiently
transferred from one hardware, software or other operational or usage
environment to another [1].
Here we can test on different hardware and OS. Mobile games need to be tested
on iOS, Android and Kindle for example. Then there are different manufacturers
of hardware. Different OS versions. Different versions of the hardware from the
same manufacturer. For PC there is an unlimited number of configurations. Here
the key would be to have enough business information to be able to prioritize
different hardware and OS configurations, framed by minimum and
recommended hardware requirements.
Heuristics by James Bach [7]
James Bach, one of the most famous software testing experts, has created a list of
general test techniques that are simple and universal enough to be applicable in
many different contexts, and can also be applied to game testing. This is another
way of looking at the test space compared to the categories that I have
previously presented. Attacking the problem from a different angle.
 Function Testing - Test what it can do
 Domain Testing - Look for any data processed by the product.
 Stress Testing - Overwhelm the product
 Flow Testing - Do one thing after another
 Scenario Testing - Test to a compelling story
 Claims Testing - Verify every claim
 User Testing - Involve the users
 Risk Testing - Imagine a problem, then look for it
 Automatic Checking - Check a million different facts
To better understand this, look into the reference and read the document.
Testing Tours by James Whittaker [8]
James Whittaker is a professor and software testing evangelist at Microsoft, and
has previously worked at Google as well. During his time on Google he developed
a way of performing exploratory testing [9] which he called Testing Tours.

Testing Tours can also be applied to explore the testing space, but takes yet
another different approach in doing so than the two previous approaches.
“Suppose you are visiting a large city like London, England, for the very first time.
It’s a big, busy, confusing place for new tourists, with lots of things to see and do.
Indeed, even the richest, most time-unconstrained tourist would have a hard time
seeing everything a city like London has to offer. The same can be said of well-
equipped testers trying to explore complex software; all the funding in the world
won’t guarantee completeness.” [8]
So James Whittaker created a number of “Testing Tours” that you could perform
to explore the software under test, based on this tourism metaphor.
 The Guidebook Tour – Follow the user manual’s advice just like the wary
traveler, by never deviating from its lead
 The Money Tour – Run through sales demos to make sure everything
that is used for sales purposes works
 The Landmark Tour – Choose a set of features, decide on an ordering for
them, and then explore the application going from feature to feature
until you’ve explored all of them in your list.
 The Intellectual Tour – this tour takes on the approach of asking the
software hard questions. How do we make the software work as hard as
possible?
 The FedEx Tour – During this tour, a tester must concentrate on this
data. Try to identify inputs that are stored and “follow” them around the
software.
 The Garbage Collector’s Tour – This is like a methodical spot check. We
can decide to spot check the interface where we go screen by screen,
dialog by dialog (favoring, like the garbage collector, the shortest route),
and not stopping to test in detail, but checking the obvious things.
 The Bad-Neighborhood Tour – Software also has bad neighborhoods—
those sections of the code populated by bugs. This tour is about running
tests in those sections of the code.
 The Museum Tour – During this tour, testers should identify older code
and executable artifacts and ensure they receive a fair share of testing
attention.
 The Back Alley Tour – If your organization tracks feature usage, this tour
will direct you to test the ones at the bottom of the list. If your
organization tracks code coverage, this tour implores you to find ways to
test the code yet to be covered.
 The All-Nighter Tour – Exploratory testers on theAll-Nighter tour will
keep their application running without closing it.
 The Supermodel Tour – During the Supermodel tour, the focus is not on
functionality or real interaction. It’s only on the interface.
 The Couch Potato Tour – A Coach Potato tour means doing as little actual
work as possible. This means accepting all default values (values
prepopulated by the application), leaving input fields blank, filling in as
little form data as possible, never clicking on an advertisement, paging

through screens without clicking any buttons or entering any data, and
so forth.
 The Obsessive-Compulsive Tour – Perform the same action over and
over. Repeat, redo, copy, paste, borrow, and then do all that some more.
To better understand this, look into the reference and read the document.
Conclusion
This has been my attempt to give a good base for continued exploration of the
test space to help new testers or developers think of all the necessary tests they
need to run for their game.
This is by no means the only way to tackle this problem, but I usually combine
these three approaches when I think about a testing problem.
Unfortunately this is just the first step. The next step is to prioritize the different
tests, since running them all is not feasible. But that is a topic for another article.
Johan Hoberg

References
[1] ISO/IEC 25010:2011
http://www.iso.org/iso/catalogue_detail.htm?csnumber=35733
[2] James Bach
http://www.satisfice.com/blog/
[3] James Whittaker
http://googletesting.blogspot.se/search/label/James%20Whittaker
[4] Systematic Inventive Thinking and Game Testing
http://www.gamasutra.com/blogs/JohanHoberg/20140801/222379/Systematic_Inv
entive_Thinking_and_Game_Testing.php
[5] Testability
http://en.wikipedia.org/wiki/Software_testability
[6] Video Game Mod
http://en.wikipedia.org/wiki/Mod_(video_gaming)
[7] Heuristic Test Strategy Model
http://www.satisfice.com/tools/htsm.pdf
[8] Testing Tours
http://msdn.microsoft.com/en-us/library/jj620911.aspx#bkmk_tours
[9] Exploratory Testing
http://en.wikipedia.org/wiki/Exploratory_testing

Game Testing - Exploring the Test Space

Recommandé

Recommandé

Contenu connexe

En vedette

En vedette (17)

Plus de Johan Hoberg

Plus de Johan Hoberg (13)

Dernier

Dernier (20)

Game Testing - Exploring the Test Space