Video Game Quality: Achieving Functional Suitability, Efficiency and More

Software and Product Quality for Video Games
Antonio García-Domínguez
Aston University
CS2010 2016–17
February 14, 2017
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 1 / 54

What is this talk for?
In Stage 1...
You formed your team, and got to know your peers
You developed an idea for a game into an MVP
The priority was getting something “out there”

What is this talk for?
In Stage 1...
You formed your team, and got to know your peers
You developed an idea for a game into an MVP
The priority was getting something “out there”
For Stage 2
You should have now a minimal version of your game
It’s time to turn it into a “high quality” product — but what’s quality?
On this talk
First, we will discuss quality in general
Then, we will introduce some useful practices and concepts

Overview
1 Understanding quality
2 Achieving software product quality
Functional Suitability
Eﬃciency
Compatibility
Usability
Security
Maintainability

Overview
Eﬃciency
Compatibility
Usability
Security
Maintainability

“Quality” according to various dictionaries
Oxford Living Dictionary:
The standard of something as measured against other things
of a similar kind; the degree of excellence of something
Dictionary.com:
4. high grade; superiority; excellence:
Cambridge Dictionary:
how good or bad something is; a high standard
Are these deﬁnitions useful? What are they missing?

Software quality and product quality
More questions
As a player, what is a “good videogame”?
As a developer, what is a “good videogame”?
How are these two related?

Software quality and product quality
More questions
As a player, what is a “good videogame”?
As a developer, what is a “good videogame”?
How are these two related?
External vs internal quality (ISO 9126)
External quality: is it fun? → ﬁtness for purpose
Internal quality: is it easy to work on? → quality of construction
Can we have one thing without the other?

External and internal quality are related
Some internal ﬂaws can be very visible to the user of the software
Would you enjoy playing a game like this?
What does this tell us about the developer?

Good products do not have to be perfect inside
Other internal ﬂaws may not be relevant to most users, however
“Snake” game injected into Pokemon Silver
https://www.youtube.com/watch?v=c81P5srA7vY&t=180s

What does this tell us?
Game is not robust against maliciously constructed inputs
This is an arbitrary code execution vulnerability!

What does this tell us?
Game is not robust against maliciously constructed inputs
This is an arbitrary code execution vulnerability!
Is this a serious issue?
How likely is this to happen naturally?
What is the impact of this security ﬂaw?
What if this were a web server, and not a Gameboy game?

Quality in use
The same software can work well or not depending on the context
Car crashes due to Pokemon Go
Walk around, catch Pokemon
Fun for everyone (e.g. adults)
Adults have cars...
14 crashes in July 10–19, 2016
Whose fault is it?
Was the game badly made?
Did the game work incorrectly?
Or was it that the developers did
not cover for this context?

Balancing quality with other demands
Time
Cost Scope
Quality
“Fast, good, cheap: pick any two”
Commonly represented as the Project Management Triangle
Quality is related to the other three
However, doubling the budget won’t really halve the time:
The bearing of a child takes nine months, no matter how
many women are assigned. (Fred Brooks)

Various deﬁnitions of quality
We know types of quality and the tradeoﬀs, but what is quality?
Software quality
McCall (1977): product revision/operations/transition perspectives
Boehm (1978): as-is utility, maintainability, portability + levels
Grady (1999): FURPS
ISO 25010 (replaces 9126): based on McCall and Boehm, 8 attributes
. . . and many others
Process quality (we won’t focus on them here)
ISO 9001: company-wide process quality management
CMMI: framework to evaluate and improve your processes

ISO 25010:2011 product quality model
ISO 25010 combines external and internal quality from ISO 9126 into one model
Functional suitability
F. completeness
F. correctness
F. appropriateness
Performance efficiency
Time behaviour
Resource utilization
Capacity
Compatibility
Co-existence
Interoperability
Usability
Appr. recognisability
Learnability
Operability
User error protection
UI aesthetics
Accessibility
Product quality
Security
Confidentiality
Integrity
Non-repudiation
Accountability
Authenticity
Maintainability
Modularity
Reusability
Analysability
Modifiability
Testability
Reliability
Maturity
Availability
Fault tolerance
Recoverability
Portability
Modularity
Reusability
Analysability
Modifiability
Testability

ISO 25010:2011 quality in use model
Effectiveness Quality in use Efficiency
Satisfaction
Usefulness
Trust
Pleasure
Comfort
Freedom from risk
Economic risk mitigation
Health and safety r.m.
Environmental r.m.
Context coverage
Context completeness
Flexibility
These are related to the experience that the user obtains from the
software. For example:
Does the user achieve their goals, efficiently?
Does the user trust and take pleasure in playing your game?
This will usually come from a combination of the quality of the
product and the specific context of the user

Recap
So far...
Discussed the concept of software quality
Split it into internal, external and “in use” quality
Considered the tradeoﬀs with cost, time and scope
Introduced the ISO 25010:2011 quality models
Next half
Useful practices for some ISO 25010 characteristics
Also more examples from videogames

Overview
Eﬃciency
Compatibility
Usability
Security
Maintainability

Building the right product: challenges
ISO 25010 subcharacteristics
F. completeness: covers all the specified tasks and user objectives
F. appropriateness: facilitate specified tasks and objectives
Where does the specification for a game come from?
Pitch a game idea to someone (as you did)
Contract work from a publisher (e.g. licensed game)
Develop and self-publish on your own
Risks
Gamers (users) and publishers (clients) may demand changes
You might find out that a game idea doesn’t work out

Building the right product: changes in Diablo (1/2)
In functional completeness, the speciﬁcation can change drastically
GDC’16 postmortem on Diablo (David Brevik)
Talk: https://www.youtube.com/watch?v=VscdPA6sUkc
Pitch: http://graybeardgames.blogspot.co.uk/
Original pitch (1994)
Turn-based, with timed actions
Multi-player only for “arena”
Collectible expansion disks
Final game (1996)
Real-time, multi-player coop
No expansion disks

Building the right product: changes in Diablo (2/2)
Functional appropriateness and the hot bar in Diablo
Figure: UI changes in Diablo during development (David Brevik, GDC’16)
Very important goal — don’t die!
Requires drinking potions — what’s wrong in the old UI?
Find any other interesting diﬀerences?

Build the product right
Functional correctness: making it work as it should
How can we know if we did it right?
Read the code ourselves (code reviews: you know about these)
Have another program read our code (static analysis)
Run the program to evaluate it (testing)
What about formal proofs?
Even if the algorithm is proven correct, the implementation can be wrong!
Beware of bugs in the above code; I have only proved it correct,
not tried it. (Knuth, 1977 lecture notes)

Static analysis
Some tools can help us ﬁnd bugs and bad code as well
Examples of static analysis tools for Java
Enforcing a consistent coding style: Checkstyle
Bad practices: PMD (unnecessary loops, empty try/catch blocks...)
Potential bugs: FindBugs (e.g. potential null pointer exceptions)
Inconsistent code: Bixie (code that contradicts itself)
Some notes on using these tools
Some work on source code, some on bytecode (.class ﬁles)
Treat them with a grain of salt — they can be wrong!
Use them often (some integrate with your IDE)
They can improve your coding skills if you listen carefully

Software testing: concept and guidelines
Definition of software testing (Myers, 2004)
Testing is the process of executing program with the intent of finding errors.
Some testing guidelines (also from Myers)
Programmers shouldn’t test their own programs
Test invalid and unexpected conditions, not only expected ones:
Programs should fail in a predictable way, too!
Check also that the program doesn’t do what it shouldn’t
Do not start testing assuming you won’t find bugs
Buggy sections are likely to hide even more bugs
Testing is creative and intellectually challenging!

Software testing: basic types
Based on knowledge about the program
Black box: try all inputs
White box: try all paths
Both unfeasible — we can only test a selection
Based on scope
Unit test: methods or 1–2 classes
Integration test: more classes, or subsystems
Function test: a feature in the game
System test: the whole game
Acceptance test: playtesting, publisher approval, certiﬁcation...
Installation test

Software testing: black box testing (1/2)
How do we sample all possible inputs?
We identify equivalence classes: sets of inputs for which the program
should behave the same (whether valid or invalid)
We design test cases by:
Until all valid classes are used, combine as many as possible
Until all invalid classes are used, use exactly one
Consider boundaries (if a class is from 1 to 100, pick both 1 and 100)
Exercise
For a program that sums a list of 1 to 100 positive numbers:
List the input conditions
Deﬁne valid and invalid classes for each input condition
Create the tests from these classes

Let’s go for the equivalence classes ﬁrst:
External condition Valid classes Invalid classes
List size 1-100 (#1) 0 (#2), 101 (#3)
Element values >0 (#4) ≤ 0 (#6)

Let’s go for the equivalence classes ﬁrst:
External condition Valid classes Invalid classes
List size 1-100 (#1) 0 (#2), 101 (#3)
Element values >0 (#4) ≤ 0 (#6)
Now the tests:
List with 1 item >0 (#1 low boundary, #4)
List with 100 items >0 (#1 high boundary, #4)
Empty list (#2)
List with 101 items >0 (#3, #4)
List with 1 item =0 (#1, #6 high boundary)
List with 1 item <0 (#1, #6 low boundary)
Are we missing anything?

Software testing: white box testing (1/3)
protected int computeDamage(int def, boolean expAmmo) {
int baseDamage;
if (def < 100) {
// Up to 100 defense is applied directly
baseDamage = 150 − def;
} else {
// Need 200+ defense to completely avoid damage
baseDamage = Math.max(50 − (def − 100)/2, 0);
}
// Double damage if we are using explosive ammo
return expAmmo ? baseDamage ∗ 2 : baseDamage;
}
What are the diﬀerent paths we can take through the code?

Considering Math.max, we have 6 paths in total:
def < 100, expAmmo = true
def < 100, expAmmo = false
100 ≤ def ≤ 200, expAmmo = true
100 ≤ def ≤ 200, expAmmo = false
def > 200, expAmmo = true
def > 200, expAmmo = false
Our tests should:
Try all these paths
Take boundary values into account
Deﬁne the expected damage for each test

Coverage testing tools help ﬁnd missed paths
Figure: EclEmma screenshot (credit: JBoss.org)

Software testing: unit and integration tests
Unit testing and TDD
Related but not the same: TDD is one way to do unit testing
We already discussed TDD in a lab — we won’t stop much here
Integration tests
They involve the combination of multiple classes / modules
Can be done bottom-up (basic modules → integrations) or top-down
Can be tricky to get right sometimes: UI, networking, sound...
Exercise for integration testing
You need to test that each time you attack, you play a sound
You really don’t want to try it manually every time
Your GameEngine takes a SoundSystem
Any ideas on how to do it?

Software testing: mocking and behaviour-based testing
Stubs/mocks for testing interactions
We give GameEngine a mock of a SoundSystem that records if the
playSound() method was passed the right sound
This is known as behaviour-based testing: we don’t check the state of
the objects, but rather how they interact
Manual mocks are time-consuming: use a library, e.g. Mockito (Java)
import static org.mockito.Mockito.∗;
@Test
public void attackingPlaysSound() {
GameEngine eng = new GameEngine(...);
Soldier soldier = new Soldier(...);
SoundSystem soundSystem = mock(SoundSystem.class);
// Here is where we check the interaction
soldier.attack();
verify(soundSystem).play(expectedSound);
}

Software testing: system tests
There are many other types of tests you want to run at a global level:
Volume testing: try a very large map
Stress testing: test many enemies appearing at once
Usability testing: can be part of playtesting
Security testing: important in online games!
Performance testing: we’ll look at it later
Conﬁguration testing: try various hardware / OS setups
Installability testing: try installing your game (obvious?)
Documentation testing: try reading your manual!

Software testing: playtesting
A very specific type of testing for videogames: test the fun
According to Valve (GDC 2009)
Not for finding bugs, not for balancing, not for focus testing
Goal = fun game, designs = hypotheses, playtesting = experiments:
Direct observation (don’t coach the player!)
Can include talking aloud (can be distracting)
Can include specific Q&A (don’t appeal to be liked!)
Objective metrics (death heatmaps, eye tracking, heartrate...)
One example: Portal 2 (“Smooth Jazz”)
This map was one of the first of the older Portal maps that we
beat up and decayed to bind the two games together. The
‘smooth jazz’ joke is probably the oldest one in the game. The
team discovered through playtesting that smooth jazz was funny
to all ages, genders and cultures. (Mike Morasky)

Overview
Eﬃciency
Compatibility
Usability
Security
Maintainability

Bad performance on a high-budget game: Ultima IX
IGN review from 1999 (Trent C. Ward):
This is all too bad, because if you can
make it past the terrible framerates and
the fact that the Avatar runs around
whacking bad guys like Lara Croft on
acid, there’s actually an amazing depth
to this game.
Review scores: 9 presentation, 7
graphics, 9.5 sound, 3.5 gameplay, 3.5
lasting appeal
It likely runs well now, but this ﬂaw and
others killed the Ultima series, which
was running since 1981!

When and how do we optimise our game?
Rules of Optimisation (Ward Cunningham’s wiki)
So you want to write some smart, tricky, eﬃcient code?

1 Don’t do it!

1 Don’t do it!
2 Don’t do it yet!

1 Don’t do it!
2 Don’t do it yet!
3 (Advanced) If you must, use a profiler — examples for Java:
VisualVM (comes with JDK)
YourKit, JProfiler (commercial tools — offer licenses to OSS projects)
Rationale
Focus on writing maintainable and robust code first
When the game runs slow, use a profiler to find the hot spots
These will likely be in places you didn’t think of!
Best investment of your time — also, make sure you have tests

Performance proﬁling
Figure: Screenshot of a VisualVM
session (Java Code Geeks, 2012)
Usual operation
Find case when game runs slow
Reproduce it with a test
Hook the proﬁler into it
Find where most time is spent
Caveats
Sampling is cheap but inexact
Instrumentation is exact but
expensive (use only if needed)
Some tools also help with
memory leaks

Picking the right algorithms and data structures: Doom
Arranging level subsectors into a BSP tree allowed 3D graphics in the 386/486 era
Figure: Doom E1M1 “Hangar” map: color-coded subsectors (Doom Wiki)

Overview
Eﬃciency
Compatibility
Usability
Security
Maintainability

Compatibility: co-existence
Some programs may impede others from running: SecuROM and antiviruses
Figure: Norton Community thread on SecuROM/NAV 2009 clashes

Compatibility: interoperability
Separate implementations of Rocket League playing against each other
Figure: Polygon article about PC/XB1 cross-play for Rocket League
What does this mean in terms of development / maintenance?

Overview
Eﬃciency
Compatibility
Usability
Security
Maintainability

Learnability
Figure: Super Mario Bros. world 1-1 (Rick N. Bruns, 2008)
Miyamoto: https://www.youtube.com/watch?v=zRGRJRUWafY
Introduce features in a safe space, one by one
Slowly remove safety nets and introduce combinations

Operability
A bad control scheme can also ruin your game (or not!)
Figure: Jurassic Park: Trespasser
(Hardcore Gaming 101)
Figure: Octodad (Rare Gamer)
Hard to control: both require manually operating limbs!
First game was panned, second one was praised – why?

Accessibility: survey by Yuan et al. (2011)
Figure: Interaction ﬁnite state
machine (Yuan et al., 2011)
WHO classiﬁcation of impairments
Visual: low vision, blindness
Hearing: complete/partial loss of ability
Motor: arthritis, paralysis, palsy, RSI...
Cognitive: Alzheimer’s, senility...
Strategies
Receiving stimuli: enhance, replace
Determine response: reduce stimuli,
reduce time constraints, reduce input
Provide input: reduce, replace
Check the survey!

Overview
Eﬃciency
Compatibility
Usability
Security
Maintainability

How important is security in videogames?
Going back to the subcharacteristics
Conﬁdentiality: others cannot see my data
Integrity: others cannot change code / data
Non-repudiation: actions/events can be proven to have happened
Accountability: actions/events can be traced to someone
Authenticity: identities/resources are the ones they claim to be
More questions
Can you think of one game-related situation for each?
How would you rank them?

Security issues in MMORPGs
Availability can be a key issue in subscription-based games
Figure: IBT article on Distributed Denial of Service attack on Battle.net (2016)

Overview
Eﬃciency
Compatibility
Usability
Security
Maintainability

Cutting corners can slow down development
Modularity, reusability, analysability and modiﬁability are needed to stay agile
Technical debt (Ward Cunningham, 1992)
Shipping ﬁrst time code is like going into debt. A little debt speeds
development so long as it is paid back promptly with a rewrite. [...] The
danger occurs when the debt is not repaid. Every minute spent on
not-quite-right code counts as interest on that debt. Entire engineering
organizations can be brought to a stand-still under the debt load of an
unconsolidated implementation [...]

Cutting corners can slow down development
Modularity, reusability, analysability and modiﬁability are needed to stay agile
Technical debt (Ward Cunningham, 1992)
Shipping ﬁrst time code is like going into debt. A little debt speeds
development so long as it is paid back promptly with a rewrite. [...] The
danger occurs when the debt is not repaid. Every minute spent on
not-quite-right code counts as interest on that debt. Entire engineering
organizations can be brought to a stand-still under the debt load of an
unconsolidated implementation [...]
In summary
It is acceptable to incur some debt to complete an iteration or
prototype a new concept for your game
However, you should allocate time in each iteration to manage this
debt, or development will slow down to a crawl!

Conclusion
What have we discussed
A general idea of software quality
Examples of some of the ISO 25010 quality attributes
Things you can do or think about to improve your game
What you can do
Go over the cited resources and videos
Ask questions (this is hard, we know!)
Discuss with your team and decide how to tackle quality for TP2

Bibliography I
International Standards Organisation.
ISO/IEC 25010:2011(en): Systems and software engineering —
Systems and software Quality Requirements and Evaluation (SQuaRE)
— System and software quality models.
https://www.iso.org/obp/ui/#iso:std:iso-iec:25010:ed-1:
v1:en
Fred Brooks.
The Mythical Man-month: Essays on Software Engineering.
Addison Wesley, second edition (1995).
ISBN: 978-0201835953.
Glenford J. Myers, Tom Badgett, Corey Sandler.
The Art of Software Testing.
John Wiley & Sons, third edition (2011).
ISBN: 978-1118031964.

Bibliography II
Game Developers Conference.
YouTube channel.
https://www.youtube.com/channel/UC0JB7TSe49lg56u6qH8y_MQ
Donald E. Knuth, Peter van Emde Boas.
Correspondence on priority queues during spring 1977.
https://staff.fnwi.uva.nl/p.vanemdeboas/knuthnote.pdf
Ward Cunningham, and many others.
WikiWikiWeb.
http://wiki.c2.com/
NES Maps.
Super Mario Brothers Map Selection.
http://www.nesmaps.com/maps/SuperMarioBrothers/
SuperMarioBrothers.html

Bibliography III
The Portal Wiki.
Portal 2 developer commentary.
https://theportalwiki.com/wiki/Portal_2_developer_commentary
John W. Ayers, Eric C. Leas, Mark Dredze.
Pokémon GO — A New Distraction for Drivers and Pedestrians
JAMA Intern Med. 2016, 176(12):1865-1866.
http://dx.doi.org/10.1001/jamainternmed.2016.6274
B. Yuan, E. Folmer, F. C. Harris Jr.
Game accesibility: a survey.
Univ Access Inf Soc (2011), 10:81-100.
http://dx.doi.org/10.1007/s10209-010-0189-5

Video Game Quality: Achieving Functional Suitability, Efficiency and More

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Video Game Quality: Achieving Functional Suitability, Efficiency and More

Similar to Video Game Quality: Achieving Functional Suitability, Efficiency and More (20)

More from Antonio García-Domínguez

More from Antonio García-Domínguez (17)

Recently uploaded

Recently uploaded (20)

Video Game Quality: Achieving Functional Suitability, Efficiency and More