These are the slides for a second-year 2-hour lecture in the CS2010 "Group Project" module explaining software quality through the ISO 25010 standard and giving some basics of software testing. The talk illustrates software quality concepts through relevant videogames, in line with the "strategy game" theme chosen for this year's group coursework.
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Video Game Quality: Achieving Functional Suitability, Efficiency and More
1. Software and Product Quality for Video Games
Antonio García-Domínguez
Aston University
CS2010 2016–17
February 14, 2017
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 1 / 54
2. What is this talk for?
In Stage 1...
You formed your team, and got to know your peers
You developed an idea for a game into an MVP
The priority was getting something “out there”
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 2 / 54
3. What is this talk for?
In Stage 1...
You formed your team, and got to know your peers
You developed an idea for a game into an MVP
The priority was getting something “out there”
For Stage 2
You should have now a minimal version of your game
It’s time to turn it into a “high quality” product — but what’s quality?
On this talk
First, we will discuss quality in general
Then, we will introduce some useful practices and concepts
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 2 / 54
6. “Quality” according to various dictionaries
Oxford Living Dictionary:
The standard of something as measured against other things
of a similar kind; the degree of excellence of something
Dictionary.com:
4. high grade; superiority; excellence:
Cambridge Dictionary:
how good or bad something is; a high standard
Are these definitions useful? What are they missing?
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 5 / 54
7. Software quality and product quality
More questions
As a player, what is a “good videogame”?
As a developer, what is a “good videogame”?
How are these two related?
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 6 / 54
8. Software quality and product quality
More questions
As a player, what is a “good videogame”?
As a developer, what is a “good videogame”?
How are these two related?
External vs internal quality (ISO 9126)
External quality: is it fun? → fitness for purpose
Internal quality: is it easy to work on? → quality of construction
Can we have one thing without the other?
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 6 / 54
9. External and internal quality are related
Some internal flaws can be very visible to the user of the software
Would you enjoy playing a game like this?
What does this tell us about the developer?
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 7 / 54
10. Good products do not have to be perfect inside
Other internal flaws may not be relevant to most users, however
“Snake” game injected into Pokemon Silver
https://www.youtube.com/watch?v=c81P5srA7vY&t=180s
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 8 / 54
11. Good products do not have to be perfect inside
Other internal flaws may not be relevant to most users, however
“Snake” game injected into Pokemon Silver
https://www.youtube.com/watch?v=c81P5srA7vY&t=180s
What does this tell us?
Game is not robust against maliciously constructed inputs
This is an arbitrary code execution vulnerability!
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 8 / 54
12. Good products do not have to be perfect inside
Other internal flaws may not be relevant to most users, however
“Snake” game injected into Pokemon Silver
https://www.youtube.com/watch?v=c81P5srA7vY&t=180s
What does this tell us?
Game is not robust against maliciously constructed inputs
This is an arbitrary code execution vulnerability!
Is this a serious issue?
How likely is this to happen naturally?
What is the impact of this security flaw?
What if this were a web server, and not a Gameboy game?
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 8 / 54
13. Quality in use
The same software can work well or not depending on the context
Car crashes due to Pokemon Go
Walk around, catch Pokemon
Fun for everyone (e.g. adults)
Adults have cars...
14 crashes in July 10–19, 2016
Whose fault is it?
Was the game badly made?
Did the game work incorrectly?
Or was it that the developers did
not cover for this context?
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 9 / 54
14. Balancing quality with other demands
Time
Cost Scope
Quality
“Fast, good, cheap: pick any two”
Commonly represented as the Project Management Triangle
Quality is related to the other three
However, doubling the budget won’t really halve the time:
The bearing of a child takes nine months, no matter how
many women are assigned. (Fred Brooks)
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 10 / 54
15. Various definitions of quality
We know types of quality and the tradeoffs, but what is quality?
Software quality
McCall (1977): product revision/operations/transition perspectives
Boehm (1978): as-is utility, maintainability, portability + levels
Grady (1999): FURPS
ISO 25010 (replaces 9126): based on McCall and Boehm, 8 attributes
. . . and many others
Process quality (we won’t focus on them here)
ISO 9001: company-wide process quality management
CMMI: framework to evaluate and improve your processes
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 11 / 54
16. ISO 25010:2011 product quality model
ISO 25010 combines external and internal quality from ISO 9126 into one model
Functional suitability
F. completeness
F. correctness
F. appropriateness
Performance efficiency
Time behaviour
Resource utilization
Capacity
Compatibility
Co-existence
Interoperability
Usability
Appr. recognisability
Learnability
Operability
User error protection
UI aesthetics
Accessibility
Product quality
Security
Confidentiality
Integrity
Non-repudiation
Accountability
Authenticity
Maintainability
Modularity
Reusability
Analysability
Modifiability
Testability
Reliability
Maturity
Availability
Fault tolerance
Recoverability
Portability
Modularity
Reusability
Analysability
Modifiability
Testability
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 12 / 54
17. ISO 25010:2011 quality in use model
Effectiveness Quality in use Efficiency
Satisfaction
Usefulness
Trust
Pleasure
Comfort
Freedom from risk
Economic risk mitigation
Health and safety r.m.
Environmental r.m.
Context coverage
Context completeness
Flexibility
These are related to the experience that the user obtains from the
software. For example:
Does the user achieve their goals, efficiently?
Does the user trust and take pleasure in playing your game?
This will usually come from a combination of the quality of the
product and the specific context of the user
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 13 / 54
18. Recap
So far...
Discussed the concept of software quality
Split it into internal, external and “in use” quality
Considered the tradeoffs with cost, time and scope
Introduced the ISO 25010:2011 quality models
Next half
Useful practices for some ISO 25010 characteristics
Also more examples from videogames
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 14 / 54
21. Building the right product: challenges
ISO 25010 subcharacteristics
F. completeness: covers all the specified tasks and user objectives
F. appropriateness: facilitate specified tasks and objectives
Where does the specification for a game come from?
Pitch a game idea to someone (as you did)
Contract work from a publisher (e.g. licensed game)
Develop and self-publish on your own
Risks
Gamers (users) and publishers (clients) may demand changes
You might find out that a game idea doesn’t work out
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 17 / 54
22. Building the right product: changes in Diablo (1/2)
In functional completeness, the specification can change drastically
GDC’16 postmortem on Diablo (David Brevik)
Talk: https://www.youtube.com/watch?v=VscdPA6sUkc
Pitch: http://graybeardgames.blogspot.co.uk/
Original pitch (1994)
Turn-based, with timed actions
Multi-player only for “arena”
Collectible expansion disks
Final game (1996)
Real-time, multi-player coop
No expansion disks
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 18 / 54
23. Building the right product: changes in Diablo (2/2)
Functional appropriateness and the hot bar in Diablo
Figure: UI changes in Diablo during development (David Brevik, GDC’16)
Very important goal — don’t die!
Requires drinking potions — what’s wrong in the old UI?
Find any other interesting differences?
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 19 / 54
24. Build the product right
Functional correctness: making it work as it should
How can we know if we did it right?
Read the code ourselves (code reviews: you know about these)
Have another program read our code (static analysis)
Run the program to evaluate it (testing)
What about formal proofs?
Even if the algorithm is proven correct, the implementation can be wrong!
Beware of bugs in the above code; I have only proved it correct,
not tried it. (Knuth, 1977 lecture notes)
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 20 / 54
25. Static analysis
Some tools can help us find bugs and bad code as well
Examples of static analysis tools for Java
Enforcing a consistent coding style: Checkstyle
Bad practices: PMD (unnecessary loops, empty try/catch blocks...)
Potential bugs: FindBugs (e.g. potential null pointer exceptions)
Inconsistent code: Bixie (code that contradicts itself)
Some notes on using these tools
Some work on source code, some on bytecode (.class files)
Treat them with a grain of salt — they can be wrong!
Use them often (some integrate with your IDE)
They can improve your coding skills if you listen carefully
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 21 / 54
27. Software testing: concept and guidelines
Definition of software testing (Myers, 2004)
Testing is the process of executing program with the intent of finding errors.
Some testing guidelines (also from Myers)
Programmers shouldn’t test their own programs
Test invalid and unexpected conditions, not only expected ones:
Programs should fail in a predictable way, too!
Check also that the program doesn’t do what it shouldn’t
Do not start testing assuming you won’t find bugs
Buggy sections are likely to hide even more bugs
Testing is creative and intellectually challenging!
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 23 / 54
28. Software testing: basic types
Based on knowledge about the program
Black box: try all inputs
White box: try all paths
Both unfeasible — we can only test a selection
Based on scope
Unit test: methods or 1–2 classes
Integration test: more classes, or subsystems
Function test: a feature in the game
System test: the whole game
Acceptance test: playtesting, publisher approval, certification...
Installation test
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 24 / 54
29. Software testing: black box testing (1/2)
How do we sample all possible inputs?
We identify equivalence classes: sets of inputs for which the program
should behave the same (whether valid or invalid)
We design test cases by:
Until all valid classes are used, combine as many as possible
Until all invalid classes are used, use exactly one
Consider boundaries (if a class is from 1 to 100, pick both 1 and 100)
Exercise
For a program that sums a list of 1 to 100 positive numbers:
List the input conditions
Define valid and invalid classes for each input condition
Create the tests from these classes
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 25 / 54
30. Software testing: black box testing (2/2)
Let’s go for the equivalence classes first:
External condition Valid classes Invalid classes
List size 1-100 (#1) 0 (#2), 101 (#3)
Element values >0 (#4) ≤ 0 (#6)
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 26 / 54
31. Software testing: black box testing (2/2)
Let’s go for the equivalence classes first:
External condition Valid classes Invalid classes
List size 1-100 (#1) 0 (#2), 101 (#3)
Element values >0 (#4) ≤ 0 (#6)
Now the tests:
List with 1 item >0 (#1 low boundary, #4)
List with 100 items >0 (#1 high boundary, #4)
Empty list (#2)
List with 101 items >0 (#3, #4)
List with 1 item =0 (#1, #6 high boundary)
List with 1 item <0 (#1, #6 low boundary)
Are we missing anything?
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 26 / 54
32. Software testing: white box testing (1/3)
protected int computeDamage(int def, boolean expAmmo) {
int baseDamage;
if (def < 100) {
// Up to 100 defense is applied directly
baseDamage = 150 − def;
} else {
// Need 200+ defense to completely avoid damage
baseDamage = Math.max(50 − (def − 100)/2, 0);
}
// Double damage if we are using explosive ammo
return expAmmo ? baseDamage ∗ 2 : baseDamage;
}
What are the different paths we can take through the code?
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 27 / 54
33. Software testing: white box testing (2/3)
Considering Math.max, we have 6 paths in total:
def < 100, expAmmo = true
def < 100, expAmmo = false
100 ≤ def ≤ 200, expAmmo = true
100 ≤ def ≤ 200, expAmmo = false
def > 200, expAmmo = true
def > 200, expAmmo = false
Our tests should:
Try all these paths
Take boundary values into account
Define the expected damage for each test
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 28 / 54
34. Software testing: white box testing (3/3)
Coverage testing tools help find missed paths
Figure: EclEmma screenshot (credit: JBoss.org)
35. Software testing: unit and integration tests
Unit testing and TDD
Related but not the same: TDD is one way to do unit testing
We already discussed TDD in a lab — we won’t stop much here
Integration tests
They involve the combination of multiple classes / modules
Can be done bottom-up (basic modules → integrations) or top-down
Can be tricky to get right sometimes: UI, networking, sound...
Exercise for integration testing
You need to test that each time you attack, you play a sound
You really don’t want to try it manually every time
Your GameEngine takes a SoundSystem
Any ideas on how to do it?
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 30 / 54
36. Software testing: mocking and behaviour-based testing
Stubs/mocks for testing interactions
We give GameEngine a mock of a SoundSystem that records if the
playSound() method was passed the right sound
This is known as behaviour-based testing: we don’t check the state of
the objects, but rather how they interact
Manual mocks are time-consuming: use a library, e.g. Mockito (Java)
import static org.mockito.Mockito.∗;
@Test
public void attackingPlaysSound() {
GameEngine eng = new GameEngine(...);
Soldier soldier = new Soldier(...);
SoundSystem soundSystem = mock(SoundSystem.class);
// Here is where we check the interaction
soldier.attack();
verify(soundSystem).play(expectedSound);
}
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 31 / 54
37. Software testing: system tests
There are many other types of tests you want to run at a global level:
Volume testing: try a very large map
Stress testing: test many enemies appearing at once
Usability testing: can be part of playtesting
Security testing: important in online games!
Performance testing: we’ll look at it later
Configuration testing: try various hardware / OS setups
Installability testing: try installing your game (obvious?)
Documentation testing: try reading your manual!
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 32 / 54
38. Software testing: playtesting
A very specific type of testing for videogames: test the fun
According to Valve (GDC 2009)
Not for finding bugs, not for balancing, not for focus testing
Goal = fun game, designs = hypotheses, playtesting = experiments:
Direct observation (don’t coach the player!)
Can include talking aloud (can be distracting)
Can include specific Q&A (don’t appeal to be liked!)
Objective metrics (death heatmaps, eye tracking, heartrate...)
One example: Portal 2 (“Smooth Jazz”)
This map was one of the first of the older Portal maps that we
beat up and decayed to bind the two games together. The
‘smooth jazz’ joke is probably the oldest one in the game. The
team discovered through playtesting that smooth jazz was funny
to all ages, genders and cultures. (Mike Morasky)
40. Bad performance on a high-budget game: Ultima IX
IGN review from 1999 (Trent C. Ward):
This is all too bad, because if you can
make it past the terrible framerates and
the fact that the Avatar runs around
whacking bad guys like Lara Croft on
acid, there’s actually an amazing depth
to this game.
Review scores: 9 presentation, 7
graphics, 9.5 sound, 3.5 gameplay, 3.5
lasting appeal
It likely runs well now, but this flaw and
others killed the Ultima series, which
was running since 1981!
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 35 / 54
41. When and how do we optimise our game?
Rules of Optimisation (Ward Cunningham’s wiki)
So you want to write some smart, tricky, efficient code?
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 36 / 54
42. When and how do we optimise our game?
Rules of Optimisation (Ward Cunningham’s wiki)
So you want to write some smart, tricky, efficient code?
1 Don’t do it!
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 36 / 54
43. When and how do we optimise our game?
Rules of Optimisation (Ward Cunningham’s wiki)
So you want to write some smart, tricky, efficient code?
1 Don’t do it!
2 Don’t do it yet!
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 36 / 54
44. When and how do we optimise our game?
Rules of Optimisation (Ward Cunningham’s wiki)
So you want to write some smart, tricky, efficient code?
1 Don’t do it!
2 Don’t do it yet!
3 (Advanced) If you must, use a profiler — examples for Java:
VisualVM (comes with JDK)
YourKit, JProfiler (commercial tools — offer licenses to OSS projects)
Rationale
Focus on writing maintainable and robust code first
When the game runs slow, use a profiler to find the hot spots
These will likely be in places you didn’t think of!
Best investment of your time — also, make sure you have tests
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 36 / 54
45. Performance profiling
Figure: Screenshot of a VisualVM
session (Java Code Geeks, 2012)
Usual operation
Find case when game runs slow
Reproduce it with a test
Hook the profiler into it
Find where most time is spent
Caveats
Sampling is cheap but inexact
Instrumentation is exact but
expensive (use only if needed)
Some tools also help with
memory leaks
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 37 / 54
46. Picking the right algorithms and data structures: Doom
Arranging level subsectors into a BSP tree allowed 3D graphics in the 386/486 era
Figure: Doom E1M1 “Hangar” map: color-coded subsectors (Doom Wiki)
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 38 / 54
48. Compatibility: co-existence
Some programs may impede others from running: SecuROM and antiviruses
Figure: Norton Community thread on SecuROM/NAV 2009 clashes
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 40 / 54
49. Compatibility: interoperability
Separate implementations of Rocket League playing against each other
Figure: Polygon article about PC/XB1 cross-play for Rocket League
What does this mean in terms of development / maintenance?
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 41 / 54
51. Learnability
Figure: Super Mario Bros. world 1-1 (Rick N. Bruns, 2008)
Miyamoto: https://www.youtube.com/watch?v=zRGRJRUWafY
Introduce features in a safe space, one by one
Slowly remove safety nets and introduce combinations
52. Operability
A bad control scheme can also ruin your game (or not!)
Figure: Jurassic Park: Trespasser
(Hardcore Gaming 101)
Figure: Octodad (Rare Gamer)
Hard to control: both require manually operating limbs!
First game was panned, second one was praised – why?
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 44 / 54
53. Accessibility: survey by Yuan et al. (2011)
Figure: Interaction finite state
machine (Yuan et al., 2011)
WHO classification of impairments
Visual: low vision, blindness
Hearing: complete/partial loss of ability
Motor: arthritis, paralysis, palsy, RSI...
Cognitive: Alzheimer’s, senility...
Strategies
Receiving stimuli: enhance, replace
Determine response: reduce stimuli,
reduce time constraints, reduce input
Provide input: reduce, replace
Check the survey!
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 45 / 54
55. How important is security in videogames?
Going back to the subcharacteristics
Confidentiality: others cannot see my data
Integrity: others cannot change code / data
Non-repudiation: actions/events can be proven to have happened
Accountability: actions/events can be traced to someone
Authenticity: identities/resources are the ones they claim to be
More questions
Can you think of one game-related situation for each?
How would you rank them?
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 47 / 54
56. Security issues in MMORPGs
Availability can be a key issue in subscription-based games
Figure: IBT article on Distributed Denial of Service attack on Battle.net (2016)
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 48 / 54
58. Cutting corners can slow down development
Modularity, reusability, analysability and modifiability are needed to stay agile
Technical debt (Ward Cunningham, 1992)
Shipping first time code is like going into debt. A little debt speeds
development so long as it is paid back promptly with a rewrite. [...] The
danger occurs when the debt is not repaid. Every minute spent on
not-quite-right code counts as interest on that debt. Entire engineering
organizations can be brought to a stand-still under the debt load of an
unconsolidated implementation [...]
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 50 / 54
59. Cutting corners can slow down development
Modularity, reusability, analysability and modifiability are needed to stay agile
Technical debt (Ward Cunningham, 1992)
Shipping first time code is like going into debt. A little debt speeds
development so long as it is paid back promptly with a rewrite. [...] The
danger occurs when the debt is not repaid. Every minute spent on
not-quite-right code counts as interest on that debt. Entire engineering
organizations can be brought to a stand-still under the debt load of an
unconsolidated implementation [...]
In summary
It is acceptable to incur some debt to complete an iteration or
prototype a new concept for your game
However, you should allocate time in each iteration to manage this
debt, or development will slow down to a crawl!
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 50 / 54
60. Conclusion
What have we discussed
A general idea of software quality
Examples of some of the ISO 25010 quality attributes
Things you can do or think about to improve your game
What you can do
Go over the cited resources and videos
Ask questions (this is hard, we know!)
Discuss with your team and decide how to tackle quality for TP2
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 51 / 54
61. Bibliography I
International Standards Organisation.
ISO/IEC 25010:2011(en): Systems and software engineering —
Systems and software Quality Requirements and Evaluation (SQuaRE)
— System and software quality models.
https://www.iso.org/obp/ui/#iso:std:iso-iec:25010:ed-1:
v1:en
Fred Brooks.
The Mythical Man-month: Essays on Software Engineering.
Addison Wesley, second edition (1995).
ISBN: 978-0201835953.
Glenford J. Myers, Tom Badgett, Corey Sandler.
The Art of Software Testing.
John Wiley & Sons, third edition (2011).
ISBN: 978-1118031964.
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 52 / 54
62. Bibliography II
Game Developers Conference.
YouTube channel.
https://www.youtube.com/channel/UC0JB7TSe49lg56u6qH8y_MQ
Donald E. Knuth, Peter van Emde Boas.
Correspondence on priority queues during spring 1977.
https://staff.fnwi.uva.nl/p.vanemdeboas/knuthnote.pdf
Ward Cunningham, and many others.
WikiWikiWeb.
http://wiki.c2.com/
NES Maps.
Super Mario Brothers Map Selection.
http://www.nesmaps.com/maps/SuperMarioBrothers/
SuperMarioBrothers.html
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 53 / 54
63. Bibliography III
The Portal Wiki.
Portal 2 developer commentary.
https://theportalwiki.com/wiki/Portal_2_developer_commentary
John W. Ayers, Eric C. Leas, Mark Dredze.
Pokémon GO — A New Distraction for Drivers and Pedestrians
JAMA Intern Med. 2016, 176(12):1865-1866.
http://dx.doi.org/10.1001/jamainternmed.2016.6274
B. Yuan, E. Folmer, F. C. Harris Jr.
Game accesibility: a survey.
Univ Access Inf Soc (2011), 10:81-100.
http://dx.doi.org/10.1007/s10209-010-0189-5
Antonio García-Domínguez Software and Product Quality CS2010 2016–17 54 / 54