SlideShare a Scribd company logo
1 of 42
Download to read offline
USABILITY TESTING
- Punto Damar P -
WHY ?
" Since the limitation of data, and the
lack of theoretical foundation in
Game Design, most of games have
been developed based solely on own
experiences and intuitions of the
Designer. As the result, about 80% of
games fail on the market every
year."
( Game Software Industry Report in AlienBrain product catalog. NxN
software. 2001 )
WHY ? (2)
"However, it is necessary to point out that,
too often, video game interfaces are an
afterthought. The reason is, too many
project managers assume the most
important part of a software
development project is the programming,
and then the interface can come later. As
the result, insufficient time is assigned for
interface design which may leads to a poor
quality interface." ( Fox 2005 )
MORE INFORMATION ...
"Human Computer Interaction in
Game Design"
- Nguyen Hung -
http://www.theseus.fi/bitstream/handle/10024/43234/Nguyen_Hung.
pdf?sequence=1
MORE INFORMATION ... (2)
"Quantifying The
User Experince"
- Jeff Sauro / James R. Lewis -
USABILITY TESTING ?
DEBUGGING != USABILITY TESTING
BUG FREE != USABLE
HOW DO WE DO IT ?
• Compare it to a specific benchmark or
goal.
• Get stastistical w ays to get more
precise answers.
• Get statistically significant evidence
from small samples.
HOW DO WE SET A
BENCHMARK ?
• Based on historical data obtained from
previous test that included the task.
• Based on findings reported in published
scientific or marketing research.
• Negotiate criteria with the stakeholders who
are responsible for the product.
HOW DO WE SET A
BENCHMARK ? (2)
Some suggestions :
• The best objective basis are data from previous
usability studies of predecessor or competitive
products.
• The source of historical data should be studies of
similiar types of participans, completing the same
tasks, under the same conditions.
• Negotiate with other stakeholders for the final set of
shared goals.
HOW DO WE SET A
BENCHMARK ? (2)
Some other suggestions :
• Establish some specific objectives
immediately, so you can measure
improvements.
• Revise your product in the early stages.
• Do not change reasonable goals to
accomodate an unusable product.
COMPARING A COMPLETION RATE
TO A BENCHMARK
small sample test & largle sample test
SMALL SAMPLE TEST
• success / fail
• "small" sample size = the total number of
users tested is less than 30.
HERE'S THE FORMULA
( brace yourselves )
Use the exact probabilities from the binomial distribution,
where :
x = the number of users who successfully completed the
task
n = sample size
)(
)1(
)!(!
!
)( xnx
pp
xnx
n
xp 



LIFE HACK ..
Use Microsoft Excel's function :
BINOMDST()
EXAMPLE 1
Eight of nine users successfully
completed a task.
Is there sufficent evidence to conclude
that at least 70% of all users would
be able to complete the same task ?
ANSWER
1556.0)7.01(7.0
)!89(!8
!9
)8( )89(8


 
p
 04035.0)7.01(7.0
)!99(!9
!9
)9( )99(9


 
p
OR..
= BINOMDIST (8 , 9 , 0.7 , FALSE) = 0.1556
= BINOMDIST (9 , 9 , 0.7 , FALSE) = 0.04035
CONCLUSION
0.1556 + 0.04035 = 0.1960
The probability of 8 or 9 successes out of
nine attempts is (1 - 0.1960) * 100 = 80.4%
There is an 80.4% chance that the
completion rate exceeds 70%
MID - PROBABILITY
0.5*(0.1556) + 0.04035 = 0.07782
The probability of 8 or 9 successes out of
nine attempts is (1 - 0.07782) * 100 = 88.4%
There is an 88.4% chance that the
completion rate exceeds 70%
MID - PROBABILITY
0.5*(0.1556) + 0.04035 = 0.07782
The probability of 8 or 9 successes out of
nine attempts is (1 - 0.07782) * 100 = 88.4%
There is an 88.4% chance that the
completion rate exceeds 70%
• Not suitable for production, but sufficent
enough to show that efforts are better spent on
improving other functions.
• The probability we computed is called an "exact"
probability. Not because it's exactly correct, but
because the probabilities are calculated
correctly. Rather than approximated.
• This result tend to be coservative.
IMPORTANT NOTES
LARGE SAMPLE TEST
• success / fail
• "large" sample size = at least 15 failures
and 15 successes.
HERE'S THE FORMULA
( brace yourselves again)
pˆ
n
pp
pp
z
)1(
ˆ



Use normal approximation to the binomial,
where :
= the observed completion rate expressed as a proportion
p = benchmark
n = number of users tested
EXAMPLE 2
85 out of 100 users were able to
successfully locate a specific product
and add it to their shopping cart.
Is there enough evidence to conclude
that at least 75% of all users can
complete this task successfully ?
ANSWER
309.2
100
)75.01(75.0
75.085.0



z
• Use NORMSDIST() to get the z-score.
• Final result = abs( NORMSDIST(2.309) - 1 )
= 0.0105
CONCLUSION
0.0105 * 100 = 1.05 %
There is around 99% chance that at
least 75% of users can complete the
task.
COMPARING A TASK TIME TO A
BENCHMARK
HERE'S THE FORMULA
where :
n
s
x
t ln
lnˆ)ln( 
 
ln
ˆx
lns
= mean of the log values
= standar deviation of the log values
EXAMPLE 3
11 users completed a task in a financial
application.
Task times : 90, 59, 54, 55, 171, 86, 107,
53, 79, 72, 157
Is there enough evidence that the average
task time is less than 100 seconds?
ANSWER
• Task Times =
90, 59, 54, 55, 171, 86, 107, 53, 79, 72, 157
• Log-transformed times =
4.5, 4.08, 3.99, 4.01, 5.14, 4.45, 4.67, 3.97, 4.37, 4.28, 5.06
• Mean of log times = 4.41
• Geometric mean of log times = EXP(4.41) =
82.3
• Standar deviation of log times = 0.411
• Log of benchmark (60s) = 4.61
ANSWER (2)
find the t-statistic value
Use the probability on 10 degrees of freedom
(n-1);
TDIST(1.53,10,1) = 0.0785
53.1
124.0
19.0
11
411.0
41.461.4


t
CONCLUSION
The probability of seeing an average time of 82.3
seconds if the actual population time is greater
than 100 seconds is around 7.87%
OR
We can be 92.15% confident that users can
complete this task in less than 100 seconds.
• What is geometric mean?
The best estimate of the middle task time for
small-sample usability data (less than 25).
• How about large-sample usability data?
Use sample median method.
(won't be explained here)
IMPORTANT NOTES
TOOLS
http://pencil.evolus.vn/
https://marvelapp.com/ https://proto.io/
http://www.invisionapp.com/
FIXING COST
Source : Theo Allen
UNIFIED PROCESS MODEL
https://en.wikipedia.org/wiki/Unified_Process
THANK YOU
Punto Damar P.
facebook.com/puntodamar
@ puntodamar
BikinGame.com

More Related Content

Viewers also liked

Bengkel Gamelan 3: HTML 5
Bengkel Gamelan 3: HTML 5Bengkel Gamelan 3: HTML 5
Bengkel Gamelan 3: HTML 5
gamelanYK
 
Presentasi prototype day mobile game advertisement
Presentasi prototype day   mobile game advertisementPresentasi prototype day   mobile game advertisement
Presentasi prototype day mobile game advertisement
Dennis Ganda
 

Viewers also liked (20)

Bengkel Gamelan 3: HTML 5
Bengkel Gamelan 3: HTML 5Bengkel Gamelan 3: HTML 5
Bengkel Gamelan 3: HTML 5
 
Basic Optimization and Unity Tips & Tricks by Yogie Aditya
Basic Optimization and Unity Tips & Tricks by Yogie AdityaBasic Optimization and Unity Tips & Tricks by Yogie Aditya
Basic Optimization and Unity Tips & Tricks by Yogie Aditya
 
Bengkel Gamelan - Game Balancing
Bengkel Gamelan - Game BalancingBengkel Gamelan - Game Balancing
Bengkel Gamelan - Game Balancing
 
Presentasi prototype day mobile game advertisement
Presentasi prototype day   mobile game advertisementPresentasi prototype day   mobile game advertisement
Presentasi prototype day mobile game advertisement
 
Materi Bengkel Gamelan : Game Marketing
Materi Bengkel Gamelan : Game MarketingMateri Bengkel Gamelan : Game Marketing
Materi Bengkel Gamelan : Game Marketing
 
JGJ48: Baidu Android Store - Edo Surya
JGJ48: Baidu Android Store - Edo SuryaJGJ48: Baidu Android Store - Edo Surya
JGJ48: Baidu Android Store - Edo Surya
 
Health Cannot Be Measured
Health Cannot Be MeasuredHealth Cannot Be Measured
Health Cannot Be Measured
 
Brocher Foundation program 2015
Brocher Foundation program 2015Brocher Foundation program 2015
Brocher Foundation program 2015
 
Gbd measure
Gbd measureGbd measure
Gbd measure
 
Cómo Triunfar con tu Negocio en las Redes Sociales
Cómo Triunfar con tu Negocio en las Redes Sociales Cómo Triunfar con tu Negocio en las Redes Sociales
Cómo Triunfar con tu Negocio en las Redes Sociales
 
Gayprojectfile
GayprojectfileGayprojectfile
Gayprojectfile
 
¿ Qué es el Marketing de Contenidos ?
¿ Qué es el Marketing de Contenidos ? ¿ Qué es el Marketing de Contenidos ?
¿ Qué es el Marketing de Contenidos ?
 
New Deck
New DeckNew Deck
New Deck
 
Ch ng 4_-_b_i_gi_ng_anten-truy_n_s_ng
Ch ng 4_-_b_i_gi_ng_anten-truy_n_s_ngCh ng 4_-_b_i_gi_ng_anten-truy_n_s_ng
Ch ng 4_-_b_i_gi_ng_anten-truy_n_s_ng
 
Ch ng 3_-_b_i_gi_ng_anten-truy_n_s_ng_1_
Ch ng 3_-_b_i_gi_ng_anten-truy_n_s_ng_1_Ch ng 3_-_b_i_gi_ng_anten-truy_n_s_ng_1_
Ch ng 3_-_b_i_gi_ng_anten-truy_n_s_ng_1_
 
Baigiangdugio 20-11-08
Baigiangdugio 20-11-08Baigiangdugio 20-11-08
Baigiangdugio 20-11-08
 
Ch ng 3_-_b_i_gi_ng_anten-truy_n_s_ng_2_
Ch ng 3_-_b_i_gi_ng_anten-truy_n_s_ng_2_Ch ng 3_-_b_i_gi_ng_anten-truy_n_s_ng_2_
Ch ng 3_-_b_i_gi_ng_anten-truy_n_s_ng_2_
 
259973943 xbee-node-temperature-sensor
259973943 xbee-node-temperature-sensor259973943 xbee-node-temperature-sensor
259973943 xbee-node-temperature-sensor
 
Online Security - The Good, the Bad, and the Crooks
Online Security - The Good, the Bad, and the CrooksOnline Security - The Good, the Bad, and the Crooks
Online Security - The Good, the Bad, and the Crooks
 
2D Art Dalam Video Game - Kudit
2D Art Dalam Video Game  -  Kudit2D Art Dalam Video Game  -  Kudit
2D Art Dalam Video Game - Kudit
 

Similar to Usability testing

Demystifying Sample Size - How Many Participants Do You Really Need for UX Re...
Demystifying Sample Size - How Many Participants Do You Really Need for UX Re...Demystifying Sample Size - How Many Participants Do You Really Need for UX Re...
Demystifying Sample Size - How Many Participants Do You Really Need for UX Re...
UserZoom
 
Extreme Programming Talk Wise Consulting Www.Talkwiseconsulting
Extreme  Programming    Talk Wise  Consulting   Www.TalkwiseconsultingExtreme  Programming    Talk Wise  Consulting   Www.Talkwiseconsulting
Extreme Programming Talk Wise Consulting Www.Talkwiseconsulting
talkwiseone
 
Extreme programming talk wise consulting - www.talkwiseconsulting
Extreme programming   talk wise consulting - www.talkwiseconsultingExtreme programming   talk wise consulting - www.talkwiseconsulting
Extreme programming talk wise consulting - www.talkwiseconsulting
talkwiseone
 
Monte Carlo Simulation for project estimates v1.0
Monte Carlo Simulation for project estimates v1.0Monte Carlo Simulation for project estimates v1.0
Monte Carlo Simulation for project estimates v1.0
PMILebanonChapter
 
dxDOE design of experiment for students.ppt
dxDOE design of experiment for students.pptdxDOE design of experiment for students.ppt
dxDOE design of experiment for students.ppt
tenadrementees
 

Similar to Usability testing (20)

Process capability relation between yield and number of parts in assembly und...
Process capability relation between yield and number of parts in assembly und...Process capability relation between yield and number of parts in assembly und...
Process capability relation between yield and number of parts in assembly und...
 
Need for Speed: How to Performance Test the right way by Annie Bhaumik
Need for Speed: How to Performance Test the right way by Annie BhaumikNeed for Speed: How to Performance Test the right way by Annie Bhaumik
Need for Speed: How to Performance Test the right way by Annie Bhaumik
 
Design of Experiments
Design of ExperimentsDesign of Experiments
Design of Experiments
 
Cs 568 Spring 10 Lecture 5 Estimation
Cs 568 Spring 10  Lecture 5 EstimationCs 568 Spring 10  Lecture 5 Estimation
Cs 568 Spring 10 Lecture 5 Estimation
 
Demystifying Sample Size - How Many Participants Do You Really Need for UX Re...
Demystifying Sample Size - How Many Participants Do You Really Need for UX Re...Demystifying Sample Size - How Many Participants Do You Really Need for UX Re...
Demystifying Sample Size - How Many Participants Do You Really Need for UX Re...
 
Unit 2 Unit level testing.ppt
Unit 2 Unit level testing.pptUnit 2 Unit level testing.ppt
Unit 2 Unit level testing.ppt
 
Keynote AST 2016
Keynote AST 2016Keynote AST 2016
Keynote AST 2016
 
Process Control
Process ControlProcess Control
Process Control
 
Bootstrapping of PySpark Models for Factorial A/B Tests
Bootstrapping of PySpark Models for Factorial A/B TestsBootstrapping of PySpark Models for Factorial A/B Tests
Bootstrapping of PySpark Models for Factorial A/B Tests
 
Effective Test Cases & Introduction to Hexawise
Effective Test Cases & Introduction to HexawiseEffective Test Cases & Introduction to Hexawise
Effective Test Cases & Introduction to Hexawise
 
The Art of Testing Less without Sacrificing Quality @ ICSE 2015
The Art of Testing Less without Sacrificing Quality @ ICSE 2015The Art of Testing Less without Sacrificing Quality @ ICSE 2015
The Art of Testing Less without Sacrificing Quality @ ICSE 2015
 
Extreme Programming Talk Wise Consulting Www.Talkwiseconsulting
Extreme  Programming    Talk Wise  Consulting   Www.TalkwiseconsultingExtreme  Programming    Talk Wise  Consulting   Www.Talkwiseconsulting
Extreme Programming Talk Wise Consulting Www.Talkwiseconsulting
 
Extreme programming talk wise consulting - www.talkwiseconsulting
Extreme programming   talk wise consulting - www.talkwiseconsultingExtreme programming   talk wise consulting - www.talkwiseconsulting
Extreme programming talk wise consulting - www.talkwiseconsulting
 
Class 12 CBSE Computer Science Investigatory Project
Class 12 CBSE Computer Science Investigatory ProjectClass 12 CBSE Computer Science Investigatory Project
Class 12 CBSE Computer Science Investigatory Project
 
Monte Carlo Simulation for project estimates v1.0
Monte Carlo Simulation for project estimates v1.0Monte Carlo Simulation for project estimates v1.0
Monte Carlo Simulation for project estimates v1.0
 
2015 drupalcampcebu estimation_jrf
2015 drupalcampcebu estimation_jrf2015 drupalcampcebu estimation_jrf
2015 drupalcampcebu estimation_jrf
 
Пирамида Тестирования через призму ROI калькулятора и прочая геометрия
Пирамида Тестирования через призму ROI калькулятора и прочая геометрияПирамида Тестирования через призму ROI калькулятора и прочая геометрия
Пирамида Тестирования через призму ROI калькулятора и прочая геометрия
 
Test Pyramid vs Roi
Test Pyramid vs Roi Test Pyramid vs Roi
Test Pyramid vs Roi
 
dxDOE design of experiment for students.ppt
dxDOE design of experiment for students.pptdxDOE design of experiment for students.ppt
dxDOE design of experiment for students.ppt
 
Software testing foundation
Software testing foundationSoftware testing foundation
Software testing foundation
 

More from gamelanYK (6)

Bengkel Gamelan 3D game asset workflow
Bengkel Gamelan 3D game asset workflowBengkel Gamelan 3D game asset workflow
Bengkel Gamelan 3D game asset workflow
 
Bengkel Gamelan : Pixel Art Best Practices by Wisageni Studio
Bengkel Gamelan : Pixel Art Best Practices by Wisageni StudioBengkel Gamelan : Pixel Art Best Practices by Wisageni Studio
Bengkel Gamelan : Pixel Art Best Practices by Wisageni Studio
 
JGJ48 : Intel Realsense - Firstman Marpaung
JGJ48 : Intel Realsense - Firstman MarpaungJGJ48 : Intel Realsense - Firstman Marpaung
JGJ48 : Intel Realsense - Firstman Marpaung
 
Bengkel 6 pengetahuan dasar audio pada game (1)
Bengkel 6 pengetahuan dasar audio pada game (1)Bengkel 6 pengetahuan dasar audio pada game (1)
Bengkel 6 pengetahuan dasar audio pada game (1)
 
Bengkel 4 bring your unity game to windows phone 8
Bengkel 4 bring your unity game to windows phone 8Bengkel 4 bring your unity game to windows phone 8
Bengkel 4 bring your unity game to windows phone 8
 
Bengkel 8 presentasi press release 101
Bengkel 8 presentasi press release 101Bengkel 8 presentasi press release 101
Bengkel 8 presentasi press release 101
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Recently uploaded (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

Usability testing

  • 2. WHY ? " Since the limitation of data, and the lack of theoretical foundation in Game Design, most of games have been developed based solely on own experiences and intuitions of the Designer. As the result, about 80% of games fail on the market every year." ( Game Software Industry Report in AlienBrain product catalog. NxN software. 2001 )
  • 3. WHY ? (2) "However, it is necessary to point out that, too often, video game interfaces are an afterthought. The reason is, too many project managers assume the most important part of a software development project is the programming, and then the interface can come later. As the result, insufficient time is assigned for interface design which may leads to a poor quality interface." ( Fox 2005 )
  • 4. MORE INFORMATION ... "Human Computer Interaction in Game Design" - Nguyen Hung - http://www.theseus.fi/bitstream/handle/10024/43234/Nguyen_Hung. pdf?sequence=1
  • 5. MORE INFORMATION ... (2) "Quantifying The User Experince" - Jeff Sauro / James R. Lewis -
  • 8. BUG FREE != USABLE
  • 9.
  • 10. HOW DO WE DO IT ? • Compare it to a specific benchmark or goal. • Get stastistical w ays to get more precise answers. • Get statistically significant evidence from small samples.
  • 11. HOW DO WE SET A BENCHMARK ? • Based on historical data obtained from previous test that included the task. • Based on findings reported in published scientific or marketing research. • Negotiate criteria with the stakeholders who are responsible for the product.
  • 12. HOW DO WE SET A BENCHMARK ? (2) Some suggestions : • The best objective basis are data from previous usability studies of predecessor or competitive products. • The source of historical data should be studies of similiar types of participans, completing the same tasks, under the same conditions. • Negotiate with other stakeholders for the final set of shared goals.
  • 13. HOW DO WE SET A BENCHMARK ? (2) Some other suggestions : • Establish some specific objectives immediately, so you can measure improvements. • Revise your product in the early stages. • Do not change reasonable goals to accomodate an unusable product.
  • 14. COMPARING A COMPLETION RATE TO A BENCHMARK small sample test & largle sample test
  • 15. SMALL SAMPLE TEST • success / fail • "small" sample size = the total number of users tested is less than 30.
  • 16. HERE'S THE FORMULA ( brace yourselves )
  • 17. Use the exact probabilities from the binomial distribution, where : x = the number of users who successfully completed the task n = sample size )( )1( )!(! ! )( xnx pp xnx n xp    
  • 18. LIFE HACK .. Use Microsoft Excel's function : BINOMDST()
  • 19. EXAMPLE 1 Eight of nine users successfully completed a task. Is there sufficent evidence to conclude that at least 70% of all users would be able to complete the same task ?
  • 20. ANSWER 1556.0)7.01(7.0 )!89(!8 !9 )8( )89(8     p  04035.0)7.01(7.0 )!99(!9 !9 )9( )99(9     p OR.. = BINOMDIST (8 , 9 , 0.7 , FALSE) = 0.1556 = BINOMDIST (9 , 9 , 0.7 , FALSE) = 0.04035
  • 21. CONCLUSION 0.1556 + 0.04035 = 0.1960 The probability of 8 or 9 successes out of nine attempts is (1 - 0.1960) * 100 = 80.4% There is an 80.4% chance that the completion rate exceeds 70%
  • 22. MID - PROBABILITY 0.5*(0.1556) + 0.04035 = 0.07782 The probability of 8 or 9 successes out of nine attempts is (1 - 0.07782) * 100 = 88.4% There is an 88.4% chance that the completion rate exceeds 70%
  • 23. MID - PROBABILITY 0.5*(0.1556) + 0.04035 = 0.07782 The probability of 8 or 9 successes out of nine attempts is (1 - 0.07782) * 100 = 88.4% There is an 88.4% chance that the completion rate exceeds 70%
  • 24. • Not suitable for production, but sufficent enough to show that efforts are better spent on improving other functions. • The probability we computed is called an "exact" probability. Not because it's exactly correct, but because the probabilities are calculated correctly. Rather than approximated. • This result tend to be coservative. IMPORTANT NOTES
  • 25. LARGE SAMPLE TEST • success / fail • "large" sample size = at least 15 failures and 15 successes.
  • 26. HERE'S THE FORMULA ( brace yourselves again)
  • 27. pˆ n pp pp z )1( ˆ    Use normal approximation to the binomial, where : = the observed completion rate expressed as a proportion p = benchmark n = number of users tested
  • 28. EXAMPLE 2 85 out of 100 users were able to successfully locate a specific product and add it to their shopping cart. Is there enough evidence to conclude that at least 75% of all users can complete this task successfully ?
  • 29. ANSWER 309.2 100 )75.01(75.0 75.085.0    z • Use NORMSDIST() to get the z-score. • Final result = abs( NORMSDIST(2.309) - 1 ) = 0.0105
  • 30. CONCLUSION 0.0105 * 100 = 1.05 % There is around 99% chance that at least 75% of users can complete the task.
  • 31. COMPARING A TASK TIME TO A BENCHMARK
  • 32. HERE'S THE FORMULA where : n s x t ln lnˆ)ln(    ln ˆx lns = mean of the log values = standar deviation of the log values
  • 33. EXAMPLE 3 11 users completed a task in a financial application. Task times : 90, 59, 54, 55, 171, 86, 107, 53, 79, 72, 157 Is there enough evidence that the average task time is less than 100 seconds?
  • 34. ANSWER • Task Times = 90, 59, 54, 55, 171, 86, 107, 53, 79, 72, 157 • Log-transformed times = 4.5, 4.08, 3.99, 4.01, 5.14, 4.45, 4.67, 3.97, 4.37, 4.28, 5.06 • Mean of log times = 4.41 • Geometric mean of log times = EXP(4.41) = 82.3 • Standar deviation of log times = 0.411 • Log of benchmark (60s) = 4.61
  • 35. ANSWER (2) find the t-statistic value Use the probability on 10 degrees of freedom (n-1); TDIST(1.53,10,1) = 0.0785 53.1 124.0 19.0 11 411.0 41.461.4   t
  • 36. CONCLUSION The probability of seeing an average time of 82.3 seconds if the actual population time is greater than 100 seconds is around 7.87% OR We can be 92.15% confident that users can complete this task in less than 100 seconds.
  • 37. • What is geometric mean? The best estimate of the middle task time for small-sample usability data (less than 25). • How about large-sample usability data? Use sample median method. (won't be explained here) IMPORTANT NOTES
  • 40. Source : Theo Allen
  • 42. THANK YOU Punto Damar P. facebook.com/puntodamar @ puntodamar BikinGame.com