Combinatorial Group Testing for Software Integration. Leslie Wu.
Software testing under-researched but important in practice (Amitabh Srivastava, MSFT VP)
Today’s datacenter-scale distributed systems deploy software-as-a-service:
A pageview on Google may trigger upwards of 50 services (Jeff Dean)
View on Amazon.com may access more than 100 services (AMZN CTO)
Our work: apply “group testing” to software integration in-the-large.
Goal: accelerate integration-defect root cause analysis
2. Motivation
• Software testing under-researched but important in practice (Amitabh
Srivastava, MSFT VP)
• Today’s datacenter-scale distributed systems deploy software-as-a-service:
• A pageview on Google may trigger upwards of 50 services (Jeff Dean)
• View on Amazon.com may access more than 100 services (AMZN CTO)
3. Motivation
• Software testing under-researched but important in practice (Amitabh
Srivastava, MSFT VP)
• Today’s datacenter-scale distributed systems deploy software-as-a-service:
• A pageview on Google may trigger upwards of 50 services (Jeff Dean)
• View on Amazon.com may access more than 100 services (AMZN CTO)
• Our work: apply “group testing” to software integration in-the-large
• Goal: accelerate integration-defect root cause analysis
4. Background
• Group Testing: Syphilis in World War II draftees
• method invented Robert Dorfman, an economist
• Electronics testing (1960s-)
• Pooling designs in Biology (1990s-)
• Software Integration (modern day)
5. Related work
• Combinatorial Group Testing (Du and Hwang 2000)
• Pooling Designs in Biology (Du and Hwang 2006)
• Web services / Delta debugging (Zeller)
• Group Testing on Complexes vs. Isolated Components
• (More details in final report)
6. Problem statement
• Testing a group g of services means:
• Upgrade all services in g
• Perform an automated integration test
• Graph model of integration: each vertex corresponds to a service
• Problem: Find the defective (“positive”) edge
• Assume only one bad edge (k-complexes for k=1)
7. Problem statement
• Testing a group g of services means: C
• Upgrade all services in g
• Perform an automated integration test
A B
• Graph model of integration: each vertex corresponds to a service
• Problem: Find the defective (“positive”) edge
• Assume only one bad edge (k-complexes for k=1)
8. Integration cost metrics
1) Number of integration tests (traditional group testing)
2) Depth of recursion tree (inspired by pooling designs)
• More tests means more test machines required
• Deeper recursion tree means longer time to isolate integration defect
12. Bottom-up integration
OK
A C B
Image from http://www.gigaflop.demon.co.uk/comp/chapt3.htm
13. Bottom-up integration
Failure! ...what’s the root cause?
OK
OK
OK
A C B
Image from http://www.gigaflop.demon.co.uk/comp/chapt3.htm
14. Natural divide-and-conquer
• Example:
• Given 64 services/components, divide into four subsets of size 16
• Test all (4 choose 2) pairs of subsets (6 edges), recurse on positive edge
• Doesn’t scale if you need to increase branching factor!
15. Natural divide-and-conquer
• Example:
• Given 64 services/components, divide into four subsets of size 16
• Test all (4 choose 2) pairs of subsets (6 edges), recurse on positive edge
• Doesn’t scale if you need to increase branching factor!
16. Natural divide-and-conquer
• Example:
• Given 64 services/components, divide into four subsets of size 16
• Test all (4 choose 2) pairs of subsets (6 edges), recurse on positive edge
• Doesn’t scale if you need to increase branching factor!
17. Natural divide-and-conquer
• Example:
• Given 64 services/components, divide into four subsets of size 16
• Test all (4 choose 2) pairs of subsets (6 edges), recurse on positive edge
• Doesn’t scale if you need to increase branching factor!
18. Natural divide-and-conquer
• Example:
• Given 64 services/components, divide into four subsets of size 16
• Test all (4 choose 2) pairs of subsets (6 edges), recurse on positive edge
• Doesn’t scale if you need to increase branching factor!
19. Natural divide-and-conquer
• Example:
• Given 64 services/components, divide into four subsets of size 16
• Test all (4 choose 2) pairs of subsets (6 edges), recurse on positive edge
• Doesn’t scale if you need to increase branching factor!
20. Natural divide-and-conquer
• Example:
• Given 64 services/components, divide into four subsets of size 16
• Test all (4 choose 2) pairs of subsets (6 edges), recurse on positive edge
• Doesn’t scale if you need to increase branching factor!
21. Group Testing for 1-Complexes
• Shattr-style:
• Instead, perform group tests on larger, overlapping subsets
• ...recurse on intersection of positive subsets
• Example: 3 tests suffice to determine defective edge
22. Group Testing for 1-Complexes
• Shattr-style:
• Instead, perform group tests on larger, overlapping subsets
• ...recurse on intersection of positive subsets
• Example: 3 tests suffice to determine defective edge
23. Group Testing for 1-Complexes
• Shattr-style:
• Instead, perform group tests on larger, overlapping subsets
• ...recurse on intersection of positive subsets
• Example: 3 tests suffice to determine defective edge
24. Group Testing for 1-Complexes
• Shattr-style:
• Instead, perform group tests on larger, overlapping subsets
• ...recurse on intersection of positive subsets
• Example: 3 tests suffice to determine defective edge
26. Simulation Results
A sample:
(n=128) Time # tests (avg.)
Tree 7 32
Shattr 4 28
(More details in report)
27. Contributions
• Novel application of “Group Testing for 1-Complexes” to Software Integration
• Proposed graph-theoretic model for Software Integration as a social system
• Introduced software service integration cost metrics
• Describe several integration algorithms
• Implemented simulation in Ruby, more analysis in report
• Demonstrated potential way to find integration defects more quickly
• Literature review
28. Future work
• Remove restrictions: generalize to k-complexes
• More specific graph model, interaction not generally complete graph
• More data-driven -- integration defect count and probability
• Core theory: practical group testing on complexes relatively unstudied
• Prototyping in the wild: is group testing feasible in practice?