3. Free online webinar
events
Free 1-day local
training events
Local user groups
around the world
Online special
interest user groups
Business analytics
training
Get involved
Explore
everything
PASS has
to offer
Free Online Resources
Newsletters
PASS.org
4. Download the GuideBook App
and search: PASS Summit 2018
Follow the QR code link displayed on session
signage throughout the conference venue and
in the program guide
Session
evaluations
Your feedback is
important and valuable.
Go to passSummit.com
3 Ways to Access:
Submit by 5pm Friday, November 16th to win prizes.
5. Ike Ellis
MVP & Partner, Crafting
Bytes
Author of Developing Azure Solutions
MVP for 8 years
Speaker at PASS Summit, SQL Bits,
Redgate SQL in the City, and many
other events
@ike_ellis
Teach courses on SQL Server, SQL
Development, Business
Intelligence, Azure Cloud, and
Power BI
6. Agenda
• Why do we test? What’s the purpose?
• What is an Enterprise Data Architecture?
• Why are the challenges of testing in an EDA?
• What solutions are exist for those challenges?
• What challenges will likely always be there?
• What are some best practices in general to work within the
constraints we have?
7. Purpose of automatic testing
• To verify that changes to a production system do not break
earlier expected work
• To preserve intent and self-document functionality
• To automatically deploy a system once all tests successfully
pass
• To increase our rate of change and match the pace the
business expects from us
• To test things unattended so the QA department is not a
bottleneck for product delivery
7
9. What’s better at preserving intent?
Comments, Source Control Comments, or
Tests?
• Source control comments are done AFTER the changes, not
before or with the changes
• Comments are often outdated, ignored, and can’t be
reviewed at compilation
• Tests fail when intent is violated, pointing to the code that
failed, making it inviolate
9
10. Testing: Rate of change
10
Time it takes
to make a
change
Time since project began
11. QA Dept as Bottleneck
• Overtime, developers can outpace a QA team. More
application surface area is built, there will be more to
completely test to see the impact.
• Even the smallest changes begin having the largest impact
11
12. Enterprise Data Architecture
• Data Mart/Data Warehouse ODS
• Data Lake Architecture
• Lambda Architecture
• Kappa Architecture
• IoT Architecture
12
18. What do all of these architectures have in
common?
• Some move data more than others, but there is always an
element of moving data from one place to the other
• They have processes that drastically change the data so that
it doesn’t resemble what it originally looked like
• They store the exact same data several time. I’ve seen
company EDAs that store the same source data seven times
before it arrives at the final destination
• I’ve seen companies that store different views of the same
aggregation: Weekly, Monthly, Quarterly, MTD, YTD, YOY
18
19. What are the common challenges of
testing these environments?
Why do so many companies fail at automated testing and give
up?
19
20. Well, it’s not their fault, and it’s not yours
either
• EDAs have some insane complexity
• Most EDAs have all the things that software developers
avoid when writing code
20
22. Problem #2: Side-Effects
Stored procedures always violate the DRY
principle: Don’t repeat yourself
They do this for a variety of reasons:
• It is difficult to pass tables of data between
procedures
• It is easy to load a temp table up and
perform 100s of operations against it
• Stored procedures don’t have any of the
components necessary to avoid repetition
like inheritance, encapsulation,
polymorphism, arrays, foreach loops, etc.
All of this means we end up getting very large
stored procedures with 1000s of lines that
comprise a lot of our ETL
22
23. Problem #3: Queries take a long time to run
• If a suite of tests take hours to
run, then we can’t run them very
often.
• We will never have good test
coverage of our architecture if
we can’t run the tests in between
changes to the system
• Queries timeout
• Queries against source systems
are a bad idea, meaning we can’t
tell if we got all the data with
100% accuracy
23
24. Problem #4: Source systems don’t really want
us there
If we’re pulling data from transactional systems, testing against
the OLTP database creates a load. Create too much, and we’ll
get kicked off the source, pronto.
25. Problem #5: All of these EDAs have a lot of
repetitive data. Where do we test?
• Do we test data from source to staging? From staging to
ODS? From source to ODS? From source to DW?
• With all the repetition, we might have the manufacturing
problem that Taichi Ohno railed against while inventing lean
manufacturing and Just-in-Time
25
26. Problem #6: Accuracy is deadly important
• Test data is not always available
• Production data must be 100% reconciled and accurate
• Cash flow statements, reports to the board, SEC filings,
auditing records, etc.
• No mistakes, means no mistakes
26
27. Remember our goals with testing
• Did a change break the system?
• Are we preserving existing, needed functionality?
• Can we deploy automatically when our tests run?
27
28. OK, that’s the bad news, now what can we
do about it?
• Testing Strategies
• Testing Tools
• Testing Best Practices
28
29. Testing Strategies
• Are we all looking at the same requirements?
• Write a test plan, execute it!
• 100% test coverage is impossible, so what will we test?
• Where will we test?
• Where does it tend to break? Are we recording our outages
and bugs?
• Stay focused
29
30. It’s not a perfect world, but here are some
things we can do
• tSQLt
• Approval Tests
• Except Query Replacement Testing
• QueryApprovals
30
38. Best practices for data testing
• Test row counts
• Test random and sampled data elements
• Start with thorough manual testing and then slowly
automate that
• Alert in your audit log
• Test records are streaming
• Make testing fast!
• Deleting tests is essential!
• It’s absolutely ok to delete tests that aren’t working for you. It is not
OK to give up on testing.
38
39. Report Testing
• SSRS Testing
• Power BI Testing
• Push logic into the database
• Push coloring into the database
• Test at the database
• JPG/PDF testing is still not mature enough to use
• Test report data sizes and subscription dates
39
40. Develop a culture of testing
There are teams that talk about testing in every sentence
There are teams that treat testing as a burden or someone
else’s problem
Guess who write better products, has happy users, and enjoys
making changes to their own system?
Remember change lock. It will make you want to quit your job.
Write tests to avoid change lock
40