Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
A Picture is Worth a Thousand Words
1. Visualization -
A Picture is worth a thousand
words
Mark Ott, Teradata
John Park, Qlik
Data Scientist, Teradata
2. Module Objectives
After completing this module, you will have exposure to:
• Teradata Aster Lens ™
• Tibco Spotfire™
• RStudio™
• Tableau™
• D3™
Qlik™
1
2
3
4
5
6
3. Visualization makes data easier to digest
• Information overload and data glut is the problem. Visualization is the solution
• Now can see hidden patterns and connections that matters most
• Use colors and scale to see the forest through the trees
Do you see Pattern in
these 4 Data sets?
But can see Patterns much easier in a Graph
4. Aster Lens is our Visualization Web application
• It is an interactive Web application that allows Users to find, view and
share results from their nPathViz and cFilterViz queries
These are called Cards. You
click on them to open the Chart
These are Categories where you
group charts together. Where the
Chart lands depends on which
Table you point to during INSERT.
See next slide
1
• nPathViz is for Pattern
Detection charts
• cFilterViz is for Collaboration
Filter charts
5. Some of the Aster Lens Chart types
Sankey Chord
1
6. nPathViz – Chord chart
Query: Display my most popular 1st
two clicks on my Web page
INSERT into aster_lens.workshop
SELECT * from nPathViz
(on(SELECT * from retail_dept)
partition by 1
order by freq desc
graph_type('chord')
path_col('path')
frequency_col('freq')
directed('true')
title('Chord chart'));
Aster Function
Input table
Input
Output
1
7. Creating Multiple Sankey charts with 1 statement
INSERT into aster_lens.cart_abandonment
SELECT * FROM nPathViz(ON aster_lens.npath_output_abandoned_shopping_order as input
PARTITION BY storeid
graph_type('sankey') frequency_col('cnt') path_col('path')
arguments('start_date=4/12/2013','end_date=4/30/2013','owner=ASTER','tags=Coupon Sale')
title('CPN Shopping Order 1')subtitle('Sequence of items purchased - Coupon '13') accumulate('storeid'));
Input Aster Lens
Output
Note I have 2 StoreID’s so
will Output 2 charts
1
8. 1 nPathViz – Sankey chart
INSERT into aster_lens.workshop
SELECT * from nPathViz (on(
SELECT path ,count(*) as freq from(select path, cnt, id ,
ROW_NUMBER () over ( partition by id order by cnt
desc) as reihe
from npath( on vt_tv partition by id order by ts
mode(overlapping) pattern('a*.bb')
symbols(tvshow <> 'BreakingBad' as a, tvshow =
'BreakingBad' as bb)
result(accumulate (tvshow of any (a,bb)) as path,
COUNT(* of any(a, bb)) as cnt,first(id of any(a)) as id)
FILTER(first (ts +'20 minutes'::interval of any(a))> first(ts
of any(bb)) ))) as a
where reihe =1 group by 1 order by 2 desc)
partition by 1order by freq desc graph_type('sankey')
frequency_col('freq') path_col('path')
title('Channel Surf 20 minute before Breaking Bad'));
Input
Output
10. Using Tibco Spotfire in Parameterized query
2
Want even more Visualization options? How about parameterizing your Queries?
For eample, in Spotfire's pull-down menu, I pick 'Target Show' and then select '20' minutes.
Charts shows all tv shows watched 20 minutes before 'Breaking Bad'
11. Creating Charts with RStudio
3
RStudio is an Open source integrated development environment (IDE) for R,
a programming language for statistical computing and graphics
To use RStudio, must first
download the TOASTER
package and then Point to
the Aster ODBC driver and
you are good to go
When highlight code, and click the
RUN button, it’s completed when
you see > prompt in Console
12. Creating Scatterplot with RStudio
Let’s create a Scatterplot chart comparing Baseball strikeouts to walks
(base-on-balls) across 3 decades to see who is getting the upper hand
(pitchers or hitters). We will use 1950 decade as the benchmark
3
13. Using Tableau, isolate bad Electric Car Batteries
Analyst finds increasing warranty costs
$60 m
$50 m
$40 m
$30 m
$20 m
$10 m
$0 m
Jan Feb Mar Apr May Jun
$4.5 m
$4.0 m
$3.5 m
$3.0 m
$2.5 m
$2.0 m
$1.5 m
$1.0 m
$0.5 m
$0.0 m
Warranty Costs
January
June
Inventory
Warranty
Materials
Labor Need to investigate the root cause
> Need self-service access to all data
– Data warehouse and Hadoop
> Must be able to join between data stores
> Support for multi-structured data
– Combine fixed schema and variable schema data
> Must support fast, iterative processing
4
14. 4 Tableau - Where are bad car batteries coming ?
Lot 4102 has bad batteries
15. 5 Using d3
Another Open source
Visualiation application
which uses JavaScript
langugage
16. 5 Using d3 to view 100 Charts simultaneously
Lot 4102 has bad batteries
17. Qlik – Differentiators That Matter
• Association – broader, more flexible application
• Exploration – un-paralleled navigation
• Search – flexible and powerful
• Real-time collaboration
All of this in an Intuitive and Fast Interface
Better Insights = Greater Business Value
6
18. Using Qlik – Find where my customer are
churning Telecom contracts
6
• Easy Integration
with maps
Polygon and Point
Maps
• Use Color
Gradients to show
measure
Graphically.
19. Using Qlik – Use Interactive Sankey to ask what
path customer take when churning
6
• Integrate
advanced
Visualization
such as
Sankey
• Interactive
Visualization
for drill down
Set up a data source to return only information applicable for a certain user or group
This is the workflow of the process. We see two passes through the analysis. The first produces a very generalized result. The second, or last pass, produces the specific Golden Pathway toward cancellation. Both processes are identical except for the parameters supplied to nPath. No complex programming required.
On the left we see the 3-way join of the tables which then feed directly into nPath. Next there is a small amount of SQL processing followed by the Pathmap SQL-MR function. Pathmap prepares the data for visualization.
In the real-world, and in this case, there are many iterations between the first and last pass. Using the technologies commonly available today, each iteration may be a project of its own. Hence, the high cost of this kind of analysis. These iterations occurred over a few days, and were done by a business analyst, not an engineer. No project required.