2. Data Expertise / Lynn Langit
• Industry awards
– Microsoft – MVP for SQL Server
– Google – GDE for Cloud Platform
– 10Gen – Master for MongoDB
• Practicing Architect
• Technical author / trainer
– Pluralsight – Google Cloud Series
– DevelopMentor – SQL Server Series
– 2 books on SQL Server BI
• Former MSFT FTE
– 4 years
4. BigData, NoSQL… => No Microsoft?
Big Data => keeping / getting more data
• Cheap Storage
• Cloud Storage
• Open Source data projects (Hadoop)
NoSQL => schema-lite, scalable storage
• NoSQL data projects
• Mostly open source
• Sharded replicas
5. In a (Open Source) Relationship?
NoSQL
Hadoop
Cloud
MongoDB
Neo4j
Riak AWS Heroku RackSpace OpenStack
Cassandra
12. Database Lifecycle Management
• Evaluating current processes
• Improving processes
• Adding new tools
– SSDT
• Data synchronization processes
13. Storing the data
Relational
• SQL Server – can use partitioning for scalability
Beyond relational via relational
• Specialized data types
• XML, Hierarchy, Filestream/Filetable, Geospatial
• Columnstore index
Multi-dimensional / in-memory
• OLAP cubes / Mining Models
• Tabular models
14. Big Data in SQL Server 2012 – Relational Enhancements
DEMO
COLUMNSTORE, XML, FILETABLE
18. Types of Data Quality Projects
T-SQL scripts (boolean • Exact matches WHERE = , WHERE WHERE
<>, IN
match) • LIKE string matching
% --
Full-text matching
(semantic word match) • CONTAINS
Semantic Search
• SEMANTICSIMIALARITIESTABLE
(semantic phrase match)
SSIS tasks - (transactional,
multi-valued matching) • List below
• KnowledgeBase rules/matches
-
DQS (KB matching)
• DataQualityproject clean correctdata
- /
MDS (One view of truth) • Versioned Entities, Attributes and Rules
19. Data Presentation
• View-only client
• View & manipulate (hide-only) client
• View & query (aggregate) client
• View & query (drill through) client
• View & mash-up (add new data) client
• View & update client
• Timeliness of data (latency)
• Beauty of data
20. But, does it work in Excel?
Mash-up
Clean up Extract- Authorize
data with
data with Transform- with 3rd party –
Import PowerPivot
Data Load with Master Mine with
Data – including
Quality Data Data Predixion
Hadoop via
Services Explorer Services
ODBC
21. From Pivot tables to Visualized Data Mash-ups with Mining
DEMO
THE POWER OF EXCEL
22. What about the UDM?
• UDM / Data Mining is fully supported in SSAS
• Must be installed in this mode
– Mutually exclusive to Tabular mode
• But, should you use it anymore?
23. Big Data in SQL Server 2012
– Non-Relational Features
DEMO
TABULAR MODELS
DATA MINING
27. Data Fluency and Job Roles
Consumer Analyzer Cleaner Artist
• View and • View, • Validate • Visualize
understand manipulate and update and present
and decide
28. BigData in SQL Server 2012
• Scaling via
• Partitioning for Tables, indexes
• PDW
Relational • Columnstore indexes
engine • Special Data Types
• XML, Hierarchy, Filetable
• OLAP Cubes
Analysis • Tabular Models
service engines • Data Mining Models
• Data Quality Services
Other • Master Data Services
services • StreamInsight
29. Other Data Services from Microsoft
Windows
Azure SQL Azure
Marketplace
Data
Power Pivot
Explorer
30. NoSQL – New Products / Betas
SSRS on
Semantic Azure
Search
HDInsight
PowerView (Hadoop on
Azure)
Cloud-based
Data Explorer
33. • recipes)
www.TeachingKidsProgramming.org
• Free Courseware
• Do a Recipe Teach a Kid (Ages 10 ++)
• Java or Microsoft SmallBasic
• C# on Pluralsight
34. Toward Data Craftsmanship…
Follow me
• @LynnLangit
• www.LynnLangit.com
• YouTube - SoCalDevGal
Hire me
• To help build your BI/Big Data solution
• To teach your team next gen BI
• To learn more about using NoSQL solutions
Notes de l'éditeur
SSIS Tasks - Lookup transformation - (this for that, substitutions)Cache transformation - (multiple lookups)Fuzzy Lookup - (lookup based on threshold matching)Fuzzy Grouping - (grouping based on thresholds)Data Mining Query - (based on mining model algorithms)DQS Cleansing - (uses a KB)
Comparison of features from MSDN -- http://msdn.microsoft.com/en-us/library/hh212940(v=sql.110).aspx