SlideShare une entreprise Scribd logo
1  sur  34
© 2015 IBM Corporation
Using Bluemix and dashDB for Twitter Analysis
Session # 1824
Torsten Steinbach @torsstei
Please Note:
• IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM’s sole discretion.
• Information regarding potential future products is intended to outline our general product direction and it should not be relied on in
making a purchasing decision.
• The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any
material, code or functionality. Information about potential future products may not be incorporated into any contract.
• The development, release, and timing of any future features or functionality described for our products remains at our sole discretion.
Performance is based on measurements and projections using standard IBM benchmarks in a
controlled environment. The actual throughput or performance that any user will experience will vary
depending upon many factors, including considerations such as the amount of multiprogramming in the
user’s job stream, the I/O configuration, the storage configuration, and the workload processed.
Therefore, no assurance can be given that an individual user will achieve results similar to those stated
here.
2
3
dashDB
IBM Insights for Twitter Service in Bluemix
2
Query exactly the data that
your social application needs.
Get IBM analytics enrichments
in addition to base Twitter data.
Whenever needed, check
whether previously received
Tweets are still valid
(compliance).
Ingest, enrich, curate,
govern Decahose
data over time.
Receive & process
compliance events.
Social Application using the IBM Insights for Twitter Service
IBM Insights for
Twitter Service:
Search over enriched
Decahose Data
IBM Insights for
Twitter Service:
Search over enriched
Decahose Data
Twitter
GNIP APIs
Twitter
GNIP APIs
Social
Application
Social
Application
IBM Insights for
Twitter System
on Softlayer
IBM Insights for
Twitter System
on Softlayer
Twitter Data enriched
through IBM Analytics
Twitter Data enriched
through IBM Analytics
Store and Index up to 2-year history of
enriched Tweets, point in time compliant
5
PowerTrack
collection rules &
filters.
Queries
6
keyword Matches tweets that have “keyword” in their body. The search is case-insensitive. cat
“exact phrase match”
Matches tweets that contain the exact keyword sequence <”exact”, “phrase”,
“match”>.
"cats and dogs"
#hashtag Matches tweets with the hashtag “#hashtag”. #insight2014
from: twitterHandle
Returns tweets from authors with the preferredUsername twitterHandle. Must not
contain the @ sign.
from:alexlang11
followers_count:lower
followers_count:lower,upper
Matches tweets of authors that have at least “lower” followers. The upper bound is
optional and both limits are inclusive.
followers_count:500
posted:startTime
posted:startTime, endTime
Matches tweets that have been posted at or after “startTime”. The “endTime” bound
is optional, and is inclusive.
Timestamps have to be in one of the following two formats:
“yyyy-mm-dd”
“yyyy-mm-dd'T'HH:MM:SS'Z'”
Timezone is UTC
posted: 2014-12-1T00:00:00Z,
2014-12-12T00:00:00Z
The query language mimics the Gnip Powertrack query language, a subset of Powertrack operators is available. See documentation in Bluemix as we roll out more query
operators.
Boolean Operators
Operator precedence: “-” is stronger than “AND” and “AND” is stronger “than OR”. You can (and should) use parentheses to make operator precedence explicit.
Example: ibm twitter -(lame OR boring) searches for tweets that contain both the terms “ibm” and “twitter” but neither “lame” nor “boring”.
Query terms
All of the following query terms can be freely combined with the boolean operators introduced above, e.g. ibm apple followers_count:500
Operator Example(s) Description
term1 AND term2
cat dog
cat AND dog
#cutecat food
Returns tweets that contain both term1 and term2.
Whitespace between two terms is treated as AND, so the
operator can be omitted
term1 OR term2 #money OR broke Returns tweets that contain either term1 or term2
-term1 ibm -apple Returns tweets that do not contain term1
Count: /messages/count?q=QUERY
• Use to find out how many Tweets match a given query
7
Http Code Description Example Response
200
Number of results at json_path(“search.results”)
URL to retrieve documents at
json_path(“related.search.href”)
Note: add you client_id and your client_secret to this URL
{
"search":{ "results":21695 }
"related":{ "search":
{ "href":"https://server.bluemix.net/api/v1/mes
sages/search?q=ibm" } },
}
4xx
There was a problem with your query. Please have a look at
json_path(“error”) to identify the problem.
5xx
There was a problem with the service. Please have a look at
json_path(“error”) and contact support.
Search: /messages/search?q=QUERY&size=NUMBER
• Search & retrieve <= NUMBER Tweets matching QUERY
8
Http Code Description Example Response
200
Number of overall results at
json_path(“search.results”)
First batch of results at json_path("tweets")
URL to retrieve the next batch of documents
(if available) at json_path(“related.next.href”)
Note: add you client_id and your
client_secret to this URL
{ "search": { "results": 16283624 },
"tweets": [ { "message": {
…
“body”: “this is a nice tweet ”
…
“actor” : { “followersCount”: 456,
“displayName”: “IBM Tweeter”
…
“cde” : {
"sentiment": { "polarity": "POSITIVE" ...
“author”: { “gender”:”male” …
}
4xx
There was a problem with your query.
Please have a look at json_path(“error”) to
identify the problem.
5xx
There was a problem with the service.
Please have a look at json_path(“error”) and
contact support.
Example Queries
• Get Tweets about an upcoming movie for a given time frame to sense interest &
reactions to trailer:
search?q="posted:2015-02-01T00:00:00Z AND #starwars"&size=5
• Get Tweets with positive/negative sentiment about a product to learn what
customers like / dislike about the product:
search?q="IBM Bluemix sentiment:positive"
• Get Tweets about a product being marketed and compare over time to sense
audience reaction to the campaign:
search?q="posted:2015-02-01T00:00:00Z,2015-02-15T00:00:00Z
AND #IBM"
9
Built-in Tool to load Tweets to dashDB
R & Python for dashDB
dashDBdashDB
Predictive Analytics With R In dashDB 1/3
• Built-in R runtime
& R Studio
• ibmdbR package
 Data frames logically representing data physically residing in dashDB tables
> con <- idaConnect("BLUDB", "", "")
> idaInit(con)
> sysusage<-ida.data.frame('DB2INST1.SHOWCASE_SYSUSAGE')
> systems<-ida.data.frame('DB2INST1.SHOWCASE_SYSTEMS')
> systypes<-ida.data.frame('DB2INST1.SHOWCASE_SYSTYPES’)
 Push down of R data preparation to dashDB
> sysusage2 <- sysusage[sysusage$MEMUSED>50000,c("MEMUSED","USERS")]
> mergedSys<-idaMerge(systems, systypes, by='TYPEID')
> mergedUsage<-idaMerge(sysusage2, mergedSys, by='SID’)
 Push down of analytic algorithms to in-db execution
> lm1 <- idaLm(MEMUSED~USERS, mergedUsage)
R RuntimeR Runtime
BrowserBrowser
Any R RuntimeAny R Runtime
ibmdbRibmdbR
ibmdbRibmdbR
RStudioRStudio
REST Client
REST
Predictive Analytics With R In dashDB 2/3
 Dynamite-native implementation of statistical functions
• colnames, cor, cov, dim, head, length, max, mean, min, names, print, sd, summary, var
 Logically derived columns pushed down to Dynamite
> myDF <- ida.data.frame('DB2INST1.SHOWCASE_SYSUSAGE')
> myDF$MemPerUser <- myDF$MEMUSED / myDF$USERS
 Sampling of tables in Dynamite
> idaSample(myDF, 3)
SID DATE USERS MEMUSED ALERT MemPerUser
1 8 2014-02-14 23:39:00.000000 34 5015 f 147
2 5 2014-01-22 07:52:00.000000 96 11512 f 119
3 7 2013-09-12 05:17:00.000000 39 5592 t 143
 Statistics about tables in Dynamite
> summary(myDF)
SID USERS MEMUSED ALERT MemPerUser
Min. :0.000 Min. : 3.000 Min. : 350.000 f :3655563 Min. :105.000
1st Qu.:2.000 1st Qu.: 35.000 1st Qu.: 5113.000 t :1344437 1st Qu.:135.000
Median :4.500 Median : 64.000 Median : 9455.000 NA's: NA Median :150.000
Mean : NA Mean : NA Mean : NA Mean : NA
3rd Qu.:7.000 3rd Qu.:111.000 3rd Qu.:16517.000 3rd Qu.:165.000
Max. :9.000 Max. :347.000 Max. :62379.000 Max. :209.000
 Statistics about categorical values
> idaTable(myDF)
ALERT
f t
3655563 1344437
Predictive Analytics With R In dashDB 3/3
 Store R objects in Dynamite database
> myPrivateObjects <- ida.list(type='private’)
> myPrivateObjects['series100'] <- 1:100
> x <- myPrivateObjects['series100’]
> X
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
[23] 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
[45] 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66
[67] 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88
[89] 89 90 91 92 93 94 95 96 97 98 99 100
> names(myPrivateObjects)
[1] "series100”
> myPrivateObjects['series100'] <- NULL
 Manage Dynamite tables
> idaExistTable('DB2INST1.SHOWCASE_SYSUSAGE')
[1] TRUE
> idaShowTables()
Schema Name Owner Type
1 BLUADMIN R_OBJECTS_PRIVATE BLUADMIN T
2 BLUADMIN R_OBJECTS_PRIVATE_META BLUADMIN T
3 BLUADMIN R_OBJECTS_PUBLIC BLUADMIN T
4 BLUADMIN R_OBJECTS_PUBLIC_META BLUADMIN T
> myView <- idaCreateView(myDF)
> idaIsView(myView)
[1] TRUE
> idaDropView(myView)
> idaIsView(myView)
[1] FALSE
 Create you R script with RStudio
• Storing it in home dir inside dashDB
 POST <dashdb-server>/dashdb-api/rscript/<fileName>
• Run the specified R script
 GET <dashdb-server>/dashdb-api/home
• List all files under user home (recursively)
– E.g. list the output written by your R script
 GET <dashdb-server>/dashdb-api/home/<fileName>
• Download the specified file
Running R in dashDB via REST API
15
dashDBdashDB
Predictive Analytics With Python In dashDB
• Bluemix Analytic Notebooks
• ibmdbPy package
 https://pypi.python.org/pypi/ibmdbpy
 Data frames logically representing data physically residing in dashDB tables
from ibmdbpy import IdaDataFrame
idadf = IdaDataFrame(idadb, "IRIS", indexer = "ID")
idadf = idadf[["ID","sepal_length", "sepal_width"]]
idadf['new'] = idadf['sepal_width'] + idadf['sepal_length'].mean()
idadf.head()
 Push down of analytic algorithms to in-db execution
from ibmdbpy.learn import KMeans
kmeans = KMeans(3) # clustering with 3 clusters
kmeans.fit_predict(idadf).head()
Analytics for Spark
Notebook in Bluemix
Analytics for Spark
Notebook in Bluemix
BrowserBrowser
Any Python RuntimeAny Python Runtime
ibmdbPyibmdbPy
ibmdbPyibmdbPy
Loading Twitter Data to dashDB with Bluemix App
Show Case for box office analysis with Twitter:
www.youtube.com/watch?v=9yVNwOs9L4c
Twitter loader app for dashDB: hub.jazz.net/project/torsstei/Twitter-Loader/overview
(www.youtube.com/watch?v=ANakSSGM4zU)
18
Movie Analysis Show Case
Public map data for US counties
https://www.census.gov/geo/maps-data/data/tiger-line.html
In Bluemix
dashDB service for analytics and
correlation between Tweets and
box office data
Box Office stats from the-numbers.com
Interactive app for visualization
using Node.JS and D3.js libraryTweets about movies
from Bluemix service
dashDB
Analysis using
built-in R &
RStudio
https://hub.jazz.net/project/torsstei/movie-analysis
Movie Analysis Show Case https://hub.jazz.net/project/torsstei/movie-analysis
Movie Analysis Show Case
S3
Swift
Populating dashDB with Data
dashDB
Geodata in Esri
ShapefilesOn Premise Databases
Mobile App Data
in Cloudant
GeoJSON
Twitter
The Weather Company
CSVs
Open Data
Bluemix
Cloud Storage
data.gc.ca, data.gov, data.gov.uk,
datahub.io, openAFRICA
Open
Data
Loader
2
The Weather Company Data Loader Bluemix App
2
Backup
dashDB: Key Use Cases
• Minimize capital expense of DR solutionDR in the Cloud
We Bring Netezza Compatible Analytic Platform to the
Cloud
Analytic Extension FrameworkAnalytic Extension Framework
UDX C++ APIUDX C++ API
Canned AnalyticsCanned Analytics
Application IntegrationApplication Integration
AE FrameworkAE Framework In-DB RIn-DB R In-DB LUAIn-DB LUAIn-DB PythonIn-DB Python In-DB PerlIn-DB Perl
OLAP FunctionsOLAP Functions
ROW_NUMBERROW_NUMBER
RANKRANK
LAGLAG LEADLEAD
DENSE_RANKDENSE_RANK Linear RegressionLinear Regression
Kmeans
Clustering
Kmeans
Clustering Decision TreeDecision Tree
Association RulesAssociation Rules
Association RulesAssociation Rules
Naive BayesNaive Bayes
Spatial OperatorsSpatial Operators
ContainsContains
TouchesTouches
WithinWithin
IntersectsIntersects
CrossesCrosses
OverlapsOverlaps
R WrapperR Wrapper Watson AnalyticsWatson Analytics ESRI ArcGIS
Connector
ESRI ArcGIS
Connector ……
Analytics Applications of ISVs and CustomersAnalytics Applications of ISVs and Customers
STDDEVSTDDEV
COVARCOVAR
…………
Analytic Code &
Algorithms:
Analytic Data:
Data pulled out and processed in analytic
application
Analytic
Applications
This is where we start from: All analytic processing done on application side
Analytics of Warehouse Data
SQLs
Analytic Code &
Algorithms:
Analytic Data:
Simple data lookup & massage operations
pushed down as SQL operations
Analytic
Applications
Benefit: Acceleration with no SQL skills required
SQLs
Push Down Step 1: BLU tables only logically represented in analytic application
Accelerate Analytics for Warehouse Data
SQLs
Analytic Code &
Algorithms:
Analytic Data:
Call built-in functions via SQL to execute
typical algorithms inside db
Cloud Tooling
Analytic
Applications
Benefit: Bring Standard Analytics to the Data
SQLs
Canned Algorithms
Push Down Step 2: Typical and popular algorithms pushed down to canned UDFs in the db
Accelerate Analytics for Warehouse Data
LanguageFramework
(UDX&AE)
Analytic Code &
Algorithms:
Analytic Data:
Deploy customer code and call via special
SQL function interfaces
SQLs
SQLs
Canned Algorithms
Analytic
Applications
Benefit: Bring Custom Analytics to the Data
Push Down Step 3: Execute entire customer analytic programs inside the db
Accelerate Analytics for Warehouse Data
Don’t forget to submit your Insight session and speaker feedback! Your
feedback is very important to us – we use it to continually improve the
conference.
Access your surveys at insight2015survey.com to quickly submit your surveys
from your smartphone, laptop or conference kiosk.
We Value Your Feedback!
31
32
Notices and Disclaimers
Copyright © 2015 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form
without written permission from IBM.
U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM.
Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for
accuracy as of the date of initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to
update this information. THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IN NO
EVENT SHALL IBM BE LIABLE FOR ANY DAMAGE ARISING FROM THE USE OF THIS INFORMATION, INCLUDING BUT NOT LIMITED TO,
LOSS OF DATA, BUSINESS INTERRUPTION, LOSS OF PROFIT OR LOSS OF OPPORTUNITY. IBM products and services are warranted
according to the terms and conditions of the agreements under which they are provided.
Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice.
Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as
illustrations of how those customers have used IBM products and the results they may have achieved. Actual performance, cost, savings or other
results in other operating environments may vary.
References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services
available in all countries in which IBM operates or does business.
Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the
views of IBM. All materials and discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or
other guidance or advice to any individual participant or their specific situation.
It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the
identification and interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the
customer may need to take to comply with such laws. IBM does not provide legal advice or represent or warrant that its services or products will
ensure that the customer is in compliance with any law.
33
Notices and Disclaimers (con’t)
Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly
available sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance,
compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the
suppliers of those products. IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to
interoperate with IBM’s products. IBM EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
The provision of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents, copyrights,
trademarks or other intellectual property right.
•IBM, the IBM logo, ibm.com, Aspera®, Bluemix, Blueworks Live, CICS, Clearcase, Cognos®, DB2® , DOORS®, Emptoris®, Enterprise
Document Management System™, FASP®, FileNet®, Global Business Services ®, Global Technology Services ®, IBM ExperienceOne™, IBM
SmartCloud®, IBM Social Business®, IMS™, Information on Demand, ILOG, Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON,
OpenPower, PureAnalytics™, PureApplication®, pureCluster™, PureCoverage®, PureData®, PureExperience®, PureFlex®, pureQuery®,
pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, Smarter Commerce®, SoDA, SPSS, Sterling Commerce®, StoredIQ, Tealeaf®,
Tivoli®, Trusteer®, Unica®, urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of International
Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or
other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at:
www.ibm.com/legal/copytrade.shtml.
© 2015 IBM Corporation
Thank You

Contenu connexe

Similaire à IBM Insight 2015 - 1824 - Using Bluemix and dashDB for Twitter Analysis

Ruby on Rails Security Updated (Rails 3) at RailsWayCon
Ruby on Rails Security Updated (Rails 3) at RailsWayConRuby on Rails Security Updated (Rails 3) at RailsWayCon
Ruby on Rails Security Updated (Rails 3) at RailsWayCon
heikowebers
 
The Enterprise Architecture You Always Wanted
The Enterprise Architecture You Always WantedThe Enterprise Architecture You Always Wanted
The Enterprise Architecture You Always Wanted
Thoughtworks
 

Similaire à IBM Insight 2015 - 1824 - Using Bluemix and dashDB for Twitter Analysis (20)

AWS January 2016 Webinar Series - Building Smart Applications with Amazon Mac...
AWS January 2016 Webinar Series - Building Smart Applications with Amazon Mac...AWS January 2016 Webinar Series - Building Smart Applications with Amazon Mac...
AWS January 2016 Webinar Series - Building Smart Applications with Amazon Mac...
 
Getting Started with Real-Time Analytics
Getting Started with Real-Time AnalyticsGetting Started with Real-Time Analytics
Getting Started with Real-Time Analytics
 
Dublin Ireland Spark Meetup October 15, 2015
Dublin Ireland Spark Meetup October 15, 2015Dublin Ireland Spark Meetup October 15, 2015
Dublin Ireland Spark Meetup October 15, 2015
 
Lessons from running AppSync in prod
Lessons from running AppSync in prodLessons from running AppSync in prod
Lessons from running AppSync in prod
 
Crm saturday madrid 2017 3 mosqueteros demian-marco-mario
Crm saturday madrid 2017   3  mosqueteros demian-marco-marioCrm saturday madrid 2017   3  mosqueteros demian-marco-mario
Crm saturday madrid 2017 3 mosqueteros demian-marco-mario
 
PHP FUNCTIONS
PHP FUNCTIONSPHP FUNCTIONS
PHP FUNCTIONS
 
Azure Enterprise Data Analyst (DP-500) Exam Dumps 2023.pdf
Azure Enterprise Data Analyst (DP-500) Exam Dumps 2023.pdfAzure Enterprise Data Analyst (DP-500) Exam Dumps 2023.pdf
Azure Enterprise Data Analyst (DP-500) Exam Dumps 2023.pdf
 
AWS November Webinar Series - Advanced Analytics with Amazon Redshift and the...
AWS November Webinar Series - Advanced Analytics with Amazon Redshift and the...AWS November Webinar Series - Advanced Analytics with Amazon Redshift and the...
AWS November Webinar Series - Advanced Analytics with Amazon Redshift and the...
 
DownUnder Dreaming - 5 steps to dreamy data
DownUnder Dreaming - 5 steps to dreamy dataDownUnder Dreaming - 5 steps to dreamy data
DownUnder Dreaming - 5 steps to dreamy data
 
Analyze Amazon CloudFront and Lambda@Edge Logs to Improve Customer Experience...
Analyze Amazon CloudFront and Lambda@Edge Logs to Improve Customer Experience...Analyze Amazon CloudFront and Lambda@Edge Logs to Improve Customer Experience...
Analyze Amazon CloudFront and Lambda@Edge Logs to Improve Customer Experience...
 
Aws data analytics practice tests 2022
Aws data analytics practice tests 2022Aws data analytics practice tests 2022
Aws data analytics practice tests 2022
 
Ruby on Rails Security Updated (Rails 3) at RailsWayCon
Ruby on Rails Security Updated (Rails 3) at RailsWayConRuby on Rails Security Updated (Rails 3) at RailsWayCon
Ruby on Rails Security Updated (Rails 3) at RailsWayCon
 
Apex Enterprise Patterns: Building Strong Foundations
Apex Enterprise Patterns: Building Strong FoundationsApex Enterprise Patterns: Building Strong Foundations
Apex Enterprise Patterns: Building Strong Foundations
 
Social Media and the Customer-centric Data Strategy #data17
Social Media and the Customer-centric Data Strategy #data17Social Media and the Customer-centric Data Strategy #data17
Social Media and the Customer-centric Data Strategy #data17
 
AWS Neptune - A Fast and reliable Graph Database Built for the Cloud
AWS Neptune - A Fast and reliable Graph Database Built for the CloudAWS Neptune - A Fast and reliable Graph Database Built for the Cloud
AWS Neptune - A Fast and reliable Graph Database Built for the Cloud
 
As You Seek – How Search Enables Big Data Analytics
As You Seek – How Search Enables Big Data AnalyticsAs You Seek – How Search Enables Big Data Analytics
As You Seek – How Search Enables Big Data Analytics
 
MongoDB World 2018: Pissing Off IT and Delivery: A Tale of 2 ODS’s
MongoDB World 2018: Pissing Off IT and Delivery: A Tale of 2 ODS’sMongoDB World 2018: Pissing Off IT and Delivery: A Tale of 2 ODS’s
MongoDB World 2018: Pissing Off IT and Delivery: A Tale of 2 ODS’s
 
MongoDB.local Austin 2018: Pissing Off IT and Delivery: A Tale of 2 ODS's
MongoDB.local Austin 2018:  Pissing Off IT and Delivery: A Tale of 2 ODS'sMongoDB.local Austin 2018:  Pissing Off IT and Delivery: A Tale of 2 ODS's
MongoDB.local Austin 2018: Pissing Off IT and Delivery: A Tale of 2 ODS's
 
Best Practices for Running SQL Server on Amazon RDS (DAT323) - AWS re:Invent ...
Best Practices for Running SQL Server on Amazon RDS (DAT323) - AWS re:Invent ...Best Practices for Running SQL Server on Amazon RDS (DAT323) - AWS re:Invent ...
Best Practices for Running SQL Server on Amazon RDS (DAT323) - AWS re:Invent ...
 
The Enterprise Architecture You Always Wanted
The Enterprise Architecture You Always WantedThe Enterprise Architecture You Always Wanted
The Enterprise Architecture You Always Wanted
 

Plus de Torsten Steinbach

esri2015cloudantdashdbpresentation-150731203041-lva1-app6892
esri2015cloudantdashdbpresentation-150731203041-lva1-app6892esri2015cloudantdashdbpresentation-150731203041-lva1-app6892
esri2015cloudantdashdbpresentation-150731203041-lva1-app6892
Torsten Steinbach
 

Plus de Torsten Steinbach (17)

Suburface 2021 IBM Cloud Data Lake
Suburface 2021 IBM Cloud Data LakeSuburface 2021 IBM Cloud Data Lake
Suburface 2021 IBM Cloud Data Lake
 
IBM Cloud Day January 2021 Data Lake Deep Dive
IBM Cloud Day January 2021 Data Lake Deep DiveIBM Cloud Day January 2021 Data Lake Deep Dive
IBM Cloud Day January 2021 Data Lake Deep Dive
 
IBM Cloud Native Day April 2021: Serverless Data Lake
IBM Cloud Native Day April 2021: Serverless Data LakeIBM Cloud Native Day April 2021: Serverless Data Lake
IBM Cloud Native Day April 2021: Serverless Data Lake
 
IBM Cloud Day January 2021 - A well architected data lake
IBM Cloud Day January 2021 - A well architected data lakeIBM Cloud Day January 2021 - A well architected data lake
IBM Cloud Day January 2021 - A well architected data lake
 
IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services
IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services
IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services
 
Cloud-based Data Lake for Analytics and AI
Cloud-based Data Lake for Analytics and AICloud-based Data Lake for Analytics and AI
Cloud-based Data Lake for Analytics and AI
 
Coud-based Data Lake for Analytics and AI
Coud-based Data Lake for Analytics and AICoud-based Data Lake for Analytics and AI
Coud-based Data Lake for Analytics and AI
 
Serverless SQL
Serverless SQLServerless SQL
Serverless SQL
 
IBM THINK 2019 - A Sharing Economy for Analytics: SQL Query in IBM Cloud
IBM THINK 2019 - A Sharing Economy for Analytics: SQL Query in IBM CloudIBM THINK 2019 - A Sharing Economy for Analytics: SQL Query in IBM Cloud
IBM THINK 2019 - A Sharing Economy for Analytics: SQL Query in IBM Cloud
 
IBM THINK 2019 - What? I Don't Need a Database to Do All That with SQL?
IBM THINK 2019 - What? I Don't Need a Database to Do All That with SQL?IBM THINK 2019 - What? I Don't Need a Database to Do All That with SQL?
IBM THINK 2019 - What? I Don't Need a Database to Do All That with SQL?
 
IBM THINK 2019 - Cloud-Native Clickstream Analysis in IBM Cloud
IBM THINK 2019 - Cloud-Native Clickstream Analysis in IBM CloudIBM THINK 2019 - Cloud-Native Clickstream Analysis in IBM Cloud
IBM THINK 2019 - Cloud-Native Clickstream Analysis in IBM Cloud
 
IBM THINK 2019 - Self-Service Cloud Data Management with SQL
IBM THINK 2019 - Self-Service Cloud Data Management with SQL IBM THINK 2019 - Self-Service Cloud Data Management with SQL
IBM THINK 2019 - Self-Service Cloud Data Management with SQL
 
IBM THINK 2018 - IBM Cloud SQL Query Introduction
IBM THINK 2018 - IBM Cloud SQL Query IntroductionIBM THINK 2018 - IBM Cloud SQL Query Introduction
IBM THINK 2018 - IBM Cloud SQL Query Introduction
 
IBM Insight 2014 - Advanced Warehouse Analytics in the Cloud
IBM Insight 2014 - Advanced Warehouse Analytics in the CloudIBM Insight 2014 - Advanced Warehouse Analytics in the Cloud
IBM Insight 2014 - Advanced Warehouse Analytics in the Cloud
 
IBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloud
IBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloudIBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloud
IBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloud
 
IBM Information on Demand 2013 - Session 2839 - Using IBM PureData System fo...
IBM Information on Demand 2013  - Session 2839 - Using IBM PureData System fo...IBM Information on Demand 2013  - Session 2839 - Using IBM PureData System fo...
IBM Information on Demand 2013 - Session 2839 - Using IBM PureData System fo...
 
esri2015cloudantdashdbpresentation-150731203041-lva1-app6892
esri2015cloudantdashdbpresentation-150731203041-lva1-app6892esri2015cloudantdashdbpresentation-150731203041-lva1-app6892
esri2015cloudantdashdbpresentation-150731203041-lva1-app6892
 

Dernier

%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
chiefasafspells
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 

Dernier (20)

%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 

IBM Insight 2015 - 1824 - Using Bluemix and dashDB for Twitter Analysis

  • 1. © 2015 IBM Corporation Using Bluemix and dashDB for Twitter Analysis Session # 1824 Torsten Steinbach @torsstei
  • 2. Please Note: • IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM’s sole discretion. • Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. • The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. • The development, release, and timing of any future features or functionality described for our products remains at our sole discretion. Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in the user’s job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here. 2
  • 4. IBM Insights for Twitter Service in Bluemix 2
  • 5. Query exactly the data that your social application needs. Get IBM analytics enrichments in addition to base Twitter data. Whenever needed, check whether previously received Tweets are still valid (compliance). Ingest, enrich, curate, govern Decahose data over time. Receive & process compliance events. Social Application using the IBM Insights for Twitter Service IBM Insights for Twitter Service: Search over enriched Decahose Data IBM Insights for Twitter Service: Search over enriched Decahose Data Twitter GNIP APIs Twitter GNIP APIs Social Application Social Application IBM Insights for Twitter System on Softlayer IBM Insights for Twitter System on Softlayer Twitter Data enriched through IBM Analytics Twitter Data enriched through IBM Analytics Store and Index up to 2-year history of enriched Tweets, point in time compliant 5 PowerTrack collection rules & filters.
  • 6. Queries 6 keyword Matches tweets that have “keyword” in their body. The search is case-insensitive. cat “exact phrase match” Matches tweets that contain the exact keyword sequence <”exact”, “phrase”, “match”>. "cats and dogs" #hashtag Matches tweets with the hashtag “#hashtag”. #insight2014 from: twitterHandle Returns tweets from authors with the preferredUsername twitterHandle. Must not contain the @ sign. from:alexlang11 followers_count:lower followers_count:lower,upper Matches tweets of authors that have at least “lower” followers. The upper bound is optional and both limits are inclusive. followers_count:500 posted:startTime posted:startTime, endTime Matches tweets that have been posted at or after “startTime”. The “endTime” bound is optional, and is inclusive. Timestamps have to be in one of the following two formats: “yyyy-mm-dd” “yyyy-mm-dd'T'HH:MM:SS'Z'” Timezone is UTC posted: 2014-12-1T00:00:00Z, 2014-12-12T00:00:00Z The query language mimics the Gnip Powertrack query language, a subset of Powertrack operators is available. See documentation in Bluemix as we roll out more query operators. Boolean Operators Operator precedence: “-” is stronger than “AND” and “AND” is stronger “than OR”. You can (and should) use parentheses to make operator precedence explicit. Example: ibm twitter -(lame OR boring) searches for tweets that contain both the terms “ibm” and “twitter” but neither “lame” nor “boring”. Query terms All of the following query terms can be freely combined with the boolean operators introduced above, e.g. ibm apple followers_count:500 Operator Example(s) Description term1 AND term2 cat dog cat AND dog #cutecat food Returns tweets that contain both term1 and term2. Whitespace between two terms is treated as AND, so the operator can be omitted term1 OR term2 #money OR broke Returns tweets that contain either term1 or term2 -term1 ibm -apple Returns tweets that do not contain term1
  • 7. Count: /messages/count?q=QUERY • Use to find out how many Tweets match a given query 7 Http Code Description Example Response 200 Number of results at json_path(“search.results”) URL to retrieve documents at json_path(“related.search.href”) Note: add you client_id and your client_secret to this URL { "search":{ "results":21695 } "related":{ "search": { "href":"https://server.bluemix.net/api/v1/mes sages/search?q=ibm" } }, } 4xx There was a problem with your query. Please have a look at json_path(“error”) to identify the problem. 5xx There was a problem with the service. Please have a look at json_path(“error”) and contact support.
  • 8. Search: /messages/search?q=QUERY&size=NUMBER • Search & retrieve <= NUMBER Tweets matching QUERY 8 Http Code Description Example Response 200 Number of overall results at json_path(“search.results”) First batch of results at json_path("tweets") URL to retrieve the next batch of documents (if available) at json_path(“related.next.href”) Note: add you client_id and your client_secret to this URL { "search": { "results": 16283624 }, "tweets": [ { "message": { … “body”: “this is a nice tweet ” … “actor” : { “followersCount”: 456, “displayName”: “IBM Tweeter” … “cde” : { "sentiment": { "polarity": "POSITIVE" ... “author”: { “gender”:”male” … } 4xx There was a problem with your query. Please have a look at json_path(“error”) to identify the problem. 5xx There was a problem with the service. Please have a look at json_path(“error”) and contact support.
  • 9. Example Queries • Get Tweets about an upcoming movie for a given time frame to sense interest & reactions to trailer: search?q="posted:2015-02-01T00:00:00Z AND #starwars"&size=5 • Get Tweets with positive/negative sentiment about a product to learn what customers like / dislike about the product: search?q="IBM Bluemix sentiment:positive" • Get Tweets about a product being marketed and compare over time to sense audience reaction to the campaign: search?q="posted:2015-02-01T00:00:00Z,2015-02-15T00:00:00Z AND #IBM" 9
  • 10. Built-in Tool to load Tweets to dashDB
  • 11. R & Python for dashDB
  • 12. dashDBdashDB Predictive Analytics With R In dashDB 1/3 • Built-in R runtime & R Studio • ibmdbR package  Data frames logically representing data physically residing in dashDB tables > con <- idaConnect("BLUDB", "", "") > idaInit(con) > sysusage<-ida.data.frame('DB2INST1.SHOWCASE_SYSUSAGE') > systems<-ida.data.frame('DB2INST1.SHOWCASE_SYSTEMS') > systypes<-ida.data.frame('DB2INST1.SHOWCASE_SYSTYPES’)  Push down of R data preparation to dashDB > sysusage2 <- sysusage[sysusage$MEMUSED>50000,c("MEMUSED","USERS")] > mergedSys<-idaMerge(systems, systypes, by='TYPEID') > mergedUsage<-idaMerge(sysusage2, mergedSys, by='SID’)  Push down of analytic algorithms to in-db execution > lm1 <- idaLm(MEMUSED~USERS, mergedUsage) R RuntimeR Runtime BrowserBrowser Any R RuntimeAny R Runtime ibmdbRibmdbR ibmdbRibmdbR RStudioRStudio REST Client REST
  • 13. Predictive Analytics With R In dashDB 2/3  Dynamite-native implementation of statistical functions • colnames, cor, cov, dim, head, length, max, mean, min, names, print, sd, summary, var  Logically derived columns pushed down to Dynamite > myDF <- ida.data.frame('DB2INST1.SHOWCASE_SYSUSAGE') > myDF$MemPerUser <- myDF$MEMUSED / myDF$USERS  Sampling of tables in Dynamite > idaSample(myDF, 3) SID DATE USERS MEMUSED ALERT MemPerUser 1 8 2014-02-14 23:39:00.000000 34 5015 f 147 2 5 2014-01-22 07:52:00.000000 96 11512 f 119 3 7 2013-09-12 05:17:00.000000 39 5592 t 143  Statistics about tables in Dynamite > summary(myDF) SID USERS MEMUSED ALERT MemPerUser Min. :0.000 Min. : 3.000 Min. : 350.000 f :3655563 Min. :105.000 1st Qu.:2.000 1st Qu.: 35.000 1st Qu.: 5113.000 t :1344437 1st Qu.:135.000 Median :4.500 Median : 64.000 Median : 9455.000 NA's: NA Median :150.000 Mean : NA Mean : NA Mean : NA Mean : NA 3rd Qu.:7.000 3rd Qu.:111.000 3rd Qu.:16517.000 3rd Qu.:165.000 Max. :9.000 Max. :347.000 Max. :62379.000 Max. :209.000  Statistics about categorical values > idaTable(myDF) ALERT f t 3655563 1344437
  • 14. Predictive Analytics With R In dashDB 3/3  Store R objects in Dynamite database > myPrivateObjects <- ida.list(type='private’) > myPrivateObjects['series100'] <- 1:100 > x <- myPrivateObjects['series100’] > X [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 [23] 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 [45] 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 [67] 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 [89] 89 90 91 92 93 94 95 96 97 98 99 100 > names(myPrivateObjects) [1] "series100” > myPrivateObjects['series100'] <- NULL  Manage Dynamite tables > idaExistTable('DB2INST1.SHOWCASE_SYSUSAGE') [1] TRUE > idaShowTables() Schema Name Owner Type 1 BLUADMIN R_OBJECTS_PRIVATE BLUADMIN T 2 BLUADMIN R_OBJECTS_PRIVATE_META BLUADMIN T 3 BLUADMIN R_OBJECTS_PUBLIC BLUADMIN T 4 BLUADMIN R_OBJECTS_PUBLIC_META BLUADMIN T > myView <- idaCreateView(myDF) > idaIsView(myView) [1] TRUE > idaDropView(myView) > idaIsView(myView) [1] FALSE
  • 15.  Create you R script with RStudio • Storing it in home dir inside dashDB  POST <dashdb-server>/dashdb-api/rscript/<fileName> • Run the specified R script  GET <dashdb-server>/dashdb-api/home • List all files under user home (recursively) – E.g. list the output written by your R script  GET <dashdb-server>/dashdb-api/home/<fileName> • Download the specified file Running R in dashDB via REST API 15
  • 16. dashDBdashDB Predictive Analytics With Python In dashDB • Bluemix Analytic Notebooks • ibmdbPy package  https://pypi.python.org/pypi/ibmdbpy  Data frames logically representing data physically residing in dashDB tables from ibmdbpy import IdaDataFrame idadf = IdaDataFrame(idadb, "IRIS", indexer = "ID") idadf = idadf[["ID","sepal_length", "sepal_width"]] idadf['new'] = idadf['sepal_width'] + idadf['sepal_length'].mean() idadf.head()  Push down of analytic algorithms to in-db execution from ibmdbpy.learn import KMeans kmeans = KMeans(3) # clustering with 3 clusters kmeans.fit_predict(idadf).head() Analytics for Spark Notebook in Bluemix Analytics for Spark Notebook in Bluemix BrowserBrowser Any Python RuntimeAny Python Runtime ibmdbPyibmdbPy ibmdbPyibmdbPy
  • 17. Loading Twitter Data to dashDB with Bluemix App Show Case for box office analysis with Twitter: www.youtube.com/watch?v=9yVNwOs9L4c Twitter loader app for dashDB: hub.jazz.net/project/torsstei/Twitter-Loader/overview (www.youtube.com/watch?v=ANakSSGM4zU)
  • 18. 18 Movie Analysis Show Case Public map data for US counties https://www.census.gov/geo/maps-data/data/tiger-line.html In Bluemix dashDB service for analytics and correlation between Tweets and box office data Box Office stats from the-numbers.com Interactive app for visualization using Node.JS and D3.js libraryTweets about movies from Bluemix service dashDB Analysis using built-in R & RStudio https://hub.jazz.net/project/torsstei/movie-analysis
  • 19. Movie Analysis Show Case https://hub.jazz.net/project/torsstei/movie-analysis
  • 21. S3 Swift Populating dashDB with Data dashDB Geodata in Esri ShapefilesOn Premise Databases Mobile App Data in Cloudant GeoJSON Twitter The Weather Company CSVs Open Data Bluemix Cloud Storage data.gc.ca, data.gov, data.gov.uk, datahub.io, openAFRICA
  • 23. The Weather Company Data Loader Bluemix App 2
  • 25. dashDB: Key Use Cases • Minimize capital expense of DR solutionDR in the Cloud
  • 26. We Bring Netezza Compatible Analytic Platform to the Cloud Analytic Extension FrameworkAnalytic Extension Framework UDX C++ APIUDX C++ API Canned AnalyticsCanned Analytics Application IntegrationApplication Integration AE FrameworkAE Framework In-DB RIn-DB R In-DB LUAIn-DB LUAIn-DB PythonIn-DB Python In-DB PerlIn-DB Perl OLAP FunctionsOLAP Functions ROW_NUMBERROW_NUMBER RANKRANK LAGLAG LEADLEAD DENSE_RANKDENSE_RANK Linear RegressionLinear Regression Kmeans Clustering Kmeans Clustering Decision TreeDecision Tree Association RulesAssociation Rules Association RulesAssociation Rules Naive BayesNaive Bayes Spatial OperatorsSpatial Operators ContainsContains TouchesTouches WithinWithin IntersectsIntersects CrossesCrosses OverlapsOverlaps R WrapperR Wrapper Watson AnalyticsWatson Analytics ESRI ArcGIS Connector ESRI ArcGIS Connector …… Analytics Applications of ISVs and CustomersAnalytics Applications of ISVs and Customers STDDEVSTDDEV COVARCOVAR …………
  • 27. Analytic Code & Algorithms: Analytic Data: Data pulled out and processed in analytic application Analytic Applications This is where we start from: All analytic processing done on application side Analytics of Warehouse Data
  • 28. SQLs Analytic Code & Algorithms: Analytic Data: Simple data lookup & massage operations pushed down as SQL operations Analytic Applications Benefit: Acceleration with no SQL skills required SQLs Push Down Step 1: BLU tables only logically represented in analytic application Accelerate Analytics for Warehouse Data
  • 29. SQLs Analytic Code & Algorithms: Analytic Data: Call built-in functions via SQL to execute typical algorithms inside db Cloud Tooling Analytic Applications Benefit: Bring Standard Analytics to the Data SQLs Canned Algorithms Push Down Step 2: Typical and popular algorithms pushed down to canned UDFs in the db Accelerate Analytics for Warehouse Data
  • 30. LanguageFramework (UDX&AE) Analytic Code & Algorithms: Analytic Data: Deploy customer code and call via special SQL function interfaces SQLs SQLs Canned Algorithms Analytic Applications Benefit: Bring Custom Analytics to the Data Push Down Step 3: Execute entire customer analytic programs inside the db Accelerate Analytics for Warehouse Data
  • 31. Don’t forget to submit your Insight session and speaker feedback! Your feedback is very important to us – we use it to continually improve the conference. Access your surveys at insight2015survey.com to quickly submit your surveys from your smartphone, laptop or conference kiosk. We Value Your Feedback! 31
  • 32. 32 Notices and Disclaimers Copyright © 2015 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form without written permission from IBM. U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM. Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for accuracy as of the date of initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to update this information. THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IN NO EVENT SHALL IBM BE LIABLE FOR ANY DAMAGE ARISING FROM THE USE OF THIS INFORMATION, INCLUDING BUT NOT LIMITED TO, LOSS OF DATA, BUSINESS INTERRUPTION, LOSS OF PROFIT OR LOSS OF OPPORTUNITY. IBM products and services are warranted according to the terms and conditions of the agreements under which they are provided. Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice. Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual performance, cost, savings or other results in other operating environments may vary. References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in which IBM operates or does business. Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the views of IBM. All materials and discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or other guidance or advice to any individual participant or their specific situation. It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the identification and interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the customer may need to take to comply with such laws. IBM does not provide legal advice or represent or warrant that its services or products will ensure that the customer is in compliance with any law.
  • 33. 33 Notices and Disclaimers (con’t) Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to interoperate with IBM’s products. IBM EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. The provision of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents, copyrights, trademarks or other intellectual property right. •IBM, the IBM logo, ibm.com, Aspera®, Bluemix, Blueworks Live, CICS, Clearcase, Cognos®, DB2® , DOORS®, Emptoris®, Enterprise Document Management System™, FASP®, FileNet®, Global Business Services ®, Global Technology Services ®, IBM ExperienceOne™, IBM SmartCloud®, IBM Social Business®, IMS™, Information on Demand, ILOG, Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON, OpenPower, PureAnalytics™, PureApplication®, pureCluster™, PureCoverage®, PureData®, PureExperience®, PureFlex®, pureQuery®, pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, Smarter Commerce®, SoDA, SPSS, Sterling Commerce®, StoredIQ, Tealeaf®, Tivoli®, Trusteer®, Unica®, urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at: www.ibm.com/legal/copytrade.shtml.
  • 34. © 2015 IBM Corporation Thank You

Notes de l'éditeur

  1. Value: Data Analysts can use SQL (R Skill are sufficient) -&amp;gt; we speak language of analysts
  2. Value: Bring Standard Analytics to data
  3. Value: Bring Customer Analytic Functions to the data ToDo: Auf charts value bringen