10. THE TROUBLE WITH DATA
• You need to find data API
• Get Access – Signup for key
• Find data endpoint
• Read docs to learn what parameters you
have
• Get data in obscure format
• Use data after converting and filtering
• More APIs you use, more is your
annoyance
21. ACCESSING PRIVATE DATA
http://query.yahooapis.com/v1/yql
Uses OAuth 1.0 for authorization
OAuth is complicated – use one of our SDKs at
https://github.com/yahoo
22. You can also mix and
match several web
services using the in()
command.
23. select * from search.termextract
where context in (select
description from rss where
url='http://rss.news.yahoo.com/
rss/topstories')
40. QUERY EXAMPLES
select
*
from
yahoo.finance.quotes
where
symbol
in
("^IXIC","^DJI","YHOO
","AAPL")
41. QUERY EXAMPLES
select
*
from
weather.bylocaHon
where
locaHon
in
("bangalore,
in",
”new
york,
us")
42. QUERY EXAMPLES
Find hackday tweets:
SELECT * FROM twitter.search where q='hackday’
Search Yahoo! Answers for resolved questions about cars:
select * from answers.search where query="cars" and type="resolved”
Find distance between Bangalore and Mumbai:
select * from geo.distance where place1="bangalore" and
place2="mumbai”
Extract important terms from top stories on Yahoo! news:
select * from search.termextract where context in (select description
from rss where url='http://rss.news.yahoo.com/rss/topstories')
43. QUERY EXAMPLES
Get Olympic medal list
select * from html where url='http://sports.yahoo.com/olympics/
medals.html' and xpath='//*[@id='mediasportsoverallmedalcount']/div
[2]/table/tbody/tr/td/a'
Shorten a URL:
insert into yahoo.y.ahoo.it (url, keysize) values ('http://
www.javarants.com', 5)
Search apartments in criagslist:
select * from craigslist.search where location="bangalore" and
type="apa" and query="indiranagar”
44. QUERY EXAMPLES
Scrape news from Yahoo! Finance:
select * from html where url="http://finance.yahoo.com/q?
s=yhoo" and xpath='//div[@id="yfi_headlines"]/div[2]/ul/li/a’
Select, filter data from google spreadsheets:
select * from csv where url="https://
spreadsheets.google.com/pub?key=0ArYndzim-
lbrdF8wc3A5QWl1ZGRpdkxRZk80SU9zUXc&output=csv"
and col5 like 'Bangalore%’ ;
60. WEBMEME.IN
Fetch multiple feeds in different formats like atom, RSS and
transform them into consistent RSS format
Select * from rss where url in (‘http://feeds.feedburner.com/pluggd’,
‘http://quatrainman.blogspot.com/atom.xml’, ‘…’)
Filter news containing “india” from multiple feeds:
select * from rss where url in ('http://feeds.feedburner.com/
TechCrunch', 'http://www.readwriteweb.com/rss.xml','http://
gigaom.com/feed/') and description like '%india%’
61. YQL is open – you
can get your data
tables in our system
62. All you need to do is
write an XML
schema and put it
on Github.