6. Computers Humans
| B EST PR ACT ICE S fo r PUBL IS HI NG D ATA | Hjalmar Gislason, hg@datamarket.com | October 2012
7. Computers Humans
• Structure • Understand and
use
| B EST PR ACT ICE S fo r PUBL IS HI NG D ATA | Hjalmar Gislason, hg@datamarket.com | October 2012
8. Computers Humans
• Structure • Understand and
use
| B EST PR ACT ICE S fo r PUBL IS HI NG D ATA | Hjalmar Gislason, hg@datamarket.com | October 2012
9. Publishing for Computers
1. Simple formats
2. Indexes, unique IDs and meta-data
3. FAQs and feedback channels
12. Simple Formats:
Tim Berners-Lee’s Five Stars
| B EST PR ACT ICE S fo r PUBL IS HI NG D ATA | Hjalmar Gislason, hg@datamarket.com | October 2012
13. Simple Formats:
You lost me at “Semantics”
| B EST PR ACT ICE S fo r PUBL IS HI NG D ATA | Hjalmar Gislason, hg@datamarket.com | October 2012
14. Standards will emerge and there will
be more and more of them
• RDF
• OData vs. GData
• DSPL
• SDMX
| B EST PR ACT ICE S fo r PUBL IS HI NG D ATA | Hjalmar Gislason, hg@datamarket.com | October 2012
15. Indexes, unique ids and meta-data
| B EST PR ACT ICE S fo r PUBL IS HI NG D ATA | Hjalmar Gislason, hg@datamarket.com | October 2012
16. Indexes, unique ids and meta-data
| B EST PR ACT ICE S fo r PUBL IS HI NG D ATA | Hjalmar Gislason, hg@datamarket.com | October 2012
17. Indexes, unique ids and meta-data
| B EST PR ACT ICE S fo r PUBL IS HI NG D ATA | Hjalmar Gislason, hg@datamarket.com | October 2012
18. Indexes, unique IDs and meta-data
• Must: Unique ID, Title, Last updated
• Should: Meta-data
• Why?
• No need for scraping
• Less load on your end
• Ensures full coverage
• Ensures content removal and updates
| B EST PR ACT ICE S fo r PUBL IS HI NG D ATA | Hjalmar Gislason, hg@datamarket.com | October 2012
19. Indexes, unique IDs and meta-data
• Hard to emphasize enough!
• Unique IDs for everything: Datsets, columns, entities, ...
• Why?
• Continuity: A small change for a man = giant leap for a
computer
| B EST PR ACT ICE S fo r PUBL IS HI NG D ATA | Hjalmar Gislason, hg@datamarket.com | October 2012
20. Indexes, unique IDs and meta-data
• Any relevant contextual information
• URL(s), descriptions, methodology, next updated, authors,
keywords, units, license information, ...
| B EST PR ACT ICE S fo r PUBL IS HI NG D ATA | Hjalmar Gislason, hg@datamarket.com | October 2012
21. FAQs and feedback channels
#1 reason for not publishing data:
“There are errors in the data and I don't
want others to discover them”
| B EST PR ACT ICE S fo r PUBL IS HI NG D ATA | Hjalmar Gislason, hg@datamarket.com | October 2012
22. FAQs and feedback channels
#1 reason for not publishing data:
“There are errors in the data and I do
want others to discover them”
| B EST PR ACT ICE S fo r PUBL IS HI NG D ATA | Hjalmar Gislason, hg@datamarket.com | October 2012
23. FAQs and feedback channels
| B EST PR ACT ICE S fo r PUBL IS HI NG D ATA | Hjalmar Gislason, hg@datamarket.com | October 2012
24. FAQs and feedback channels
| B EST PR ACT ICE S fo r PUBL IS HI NG D ATA | Hjalmar Gislason, hg@datamarket.com | October 2012
25. Publishing for Computers
1. Simple formats
2. Indexes, unique IDs and meta-data
3. FAQs and feedback channels
26. Computers Humans
• Structure • Understand
and use
| B EST PR ACT ICE S fo r PUBL IS HI NG D ATA | Hjalmar Gislason, hg@datamarket.com | October 2012
28. Search / Discovery
• Requirements differ from web/text search
• A lot less textual content to base on
• Synonyms, dictionaries, autocomplete
• But (hopefully) good meta-data = facets and filtering
• Give people ways to browse
• Categories vs. tags vs. search
• Serendipity: Random, related, interesting...
36. Visualize
• What you should offer depends on the data
• Statistical data
• Focus on the most common charts and get them right
• Do NOT invent new visualizations or chart types
• Use standards compatible technologies
• No Flash!
• Charting and visualization libraries
39. Download
• Make it easy to use your data outside your tools
• Play nicely with those providing functionality beyond what
you can offer: Tableau, R, SAS, MathLab, Mathematica,
SPSS, ...
• Provide downloads in the formats most commonly used by your
users:
• Raw data: Excel, CSV, feeds (R, Excel live feeds, APIs)
• Charts and visualizations: Bitmap, vector, PPT, embeds?
40. Computers Humans
• Structure • Understand and use
• Simple formats • Search / Discovery
• Indexes, unique IDs and • Visualization
meta-data • Download
• FAQs and feedback
channels
| B EST PR ACT ICE S fo r PUBL IS HI NG D ATA | Hjalmar Gislason, hg@datamarket.com | October 2012
41. F I N D A N D U N D E R S TA N D D ATA
Hjalmar Gislason, founder & CEO
Twitter: @datamarket · Facebook: DataMarket · E-mail: hg@datamarket.com