This document provides an overview of computer-assisted reporting (CAR), which uses data from databases, spreadsheets, and other sources to uncover stories. CAR can confirm obvious facts, reveal unexpected findings, and report information without needing to attribute to sources. Some potential data sources discussed include inspection reports, complaints, licenses and registrations. The document outlines challenges in obtaining data through freedom of information requests and provides tips for negotiating access. It also discusses software options, potential stories that could be uncovered, and resources for learning more about CAR techniques.
7. Understand data tables Fields (or columns) contain types of data. Records (or rows) contain the information you want .
8. Why use data? • No more “according to…” State as fact, don’t attribute. • Uncovers stories even the subjects don’t know. • Confirms the obvious; reveals the unexpected.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34. Other CAR ideas • Overpass inspection records. • Mayoralty campaign donors. • Ambulance response times. • Health/safety reports in city-run apartment buildings. • Complaints against taxi drivers.
35. More CAR ideas • Pet licenses by postal code. • Single men/women by census tract. • Day most marriage licenses. • Most common street name.
36. Data sources • Inspection reports • Complaints • Incidents reports • Discipline records • Registrations and licenses Check reporting requirements.
37.
38. The CAR Inverse Pyramid of Aggravation Obtaining data (hard) Formatting data Analyzing data Reporting Writing (easy)
39. Where to get data • Ask for it. • Download from the Web. • Scrape from the Web. • Build it yourself from documents. • FOI or ATIP.
42. FOI/ATIP strategies • Ask for everything except names (and sometimes addresses) • Negotiate. • Appeal.
43. Negotiating for data • Request a sample of the data. • Arrange a meeting with the ATIP/FOI person or the data expert. • Eliminate fields that need to be severed. • Modify your request.
44. Finding the story • Think vertically. Look at columns. • Cross-tab columns. • Chart over time. • Look for patterns. • Dig down from data..
45.
46. Spreadsheets • “ Smart paper,” great for adding up totals, calculating percentages and other data summary. • Limited to 65,000 records.
47.
48. Database managers • Quickly organizes large numbers of records. • No 65,000-record limit. • Can summarize data, too.
49.
50. Mapping or GIS • Puts data on a map. • Analyzes data based on location and distance.
51.
52.
53. Software for Mac • Spreadsheet: Excel ($), OpenOffice (open source) • Database manager: Filemaker($), MySQL (open source) • Mapping : QGIS (open source), Google Maps mash-ups.
54. Software for PC • Spreadsheet: Excel ($), OpenOffice (open source) • Database manager: Access($), MySQL (open source) • Mapping : ArcView ($, Citizen owns a copy), MapInfo ($), Google Maps mash-ups.
55. Problems with CAR • Long lead times. • Time consuming. • Requires software and hardware. • Extra work not seen by readers (or editors)
56. Scale • A little goes a long way • Think story, not series • Keep and reuse your data
57. Why learn CAR? • Few doing it in Canada • Easy online components • Useful in almost any beat • Works in print, broadcast, web
59. Ask for help • (613) 235-6685 • [email_address] • http://www.sushiboy.org/car.pdf • http://www.sushiboy.org/car.ppt -30-
Notes de l'éditeur
CAR is a misnomer. Predates use of the internet in newsrooms. Everyone who does a Google search is computer assisted. We are using electronic data to do reporting.
- Just about anything large organizations (and governments) do gets put into a database somewhere. Email is a form of database. Maps are graphic representations of data. Census stuff.
Each line is a separate record that represents a single gun somewhere in Canada. Record level data. Not summary data. Summary data would be “21 per cent of handguns in Canada are registered in Quebec.”
Shows the file number of a complaint. Which are most interesting? Most serious?
Govt. contracts by department and date. Party column was added after the fact.
Need to know what things are called when you ask for them.
“ Ottawa Police solve fewer murders of women than police forces in eight other major Canadian cities, a Citizen analysis shows.” Parking meter on Lisgar most ticketed in the city. Lots of tickets in Byward Market; but Lynda Lane close.
Low income people twice as likely to live near lotto dealers. Neighbourhood with most outlets per capita. How to do this story?
Black boys twice as likely to be suspended than white boys. More interesting lede?
Data works great on web. Lets readers drill down themselves so you don’t have to.
Nevada rural ambulances worst in US. Grocery clerks and miners.
Might have the same data in Canada. Find out what it’s called.
- Ledes with compelling story; data hit comes later.
Reporter’s process on this? Complaint from a person, idea, data, analysis, back to a person with the same problem. Sources?
Looks like summary data, but produced by the paper. Again, ledes with one person’s story. A bit hackneyed but it works.
Everyone fills up with gas. Water-cooler value. The numbers behind the mundane things we do every day. Lots of papers have done this story. Nice simple lede. Sources?
Texas Assessment of Knowledge and Skills. We publish results of standardized testing every year. Imagine if we could show which schools cheat the most? Multiple choice data easy to computer capture.
Call this guy, asking him about Ontario standardized tests. Privacy concerns easy to get around.
Easy to knock-off here. Are we funding groups with Christian affiliations? More since Harper elected?
Conclusion a bit dramatic.
Huge amount of donations from two addresses. Takes abstract idea like campaign finance and reduces it down to bricks-and-mortar.
Quantifying something most people have had happen to them. Obvious: theives hit parking lots. Less obvious: South Iowa Street is the hot zone.
Google Mash-up. Can do with Platial or code it yourself. Country club vs public course.
Cellphone towers as a visual blight? What about health hazard? European standards?
Methodology shaky? Important thing: gettinng the data from FCC.
Guy who issued more tickets than anyone else. Crunch the numbers, then go find him and talk to him.
List of parking officers names. Nearly got it wrong. Panic two days before publication because Raine ranked second. Two guys named Charbonneau. Figured out with badge numbers.
Whenver possible, put your data on the map. Confirms obvious: parking in the Market, Lynda Lane.
Same idea. Mapping break-ins. “ 11 News put a special computer program to work.”
“ The Fat Belt” obesity charted by geography. Nobody in grey, less than <10 per cent. What is your story based on this data? (Poverty) What is the SECOND story? (Michigan).
What’s are paczkis and coneys? Experts blame: long winters, no bike trails, suburban population boom in Detroit increases commuting time, elimination of phys ed in school.
Look for data stories behind the A1 or C1 news hit.
Doesn’t have to be death and despair.
Government logs everything. Professional bodies… lawyers, nurses, dentists, morticians. Story about veterinarian who gave too much laughing gas to a Norwegian Blue parrot.
Hard stuff at the beginning. Get many irons in the fire.
Stupidity: Wouldn’t give me the Lobbyist Registry because it was available in very limited form online. ATIP coordinators are just learning this, too.
Analysis based on location Eg. How many homes within a km of a cellphone tower?