SlideShare a Scribd company logo
1 of 18
Download to read offline
Open Legislation
  Spring 2011
Open Data
(Government)
Secondary Sources are nice
●   OpenCongress
●   GovTrack.US
●   OpenStates
●   FedSpending.org

●   Many more
Primary Sources are better
●   Data.gov
●   USAspending.gov
●   California
●   Oregon
●   Washington

●   Many more
Sometimes though...
Open Data is not Enough.

  We need Platforms.
A Different Breed of Open
●   Making data accessible:
    ●   Built-in search
    ●   Permanent URIs
    ●   Standardized Feeds
    ●   Real-time Alerts


●   REST Architecture with Feed Publishing
    ●   RSS/Atom => Pubsubhubbub => Alerts
So back to
Open Legislation
Browse, Search, and Share
http://open.nysenate.gov/legislation
Its not a Service;
Its an Open Platform
1 Year Re-cap
●   Open Sourced It (for real)
●   Improved the API (xml/json)
●   Decreased Load Times
●   Restructured the Back-end
●   Basic Documentation
●   Wrapped into a build system
The next year
●   In general..
    ●   Data Quality and Documentation
    ●   Usage Tracking and Statistics
    ●   User Interface Improvements
    ●   Further separation of the Platform and Service

●   Right now
    ●   Data Quality, Data Quality, Data Quality
    ●   And a little bit of documentation
The Senate has Legislative
   Data Quality issues?
Well, not exactly
●   Legislative Research Service has the data
    ●   Big, ancient mainframe to boot


●   They FTP us updates every 5 minutes
    ●   In SOBI formats (what?)
    ●   With some XML mixed in

●   We parse it back into XML/JSON/SQL structure
Reasons for Difficulty
●   Poorly Documented SOBI behavior

●   Formatted as a change log (sometimes)
    ●   Finding sources of error can be hard


●   LRS is not co-operative
Solutions
●   Version Control
    ●   Write objects to JSON/XML files
    ●   With Git, commit each new version
        –   Commit message points to the source SOBI
    ●   Use git to trace data errors back to SOBI files


●   Unit Test known corner cases

●   Periodically do a scrape check?
Progress
✔   Parsing has been overhauled
✔   Objects are written to file
✔   Bugs have been found and fixed
✔   Periodic Scrapes are approved
A short task list
✗   Integrate git into the parsing system.
✗   Document expected behavoir
✗   Write a small test suite
✗   Try to avoid having to scrape.
HFOSS Symposium 2011
●   Bryan Sivak – Civic Commons
●   Mark Prutalis – Sahana Foundation
●   Many universities, Mozilla, Google

●   David, Moorthy, Brian, and Myself!
    ●   1 Hour and a few 3' x 4' posters.

More Related Content

Similar to Open Legislation Spring 2011 Talk 1

#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo
#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo
#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, CriteoParis Open Source Summit
 
A culture of Automation - Joe Smith - DevOpsDays Tel Aviv 2017
A culture of Automation - Joe Smith - DevOpsDays Tel Aviv 2017A culture of Automation - Joe Smith - DevOpsDays Tel Aviv 2017
A culture of Automation - Joe Smith - DevOpsDays Tel Aviv 2017DevOpsDays Tel Aviv
 
Pakistan Census Data – Case Study
Pakistan Census Data – Case StudyPakistan Census Data – Case Study
Pakistan Census Data – Case StudyJabran Rafique
 
Indextank east bay ruby meetup slides
Indextank east bay ruby meetup slidesIndextank east bay ruby meetup slides
Indextank east bay ruby meetup slidesYogiWanKenobi
 
Open Data and Web API
Open Data and Web APIOpen Data and Web API
Open Data and Web APISammy Fung
 
WEBINAR: Proven Patterns for Loading Test Data for Managed Package Testing
WEBINAR: Proven Patterns for Loading Test Data for Managed Package TestingWEBINAR: Proven Patterns for Loading Test Data for Managed Package Testing
WEBINAR: Proven Patterns for Loading Test Data for Managed Package TestingCodeScience
 
Keeping up with the changes: Automating UAT - Damian Sweeney, Student and Aca...
Keeping up with the changes: Automating UAT - Damian Sweeney, Student and Aca...Keeping up with the changes: Automating UAT - Damian Sweeney, Student and Aca...
Keeping up with the changes: Automating UAT - Damian Sweeney, Student and Aca...Blackboard APAC
 
Presentation1.pdf
Presentation1.pdfPresentation1.pdf
Presentation1.pdfZixunZhou
 
OSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles JudithOSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles JudithNETWAYS
 
An Introduction to Pentaho Kettle
An Introduction to Pentaho KettleAn Introduction to Pentaho Kettle
An Introduction to Pentaho KettleDan Moore
 
(Greach 2015) Decathlon Sport Meeting
(Greach 2015) Decathlon Sport Meeting(Greach 2015) Decathlon Sport Meeting
(Greach 2015) Decathlon Sport MeetingAlonso Torres
 
Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4aspyker
 
10 ways to stumble with big data
10 ways to stumble with big data10 ways to stumble with big data
10 ways to stumble with big dataLars Albertsson
 
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...Data Con LA
 
Beyond the Hype: 4 Years of Go in Production
Beyond the Hype: 4 Years of Go in ProductionBeyond the Hype: 4 Years of Go in Production
Beyond the Hype: 4 Years of Go in ProductionC4Media
 

Similar to Open Legislation Spring 2011 Talk 1 (20)

#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo
#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo
#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo
 
A culture of Automation - Joe Smith - DevOpsDays Tel Aviv 2017
A culture of Automation - Joe Smith - DevOpsDays Tel Aviv 2017A culture of Automation - Joe Smith - DevOpsDays Tel Aviv 2017
A culture of Automation - Joe Smith - DevOpsDays Tel Aviv 2017
 
Pakistan Census Data – Case Study
Pakistan Census Data – Case StudyPakistan Census Data – Case Study
Pakistan Census Data – Case Study
 
Indextank east bay ruby meetup slides
Indextank east bay ruby meetup slidesIndextank east bay ruby meetup slides
Indextank east bay ruby meetup slides
 
Ice dec04-04-sammy
Ice dec04-04-sammyIce dec04-04-sammy
Ice dec04-04-sammy
 
Open Data and Web API
Open Data and Web APIOpen Data and Web API
Open Data and Web API
 
WEBINAR: Proven Patterns for Loading Test Data for Managed Package Testing
WEBINAR: Proven Patterns for Loading Test Data for Managed Package TestingWEBINAR: Proven Patterns for Loading Test Data for Managed Package Testing
WEBINAR: Proven Patterns for Loading Test Data for Managed Package Testing
 
ION Ljubljana - Aaron Hughes: Best Current Operational Practices
ION Ljubljana - Aaron Hughes: Best Current Operational PracticesION Ljubljana - Aaron Hughes: Best Current Operational Practices
ION Ljubljana - Aaron Hughes: Best Current Operational Practices
 
Keeping up with the changes: Automating UAT - Damian Sweeney, Student and Aca...
Keeping up with the changes: Automating UAT - Damian Sweeney, Student and Aca...Keeping up with the changes: Automating UAT - Damian Sweeney, Student and Aca...
Keeping up with the changes: Automating UAT - Damian Sweeney, Student and Aca...
 
Presentation1.pdf
Presentation1.pdfPresentation1.pdf
Presentation1.pdf
 
OSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles JudithOSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles Judith
 
An Introduction to Pentaho Kettle
An Introduction to Pentaho KettleAn Introduction to Pentaho Kettle
An Introduction to Pentaho Kettle
 
(Greach 2015) Decathlon Sport Meeting
(Greach 2015) Decathlon Sport Meeting(Greach 2015) Decathlon Sport Meeting
(Greach 2015) Decathlon Sport Meeting
 
Introduction To Pentaho Kettle
Introduction To Pentaho KettleIntroduction To Pentaho Kettle
Introduction To Pentaho Kettle
 
Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4
 
10 ways to stumble with big data
10 ways to stumble with big data10 ways to stumble with big data
10 ways to stumble with big data
 
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
 
Streaming Analytics
Streaming AnalyticsStreaming Analytics
Streaming Analytics
 
Beyond the Hype: 4 Years of Go in Production
Beyond the Hype: 4 Years of Go in ProductionBeyond the Hype: 4 Years of Go in Production
Beyond the Hype: 4 Years of Go in Production
 
Dynamic sitemaps
Dynamic sitemapsDynamic sitemaps
Dynamic sitemaps
 

Recently uploaded

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Recently uploaded (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

Open Legislation Spring 2011 Talk 1

  • 1. Open Legislation Spring 2011
  • 3. Secondary Sources are nice ● OpenCongress ● GovTrack.US ● OpenStates ● FedSpending.org ● Many more
  • 4. Primary Sources are better ● Data.gov ● USAspending.gov ● California ● Oregon ● Washington ● Many more
  • 5. Sometimes though... Open Data is not Enough. We need Platforms.
  • 6. A Different Breed of Open ● Making data accessible: ● Built-in search ● Permanent URIs ● Standardized Feeds ● Real-time Alerts ● REST Architecture with Feed Publishing ● RSS/Atom => Pubsubhubbub => Alerts
  • 7. So back to Open Legislation
  • 8. Browse, Search, and Share http://open.nysenate.gov/legislation
  • 9. Its not a Service; Its an Open Platform
  • 10. 1 Year Re-cap ● Open Sourced It (for real) ● Improved the API (xml/json) ● Decreased Load Times ● Restructured the Back-end ● Basic Documentation ● Wrapped into a build system
  • 11. The next year ● In general.. ● Data Quality and Documentation ● Usage Tracking and Statistics ● User Interface Improvements ● Further separation of the Platform and Service ● Right now ● Data Quality, Data Quality, Data Quality ● And a little bit of documentation
  • 12. The Senate has Legislative Data Quality issues?
  • 13. Well, not exactly ● Legislative Research Service has the data ● Big, ancient mainframe to boot ● They FTP us updates every 5 minutes ● In SOBI formats (what?) ● With some XML mixed in ● We parse it back into XML/JSON/SQL structure
  • 14. Reasons for Difficulty ● Poorly Documented SOBI behavior ● Formatted as a change log (sometimes) ● Finding sources of error can be hard ● LRS is not co-operative
  • 15. Solutions ● Version Control ● Write objects to JSON/XML files ● With Git, commit each new version – Commit message points to the source SOBI ● Use git to trace data errors back to SOBI files ● Unit Test known corner cases ● Periodically do a scrape check?
  • 16. Progress ✔ Parsing has been overhauled ✔ Objects are written to file ✔ Bugs have been found and fixed ✔ Periodic Scrapes are approved
  • 17. A short task list ✗ Integrate git into the parsing system. ✗ Document expected behavoir ✗ Write a small test suite ✗ Try to avoid having to scrape.
  • 18. HFOSS Symposium 2011 ● Bryan Sivak – Civic Commons ● Mark Prutalis – Sahana Foundation ● Many universities, Mozilla, Google ● David, Moorthy, Brian, and Myself! ● 1 Hour and a few 3' x 4' posters.