Senate expensesslideshare
Upcoming SlideShare
Loading in...5
×
 

Senate expensesslideshare

on

  • 6,379 vues

 

Statistiques

Vues

Total des vues
6,379
Vues sur SlideShare
1,136
Vues externes
5,243

Actions

J'aime
1
Téléchargements
9
Commentaires
0

8 Ajouts 5,243

http://blogs.lanacion.com.ar 4765
http://databloom.wordpress.com 441
http://www.dagmedya.net 24
http://translate.googleusercontent.com 6
http://periodismocide.org 3
http://dataperiodismocide.org 2
https://www.google.fr 1
https://www.google.com.ar 1
Plus...

Accessibilité

Catégories

Détails de l'import

Uploaded via as Adobe PDF

Droits d'utilisation

© Tous droits réservés

Report content

Signalé comme inapproprié Signaler comme inapproprié
Signaler comme inapproprié

Indiquez la raison pour laquelle vous avez signalé cette présentation comme n'étant pas appropriée.

Annuler
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Votre message apparaîtra ici
    Processing...
Poster un commentaire
Modifier votre commentaire

    Senate expensesslideshare Senate expensesslideshare Presentation Transcript

    • Argentina’s Senate Expenses: Opening Public Data and Data Journalism
    • Argentina´s Senate Site (www.senado.gov.ar) – SenateExpenses 4 1 2 3
    • www.senado.gov.ar Site DescriptionThe identified links are:1. Procurement Tenders (1320 PDF’s) TOTAL : 33.6572. Administrative decisions (14.205 PDF’s) PDF’s3. Accounting decisions (6.817 PDF’s)4. Presidential Decrees (11.315 PDF’s) Feb 8, 2013
    • 2) Processing Data - PDF’s Password Protected - Password Removal Files are PDF´s generated from paper scanned documents…. But….. Password Protected PDF’s (against copy and printing). So, before executing OCR process, we need to remove the Password Protection.
    • 3) Transforming Data - Run OCR  33.000 TXT documents
    • 3) Transforming Data - OCR issues“Noisy” or unevenly scaned documents produce unpredictable results….
    • 4) Processing Data I: Insert 33.000 TXT files in Excel worksheet, identifying entities, (persons and Companies),amounts (money or bodyguards) and customizable Keywords)  Analyze applying Filtering
    • 5) Identifying Patterns and Entities What WhoTravel Expenses requested by the President of the Senate (Vicepresident ofArgentina), Mr. Amado Boudou, for 4 Security Agents for his trip to the UnitedStates of America from 10/04/2012 to 10/05/2012.$ 154.864,90 (~USD 35.000) When Where How much
    • 5) Identifying Patterns and Entities Security Agents expenses : 15 Santa Fe Agents 6/18/2012- 6/21/2012 $49,737 OCR mistakes
    • 6) Processing Data II : Detecting dates interval, destination, number of security agents, and ammount of money{
    • Journalistic Case : Show Service Company - Direct Purchases
    • 1) Getting Data Application to download + 33.000 Documents !!!
    • Senate’sPresident Decree
    • Senate Administration Department E.g
    • Senate Accountant Department E.g.
    • Senate Expensed Structured Dataset in Excel
    • Sunday Print edition and Infography
    • Sunday Print Editon infography
    • TABLEAU: filters per amount & destiny
    • IMPACT: VP takes to TV a stack of decrees in print
    • IMPACT: Judge reopens investigation on VP’s expenses
    • Interactive Gantt with overlapped or inexesistent trips reimboursed
    • Print Edition
    • Senate Expenses TAG gathersstories origitated in this database