SlideShare une entreprise Scribd logo
1  sur  9
Pingar SharePoint NZ Idol

   For Wave to incorporate
   into Peter’s presentation
Time spent on information tasks
                             Avg. hours per week
14.5

                           = 37K year/person
        13.3


               9.6   9.5
                           8.8   8.3
                                       6.8   6.7
                                                     5.6     5.6
                                                                      4.3     4.2

                                                                                        1




                                               Source: IDC, Hidden Cost of Information (2005)
Time spent on information tasks
                             Avg. hours per week
                                                            …can be rescued!
14.5
        13.3


               9.6   9.5
                           8.8   8.3
                                       6.8   6.7
                                                     5.6     5.6
                                                                      4.3     4.2

                                                                                        1




                                                   Source: IDC, Hidden Cost of Information (2005)
Redaction example is from dysonology.wordpress.com
New Pingar API

Rapid Discovery
Related searches
Dynamic facets
Document preview


       HCIR Workshop
         20 October 2011
   Google, Mountain View
New Pingar API

        Entity Extraction
        Named entity extraction
        Taxonomy mapping
        Linked Data connectors
        Address detection
        Invoice analysis

    Mining Custom Taxonomies
                Sept 2010 – Feb 2012
NZ Ministry of Science and Innovation
      University of Waikato & Pingar
New Pingar API

Content Analysis              query
Sanitization and redaction
Offensive content filtering
Summarization                 Link to download
Report generation             an auto-generated
                              PDF report


       Exploring verticals
                      Legal
                 Bioscience
                 Education
               Government
Demo time

Contenu connexe

Similaire à Discover New Value from Unstructured Data

Linked Data: opportunities and challenges
Linked Data: opportunities and challengesLinked Data: opportunities and challenges
Linked Data: opportunities and challengesMichael Hausenblas
 
Publishing of Scientific Data - Science Foundation Ireland Summit 2010
Publishing of Scientific Data  - Science Foundation Ireland Summit 2010Publishing of Scientific Data  - Science Foundation Ireland Summit 2010
Publishing of Scientific Data - Science Foundation Ireland Summit 2010jodischneider
 
On the diversity and availability of temporal information in linked open data
On the diversity and availability of temporal information in linked open dataOn the diversity and availability of temporal information in linked open data
On the diversity and availability of temporal information in linked open dataAnisa Rula
 
Melrose
MelroseMelrose
Melrosemcruce
 
The architecture of data analytics PaaS on AWS
The architecture of data analytics PaaS on AWSThe architecture of data analytics PaaS on AWS
The architecture of data analytics PaaS on AWSTreasure Data, Inc.
 
How to Build Linked Data Sites with Drupal 7 and RDFa
How to Build Linked Data Sites with Drupal 7 and RDFaHow to Build Linked Data Sites with Drupal 7 and RDFa
How to Build Linked Data Sites with Drupal 7 and RDFascorlosquet
 
Leyline: A provenance-based desktop search
Leyline: A provenance-based desktop searchLeyline: A provenance-based desktop search
Leyline: A provenance-based desktop searchSoroush Ghorashi
 
Albert Simard - Mobilizing Knowledge: Acquisition, Analysis, and Action
Albert Simard - Mobilizing Knowledge: Acquisition, Analysis, and ActionAlbert Simard - Mobilizing Knowledge: Acquisition, Analysis, and Action
Albert Simard - Mobilizing Knowledge: Acquisition, Analysis, and ActionInstitute for Knowledge Mobilization
 
RDAP13 Mark Leggott: Stewarding research data using the Islandora framework
RDAP13 Mark Leggott: Stewarding research data using the Islandora frameworkRDAP13 Mark Leggott: Stewarding research data using the Islandora framework
RDAP13 Mark Leggott: Stewarding research data using the Islandora frameworkASIS&T
 
HEC Initiatives IT(Division) By A.Chattha
HEC Initiatives IT(Division) By A.ChatthaHEC Initiatives IT(Division) By A.Chattha
HEC Initiatives IT(Division) By A.ChatthaRaheel Raza
 
The current architecture of TYPO3 5.0
The current architecture of TYPO3 5.0The current architecture of TYPO3 5.0
The current architecture of TYPO3 5.0Robert Lemke
 
Building Optimisation using Scenario Modeling and Linked Data
Building Optimisation using Scenario Modeling and Linked DataBuilding Optimisation using Scenario Modeling and Linked Data
Building Optimisation using Scenario Modeling and Linked DataEdward Curry
 
Big Data and Content Management. SkyDox and the European Court of Human Righ...
Big Data and Content Management.  SkyDox and the European Court of Human Righ...Big Data and Content Management.  SkyDox and the European Court of Human Righ...
Big Data and Content Management. SkyDox and the European Court of Human Righ...SkyDox LTD
 
Invited talk @ DCC09 workshop
Invited talk @ DCC09 workshopInvited talk @ DCC09 workshop
Invited talk @ DCC09 workshopPaolo Missier
 
Data Management for Librarians: An Introduction
Data Management for Librarians: An IntroductionData Management for Librarians: An Introduction
Data Management for Librarians: An IntroductionGarethKnight
 
Presentation Ispass 2012 Session6 Presentation1
Presentation Ispass 2012 Session6 Presentation1Presentation Ispass 2012 Session6 Presentation1
Presentation Ispass 2012 Session6 Presentation1sairahul321
 

Similaire à Discover New Value from Unstructured Data (20)

Linked Data: opportunities and challenges
Linked Data: opportunities and challengesLinked Data: opportunities and challenges
Linked Data: opportunities and challenges
 
Publishing of Scientific Data - Science Foundation Ireland Summit 2010
Publishing of Scientific Data  - Science Foundation Ireland Summit 2010Publishing of Scientific Data  - Science Foundation Ireland Summit 2010
Publishing of Scientific Data - Science Foundation Ireland Summit 2010
 
On the diversity and availability of temporal information in linked open data
On the diversity and availability of temporal information in linked open dataOn the diversity and availability of temporal information in linked open data
On the diversity and availability of temporal information in linked open data
 
Restfs
RestfsRestfs
Restfs
 
Melrose
MelroseMelrose
Melrose
 
The architecture of data analytics PaaS on AWS
The architecture of data analytics PaaS on AWSThe architecture of data analytics PaaS on AWS
The architecture of data analytics PaaS on AWS
 
How to Build Linked Data Sites with Drupal 7 and RDFa
How to Build Linked Data Sites with Drupal 7 and RDFaHow to Build Linked Data Sites with Drupal 7 and RDFa
How to Build Linked Data Sites with Drupal 7 and RDFa
 
Leyline: A provenance-based desktop search
Leyline: A provenance-based desktop searchLeyline: A provenance-based desktop search
Leyline: A provenance-based desktop search
 
Knowledge mobilization
Knowledge mobilization Knowledge mobilization
Knowledge mobilization
 
Albert Simard - Mobilizing Knowledge: Acquisition, Analysis, and Action
Albert Simard - Mobilizing Knowledge: Acquisition, Analysis, and ActionAlbert Simard - Mobilizing Knowledge: Acquisition, Analysis, and Action
Albert Simard - Mobilizing Knowledge: Acquisition, Analysis, and Action
 
Lgd 2
Lgd 2Lgd 2
Lgd 2
 
RDAP13 Mark Leggott: Stewarding research data using the Islandora framework
RDAP13 Mark Leggott: Stewarding research data using the Islandora frameworkRDAP13 Mark Leggott: Stewarding research data using the Islandora framework
RDAP13 Mark Leggott: Stewarding research data using the Islandora framework
 
HEC Initiatives IT(Division) By A.Chattha
HEC Initiatives IT(Division) By A.ChatthaHEC Initiatives IT(Division) By A.Chattha
HEC Initiatives IT(Division) By A.Chattha
 
The current architecture of TYPO3 5.0
The current architecture of TYPO3 5.0The current architecture of TYPO3 5.0
The current architecture of TYPO3 5.0
 
Building Optimisation using Scenario Modeling and Linked Data
Building Optimisation using Scenario Modeling and Linked DataBuilding Optimisation using Scenario Modeling and Linked Data
Building Optimisation using Scenario Modeling and Linked Data
 
Big Data and Content Management. SkyDox and the European Court of Human Righ...
Big Data and Content Management.  SkyDox and the European Court of Human Righ...Big Data and Content Management.  SkyDox and the European Court of Human Righ...
Big Data and Content Management. SkyDox and the European Court of Human Righ...
 
Invited talk @ DCC09 workshop
Invited talk @ DCC09 workshopInvited talk @ DCC09 workshop
Invited talk @ DCC09 workshop
 
Data Management for Librarians: An Introduction
Data Management for Librarians: An IntroductionData Management for Librarians: An Introduction
Data Management for Librarians: An Introduction
 
Treasure Data and Heroku
Treasure Data and HerokuTreasure Data and Heroku
Treasure Data and Heroku
 
Presentation Ispass 2012 Session6 Presentation1
Presentation Ispass 2012 Session6 Presentation1Presentation Ispass 2012 Session6 Presentation1
Presentation Ispass 2012 Session6 Presentation1
 

Dernier

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 

Dernier (20)

Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 

Discover New Value from Unstructured Data

  • 1. Pingar SharePoint NZ Idol For Wave to incorporate into Peter’s presentation
  • 2. Time spent on information tasks Avg. hours per week 14.5 = 37K year/person 13.3 9.6 9.5 8.8 8.3 6.8 6.7 5.6 5.6 4.3 4.2 1 Source: IDC, Hidden Cost of Information (2005)
  • 3. Time spent on information tasks Avg. hours per week …can be rescued! 14.5 13.3 9.6 9.5 8.8 8.3 6.8 6.7 5.6 5.6 4.3 4.2 1 Source: IDC, Hidden Cost of Information (2005)
  • 4. Redaction example is from dysonology.wordpress.com
  • 5. New Pingar API Rapid Discovery Related searches Dynamic facets Document preview HCIR Workshop 20 October 2011 Google, Mountain View
  • 6. New Pingar API Entity Extraction Named entity extraction Taxonomy mapping Linked Data connectors Address detection Invoice analysis Mining Custom Taxonomies Sept 2010 – Feb 2012 NZ Ministry of Science and Innovation University of Waikato & Pingar
  • 7. New Pingar API Content Analysis query Sanitization and redaction Offensive content filtering Summarization Link to download Report generation an auto-generated PDF report Exploring verticals Legal Bioscience Education Government
  • 8.

Notes de l'éditeur

  1. Approximately 1/3 of time spent at work is on tasks that can be automated.Just the time devoted to searching and organizing documents costs an average company 37K per person per year.
  2. Pingar can’t help you write emails or create presentations. But it can rescue much of the time and costs spent on other tasks related to unstructured documents.
  3. Examples:Creating a search report without the need of visiting each link in search resultsAnalyzing thousands of emails released by Enron and detecting interesting pattern in theseA SharePoint webpart that helps users formulate their query and understand search resultsHelp editing documents, e.g. removing all the sensitive information from them (redaction, sanitization)Gathering information by summarizing documentsMetadata is information about documents that can be used for filtering search results. People avoid creating metadata – they already spent time creating documents, they don’t want to spend time organizing them. Pingar generates metadata on the fly – it automatically extracts keywords, identifies names of places, companies, people.