SlideShare une entreprise Scribd logo
1  sur  35
Digital Enterprise Research Institute                                                                         www.deri.ie




              TOWARDS EXPERTISE MODELLING FOR ROUTING DATA
                  CLEANING TASKS WITHIN A COMMUNITY OF
                          KNOWLEDGE WORKERS
                                             Umair ul Hassan, Sean O’Riain, Edward Curry
                                                                     Digital Enterprise Research Institute
                                                                     National University of Ireland, Galway




   17th International Conference on Information
   Quality (ICIQ 2012), Paris, France
 Copyright 2012 Digital Enterprise Research Institute. All rights reserved.
Agenda
Digital Enterprise Research Institute                 www.deri.ie



         Paper Overview
         Motivation
                    Enterprise Data Landscape
                    Collaborative Data Quality
                    Human Computation
             Problem Space
                    Task Routing
                    Challenges of Push Routing
             CAMEE Prototype
                    DBPedia.org & SKOS
                    Expertise Assessment
                    Task Routing
         Experiments
         Summary



                                                  2
Paper Overview
Digital Enterprise Research Institute                                                    www.deri.ie



             Motivation
                    Data quality management is limited to few individuals (e.g. MDM)
                    Involve community of user in data quality tasks
                    Tasks require expertise and domain knowledge
             Problem
                    How to assess and model human expertise
                    How to effectively route tasks to appropriate workers
             Contribution
                    Concepts based approach for modelling and assessment of knowledge
                     worker‟s expertise
                    Concept matching approach for routing data quality tasks
                    Prototype implementation using SKOS vocabulary




                                                        3
Digital Enterprise Research Institute             www.deri.ie




             Big Data & Information Quality

             MOTIVATION


                                              4
Enterprise Data Landscape
Digital Enterprise Research Institute                                                                   www.deri.ie



             Enterprises will have to deal with much more data in future

            The
            Reality
                                                                          All data relevant to enterprise and its
                                                                          operations

                                                  Relevant
            The                                 External Data
            Known




                                                                           Data directly managed by enterprise and
            The                                                            its departments
            Managed
                                        Enterprise Data




                                                                    Reference data managed through well define
                                        MDM
             Collaboratively                                        policies and governance council
               Managed

                                                                         “The data deluge,” The Economist, Feb-2010



                                                                8
Collaborative Data Quality
Digital Enterprise Research Institute                                                               www.deri.ie


                                         Developers    Data Governance

                 Data
                Sources

                                                                                  External Crowd




                                        Data Quality               Human
                                        Algorithms               Computation




                                        Clean Data              Clean Data     Internal Community




                                                       6
Human Computation
Digital Enterprise Research Institute                                                                                www.deri.ie



             Solve computationally hard problems with help of humans
             Algorithms control human workers
             Computation is carried out by Humans
                                                                       Algorithm


                                                                                                                  Workers
     Developer



                         Define                                                                         Compute




* Barowy et al, “AutoMan: a platform for integrating human-based and digital computation,” OOPSLA ’12



                                                                               7
Human Computation
Digital Enterprise Research Institute                                                                     www.deri.ie




                                                               Task Design
                                                            during computation




                                  Input                                                          Output




                                             Task Router                        Output Aggregation
      Our Focus                           before computation                     after computation




* Edith Law and Luis von Ahn, Human Computation - Core Research Questions and State of the Art



                                                                            8
Digital Enterprise Research Institute                                   www.deri.ie




             Challenges of Task Routing in Collaborative Data Quality

             PROBLEM SPACE


                                                9
Task Routing
Digital Enterprise Research Institute                                                       www.deri.ie




             Pull Routing
                    System provides an interface to support workers
                    Workers actively seek tasks and assign to themselves

                                                    Search & Browse Interface

            Algorithm
                                                                                         Workers

                                        Tasks                                   Select


                                           Result
                                                                                Result




* www.mtruk.com



                                                           10
Task Routing
Digital Enterprise Research Institute                                                        www.deri.ie




             Push Routing
                    System has complete control over assignment of tasks
                       – Based on criteria such as expertise, cost, and latency
                    Workers passively receive tasks
                                                       Task Interface

              Algorithm
                                                                                  Assign   Workers

                                        Tasks
                                                                                  Result

                                           Result
                                                                                  Assign



* www.mobileworks.com



                                                        11
Challenges of Push Routing
Digital Enterprise Research Institute                                      www.deri.ie




             Workers have different domain knowledge and expertise

               1.      How to represent expertise required for task?


               2.      How to assess and represent expertise of workers?


               3.      How to match a task with expertise of workers?




                                                     12
CAMEE Collaborative Management of Enterprise Entities
Digital Enterprise Research Institute                                                                      www.deri.ie




             Leverages concepts from data to build expertise profiles

                                            Associate data
                                                                Task
                                        concepts with tasks




       Use concepts from the                                                             Profile worker expertise
                                                 Data         Concepts       Expertise
               data sources                                                              against concepts




                                                              Routing
                                                                         Leverage profiles for
                                                                         making routing decisions



                                                                13
dbp-res:X-Men:_First_Class
             rdfs:type dbp-owl:Film .
             foaf:name "X-Men: First Class"@en .                                dbp-res:X-Men:_First_Class dbp-prop:released "25-05-2011"
Digital Enterprise Research Institute                                                                                                                            www.deri.ie
             dbp-prop:budget "9600.0" .
             dbp-owl:distributor dbp-res:20th_Century_Fox




                                                                                 CAMEE
                                                                                                   Task Manager
                                                                                1) Update &
                                                                                 Concepts
                                                  Input     Data Quality                              Task
                             Dirty Dataset                  Algorithms                                Model
                                                                                                                                       Worker1 (Sci-Fi, Action, Adventure)
                                                                                                                                       Worker2 (Drama, Action, Thriller)



                              Crowd
                                          2) Assessment        Expertise         3) Expertise
                                                                                                      Routing
                                                                Model
                                                                                                       Model

                                                            Crowd Manager           4) Task
                                             5) Task UI



                                                                             Feedback Manager                             Output                 Clean
                                             6) Response                                                                                        Dataset




                                                                                                                       dbp-res:X-Men:_First_Class
                                                             Was film “X-Men: First Class”
                                                                                                                           rdfs:type dbp-owl:Film .
                             True                            released in 25 May 2011?
                                                                                                                           foaf:name "X-Men: First Class"@en .
                                                                                                                           dbp-prop:budget "9600.0" .
                                                                                                                           dbp-owl:distributor dbp-res:20th_Century_Fox
                                                                                                                           dbp-prop:released "25-05-2011"

                                                                                     14
Digital Enterprise Research Institute                      www.deri.ie




             SKOS Concepts based Implementation of CAMEE

             PROTOTYPE


                                           15
Challenges of Push Routing
Digital Enterprise Research Institute                                      www.deri.ie




             Workers have different domain knowledge and expertise

               1.      How to represent expertise required for task?
                      –      DBPedia & SKOS Concepts


               2.      How to assess and represent expertise of workers?
                      –      Expertise based on Self/Task Assessment


               3.      How to match a task with expertise of workers
                      –      Task Routing based on Matching




                                                       16
DBpedia & SKOS
Digital Enterprise Research Institute                                 www.deri.ie




             Dbpedia.org
                    Structured Database from Wikipedia Facts


             Simple Knowledge Organization System
                    Common model for knowledge organization
                       – Facilitate interoperability
                       – Machine readability
                    “Concept” is basic element
                       – Identified by URI and represented with RDF
                    Defines concept schemes
                       – Hierarchical and associative relationships

* www.dbpedia.org, www.w3.org/2004/02/skos/



                                                        17
Example Entity
Digital Enterprise Research Institute                        www.deri.ie




* http://dbpedia.org/resource/A_Beautiful_Mind_(film)



                                                        18
CAMEE with SKOS
Digital Enterprise Research Institute                                                                                www.deri.ie

                              Source Data             Data Quality Algorithm                   Task Model




                    Entity: A Beautiful Mind        Update: Missing Value            Task: Confirm Missing Value
                      Property & Values:             dbpedia-owl:writer =             Did Akiva Goldsman wrote the
                      dbpedia-owl:Work/runtime       dbpedia:Akiva_Goldsman           movie "A Beautiful Mind"?
                         135.0
                      dbpedia-owl:director           SKOS Concepts:                   SKOS Concepts:
                         dbpedia:Ron_Howard          American_biographical_films      American_biographical_films
                      dbpedia-owl:producer
                                                     Films_set_in_the_1950s           Films_set_in_the_1950s
                         dbpedia:Ron_Howard
                         dbpedia:Brian_Graze
                      dbpedia-owl:starring
                         dbpedia:Ed_Harris
                         dbpedia:Russell_Crowe      Worker Expertise                 Task Routing
                                                     SKOS Concepts:
                      SKOS Concepts:                                                  Match
                                                     Films_set_in_the_1950s (Good)
                                                                                      American_biographical_films
                      American_biographical_films    Films_about_psychiatry (Poor)
                                                                                      American_drama_films (Fair)
                      Films_set_in_the_1950s         American_drama_films (Fair)




                                                    Workers & Expertise Model                 Routing Model




                                                                    19
Expertise Assessment
Digital Enterprise Research Institute                                                                                 www.deri.ie



             Build profiles of workers
                    To quantify expertise or knowledge levels of workers against concepts

             Two Approaches
                    Self-Assessment:                 Workers provide self-assessment of knowledge for each concept

                    Task Assessment:                   Workers provide responses to assessment tasks



             Expertise Profiles in form of matrix E(C,W)
                    where C in set of concepts and W is set of workers
                                        Concept                          Worker 1   Worker 2    Worker 3
                                        1990s_comedy-drama_films           0.6         0.2        0.2
                                        Films_about_psychiatry             0.6         0.2        0.6
                                        American_biographical_films        0.8         0.4        0.4
                                        American_comedy-drama_films        0.8         0.6        0.6




                                                                         20
Example Self-Assessment
Digital Enterprise Research Institute        www.deri.ie




                                        21
Example Task-Assessment
Digital Enterprise Research Institute        www.deri.ie




                                        22
Task Routing
Digital Enterprise Research Institute        www.deri.ie




        




                                        23
Digital Enterprise Research Institute                         www.deri.ie




             Leveraging Expertise Profiles for Task routing

             EXPERIMENTS


                                                24
Expertiment
Digital Enterprise Research Institute                                                     www.deri.ie




             Hypothesis
                    Data quality tasks routed using a concept-based expertise profiles
                     have higher response rates if the expertise model is built using a
                     task-assessment approach as compared to a self-assessment
                     based approach.


             Two stages of experiment
                    Assessment Stage (build profiles)
                    Routing Stage (leverage profiles)




                                                   25
Dataset
Digital Enterprise Research Institute                                     www.deri.ie




             Popular Movies in Dbpedia
                    Top 100 grossing movies in Hollywood and Bollywood

                              Characteristic                  Value

                              Number of entities (dbp:Film)   724

                              No. of concepts (film genres)   42

                              No. of data quality tasks       230


             Knowledge Workers
                              Characteristic                  Value

                              No. of Workers                  11

                              Tasks for Assessment Stage      100

                              Tasks for Routing Stage         130




                                                              26
Response Rate
Digital Enterprise Research Institute                                                                   www.deri.ie




             Hypothesis
                    Data quality tasks routed using a concept-based expertise profiles
                     have higher response rates if the expertise model is built using a
                     task-assessment approach as compared to a self-assessment
                     based approach.

             Data
                                                                     Matching            Matching
                           Routing (Assessment)   Random
                                                                (Self-Assessment)   (Task Assessment)
                           Don't know             71.54%             58.46%              10.00%
                           Strongly Disagree      5.38%              16.92%              29.23%
                           Disagree               6.92%              2.31%               13.08%
                           Neutral                2.31%              2.31%               8.46%
                           Agree                  3.85%              4.62%               12.31%
                           Strongly Agree         10.00%             15.38%              26.92%




                                                           27
Assessment Effort
Digital Enterprise Research Institute                                                                                                                                    www.deri.ie




             Combine self-assessment with task assessment
                    Filtering assessment tasks based on self-rated concepts reduces effort
                     required during assessment
                                                                         150
                                                                         140
                                 Effort (average decisions per worker)




                                                                         130
                                                                         120
                                                                         110
                                                                         100
                                                                                                                                                              For examples
                                                                          90
                                                                          80
                                                                                                                                                              filter tasks with
                                                                          70                                                                                  concepts of
                                                                          60                                                                                  Good or higher
                                                                          50                                                                                  self-rating
                                                                          40
                                                                          30
                                                                          20
                                                                          10
                                                                           0
                                                                               RND   SA     TA       SA&TA      SA&TA       SA&TA      SA&TA      SA&TA
                                                                                                                (Poor+)     (Fair+)   (Good+)   (Excellent)
                                                                                          Assessment Method for Expertise Profiling
  CA: Self-Assessment
  TA: Task Assessment



                                                                                                        28
Task Routing
Digital Enterprise Research Institute                                                                                                                                        www.deri.ie




             Likelihood of response and Quality of response remains near
              maximum during routing stage

                                                                         100.00%
                                                                                                                                              Response Rate
                                                                         90.00%
                                 Effort (average decisions per worker)




                                                                                                                                              Accuracy
                                                                         80.00%

                                                                         70.00%
                                                                                                                                                                  For examples
                                                                         60.00%
                                                                                                                                                                  filter tasks with
                                                                         50.00%                                                                                   concepts of
                                                                         40.00%                                                                                   Good or higher
                                                                                                                                                                  self-rating
                                                                         30.00%

                                                                         20.00%

                                                                         10.00%

                                                                          0.00%
                                                                                   RND   SA     TA       SA&TA      SA&TA      SA&TA       SA&TA      SA&TA
                                                                                                                    (Poor+)    (Fair+)    (Good+)   (Excellent)
                                                                                              Assessment Method for Expertise Profiling
  CR: Self-Assessment
  TP: Task Assessment



                                                                                                          29
Summary
Digital Enterprise Research Institute                                                     www.deri.ie




             Conclusion
                    Effective task routing is fundamental aspect of collaborative data
                     quality management
                    Concepts are effective for expertise assessment and modelling
                    Task routing leveraging Task Assessment based profiles have better
                     likelihood of response from workers


             Future Directions
                    Loading balancing under constraints
                       – Cost, Latency, Motivation, Expertise, Utility
                    Trade-off between assessment for profiling and exploitation



                                                         30
Further Reading
Digital Enterprise Research Institute                                                           www.deri.ie




                            17th International Conference on Information Quality (ICIQ 2012)
                                                Paris, 16-17 November 2012




        U. Ul Hassan, S. O’Riain, and E. Curry, “Towards Expertise Modelling for Routing Data
        Cleaning Tasks within a Community of Knowledge Workers,” in 17th International
        Conference on Information Quality - ICIQ’12, 2012.




        http://www.deri.ie/about/team/member/umair_ul_hassan/



                                                            31
Selected References
Digital Enterprise Research Institute                                                                                   www.deri.ie



             Big Data & Data Quality
                    S. Lavalle, E. Lesser, R. Shockley, M. S. Hopkins, and N. Kruschwitz, “Big Data, Analytics and the
                     Path from Insights to Value,” MIT Sloan Management Review, vol. 52, no. 2, pp. 21–32, 2011.
                    A. Haug and J. S. Arlbjørn, “Barriers to master data quality,” Journal of Enterprise Information
                     Management, vol. 24, no. 3, pp. 288–303, 2011.
                    R. Silvola, O. Jaaskelainen, H. Kropsu-Vehkapera, and H. Haapasalo, “Managing one master data –
                     challenges and preconditions,” Industrial Management & Data Systems, vol. 111, no. 1, pp. 146–
                     162, 2011.
                    E. Curry, S. Hasan, and S. O‟Riain, “Enterprise Energy Management using a Linked Dataspace for
                     Energy Intelligence,” in Second IFIP Conference on Sustainable Internet and ICT for
                     Sustainability, 2012.
                    D. Loshin, Master Data Management. San Francisco, CA, USA: Morgan Kaufmann Publishers
                     Inc., 2008.
                    B. Otto and A. Reichert, “Organizing Master Data Management: Findings from an Expert Survey,” in
                     Proceedings of the 2010 ACM Symposium on Applied Computing - SAC ‟10, 2010, pp. 106–110.




                                                                   32
Selected References
Digital Enterprise Research Institute                                                                                www.deri.ie



             Collective Intelligence, Crowdsourcing & Human Computation
                    E. Curry, A. Freitas, and S. O. Riain, “The Role of Community-Driven Data Curation for Enterprises,”
                     in Linking Enterprise Data, D. Wood, Ed. Boston, MA: Springer US, 2010, pp. 25–47.
                    A. Doan, R. Ramakrishnan, and A. Y. Halevy, “Crowdsourcing systems on the World-Wide Web,”
                     Communications of the ACM, vol. 54, no. 4, p. 86, Apr. 2011.
                    E. Law and L. von Ahn, “Human Computation,” Synthesis Lectures on Artificial Intelligence and
                     Machine Learning, vol. 5, no. 3, pp. 1–121, Jun. 2011.
                    M. J. Franklin, D. Kossmann, T. Kraska, S. Ramesh, and R. Xin, “CrowdDB  Answering Queries with
                                                                                               :
                     Crowdsourcing,” in Proceedings of the 2011 international conference on Management of data -
                     SIGMOD ‟11, 2011, p. 61.
                    P. Wichmann, A. Borek, R. Kern, P. Woodall, A. K. Parlikad, and G. Satzger, “Exploring the „Crowd‟
                     as Enabler of Better Information Quality,” in Proceedings of the 16th International Conference on
                     Information Quality, 2011, pp. 302–312.




                                                                 33
Selected References
Digital Enterprise Research Institute                                                                                  www.deri.ie



             Expert Finding
                    K. Balog, L. Azzopardi, and M. de Rijke, “Formal models for expert finding in enterprise corpora,” in
                     Proceedings of the 29th annual international ACM SIGIR conference on Research and development
                     in information retrieval - SIGIR ‟06, 2006, p. 43.
                    K. Balog, T. Bogers, L. Azzopardi, M. de Rijke, and A. van den Bosch, “Broad expertise retrieval in
                     sparse data environments,” in Proceedings of the 30th annual international ACM SIGIR conference
                     on Research and development in information retrieval - SIGIR ‟07, 2007, p. 551.
                    K. Balog and M. De Rijke, “Determining expert profiles (with an application to expert finding),” in
                     Proceedings of the 20th international joint conference on Artifical intelligence, 2007, pp. 2657–=2662.




                                                                  34
Selected References
Digital Enterprise Research Institute                                                                                www.deri.ie



             Linked Data & User Feedback
                    S. O‟Riain, E. Curry, and A. Harth, “XBRL and open data for global financial ecosystems: A linked
                     data approach,” International Journal of Accounting Information Systems, Mar. 2012.
                    U. Ul Hassan, S. O‟Riain, and E. Curry, “Leveraging Matching Dependencies for Guided User
                     Feedback in Linked Data Applications,” in 9th International Workshop on Information Integration on
                     the Web IIWeb2012, 2012.
                    A. Miles and J. R. Pérez-Agüera, “SKOS: Simple Knowledge Organisation for the Web,” Cataloging &
                     Classification Quarterly, vol. 43, no. 3–4, pp. 69–83, Apr. 2007.
                    C. Bizer, J. Lehmann, G. Kobilarov, S. Auer, C. Becker, R. Cyganiak, and S. Hellmann, “DBpedia - A
                     crystallization point for the Web of Data,” Web Semantics: Science, Services and Agents on the
                     World Wide Web, vol. 7, no. 3, pp. 154–165, Sep. 2009.
                    S. R. Jeffery, M. J. Franklin, and A. Y. Halevy, “Pay-as-you-go user feedback for dataspace systems,”
                     in Proceedings of the 2008 ACM SIGMOD international conference on Management of data -
                     SIGMOD ‟08, 2008, pp. 847–860.




                                                                 35

Contenu connexe

Tendances

When Worlds Collide: Intelligence, Analytics and Operations
When Worlds Collide: Intelligence, Analytics and OperationsWhen Worlds Collide: Intelligence, Analytics and Operations
When Worlds Collide: Intelligence, Analytics and OperationsInside Analysis
 
Linked Open Data
Linked Open DataLinked Open Data
Linked Open DataDerilinx
 
Data Curation at the New York Times
Data Curation at the New York TimesData Curation at the New York Times
Data Curation at the New York TimesEdward Curry
 
System of Systems Information Interoperability using a Linked Dataspace
System of Systems Information Interoperability using a Linked DataspaceSystem of Systems Information Interoperability using a Linked Dataspace
System of Systems Information Interoperability using a Linked DataspaceEdward Curry
 
The Information Advantage - Information Access in Tomorrow's Enterprise
The Information Advantage - Information Access in Tomorrow's EnterpriseThe Information Advantage - Information Access in Tomorrow's Enterprise
The Information Advantage - Information Access in Tomorrow's EnterpriseElizabeth Lupfer
 
Big Data and Content Management. SkyDox and the European Court of Human Righ...
Big Data and Content Management.  SkyDox and the European Court of Human Righ...Big Data and Content Management.  SkyDox and the European Court of Human Righ...
Big Data and Content Management. SkyDox and the European Court of Human Righ...SkyDox LTD
 
Approximate Semantic Matching of Heterogeneous Events
Approximate Semantic Matching of Heterogeneous EventsApproximate Semantic Matching of Heterogeneous Events
Approximate Semantic Matching of Heterogeneous EventsEdward Curry
 
Externalization Trend
Externalization TrendExternalization Trend
Externalization TrendNigel Green
 
ASolutionforWomensLawWorkgroup
ASolutionforWomensLawWorkgroupASolutionforWomensLawWorkgroup
ASolutionforWomensLawWorkgroupLisa Martinez
 
Wikipedia (DBpedia): Crowdsourced Data Curation
Wikipedia (DBpedia): Crowdsourced Data CurationWikipedia (DBpedia): Crowdsourced Data Curation
Wikipedia (DBpedia): Crowdsourced Data CurationEdward Curry
 
Towards Patient Controlled Privacy
Towards Patient Controlled PrivacyTowards Patient Controlled Privacy
Towards Patient Controlled PrivacyOwen Sacco
 
Projections for BI in 2012 from the neutrinoBI team
Projections for BI in 2012 from the neutrinoBI teamProjections for BI in 2012 from the neutrinoBI team
Projections for BI in 2012 from the neutrinoBI teamneutrinoBI
 
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...Benjamin Heitmann
 
Work smarter with the future of productivity hau lu
Work smarter with the future of productivity hau luWork smarter with the future of productivity hau lu
Work smarter with the future of productivity hau luMicrosoft Singapore
 
Intel Social Computing & Sustainability Issues
Intel Social Computing & Sustainability IssuesIntel Social Computing & Sustainability Issues
Intel Social Computing & Sustainability IssuesUmair Mohsin
 
AiLibrary Garage.com application review - by Gordon Kraft
AiLibrary Garage.com   application review - by Gordon Kraft AiLibrary Garage.com   application review - by Gordon Kraft
AiLibrary Garage.com application review - by Gordon Kraft Gordon Kraft
 
Challenges Ahead for Converging Financial Data
Challenges Ahead for Converging Financial DataChallenges Ahead for Converging Financial Data
Challenges Ahead for Converging Financial DataEdward Curry
 

Tendances (20)

When Worlds Collide: Intelligence, Analytics and Operations
When Worlds Collide: Intelligence, Analytics and OperationsWhen Worlds Collide: Intelligence, Analytics and Operations
When Worlds Collide: Intelligence, Analytics and Operations
 
Linked Open Data
Linked Open DataLinked Open Data
Linked Open Data
 
Data Curation at the New York Times
Data Curation at the New York TimesData Curation at the New York Times
Data Curation at the New York Times
 
System of Systems Information Interoperability using a Linked Dataspace
System of Systems Information Interoperability using a Linked DataspaceSystem of Systems Information Interoperability using a Linked Dataspace
System of Systems Information Interoperability using a Linked Dataspace
 
The Information Advantage - Information Access in Tomorrow's Enterprise
The Information Advantage - Information Access in Tomorrow's EnterpriseThe Information Advantage - Information Access in Tomorrow's Enterprise
The Information Advantage - Information Access in Tomorrow's Enterprise
 
Big Data and Content Management. SkyDox and the European Court of Human Righ...
Big Data and Content Management.  SkyDox and the European Court of Human Righ...Big Data and Content Management.  SkyDox and the European Court of Human Righ...
Big Data and Content Management. SkyDox and the European Court of Human Righ...
 
Approximate Semantic Matching of Heterogeneous Events
Approximate Semantic Matching of Heterogeneous EventsApproximate Semantic Matching of Heterogeneous Events
Approximate Semantic Matching of Heterogeneous Events
 
Externalization Trend
Externalization TrendExternalization Trend
Externalization Trend
 
ASolutionforWomensLawWorkgroup
ASolutionforWomensLawWorkgroupASolutionforWomensLawWorkgroup
ASolutionforWomensLawWorkgroup
 
Wikipedia (DBpedia): Crowdsourced Data Curation
Wikipedia (DBpedia): Crowdsourced Data CurationWikipedia (DBpedia): Crowdsourced Data Curation
Wikipedia (DBpedia): Crowdsourced Data Curation
 
Towards Patient Controlled Privacy
Towards Patient Controlled PrivacyTowards Patient Controlled Privacy
Towards Patient Controlled Privacy
 
Projections for BI in 2012 from the neutrinoBI team
Projections for BI in 2012 from the neutrinoBI teamProjections for BI in 2012 from the neutrinoBI team
Projections for BI in 2012 from the neutrinoBI team
 
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...
 
Work smarter with the future of productivity hau lu
Work smarter with the future of productivity hau luWork smarter with the future of productivity hau lu
Work smarter with the future of productivity hau lu
 
Intel Social Computing & Sustainability Issues
Intel Social Computing & Sustainability IssuesIntel Social Computing & Sustainability Issues
Intel Social Computing & Sustainability Issues
 
Lgd 2
Lgd 2Lgd 2
Lgd 2
 
AiLibrary Garage.com application review - by Gordon Kraft
AiLibrary Garage.com   application review - by Gordon Kraft AiLibrary Garage.com   application review - by Gordon Kraft
AiLibrary Garage.com application review - by Gordon Kraft
 
Challenges Ahead for Converging Financial Data
Challenges Ahead for Converging Financial DataChallenges Ahead for Converging Financial Data
Challenges Ahead for Converging Financial Data
 
Glen Koskela, Future of workplace computing, 23.10.2012
Glen Koskela, Future of workplace computing, 23.10.2012Glen Koskela, Future of workplace computing, 23.10.2012
Glen Koskela, Future of workplace computing, 23.10.2012
 
Cybersecurity1
Cybersecurity1Cybersecurity1
Cybersecurity1
 

Similaire à Towards Expertise Modelling for Routing Data Cleaning Tasks within a Community of Knowledge Workers

Empowering the Business with Agile Analytics
Empowering the Business with Agile AnalyticsEmpowering the Business with Agile Analytics
Empowering the Business with Agile AnalyticsInside Analysis
 
B13 Driving Business Intelligence John Robson
B13 Driving Business Intelligence John RobsonB13 Driving Business Intelligence John Robson
B13 Driving Business Intelligence John RobsonProvoke Solutions
 
B13 Driving Business Intelligence
B13 Driving Business IntelligenceB13 Driving Business Intelligence
B13 Driving Business IntelligenceJohnRobson
 
121211 depfac ulb_master_presentation_v5_1
121211 depfac ulb_master_presentation_v5_1121211 depfac ulb_master_presentation_v5_1
121211 depfac ulb_master_presentation_v5_1Thibaut De Vylder
 
Building Optimisation using Scenario Modeling and Linked Data
Building Optimisation using Scenario Modeling and Linked DataBuilding Optimisation using Scenario Modeling and Linked Data
Building Optimisation using Scenario Modeling and Linked DataEdward Curry
 
Interlinking Personal Semantic Data on the Semantic Desktop and the Web of Data
Interlinking Personal Semantic Data on the Semantic Desktop and the Web of DataInterlinking Personal Semantic Data on the Semantic Desktop and the Web of Data
Interlinking Personal Semantic Data on the Semantic Desktop and the Web of DataLaura Dragan
 
Manfred Linking the Real World
Manfred Linking the Real WorldManfred Linking the Real World
Manfred Linking the Real Worldsssw2012
 
Business in the Moment: From Reactive to Proactive
Business in the Moment: From Reactive to ProactiveBusiness in the Moment: From Reactive to Proactive
Business in the Moment: From Reactive to ProactiveSAP Analytics
 
Unlocking value in your (big) data
Unlocking value in your (big) dataUnlocking value in your (big) data
Unlocking value in your (big) dataOscar Renalias
 
Cloud Computing: da curiosidade para casos reais
Cloud Computing: da curiosidade para casos reaisCloud Computing: da curiosidade para casos reais
Cloud Computing: da curiosidade para casos reaissoudW
 
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)Ajay Ohri
 
Unleash Business Innovation with the Next Generation of Cloud Computing
Unleash Business Innovation with the Next Generation of Cloud ComputingUnleash Business Innovation with the Next Generation of Cloud Computing
Unleash Business Innovation with the Next Generation of Cloud ComputingSam Garforth
 
Big Data Beyond Hadoop*: Research Directions for the Future
Big Data Beyond Hadoop*: Research Directions for the FutureBig Data Beyond Hadoop*: Research Directions for the Future
Big Data Beyond Hadoop*: Research Directions for the FutureOdinot Stanislas
 
Ibm big data ibm marriage of hadoop and data warehousing
Ibm big dataibm marriage of hadoop and data warehousingIbm big dataibm marriage of hadoop and data warehousing
Ibm big data ibm marriage of hadoop and data warehousing DataWorks Summit
 
hcid2011 - RED: a multi-disciplinary approach to experience design - Jarnail ...
hcid2011 - RED: a multi-disciplinary approach to experience design - Jarnail ...hcid2011 - RED: a multi-disciplinary approach to experience design - Jarnail ...
hcid2011 - RED: a multi-disciplinary approach to experience design - Jarnail ...City University London
 
Cognitive computing big_data_statistical_analytics
Cognitive computing big_data_statistical_analyticsCognitive computing big_data_statistical_analytics
Cognitive computing big_data_statistical_analyticsPietro Leo
 

Similaire à Towards Expertise Modelling for Routing Data Cleaning Tasks within a Community of Knowledge Workers (20)

Empowering the Business with Agile Analytics
Empowering the Business with Agile AnalyticsEmpowering the Business with Agile Analytics
Empowering the Business with Agile Analytics
 
B13 Driving Business Intelligence John Robson
B13 Driving Business Intelligence John RobsonB13 Driving Business Intelligence John Robson
B13 Driving Business Intelligence John Robson
 
B13 Driving Business Intelligence
B13 Driving Business IntelligenceB13 Driving Business Intelligence
B13 Driving Business Intelligence
 
121211 depfac ulb_master_presentation_v5_1
121211 depfac ulb_master_presentation_v5_1121211 depfac ulb_master_presentation_v5_1
121211 depfac ulb_master_presentation_v5_1
 
The New Enterprise Data Platform
The New Enterprise Data PlatformThe New Enterprise Data Platform
The New Enterprise Data Platform
 
Building Optimisation using Scenario Modeling and Linked Data
Building Optimisation using Scenario Modeling and Linked DataBuilding Optimisation using Scenario Modeling and Linked Data
Building Optimisation using Scenario Modeling and Linked Data
 
Interlinking Personal Semantic Data on the Semantic Desktop and the Web of Data
Interlinking Personal Semantic Data on the Semantic Desktop and the Web of DataInterlinking Personal Semantic Data on the Semantic Desktop and the Web of Data
Interlinking Personal Semantic Data on the Semantic Desktop and the Web of Data
 
Manfred Linking the Real World
Manfred Linking the Real WorldManfred Linking the Real World
Manfred Linking the Real World
 
Business in the Moment: From Reactive to Proactive
Business in the Moment: From Reactive to ProactiveBusiness in the Moment: From Reactive to Proactive
Business in the Moment: From Reactive to Proactive
 
Unlocking value in your (big) data
Unlocking value in your (big) dataUnlocking value in your (big) data
Unlocking value in your (big) data
 
Informatics technologies in an evolving r & d landscape
Informatics technologies in an evolving r & d landscapeInformatics technologies in an evolving r & d landscape
Informatics technologies in an evolving r & d landscape
 
Cloud Computing: da curiosidade para casos reais
Cloud Computing: da curiosidade para casos reaisCloud Computing: da curiosidade para casos reais
Cloud Computing: da curiosidade para casos reais
 
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
 
Unleash Business Innovation with the Next Generation of Cloud Computing
Unleash Business Innovation with the Next Generation of Cloud ComputingUnleash Business Innovation with the Next Generation of Cloud Computing
Unleash Business Innovation with the Next Generation of Cloud Computing
 
Enterprise Services Solutions
Enterprise Services SolutionsEnterprise Services Solutions
Enterprise Services Solutions
 
Big Data Beyond Hadoop*: Research Directions for the Future
Big Data Beyond Hadoop*: Research Directions for the FutureBig Data Beyond Hadoop*: Research Directions for the Future
Big Data Beyond Hadoop*: Research Directions for the Future
 
Ibm big data ibm marriage of hadoop and data warehousing
Ibm big dataibm marriage of hadoop and data warehousingIbm big dataibm marriage of hadoop and data warehousing
Ibm big data ibm marriage of hadoop and data warehousing
 
hcid2011 - RED: a multi-disciplinary approach to experience design - Jarnail ...
hcid2011 - RED: a multi-disciplinary approach to experience design - Jarnail ...hcid2011 - RED: a multi-disciplinary approach to experience design - Jarnail ...
hcid2011 - RED: a multi-disciplinary approach to experience design - Jarnail ...
 
Cognitive computing big_data_statistical_analytics
Cognitive computing big_data_statistical_analyticsCognitive computing big_data_statistical_analytics
Cognitive computing big_data_statistical_analytics
 
Accelerate Return on Data
Accelerate Return on DataAccelerate Return on Data
Accelerate Return on Data
 

Plus de Umair ul Hassan

Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment
Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality AssessmentLeveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment
Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality AssessmentUmair ul Hassan
 
A Multi-armed Bandit Approach to Online Spatial Task Assignment
A Multi-armed Bandit Approach to Online Spatial Task AssignmentA Multi-armed Bandit Approach to Online Spatial Task Assignment
A Multi-armed Bandit Approach to Online Spatial Task AssignmentUmair ul Hassan
 
SLUA: Towards Semantic Linking of Users with Actions in Crowdsourcing
SLUA: Towards Semantic Linking of Users with Actions in CrowdsourcingSLUA: Towards Semantic Linking of Users with Actions in Crowdsourcing
SLUA: Towards Semantic Linking of Users with Actions in CrowdsourcingUmair ul Hassan
 
A Collaborative Approach for Metadata Management for Internet of Things
A Collaborative Approach for Metadata Management for Internet of ThingsA Collaborative Approach for Metadata Management for Internet of Things
A Collaborative Approach for Metadata Management for Internet of ThingsUmair ul Hassan
 
Researh toolbox - Data analysis with python
Researh toolbox  - Data analysis with pythonResearh toolbox  - Data analysis with python
Researh toolbox - Data analysis with pythonUmair ul Hassan
 
A Capability Requirements Approach for Predicting Worker Performance in Crowd...
A Capability Requirements Approach for Predicting Worker Performance in Crowd...A Capability Requirements Approach for Predicting Worker Performance in Crowd...
A Capability Requirements Approach for Predicting Worker Performance in Crowd...Umair ul Hassan
 
Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...
Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...
Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...Umair ul Hassan
 
Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...
Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...
Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...Umair ul Hassan
 

Plus de Umair ul Hassan (8)

Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment
Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality AssessmentLeveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment
Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment
 
A Multi-armed Bandit Approach to Online Spatial Task Assignment
A Multi-armed Bandit Approach to Online Spatial Task AssignmentA Multi-armed Bandit Approach to Online Spatial Task Assignment
A Multi-armed Bandit Approach to Online Spatial Task Assignment
 
SLUA: Towards Semantic Linking of Users with Actions in Crowdsourcing
SLUA: Towards Semantic Linking of Users with Actions in CrowdsourcingSLUA: Towards Semantic Linking of Users with Actions in Crowdsourcing
SLUA: Towards Semantic Linking of Users with Actions in Crowdsourcing
 
A Collaborative Approach for Metadata Management for Internet of Things
A Collaborative Approach for Metadata Management for Internet of ThingsA Collaborative Approach for Metadata Management for Internet of Things
A Collaborative Approach for Metadata Management for Internet of Things
 
Researh toolbox - Data analysis with python
Researh toolbox  - Data analysis with pythonResearh toolbox  - Data analysis with python
Researh toolbox - Data analysis with python
 
A Capability Requirements Approach for Predicting Worker Performance in Crowd...
A Capability Requirements Approach for Predicting Worker Performance in Crowd...A Capability Requirements Approach for Predicting Worker Performance in Crowd...
A Capability Requirements Approach for Predicting Worker Performance in Crowd...
 
Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...
Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...
Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...
 
Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...
Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...
Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...
 

Dernier

Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...KarteekMane1
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfSubhamKumar3239
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 

Dernier (20)

Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdf
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 

Towards Expertise Modelling for Routing Data Cleaning Tasks within a Community of Knowledge Workers

  • 1. Digital Enterprise Research Institute www.deri.ie TOWARDS EXPERTISE MODELLING FOR ROUTING DATA CLEANING TASKS WITHIN A COMMUNITY OF KNOWLEDGE WORKERS Umair ul Hassan, Sean O’Riain, Edward Curry Digital Enterprise Research Institute National University of Ireland, Galway 17th International Conference on Information Quality (ICIQ 2012), Paris, France Copyright 2012 Digital Enterprise Research Institute. All rights reserved.
  • 2. Agenda Digital Enterprise Research Institute www.deri.ie  Paper Overview  Motivation  Enterprise Data Landscape  Collaborative Data Quality  Human Computation  Problem Space  Task Routing  Challenges of Push Routing  CAMEE Prototype  DBPedia.org & SKOS  Expertise Assessment  Task Routing  Experiments  Summary 2
  • 3. Paper Overview Digital Enterprise Research Institute www.deri.ie  Motivation  Data quality management is limited to few individuals (e.g. MDM)  Involve community of user in data quality tasks  Tasks require expertise and domain knowledge  Problem  How to assess and model human expertise  How to effectively route tasks to appropriate workers  Contribution  Concepts based approach for modelling and assessment of knowledge worker‟s expertise  Concept matching approach for routing data quality tasks  Prototype implementation using SKOS vocabulary 3
  • 4. Digital Enterprise Research Institute www.deri.ie Big Data & Information Quality MOTIVATION 4
  • 5. Enterprise Data Landscape Digital Enterprise Research Institute www.deri.ie  Enterprises will have to deal with much more data in future The Reality All data relevant to enterprise and its operations Relevant The External Data Known Data directly managed by enterprise and The its departments Managed Enterprise Data Reference data managed through well define MDM Collaboratively policies and governance council Managed “The data deluge,” The Economist, Feb-2010 8
  • 6. Collaborative Data Quality Digital Enterprise Research Institute www.deri.ie Developers Data Governance Data Sources External Crowd Data Quality Human Algorithms Computation Clean Data Clean Data Internal Community 6
  • 7. Human Computation Digital Enterprise Research Institute www.deri.ie  Solve computationally hard problems with help of humans  Algorithms control human workers  Computation is carried out by Humans Algorithm Workers Developer Define Compute * Barowy et al, “AutoMan: a platform for integrating human-based and digital computation,” OOPSLA ’12 7
  • 8. Human Computation Digital Enterprise Research Institute www.deri.ie Task Design during computation Input Output Task Router Output Aggregation Our Focus before computation after computation * Edith Law and Luis von Ahn, Human Computation - Core Research Questions and State of the Art 8
  • 9. Digital Enterprise Research Institute www.deri.ie Challenges of Task Routing in Collaborative Data Quality PROBLEM SPACE 9
  • 10. Task Routing Digital Enterprise Research Institute www.deri.ie  Pull Routing  System provides an interface to support workers  Workers actively seek tasks and assign to themselves Search & Browse Interface Algorithm Workers Tasks Select Result Result * www.mtruk.com 10
  • 11. Task Routing Digital Enterprise Research Institute www.deri.ie  Push Routing  System has complete control over assignment of tasks – Based on criteria such as expertise, cost, and latency  Workers passively receive tasks Task Interface Algorithm Assign Workers Tasks Result Result Assign * www.mobileworks.com 11
  • 12. Challenges of Push Routing Digital Enterprise Research Institute www.deri.ie  Workers have different domain knowledge and expertise 1. How to represent expertise required for task? 2. How to assess and represent expertise of workers? 3. How to match a task with expertise of workers? 12
  • 13. CAMEE Collaborative Management of Enterprise Entities Digital Enterprise Research Institute www.deri.ie  Leverages concepts from data to build expertise profiles Associate data Task concepts with tasks Use concepts from the Profile worker expertise Data Concepts Expertise data sources against concepts Routing Leverage profiles for making routing decisions 13
  • 14. dbp-res:X-Men:_First_Class rdfs:type dbp-owl:Film . foaf:name "X-Men: First Class"@en . dbp-res:X-Men:_First_Class dbp-prop:released "25-05-2011" Digital Enterprise Research Institute www.deri.ie dbp-prop:budget "9600.0" . dbp-owl:distributor dbp-res:20th_Century_Fox CAMEE Task Manager 1) Update & Concepts Input Data Quality Task Dirty Dataset Algorithms Model Worker1 (Sci-Fi, Action, Adventure) Worker2 (Drama, Action, Thriller) Crowd 2) Assessment Expertise 3) Expertise Routing Model Model Crowd Manager 4) Task 5) Task UI Feedback Manager Output Clean 6) Response Dataset dbp-res:X-Men:_First_Class Was film “X-Men: First Class” rdfs:type dbp-owl:Film . True released in 25 May 2011? foaf:name "X-Men: First Class"@en . dbp-prop:budget "9600.0" . dbp-owl:distributor dbp-res:20th_Century_Fox dbp-prop:released "25-05-2011" 14
  • 15. Digital Enterprise Research Institute www.deri.ie SKOS Concepts based Implementation of CAMEE PROTOTYPE 15
  • 16. Challenges of Push Routing Digital Enterprise Research Institute www.deri.ie  Workers have different domain knowledge and expertise 1. How to represent expertise required for task? – DBPedia & SKOS Concepts 2. How to assess and represent expertise of workers? – Expertise based on Self/Task Assessment 3. How to match a task with expertise of workers – Task Routing based on Matching 16
  • 17. DBpedia & SKOS Digital Enterprise Research Institute www.deri.ie  Dbpedia.org  Structured Database from Wikipedia Facts  Simple Knowledge Organization System  Common model for knowledge organization – Facilitate interoperability – Machine readability  “Concept” is basic element – Identified by URI and represented with RDF  Defines concept schemes – Hierarchical and associative relationships * www.dbpedia.org, www.w3.org/2004/02/skos/ 17
  • 18. Example Entity Digital Enterprise Research Institute www.deri.ie * http://dbpedia.org/resource/A_Beautiful_Mind_(film) 18
  • 19. CAMEE with SKOS Digital Enterprise Research Institute www.deri.ie Source Data Data Quality Algorithm Task Model Entity: A Beautiful Mind Update: Missing Value Task: Confirm Missing Value Property & Values: dbpedia-owl:writer = Did Akiva Goldsman wrote the dbpedia-owl:Work/runtime dbpedia:Akiva_Goldsman movie "A Beautiful Mind"? 135.0 dbpedia-owl:director SKOS Concepts: SKOS Concepts: dbpedia:Ron_Howard American_biographical_films American_biographical_films dbpedia-owl:producer Films_set_in_the_1950s Films_set_in_the_1950s dbpedia:Ron_Howard dbpedia:Brian_Graze dbpedia-owl:starring dbpedia:Ed_Harris dbpedia:Russell_Crowe Worker Expertise Task Routing SKOS Concepts: SKOS Concepts: Match Films_set_in_the_1950s (Good) American_biographical_films American_biographical_films Films_about_psychiatry (Poor) American_drama_films (Fair) Films_set_in_the_1950s American_drama_films (Fair) Workers & Expertise Model Routing Model 19
  • 20. Expertise Assessment Digital Enterprise Research Institute www.deri.ie  Build profiles of workers  To quantify expertise or knowledge levels of workers against concepts  Two Approaches  Self-Assessment: Workers provide self-assessment of knowledge for each concept  Task Assessment: Workers provide responses to assessment tasks  Expertise Profiles in form of matrix E(C,W)  where C in set of concepts and W is set of workers Concept Worker 1 Worker 2 Worker 3 1990s_comedy-drama_films 0.6 0.2 0.2 Films_about_psychiatry 0.6 0.2 0.6 American_biographical_films 0.8 0.4 0.4 American_comedy-drama_films 0.8 0.6 0.6 20
  • 21. Example Self-Assessment Digital Enterprise Research Institute www.deri.ie 21
  • 22. Example Task-Assessment Digital Enterprise Research Institute www.deri.ie 22
  • 23. Task Routing Digital Enterprise Research Institute www.deri.ie  23
  • 24. Digital Enterprise Research Institute www.deri.ie Leveraging Expertise Profiles for Task routing EXPERIMENTS 24
  • 25. Expertiment Digital Enterprise Research Institute www.deri.ie  Hypothesis  Data quality tasks routed using a concept-based expertise profiles have higher response rates if the expertise model is built using a task-assessment approach as compared to a self-assessment based approach.  Two stages of experiment  Assessment Stage (build profiles)  Routing Stage (leverage profiles) 25
  • 26. Dataset Digital Enterprise Research Institute www.deri.ie  Popular Movies in Dbpedia  Top 100 grossing movies in Hollywood and Bollywood Characteristic Value Number of entities (dbp:Film) 724 No. of concepts (film genres) 42 No. of data quality tasks 230  Knowledge Workers Characteristic Value No. of Workers 11 Tasks for Assessment Stage 100 Tasks for Routing Stage 130 26
  • 27. Response Rate Digital Enterprise Research Institute www.deri.ie  Hypothesis  Data quality tasks routed using a concept-based expertise profiles have higher response rates if the expertise model is built using a task-assessment approach as compared to a self-assessment based approach.  Data Matching Matching Routing (Assessment) Random (Self-Assessment) (Task Assessment) Don't know 71.54% 58.46% 10.00% Strongly Disagree 5.38% 16.92% 29.23% Disagree 6.92% 2.31% 13.08% Neutral 2.31% 2.31% 8.46% Agree 3.85% 4.62% 12.31% Strongly Agree 10.00% 15.38% 26.92% 27
  • 28. Assessment Effort Digital Enterprise Research Institute www.deri.ie  Combine self-assessment with task assessment  Filtering assessment tasks based on self-rated concepts reduces effort required during assessment 150 140 Effort (average decisions per worker) 130 120 110 100 For examples 90 80 filter tasks with 70 concepts of 60 Good or higher 50 self-rating 40 30 20 10 0 RND SA TA SA&TA SA&TA SA&TA SA&TA SA&TA (Poor+) (Fair+) (Good+) (Excellent) Assessment Method for Expertise Profiling CA: Self-Assessment TA: Task Assessment 28
  • 29. Task Routing Digital Enterprise Research Institute www.deri.ie  Likelihood of response and Quality of response remains near maximum during routing stage 100.00% Response Rate 90.00% Effort (average decisions per worker) Accuracy 80.00% 70.00% For examples 60.00% filter tasks with 50.00% concepts of 40.00% Good or higher self-rating 30.00% 20.00% 10.00% 0.00% RND SA TA SA&TA SA&TA SA&TA SA&TA SA&TA (Poor+) (Fair+) (Good+) (Excellent) Assessment Method for Expertise Profiling CR: Self-Assessment TP: Task Assessment 29
  • 30. Summary Digital Enterprise Research Institute www.deri.ie  Conclusion  Effective task routing is fundamental aspect of collaborative data quality management  Concepts are effective for expertise assessment and modelling  Task routing leveraging Task Assessment based profiles have better likelihood of response from workers  Future Directions  Loading balancing under constraints – Cost, Latency, Motivation, Expertise, Utility  Trade-off between assessment for profiling and exploitation 30
  • 31. Further Reading Digital Enterprise Research Institute www.deri.ie 17th International Conference on Information Quality (ICIQ 2012) Paris, 16-17 November 2012 U. Ul Hassan, S. O’Riain, and E. Curry, “Towards Expertise Modelling for Routing Data Cleaning Tasks within a Community of Knowledge Workers,” in 17th International Conference on Information Quality - ICIQ’12, 2012. http://www.deri.ie/about/team/member/umair_ul_hassan/ 31
  • 32. Selected References Digital Enterprise Research Institute www.deri.ie  Big Data & Data Quality  S. Lavalle, E. Lesser, R. Shockley, M. S. Hopkins, and N. Kruschwitz, “Big Data, Analytics and the Path from Insights to Value,” MIT Sloan Management Review, vol. 52, no. 2, pp. 21–32, 2011.  A. Haug and J. S. Arlbjørn, “Barriers to master data quality,” Journal of Enterprise Information Management, vol. 24, no. 3, pp. 288–303, 2011.  R. Silvola, O. Jaaskelainen, H. Kropsu-Vehkapera, and H. Haapasalo, “Managing one master data – challenges and preconditions,” Industrial Management & Data Systems, vol. 111, no. 1, pp. 146– 162, 2011.  E. Curry, S. Hasan, and S. O‟Riain, “Enterprise Energy Management using a Linked Dataspace for Energy Intelligence,” in Second IFIP Conference on Sustainable Internet and ICT for Sustainability, 2012.  D. Loshin, Master Data Management. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2008.  B. Otto and A. Reichert, “Organizing Master Data Management: Findings from an Expert Survey,” in Proceedings of the 2010 ACM Symposium on Applied Computing - SAC ‟10, 2010, pp. 106–110. 32
  • 33. Selected References Digital Enterprise Research Institute www.deri.ie  Collective Intelligence, Crowdsourcing & Human Computation  E. Curry, A. Freitas, and S. O. Riain, “The Role of Community-Driven Data Curation for Enterprises,” in Linking Enterprise Data, D. Wood, Ed. Boston, MA: Springer US, 2010, pp. 25–47.  A. Doan, R. Ramakrishnan, and A. Y. Halevy, “Crowdsourcing systems on the World-Wide Web,” Communications of the ACM, vol. 54, no. 4, p. 86, Apr. 2011.  E. Law and L. von Ahn, “Human Computation,” Synthesis Lectures on Artificial Intelligence and Machine Learning, vol. 5, no. 3, pp. 1–121, Jun. 2011.  M. J. Franklin, D. Kossmann, T. Kraska, S. Ramesh, and R. Xin, “CrowdDB  Answering Queries with : Crowdsourcing,” in Proceedings of the 2011 international conference on Management of data - SIGMOD ‟11, 2011, p. 61.  P. Wichmann, A. Borek, R. Kern, P. Woodall, A. K. Parlikad, and G. Satzger, “Exploring the „Crowd‟ as Enabler of Better Information Quality,” in Proceedings of the 16th International Conference on Information Quality, 2011, pp. 302–312. 33
  • 34. Selected References Digital Enterprise Research Institute www.deri.ie  Expert Finding  K. Balog, L. Azzopardi, and M. de Rijke, “Formal models for expert finding in enterprise corpora,” in Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR ‟06, 2006, p. 43.  K. Balog, T. Bogers, L. Azzopardi, M. de Rijke, and A. van den Bosch, “Broad expertise retrieval in sparse data environments,” in Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR ‟07, 2007, p. 551.  K. Balog and M. De Rijke, “Determining expert profiles (with an application to expert finding),” in Proceedings of the 20th international joint conference on Artifical intelligence, 2007, pp. 2657–=2662. 34
  • 35. Selected References Digital Enterprise Research Institute www.deri.ie  Linked Data & User Feedback  S. O‟Riain, E. Curry, and A. Harth, “XBRL and open data for global financial ecosystems: A linked data approach,” International Journal of Accounting Information Systems, Mar. 2012.  U. Ul Hassan, S. O‟Riain, and E. Curry, “Leveraging Matching Dependencies for Guided User Feedback in Linked Data Applications,” in 9th International Workshop on Information Integration on the Web IIWeb2012, 2012.  A. Miles and J. R. Pérez-Agüera, “SKOS: Simple Knowledge Organisation for the Web,” Cataloging & Classification Quarterly, vol. 43, no. 3–4, pp. 69–83, Apr. 2007.  C. Bizer, J. Lehmann, G. Kobilarov, S. Auer, C. Becker, R. Cyganiak, and S. Hellmann, “DBpedia - A crystallization point for the Web of Data,” Web Semantics: Science, Services and Agents on the World Wide Web, vol. 7, no. 3, pp. 154–165, Sep. 2009.  S. R. Jeffery, M. J. Franklin, and A. Y. Halevy, “Pay-as-you-go user feedback for dataspace systems,” in Proceedings of the 2008 ACM SIGMOD international conference on Management of data - SIGMOD ‟08, 2008, pp. 847–860. 35

Notes de l'éditeur

  1. Personal background
  2. Show start and the end beforeBreak the builds
  3. The ranking allows us to select best worker based on scoring method