SlideShare a Scribd company logo
1 of 21
Download to read offline
Mendeley's Data and
Perspectives on Data
          Challenges




          Kris Jack, PhD
         Chief Data Scientist
   https://twitter.com/_krisjack
Overview

➔
    What's Mendeley?

➔
    Why Run Challenges?

➔
    Mendeley's Challenges

➔
    Conclusions
What's Mendeley?
➔
    Mendeley is a platform that connects
    researchers, research data and apps




                         Mendeley Open API


➔
    How are we building our community?
Mendeley provides tools to help users...


...organise
their research



                                              ➔
                                                  Reference
                                                  management

                                              ➔
                                                  Cite-as-you-
                                                  write

                                              ➔
                                                  Full-text
                                                  article search

                                              ➔
                                                  Digitalised
                                                  annotations
Mendeley provides tools to help users...
                 ...collaborate with
                     one another
...organise
their research




                                         ➔
                                             Professional
                                             research groups

                                         ➔
                                             Social network

                                         ➔
                                             Annotation
                                             sharing
Mendeley provides tools to help users...
                 ...collaborate with
                     one another
...organise                                ...discover new
their research                                    research




                                       ➔
                                           Personalised article
                                           recommendations

                                       ➔
                                           Related research

                                       ➔
                                           Research contact
                                           suggestions
Our community from a data perspective




Social network                          Personal libraries
 (~2M users)                            (~300M articles)




 Research groups                     Research catalogue
 (~175K groups)                    (~50M unique articles)
Why Run
Challenges?
Why Run Challenges?
➔
    An important part of our mission is to make science more open
Why Run Challenges?
➔
    An important part of our mission is to make science more open

                                “All the time we are very
                                conscious of the huge challenges
                                that human society has now –
                                curing cancer, understanding
                                the brain for Alzheimer‘s [...].
Why Run Challenges?
➔
    An important part of our mission is to make science more open

                                “All the time we are very
                                conscious of the huge challenges
                                that human society has now –
                                curing cancer, understanding
                                the brain for Alzheimer‘s [...].

                                But a lot of the state of knowledge
                                of the human race is sitting in the
                                scientists’ computers, and is
                                currently not shared […] We need
                                to get it unlocked so we can tackle
                                those huge problems.“
Why Run Challenges?
➔
    An important part of our mission is to make science more open

                                  “All the time we are very
                                  conscious of the huge challenges
                                  that human society has now –
                                  curing cancer, understanding
                                  the brain for Alzheimer‘s [...].

                                  But a lot of the state of knowledge
                                  of the human race is sitting in the
➔
    We run challenges that        scientists’ computers, and is
    aim to open up science        currently not shared […] We need
                                  to get it unlocked so we can tackle
➔
    Your skills in information    those huge problems.“
    sciences are valuable to us
Mendeley's
Challenges
PloS/Mendeley's Binary Battle

Challenge: Build an application with our data,
           make science more open.


Results:




             More details at http://dev.mendeley.com/api-binary-battle/
ScienceRec Challenge 2012

Challenge: Build off-line system for scientific
           recommendations with our API
           and DataTEL data set


Results:     Will discuss today
             How to improve for the future?




50K users, with at least
   20 articles each

             More details at http://2012.recsyschallenge.com/tracks/sciencerec/
Conclusions
Conclusions
➔
    Mendeley makes tools that help researchers to:
    ➔
        organise their research
    ➔
        collaborate with one another
    ➔
        discover new research
➔
    We are crowdsourcing a wealth of research data
➔
    We're opening it up to the world
➔
    And inviting you to participate
We're Hiring!
➔
    Data Scientist
    ➔
        apply recommender technologies to Mendeley's data
    ➔
        work on improving the quality of Mendeley's research catalogue
    ➔
        starting in first quarter of 2013
    ➔
        6 month secondment in KNOW Center, TU Graz, Austria as part of the EC FP7
        TEAM project (http://team-project.tugraz.at/)
➔
    http://www.mendeley.com/careers/
www.mendeley.com
A Challenge for the Future?

Challenge: Investigate how well algorithms
           perform in real-world settings

Motivation: Industry repeatedly finds that
            aggressive A/B testing is required
            because offline improvements do
            not necessarily translate to online
            improvements

Problem:    Academia tends not to have access
            to large online communities           Research groups
                                                  (~175K groups)
Solution:   Industry runs A/B test with
            academic algorithms and reports
            results

What about privacy?
  Use publicly available data
  Anonymise and aggregate results reported

More Related Content

Viewers also liked

Anne Ysunza's Alaska Slideshow
Anne Ysunza's Alaska SlideshowAnne Ysunza's Alaska Slideshow
Anne Ysunza's Alaska Slideshowmmatheson
 
Machine Learning @ Mendeley
Machine Learning @ MendeleyMachine Learning @ Mendeley
Machine Learning @ MendeleyKris Jack
 
improving explicit preference entry by visualising data similarities
improving explicit preference entry by visualising data similaritiesimproving explicit preference entry by visualising data similarities
improving explicit preference entry by visualising data similaritiesKris Jack
 
Community school of music and arts project 1
Community school of music and arts project 1Community school of music and arts project 1
Community school of music and arts project 1gsk8er1925
 
Mendeley, putting data into the hands of researchers
Mendeley, putting data into the hands of researchersMendeley, putting data into the hands of researchers
Mendeley, putting data into the hands of researchersKris Jack
 
From Syllables to Syntax: Investigating Staged Linguistic Development through...
From Syllables to Syntax: Investigating Staged Linguistic Development through...From Syllables to Syntax: Investigating Staged Linguistic Development through...
From Syllables to Syntax: Investigating Staged Linguistic Development through...Kris Jack
 
Mendeley: crowdsourcing and recommending research on a large scale
Mendeley: crowdsourcing and recommending research on a large scaleMendeley: crowdsourcing and recommending research on a large scale
Mendeley: crowdsourcing and recommending research on a large scaleKris Jack
 
Cloud Elephants and Witches: A Big Data Tale from Mendeley
Cloud Elephants and Witches: A Big Data Tale from MendeleyCloud Elephants and Witches: A Big Data Tale from Mendeley
Cloud Elephants and Witches: A Big Data Tale from MendeleyKris Jack
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyKris Jack
 
Mendeley Suggest: Engineering a Personalised Article Recommender System
Mendeley Suggest: Engineering a Personalised Article Recommender SystemMendeley Suggest: Engineering a Personalised Article Recommender System
Mendeley Suggest: Engineering a Personalised Article Recommender SystemKris Jack
 
Mendeley: Recommendation Systems for Academic Literature
Mendeley: Recommendation Systems for Academic LiteratureMendeley: Recommendation Systems for Academic Literature
Mendeley: Recommendation Systems for Academic LiteratureKris Jack
 

Viewers also liked (13)

Anne Ysunza's Alaska Slideshow
Anne Ysunza's Alaska SlideshowAnne Ysunza's Alaska Slideshow
Anne Ysunza's Alaska Slideshow
 
Planejamento do texto
Planejamento do textoPlanejamento do texto
Planejamento do texto
 
Machine Learning @ Mendeley
Machine Learning @ MendeleyMachine Learning @ Mendeley
Machine Learning @ Mendeley
 
improving explicit preference entry by visualising data similarities
improving explicit preference entry by visualising data similaritiesimproving explicit preference entry by visualising data similarities
improving explicit preference entry by visualising data similarities
 
(2) ben 10_rounding[1]
(2) ben 10_rounding[1](2) ben 10_rounding[1]
(2) ben 10_rounding[1]
 
Community school of music and arts project 1
Community school of music and arts project 1Community school of music and arts project 1
Community school of music and arts project 1
 
Mendeley, putting data into the hands of researchers
Mendeley, putting data into the hands of researchersMendeley, putting data into the hands of researchers
Mendeley, putting data into the hands of researchers
 
From Syllables to Syntax: Investigating Staged Linguistic Development through...
From Syllables to Syntax: Investigating Staged Linguistic Development through...From Syllables to Syntax: Investigating Staged Linguistic Development through...
From Syllables to Syntax: Investigating Staged Linguistic Development through...
 
Mendeley: crowdsourcing and recommending research on a large scale
Mendeley: crowdsourcing and recommending research on a large scaleMendeley: crowdsourcing and recommending research on a large scale
Mendeley: crowdsourcing and recommending research on a large scale
 
Cloud Elephants and Witches: A Big Data Tale from Mendeley
Cloud Elephants and Witches: A Big Data Tale from MendeleyCloud Elephants and Witches: A Big Data Tale from Mendeley
Cloud Elephants and Witches: A Big Data Tale from Mendeley
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
 
Mendeley Suggest: Engineering a Personalised Article Recommender System
Mendeley Suggest: Engineering a Personalised Article Recommender SystemMendeley Suggest: Engineering a Personalised Article Recommender System
Mendeley Suggest: Engineering a Personalised Article Recommender System
 
Mendeley: Recommendation Systems for Academic Literature
Mendeley: Recommendation Systems for Academic LiteratureMendeley: Recommendation Systems for Academic Literature
Mendeley: Recommendation Systems for Academic Literature
 

Similar to Mendeley's Data and Perspectives on Data Challenges

Mahout Becomes a Researcher: Large Scale Recommendations at Mendeley
Mahout Becomes a Researcher: Large Scale Recommendations at MendeleyMahout Becomes a Researcher: Large Scale Recommendations at Mendeley
Mahout Becomes a Researcher: Large Scale Recommendations at MendeleyKris Jack
 
Understanding Research 2.0 from a Socio-technical Perspective
Understanding Research 2.0 from a Socio-technical PerspectiveUnderstanding Research 2.0 from a Socio-technical Perspective
Understanding Research 2.0 from a Socio-technical PerspectiveYuwei Lin
 
Designing and using group software through patterns
Designing and using group software through patternsDesigning and using group software through patterns
Designing and using group software through patternsKyle Mathews
 
Cat Herding and Community Gardens: Practical e-Science Project Management
Cat Herding and Community Gardens: Practical e-Science Project ManagementCat Herding and Community Gardens: Practical e-Science Project Management
Cat Herding and Community Gardens: Practical e-Science Project ManagementNeil Chue Hong
 
CNI Spring 2011: Connecting Researchers with Information - and Unlocking It!
CNI Spring 2011: Connecting Researchers with Information - and Unlocking It!CNI Spring 2011: Connecting Researchers with Information - and Unlocking It!
CNI Spring 2011: Connecting Researchers with Information - and Unlocking It!William Gunn
 
Connecting Researchers with Information - and Unlocking It!
Connecting Researchers with Information - and Unlocking It!Connecting Researchers with Information - and Unlocking It!
Connecting Researchers with Information - and Unlocking It!William Gunn
 
Mendeley Institutional Edition - Universiti Kebangasaan Malaysia
Mendeley Institutional Edition - Universiti Kebangasaan MalaysiaMendeley Institutional Edition - Universiti Kebangasaan Malaysia
Mendeley Institutional Edition - Universiti Kebangasaan MalaysiaNurhazman Abdul Aziz
 
Ambjorn Keynote WSKS-2008
Ambjorn Keynote WSKS-2008Ambjorn Keynote WSKS-2008
Ambjorn Keynote WSKS-2008Ambjorn Naeve
 
Knowledge Worker 20562
Knowledge Worker 20562Knowledge Worker 20562
Knowledge Worker 20562npasha
 
UX and Social Justice Workshop
UX and Social Justice  Workshop UX and Social Justice  Workshop
UX and Social Justice Workshop Danielle Ridenour
 
Littlejohn mooc collective_final
Littlejohn  mooc collective_finalLittlejohn  mooc collective_final
Littlejohn mooc collective_finalColin Milligan
 
SMART Seminar Series: Learning Journeys – Making learning visible in developi...
SMART Seminar Series: Learning Journeys – Making learning visible in developi...SMART Seminar Series: Learning Journeys – Making learning visible in developi...
SMART Seminar Series: Learning Journeys – Making learning visible in developi...SMART Infrastructure Facility
 
Social Media
Social MediaSocial Media
Social Mediapsllc
 
Supporting Designers to develop Innovative Products
Supporting Designers to develop Innovative ProductsSupporting Designers to develop Innovative Products
Supporting Designers to develop Innovative ProductsPetros Georgiakakis
 
Internet Librarian 2011: Connecting Researchers to Information - and Unlockin...
Internet Librarian 2011: Connecting Researchers to Information - and Unlockin...Internet Librarian 2011: Connecting Researchers to Information - and Unlockin...
Internet Librarian 2011: Connecting Researchers to Information - and Unlockin...William Gunn
 
Do Libraries Meet Research 2.0 : collaborative tools and relevance for Resear...
Do Libraries Meet Research 2.0 : collaborative tools and relevance for Resear...Do Libraries Meet Research 2.0 : collaborative tools and relevance for Resear...
Do Libraries Meet Research 2.0 : collaborative tools and relevance for Resear...Guus van den Brekel
 
peer review as an extension of bioinformatics
peer review as an extension of bioinformaticspeer review as an extension of bioinformatics
peer review as an extension of bioinformaticsmlincol2
 

Similar to Mendeley's Data and Perspectives on Data Challenges (20)

Mahout Becomes a Researcher: Large Scale Recommendations at Mendeley
Mahout Becomes a Researcher: Large Scale Recommendations at MendeleyMahout Becomes a Researcher: Large Scale Recommendations at Mendeley
Mahout Becomes a Researcher: Large Scale Recommendations at Mendeley
 
Understanding Research 2.0 from a Socio-technical Perspective
Understanding Research 2.0 from a Socio-technical PerspectiveUnderstanding Research 2.0 from a Socio-technical Perspective
Understanding Research 2.0 from a Socio-technical Perspective
 
Designing and using group software through patterns
Designing and using group software through patternsDesigning and using group software through patterns
Designing and using group software through patterns
 
Final Johnson Research Libraries and Computational Research
Final Johnson Research Libraries and Computational ResearchFinal Johnson Research Libraries and Computational Research
Final Johnson Research Libraries and Computational Research
 
Cat Herding and Community Gardens: Practical e-Science Project Management
Cat Herding and Community Gardens: Practical e-Science Project ManagementCat Herding and Community Gardens: Practical e-Science Project Management
Cat Herding and Community Gardens: Practical e-Science Project Management
 
CNI Spring 2011: Connecting Researchers with Information - and Unlocking It!
CNI Spring 2011: Connecting Researchers with Information - and Unlocking It!CNI Spring 2011: Connecting Researchers with Information - and Unlocking It!
CNI Spring 2011: Connecting Researchers with Information - and Unlocking It!
 
Bibliotheek & Onderzoek 2.0?
Bibliotheek & Onderzoek 2.0?Bibliotheek & Onderzoek 2.0?
Bibliotheek & Onderzoek 2.0?
 
Connecting Researchers with Information - and Unlocking It!
Connecting Researchers with Information - and Unlocking It!Connecting Researchers with Information - and Unlocking It!
Connecting Researchers with Information - and Unlocking It!
 
Mendeley Institutional Edition - Universiti Kebangasaan Malaysia
Mendeley Institutional Edition - Universiti Kebangasaan MalaysiaMendeley Institutional Edition - Universiti Kebangasaan Malaysia
Mendeley Institutional Edition - Universiti Kebangasaan Malaysia
 
Ambjorn Keynote WSKS-2008
Ambjorn Keynote WSKS-2008Ambjorn Keynote WSKS-2008
Ambjorn Keynote WSKS-2008
 
Design camp slides_landgren
Design camp slides_landgrenDesign camp slides_landgren
Design camp slides_landgren
 
Knowledge Worker 20562
Knowledge Worker 20562Knowledge Worker 20562
Knowledge Worker 20562
 
UX and Social Justice Workshop
UX and Social Justice  Workshop UX and Social Justice  Workshop
UX and Social Justice Workshop
 
Littlejohn mooc collective_final
Littlejohn  mooc collective_finalLittlejohn  mooc collective_final
Littlejohn mooc collective_final
 
SMART Seminar Series: Learning Journeys – Making learning visible in developi...
SMART Seminar Series: Learning Journeys – Making learning visible in developi...SMART Seminar Series: Learning Journeys – Making learning visible in developi...
SMART Seminar Series: Learning Journeys – Making learning visible in developi...
 
Social Media
Social MediaSocial Media
Social Media
 
Supporting Designers to develop Innovative Products
Supporting Designers to develop Innovative ProductsSupporting Designers to develop Innovative Products
Supporting Designers to develop Innovative Products
 
Internet Librarian 2011: Connecting Researchers to Information - and Unlockin...
Internet Librarian 2011: Connecting Researchers to Information - and Unlockin...Internet Librarian 2011: Connecting Researchers to Information - and Unlockin...
Internet Librarian 2011: Connecting Researchers to Information - and Unlockin...
 
Do Libraries Meet Research 2.0 : collaborative tools and relevance for Resear...
Do Libraries Meet Research 2.0 : collaborative tools and relevance for Resear...Do Libraries Meet Research 2.0 : collaborative tools and relevance for Resear...
Do Libraries Meet Research 2.0 : collaborative tools and relevance for Resear...
 
peer review as an extension of bioinformatics
peer review as an extension of bioinformaticspeer review as an extension of bioinformatics
peer review as an extension of bioinformatics
 

Recently uploaded

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 

Recently uploaded (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

Mendeley's Data and Perspectives on Data Challenges

  • 1. Mendeley's Data and Perspectives on Data Challenges Kris Jack, PhD Chief Data Scientist https://twitter.com/_krisjack
  • 2. Overview ➔ What's Mendeley? ➔ Why Run Challenges? ➔ Mendeley's Challenges ➔ Conclusions
  • 4. Mendeley is a platform that connects researchers, research data and apps Mendeley Open API ➔ How are we building our community?
  • 5. Mendeley provides tools to help users... ...organise their research ➔ Reference management ➔ Cite-as-you- write ➔ Full-text article search ➔ Digitalised annotations
  • 6. Mendeley provides tools to help users... ...collaborate with one another ...organise their research ➔ Professional research groups ➔ Social network ➔ Annotation sharing
  • 7. Mendeley provides tools to help users... ...collaborate with one another ...organise ...discover new their research research ➔ Personalised article recommendations ➔ Related research ➔ Research contact suggestions
  • 8. Our community from a data perspective Social network Personal libraries (~2M users) (~300M articles) Research groups Research catalogue (~175K groups) (~50M unique articles)
  • 10. Why Run Challenges? ➔ An important part of our mission is to make science more open
  • 11. Why Run Challenges? ➔ An important part of our mission is to make science more open “All the time we are very conscious of the huge challenges that human society has now – curing cancer, understanding the brain for Alzheimer‘s [...].
  • 12. Why Run Challenges? ➔ An important part of our mission is to make science more open “All the time we are very conscious of the huge challenges that human society has now – curing cancer, understanding the brain for Alzheimer‘s [...]. But a lot of the state of knowledge of the human race is sitting in the scientists’ computers, and is currently not shared […] We need to get it unlocked so we can tackle those huge problems.“
  • 13. Why Run Challenges? ➔ An important part of our mission is to make science more open “All the time we are very conscious of the huge challenges that human society has now – curing cancer, understanding the brain for Alzheimer‘s [...]. But a lot of the state of knowledge of the human race is sitting in the ➔ We run challenges that scientists’ computers, and is aim to open up science currently not shared […] We need to get it unlocked so we can tackle ➔ Your skills in information those huge problems.“ sciences are valuable to us
  • 15. PloS/Mendeley's Binary Battle Challenge: Build an application with our data, make science more open. Results: More details at http://dev.mendeley.com/api-binary-battle/
  • 16. ScienceRec Challenge 2012 Challenge: Build off-line system for scientific recommendations with our API and DataTEL data set Results: Will discuss today How to improve for the future? 50K users, with at least 20 articles each More details at http://2012.recsyschallenge.com/tracks/sciencerec/
  • 18. Conclusions ➔ Mendeley makes tools that help researchers to: ➔ organise their research ➔ collaborate with one another ➔ discover new research ➔ We are crowdsourcing a wealth of research data ➔ We're opening it up to the world ➔ And inviting you to participate
  • 19. We're Hiring! ➔ Data Scientist ➔ apply recommender technologies to Mendeley's data ➔ work on improving the quality of Mendeley's research catalogue ➔ starting in first quarter of 2013 ➔ 6 month secondment in KNOW Center, TU Graz, Austria as part of the EC FP7 TEAM project (http://team-project.tugraz.at/) ➔ http://www.mendeley.com/careers/
  • 21. A Challenge for the Future? Challenge: Investigate how well algorithms perform in real-world settings Motivation: Industry repeatedly finds that aggressive A/B testing is required because offline improvements do not necessarily translate to online improvements Problem: Academia tends not to have access to large online communities Research groups (~175K groups) Solution: Industry runs A/B test with academic algorithms and reports results What about privacy? Use publicly available data Anonymise and aggregate results reported