SlideShare une entreprise Scribd logo
1  sur  40
Télécharger pour lire hors ligne
Towards a Community-driven Data
Science Body of Knowledge
FAIR 2016 , Florence
14-15 November 2016
Andrea Manieri
Engineering Ingegneria Informatica S.p.A.
EDISON – Education for Data Intensive
Science to Open New science frontiers
Grant 675419 (INFRASUPP-4-2015: CSA)
Credits:
• Yuri Demchenko (UvA)
• Steve Brewer (SOTON)
• Kim Hee (GOETHE)
• Adam Belloum (UvA)
• Spiros Koulozis (UvA)
A sense of urgency – dated 2013
“Europe faces up to 700.000 unfilled ICT jobs and declining competitiveness. The number of
digital jobs is growing – by 3% each year during the crisis – but the number of new ICT
graduates and other skilled ICT workers is shrinking. Our youth need actions not words, and
companies operating in Europe need the right people or they will move operations
elsewhere”. EC press release 25, Jan 2013
Grand Coalition for Digital Jobs + EU eSkills strategy for 2020 becoming Digital Skills and
Jobs Coalition (conference launch 1st Dec 2016 in Bruxelles)
Data Scientist shortage:
- Gartner, 2012
- McKinsey, 2013
- Forbes, 2013 https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century
Who need Data skills?
• As a student
– I need recommendations
on Data-driven careers
• As a resercher
– I need to cover gaps wrt
eScience
• As a Librarian
– I want to promote my
competences
• As an employee
– I want to reskill in data-
driven jobs
Who need Data skills?
• As Scholar/Lecturer
– I need to update my
background
• As training manager
– I want to innovate my
offering
• As course designer
– I have to define right
topics and know-how to
be taught
Who need Data skills?
• As HR manager
– I want to find the fit-for-
purpose candidates
• As team leader
– I need to cover know-
how and skills for a
task/project
• As employer
– I want to define re-skilling
plans for my workforce
EDISON CSA: serving customers base
VISION and Background
How all began
Visionaries and Drivers
The Fourth Paradigm: Data-Intensive Scientific Discovery.
By Jim Gray, Microsoft, 2009. Edited by Tony Hey, et al.
http://research.microsoft.com/en-us/collaboration/fourthparadigm/
Riding the wave: How Europe can gain from
the rising tide of scientific data.
Final report of the High Level Expert Group on
Scientific Data. October 2010.
http://cordis.europa.eu/fp7/ict/e-
infrastructure/docs/hlg-sdi-report.pdf
The Data Harvest: How
sharing research data
can yield knowledge,
jobs and growth.
An RDA Europe Report.
December 2014
https://rd-alliance.org/data-
harvest-report-sharing-data-
knowledge-jobs-and-growth.html
https://www.rd-alliance.org/
NIST Big Data Working Group (NBD-WG)
https://www.rd-alliance.org/ (since 2013)
ISO/IEC JTC1 Big Data Study Group (SGBD)
http://jtc1bigdatasg.nist.gov/home.php (2014)
EDISON & RDA
• 1st RDA Plenary meeting – 18-20 March 2013
– 1st BoF on Education and Skills Development in Data Intensive Science
– Attended by 16 representatives from universities, libraries, e-Science, data
centers, research coordination bodies
• 3rd RDA Plenary meeting – 26-28 March 2014, Dublin
– 3rd BoF on Education and Skills Development in Data Intensive Science
– EDISON (Education for Data Intensive Science to Open New science
frontiers) Initiative announced
• 4th RDA Plenary meeting – 22-24 September 2014, Amsterdam
– IG Education and Training on Handling of Research Data (ETHRD)
established
– EDISON Workshop – 21 Sept 2014, Science Park Amsterdam
– Decision to form a consortium and submit a proposal to IINFRASUPP-4-2015
call
• 8th RDA Plenary meeting – 15-17 September 2016, Denver, USA
– BoFs and IG meetings – now developing Certification and Accreditation proposal
EDISON Data Science Framework (EDSF)
CF-DS
DS-BoK
MC-DS
Taxonomy and
Vocabulary
eLearning Platform
Datasciencepro.eu
Roadmap &
Sustainability
• Community
Portal (CP)
• Professional
certification
• Data Science
career & prof
development
DS Prof Profiles
Data Science
Framework
Foundation & Concepts Services Biz Model
• EDISON Framework components
– CF-DS – Data Science Competence Framework
– DS-BoK – Data Science Body of Knowledge
– MC-DS – Data Science Model Curriculum
– DSP - Data Science Professional profiles definition
– Data Science Taxonomies and Scientific Disciplines Classification
Data Science
Body-of-Knowledge
A shared vision of knowledge corpus
• Based on the definition by NIST Big Data WG (NIST SP1500 -
2015)
• A Data Scientist is a practitioner who has sufficient knowledge in
the overlapping regimes of expertise in business needs, domain
knowledge, analytical skills, and programming and systems
engineering expertise to manage the end-to-end scientific method
process through each stage in the big data lifecycle
– …Till the delivery of expected scientific and business value to science or
industry
• Other definitions to admit such features as
– Ability to solve variety of business problems, tell “stories”, input to
decision making
– Optimize performance and suggest new services for the organisation
– Develop a special mindset and be statistically minded, understand raw
data and “appreciate data as a first class product”
Data Scientist Definition
• Data science is the empirical synthesis of actionable knowledge and technologies required to
handle data from raw data through the complete data lifecycle process.
• Big Data is the technology to build system and infrastructures to process large volume of
structurally complex data in a time effective way
[ref] Legacy: NIST BDWG
definition of Data Science
• Commonly accepted Data Science competences/skills groups include
– Data Analytics or Business Analytics or Machine Learning
– Engineering or Programming
– Subject/Scientific Domain Knowledge
• EDISON identified 2 additional competence groups demanded by
organisations
– Data Management, Curation, Preservation
– Scientific or Research Methods and/vs Business
Processes/Operations
• Other skills commonly recognized aka “soft skills” or “social/professional
intelligence”
– Inter-personal skills or team work, cooperativeness
• Important aspects of integrating Data Scientist into organisation structure
– General Data Science (and Data) literacy for all involved roles and management
– Common agreed and understandable way of communication and
information/data presentation
– Role of Data Scientist: Provide a kind of literacy advice and guidance to
organisation
Data Science Competence Groups
• Group 1: Skills/experience related to
competences
– Data Analytics and Machine Learning
– Data Management/Curation (both general
and scientific)
– Data Science Engineering (hardware and
software) skills
– Scientific/Research Methods or Business
Process Management
– Application/subject domain related (research
or business)
– Mathematics and Statistics
• Group 2: Big Data (Data Science) tools
and platforms
– Big Data Analytics platforms
– Mathematics & Statistics applications & tools
– Databases (SQL and NoSQL)
– Data Management and Curation platform
– Data and applications visualisation
– Cloud based platforms and tools
Data Science Skills/Experiences
Group 3: Programming and
programming languages and IDE
– General and specialized development
platforms for data analysis and statistics
Group 4: Soft skills or Social
Intelligence
– Personal, inter-personal communication, team
work, professional network
Comparing with relevant BoK
• ACM Computer Science Body of Knowledge (ACM CS-BoK)
• ICT professional Body of Knowledge (ICT-BoK)
• Business Analytics Body of Knowledge (BABOK)
• Software Engineering Body of Knowledge (SWEBOK)
• Data Management Body of Knowledge (DAMA-BoK) by Data
Management Association International (DAMAI)
• Project Management Professional Body of Knowledge (PM-
BoK)
• DS-BoK Knowledge Area Groups (KAG)
• KAG1-DSA: Data Analytics group including
Machine Learning, statistical methods,
and Business Analytics
• KAG2-DSE: Data Science Engineering group
including Software and infrastructure engineering
• KAG3-DSDM: Data Management group including data curation, preservation
and data infrastructure
• KAG4-DSRM: Scientific/Research Methods group
• KAG5-DSBP: Business process management group
• Data Science domain knowledge to be defined by related expert groups
Data Science BoK (DS-BoK)
Process Groups – knowledge at work
• Data Identification and Creation
– how to obtain digital information from in-silico experiments and instrumentations, how to collect and store in digital form,
any techniques, models, standard and tools needed to perform these activities, depending from the specific discipline.
• Data Access and Retrieval:
– tools, techniques and standards used to access any type of data from any type of media, retrieve it in compliance to
IPRs and established legislations.
• Data Curation and Preservation:
– includes activities related to data cleansing, normalisation, validation and storage.
• Data Fusion (or Data integration):
– the integration of multiple data and knowledge representing the same real-world object into a consistent, accurate, and
useful representation.
• Data Organisation and Management:
– how to organise the storage of data for various purposes required by each discipline, tools, techniques, standards and
best practices (including IPRs management and compliance to laws and regulations, and metadata definition and
completion) to set up ICT solutions in order to achieve the required Services Level Agreement for data conservation.
• Data Storage and Stewardship:
– how to enhance the use of data by using metadata and other techniques to establish a long term access and extended
use to that data also by scientists and researchers from other disciplines and after very long time from the data
production time.
• Data Processing:
– tools, techniques and standards to analyse different and heterogeneous data coming from various sources, different
scientific domains and of a variety of size (up to Exabytes) – it includes notion of programming paradigms.
• Data Visualisation and Communication:
– techniques, models and best practices to merge and join various data sets, techniques and tools for data analytics and
visualisation, depending on the data significant and the discipline.
Data Science Data Management Group
(DSDM)
KAG3-DSDM:
Data Management
group including
data curation,
preservation and
data infrastructure
DAMA-BoK selected KAs
(1) Data Governance
(2) Data Architecture
(3) Data Modelling and Design
(4) Data Storage and Operations
(5) Data Security
(6) Data Integration and
Interoperability
(7) Documents and Content
(8) Reference and Master Data
(9) Data Warehousing and Business
Intelligence
(10) Metadata
(11) Data Quality
General Data Management KA’s
 Data Lifecycle Management
 Data archives/storage
compliance and certification
New KAs to support RDA
recommendations and community
data management models (Open
Access, Open Data, etc.)
 Data type registries, PIDs
 Data infrastructure and Data
Factories
 …
Data Science
Competence Framework
A bottom-up approach
• Professional
profiles groups
are defined in
compliance
with the ESCO
taxonomy
Data Science Professions Family
• Relevance of a
competence to a
DSP profile:
• 5 – high, 1 - low
Mapping DS-BoK GAs to DSP profiles
E - CO2 Classification
• Text Filtering
• Find overlapping terms
• Calculate TF-IDF of terms
• For each category vector calculate cosine similarity
• The output is a CSV with the similarity for each
category
Education offered vs. Market requests
DSDA: Data Science Analytics
DSDK: DS Domain Knowledge (DSDK)
DSEN: Data Science Engineering
DSRM: Scientific/ Research Methods
DSDM: Data Management
DSDA: Data Science Analytics
DSDK: DS Domain Knowledge (DSDK)
DSEN: Data Science Engineering
DSRM: Scientific/ Research Methods
DSDM: Data Management
CV vs. Job offering
Model Curricula
in Data Science
Supporting EU Academy in excelling
and matching the market needs
• Data Science Model Curriculum includes
– Learning Outcomes (LO) definition based on CF-DS
• LOs are defined for CF-DS competence groups and for all
enumerated competences
– LOs mapping to Learning Units (LU)
• LUs are based on CCS(2012) and universities best practices
• Data Science university programmes and courses inventory
(interactive)
http://edison-project.eu/university-programs-list
– LU/course relevance: Mandatory Tier 1, Tier 2,
Elective, Prerequisite
– Learning methods and learning models (in progress)
• Based on Bloom’s Taxonomy, Outcome Based Learning, etc
Data Science Model Curriculum (MC-DS)
Using the MC-DS
Some numbers (2015)
• A portfolio of more than 300 courses
• 200 traineers and experts
• 5 offices and 16 classroom
• 18.000 training person/hours
• New on-line platform
Aosta
Roma
Padova
Milano
Frosinone
Engineering IT & Management school
Data Science Master at Univ. of Perugia
http://masterds.unipg.it/en/index.html
Accreditation and Certification - RDA BoF
Aim: contribute to the sustainable development of the data
science profession.
Goal: deliver a report that presents a concise but
representative picture of the various accreditation and
certification schemes that exist around the world
Outcome: Need to develop 9 months working group proposal
centered on supporting the members of RDA to develop their
own professional career paths around their own skills, interests
and contexts.
Career development and reskilling
Mind the gap!
Get practical recommendations
Training and Education Inventory
Sharing education events and experiences
What’s next?
Putting theory into practice and
supporting service delivery
EDISON Community portal
PoweredandHostedBy
What we can do with you
1. Improve and Validate EDSF
1. Identifying the “soft skills”: how to ask a research/business question?
2. Identifying the Community need: from stewards to scientists, any market, any discipline
3. Validate completeness of BoK, coverage of CF, usability of MC
4. Promote National workshop for bottom-up adoption of EDSF
2. Career Development
1. Specifications for DSP job positions in Data Management and Librarian teams and
Engagement mechanisms Employers/DSP candidates
2. Links and Recommendations for placing students for getting DSP work experience
3. Facilitate cross-institutional agreements on DSP career paths
4. Supporting Training through DataSciencePro.eu
5. Mapping and comparing career paths and Learning opportunities for Personal Competence Portfolio (PCP)
6. Advice Events, Courses and Tools for Community training
7. Develop Virtual Labs, re-usable and promoted further out of your Community
8. Certification: from badges to professions – the How-to of a Community-driven Data
Science Certification (RDA)
Promoting the Data Science Profession
• Invitation to contribution and cooperation:
– Forum, EDISON Liaisons Groups, Champions Conference (Spring & Summer
2017)
• EDISON project website http://edison-project.eu/
• EDISON Data Science Framework Release 1 (EDSF)
http://edison-project.eu/edison-data-science-framework-edsf
• Community oriented - Survey Data Science Competences (Available Soon)

Contenu connexe

Tendances

Addressing non economical externalities
Addressing non economical externalitiesAddressing non economical externalities
Addressing non economical externalitiesBYTE Project
 
A-XLRM summary for BYTE case studies: Crisis, culture and health
A-XLRM summary for BYTE case studies: Crisis, culture and healthA-XLRM summary for BYTE case studies: Crisis, culture and health
A-XLRM summary for BYTE case studies: Crisis, culture and healthBYTE Project
 
Digital notebooks - a Jisc perspective
Digital notebooks - a Jisc perspectiveDigital notebooks - a Jisc perspective
Digital notebooks - a Jisc perspectiveChristopher Brown
 
Algorithmic Systems Transparency and Accountability in Big Data & Cognitive Era
Algorithmic Systems Transparency and Accountability in Big Data & Cognitive EraAlgorithmic Systems Transparency and Accountability in Big Data & Cognitive Era
Algorithmic Systems Transparency and Accountability in Big Data & Cognitive EraNozha Boujemaa
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation Research Data Alliance
 
BYTE bdva Valencia Summit November 2016
BYTE bdva Valencia Summit November 2016BYTE bdva Valencia Summit November 2016
BYTE bdva Valencia Summit November 2016Trilateral Research
 
Phaedra II Technology foresight, 17 Nov 2016
Phaedra II Technology foresight, 17 Nov 2016Phaedra II Technology foresight, 17 Nov 2016
Phaedra II Technology foresight, 17 Nov 2016Trilateral Research
 
Cross-Disciplinary Insights on Big Data Challenges and Solutions
Cross-Disciplinary Insights on Big Data Challenges and SolutionsCross-Disciplinary Insights on Big Data Challenges and Solutions
Cross-Disciplinary Insights on Big Data Challenges and SolutionsBYTE Project
 
Research engagement in EUDAT| www.eudat.eu |
Research engagement in EUDAT| www.eudat.eu | Research engagement in EUDAT| www.eudat.eu |
Research engagement in EUDAT| www.eudat.eu | EUDAT
 
Open Data: Barriers, Risks, and Opportunities
Open Data: Barriers, Risks, and OpportunitiesOpen Data: Barriers, Risks, and Opportunities
Open Data: Barriers, Risks, and OpportunitiesSlim Turki, Dr.
 
Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...
Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...
Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...Anna Maria Tammaro
 
Holger Wollschläger | E-government at its best: Open, transparent and useful
Holger Wollschläger | E-government at its best: Open, transparent and usefulHolger Wollschläger | E-government at its best: Open, transparent and useful
Holger Wollschläger | E-government at its best: Open, transparent and usefulsemanticsconference
 
Open data ecosystems research talk at Copenhagen Business School on 25042014
Open data ecosystems research talk at Copenhagen Business School on 25042014Open data ecosystems research talk at Copenhagen Business School on 25042014
Open data ecosystems research talk at Copenhagen Business School on 25042014Matti Rossi
 
Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...
Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...
Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...CambridgeshireInsight
 
Rda in a_nutshell_february_2017_updated
Rda in a_nutshell_february_2017_updatedRda in a_nutshell_february_2017_updated
Rda in a_nutshell_february_2017_updatedResearch Data Alliance
 
Data Mining & Knowledge Management Process (IJDKP)
Data Mining & Knowledge Management Process (IJDKP)Data Mining & Knowledge Management Process (IJDKP)
Data Mining & Knowledge Management Process (IJDKP)IJDKP
 

Tendances (20)

Addressing non economical externalities
Addressing non economical externalitiesAddressing non economical externalities
Addressing non economical externalities
 
Data Science and its impact on society
Data Science and its impact on societyData Science and its impact on society
Data Science and its impact on society
 
A-XLRM summary for BYTE case studies: Crisis, culture and health
A-XLRM summary for BYTE case studies: Crisis, culture and healthA-XLRM summary for BYTE case studies: Crisis, culture and health
A-XLRM summary for BYTE case studies: Crisis, culture and health
 
Digital notebooks - a Jisc perspective
Digital notebooks - a Jisc perspectiveDigital notebooks - a Jisc perspective
Digital notebooks - a Jisc perspective
 
Algorithmic Systems Transparency and Accountability in Big Data & Cognitive Era
Algorithmic Systems Transparency and Accountability in Big Data & Cognitive EraAlgorithmic Systems Transparency and Accountability in Big Data & Cognitive Era
Algorithmic Systems Transparency and Accountability in Big Data & Cognitive Era
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
 
Collaborate to Share
Collaborate to ShareCollaborate to Share
Collaborate to Share
 
BYTE bdva Valencia Summit November 2016
BYTE bdva Valencia Summit November 2016BYTE bdva Valencia Summit November 2016
BYTE bdva Valencia Summit November 2016
 
Phaedra II Technology foresight, 17 Nov 2016
Phaedra II Technology foresight, 17 Nov 2016Phaedra II Technology foresight, 17 Nov 2016
Phaedra II Technology foresight, 17 Nov 2016
 
Cross-Disciplinary Insights on Big Data Challenges and Solutions
Cross-Disciplinary Insights on Big Data Challenges and SolutionsCross-Disciplinary Insights on Big Data Challenges and Solutions
Cross-Disciplinary Insights on Big Data Challenges and Solutions
 
Research engagement in EUDAT| www.eudat.eu |
Research engagement in EUDAT| www.eudat.eu | Research engagement in EUDAT| www.eudat.eu |
Research engagement in EUDAT| www.eudat.eu |
 
Research Data Alliance Overview
Research Data Alliance OverviewResearch Data Alliance Overview
Research Data Alliance Overview
 
Open Data: Barriers, Risks, and Opportunities
Open Data: Barriers, Risks, and OpportunitiesOpen Data: Barriers, Risks, and Opportunities
Open Data: Barriers, Risks, and Opportunities
 
Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...
Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...
Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...
 
Holger Wollschläger | E-government at its best: Open, transparent and useful
Holger Wollschläger | E-government at its best: Open, transparent and usefulHolger Wollschläger | E-government at its best: Open, transparent and useful
Holger Wollschläger | E-government at its best: Open, transparent and useful
 
Open data ecosystems research talk at Copenhagen Business School on 25042014
Open data ecosystems research talk at Copenhagen Business School on 25042014Open data ecosystems research talk at Copenhagen Business School on 25042014
Open data ecosystems research talk at Copenhagen Business School on 25042014
 
Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...
Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...
Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...
 
Rda in a_nutshell_february_2017_updated
Rda in a_nutshell_february_2017_updatedRda in a_nutshell_february_2017_updated
Rda in a_nutshell_february_2017_updated
 
Data Mining & Knowledge Management Process (IJDKP)
Data Mining & Knowledge Management Process (IJDKP)Data Mining & Knowledge Management Process (IJDKP)
Data Mining & Knowledge Management Process (IJDKP)
 
How to elaborate a data management plan
How to elaborate a data management planHow to elaborate a data management plan
How to elaborate a data management plan
 

Similaire à Towards a Community-driven Data Science Body of Knowledge – Data Management Skills and Competences

Data management plans – EUDAT Best practices and case study | www.eudat.eu
Data management plans – EUDAT Best practices and case study | www.eudat.euData management plans – EUDAT Best practices and case study | www.eudat.eu
Data management plans – EUDAT Best practices and case study | www.eudat.euEUDAT
 
Building the Data Science Profession in Europe
Building the Data Science Profession in EuropeBuilding the Data Science Profession in Europe
Building the Data Science Profession in EuropeSteven Miller
 
My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018Susanna-Assunta Sansone
 
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...Tomasz Bednarz
 
PROPEL . Austrian's Roadmap for Enterprise Linked Data
PROPEL . Austrian's Roadmap for Enterprise Linked DataPROPEL . Austrian's Roadmap for Enterprise Linked Data
PROPEL . Austrian's Roadmap for Enterprise Linked DataSemantic Web Company
 
Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020Joanne Luciano
 
Global Research Data Initiatives
Global Research Data InitiativesGlobal Research Data Initiatives
Global Research Data InitiativesSarah Jones
 
Turning FAIR data into reality
Turning FAIR data into realityTurning FAIR data into reality
Turning FAIR data into realitySarah Jones
 
e-SIDES workshop at EBDVF 2018, Vienna 14/11/2018
e-SIDES workshop at EBDVF 2018, Vienna 14/11/2018 e-SIDES workshop at EBDVF 2018, Vienna 14/11/2018
e-SIDES workshop at EBDVF 2018, Vienna 14/11/2018 e-SIDES.eu
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactDr. Sunil Kr. Pandey
 
H2020 Open Research Data pilot
H2020 Open Research Data pilotH2020 Open Research Data pilot
H2020 Open Research Data pilotSarah Jones
 
Birgit Plietzsch “RDM within research computing support” SALCTG June 2013
Birgit Plietzsch “RDM within research computing support” SALCTG June 2013Birgit Plietzsch “RDM within research computing support” SALCTG June 2013
Birgit Plietzsch “RDM within research computing support” SALCTG June 2013SALCTG
 
Introduction to UC San Diego’s Integrated Digital Infrastructure
Introduction to UC San Diego’s Integrated Digital InfrastructureIntroduction to UC San Diego’s Integrated Digital Infrastructure
Introduction to UC San Diego’s Integrated Digital InfrastructureLarry Smarr
 
Big data presentation for University of Reykjavik, Iceland, March 22
Big data presentation for University of Reykjavik, Iceland, March 22 Big data presentation for University of Reykjavik, Iceland, March 22
Big data presentation for University of Reykjavik, Iceland, March 22 Thorhildur Jetzek, Ph.D.
 
LIBER Webinar: Turning FAIR Data Into Reality
LIBER Webinar: Turning FAIR Data Into RealityLIBER Webinar: Turning FAIR Data Into Reality
LIBER Webinar: Turning FAIR Data Into RealityLIBER Europe
 

Similaire à Towards a Community-driven Data Science Body of Knowledge – Data Management Skills and Competences (20)

Data management plans – EUDAT Best practices and case study | www.eudat.eu
Data management plans – EUDAT Best practices and case study | www.eudat.euData management plans – EUDAT Best practices and case study | www.eudat.eu
Data management plans – EUDAT Best practices and case study | www.eudat.eu
 
Building the Data Science Profession in Europe
Building the Data Science Profession in EuropeBuilding the Data Science Profession in Europe
Building the Data Science Profession in Europe
 
Rdaeu russia_fg_1_july2014_final
Rdaeu  russia_fg_1_july2014_finalRdaeu  russia_fg_1_july2014_final
Rdaeu russia_fg_1_july2014_final
 
My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018
 
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
 
PROPEL . Austrian's Roadmap for Enterprise Linked Data
PROPEL . Austrian's Roadmap for Enterprise Linked DataPROPEL . Austrian's Roadmap for Enterprise Linked Data
PROPEL . Austrian's Roadmap for Enterprise Linked Data
 
Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020
 
Data Scientist Enablement roadmap 1.0
Data Scientist Enablement roadmap 1.0Data Scientist Enablement roadmap 1.0
Data Scientist Enablement roadmap 1.0
 
Global Research Data Initiatives
Global Research Data InitiativesGlobal Research Data Initiatives
Global Research Data Initiatives
 
CODATA: Open Data, FAIR Data and Open Science/Simon Hodson
CODATA: Open Data, FAIR Data and Open Science/Simon HodsonCODATA: Open Data, FAIR Data and Open Science/Simon Hodson
CODATA: Open Data, FAIR Data and Open Science/Simon Hodson
 
Turning FAIR data into reality
Turning FAIR data into realityTurning FAIR data into reality
Turning FAIR data into reality
 
e-SIDES workshop at EBDVF 2018, Vienna 14/11/2018
e-SIDES workshop at EBDVF 2018, Vienna 14/11/2018 e-SIDES workshop at EBDVF 2018, Vienna 14/11/2018
e-SIDES workshop at EBDVF 2018, Vienna 14/11/2018
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
 
data science
data sciencedata science
data science
 
data science
data sciencedata science
data science
 
H2020 Open Research Data pilot
H2020 Open Research Data pilotH2020 Open Research Data pilot
H2020 Open Research Data pilot
 
Birgit Plietzsch “RDM within research computing support” SALCTG June 2013
Birgit Plietzsch “RDM within research computing support” SALCTG June 2013Birgit Plietzsch “RDM within research computing support” SALCTG June 2013
Birgit Plietzsch “RDM within research computing support” SALCTG June 2013
 
Introduction to UC San Diego’s Integrated Digital Infrastructure
Introduction to UC San Diego’s Integrated Digital InfrastructureIntroduction to UC San Diego’s Integrated Digital Infrastructure
Introduction to UC San Diego’s Integrated Digital Infrastructure
 
Big data presentation for University of Reykjavik, Iceland, March 22
Big data presentation for University of Reykjavik, Iceland, March 22 Big data presentation for University of Reykjavik, Iceland, March 22
Big data presentation for University of Reykjavik, Iceland, March 22
 
LIBER Webinar: Turning FAIR Data Into Reality
LIBER Webinar: Turning FAIR Data Into RealityLIBER Webinar: Turning FAIR Data Into Reality
LIBER Webinar: Turning FAIR Data Into Reality
 

Plus de Research Data Alliance

The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsResearch Data Alliance
 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsResearch Data Alliance
 
RDA Value for Infrastructure Providers
RDA Value for Infrastructure ProvidersRDA Value for Infrastructure Providers
RDA Value for Infrastructure ProvidersResearch Data Alliance
 
The Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing ResearchThe Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing ResearchResearch Data Alliance
 

Plus de Research Data Alliance (20)

RDA in a Nutshell - September 2020
RDA in a Nutshell - September 2020RDA in a Nutshell - September 2020
RDA in a Nutshell - September 2020
 
RDA in a Nutshell - August 2020
RDA in a Nutshell - August 2020RDA in a Nutshell - August 2020
RDA in a Nutshell - August 2020
 
RDA in a Nutshell - July 2020
RDA in a Nutshell - July 2020RDA in a Nutshell - July 2020
RDA in a Nutshell - July 2020
 
RDA in a Nutshell - June 2020
RDA in a Nutshell - June 2020RDA in a Nutshell - June 2020
RDA in a Nutshell - June 2020
 
RDA in a Nutshell - May 2020
RDA in a Nutshell - May 2020RDA in a Nutshell - May 2020
RDA in a Nutshell - May 2020
 
RDA in a Nutshell - April 2020
RDA in a Nutshell - April 2020RDA in a Nutshell - April 2020
RDA in a Nutshell - April 2020
 
RDA in a Nutshell - March 2020
RDA in a Nutshell - March 2020RDA in a Nutshell - March 2020
RDA in a Nutshell - March 2020
 
RDA in a Nutshell - February 2020
RDA in a Nutshell - February 2020RDA in a Nutshell - February 2020
RDA in a Nutshell - February 2020
 
RDA in a Nutshell - January 2020
RDA in a Nutshell - January 2020RDA in a Nutshell - January 2020
RDA in a Nutshell - January 2020
 
Rda in a Nutshell - December 2019
Rda in a Nutshell - December 2019Rda in a Nutshell - December 2019
Rda in a Nutshell - December 2019
 
Rda in a Nutshell - November 2019
Rda in a Nutshell - November 2019Rda in a Nutshell - November 2019
Rda in a Nutshell - November 2019
 
RDA in a Nutshell - October 2019
RDA in a Nutshell - October 2019RDA in a Nutshell - October 2019
RDA in a Nutshell - October 2019
 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to Individuals
 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to Individuals
 
RDA Value for Infrastructure Providers
RDA Value for Infrastructure ProvidersRDA Value for Infrastructure Providers
RDA Value for Infrastructure Providers
 
Rda in a nutshell september 2019
Rda in a nutshell september 2019Rda in a nutshell september 2019
Rda in a nutshell september 2019
 
The Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing ResearchThe Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing Research
 
RDA Value for Libraries
RDA Value for LibrariesRDA Value for Libraries
RDA Value for Libraries
 
The Value of the RDA for Funders
The Value of the RDA for FundersThe Value of the RDA for Funders
The Value of the RDA for Funders
 
Rda in a nutshell august 2019
Rda in a nutshell august 2019Rda in a nutshell august 2019
Rda in a nutshell august 2019
 

Dernier

定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 

Dernier (20)

定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 

Towards a Community-driven Data Science Body of Knowledge – Data Management Skills and Competences

  • 1. Towards a Community-driven Data Science Body of Knowledge FAIR 2016 , Florence 14-15 November 2016 Andrea Manieri Engineering Ingegneria Informatica S.p.A. EDISON – Education for Data Intensive Science to Open New science frontiers Grant 675419 (INFRASUPP-4-2015: CSA) Credits: • Yuri Demchenko (UvA) • Steve Brewer (SOTON) • Kim Hee (GOETHE) • Adam Belloum (UvA) • Spiros Koulozis (UvA)
  • 2. A sense of urgency – dated 2013 “Europe faces up to 700.000 unfilled ICT jobs and declining competitiveness. The number of digital jobs is growing – by 3% each year during the crisis – but the number of new ICT graduates and other skilled ICT workers is shrinking. Our youth need actions not words, and companies operating in Europe need the right people or they will move operations elsewhere”. EC press release 25, Jan 2013 Grand Coalition for Digital Jobs + EU eSkills strategy for 2020 becoming Digital Skills and Jobs Coalition (conference launch 1st Dec 2016 in Bruxelles) Data Scientist shortage: - Gartner, 2012 - McKinsey, 2013 - Forbes, 2013 https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century
  • 3. Who need Data skills? • As a student – I need recommendations on Data-driven careers • As a resercher – I need to cover gaps wrt eScience • As a Librarian – I want to promote my competences • As an employee – I want to reskill in data- driven jobs
  • 4. Who need Data skills? • As Scholar/Lecturer – I need to update my background • As training manager – I want to innovate my offering • As course designer – I have to define right topics and know-how to be taught
  • 5. Who need Data skills? • As HR manager – I want to find the fit-for- purpose candidates • As team leader – I need to cover know- how and skills for a task/project • As employer – I want to define re-skilling plans for my workforce
  • 6. EDISON CSA: serving customers base
  • 8. Visionaries and Drivers The Fourth Paradigm: Data-Intensive Scientific Discovery. By Jim Gray, Microsoft, 2009. Edited by Tony Hey, et al. http://research.microsoft.com/en-us/collaboration/fourthparadigm/ Riding the wave: How Europe can gain from the rising tide of scientific data. Final report of the High Level Expert Group on Scientific Data. October 2010. http://cordis.europa.eu/fp7/ict/e- infrastructure/docs/hlg-sdi-report.pdf The Data Harvest: How sharing research data can yield knowledge, jobs and growth. An RDA Europe Report. December 2014 https://rd-alliance.org/data- harvest-report-sharing-data- knowledge-jobs-and-growth.html https://www.rd-alliance.org/ NIST Big Data Working Group (NBD-WG) https://www.rd-alliance.org/ (since 2013) ISO/IEC JTC1 Big Data Study Group (SGBD) http://jtc1bigdatasg.nist.gov/home.php (2014)
  • 9. EDISON & RDA • 1st RDA Plenary meeting – 18-20 March 2013 – 1st BoF on Education and Skills Development in Data Intensive Science – Attended by 16 representatives from universities, libraries, e-Science, data centers, research coordination bodies • 3rd RDA Plenary meeting – 26-28 March 2014, Dublin – 3rd BoF on Education and Skills Development in Data Intensive Science – EDISON (Education for Data Intensive Science to Open New science frontiers) Initiative announced • 4th RDA Plenary meeting – 22-24 September 2014, Amsterdam – IG Education and Training on Handling of Research Data (ETHRD) established – EDISON Workshop – 21 Sept 2014, Science Park Amsterdam – Decision to form a consortium and submit a proposal to IINFRASUPP-4-2015 call • 8th RDA Plenary meeting – 15-17 September 2016, Denver, USA – BoFs and IG meetings – now developing Certification and Accreditation proposal
  • 10. EDISON Data Science Framework (EDSF) CF-DS DS-BoK MC-DS Taxonomy and Vocabulary eLearning Platform Datasciencepro.eu Roadmap & Sustainability • Community Portal (CP) • Professional certification • Data Science career & prof development DS Prof Profiles Data Science Framework Foundation & Concepts Services Biz Model • EDISON Framework components – CF-DS – Data Science Competence Framework – DS-BoK – Data Science Body of Knowledge – MC-DS – Data Science Model Curriculum – DSP - Data Science Professional profiles definition – Data Science Taxonomies and Scientific Disciplines Classification
  • 11. Data Science Body-of-Knowledge A shared vision of knowledge corpus
  • 12. • Based on the definition by NIST Big Data WG (NIST SP1500 - 2015) • A Data Scientist is a practitioner who has sufficient knowledge in the overlapping regimes of expertise in business needs, domain knowledge, analytical skills, and programming and systems engineering expertise to manage the end-to-end scientific method process through each stage in the big data lifecycle – …Till the delivery of expected scientific and business value to science or industry • Other definitions to admit such features as – Ability to solve variety of business problems, tell “stories”, input to decision making – Optimize performance and suggest new services for the organisation – Develop a special mindset and be statistically minded, understand raw data and “appreciate data as a first class product” Data Scientist Definition • Data science is the empirical synthesis of actionable knowledge and technologies required to handle data from raw data through the complete data lifecycle process. • Big Data is the technology to build system and infrastructures to process large volume of structurally complex data in a time effective way [ref] Legacy: NIST BDWG definition of Data Science
  • 13. • Commonly accepted Data Science competences/skills groups include – Data Analytics or Business Analytics or Machine Learning – Engineering or Programming – Subject/Scientific Domain Knowledge • EDISON identified 2 additional competence groups demanded by organisations – Data Management, Curation, Preservation – Scientific or Research Methods and/vs Business Processes/Operations • Other skills commonly recognized aka “soft skills” or “social/professional intelligence” – Inter-personal skills or team work, cooperativeness • Important aspects of integrating Data Scientist into organisation structure – General Data Science (and Data) literacy for all involved roles and management – Common agreed and understandable way of communication and information/data presentation – Role of Data Scientist: Provide a kind of literacy advice and guidance to organisation Data Science Competence Groups
  • 14. • Group 1: Skills/experience related to competences – Data Analytics and Machine Learning – Data Management/Curation (both general and scientific) – Data Science Engineering (hardware and software) skills – Scientific/Research Methods or Business Process Management – Application/subject domain related (research or business) – Mathematics and Statistics • Group 2: Big Data (Data Science) tools and platforms – Big Data Analytics platforms – Mathematics & Statistics applications & tools – Databases (SQL and NoSQL) – Data Management and Curation platform – Data and applications visualisation – Cloud based platforms and tools Data Science Skills/Experiences Group 3: Programming and programming languages and IDE – General and specialized development platforms for data analysis and statistics Group 4: Soft skills or Social Intelligence – Personal, inter-personal communication, team work, professional network
  • 15. Comparing with relevant BoK • ACM Computer Science Body of Knowledge (ACM CS-BoK) • ICT professional Body of Knowledge (ICT-BoK) • Business Analytics Body of Knowledge (BABOK) • Software Engineering Body of Knowledge (SWEBOK) • Data Management Body of Knowledge (DAMA-BoK) by Data Management Association International (DAMAI) • Project Management Professional Body of Knowledge (PM- BoK)
  • 16. • DS-BoK Knowledge Area Groups (KAG) • KAG1-DSA: Data Analytics group including Machine Learning, statistical methods, and Business Analytics • KAG2-DSE: Data Science Engineering group including Software and infrastructure engineering • KAG3-DSDM: Data Management group including data curation, preservation and data infrastructure • KAG4-DSRM: Scientific/Research Methods group • KAG5-DSBP: Business process management group • Data Science domain knowledge to be defined by related expert groups Data Science BoK (DS-BoK)
  • 17. Process Groups – knowledge at work • Data Identification and Creation – how to obtain digital information from in-silico experiments and instrumentations, how to collect and store in digital form, any techniques, models, standard and tools needed to perform these activities, depending from the specific discipline. • Data Access and Retrieval: – tools, techniques and standards used to access any type of data from any type of media, retrieve it in compliance to IPRs and established legislations. • Data Curation and Preservation: – includes activities related to data cleansing, normalisation, validation and storage. • Data Fusion (or Data integration): – the integration of multiple data and knowledge representing the same real-world object into a consistent, accurate, and useful representation. • Data Organisation and Management: – how to organise the storage of data for various purposes required by each discipline, tools, techniques, standards and best practices (including IPRs management and compliance to laws and regulations, and metadata definition and completion) to set up ICT solutions in order to achieve the required Services Level Agreement for data conservation. • Data Storage and Stewardship: – how to enhance the use of data by using metadata and other techniques to establish a long term access and extended use to that data also by scientists and researchers from other disciplines and after very long time from the data production time. • Data Processing: – tools, techniques and standards to analyse different and heterogeneous data coming from various sources, different scientific domains and of a variety of size (up to Exabytes) – it includes notion of programming paradigms. • Data Visualisation and Communication: – techniques, models and best practices to merge and join various data sets, techniques and tools for data analytics and visualisation, depending on the data significant and the discipline.
  • 18. Data Science Data Management Group (DSDM) KAG3-DSDM: Data Management group including data curation, preservation and data infrastructure DAMA-BoK selected KAs (1) Data Governance (2) Data Architecture (3) Data Modelling and Design (4) Data Storage and Operations (5) Data Security (6) Data Integration and Interoperability (7) Documents and Content (8) Reference and Master Data (9) Data Warehousing and Business Intelligence (10) Metadata (11) Data Quality General Data Management KA’s  Data Lifecycle Management  Data archives/storage compliance and certification New KAs to support RDA recommendations and community data management models (Open Access, Open Data, etc.)  Data type registries, PIDs  Data infrastructure and Data Factories  …
  • 20. • Professional profiles groups are defined in compliance with the ESCO taxonomy Data Science Professions Family
  • 21. • Relevance of a competence to a DSP profile: • 5 – high, 1 - low Mapping DS-BoK GAs to DSP profiles
  • 22. E - CO2 Classification • Text Filtering • Find overlapping terms • Calculate TF-IDF of terms • For each category vector calculate cosine similarity • The output is a CSV with the similarity for each category
  • 23. Education offered vs. Market requests DSDA: Data Science Analytics DSDK: DS Domain Knowledge (DSDK) DSEN: Data Science Engineering DSRM: Scientific/ Research Methods DSDM: Data Management
  • 24. DSDA: Data Science Analytics DSDK: DS Domain Knowledge (DSDK) DSEN: Data Science Engineering DSRM: Scientific/ Research Methods DSDM: Data Management CV vs. Job offering
  • 25. Model Curricula in Data Science Supporting EU Academy in excelling and matching the market needs
  • 26. • Data Science Model Curriculum includes – Learning Outcomes (LO) definition based on CF-DS • LOs are defined for CF-DS competence groups and for all enumerated competences – LOs mapping to Learning Units (LU) • LUs are based on CCS(2012) and universities best practices • Data Science university programmes and courses inventory (interactive) http://edison-project.eu/university-programs-list – LU/course relevance: Mandatory Tier 1, Tier 2, Elective, Prerequisite – Learning methods and learning models (in progress) • Based on Bloom’s Taxonomy, Outcome Based Learning, etc Data Science Model Curriculum (MC-DS)
  • 28. Some numbers (2015) • A portfolio of more than 300 courses • 200 traineers and experts • 5 offices and 16 classroom • 18.000 training person/hours • New on-line platform Aosta Roma Padova Milano Frosinone Engineering IT & Management school
  • 29. Data Science Master at Univ. of Perugia http://masterds.unipg.it/en/index.html
  • 30. Accreditation and Certification - RDA BoF Aim: contribute to the sustainable development of the data science profession. Goal: deliver a report that presents a concise but representative picture of the various accreditation and certification schemes that exist around the world Outcome: Need to develop 9 months working group proposal centered on supporting the members of RDA to develop their own professional career paths around their own skills, interests and contexts.
  • 35. Sharing education events and experiences
  • 36. What’s next? Putting theory into practice and supporting service delivery
  • 38. What we can do with you 1. Improve and Validate EDSF 1. Identifying the “soft skills”: how to ask a research/business question? 2. Identifying the Community need: from stewards to scientists, any market, any discipline 3. Validate completeness of BoK, coverage of CF, usability of MC 4. Promote National workshop for bottom-up adoption of EDSF 2. Career Development 1. Specifications for DSP job positions in Data Management and Librarian teams and Engagement mechanisms Employers/DSP candidates 2. Links and Recommendations for placing students for getting DSP work experience 3. Facilitate cross-institutional agreements on DSP career paths 4. Supporting Training through DataSciencePro.eu 5. Mapping and comparing career paths and Learning opportunities for Personal Competence Portfolio (PCP) 6. Advice Events, Courses and Tools for Community training 7. Develop Virtual Labs, re-usable and promoted further out of your Community 8. Certification: from badges to professions – the How-to of a Community-driven Data Science Certification (RDA)
  • 39. Promoting the Data Science Profession
  • 40. • Invitation to contribution and cooperation: – Forum, EDISON Liaisons Groups, Champions Conference (Spring & Summer 2017) • EDISON project website http://edison-project.eu/ • EDISON Data Science Framework Release 1 (EDSF) http://edison-project.eu/edison-data-science-framework-edsf • Community oriented - Survey Data Science Competences (Available Soon)