SlideShare une entreprise Scribd logo
1  sur  9
Jeffrey
                Stanton

      WHAT IS   School of
                Information
DATA SCIENCE?   Studies

                Syracuse
                University
BIG   DATA
KILO, MEGA, GIGA, TERA, PETA, EXA
        ZETTA = 10 21 BYTES
…An organization          Over 95% of the
employing      1,000      digital universe is
knowledge workers         "unstructured data"
loses $5.7 million        –     meaning       its
annually just in          content can't be truly
time wasted having        represented by its
to         reformat       field       in        a
information as they       record,    such      as
move          among       name, address, or
applications.   Not       date      of       last
finding information       transaction. In
costs that same           organizations, unstr
organization      an      uctured data
additional $5.3m a        accounts for more
year.                     than 80% of all
                          information.
Source: IDC
                          Source: IDC
WHY DATA SCIENCE?

 Available data on a scale millions of times larger than 20
  years ago: customer transactions; environmental sensor
  outputs; genetic and epigenetic sequences; web documents;
  digital images and audio
 Heterogeneous data sets, with different representations and
  formats; mixtures of structured and unstructured data;
  some, little, or no metadata; distributed across systems
 Chaotic information life cycle, where little time and effort is
  spent on what should be kept and what can be discarded
 Diverse and/or legacy infrastructure: mainframes running
  Cobol connected with high speed networks to sensor arrays
  running Linux
CRITICAL QUESTIONS

 How will global climate change affect sea levels in major
  coastal metropolitan areas worldwide?
 Does genetic screening reduce cancer mortality for adults
  between the ages of 50 and 59?
 What gene sequences in cereal grains are associated with
  greater crop yields in arid environments?
 How can we reduce false positives in automated airline
  baggage scans without reducing accuracy?
 What Internet data can be mined as predictive of firm
  creation among startups that provide new jobs?
“BIG DATA” PROVIDES ANSWERS

 Water sustainability                              Drug design and
 Climate analysis and                               development
  prediction                                        Advanced materials
 Energy through fusion                              analysis
 CO 2 Sequestration                                New combustion
 Hazard analysis and                                systems
  management                                        Virtual product design
 Cancer detection and                              In silico semiconductor
  therapy                                            design
NSF Advisory Committee for Cyberinfrastructure, Taskforce for Grand Challenges, Final Report,
March 2011. http://www.nsf.gov/od/oci/taskforces/TaskForceReport_GrandChallenges.pdf
NSF Advisory
“All grand challenges face        Committee
                                  for
barriers due to challenges in     Cyberinfra-
software, in data management      structure, Tas
                                  kforce for
and visualization, and in         Grand
                                  Challenges, F
coordinating the work of          inal
                                  Report, Marc
diverse communities that must     h 2011.
work together to develop new      http://www.n
                                  sf.gov/od/oci/
models and algorithms, and to     taskforces/Ta
                                  skForceRepor
evaluate outputs as a basis for   t_GrandChall
                                  enges.pdf
critical decisions.”
Knowledge Development
                                            for
                             Industry, Education, Governme
                                       nt, Research
       Domain
       Experts                                                            Infrastructure
                                       Information
                                                                          Professionals
  Expertise in specific
                                      Organization &                       Rapid pace of
     subject areas                     Visualization                      IT development

Limited opportunity to                                                  Limited expertise in
master technology skills    Information      Data         Solution
                                                                           domain areas
                              Analysis    Scientists     Integration

Proliferation of big data
                                                                       Specialized knowledge
  & new technology                                                      of HW, FW, MW, SW
                                      Digital Curation
Need for knowledge and                                                    Communication
 information managers                                                       challenges

         Data Scientists: Transforming Data Into Decisions
A DEFINITION OF A DATA SCIENTIST

 A data scientist uses deep expertise in the
  management, transformation, and analysis of large,
  heterogeneous data sets to:
   Help infrastructure experts with the architecture of hardware
    and software to manage big data challenges
   Help domain experts and decision makers reduce the data
    deluge into usable knowledge, visualizations, and
    presentations
   Help institutions and organizations control and curate data
    throughout the information lifecycle

Contenu connexe

Tendances

Citizen Actuation For Lightweight Energy Management
Citizen Actuation For Lightweight Energy ManagementCitizen Actuation For Lightweight Energy Management
Citizen Actuation For Lightweight Energy ManagementEdward Curry
 
Challenges Ahead for Converging Financial Data
Challenges Ahead for Converging Financial DataChallenges Ahead for Converging Financial Data
Challenges Ahead for Converging Financial DataEdward Curry
 
BeSTGRID OpenGridForum 29 GIN session
BeSTGRID OpenGridForum 29 GIN sessionBeSTGRID OpenGridForum 29 GIN session
BeSTGRID OpenGridForum 29 GIN sessionNick Jones
 
System of Systems Information Interoperability using a Linked Dataspace
System of Systems Information Interoperability using a Linked DataspaceSystem of Systems Information Interoperability using a Linked Dataspace
System of Systems Information Interoperability using a Linked DataspaceEdward Curry
 
Crowdsourcing Approaches to Big Data Curation for Earth Sciences
Crowdsourcing Approaches to Big Data Curation for Earth SciencesCrowdsourcing Approaches to Big Data Curation for Earth Sciences
Crowdsourcing Approaches to Big Data Curation for Earth SciencesEdward Curry
 
NextGen Infrastructure for Big Data
NextGen Infrastructure for Big DataNextGen Infrastructure for Big Data
NextGen Infrastructure for Big DataEd Dodds
 
Big Data Public Private Forum (BIG) @ European Data Forum 2013
Big Data Public Private Forum (BIG) @ European Data Forum 2013Big Data Public Private Forum (BIG) @ European Data Forum 2013
Big Data Public Private Forum (BIG) @ European Data Forum 2013Edward Curry
 
Sustainable IT for Energy Management: Approaches, Challenges, and Trends
Sustainable IT for Energy Management: Approaches, Challenges, and TrendsSustainable IT for Energy Management: Approaches, Challenges, and Trends
Sustainable IT for Energy Management: Approaches, Challenges, and TrendsEdward Curry
 
The Essential Ingredient for Today's Enterprise
The Essential Ingredient for Today's EnterpriseThe Essential Ingredient for Today's Enterprise
The Essential Ingredient for Today's EnterpriseReadWrite
 

Tendances (10)

Citizen Actuation For Lightweight Energy Management
Citizen Actuation For Lightweight Energy ManagementCitizen Actuation For Lightweight Energy Management
Citizen Actuation For Lightweight Energy Management
 
Challenges Ahead for Converging Financial Data
Challenges Ahead for Converging Financial DataChallenges Ahead for Converging Financial Data
Challenges Ahead for Converging Financial Data
 
BeSTGRID OpenGridForum 29 GIN session
BeSTGRID OpenGridForum 29 GIN sessionBeSTGRID OpenGridForum 29 GIN session
BeSTGRID OpenGridForum 29 GIN session
 
System of Systems Information Interoperability using a Linked Dataspace
System of Systems Information Interoperability using a Linked DataspaceSystem of Systems Information Interoperability using a Linked Dataspace
System of Systems Information Interoperability using a Linked Dataspace
 
Crowdsourcing Approaches to Big Data Curation for Earth Sciences
Crowdsourcing Approaches to Big Data Curation for Earth SciencesCrowdsourcing Approaches to Big Data Curation for Earth Sciences
Crowdsourcing Approaches to Big Data Curation for Earth Sciences
 
NextGen Infrastructure for Big Data
NextGen Infrastructure for Big DataNextGen Infrastructure for Big Data
NextGen Infrastructure for Big Data
 
Innovation in Silicon Valley
Innovation in Silicon ValleyInnovation in Silicon Valley
Innovation in Silicon Valley
 
Big Data Public Private Forum (BIG) @ European Data Forum 2013
Big Data Public Private Forum (BIG) @ European Data Forum 2013Big Data Public Private Forum (BIG) @ European Data Forum 2013
Big Data Public Private Forum (BIG) @ European Data Forum 2013
 
Sustainable IT for Energy Management: Approaches, Challenges, and Trends
Sustainable IT for Energy Management: Approaches, Challenges, and TrendsSustainable IT for Energy Management: Approaches, Challenges, and Trends
Sustainable IT for Energy Management: Approaches, Challenges, and Trends
 
The Essential Ingredient for Today's Enterprise
The Essential Ingredient for Today's EnterpriseThe Essential Ingredient for Today's Enterprise
The Essential Ingredient for Today's Enterprise
 

En vedette

O Ambiente Sócrates ProUCA-CE
O Ambiente Sócrates ProUCA-CEO Ambiente Sócrates ProUCA-CE
O Ambiente Sócrates ProUCA-CEMarcia Duarte
 
Onda Carioca Condominium Club - Consulte-nos (21) 4109-6372
Onda Carioca Condominium Club - Consulte-nos (21) 4109-6372Onda Carioca Condominium Club - Consulte-nos (21) 4109-6372
Onda Carioca Condominium Club - Consulte-nos (21) 4109-6372Dowglas Delácio
 
Idadecontempornea imperialismoeneocolonialismo-110429224422-phpapp01
Idadecontempornea imperialismoeneocolonialismo-110429224422-phpapp01Idadecontempornea imperialismoeneocolonialismo-110429224422-phpapp01
Idadecontempornea imperialismoeneocolonialismo-110429224422-phpapp01Gilson Sousa
 
ASSETS'11 Doctoral Consortium
ASSETS'11 Doctoral ConsortiumASSETS'11 Doctoral Consortium
ASSETS'11 Doctoral ConsortiumHugo Nicolau
 

En vedette (8)

Oficina pbworks 1 de 2
Oficina  pbworks 1 de 2Oficina  pbworks 1 de 2
Oficina pbworks 1 de 2
 
O Ambiente Sócrates ProUCA-CE
O Ambiente Sócrates ProUCA-CEO Ambiente Sócrates ProUCA-CE
O Ambiente Sócrates ProUCA-CE
 
Plano CNO
Plano CNOPlano CNO
Plano CNO
 
Onda Carioca Condominium Club - Consulte-nos (21) 4109-6372
Onda Carioca Condominium Club - Consulte-nos (21) 4109-6372Onda Carioca Condominium Club - Consulte-nos (21) 4109-6372
Onda Carioca Condominium Club - Consulte-nos (21) 4109-6372
 
Idadecontempornea imperialismoeneocolonialismo-110429224422-phpapp01
Idadecontempornea imperialismoeneocolonialismo-110429224422-phpapp01Idadecontempornea imperialismoeneocolonialismo-110429224422-phpapp01
Idadecontempornea imperialismoeneocolonialismo-110429224422-phpapp01
 
ASSETS'11 Doctoral Consortium
ASSETS'11 Doctoral ConsortiumASSETS'11 Doctoral Consortium
ASSETS'11 Doctoral Consortium
 
Manual do jammer
Manual do jammerManual do jammer
Manual do jammer
 
Compress
CompressCompress
Compress
 

Similaire à Jeff's what isdatascience

Introduction to Advance Analytics Course
Introduction to Advance Analytics CourseIntroduction to Advance Analytics Course
Introduction to Advance Analytics CourseSyracuse University
 
EDF2013: Invited Talk Julie Marguerite: Big data: a new world of opportunitie...
EDF2013: Invited Talk Julie Marguerite: Big data: a new world of opportunitie...EDF2013: Invited Talk Julie Marguerite: Big data: a new world of opportunitie...
EDF2013: Invited Talk Julie Marguerite: Big data: a new world of opportunitie...European Data Forum
 
Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...
Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...
Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...Cloudera, Inc.
 
Big data appliances for BI on Cloud
Big data appliances for BI on CloudBig data appliances for BI on Cloud
Big data appliances for BI on Cloudtdwiindia
 
Educating a New Breed of Data Scientists for Scientific Data Management
Educating a New Breed of Data Scientists for Scientific Data Management Educating a New Breed of Data Scientists for Scientific Data Management
Educating a New Breed of Data Scientists for Scientific Data Management Jian Qin
 
Building IT Infrastructures to Interact with Big Data - Doug Roberts, Associ...
Building IT Infrastructures to Interact with Big Data  - Doug Roberts, Associ...Building IT Infrastructures to Interact with Big Data  - Doug Roberts, Associ...
Building IT Infrastructures to Interact with Big Data - Doug Roberts, Associ...IT Network marcus evans
 
An Overview of BigData
An Overview of BigDataAn Overview of BigData
An Overview of BigDataValarmathi V
 
Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)NikitaRajbhoj
 
Big data: Challenges, Practices and Technologies
Big data: Challenges, Practices and TechnologiesBig data: Challenges, Practices and Technologies
Big data: Challenges, Practices and TechnologiesNavneet Randhawa
 
Analytics big data ibm
Analytics big data ibmAnalytics big data ibm
Analytics big data ibmAccenture
 
IBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep diveIBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep diveKun Le
 
The Zen and Art of IT Management (VM World Keynote 2012)
The Zen and Art of IT Management (VM World Keynote 2012)The Zen and Art of IT Management (VM World Keynote 2012)
The Zen and Art of IT Management (VM World Keynote 2012)CA Technologies
 
Hitachi Data Systems Big Data Roadmap
Hitachi Data Systems Big Data RoadmapHitachi Data Systems Big Data Roadmap
Hitachi Data Systems Big Data RoadmapHitachi Vantara
 
Exploring Big Data value for your business
Exploring Big Data value for your businessExploring Big Data value for your business
Exploring Big Data value for your businessAcunu
 
Smart Data for Smart Labs
Smart Data for Smart Labs Smart Data for Smart Labs
Smart Data for Smart Labs OSTHUS
 

Similaire à Jeff's what isdatascience (20)

Introduction to Advance Analytics Course
Introduction to Advance Analytics CourseIntroduction to Advance Analytics Course
Introduction to Advance Analytics Course
 
What is Data Science
What is Data ScienceWhat is Data Science
What is Data Science
 
EDF2013: Invited Talk Julie Marguerite: Big data: a new world of opportunitie...
EDF2013: Invited Talk Julie Marguerite: Big data: a new world of opportunitie...EDF2013: Invited Talk Julie Marguerite: Big data: a new world of opportunitie...
EDF2013: Invited Talk Julie Marguerite: Big data: a new world of opportunitie...
 
Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...
Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...
Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...
 
Big data appliances for BI on Cloud
Big data appliances for BI on CloudBig data appliances for BI on Cloud
Big data appliances for BI on Cloud
 
Informatics technologies in an evolving r & d landscape
Informatics technologies in an evolving r & d landscapeInformatics technologies in an evolving r & d landscape
Informatics technologies in an evolving r & d landscape
 
Educating a New Breed of Data Scientists for Scientific Data Management
Educating a New Breed of Data Scientists for Scientific Data Management Educating a New Breed of Data Scientists for Scientific Data Management
Educating a New Breed of Data Scientists for Scientific Data Management
 
Building IT Infrastructures to Interact with Big Data - Doug Roberts, Associ...
Building IT Infrastructures to Interact with Big Data  - Doug Roberts, Associ...Building IT Infrastructures to Interact with Big Data  - Doug Roberts, Associ...
Building IT Infrastructures to Interact with Big Data - Doug Roberts, Associ...
 
Big data_郭惠民
Big data_郭惠民Big data_郭惠民
Big data_郭惠民
 
An Overview of BigData
An Overview of BigDataAn Overview of BigData
An Overview of BigData
 
Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)
 
Complete-SRS.doc
Complete-SRS.docComplete-SRS.doc
Complete-SRS.doc
 
Big data: Challenges, Practices and Technologies
Big data: Challenges, Practices and TechnologiesBig data: Challenges, Practices and Technologies
Big data: Challenges, Practices and Technologies
 
Analytics big data ibm
Analytics big data ibmAnalytics big data ibm
Analytics big data ibm
 
IBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep diveIBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep dive
 
The Zen and Art of IT Management (VM World Keynote 2012)
The Zen and Art of IT Management (VM World Keynote 2012)The Zen and Art of IT Management (VM World Keynote 2012)
The Zen and Art of IT Management (VM World Keynote 2012)
 
Hitachi Data Systems Big Data Roadmap
Hitachi Data Systems Big Data RoadmapHitachi Data Systems Big Data Roadmap
Hitachi Data Systems Big Data Roadmap
 
Big data
Big dataBig data
Big data
 
Exploring Big Data value for your business
Exploring Big Data value for your businessExploring Big Data value for your business
Exploring Big Data value for your business
 
Smart Data for Smart Labs
Smart Data for Smart Labs Smart Data for Smart Labs
Smart Data for Smart Labs
 

Dernier

Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 

Dernier (20)

Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 

Jeff's what isdatascience

  • 1. Jeffrey Stanton WHAT IS School of Information DATA SCIENCE? Studies Syracuse University
  • 2. BIG DATA
  • 3. KILO, MEGA, GIGA, TERA, PETA, EXA ZETTA = 10 21 BYTES …An organization Over 95% of the employing 1,000 digital universe is knowledge workers "unstructured data" loses $5.7 million – meaning its annually just in content can't be truly time wasted having represented by its to reformat field in a information as they record, such as move among name, address, or applications. Not date of last finding information transaction. In costs that same organizations, unstr organization an uctured data additional $5.3m a accounts for more year. than 80% of all information. Source: IDC Source: IDC
  • 4. WHY DATA SCIENCE?  Available data on a scale millions of times larger than 20 years ago: customer transactions; environmental sensor outputs; genetic and epigenetic sequences; web documents; digital images and audio  Heterogeneous data sets, with different representations and formats; mixtures of structured and unstructured data; some, little, or no metadata; distributed across systems  Chaotic information life cycle, where little time and effort is spent on what should be kept and what can be discarded  Diverse and/or legacy infrastructure: mainframes running Cobol connected with high speed networks to sensor arrays running Linux
  • 5. CRITICAL QUESTIONS  How will global climate change affect sea levels in major coastal metropolitan areas worldwide?  Does genetic screening reduce cancer mortality for adults between the ages of 50 and 59?  What gene sequences in cereal grains are associated with greater crop yields in arid environments?  How can we reduce false positives in automated airline baggage scans without reducing accuracy?  What Internet data can be mined as predictive of firm creation among startups that provide new jobs?
  • 6. “BIG DATA” PROVIDES ANSWERS  Water sustainability  Drug design and  Climate analysis and development prediction  Advanced materials  Energy through fusion analysis  CO 2 Sequestration  New combustion  Hazard analysis and systems management  Virtual product design  Cancer detection and  In silico semiconductor therapy design NSF Advisory Committee for Cyberinfrastructure, Taskforce for Grand Challenges, Final Report, March 2011. http://www.nsf.gov/od/oci/taskforces/TaskForceReport_GrandChallenges.pdf
  • 7. NSF Advisory “All grand challenges face Committee for barriers due to challenges in Cyberinfra- software, in data management structure, Tas kforce for and visualization, and in Grand Challenges, F coordinating the work of inal Report, Marc diverse communities that must h 2011. work together to develop new http://www.n sf.gov/od/oci/ models and algorithms, and to taskforces/Ta skForceRepor evaluate outputs as a basis for t_GrandChall enges.pdf critical decisions.”
  • 8. Knowledge Development for Industry, Education, Governme nt, Research Domain Experts Infrastructure Information Professionals Expertise in specific Organization & Rapid pace of subject areas Visualization IT development Limited opportunity to Limited expertise in master technology skills Information Data Solution domain areas Analysis Scientists Integration Proliferation of big data Specialized knowledge & new technology of HW, FW, MW, SW Digital Curation Need for knowledge and Communication information managers challenges Data Scientists: Transforming Data Into Decisions
  • 9. A DEFINITION OF A DATA SCIENTIST  A data scientist uses deep expertise in the management, transformation, and analysis of large, heterogeneous data sets to:  Help infrastructure experts with the architecture of hardware and software to manage big data challenges  Help domain experts and decision makers reduce the data deluge into usable knowledge, visualizations, and presentations  Help institutions and organizations control and curate data throughout the information lifecycle

Notes de l'éditeur

  1. Facebook friend connections worldwide, a network diagram of the Enron email set, a comparison of similar gene sequences between humans, chimps, and macaques
  2. HW, FW, MW, SW: Hardware Firmware Middleware Software