SlideShare a Scribd company logo
1 of 9
Software Carpentry @ Arizona!
Instructors
• Titus Brown
• Karen Cranston
• Rich Enbody

• Deren, Chas, Katie, Nirav
What do scientists care about?
1. Correctness
2. Reproducibility and provenance
3. Efficiency
What do scientists actually care
               about?
1. Efficiency

2. Correctness
3. Reproducibility and provenance
Our concern
• As we become more reliant on computational
  inference, does more of our science become wrong?
• “Big Data” increasingly requires sophisticated
  computational pipelines…
• We know that simple computational errors have gone
  undetected for many years
   – a sign error => retraction of 3 Science, 1 Nature, 1 PNAS
   – Rejection of grants, publications!
   http://boscoh.com/protein/a-sign-a-flipped-structure-
and-a-scientific-flameout-of-epic-proportions
Our central thesis
With only a little bit of training and effort,
• Computational scientists can become more
  efficient and effective at getting their work
  done,
• while considerably improving correctness and
  reproducibility of their code.
Automation
Why Python, and not R?
In my opinion,

• Python is a more general purpose language, while R is
  mostly about data analysis.

• Everyone will need to learn multiple languages; R and
  Python are pretty dominant in bio right now.

• Luckily, once you get the hang of it, new languages are not
  so difficult to pick up.

• Ultimately, we’re trying to teach process not details.
Administrivia
• Asking for help

• Using the Web site

• Sticky notes: ok? Not ok?

• Minute cards: at the end of every session, write
  down
     • One thing you learned
     • One thing you are confused about

More Related Content

Viewers also liked

Trabajo De IngléS
Trabajo De IngléSTrabajo De IngléS
Trabajo De IngléSdaniva
 
10 Biggest Brain Damaging Habits
10  Biggest  Brain  Damaging  Habits10  Biggest  Brain  Damaging  Habits
10 Biggest Brain Damaging Habitslewisj2111
 
Manual Book - Telkomsel Care Applications
Manual Book - Telkomsel Care ApplicationsManual Book - Telkomsel Care Applications
Manual Book - Telkomsel Care ApplicationsKhomeini Mujahid
 
Private Work: How to Secure a Fair Contract + Get Paid
Private Work: How to Secure a Fair Contract + Get PaidPrivate Work: How to Secure a Fair Contract + Get Paid
Private Work: How to Secure a Fair Contract + Get PaidKegler Brown Hill + Ritter
 
Approximate Thin Plate Spline Mappings
Approximate Thin Plate Spline MappingsApproximate Thin Plate Spline Mappings
Approximate Thin Plate Spline MappingsArchzilon Eshun-Davies
 
2013 ucdavis-smbe-eukaryotes
2013 ucdavis-smbe-eukaryotes2013 ucdavis-smbe-eukaryotes
2013 ucdavis-smbe-eukaryotesc.titus.brown
 
Enlightenment
EnlightenmentEnlightenment
EnlightenmentGregorio
 
Manduca
ManducaManduca
Manducanbmro
 
Where to focus event innovation? - An audience led approach
Where to focus event innovation? - An audience led approachWhere to focus event innovation? - An audience led approach
Where to focus event innovation? - An audience led approachLive Union
 
Exporting from the United States: Key Legal Considerations
Exporting from the United States: Key Legal ConsiderationsExporting from the United States: Key Legal Considerations
Exporting from the United States: Key Legal ConsiderationsKegler Brown Hill + Ritter
 
Пропозиція PR-агенції "Автограф"
Пропозиція PR-агенції "Автограф"Пропозиція PR-агенції "Автограф"
Пропозиція PR-агенції "Автограф"autograf_comm
 

Viewers also liked (20)

Trabajo De IngléS
Trabajo De IngléSTrabajo De IngléS
Trabajo De IngléS
 
10 Biggest Brain Damaging Habits
10  Biggest  Brain  Damaging  Habits10  Biggest  Brain  Damaging  Habits
10 Biggest Brain Damaging Habits
 
Manual Book - Telkomsel Care Applications
Manual Book - Telkomsel Care ApplicationsManual Book - Telkomsel Care Applications
Manual Book - Telkomsel Care Applications
 
Ny Vraa Bioenergi
Ny Vraa BioenergiNy Vraa Bioenergi
Ny Vraa Bioenergi
 
Dr Roadmap
Dr RoadmapDr Roadmap
Dr Roadmap
 
Pagerank
PagerankPagerank
Pagerank
 
Private Work: How to Secure a Fair Contract + Get Paid
Private Work: How to Secure a Fair Contract + Get PaidPrivate Work: How to Secure a Fair Contract + Get Paid
Private Work: How to Secure a Fair Contract + Get Paid
 
Hohmann liber2006
Hohmann liber2006Hohmann liber2006
Hohmann liber2006
 
Coalition Orientation for SACADA Board Members
Coalition Orientation for SACADA Board MembersCoalition Orientation for SACADA Board Members
Coalition Orientation for SACADA Board Members
 
Approximate Thin Plate Spline Mappings
Approximate Thin Plate Spline MappingsApproximate Thin Plate Spline Mappings
Approximate Thin Plate Spline Mappings
 
18 Di Concetta
18 Di Concetta18 Di Concetta
18 Di Concetta
 
Circles of San Antonio Community Coalition Overview
Circles of San Antonio Community Coalition OverviewCircles of San Antonio Community Coalition Overview
Circles of San Antonio Community Coalition Overview
 
Canada
CanadaCanada
Canada
 
2013 ucdavis-smbe-eukaryotes
2013 ucdavis-smbe-eukaryotes2013 ucdavis-smbe-eukaryotes
2013 ucdavis-smbe-eukaryotes
 
Enlightenment
EnlightenmentEnlightenment
Enlightenment
 
Manduca
ManducaManduca
Manduca
 
Where to focus event innovation? - An audience led approach
Where to focus event innovation? - An audience led approachWhere to focus event innovation? - An audience led approach
Where to focus event innovation? - An audience led approach
 
Exporting from the United States: Key Legal Considerations
Exporting from the United States: Key Legal ConsiderationsExporting from the United States: Key Legal Considerations
Exporting from the United States: Key Legal Considerations
 
Пропозиція PR-агенції "Автограф"
Пропозиція PR-агенції "Автограф"Пропозиція PR-агенції "Автограф"
Пропозиція PR-агенції "Автограф"
 
2014 abic-talk
2014 abic-talk2014 abic-talk
2014 abic-talk
 

Similar to 2013 arizona-swc

2014 nicta-reproducibility
2014 nicta-reproducibility2014 nicta-reproducibility
2014 nicta-reproducibilityc.titus.brown
 
Beyond The Bench Workshops
Beyond The Bench WorkshopsBeyond The Bench Workshops
Beyond The Bench WorkshopsBeyond The Bench
 
NLP in Practice - Part I
NLP in Practice - Part INLP in Practice - Part I
NLP in Practice - Part IDelip Rao
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to BioinformaticsLeighton Pritchard
 
Python Programming - An Introduction
Python Programming - An IntroductionPython Programming - An Introduction
Python Programming - An IntroductionADITYATANDONKECCSE
 
2014 manchester-reproducibility
2014 manchester-reproducibility2014 manchester-reproducibility
2014 manchester-reproducibilityc.titus.brown
 
Reproducible research - to infinity
Reproducible research - to infinityReproducible research - to infinity
Reproducible research - to infinityPeterMorrell4
 
How to start Python? - lesson 1
How to start Python? - lesson 1How to start Python? - lesson 1
How to start Python? - lesson 1Shohel Rana
 
2013 ucar best practices
2013 ucar best practices2013 ucar best practices
2013 ucar best practicesc.titus.brown
 
Intro to Python for C# Developers
Intro to Python for C# DevelopersIntro to Python for C# Developers
Intro to Python for C# DevelopersSarah Dutkiewicz
 
Data_Science_Generating_Value_From_Data_Course_Slides_red.pdf
Data_Science_Generating_Value_From_Data_Course_Slides_red.pdfData_Science_Generating_Value_From_Data_Course_Slides_red.pdf
Data_Science_Generating_Value_From_Data_Course_Slides_red.pdfOlgaAngelikiKyriakou
 
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
NYC Open Data Meetup-- Thoughtworks chief data scientist talkNYC Open Data Meetup-- Thoughtworks chief data scientist talk
NYC Open Data Meetup-- Thoughtworks chief data scientist talkVivian S. Zhang
 
Meaning of life (TipTop Technologies)
Meaning of life (TipTop Technologies)Meaning of life (TipTop Technologies)
Meaning of life (TipTop Technologies)tiptoptech
 
Meaning of life (TipTop Technologies)
Meaning of life (TipTop Technologies)Meaning of life (TipTop Technologies)
Meaning of life (TipTop Technologies)tiptoptech
 
FEC2017-Introduction-to-programming
FEC2017-Introduction-to-programmingFEC2017-Introduction-to-programming
FEC2017-Introduction-to-programmingHenrikki Tenkanen
 
Reproducible Research with R, The Tidyverse, Notebooks, and Spark
Reproducible Research with R, The Tidyverse, Notebooks, and SparkReproducible Research with R, The Tidyverse, Notebooks, and Spark
Reproducible Research with R, The Tidyverse, Notebooks, and SparkAdaryl "Bob" Wakefield, MBA
 
Welcome to the Brixton Library Technology Initiative
Welcome to the Brixton Library Technology InitiativeWelcome to the Brixton Library Technology Initiative
Welcome to the Brixton Library Technology InitiativeBasil Bibi
 
Introduction to Text Mining
Introduction to Text MiningIntroduction to Text Mining
Introduction to Text MiningMinha Hwang
 

Similar to 2013 arizona-swc (20)

2014 nicta-reproducibility
2014 nicta-reproducibility2014 nicta-reproducibility
2014 nicta-reproducibility
 
Beyond The Bench Workshops
Beyond The Bench WorkshopsBeyond The Bench Workshops
Beyond The Bench Workshops
 
Intro slides
Intro slidesIntro slides
Intro slides
 
NLP in Practice - Part I
NLP in Practice - Part INLP in Practice - Part I
NLP in Practice - Part I
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
Python Programming - An Introduction
Python Programming - An IntroductionPython Programming - An Introduction
Python Programming - An Introduction
 
2014 manchester-reproducibility
2014 manchester-reproducibility2014 manchester-reproducibility
2014 manchester-reproducibility
 
Reproducible research - to infinity
Reproducible research - to infinityReproducible research - to infinity
Reproducible research - to infinity
 
Labou "Data Science and the Library at UC San Diego"
Labou "Data Science and the Library at UC San Diego"Labou "Data Science and the Library at UC San Diego"
Labou "Data Science and the Library at UC San Diego"
 
How to start Python? - lesson 1
How to start Python? - lesson 1How to start Python? - lesson 1
How to start Python? - lesson 1
 
2013 ucar best practices
2013 ucar best practices2013 ucar best practices
2013 ucar best practices
 
Intro to Python for C# Developers
Intro to Python for C# DevelopersIntro to Python for C# Developers
Intro to Python for C# Developers
 
Data_Science_Generating_Value_From_Data_Course_Slides_red.pdf
Data_Science_Generating_Value_From_Data_Course_Slides_red.pdfData_Science_Generating_Value_From_Data_Course_Slides_red.pdf
Data_Science_Generating_Value_From_Data_Course_Slides_red.pdf
 
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
NYC Open Data Meetup-- Thoughtworks chief data scientist talkNYC Open Data Meetup-- Thoughtworks chief data scientist talk
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
 
Meaning of life (TipTop Technologies)
Meaning of life (TipTop Technologies)Meaning of life (TipTop Technologies)
Meaning of life (TipTop Technologies)
 
Meaning of life (TipTop Technologies)
Meaning of life (TipTop Technologies)Meaning of life (TipTop Technologies)
Meaning of life (TipTop Technologies)
 
FEC2017-Introduction-to-programming
FEC2017-Introduction-to-programmingFEC2017-Introduction-to-programming
FEC2017-Introduction-to-programming
 
Reproducible Research with R, The Tidyverse, Notebooks, and Spark
Reproducible Research with R, The Tidyverse, Notebooks, and SparkReproducible Research with R, The Tidyverse, Notebooks, and Spark
Reproducible Research with R, The Tidyverse, Notebooks, and Spark
 
Welcome to the Brixton Library Technology Initiative
Welcome to the Brixton Library Technology InitiativeWelcome to the Brixton Library Technology Initiative
Welcome to the Brixton Library Technology Initiative
 
Introduction to Text Mining
Introduction to Text MiningIntroduction to Text Mining
Introduction to Text Mining
 

More from c.titus.brown

More from c.titus.brown (20)

2016 bergen-sars
2016 bergen-sars2016 bergen-sars
2016 bergen-sars
 
2016 davis-plantbio
2016 davis-plantbio2016 davis-plantbio
2016 davis-plantbio
 
2016 davis-biotech
2016 davis-biotech2016 davis-biotech
2016 davis-biotech
 
2015 genome-center
2015 genome-center2015 genome-center
2015 genome-center
 
2015 beacon-metagenome-tutorial
2015 beacon-metagenome-tutorial2015 beacon-metagenome-tutorial
2015 beacon-metagenome-tutorial
 
2015 aem-grs-keynote
2015 aem-grs-keynote2015 aem-grs-keynote
2015 aem-grs-keynote
 
2015 msu-code-review
2015 msu-code-review2015 msu-code-review
2015 msu-code-review
 
2015 illinois-talk
2015 illinois-talk2015 illinois-talk
2015 illinois-talk
 
2015 mcgill-talk
2015 mcgill-talk2015 mcgill-talk
2015 mcgill-talk
 
2015 pycon-talk
2015 pycon-talk2015 pycon-talk
2015 pycon-talk
 
2015 opencon-webcast
2015 opencon-webcast2015 opencon-webcast
2015 opencon-webcast
 
2015 vancouver-vanbug
2015 vancouver-vanbug2015 vancouver-vanbug
2015 vancouver-vanbug
 
2015 osu-metagenome
2015 osu-metagenome2015 osu-metagenome
2015 osu-metagenome
 
2015 ohsu-metagenome
2015 ohsu-metagenome2015 ohsu-metagenome
2015 ohsu-metagenome
 
2015 balti-and-bioinformatics
2015 balti-and-bioinformatics2015 balti-and-bioinformatics
2015 balti-and-bioinformatics
 
2015 pag-chicken
2015 pag-chicken2015 pag-chicken
2015 pag-chicken
 
2015 pag-metagenome
2015 pag-metagenome2015 pag-metagenome
2015 pag-metagenome
 
2014 nyu-bio-talk
2014 nyu-bio-talk2014 nyu-bio-talk
2014 nyu-bio-talk
 
2014 bangkok-talk
2014 bangkok-talk2014 bangkok-talk
2014 bangkok-talk
 
2014 anu-canberra-streaming
2014 anu-canberra-streaming2014 anu-canberra-streaming
2014 anu-canberra-streaming
 

2013 arizona-swc

  • 2. Instructors • Titus Brown • Karen Cranston • Rich Enbody • Deren, Chas, Katie, Nirav
  • 3. What do scientists care about? 1. Correctness 2. Reproducibility and provenance 3. Efficiency
  • 4. What do scientists actually care about? 1. Efficiency 2. Correctness 3. Reproducibility and provenance
  • 5. Our concern • As we become more reliant on computational inference, does more of our science become wrong? • “Big Data” increasingly requires sophisticated computational pipelines… • We know that simple computational errors have gone undetected for many years – a sign error => retraction of 3 Science, 1 Nature, 1 PNAS – Rejection of grants, publications! http://boscoh.com/protein/a-sign-a-flipped-structure- and-a-scientific-flameout-of-epic-proportions
  • 6. Our central thesis With only a little bit of training and effort, • Computational scientists can become more efficient and effective at getting their work done, • while considerably improving correctness and reproducibility of their code.
  • 8. Why Python, and not R? In my opinion, • Python is a more general purpose language, while R is mostly about data analysis. • Everyone will need to learn multiple languages; R and Python are pretty dominant in bio right now. • Luckily, once you get the hang of it, new languages are not so difficult to pick up. • Ultimately, we’re trying to teach process not details.
  • 9. Administrivia • Asking for help • Using the Web site • Sticky notes: ok? Not ok? • Minute cards: at the end of every session, write down • One thing you learned • One thing you are confused about