Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Beyond The Bench Workshops

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Chargement dans…3
×

Consultez-les par la suite

1 sur 48 Publicité

Plus De Contenu Connexe

Similaire à Beyond The Bench Workshops (20)

Publicité

Plus récents (20)

Beyond The Bench Workshops

  1. 1. #BeyondTheBench   #BECareer2013   #CurrentExchange   ORGANIZERS: SPONSORS:
  2. 2. ur yo ing ce lish sen il tab pre oukhal Es ne b er t A Ro b nli o
  3. 3. ? Why You’r e bein g Go ogle d
  4. 4. inke #1: L dIn
  5. 5. Why LinkedIn? •  Online CV + networking •  Recruiters use LinkedIn •  Find jobs posted on LinkedIn •  Apply to jobs
  6. 6. www.linkedin.com/pub/robert-aboukhalil/84/a648df/
  7. 7. #2: F aceb o ok
  8. 8. #3: T witte r
  9. 9. #4: Y our w ebsi te
  10. 10. Step 1: Wordpress.com
  11. 11. Step 1: Wordpress.com
  12. 12. Step 2: themeforest.net
  13. 13. Step 2: themeforest.net
  14. 14. Step 3: Have an awesome portfolio
  15. 15. Now what ?
  16. 16. A language all scientists should know How R helped me look at billions of genotypes and how it can help you too Mitchell Bekritsky WSBS Graduate Student
  17. 17. What is R? •  Language for statistical analysis, data manipulation and graphics •  Open source •  Flexible language •  Powerful built-in functions •  Strong user community •  Publication quality graphs •  Free! Graphic  from  h=p://blenditbayes.blogspot.com/2013/06/visualising-­‐crime-­‐hotspots-­‐in-­‐england_25.html  
  18. 18. Who uses R? Source:  h=p://www.revoluKonanalyKcs.com/what-­‐is-­‐open-­‐source-­‐r/companies-­‐using-­‐r.php  
  19. 19. What is R used for? •  Movie recommendations •  Clinical drug development •  Credit risk analysis •  News graphics •  Tailoring online advertising •  Modeling oil spills •  Predicting economic activity •  Predicting election outcomes Graphic  from  h=p://www.nyKmes.com/interacKve/2009/06/25/arts/0625-­‐jackson-­‐graphic.html  
  20. 20. But I’m a biologist…
  21. 21. How R helped me see my data •  First time looking at microsatellite genotypes •  How many microsatellites differ from reference genome? •  By how much? Problems: –  Lots of data (4.7 million genotypes) –  Complex information –  Too big for Excel –  No good graphics in Excel either
  22. 22. One of my first graphs in R Lessons learned about my data •  Lots of microsatellites differ from reference by a little bit •  Thousands differ by ± 20 bp •  8.27% of all microsatellites differ from reference (~400k) Lessons learned about my graph •  This is a terrible graph
  23. 23. A bad R graph is better than no R graph Bad graphs helped me •  Understand my data better •  Improve my analyses •  Improve how I communicate my data •  R has incredible flexibility for graphing—if you can dream it, you can probably build it
  24. 24. A bad R graph is better than no R graph Bad graphs helped me •  Understand my data better •  Improve my analyses •  Improve how I communicate my data •  R has incredible flexibility for graphing—if you can dream it, you can probably build it My best R graphs make one point clearly without clutter
  25. 25. For example…
  26. 26. How R saved my thesis •  Processing lots of sequencing data in hundreds of people •  Too many people and processes to monitor all steps of pipeline by eye while data was being processed Sanity check •  After data processing did data look bi-allelic?
  27. 27. How R saved my thesis •  Processing lots of sequencing data in hundreds of people •  Too many people and processes to monitor all steps of pipeline by eye while data was being processed Sanity check •  After data processing did data look bi-allelic? No!!  
  28. 28. Troubleshooting using R •  People don’t actually have massive deletions and amplifications •  My pipeline was deleting files because of a bug, which would remove large chunks of chromosomes •  Thanks to R, I found people where this had happened, tracked down the bug, and didn’t report massive CNVs in autistic children Side note •  If it looks too good to be true, it probably is
  29. 29. R helped me build a better genotyper •  Some non-reference alleles aren’t covered well •  Leads to incorrect genotype calls Problem •  How do I develop a smarter genotyper and know that it works?
  30. 30. R helped me build a better genotyper •  Some non-reference alleles chr19:54772760 A repeat, reference length 8 aren’t covered well Genotypes 100 •  Leads to incorrect genotype works? 60 40 20 0 genotyper and know that it 10 bp allele coverage •  How do I develop a smarter 80 calls Problem 10|-1 10|10 8|-1 8|10 8|8 0 20 40 60 8 bp allele coverage 80 100
  31. 31. Modeling genotypes in R •  Built a model for biased genotypes in R •  Model helped me build a more accurate genotyper •  When applied to real data, clear improvements
  32. 32. R finds de novo mutations for me •  >300 million genotypes •  How do I find de novo mutations in all that data? R to the rescue!
  33. 33. What R has done for me Data mining •  Finding de novo mutations •  Quality control for my data Data manipulation •  Converting raw read counts to genotypes Data simulation and modeling •  Finding ways to improve my genotyper Data visualization
  34. 34. R has extensive support for biologists Bioconductor is an incredible resource for biological analyses in R •  Microarrays •  Differential expression (DESeq, edgeR, cummeRbund) •  Gene models •  Flow cytometry (flowCore, flowStats, flowViz) •  Interacting with Ensembl, Cosmic, Gramene, etc. (biomaRt)
  35. 35. Installing R •  R can be downloaded from rproject.org •  R runs on PCs, Macs and Linux computers •  The R project website has an R manual to get you started
  36. 36. Working in R Native R interface can be hard to work with •  Lots of windows •  Difficult to keep things organized
  37. 37. RStudio interface •  All your variables, help pages, script windows and consoles in one place •  Highlights R code for easier programming •  Tabbed windows for multiple scripts •  History saves all previous commands, plot history saves all previous plots •  Find it at rstudio.com
  38. 38. Learning R Many online tutorials •  R has its own introduction •  Statistics Using R with Biological Examples Take interesting data, use it to explore R •  Plot, graph, use statistical tests Ask someone who knows R •  Getting started is pretty easy •  Learn what you need when you need it
  39. 39. Thanks!!
  40. 40. The Bioscience Entreprise Club is dedicated to helping CSHL’s science research professionals and alumni cultivate and leverage their cross-disciplinary skill sets and expertise to transition into diverse careers.
  41. 41. Current Exchange is CSHL’s very own student-run magazine. We feature articles about science aimed at a general audience. Check out our inaugural issue at issuu.com/ currentexchange Send your articles to raboukha@cshl.edu by November 5, 2013  

×