SlideShare une entreprise Scribd logo
1  sur  88
Télécharger pour lire hors ligne
Workshops	
  in	
  next-­‐genera1on	
  
science	
  at	
  UNC	
  Charlo7e	
  2014	
  
Workshop	
  1	
  -­‐	
  Design,	
  sequence,	
  
align,	
  count,	
  visualize	
  
1	
  
Workshop	
  Loca1ons	
  
•  Sec$on	
  1	
  -­‐	
  Room	
  801	
  	
  
– Ann	
  Loraine,	
  UNC	
  Charlo7e	
  
– Naim	
  Matasci,	
  University	
  of	
  Arizona,	
  iPlant	
  
•  Sec$on	
  2	
  -­‐	
  Room	
  802	
  
– Ivory	
  Clabaugh	
  Blakley,	
  UNC	
  Charlo7e	
  
– Xiangqin	
  Cui,	
  University	
  of	
  Alabama	
  Birmingham	
  
•  Please	
  stay	
  in	
  your	
  sec$on	
  
– Cover	
  same	
  material,	
  but	
  1ming	
  may	
  vary	
  
2	
  
Meet	
  your	
  TAs	
  
•  Graduate	
  students	
  from	
  UNCC	
  Dept	
  of	
  
Bioinforma1cs	
  and	
  Genomics	
  
–  801	
  Roshonda	
  Barner,	
  Ibro	
  Mujacic,	
  Chi-­‐Yu	
  "Jack"	
  Yen,	
  
Warren	
  (G.)	
  Cole,	
  Tony	
  Dao,	
  Greg	
  Linchango,	
  Sushma	
  
Madamanchi,	
  Anuja	
  Jain	
  
–  802	
  Richard	
  Linchangco,	
  	
  Fred	
  Lin,	
  Chris	
  Ball,	
  Lu	
  Tian,	
  
Shawn	
  Chaffin,	
  Natascha	
  Moestl,	
  Walter	
  Clemens,	
  
Adriano	
  Schneider	
  
•  Loraine	
  Lab	
  members	
  
–  801	
  Kyle	
  Su7lemyre	
  (IGB	
  support),	
  April	
  Estrada	
  
(Research	
  Specialist,	
  Expert	
  IGB	
  User)	
  
–  802	
  David	
  Norris	
  (IGB	
  Developer)	
  
3	
  
Schedule	
  
•  Workshop	
  1	
  -­‐	
  planning	
  an	
  experiment,	
  data	
  
processing,	
  visualiza1on	
  
– 9:00	
  to	
  11:30,	
  then	
  Lunch	
  
•  Workshop	
  2	
  -­‐	
  introduc1on	
  to	
  R	
  &	
  RStudio	
  for	
  
data	
  analysis,	
  differen1al	
  expression	
  
– 12:30	
  to	
  2:30,	
  then	
  a	
  30'	
  Break	
  
•  Workshop	
  3	
  -­‐	
  biological	
  interpreta1on	
  using	
  
pathway	
  tools,	
  Gene	
  Ontology,	
  the	
  Web	
  
– 3:00	
  to	
  	
  5:00,	
  then	
  Done	
  
4	
  
Using	
  RNA-­‐Seq	
  data	
  set	
  for	
  WiNGS2014	
  	
  
5	
  
pollennetwork.org	
  
•  Sponsored	
  by	
  Pollen	
  Research	
  Coordina1on	
  
Network	
  in	
  Integra1ve	
  Pollen	
  Biology	
  (annual	
  
mee1ng	
  starts	
  tonite)	
  	
  
•  Visit	
  Web	
  site	
  for	
  more	
  info	
  
RNA-­‐Seq	
  data	
  set	
  for	
  the	
  workshop	
  
•  Goal:	
  Provide	
  resources	
  for	
  pollen	
  biology	
  
–  Example	
  RNA-­‐Seq	
  data	
  analysis	
  
–  Catalog	
  of	
  genes	
  expressed	
  in	
  pollen	
  
–  Highlight	
  important	
  area	
  of	
  pollen	
  research	
  
•  Problem:	
  Pollen	
  in	
  some	
  plant	
  species	
  is	
  vulnerable	
  to	
  
heat	
  stress,	
  reduces	
  yields	
  
–  Exposure	
  to	
  mild	
  heat	
  stress	
  (acclima$on)	
  can	
  protect	
  
against	
  more	
  severe	
  stress	
  later	
  -­‐	
  called	
  acquired	
  
thermotolerance	
  (Firon	
  2012)	
  
•  To	
  learn	
  more,	
  we	
  sequenced	
  RNA	
  extracted	
  from	
  
pollen	
  undergoing	
  a	
  mild	
  heat	
  stress	
  
–  Same	
  temperature	
  that	
  can	
  establish	
  thermotolerance	
  
	
  
6	
  
Samples	
  from	
  the	
  lab	
  of	
  Nurit	
  Firon,	
  
Volcani	
  Ins1tute,	
  Israel	
  
•  Firon	
  lab	
  studies	
  effects	
  of	
  heat	
  stress	
  on	
  
tomato	
  pollen	
  
•  Showed	
  (along	
  with	
  others)	
  that	
  high	
  temp.	
  
reduces	
  pollen	
  viability,	
  sugar	
  content	
  	
  
•  Studying	
  a	
  heat-­‐tolerant	
  tomato	
  cul1var:	
  
Hazera	
  3042	
  
– Pollen	
  is	
  sensi1ve	
  to	
  heat	
  stress	
  but	
  not	
  as	
  much	
  
as	
  other	
  varie1es	
  
7	
  
Nurit's	
  experiment:	
  RNA-­‐Seq	
  of	
  heat-­‐
tolerant	
  tomato	
  cul1var	
  Hazera	
  3042	
  
•  Collected	
  pollen	
  from	
  plants	
  growing	
  in	
  
temperature-­‐controlled	
  greenhouses	
  
–  Control	
  25/18°	
  C	
  op$mal	
  temperature	
  
–  Treatment	
  32/26°	
  C	
  mild	
  chronic	
  heat	
  stress	
  	
  
•  Collected	
  batches	
  of	
  pollen	
  from	
  ~	
  10	
  plants	
  
during	
  Sep.	
  &	
  Oct	
  2013	
  	
  
–  One	
  treatment,	
  one	
  control	
  per	
  collec1on	
  
–  Made	
  RNA	
  from	
  five	
  collec1ons,	
  5	
  treatment,	
  5	
  
control	
  "batches"	
  
–  	
  sequenced	
  at	
  UCLA	
  (69	
  base,	
  PE)	
  
8	
  
Arabidopsis	
  cold	
  stress	
  RNA-­‐Seq	
  	
  
•  Simpler	
  data	
  set	
  with	
  one	
  treatment	
  &	
  control	
  
–  Using	
  data	
  from	
  part	
  of	
  chr1,	
  treatment	
  sample	
  to	
  
illustrate	
  data	
  processing,	
  visualiza1on,	
  effects	
  of	
  
parameter	
  seongs	
  on	
  results	
  (maximum	
  intron	
  size	
  in	
  
tophat	
  spliced	
  alignment	
  program)	
  
•  For	
  details,	
  see:	
  	
  
–  experiment	
  record	
  at	
  the	
  Short	
  Read	
  Archive
h7p://www.ncbi.nlm.nih.gov/sra/SRP029896	
  	
  
–  sample	
  h7p://www.ncbi.nlm.nih.gov/sra/SRX348640	
  	
  
•  Published	
  in	
  Methods	
  in	
  Molecular	
  Biology	
  
h7p://www.ncbi.nlm.nih.gov/pubmed/24792048	
  
	
  
9	
  
Workshop	
  1:	
  RNA-­‐Seq:	
  Design,	
  
sequence,	
  align,	
  count,	
  visualize	
  
wings	
  2014	
  
10	
  10	
  
Goals	
  
•  Learn	
  the	
  basics	
  (20')	
  
– Plan	
  an	
  experiment	
  
– Library	
  prep	
  for	
  RNA-­‐Seq	
  
– Illumina	
  sequencing	
  
•  Prac1ce:	
  Quality	
  analysis	
  using	
  FastQC	
  (30')	
  
•  Prac1ce:	
  Data	
  processing	
  (30')	
  
– Align	
  reads	
  (make	
  BAM	
  files	
  and	
  junc1on	
  files)	
  	
  
– Make	
  counts	
  files	
  for	
  sta1s1cal	
  analysis	
  
– Merge	
  reads	
  into	
  transcript	
  models	
  w/	
  Cufflinks 	
  	
  
•  Prac1ce:	
  Visualize	
  results	
  in	
  IGB	
  (60')	
  
– Compare	
  to	
  data	
  set	
  in	
  Galaxy,	
  TAIR10	
  gene	
  models	
  
11	
  
Visualiza1on	
  using	
  IGB	
  
FASTQ	
  files	
  
WildType1a.fastq
Work	
  Shop	
  2	
  
Workshop	
  1	
  
Overview	
   FASTQC	
  
Alignment	
  
onto	
  Genome	
  
$Command Line…
WildType1a.bam
Genera1on	
  of	
  Counts	
  Data	
  
Counts.txt
Sequencing	
  Strategy	
  
RNA-­‐seq:	
  ultra-­‐high	
  throughput	
  cDNA	
  
sequencing	
  
•  Several	
  papers	
  published	
  in	
  2008,	
  first	
  in	
  May	
  
	
  
13	
  h7p://blog.sbgenomics.com/rna-­‐seq-­‐the-­‐first-­‐wave/	
  
Ecker	
  lab	
  
Snyder	
  lab	
  
999	
  cites	
  
1,076	
  cites	
  
Mortazavi	
  2008	
  "Mapping	
  and	
  
quan1fying	
  mammalian	
  transcriptomes	
  
by	
  RNA-­‐Seq"	
  Nature	
  Methods	
  	
  
•  Published	
  later	
  in	
  2008,	
  
but	
  >	
  3000	
  cita1ons	
  
•  	
  Why?	
  Maybe	
  because	
  
emphasized	
  RNA-­‐Seq	
  as	
  	
  
replacement	
  for	
  
expression	
  DNA	
  
microarrays	
  
•  Comment	
  in	
  same	
  issue:	
  
"Beginning	
  of	
  the	
  end	
  for	
  
microarrays?"	
  
	
  
14	
  
google	
  scholar	
  
RNA-Seq Overview - Illumina	
  
~	
  ~	
  ~	
  ~	
  
fragment	
  
synthesize
cDNA	

(random
hexamers)	
   -	
  -	
  -	
  -	
  
-	
  -	
  -	
  -	
  
-	
  -	
  -	
  
-	
  -	
  
-	
  -	
  
-	
  -	
  -	
  -	
  
-	
  -	
  -	
  -	
  
-	
  -	
  -	
  
-	
  -	
  
-	
  -	
  
repair	

ends	
  
add “A”
bases to 3’
ends	
  
ligate
adapters	
  
extract RNA,
purify polyA+	
  
-	
  -	
  -	
  -	
  -	
  -	
   -	
  -	
  -	
  -	
   -	
  
amplify	
  
library	

reflects RNA
from original
sample	
  
Data, fastq
sequence files	

Millions of reads	

per library	
  
Map to genome	

Count reads
per gene	
  
improve
gene
models	
  
identify	

differentially
expressed	

genes	
  
alignments	
  
analyze
splicing	
  
and much
more..	
  
prepare
flowcell	
  
Plan experiment	

•  Biological replication	

•  Sequencing strategy	

•  Data analysis strategy	
  
sequence
by
synthesis	
  
collect samples	
  
2. Making Libraries	
  
quality assessment	

3. Sequencing	
  
4. Data Analysis	
  
1. Design	
  
15	
  
Five	
  steps	
  for	
  design	
  
1.  Ar1culate	
  your	
  ques$ons	
  or	
  hypothesis	
  	
  
2.  Define	
  your	
  unit	
  of	
  biological	
  replica1on.	
  
3.  Write	
  up	
  your	
  sample	
  collec1on	
  protocol	
  in	
  
detail	
  
–  Does	
  the	
  protocol	
  allow	
  you	
  to	
  test	
  your	
  hypothesis?	
  	
  
4.  Define	
  library	
  synthesis	
  &	
  sequencing	
  strategy	
  
–  Read	
  lengths,	
  paired	
  end	
  vs.	
  single	
  end,	
  depth,	
  
barcoding	
  
5.  Ask	
  an	
  experienced	
  data	
  analyst	
  to	
  review	
  your	
  
plan,	
  revise	
  needed	
  
16	
  
Image:	
  	
  
David	
  C	
  Corney	
  Ph.	
  D.	
  	
  h7p://www.labome.com/method/RNA-­‐seq-­‐Using-­‐Next-­‐Genera1on-­‐Sequencing.html	
  
Fork	
  or	
  "Y"	
  adapters	
  
size	
  selec1on	
  
Library	
  synthesis	
  	
  
17	
  
Y	
  adapters	
  
contain	
  indexes,	
  
allow	
  
mul1plexing	
  
Example	
  library	
  molecule	
  	
  
Unknown	
  
sequence	
  Rd1	
  
Rd2	
  
barcode	
  
Universal	
  
adapter	
  	
  
Index	
  
Primer	
  
18	
  
Rd1	
  
Rd2	
  
Rd1	
  &	
  Rd	
  2	
  are	
  from	
  reverse	
  complements,	
  might	
  overlap.	
  	
  
Ref:	
  h7p://nextgen.mgh.harvard.edu/IlluminaChemistry.html	
  
P5	
   P7	
  
Flow	
  cell	
  prepara1on	
  &	
  
sequencing	
  by	
  synthesis	
  
19	
  
h7ps://www.youtube.com/watch?v=HMyCqWhwB8E	
  
	
  
Review:	
  Paired	
  End	
  vs	
  Single	
  End	
  
•  Single	
  End	
  –	
  cheaper	
  
•  Paired	
  End	
  –	
  more	
  expensive	
  
– two	
  reads	
  per	
  fragment	
  
– coun1ng	
  fragments,	
  not	
  reads	
  	
  
– call	
  normalized	
  counts	
  FPKM	
  not	
  RPKM	
  
sequenced	
  in	
  SE	
  
Sequenced	
  in	
  PE	
  
SE	
  
PE	
  
indexed	
  
adapter	
  
20	
  
Get	
  the	
  reads	
  in	
  a	
  FASTQ	
  file	
  
•  File	
  contains	
  millions	
  of	
  records	
  
– Each	
  record	
  has	
  four	
  lines,	
  represents	
  ONE	
  
sequence	
  
•  Line	
  1	
  –	
  the	
  name,	
  starts	
  with	
  @	
  
•  Line	
  2	
  –	
  the	
  sequence,	
  starts	
  at	
  new	
  line	
  
•  Line	
  3	
  –	
  some	
  other	
  stuff,	
  op1onal,	
  starts	
  with	
  +	
  
•  Line	
  4	
  –	
  the	
  quality	
  scores,	
  starts	
  at	
  new	
  line	
  
@SN1083:379:H8VA1ADXX:2:1101:1248:2144 1:N:0:12!
CCTAAATGGTGCCATGCTAGGAGGCCGTGCCCTTCTTGAAAAGTTGTATGTGAA!
+!
BBBFFFFFFBFFFIIIIFI<FFIIIIIFIIIIFBFIIIIIIIIFFFIIIIFIII!
base	
  =	
  T	
  
score	
  =	
  F	
  =	
  37	
  
21	
  
Phred	
  Quality	
  score	
  Q	
  
h7p://en.wikipedia.org/wiki/FASTQ_format	
  
Describes	
  how	
  exponen1ally	
  unlikely	
  
it	
  is	
  that	
  a	
  given	
  base	
  call	
  is	
  wrong.	
  
Q	
  =	
  -­‐10	
  log10	
  pe	
  	
  
22	
  
h7p://drive5.com/usearch/manual/quality_score.html	
  
Different	
  Illumina	
  data	
  processing	
  pipelines	
  
used	
  different	
  score	
  encodings	
  
23	
  
Get	
  two	
  files	
  -­‐	
  Read1	
  &	
  Read2	
  -­‐	
  from	
  
paired	
  end	
  sequencing	
  
•  Read1	
  and	
  Read2	
  have	
  same	
  read	
  iden$fier,	
  are	
  
reverse	
  complements	
  of	
  the	
  same	
  fragment	
  	
  
•  Example	
  is	
  processing	
  pipeline	
  Cassava	
  1.8,	
  older	
  
versions	
  used	
  different	
  naming	
  conven1ons	
  
@SN1083:379:H8VA1ADXX:2:1101:1248:2144 1:N:0:12!
CCTAAATGGTGCCATGCTAGGAGGCCGTGCCCTTCTTGAAAAGTTGTATGTGAA!
+!
BBBFFFFFFBFFFIIIIFI<FFIIIIIFIIIIFBFIIIIIIIIFFFIIIIFIII!
@SN1083:379:H8VA1ADXX:2:1101:1248:2144 2:N:0:12!
CATTTTCGACGTTGTTAATAAGCTCTGCGTACTTGCAAGCTATCTGCGCGAACG!
+!
BBBFFFFFFFFFFIIIIIIIIIIIIIIIIFIIIIIIIIIIIIIIIIIIIIIFFF!
24	
  
R1	
  
R2	
  
Sequence	
  iden1fier	
  line	
  in	
  Cassava	
  1.8	
  	
  
25	
  
@SN1083:379:H8VA1ADXX:2:1101:1248:2144 1:N:0:12!
machine	
  	
  run#	
  	
  flow-­‐cell-­‐id	
  	
  lane	
  	
  	
  1le	
  	
  x-­‐pos	
  	
  y-­‐pos	
  
read#	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  index	
  
	
  	
  is-­‐filtered	
  	
  	
  	
  	
  	
  (barcode)	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  control	
  
FastQC	
  
•  Many	
  groups	
  use	
  FastQC	
  as	
  a	
  first	
  pass	
  quality	
  
assessment	
  
•  Free	
  from	
  Babraham	
  h7p://
www.bioinforma1cs.babraham.ac.uk/
projects/fastqc/	
  
•  Run	
  interac1vely	
  (point-­‐and-­‐click)	
  or	
  
command	
  line	
  (won’t	
  cover	
  this)	
  
26	
  
Prac1ce:	
  Using	
  FastQC	
  	
  
•  Go	
  to	
  Conference	
  DropBox	
  link:	
  	
  
–  h7p://bitly.com/rnaseq2014	
  
•  Note	
  two	
  folders	
  –	
  FastQC	
  and	
  FastQC-­‐Examples	
  
–  FastQC-­‐Examples	
  has	
  FastqQC	
  reports	
  from	
  different	
  
species,	
  sample	
  types	
  (next	
  slide)	
  
•  FastQC	
  folder,	
  download	
  
–  Example.fastq	
  
–  FastQC_Manual.pdf	
  
•  Start	
  FastQC,	
  open	
  Example.fastq	
  
27	
  
Prac1ce:	
  Watch	
  FastQC	
  video	
  
•  h7ps://www.youtube.com/watch?
v=bz93ReOv87Y	
  (start	
  around	
  34	
  sec)	
  
•  Take-­‐home	
  #1:	
  FastQC	
  assesses	
  whether	
  your	
  
data	
  files	
  are	
  typical	
  
•  Take-­‐home	
  #2:	
  A	
  "bad	
  result"	
  from	
  FastQC	
  
doesn't	
  always	
  mean	
  your	
  data	
  are	
  not	
  useful	
  
or	
  valuable	
  
•  Explore	
  on	
  your	
  own!	
  (~	
  15	
  minutes)	
  
28	
  
Prac1ce:	
  View	
  reports	
  in	
  Fastqc-­‐
Examples	
  (~	
  15	
  min)	
  	
  	
  
•  Blueberry	
  	
  
– OnealRipe_1	
  	
  
– OzarkblueGreen_1	
  
•  	
  Tomato	
  pollen	
  
– T2_1	
  	
  
– C2_1	
  	
  
•  Rice	
  
– Control2h-­‐R2	
  	
  Per	
  read	
  %GC	
  
29	
  
Prac1ce:	
  Data	
  processing	
  
•  Double-­‐click	
  "Alignment.tar.gz"	
  on	
  your	
  
Desktop	
  to	
  unpack	
  it	
  
•  Also	
  available	
  from	
  
h7p://bitly.com/rnaseq2014	
  
30	
  
Prac1ce:	
  Look	
  at	
  "align.sh"	
  
•  Open	
  Alignment	
  folder	
  
•  Right-­‐click	
  "align.sh"	
  
•  Select	
  "open	
  with	
  text	
  editor"	
  
•  This	
  is	
  a	
  shell	
  script	
  
–  Commands	
  executed	
  in	
  sequence	
  	
  
–  Very	
  useful	
  for	
  automa1ng	
  tasks	
  
•  First	
  line	
  is	
  "she-­‐bang"	
  line	
  
–  tells	
  Terminal	
  it's	
  a	
  shell	
  script	
  
•  All	
  other	
  lines	
  star1ng	
  with	
  #	
  are	
  
comments	
  (not	
  run)	
  
31	
  
Learning	
  the	
  
bash	
  shell	
  	
  
Great	
  guide	
  to	
  
wri1ng	
  shell	
  
scripts	
  
align.sh	
  -­‐	
  simple	
  pipeline	
  for	
  RNA-­‐
Seq	
  data	
  processing	
  
•  Aligns	
  a	
  sample	
  fastq	
  file	
  	
  to	
  genome	
  
–  tophat2, bowtie2!
–  fastq	
  file	
  is	
  from	
  Arabidopsis	
  cold	
  stress	
  experiment	
  
(Short	
  Read	
  Archive	
  SRX348640)	
  
–  file	
  ColdTreatment-little.fastq.gz (gzip-­‐
compressed,	
  .gz)	
  
•  Counts	
  reads	
  that	
  align	
  to	
  TAIR10	
  genes	
  
–  featureCounts!
–  only	
  coun1ng	
  reads	
  that	
  uniquely	
  align	
  
•  Merges	
  alignments	
  into	
  transcript	
  models	
  
–  cufflinks!
32	
  
Prac1ce:	
  Intro	
  to	
  Terminal	
  
•  Double-­‐click	
  Terminal	
  shortcut	
  on	
  desktop	
  	
  
–  Program	
  for	
  entering	
  commands	
  or	
  running	
  scripts	
  
–  Also	
  called	
  a	
  "shell"	
  or	
  "Unix	
  shell"	
  
–  Can	
  open	
  mul1ple	
  Terminal	
  windows	
  	
  
•  Each	
  window	
  called	
  a	
  "shell"	
  or	
  "Unix	
  shell"	
  
•  Terminal	
  shows	
  hierarchical	
  view	
  of	
  file	
  system	
  
–  An	
  upside-­‐down	
  tree,	
  where	
  every	
  folder	
  is	
  inside	
  
another	
  folder	
  
–  Folders	
  are	
  also	
  called	
  "directories"	
  	
  
–  The	
  top	
  folder	
  (that	
  contains	
  everything	
  else)	
  is	
  called	
  
"root"	
  directory	
  -­‐	
  	
  /	
  (forward	
  slash)	
  
33	
  
Prac1ce:	
  Open	
  Terminal,	
  try	
  these	
  
commands	
  
•  cd 	
  change	
  directory	
  
–  by	
  itself	
  means	
  "go	
  to	
  user	
  
home	
  directory"	
  	
  
–  with	
  an	
  argument	
  means:	
  go	
  
there	
  	
  
–  with	
  ".."	
  means	
  go	
  up	
  one	
  
•  pwd	
  -­‐	
  "print	
  the	
  current	
  
working	
  directory"	
  &	
  find	
  
out	
  where	
  you	
  are	
  
34	
  
Prac1ce:	
  Try	
  these	
  commands	
  
ls lists	
  files	
  and	
  directories	
  in	
  
the	
  current	
  directory	
  
35	
  
Prac1ce:	
  Try	
  these	
  commands	
  
36	
  
•  ls -l	
  "list	
  long"	
  	
  
– report	
  more	
  informa1on	
  about	
  files	
  
– "d"	
  means	
  it's	
  a	
  directory	
  (folder)	
  	
  	
  
Prac1ce:	
  Run	
  align.sh	
  in	
  Terminal	
  
•  Go	
  to	
  home	
  directory	
  
•  Go	
  to	
  Desktop	
  	
  
•  Go	
  to	
  Alignment	
  	
  
•  Run	
  align.sh	
  	
  
37	
  
Now	
  
Running:	
  
tophat2	
  	
  
spliced	
  
alignment	
  
tool	
  
38	
  
TopHat:	
  discovering	
  splice	
  
junc$ons	
  with	
  RNA-­‐Seq	
  	
  
Cole	
  Trapnell1,	
  Lior	
  Pachter	
  and	
  
Steven	
  L.	
  Salzberg	
  
Figure	
  1	
  
Tophat	
  Output	
  -­‐	
  we'll	
  open	
  in	
  IGB	
  
•  Creates	
  new	
  folder	
  with	
  files,	
  including...	
  
•  accepted_hits.bam	
  -­‐	
  "binary	
  alignments"	
  file	
  
contains	
  read	
  alignments	
  
–  BAM	
  -­‐	
  compressed	
  version	
  of	
  SAM	
  -­‐	
  "sequence	
  alignment",	
  
needs	
  index	
  ".bai"	
  file	
  (made	
  using	
  samtools)	
  
•  junction.bed	
  -­‐	
  reports	
  boundaries	
  of	
  introns,	
  
called	
  "junc1on"	
  features	
  	
  
–  BED	
  format,	
  tab-­‐delimited	
  plain	
  text	
  file	
  
–  one	
  junc1on	
  feature	
  per	
  line	
  
–  fi{h	
  field	
  is	
  score,	
  no.	
  spliced	
  reads	
  aligned	
  across	
  the	
  
junc1on	
  
–  see:	
  h7p://genome.ucsc.edu/FAQ/
FAQformat.html#format1	
  
39	
  
Prac1ce:	
  Start	
  IGB	
  while	
  script	
  runs	
  	
  
•  Double-­‐click	
  IGB	
  desktop	
  icon	
  
•  Click	
  Arabidopsis	
  flower	
  on	
  start	
  screen	
  
40	
  
Prac1ce:	
  How	
  to	
  get	
  IGB	
  if	
  you're	
  using	
  
your	
  own	
  computer	
  
•  Go	
  to	
  h7p://bioviz.org	
  
•  Follow	
  Download	
  link	
  
•  Choose	
  Medium	
  Memory	
  op1on	
  (typical)	
  
41	
  
TAIR10	
  annota1ons,	
  June	
  2009	
  
Columbia-­‐0	
  genome	
  release	
  
•  TAIR10	
  protein-­‐coding	
  gene	
  models	
  loaded	
  
automa1cally	
  from	
  IGB	
  data	
  server	
  	
  
•  Forward	
  &	
  reverse	
  strand	
  in	
  separate	
  tracks	
  
42	
  
Forward	
  
Reverse	
  
RNA-­‐Seq,	
  ChIP-­‐Seq,	
  other	
  data	
  sets	
  
available	
  in	
  Data	
  Access	
  tab	
  
•  IGB	
  data	
  servers,	
  can	
  set	
  up	
  your	
  own	
  	
   43	
  
Arabidopsis	
  pollen	
  data	
  sets	
  
•  Read	
  alignments,	
  coverage	
  graphs,	
  junc1on	
  files	
  
•  From	
  2013	
  Plant	
  Phys.	
  Pollen	
  RNA-­‐Seq	
  paper	
  44	
  
Prac1ce:	
  Combine	
  Plus	
  &	
  Minus	
  Tracks	
  
Click	
  "+/-­‐"	
  to	
  
combine	
  tracks	
  	
  
45	
  
Use	
  Data	
  Management	
  Table	
  to	
  change	
  track	
  
color,	
  name,	
  visibility,	
  load	
  op1ons,	
  strand	
  op1ons	
  
Summary	
  of	
  moving	
  and	
  zooming	
  
•  Animated	
  zooming	
  	
  
–  click	
  to	
  posi1on	
  zoom	
  stripe,	
  sets	
  zoom	
  focus	
  
–  horizontal	
  zoom	
  &	
  ver1cal	
  stretch	
  
•  Moving	
  from	
  side	
  to	
  side	
  (panning)	
  
–  arrows	
  in	
  toolbar	
  
–  hand	
  icon	
  -­‐	
  the	
  move	
  tool	
  
•  Jump-­‐zooming	
  
–  Click-­‐drag	
  coordinate	
  axis	
  with	
  arrow	
  tool	
  
–  Double-­‐click	
  to	
  zoom	
  in	
  on	
  a	
  feature	
  	
  
–  Search	
  by	
  name	
  
46	
  
Prac1ce:	
  Zoom	
  in	
  on	
  a	
  feature	
  
•  Zoom	
  in	
  on	
  alt-­‐spliced	
  gene	
  models	
  *	
  on	
  chr1	
  
•  This	
  is	
  animated	
  zooming	
  
47	
  
1.	
  Click	
  to	
  set	
  
zoom	
  focus	
  2.	
  Drag	
  slider	
  
to	
  zoom	
  in	
  	
  
*	
  
Prac1ce:	
  Click	
  move	
  arrows	
  to	
  reposi1on	
  
during	
  zoom	
  
•  Click	
  data	
  
display	
  to	
  re-­‐
focus	
  zoom	
  on	
  
target	
  loca1on	
  
48	
  
49	
  
Prac1ce:	
  Or	
  use	
  move	
  tool	
  (hand)	
  to	
  
reposi1on	
  during	
  zoom	
  
•  Click	
  display	
  to	
  focus	
  zoom	
  on	
  target	
  	
  
1.	
  Select	
  
move	
  tool	
  
(hand)	
  	
  	
  
2.	
  Click-­‐drag	
  
to	
  move	
  
Prac1ce:	
  Click-­‐drag	
  sequence	
  axis	
  to	
  jump-­‐
zoom	
  to	
  a	
  region	
  
2.	
  Click	
  number	
  line	
  
50	
  
3.	
  Drag	
  
4.	
  Release	
  
•  Highlighted	
  region	
  becomes	
  new	
  view	
  
1.	
  Select	
  
pointer	
  tool	
  
Prac1ce:	
  Jump-­‐zoom	
  to	
  gene	
  model	
  
•  Double-­‐click	
  label,	
  space	
  a	
  li7le	
  above	
  exon	
  blocks,	
  or	
  
intron	
  to	
  jump-­‐zoom	
  to	
  a	
  gene	
  model	
  
–  Also	
  selects	
  it,	
  selected	
  items	
  outlined	
  in	
  red	
  
51	
  
2.	
  double-­‐click	
  
label	
  or	
  intron	
  	
  
1.	
  Select	
  
pointer	
  tool	
  
A{er	
  jump-­‐zoom,	
  gene	
  model	
  is	
  selected	
  	
  
•  Arrows	
  indicate	
  direc1on	
  of	
  transcrip1on	
  
52	
  
Selected	
  gene	
  
model	
  
outlined	
  in	
  red	
  
Prac1ce:	
  Gene	
  model	
  close-­‐up	
  
•  Use	
  ver1cal	
  slider	
  to	
  make	
  gene	
  models	
  taller	
  
•  Increase	
  window	
  size	
  to	
  make	
  more	
  room	
  
53	
  
Drag	
  slider	
  to	
  stretch	
  ver1cally	
  
Prac1ce:	
  Interact	
  with	
  data	
  using	
  pointer.	
  
Select	
  pointer	
  (arrow)	
  in	
  toolbar	
  	
  
•  Click	
  intron,	
  label,	
  or	
  region	
  above	
  blocks	
  to	
  select	
  
whole	
  gene	
  model	
  
•  Click	
  blocks	
  to	
  select	
  parts	
  of	
  a	
  gene	
  model	
  
•  SHIFT-­‐click	
  to	
  mul1-­‐select	
  
•  CLICK-­‐drag	
  to	
  select	
  &	
  count	
  everything	
  in	
  a	
  region	
  
•  Selec1on	
  Info,	
  top	
  right,	
  reports	
  counts	
  
–  "i"	
  bu7on	
  shows	
  info	
  if	
  one	
  item	
  selected	
  	
  
54	
  
Prac1ce:	
  View	
  edge	
  Matching	
  
•  Edges	
  that	
  match	
  selected	
  item	
  edges	
  are	
  
highlighted	
  in	
  red	
  
•  To	
  change	
  edge-­‐match	
  color	
  choose	
  File	
  >	
  
Preferences	
  >	
  Other	
  Op$ons	
  
•  To	
  turn	
  off	
  or	
  on,	
  see	
  View	
  >	
  Edge	
  Matching	
  	
  
55	
  
Prac1ce:	
  to	
  work	
  with	
  sequence	
  data,	
  click	
  
Load	
  Sequence	
  
56	
  •  Sequence	
  appears	
  in	
  Coordinates	
  track	
  
Prac1ce:	
  Zoom	
  in	
  to	
  see	
  amino	
  acids	
  
•  Note:	
  Must	
  load	
  genomic	
  sequence	
  first	
  
57	
  
Prac1ce:	
  Zoom	
  in	
  on	
  end	
  of	
  transla1on	
  
•  Click	
  the	
  "thick	
  end"	
  and	
  then	
  zoom	
  in	
  
•  Note:	
  Variants	
  encode	
  same	
  C-­‐term	
  amino	
  acids	
  
58	
  
Prac1ce:	
  Select	
  genomic	
  sequence	
  
1.	
  Choose	
  
pointer	
  tool	
  
in	
  toolbar	
  
	
  
	
  
2.	
  Click-­‐drag	
  
genomic	
  
sequence	
  to	
  
select	
  a	
  region	
  
3.	
  CNTRL-­‐click	
  
to	
  copy	
  
•  Length	
  of	
  selected	
  region	
  reported	
  in	
  Selec$on	
  Info	
  
box	
  (top	
  right)	
  
•  Useful	
  for	
  designing	
  primers,	
  measuring	
  regions	
  
59	
  
Prac1ce:	
  Right-­‐click	
  (or	
  CNTRL-­‐click)	
  gene	
  model	
  	
  
•  Shows	
  op1ons	
  to	
  run	
  a	
  Web	
  search,	
  BLAST	
  search,	
  
view	
  sequence	
  
60	
  
Prac1ce:	
  Quick	
  Search	
  
•  Enter	
  search	
  text,	
  select	
  op1on	
  
•  Jump-­‐zoom	
  to	
  selected	
  gene	
  
61	
  
Choose	
  
At-­‐SR30	
  
Zoomed	
  to	
  At-­‐SR30,	
  RNA-­‐binding	
  
protein	
  involved	
  in	
  splicing	
  
62	
  
Looking	
  ahead	
  to	
  Workshop	
  3	
  
•  Some	
  genes	
  that	
  were	
  highly	
  expressed	
  in	
  
tomato	
  pollen	
  are	
  annotated	
  as	
  "Unknown"	
  
proteins	
  &	
  have	
  no	
  counterpart	
  in	
  Arabidopsis.	
  
•  You	
  can	
  use	
  IGB	
  to	
  quickly	
  find	
  those	
  genes	
  
and	
  then	
  run	
  BLASTX	
  or	
  BLASTP	
  searches	
  at	
  
NCBI	
  to	
  find	
  out...	
  
– Are	
  they	
  unique	
  to	
  tomato?	
  
– Could	
  they	
  be	
  non-­‐coding?	
  	
  
63	
  
Prac1ce:	
  Open	
  files	
  from	
  align.sh!
•  Zoom	
  out	
  to	
  show	
  more	
  of	
  At-­‐SR30	
  region	
  
•  Choose	
  File	
  >	
  Open	
  
– Select	
  "accepted_hits.bam"	
  &	
  
"junctions.bed"	
  	
  
•  A	
  new	
  empty	
  track	
  appears	
  for	
  each	
  file	
  
•  Click	
  Load	
  Data	
  to	
  load	
  reads	
  and	
  junc1ons	
  
64	
  
65	
  
read	
  alignments	
  stack	
  	
  
reads	
  at	
  top	
  of	
  stack	
  
not	
  being	
  shown	
  (too	
  
many	
  to	
  fit)	
  
66	
  
junc1on	
  features,	
  
summarizing	
  
spliced	
  reads	
  
junc1on	
  features,	
  
summarizing	
  
spliced	
  reads	
  
Prac1ce:	
  Configure	
  view	
  -­‐	
  Load	
  
Sequence	
  
67	
  
Click	
  Load	
  
Sequence	
  to	
  
load	
  genomic	
  
bases	
  for	
  this	
  
region	
  	
  
Prac1ce:	
  Configure	
  view	
  -­‐	
  Lock	
  mRNA	
  track	
  height	
  
68	
  
1.	
  Click	
  TAIR10	
  mRNA	
  
track	
  label	
  to	
  select	
  it	
  
2.	
  Open	
  
Annota$on	
  tab	
  
3.	
  Select	
  Lock	
  Track	
  
Height,	
  enter	
  170,	
  click	
  
Apply	
  
Prac1ce:	
  Configure	
  view	
  -­‐	
  configure	
  junc1on	
  track	
  
69	
  
1.	
  Click	
  junc$ons	
  
track	
  label	
  to	
  select	
  
junc1ons	
  track	
  
2.	
  Open	
  
Annota$on	
  tab	
  
3.	
  Select	
  
score	
  in	
  Label	
  
Field	
  	
  
4.	
  Select	
  +/-­‐	
  
in	
  Strand	
  
Prac1ce:	
  Configure	
  view	
  -­‐	
  lock	
  junc1on	
  track	
  height	
  
70	
  
1.	
  Click	
  junc$ons	
  
track	
  label	
  to	
  
select	
  it	
  
2.	
  Open	
  
Annota$on	
  tab	
  
3.	
  Select	
  Lock	
  Track	
  Height,	
  
enter	
  120,	
  click	
  Apply	
  
Prac1ce:	
  Change	
  read	
  stack	
  height	
  to	
  see	
  more	
  reads	
  
1.  	
  CNTRL-­‐click	
  (or	
  right-­‐click)	
  accepted_hits.bam	
  
track	
  label	
  
2.  Choose	
  Set	
  Stack	
  Height...	
  
71	
  
Prac1ce:	
  Change	
  read	
  stack	
  height	
  	
  
3.	
  Enter	
  50	
  	
  
72	
  
Prac1ce:	
  Change	
  read	
  stack	
  height	
  to	
  see	
  more	
  reads	
  
Prac1ce:	
  Set	
  mRNA	
  stack	
  height	
  	
  
2.	
  Enter	
  3	
  -­‐	
  	
  
tallest	
  stack	
  
has	
  3	
  models	
  	
  
73	
  
Note:	
  Tabs	
  are	
  minimized	
  to	
  make	
  more	
  space	
  
1.	
  Right-­‐click	
  
TAIR10	
  mRNA	
  
track	
  label,	
  
choose	
  Set	
  
Stack	
  Height	
  
Prac1ce:	
  Note	
  read	
  support	
  for	
  
alterna1ve	
  splicing	
  
Take-­‐home:	
  Many	
  spliced	
  
reads	
  support	
  both	
  
variants,	
  but	
  there	
  are	
  also	
  
many	
  reads	
  inside	
  the	
  
introns,	
  indica1ng	
  failure	
  to	
  
splice.	
  This	
  may	
  be	
  typical	
  
of	
  alt-­‐spliced	
  introns?	
  
74	
  
Prac1ce:	
  Use	
  junc1on	
  track	
  to	
  
quan1fy	
  support	
  for	
  splice	
  variants	
  
1.  Click-­‐drag	
  to	
  genes	
  track	
  
2.  Scores	
  are	
  number	
  of	
  
spliced	
  reads	
  suppor1ng	
  
each	
  junc1on.	
  
75	
  
Prac1ce:	
  Compare	
  Cufflinks	
  GTF	
  file	
  to	
  
Gene	
  models	
  	
  
•  Open	
  Alignments	
  >	
  cufflinks_cold	
  >	
  
transcripts.gf	
  
76	
  
Prac1ce:	
  View	
  Cufflinks	
  gene	
  
models	
  
77	
  
1.	
  Click	
  Load	
  
Data	
  to	
  see	
  
Cufflinks	
  
models	
  
2.	
  Click-­‐drag	
  
new	
  track	
  
next	
  to	
  gene	
  
models	
  
3.	
  Use	
  
ver$cal	
  slider	
  
to	
  make	
  more	
  
room	
  
Take-­‐home:	
  
Cufflinks	
  
annota1ons	
  
close,	
  but	
  
incomplete.	
  	
  	
  
Prac1ce:	
  Load	
  data	
  from	
  Galaxy	
  
78	
  
1.	
  Go	
  to	
  usegalaxy.org	
  
2.	
  Open	
  Shared	
  Data	
  
3.	
  Choose	
  
Published	
  
Histories	
  
Prac1ce:	
  Load	
  data	
  from	
  Galaxy	
  
79	
  
1.	
  Search	
  for	
  Cold	
  
3.	
  Select	
  Cold	
  
stress	
  in	
  
Arabidopsis	
  (with	
  
default	
  maximum	
  
intron	
  size)	
  	
  
Prac1ce:	
  Load	
  data	
  from	
  Galaxy	
  
•  Illustrates	
  results	
  when	
  tophat	
  is	
  run	
  with	
  default	
  seongs:	
  
–  default	
  maximum	
  intron	
  size	
  is	
  500,000	
  bases	
  
•  Tophat	
  was	
  developed	
  with	
  human	
  data	
  in	
  mind,	
  where	
  
large	
  introns	
  are	
  common	
  
80	
  
Select	
  
Import	
  
History	
  	
  
Prac1ce:	
  Select	
  start	
  using	
  this	
  history	
  
81	
  
82	
  
1.	
  Select	
  Treatment	
  junc1ons	
  	
  	
  
2.	
  Select	
  display	
  in	
  IGB	
  View	
  	
  
83	
  
New	
  tab	
  opens.	
  Select	
  
Click	
  to	
  go	
  to	
  IGB	
  	
  
84	
  
New	
  track	
  
1.	
  Click	
  
Load	
  Data	
  
Prac1ce:	
  Remove	
  reads	
  -­‐	
  don't	
  need	
  them	
  now	
  
85	
  
1.  Right-­‐click	
  
accepted_hits.bam	
  
2.	
  Choose	
  Delete	
  Track	
  
86	
  
1.  Zoom	
  out	
  
all	
  the	
  way	
  
2.  Click	
  Load	
  
Data	
  
Your	
  data	
  are	
  here	
  
87	
  
Take-­‐home:	
  Tophat	
  run	
  
with	
  default	
  parameters	
  
predicts	
  enormous	
  
introns.	
  Important	
  to	
  
understand	
  parameters	
  
seongs	
  -­‐-­‐	
  defaults	
  are	
  
not	
  always	
  best.	
  
Now	
  you	
  can	
  
•  Describe	
  Illumina	
  library	
  synthesis,	
  sequencing	
  
•  Evaluate	
  data	
  quality	
  using	
  FastQC	
  
•  Run	
  a	
  data	
  processing	
  pipeline	
  (shell	
  script)	
  
•  View	
  and	
  explore	
  data	
  in	
  a	
  genome	
  browser	
  
– and	
  load	
  data	
  sets	
  from	
  Galaxy,	
  local	
  files	
  
88	
  
Thank	
  you	
  for	
  your	
  a7en1on!	
  

Contenu connexe

Tendances

2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues
Dongyan Zhao
 
DEseq, voom and vst
DEseq, voom and vstDEseq, voom and vst
DEseq, voom and vst
Qiang Kou
 

Tendances (20)

RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2
 
RNA-seq Data Analysis Overview
RNA-seq Data Analysis OverviewRNA-seq Data Analysis Overview
RNA-seq Data Analysis Overview
 
Catalyzing Plant Science Research with RNA-seq
Catalyzing Plant Science Research with RNA-seqCatalyzing Plant Science Research with RNA-seq
Catalyzing Plant Science Research with RNA-seq
 
Part 2 of RNA-seq for DE analysis: Investigating raw data
Part 2 of RNA-seq for DE analysis: Investigating raw dataPart 2 of RNA-seq for DE analysis: Investigating raw data
Part 2 of RNA-seq for DE analysis: Investigating raw data
 
Rna seq
Rna seqRna seq
Rna seq
 
Rna seq
Rna seq Rna seq
Rna seq
 
ChipSeq Data Analysis
ChipSeq Data AnalysisChipSeq Data Analysis
ChipSeq Data Analysis
 
Part 1 of RNA-seq for DE analysis: Defining the goal
Part 1 of RNA-seq for DE analysis: Defining the goalPart 1 of RNA-seq for DE analysis: Defining the goal
Part 1 of RNA-seq for DE analysis: Defining the goal
 
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
 
2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues
 
RNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionRNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential Expression
 
DEseq, voom and vst
DEseq, voom and vstDEseq, voom and vst
DEseq, voom and vst
 
Differential gene expression
Differential gene expressionDifferential gene expression
Differential gene expression
 
RNA sequencing: advances and opportunities
RNA sequencing: advances and opportunities RNA sequencing: advances and opportunities
RNA sequencing: advances and opportunities
 
Transcript detection in RNAseq
Transcript detection in RNAseqTranscript detection in RNAseq
Transcript detection in RNAseq
 
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...
 
RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1
 
Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2
 
Introduction to Single-cell RNA-seq
Introduction to Single-cell RNA-seqIntroduction to Single-cell RNA-seq
Introduction to Single-cell RNA-seq
 
presentation
presentationpresentation
presentation
 

Similaire à wings2014 Workshop 1 Design, sequence, align, count, visualize

2012 august 16 systems biology rna seq v2
2012 august 16 systems biology rna seq v22012 august 16 systems biology rna seq v2
2012 august 16 systems biology rna seq v2
Anne Deslattes Mays
 
2013 july 25 systems biology rna seq v2
2013 july 25 systems biology rna seq v22013 july 25 systems biology rna seq v2
2013 july 25 systems biology rna seq v2
Anne Deslattes Mays
 
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Spark Summit
 
Aug2013 NIST program slides
Aug2013 NIST program slidesAug2013 NIST program slides
Aug2013 NIST program slides
GenomeInABottle
 

Similaire à wings2014 Workshop 1 Design, sequence, align, count, visualize (20)

Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128
 
2012 august 16 systems biology rna seq v2
2012 august 16 systems biology rna seq v22012 august 16 systems biology rna seq v2
2012 august 16 systems biology rna seq v2
 
Introduction to Apollo for i5k
Introduction to Apollo for i5kIntroduction to Apollo for i5k
Introduction to Apollo for i5k
 
Evaluation of the impact of error correction algorithms on SNP calling.
Evaluation of the impact of error correction algorithms on SNP calling.Evaluation of the impact of error correction algorithms on SNP calling.
Evaluation of the impact of error correction algorithms on SNP calling.
 
ECCB 2014: Extracting patterns of database and software usage from the bioinf...
ECCB 2014: Extracting patterns of database and software usage from the bioinf...ECCB 2014: Extracting patterns of database and software usage from the bioinf...
ECCB 2014: Extracting patterns of database and software usage from the bioinf...
 
20160308 dtl ngs_focus_group_meeting_slideshare
20160308 dtl ngs_focus_group_meeting_slideshare20160308 dtl ngs_focus_group_meeting_slideshare
20160308 dtl ngs_focus_group_meeting_slideshare
 
High Throughput Sequencing Technologies: On the path to the $0* genome
High Throughput Sequencing Technologies: On the path to the $0* genomeHigh Throughput Sequencing Technologies: On the path to the $0* genome
High Throughput Sequencing Technologies: On the path to the $0* genome
 
2013 july 25 systems biology rna seq v2
2013 july 25 systems biology rna seq v22013 july 25 systems biology rna seq v2
2013 july 25 systems biology rna seq v2
 
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
 
Aug2013 NIST program slides
Aug2013 NIST program slidesAug2013 NIST program slides
Aug2013 NIST program slides
 
GLBIO/CCBC Metagenomics Workshop
GLBIO/CCBC Metagenomics WorkshopGLBIO/CCBC Metagenomics Workshop
GLBIO/CCBC Metagenomics Workshop
 
Hamas 1
Hamas 1Hamas 1
Hamas 1
 
Reproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trends
 
Third Generation Sequencing
Third Generation Sequencing Third Generation Sequencing
Third Generation Sequencing
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
Making your science powerful : an introduction to NGS experimental design
Making your science powerful : an introduction to NGS experimental designMaking your science powerful : an introduction to NGS experimental design
Making your science powerful : an introduction to NGS experimental design
 
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
 
Aug2014 abrf interlaboratory study plans
Aug2014 abrf interlaboratory study plansAug2014 abrf interlaboratory study plans
Aug2014 abrf interlaboratory study plans
 
Ion torrent and SOLiD Sequencing Techniques
Ion torrent and SOLiD Sequencing Techniques Ion torrent and SOLiD Sequencing Techniques
Ion torrent and SOLiD Sequencing Techniques
 

Plus de Ann Loraine

Plus de Ann Loraine (15)

Use Integrated Genome Browser to explore, analyze, and publish genomic data
Use Integrated Genome Browser to explore, analyze, and publish genomic dataUse Integrated Genome Browser to explore, analyze, and publish genomic data
Use Integrated Genome Browser to explore, analyze, and publish genomic data
 
Visualize genomes with Integrated Genome Browser
Visualize genomes with Integrated Genome BrowserVisualize genomes with Integrated Genome Browser
Visualize genomes with Integrated Genome Browser
 
BINF 3121 Data Analysis Report How-To
BINF 3121 Data Analysis Report How-ToBINF 3121 Data Analysis Report How-To
BINF 3121 Data Analysis Report How-To
 
Giving great talks in Bioinformatics - from Professional Communication class ...
Giving great talks in Bioinformatics - from Professional Communication class ...Giving great talks in Bioinformatics - from Professional Communication class ...
Giving great talks in Bioinformatics - from Professional Communication class ...
 
Interviewing - why some questions are off limits
Interviewing - why some questions are off limitsInterviewing - why some questions are off limits
Interviewing - why some questions are off limits
 
RNA-Seq Analysis of Blueberry Fruit Development and Ripening
RNA-Seq Analysis of Blueberry Fruit Development and RipeningRNA-Seq Analysis of Blueberry Fruit Development and Ripening
RNA-Seq Analysis of Blueberry Fruit Development and Ripening
 
Introducing ProtAnnot - Araport workshop at PAG 2016
Introducing ProtAnnot - Araport workshop at PAG 2016Introducing ProtAnnot - Araport workshop at PAG 2016
Introducing ProtAnnot - Araport workshop at PAG 2016
 
Em pcr 16x9
Em pcr 16x9Em pcr 16x9
Em pcr 16x9
 
Arrays and alternative splicing
Arrays and alternative splicingArrays and alternative splicing
Arrays and alternative splicing
 
Visualizing the genome: Techniques for presenting genome data and annotations
Visualizing the genome: Techniques for presenting genome data and annotationsVisualizing the genome: Techniques for presenting genome data and annotations
Visualizing the genome: Techniques for presenting genome data and annotations
 
WiNGS 2014 Workshop 2 R, RStudio, and reproducible research with knitr
WiNGS 2014 Workshop 2 R, RStudio, and reproducible research with knitrWiNGS 2014 Workshop 2 R, RStudio, and reproducible research with knitr
WiNGS 2014 Workshop 2 R, RStudio, and reproducible research with knitr
 
RNA-Seq data analysis at wings 2014 - Workshop 3 Biological Interpretation
RNA-Seq data analysis at wings 2014 - Workshop 3 Biological InterpretationRNA-Seq data analysis at wings 2014 - Workshop 3 Biological Interpretation
RNA-Seq data analysis at wings 2014 - Workshop 3 Biological Interpretation
 
Linking IGB with Galaxy
Linking IGB with GalaxyLinking IGB with Galaxy
Linking IGB with Galaxy
 
IGB genome genometry data models by Gregg Helt and Cyrus Harmon
IGB genome genometry data models by Gregg Helt and Cyrus HarmonIGB genome genometry data models by Gregg Helt and Cyrus Harmon
IGB genome genometry data models by Gregg Helt and Cyrus Harmon
 
RNA-Seq analysis of blueberry fruit identifies candidate genes involved in ri...
RNA-Seq analysis of blueberry fruit identifies candidate genes involved in ri...RNA-Seq analysis of blueberry fruit identifies candidate genes involved in ri...
RNA-Seq analysis of blueberry fruit identifies candidate genes involved in ri...
 

Dernier

Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
gajnagarg
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
amitlee9823
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 

Dernier (20)

Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 

wings2014 Workshop 1 Design, sequence, align, count, visualize

  • 1. Workshops  in  next-­‐genera1on   science  at  UNC  Charlo7e  2014   Workshop  1  -­‐  Design,  sequence,   align,  count,  visualize   1  
  • 2. Workshop  Loca1ons   •  Sec$on  1  -­‐  Room  801     – Ann  Loraine,  UNC  Charlo7e   – Naim  Matasci,  University  of  Arizona,  iPlant   •  Sec$on  2  -­‐  Room  802   – Ivory  Clabaugh  Blakley,  UNC  Charlo7e   – Xiangqin  Cui,  University  of  Alabama  Birmingham   •  Please  stay  in  your  sec$on   – Cover  same  material,  but  1ming  may  vary   2  
  • 3. Meet  your  TAs   •  Graduate  students  from  UNCC  Dept  of   Bioinforma1cs  and  Genomics   –  801  Roshonda  Barner,  Ibro  Mujacic,  Chi-­‐Yu  "Jack"  Yen,   Warren  (G.)  Cole,  Tony  Dao,  Greg  Linchango,  Sushma   Madamanchi,  Anuja  Jain   –  802  Richard  Linchangco,    Fred  Lin,  Chris  Ball,  Lu  Tian,   Shawn  Chaffin,  Natascha  Moestl,  Walter  Clemens,   Adriano  Schneider   •  Loraine  Lab  members   –  801  Kyle  Su7lemyre  (IGB  support),  April  Estrada   (Research  Specialist,  Expert  IGB  User)   –  802  David  Norris  (IGB  Developer)   3  
  • 4. Schedule   •  Workshop  1  -­‐  planning  an  experiment,  data   processing,  visualiza1on   – 9:00  to  11:30,  then  Lunch   •  Workshop  2  -­‐  introduc1on  to  R  &  RStudio  for   data  analysis,  differen1al  expression   – 12:30  to  2:30,  then  a  30'  Break   •  Workshop  3  -­‐  biological  interpreta1on  using   pathway  tools,  Gene  Ontology,  the  Web   – 3:00  to    5:00,  then  Done   4  
  • 5. Using  RNA-­‐Seq  data  set  for  WiNGS2014     5   pollennetwork.org   •  Sponsored  by  Pollen  Research  Coordina1on   Network  in  Integra1ve  Pollen  Biology  (annual   mee1ng  starts  tonite)     •  Visit  Web  site  for  more  info  
  • 6. RNA-­‐Seq  data  set  for  the  workshop   •  Goal:  Provide  resources  for  pollen  biology   –  Example  RNA-­‐Seq  data  analysis   –  Catalog  of  genes  expressed  in  pollen   –  Highlight  important  area  of  pollen  research   •  Problem:  Pollen  in  some  plant  species  is  vulnerable  to   heat  stress,  reduces  yields   –  Exposure  to  mild  heat  stress  (acclima$on)  can  protect   against  more  severe  stress  later  -­‐  called  acquired   thermotolerance  (Firon  2012)   •  To  learn  more,  we  sequenced  RNA  extracted  from   pollen  undergoing  a  mild  heat  stress   –  Same  temperature  that  can  establish  thermotolerance     6  
  • 7. Samples  from  the  lab  of  Nurit  Firon,   Volcani  Ins1tute,  Israel   •  Firon  lab  studies  effects  of  heat  stress  on   tomato  pollen   •  Showed  (along  with  others)  that  high  temp.   reduces  pollen  viability,  sugar  content     •  Studying  a  heat-­‐tolerant  tomato  cul1var:   Hazera  3042   – Pollen  is  sensi1ve  to  heat  stress  but  not  as  much   as  other  varie1es   7  
  • 8. Nurit's  experiment:  RNA-­‐Seq  of  heat-­‐ tolerant  tomato  cul1var  Hazera  3042   •  Collected  pollen  from  plants  growing  in   temperature-­‐controlled  greenhouses   –  Control  25/18°  C  op$mal  temperature   –  Treatment  32/26°  C  mild  chronic  heat  stress     •  Collected  batches  of  pollen  from  ~  10  plants   during  Sep.  &  Oct  2013     –  One  treatment,  one  control  per  collec1on   –  Made  RNA  from  five  collec1ons,  5  treatment,  5   control  "batches"   –   sequenced  at  UCLA  (69  base,  PE)   8  
  • 9. Arabidopsis  cold  stress  RNA-­‐Seq     •  Simpler  data  set  with  one  treatment  &  control   –  Using  data  from  part  of  chr1,  treatment  sample  to   illustrate  data  processing,  visualiza1on,  effects  of   parameter  seongs  on  results  (maximum  intron  size  in   tophat  spliced  alignment  program)   •  For  details,  see:     –  experiment  record  at  the  Short  Read  Archive h7p://www.ncbi.nlm.nih.gov/sra/SRP029896     –  sample  h7p://www.ncbi.nlm.nih.gov/sra/SRX348640     •  Published  in  Methods  in  Molecular  Biology   h7p://www.ncbi.nlm.nih.gov/pubmed/24792048     9  
  • 10. Workshop  1:  RNA-­‐Seq:  Design,   sequence,  align,  count,  visualize   wings  2014   10  10  
  • 11. Goals   •  Learn  the  basics  (20')   – Plan  an  experiment   – Library  prep  for  RNA-­‐Seq   – Illumina  sequencing   •  Prac1ce:  Quality  analysis  using  FastQC  (30')   •  Prac1ce:  Data  processing  (30')   – Align  reads  (make  BAM  files  and  junc1on  files)     – Make  counts  files  for  sta1s1cal  analysis   – Merge  reads  into  transcript  models  w/  Cufflinks     •  Prac1ce:  Visualize  results  in  IGB  (60')   – Compare  to  data  set  in  Galaxy,  TAIR10  gene  models   11  
  • 12. Visualiza1on  using  IGB   FASTQ  files   WildType1a.fastq Work  Shop  2   Workshop  1   Overview   FASTQC   Alignment   onto  Genome   $Command Line… WildType1a.bam Genera1on  of  Counts  Data   Counts.txt Sequencing  Strategy  
  • 13. RNA-­‐seq:  ultra-­‐high  throughput  cDNA   sequencing   •  Several  papers  published  in  2008,  first  in  May     13  h7p://blog.sbgenomics.com/rna-­‐seq-­‐the-­‐first-­‐wave/   Ecker  lab   Snyder  lab   999  cites   1,076  cites  
  • 14. Mortazavi  2008  "Mapping  and   quan1fying  mammalian  transcriptomes   by  RNA-­‐Seq"  Nature  Methods     •  Published  later  in  2008,   but  >  3000  cita1ons   •   Why?  Maybe  because   emphasized  RNA-­‐Seq  as     replacement  for   expression  DNA   microarrays   •  Comment  in  same  issue:   "Beginning  of  the  end  for   microarrays?"     14   google  scholar  
  • 15. RNA-Seq Overview - Illumina   ~  ~  ~  ~   fragment   synthesize cDNA (random hexamers)   -  -  -  -   -  -  -  -   -  -  -   -  -   -  -   -  -  -  -   -  -  -  -   -  -  -   -  -   -  -   repair ends   add “A” bases to 3’ ends   ligate adapters   extract RNA, purify polyA+   -  -  -  -  -  -   -  -  -  -   -   amplify   library reflects RNA from original sample   Data, fastq sequence files Millions of reads per library   Map to genome Count reads per gene   improve gene models   identify differentially expressed genes   alignments   analyze splicing   and much more..   prepare flowcell   Plan experiment •  Biological replication •  Sequencing strategy •  Data analysis strategy   sequence by synthesis   collect samples   2. Making Libraries   quality assessment 3. Sequencing   4. Data Analysis   1. Design   15  
  • 16. Five  steps  for  design   1.  Ar1culate  your  ques$ons  or  hypothesis     2.  Define  your  unit  of  biological  replica1on.   3.  Write  up  your  sample  collec1on  protocol  in   detail   –  Does  the  protocol  allow  you  to  test  your  hypothesis?     4.  Define  library  synthesis  &  sequencing  strategy   –  Read  lengths,  paired  end  vs.  single  end,  depth,   barcoding   5.  Ask  an  experienced  data  analyst  to  review  your   plan,  revise  needed   16  
  • 17. Image:     David  C  Corney  Ph.  D.    h7p://www.labome.com/method/RNA-­‐seq-­‐Using-­‐Next-­‐Genera1on-­‐Sequencing.html   Fork  or  "Y"  adapters   size  selec1on   Library  synthesis     17   Y  adapters   contain  indexes,   allow   mul1plexing  
  • 18. Example  library  molecule     Unknown   sequence  Rd1   Rd2   barcode   Universal   adapter     Index   Primer   18   Rd1   Rd2   Rd1  &  Rd  2  are  from  reverse  complements,  might  overlap.     Ref:  h7p://nextgen.mgh.harvard.edu/IlluminaChemistry.html   P5   P7  
  • 19. Flow  cell  prepara1on  &   sequencing  by  synthesis   19   h7ps://www.youtube.com/watch?v=HMyCqWhwB8E    
  • 20. Review:  Paired  End  vs  Single  End   •  Single  End  –  cheaper   •  Paired  End  –  more  expensive   – two  reads  per  fragment   – coun1ng  fragments,  not  reads     – call  normalized  counts  FPKM  not  RPKM   sequenced  in  SE   Sequenced  in  PE   SE   PE   indexed   adapter   20  
  • 21. Get  the  reads  in  a  FASTQ  file   •  File  contains  millions  of  records   – Each  record  has  four  lines,  represents  ONE   sequence   •  Line  1  –  the  name,  starts  with  @   •  Line  2  –  the  sequence,  starts  at  new  line   •  Line  3  –  some  other  stuff,  op1onal,  starts  with  +   •  Line  4  –  the  quality  scores,  starts  at  new  line   @SN1083:379:H8VA1ADXX:2:1101:1248:2144 1:N:0:12! CCTAAATGGTGCCATGCTAGGAGGCCGTGCCCTTCTTGAAAAGTTGTATGTGAA! +! BBBFFFFFFBFFFIIIIFI<FFIIIIIFIIIIFBFIIIIIIIIFFFIIIIFIII! base  =  T   score  =  F  =  37   21  
  • 22. Phred  Quality  score  Q   h7p://en.wikipedia.org/wiki/FASTQ_format   Describes  how  exponen1ally  unlikely   it  is  that  a  given  base  call  is  wrong.   Q  =  -­‐10  log10  pe     22  
  • 23. h7p://drive5.com/usearch/manual/quality_score.html   Different  Illumina  data  processing  pipelines   used  different  score  encodings   23  
  • 24. Get  two  files  -­‐  Read1  &  Read2  -­‐  from   paired  end  sequencing   •  Read1  and  Read2  have  same  read  iden$fier,  are   reverse  complements  of  the  same  fragment     •  Example  is  processing  pipeline  Cassava  1.8,  older   versions  used  different  naming  conven1ons   @SN1083:379:H8VA1ADXX:2:1101:1248:2144 1:N:0:12! CCTAAATGGTGCCATGCTAGGAGGCCGTGCCCTTCTTGAAAAGTTGTATGTGAA! +! BBBFFFFFFBFFFIIIIFI<FFIIIIIFIIIIFBFIIIIIIIIFFFIIIIFIII! @SN1083:379:H8VA1ADXX:2:1101:1248:2144 2:N:0:12! CATTTTCGACGTTGTTAATAAGCTCTGCGTACTTGCAAGCTATCTGCGCGAACG! +! BBBFFFFFFFFFFIIIIIIIIIIIIIIIIFIIIIIIIIIIIIIIIIIIIIIFFF! 24   R1   R2  
  • 25. Sequence  iden1fier  line  in  Cassava  1.8     25   @SN1083:379:H8VA1ADXX:2:1101:1248:2144 1:N:0:12! machine    run#    flow-­‐cell-­‐id    lane      1le    x-­‐pos    y-­‐pos   read#                            index      is-­‐filtered            (barcode)                            control  
  • 26. FastQC   •  Many  groups  use  FastQC  as  a  first  pass  quality   assessment   •  Free  from  Babraham  h7p:// www.bioinforma1cs.babraham.ac.uk/ projects/fastqc/   •  Run  interac1vely  (point-­‐and-­‐click)  or   command  line  (won’t  cover  this)   26  
  • 27. Prac1ce:  Using  FastQC     •  Go  to  Conference  DropBox  link:     –  h7p://bitly.com/rnaseq2014   •  Note  two  folders  –  FastQC  and  FastQC-­‐Examples   –  FastQC-­‐Examples  has  FastqQC  reports  from  different   species,  sample  types  (next  slide)   •  FastQC  folder,  download   –  Example.fastq   –  FastQC_Manual.pdf   •  Start  FastQC,  open  Example.fastq   27  
  • 28. Prac1ce:  Watch  FastQC  video   •  h7ps://www.youtube.com/watch? v=bz93ReOv87Y  (start  around  34  sec)   •  Take-­‐home  #1:  FastQC  assesses  whether  your   data  files  are  typical   •  Take-­‐home  #2:  A  "bad  result"  from  FastQC   doesn't  always  mean  your  data  are  not  useful   or  valuable   •  Explore  on  your  own!  (~  15  minutes)   28  
  • 29. Prac1ce:  View  reports  in  Fastqc-­‐ Examples  (~  15  min)       •  Blueberry     – OnealRipe_1     – OzarkblueGreen_1   •   Tomato  pollen   – T2_1     – C2_1     •  Rice   – Control2h-­‐R2    Per  read  %GC   29  
  • 30. Prac1ce:  Data  processing   •  Double-­‐click  "Alignment.tar.gz"  on  your   Desktop  to  unpack  it   •  Also  available  from   h7p://bitly.com/rnaseq2014   30  
  • 31. Prac1ce:  Look  at  "align.sh"   •  Open  Alignment  folder   •  Right-­‐click  "align.sh"   •  Select  "open  with  text  editor"   •  This  is  a  shell  script   –  Commands  executed  in  sequence     –  Very  useful  for  automa1ng  tasks   •  First  line  is  "she-­‐bang"  line   –  tells  Terminal  it's  a  shell  script   •  All  other  lines  star1ng  with  #  are   comments  (not  run)   31   Learning  the   bash  shell     Great  guide  to   wri1ng  shell   scripts  
  • 32. align.sh  -­‐  simple  pipeline  for  RNA-­‐ Seq  data  processing   •  Aligns  a  sample  fastq  file    to  genome   –  tophat2, bowtie2! –  fastq  file  is  from  Arabidopsis  cold  stress  experiment   (Short  Read  Archive  SRX348640)   –  file  ColdTreatment-little.fastq.gz (gzip-­‐ compressed,  .gz)   •  Counts  reads  that  align  to  TAIR10  genes   –  featureCounts! –  only  coun1ng  reads  that  uniquely  align   •  Merges  alignments  into  transcript  models   –  cufflinks! 32  
  • 33. Prac1ce:  Intro  to  Terminal   •  Double-­‐click  Terminal  shortcut  on  desktop     –  Program  for  entering  commands  or  running  scripts   –  Also  called  a  "shell"  or  "Unix  shell"   –  Can  open  mul1ple  Terminal  windows     •  Each  window  called  a  "shell"  or  "Unix  shell"   •  Terminal  shows  hierarchical  view  of  file  system   –  An  upside-­‐down  tree,  where  every  folder  is  inside   another  folder   –  Folders  are  also  called  "directories"     –  The  top  folder  (that  contains  everything  else)  is  called   "root"  directory  -­‐    /  (forward  slash)   33  
  • 34. Prac1ce:  Open  Terminal,  try  these   commands   •  cd  change  directory   –  by  itself  means  "go  to  user   home  directory"     –  with  an  argument  means:  go   there     –  with  ".."  means  go  up  one   •  pwd  -­‐  "print  the  current   working  directory"  &  find   out  where  you  are   34  
  • 35. Prac1ce:  Try  these  commands   ls lists  files  and  directories  in   the  current  directory   35  
  • 36. Prac1ce:  Try  these  commands   36   •  ls -l  "list  long"     – report  more  informa1on  about  files   – "d"  means  it's  a  directory  (folder)      
  • 37. Prac1ce:  Run  align.sh  in  Terminal   •  Go  to  home  directory   •  Go  to  Desktop     •  Go  to  Alignment     •  Run  align.sh     37  
  • 38. Now   Running:   tophat2     spliced   alignment   tool   38   TopHat:  discovering  splice   junc$ons  with  RNA-­‐Seq     Cole  Trapnell1,  Lior  Pachter  and   Steven  L.  Salzberg   Figure  1  
  • 39. Tophat  Output  -­‐  we'll  open  in  IGB   •  Creates  new  folder  with  files,  including...   •  accepted_hits.bam  -­‐  "binary  alignments"  file   contains  read  alignments   –  BAM  -­‐  compressed  version  of  SAM  -­‐  "sequence  alignment",   needs  index  ".bai"  file  (made  using  samtools)   •  junction.bed  -­‐  reports  boundaries  of  introns,   called  "junc1on"  features     –  BED  format,  tab-­‐delimited  plain  text  file   –  one  junc1on  feature  per  line   –  fi{h  field  is  score,  no.  spliced  reads  aligned  across  the   junc1on   –  see:  h7p://genome.ucsc.edu/FAQ/ FAQformat.html#format1   39  
  • 40. Prac1ce:  Start  IGB  while  script  runs     •  Double-­‐click  IGB  desktop  icon   •  Click  Arabidopsis  flower  on  start  screen   40  
  • 41. Prac1ce:  How  to  get  IGB  if  you're  using   your  own  computer   •  Go  to  h7p://bioviz.org   •  Follow  Download  link   •  Choose  Medium  Memory  op1on  (typical)   41  
  • 42. TAIR10  annota1ons,  June  2009   Columbia-­‐0  genome  release   •  TAIR10  protein-­‐coding  gene  models  loaded   automa1cally  from  IGB  data  server     •  Forward  &  reverse  strand  in  separate  tracks   42   Forward   Reverse  
  • 43. RNA-­‐Seq,  ChIP-­‐Seq,  other  data  sets   available  in  Data  Access  tab   •  IGB  data  servers,  can  set  up  your  own     43  
  • 44. Arabidopsis  pollen  data  sets   •  Read  alignments,  coverage  graphs,  junc1on  files   •  From  2013  Plant  Phys.  Pollen  RNA-­‐Seq  paper  44  
  • 45. Prac1ce:  Combine  Plus  &  Minus  Tracks   Click  "+/-­‐"  to   combine  tracks     45   Use  Data  Management  Table  to  change  track   color,  name,  visibility,  load  op1ons,  strand  op1ons  
  • 46. Summary  of  moving  and  zooming   •  Animated  zooming     –  click  to  posi1on  zoom  stripe,  sets  zoom  focus   –  horizontal  zoom  &  ver1cal  stretch   •  Moving  from  side  to  side  (panning)   –  arrows  in  toolbar   –  hand  icon  -­‐  the  move  tool   •  Jump-­‐zooming   –  Click-­‐drag  coordinate  axis  with  arrow  tool   –  Double-­‐click  to  zoom  in  on  a  feature     –  Search  by  name   46  
  • 47. Prac1ce:  Zoom  in  on  a  feature   •  Zoom  in  on  alt-­‐spliced  gene  models  *  on  chr1   •  This  is  animated  zooming   47   1.  Click  to  set   zoom  focus  2.  Drag  slider   to  zoom  in     *  
  • 48. Prac1ce:  Click  move  arrows  to  reposi1on   during  zoom   •  Click  data   display  to  re-­‐ focus  zoom  on   target  loca1on   48  
  • 49. 49   Prac1ce:  Or  use  move  tool  (hand)  to   reposi1on  during  zoom   •  Click  display  to  focus  zoom  on  target     1.  Select   move  tool   (hand)       2.  Click-­‐drag   to  move  
  • 50. Prac1ce:  Click-­‐drag  sequence  axis  to  jump-­‐ zoom  to  a  region   2.  Click  number  line   50   3.  Drag   4.  Release   •  Highlighted  region  becomes  new  view   1.  Select   pointer  tool  
  • 51. Prac1ce:  Jump-­‐zoom  to  gene  model   •  Double-­‐click  label,  space  a  li7le  above  exon  blocks,  or   intron  to  jump-­‐zoom  to  a  gene  model   –  Also  selects  it,  selected  items  outlined  in  red   51   2.  double-­‐click   label  or  intron     1.  Select   pointer  tool  
  • 52. A{er  jump-­‐zoom,  gene  model  is  selected     •  Arrows  indicate  direc1on  of  transcrip1on   52   Selected  gene   model   outlined  in  red  
  • 53. Prac1ce:  Gene  model  close-­‐up   •  Use  ver1cal  slider  to  make  gene  models  taller   •  Increase  window  size  to  make  more  room   53   Drag  slider  to  stretch  ver1cally  
  • 54. Prac1ce:  Interact  with  data  using  pointer.   Select  pointer  (arrow)  in  toolbar     •  Click  intron,  label,  or  region  above  blocks  to  select   whole  gene  model   •  Click  blocks  to  select  parts  of  a  gene  model   •  SHIFT-­‐click  to  mul1-­‐select   •  CLICK-­‐drag  to  select  &  count  everything  in  a  region   •  Selec1on  Info,  top  right,  reports  counts   –  "i"  bu7on  shows  info  if  one  item  selected     54  
  • 55. Prac1ce:  View  edge  Matching   •  Edges  that  match  selected  item  edges  are   highlighted  in  red   •  To  change  edge-­‐match  color  choose  File  >   Preferences  >  Other  Op$ons   •  To  turn  off  or  on,  see  View  >  Edge  Matching     55  
  • 56. Prac1ce:  to  work  with  sequence  data,  click   Load  Sequence   56  •  Sequence  appears  in  Coordinates  track  
  • 57. Prac1ce:  Zoom  in  to  see  amino  acids   •  Note:  Must  load  genomic  sequence  first   57  
  • 58. Prac1ce:  Zoom  in  on  end  of  transla1on   •  Click  the  "thick  end"  and  then  zoom  in   •  Note:  Variants  encode  same  C-­‐term  amino  acids   58  
  • 59. Prac1ce:  Select  genomic  sequence   1.  Choose   pointer  tool   in  toolbar       2.  Click-­‐drag   genomic   sequence  to   select  a  region   3.  CNTRL-­‐click   to  copy   •  Length  of  selected  region  reported  in  Selec$on  Info   box  (top  right)   •  Useful  for  designing  primers,  measuring  regions   59  
  • 60. Prac1ce:  Right-­‐click  (or  CNTRL-­‐click)  gene  model     •  Shows  op1ons  to  run  a  Web  search,  BLAST  search,   view  sequence   60  
  • 61. Prac1ce:  Quick  Search   •  Enter  search  text,  select  op1on   •  Jump-­‐zoom  to  selected  gene   61   Choose   At-­‐SR30  
  • 62. Zoomed  to  At-­‐SR30,  RNA-­‐binding   protein  involved  in  splicing   62  
  • 63. Looking  ahead  to  Workshop  3   •  Some  genes  that  were  highly  expressed  in   tomato  pollen  are  annotated  as  "Unknown"   proteins  &  have  no  counterpart  in  Arabidopsis.   •  You  can  use  IGB  to  quickly  find  those  genes   and  then  run  BLASTX  or  BLASTP  searches  at   NCBI  to  find  out...   – Are  they  unique  to  tomato?   – Could  they  be  non-­‐coding?     63  
  • 64. Prac1ce:  Open  files  from  align.sh! •  Zoom  out  to  show  more  of  At-­‐SR30  region   •  Choose  File  >  Open   – Select  "accepted_hits.bam"  &   "junctions.bed"     •  A  new  empty  track  appears  for  each  file   •  Click  Load  Data  to  load  reads  and  junc1ons   64  
  • 65. 65   read  alignments  stack     reads  at  top  of  stack   not  being  shown  (too   many  to  fit)  
  • 66. 66   junc1on  features,   summarizing   spliced  reads   junc1on  features,   summarizing   spliced  reads  
  • 67. Prac1ce:  Configure  view  -­‐  Load   Sequence   67   Click  Load   Sequence  to   load  genomic   bases  for  this   region    
  • 68. Prac1ce:  Configure  view  -­‐  Lock  mRNA  track  height   68   1.  Click  TAIR10  mRNA   track  label  to  select  it   2.  Open   Annota$on  tab   3.  Select  Lock  Track   Height,  enter  170,  click   Apply  
  • 69. Prac1ce:  Configure  view  -­‐  configure  junc1on  track   69   1.  Click  junc$ons   track  label  to  select   junc1ons  track   2.  Open   Annota$on  tab   3.  Select   score  in  Label   Field     4.  Select  +/-­‐   in  Strand  
  • 70. Prac1ce:  Configure  view  -­‐  lock  junc1on  track  height   70   1.  Click  junc$ons   track  label  to   select  it   2.  Open   Annota$on  tab   3.  Select  Lock  Track  Height,   enter  120,  click  Apply  
  • 71. Prac1ce:  Change  read  stack  height  to  see  more  reads   1.   CNTRL-­‐click  (or  right-­‐click)  accepted_hits.bam   track  label   2.  Choose  Set  Stack  Height...   71  
  • 72. Prac1ce:  Change  read  stack  height     3.  Enter  50     72   Prac1ce:  Change  read  stack  height  to  see  more  reads  
  • 73. Prac1ce:  Set  mRNA  stack  height     2.  Enter  3  -­‐     tallest  stack   has  3  models     73   Note:  Tabs  are  minimized  to  make  more  space   1.  Right-­‐click   TAIR10  mRNA   track  label,   choose  Set   Stack  Height  
  • 74. Prac1ce:  Note  read  support  for   alterna1ve  splicing   Take-­‐home:  Many  spliced   reads  support  both   variants,  but  there  are  also   many  reads  inside  the   introns,  indica1ng  failure  to   splice.  This  may  be  typical   of  alt-­‐spliced  introns?   74  
  • 75. Prac1ce:  Use  junc1on  track  to   quan1fy  support  for  splice  variants   1.  Click-­‐drag  to  genes  track   2.  Scores  are  number  of   spliced  reads  suppor1ng   each  junc1on.   75  
  • 76. Prac1ce:  Compare  Cufflinks  GTF  file  to   Gene  models     •  Open  Alignments  >  cufflinks_cold  >   transcripts.gf   76  
  • 77. Prac1ce:  View  Cufflinks  gene   models   77   1.  Click  Load   Data  to  see   Cufflinks   models   2.  Click-­‐drag   new  track   next  to  gene   models   3.  Use   ver$cal  slider   to  make  more   room   Take-­‐home:   Cufflinks   annota1ons   close,  but   incomplete.      
  • 78. Prac1ce:  Load  data  from  Galaxy   78   1.  Go  to  usegalaxy.org   2.  Open  Shared  Data   3.  Choose   Published   Histories  
  • 79. Prac1ce:  Load  data  from  Galaxy   79   1.  Search  for  Cold   3.  Select  Cold   stress  in   Arabidopsis  (with   default  maximum   intron  size)    
  • 80. Prac1ce:  Load  data  from  Galaxy   •  Illustrates  results  when  tophat  is  run  with  default  seongs:   –  default  maximum  intron  size  is  500,000  bases   •  Tophat  was  developed  with  human  data  in  mind,  where   large  introns  are  common   80   Select   Import   History    
  • 81. Prac1ce:  Select  start  using  this  history   81  
  • 82. 82   1.  Select  Treatment  junc1ons       2.  Select  display  in  IGB  View    
  • 83. 83   New  tab  opens.  Select   Click  to  go  to  IGB    
  • 84. 84   New  track   1.  Click   Load  Data  
  • 85. Prac1ce:  Remove  reads  -­‐  don't  need  them  now   85   1.  Right-­‐click   accepted_hits.bam   2.  Choose  Delete  Track  
  • 86. 86   1.  Zoom  out   all  the  way   2.  Click  Load   Data   Your  data  are  here  
  • 87. 87   Take-­‐home:  Tophat  run   with  default  parameters   predicts  enormous   introns.  Important  to   understand  parameters   seongs  -­‐-­‐  defaults  are   not  always  best.  
  • 88. Now  you  can   •  Describe  Illumina  library  synthesis,  sequencing   •  Evaluate  data  quality  using  FastQC   •  Run  a  data  processing  pipeline  (shell  script)   •  View  and  explore  data  in  a  genome  browser   – and  load  data  sets  from  Galaxy,  local  files   88   Thank  you  for  your  a7en1on!