SlideShare a Scribd company logo
1 of 35
2010/06/17
                       
kaneko.satoko(at)ocha.ac.jp 
                   
 

R              /R        

R Bioconductor(GeneR)        
(   )        

                     
                             
 



     
 –                ‐
                                         
                                                             
                                                 
                            
                        




                                             
                               (       )
               
         (   )

                                   (                )    
      
 –                                  ‐
                                         
                         (A:61%, G:32%, C: 7%)                               
          5             (A:80%, G:20%) 10               (A:60%, G:40%)               
                                                                                      

                                                                                 
                                                     
                                          5     
                                                                       
                                              01 A            10     
      (       )
                              02 A
                                              03 A                 01 A
A     G A G     G                                                  02 A
     G       A      A                         04 G
  A     A A    G                              05 A                 03 A
G     A A C       G                                                04 G
   C        A A                     A: 80%, G: 20%
A      A A         A                                               05 A
   A G A       G
                                                                   06 G
                                                                   07 A
                                                                   08 G
                                                                   09 A
                                                                   10 G
                                                                                         
                                                          A: 60%, G: 40%
 –                     ‐
                                                                    
                1 n
             x = ∑ xi
                n i=1
                                                        
                                                                
                                         (       )  
€


         (        ‐1)       (   )             



         
                                                            
                         
 –           ‐
                       
                   




                            5

                             
         
                          5,           0.5

     
                           5,           1.0

                                 5,           2.0
R 
R             

R      (                  ) 

Bioconductor(GeneR)    

                                
R                          
                               R        
                                    
(2010 6 14     R 2 11.1)




R‐2.11.1.dmg            
                    
                
R                       –             ‐
             R                        




                  q()              
Save workspace image? [y/n/c]: 
                y                            
n
R           –R              ‐
                 R   dock    




                                  
R       –   ‐
R                   –           ‐ 

> 1+2          #                       
   
[1] 3 
                                  +
           
>3‐1           #                  ‐
           
[1] 2 
                                  *
           
>10*2          #                  /
           
[1] 20                            ^
           

>7/2            #               sqrt()
        
[1] 3.5 

> 2^2          #         
[1] 4 

> sqrt(4)      #             
[1] 2 
R                      –              ‐
R                                                                              
                                                              
     c                                                            

> x  <‐  c(1,2,5,6)        # c() ()                                               x        
> x 
[1] 1 2 5 6 

> x + 10                   #x (1,2,5,6)             10                
[1] 11 12 15 16 

> x * 10                   #x                  10         
[1] 10 20 50 60 


> x <‐ c(“France”, “123”, “abc”)               #                         “”            
> x 
[1] "France" "123"    "abc" 
R                  –                          ‐
> number<‐ c(1,2,3,4,5,6,7,8,9,10) 
> number 
 [1]  1  2  3  4  5  6  7  8  9 10 

> length (number)       #length                     
[1] 10 

> sum (number)     #                       
[1] 55 

> max (number)     #                           
[1] 10 

>min (number)      #                           
[1] 1 

> mean(number)   #                     
[1] 5.5 
R                     –                           1‐
       [41,55,51,61,45,38,60,48,43,46,63,51,55,55,53]   
                                                            

> x<‐ c(41,55,51,61,45,38,60,48,43,46,63,51,55,55,53) 
> length(x) 
[1] 15 

> sum(x)  #        
[1] 765 

#       
> 765/15 
          
> mean(x)      
[1] 51 
R                      –                                        2‐
       [41,55,51,61,45,38,60,48,43,46,63,51,55,55,53]   
                                                                       

> sub <‐ x‐51 
> sub 
 [1] ‐10   4   0  10  ‐6 ‐13   9  ‐3  ‐8  ‐5  12   0   4   4   2 

> sub^2 
 [1] 100  16   0 100  36 169  81   9  64  25 144   0  16  16   4 

> sum(sub^2) 
[1] 780 

> 780/15 
[1] 52 

>sqrt(52)      #                                
[1] 7.211103
Bioconductor                    
                            
                                                                     
                                                             

h`p://www.bioconductor.org/packages/release/Soaware.html 
Bioconductor/GeneR 
GeneR           

> source("h`p://www.bioconductor.org/biocLite.R")  
> biocLite("GeneR") 




> library(GeneR) 




> ls(“package:GeneR”)       #GeneR                     
Bioconductor/GeneR                               1 
> s<‐"gtcatgcatgctaggtgacag`aaaatgcgtctaggtgacagtctaacaa" 
> placeString(s)        #GeneR       placeString()                     
[1] 0 


> translate()  #               1               
[1] "VMHAR*QLKCV*VTV*Q" 
> translate (from=c(1,2,3),to=c(0,0,0))    #3            
[1] "VMHAR*QLKCV*VTV*Q" "SCMLGDS*NASR*QSN"  "HAC*VTVKMRLGDSLT"  

> strCompoSeq(s, wsize=1)       #                 (%             ) 
             T         C         A         G X
[1,] 0.2549020 0.1764706 0.3137255 0.2549020 0

> strCompoSeq(s, wsize=2)        #       2                    
     TT TC   TA   TG TX   CT CC CA CG CX AT AC AA AG AX  GT   GC
[1,] 0 0 0.04 0.08 0 0.12 0 0.2 0 0 0.04 0 0.08 0.08 0 0.24 0.04
       GA GG GX XT XC XA XG XX
[1,] 0.08 0 0 0 0 0 0 0
‐Rhodopsin       ‐ 

                 


    Rhodopsin (RHO) 




        5 exons,   CDS 1047bp (348a.a.) 
human, chimpanzee, macaque Rhodopsin                 
transioon, transversion                          
1) REST                                 
2)                    CDS        R CDS                
3) ClustalW mulople alignment      
4)        transioon transversion            
Clustalw     
                 
Clustalw            mulople alignment                    
alignment                        output           format                      
Clustalw             Clustalx    GUI                  




                                                                      
                                     available for download here  
                                                      
Clustalw                       
    Clustalw‐2.0.12 (2010 6   )                      




                                        ‐2.0.12‐                         
                                                         
                                          ‐2.0.12‐                           
                                  Users/tg03/bin                 
                                             ap.ebi.ac.uk    
                                  clustalw‐2.0.12macosx              
                                                        
(nucleoode)    A,T,G,C 4               
           A,G purine       T,C pyrimidine             
A T                     G C                                                                  
purine         pyrimidine                    transioon 
purine       pyrimidine                      transversion                                              




                                                                      C
          α
                                                                                                 T

                                                                       β
    β
        β
       β
                             A
                         T
                                                                                  α
                                                                      A
                        G
                                                                             :transioon
                                                                            α
                                                                             :transversion
                                                                            β
                              G
                           C
                           purine
               pyrimidine
                                                                                                          
Figure 4‐4  Molecular Biology of the Cell (© Garland Science 2008) 
transioon/transversion              
Rhodopsin CDS(coding sequence)      1047bp         
transioon: A        G, C       T 
transversion: A,G        C,T 

NCBI   ID   
human(NM_000539) chimpanzee(XM_516740)  macaque(XM_001094250)               

human – chimpanzee, human – macaque                
                                  transioon transversion                        

                                             p distace              
p distance =      /                               

                       transioon
       transversion
        p distance

human‐
chimpanzee

human‐ 
macaque 
Bioconductor/GeneR                                                  1 
R                                  Users/tg03/bin        

> library(GeneR)    #      GeneR               R                              

1) REST             NCBI           Rhodopsin                              
        x                          R                 
> x <‐ "         " 
> s <‐placeString(x) 


2)                        CDS            R CDS                    
#                      rhodopsin.txt fasta         
> writeFasta(file="rhodopsin.txt",from =96, to=1142, comment="human") 
[1] 1             # 1:               , ‐1:              
                                                                      
Bioconductor/GeneR                                                     2 
1’) REST              NCBI     chimpanzee Rhodopsin                          
> x <‐ "         " 
> s <‐placeString(x) 

2’)                   CDS            R CDS                            
> writeFasta(file=“rhodopsin.txt”, from =    , to=   , comment= "chimpanzee", append=T)     
#append=T                    


1’’) REST             NCBI     macaque Rhodopsion                        
> x <‐ "         " 
> s <‐placeString(x) 

2’’)                  CDS             R CDS                            
> writeFasta(file=“rhodopsin.txt”, from =    , to=   , comment= "macaque", append=T)   


 Users/tg03/bin            rhodopsin.txt                         
               Seq_R_96_1142                                                       
                                   
Clustalw                      
3)                   ClustalW mulople alignment           

$ export PATH=/Users/tg03/bin/clustalw‐2.0.12‐macosx 
$ cd /Users/tg03/bin/clustalw‐2.0.12‐macosx 
$mv ~/bin/rhodopsin.txt ~/bin/clustalw‐2.0.12‐macosx 
$ clustalw2 ‐INFILE=rhodopsin.txt ‐OUTFILE=rho.aln
rho.aln                
CotEditor                         /        Courier           Courier New    




4)          transioon transversion                    
 
Rhodopsin CDS(coding sequence)           1047bp         
transioon: A        G, C       T 
transversion: A,G        C,T 

human – chimpanzee, human – macaque                
(NM_000539)‐(XM_516740) (NM_000539)‐(XM_001094250) 
                                  transioon transversion                               


                         transioon
       transversion
          p distance
        human‐
                             5
                   1
           6/1047 = 0.006
        chimpanzee
        human‐  
                             32
                  8
           40/1047 = 0.038
        macaque

                                          600               
                                  3000                  
                                                                                   
 

                                                                2
                  2                                                   

                            


                          1
                                         2
                  TCTGAGACCT
                                TCTGTGACCT

                           5th A T                   6th G C
 3th T       A               6th G C
                                     6th G C
                                                   3th T A 
                                9th C G                                     9th C G 
5th A T                                         6th C G



TCAGTGACCT               TCTGTCACGT            TCAGTGACCT                 TCTGTCACGT 
 

                                                               

    1) p distance: 
                                   :                  ,
 :                   

    2) Jukes and Cantor model (1969):                                    A
 T
 C
 G
                                                               
                                                                      A
 ‐
 α
 α
 α
              ( ) [ ( )]
        d = − 3 4 ln 1− 4 3 p                                         T
 α
 ‐
 α
 α
                                                                      G
 α
 α
 ‐
 α
    3) Kimura’s two parameter model (1980):   
                                                                      C
 α
 α
 α
 ‐
        transioon                           
€        transvesion                                               
                                                                         A
 T
 C
 G
             ( )(
        P = 1 4 1− 2e−4 (α + β ) t + e−8 βt   )                       A
 ‐
 β
 β
 α
                                                                      T
 β
 ‐
 α
        Q= 1( 2)(1− e )−8 βt


                                                                      G
 β
 α
 ‐
 β
         d ≡ 2rt = 2αt + 4 βt
€                                                                     C
 α
 β
 β
 ‐
€
               ( )                       ( )
           = − 1 2 ln(1− 2P − Q) − 1 4 ln(1− 2Q)
1
                                     (          a)                       a’,a”                                              
           t                   t+1                                                 

                                               a’
                  t
                    t+1
                                        T    a’  
                   T    a’  
              T    a’  
 (1‐r)     
                                                                                                                          (1‐r)2       
               a
                                 A
 G
 C
           T     a
                                       a’’
                                                                               1‐2r
                                                                                                          (1‐r)   
                                       T   a”
                       T   a”
                 T   a”
                                                                                                          r 10‐8,10‐9 
                                         A
 G
       C
                                                            
                       r =3α
                                                                             r2           
                                                                                                                                    
(
                                           t                         T    a’  
              T    a’  
   (1‐r) 
                      )
                                               a”         

                           :r 
                                                                     C   a”
                 T   a”
       r/3 
                                                                                                                2r(1‐r)/3              

               1‐r 

                                           t                         A    a’  
              T    a’  
   r/3 
 2r/3           

                                                a’        

                           :1/3
                                     T   a”
                 T   a”
      (1‐r)
2
Jukes and Cantor model (1969):
                   t         t+1                                                                       
             qt qt+1                           


                                          


                               qt t          qt+1‐qt dq/dt
              


                                                                             


                                                                                (d)                
2rt                        d                        
                                                       qt                                      
                                 ( ) [ ( )]
                           d = − 3 4 ln 1− 4 3 p       pt=(1‐qt):                                          
                                                       p:                                            
                                                                                          (p distance)
                                                                                                      
             €
human, chimanzee, macaque Rhodopsin                                            
    Jukes and Cantor model                               

                     transioon
 transversion
     p distance
      JC distance
     human‐
                          5
         1
         6/1047 = 0.006 
     chimpanzee
     human‐  
                         32
         8
         40/1047 = 0.038
     macaque



           ( ) [ ( )]
     d = − 3 4 ln 1− 4 3 p

              4                         

€   > ‐(3/4)*log(1‐(4/3)*0.006) 
 R                         
                           

                                       
(transioon/transversion       )        


       

More Related Content

Viewers also liked

100624_statistics2
100624_statistics2100624_statistics2
100624_statistics2
ocha_kaneko
 
100610_blastclustalw
100610_blastclustalw100610_blastclustalw
100610_blastclustalw
ocha_kaneko
 
090525-homology search(ensembl, local)
090525-homology search(ensembl, local)090525-homology search(ensembl, local)
090525-homology search(ensembl, local)
ocha_kaneko
 
100701_statistics3
100701_statistics3100701_statistics3
100701_statistics3
ocha_kaneko
 
100513_homology_search(ensembl)
100513_homology_search(ensembl)100513_homology_search(ensembl)
100513_homology_search(ensembl)
ocha_kaneko
 

Viewers also liked (6)

100624_statistics2
100624_statistics2100624_statistics2
100624_statistics2
 
100610_blastclustalw
100610_blastclustalw100610_blastclustalw
100610_blastclustalw
 
090525-homology search(ensembl, local)
090525-homology search(ensembl, local)090525-homology search(ensembl, local)
090525-homology search(ensembl, local)
 
100520_dotplot
100520_dotplot100520_dotplot
100520_dotplot
 
100701_statistics3
100701_statistics3100701_statistics3
100701_statistics3
 
100513_homology_search(ensembl)
100513_homology_search(ensembl)100513_homology_search(ensembl)
100513_homology_search(ensembl)
 

Similar to 100617_statistics1

Kza Presentatie (1)
Kza Presentatie (1)Kza Presentatie (1)
Kza Presentatie (1)
plinnebank
 
08 relations-x4
08 relations-x408 relations-x4
08 relations-x4
J00MZ
 
BaseCamp - Poland
BaseCamp - PolandBaseCamp - Poland
BaseCamp - Poland
trathwell24
 
BaseCamp - Poland
BaseCamp - PolandBaseCamp - Poland
BaseCamp - Poland
trathwell24
 
Finding%20 trigonometric%20ratios
Finding%20 trigonometric%20ratiosFinding%20 trigonometric%20ratios
Finding%20 trigonometric%20ratios
Nene Thomas
 
Power efficient design imec
Power efficient design imecPower efficient design imec
Power efficient design imec
Phillip Christie
 

Similar to 100617_statistics1 (18)

Rack para Tv Euro
Rack para Tv EuroRack para Tv Euro
Rack para Tv Euro
 
080808
080808080808
080808
 
Session1 kees westrate port of rotterdam
Session1 kees westrate port of rotterdamSession1 kees westrate port of rotterdam
Session1 kees westrate port of rotterdam
 
La Spezia: Climate integrated strategies
La Spezia: Climate integrated strategiesLa Spezia: Climate integrated strategies
La Spezia: Climate integrated strategies
 
Feed sustainability: current status, future prospects and consumer attitudes
Feed sustainability: current status, future prospects and consumer attitudesFeed sustainability: current status, future prospects and consumer attitudes
Feed sustainability: current status, future prospects and consumer attitudes
 
Keio slide
Keio slideKeio slide
Keio slide
 
IPD Investment Income return; Residential vs Commercial
IPD Investment Income return; Residential vs CommercialIPD Investment Income return; Residential vs Commercial
IPD Investment Income return; Residential vs Commercial
 
Kza Presentatie (1)
Kza Presentatie (1)Kza Presentatie (1)
Kza Presentatie (1)
 
Plone Roadmap 2009
Plone Roadmap 2009Plone Roadmap 2009
Plone Roadmap 2009
 
Ch2007slide02
Ch2007slide02Ch2007slide02
Ch2007slide02
 
Plone - Revised Roadmap: Plone 3,4,5 and beyond - Dutch Plone Users Day (+AUDIO)
Plone - Revised Roadmap: Plone 3,4,5 and beyond - Dutch Plone Users Day (+AUDIO)Plone - Revised Roadmap: Plone 3,4,5 and beyond - Dutch Plone Users Day (+AUDIO)
Plone - Revised Roadmap: Plone 3,4,5 and beyond - Dutch Plone Users Day (+AUDIO)
 
Pcd0405 (07)
Pcd0405 (07)Pcd0405 (07)
Pcd0405 (07)
 
08 relations-x4
08 relations-x408 relations-x4
08 relations-x4
 
BaseCamp - Poland
BaseCamp - PolandBaseCamp - Poland
BaseCamp - Poland
 
BaseCamp - Poland
BaseCamp - PolandBaseCamp - Poland
BaseCamp - Poland
 
Finding%20 trigonometric%20ratios
Finding%20 trigonometric%20ratiosFinding%20 trigonometric%20ratios
Finding%20 trigonometric%20ratios
 
Power efficient design imec
Power efficient design imecPower efficient design imec
Power efficient design imec
 
IEA - Eletrônica Orgânica: uma nova fronteira da Ciência
IEA - Eletrônica Orgânica: uma nova fronteira da CiênciaIEA - Eletrônica Orgânica: uma nova fronteira da Ciência
IEA - Eletrônica Orgânica: uma nova fronteira da Ciência
 

More from ocha_kaneko

100506-unix-ensembl
100506-unix-ensembl100506-unix-ensembl
100506-unix-ensembl
ocha_kaneko
 
100422-intro,setup
100422-intro,setup100422-intro,setup
100422-intro,setup
ocha_kaneko
 
090622_blast-clustalw
090622_blast-clustalw090622_blast-clustalw
090622_blast-clustalw
ocha_kaneko
 
090615-TogoWS SOAP
090615-TogoWS SOAP090615-TogoWS SOAP
090615-TogoWS SOAP
ocha_kaneko
 
090608-TogoWS REST
090608-TogoWS REST090608-TogoWS REST
090608-TogoWS REST
ocha_kaneko
 
090518_unix-ensembl
090518_unix-ensembl090518_unix-ensembl
090518_unix-ensembl
ocha_kaneko
 
090511-intro, setup
090511-intro, setup090511-intro, setup
090511-intro, setup
ocha_kaneko
 

More from ocha_kaneko (8)

100506-unix-ensembl
100506-unix-ensembl100506-unix-ensembl
100506-unix-ensembl
 
100422-intro,setup
100422-intro,setup100422-intro,setup
100422-intro,setup
 
Statistics_R
Statistics_RStatistics_R
Statistics_R
 
090622_blast-clustalw
090622_blast-clustalw090622_blast-clustalw
090622_blast-clustalw
 
090615-TogoWS SOAP
090615-TogoWS SOAP090615-TogoWS SOAP
090615-TogoWS SOAP
 
090608-TogoWS REST
090608-TogoWS REST090608-TogoWS REST
090608-TogoWS REST
 
090518_unix-ensembl
090518_unix-ensembl090518_unix-ensembl
090518_unix-ensembl
 
090511-intro, setup
090511-intro, setup090511-intro, setup
090511-intro, setup
 

Recently uploaded

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Recently uploaded (20)

Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 

100617_statistics1

  • 1. 2010/06/17   kaneko.satoko(at)ocha.ac.jp   
  • 2.   R /R   R Bioconductor(GeneR)  
  • 3. ( )          
  • 4.  – ‐             ( )   ( ) ( )    
  • 5.  – ‐   (A:61%, G:32%, C: 7%)   5 (A:80%, G:20%) 10 (A:60%, G:40%)         5     01 A 10 ( ) 02 A 03 A 01 A A G A G G 02 A G A A 04 G A A A G 05 A 03 A G A A C G 04 G C A A A: 80%, G: 20% A A A A 05 A A G A G 06 G 07 A 08 G 09 A 10 G A: 60%, G: 40%
  • 6.  – ‐   1 n x = ∑ xi n i=1         ( )   €    ( ‐1) ( )            
  • 7.  – ‐     5   5,  0.5 5,  1.0 5,  2.0
  • 8. R  R R ( )  Bioconductor(GeneR)    
  • 9. R R     (2010 6 14 R 2 11.1) R‐2.11.1.dmg      
  • 10. R   – ‐  R   q()    Save workspace image? [y/n/c]:  y   n
  • 11. R   –R ‐ R dock  
  • 12. R   – ‐
  • 13. R   – ‐  > 1+2    #   [1] 3  + >3‐1     #   ‐ [1] 2  * >10*2    #   / [1] 20  ^ >7/2      #   sqrt() [1] 3.5  > 2^2    #   [1] 4  > sqrt(4)    #   [1] 2 
  • 14. R   – ‐ R     c    > x  <‐  c(1,2,5,6)    # c() () x   > x  [1] 1 2 5 6  > x + 10        #x (1,2,5,6) 10   [1] 11 12 15 16  > x * 10        #x 10   [1] 10 20 50 60  > x <‐ c(“France”, “123”, “abc”)    # “”   > x  [1] "France" "123"    "abc" 
  • 15. R   – ‐ > number<‐ c(1,2,3,4,5,6,7,8,9,10)  > number   [1]  1  2  3  4  5  6  7  8  9 10  > length (number)    #length   [1] 10  > sum (number)  #   [1] 55  > max (number)  #   [1] 10  >min (number)  #   [1] 1  > mean(number)   #   [1] 5.5 
  • 16. R   – 1‐ [41,55,51,61,45,38,60,48,43,46,63,51,55,55,53]                         > x<‐ c(41,55,51,61,45,38,60,48,43,46,63,51,55,55,53)  > length(x)  [1] 15  > sum(x)  #   [1] 765  #   > 765/15    > mean(x)    [1] 51 
  • 17. R   – 2‐ [41,55,51,61,45,38,60,48,43,46,63,51,55,55,53]                         > sub <‐ x‐51  > sub   [1] ‐10   4   0  10  ‐6 ‐13   9  ‐3  ‐8  ‐5  12   0   4   4   2  > sub^2   [1] 100  16   0 100  36 169  81   9  64  25 144   0  16  16   4  > sum(sub^2)  [1] 780  > 780/15  [1] 52  >sqrt(52)    #   [1] 7.211103
  • 18. Bioconductor         h`p://www.bioconductor.org/packages/release/Soaware.html 
  • 19. Bioconductor/GeneR  GeneR   > source("h`p://www.bioconductor.org/biocLite.R")   > biocLite("GeneR")  > library(GeneR)  > ls(“package:GeneR”)     #GeneR  
  • 20. Bioconductor/GeneR  1  > s<‐"gtcatgcatgctaggtgacag`aaaatgcgtctaggtgacagtctaacaa"  > placeString(s)    #GeneR placeString()   [1] 0  > translate()  # 1   [1] "VMHAR*QLKCV*VTV*Q"  > translate (from=c(1,2,3),to=c(0,0,0))    #3   [1] "VMHAR*QLKCV*VTV*Q" "SCMLGDS*NASR*QSN"  "HAC*VTVKMRLGDSLT"   > strCompoSeq(s, wsize=1)    # (% )  T C A G X [1,] 0.2549020 0.1764706 0.3137255 0.2549020 0 > strCompoSeq(s, wsize=2)     # 2   TT TC TA TG TX CT CC CA CG CX AT AC AA AG AX GT GC [1,] 0 0 0.04 0.08 0 0.12 0 0.2 0 0 0.04 0 0.08 0.08 0 0.24 0.04 GA GG GX XT XC XA XG XX [1,] 0.08 0 0 0 0 0 0 0
  • 21. ‐Rhodopsin ‐  Rhodopsin (RHO)  5 exons,   CDS 1047bp (348a.a.)  human, chimpanzee, macaque Rhodopsin   transioon, transversion   1) REST   2)  CDS R CDS   3) ClustalW mulople alignment   4)  transioon transversion  
  • 22. Clustalw     Clustalw mulople alignment   alignment output format   Clustalw Clustalx GUI   available for download here    
  • 23. Clustalw   Clustalw‐2.0.12 (2010 6 )   ‐2.0.12‐     ‐2.0.12‐   Users/tg03/bin   ap.ebi.ac.uk   clustalw‐2.0.12macosx    
  • 24. (nucleoode) A,T,G,C 4   A,G purine T,C pyrimidine A T G C   purine pyrimidine transioon  purine pyrimidine transversion   C α T β β β β A T α A G :transioon α :transversion β G C purine pyrimidine Figure 4‐4  Molecular Biology of the Cell (© Garland Science 2008) 
  • 25. transioon/transversion   Rhodopsin CDS(coding sequence) 1047bp   transioon: A        G, C       T  transversion: A,G        C,T  NCBI ID   human(NM_000539) chimpanzee(XM_516740)  macaque(XM_001094250)   human – chimpanzee, human – macaque   transioon transversion   p distace   p distance =  /   transioon transversion p distance human‐ chimpanzee human‐  macaque 
  • 26. Bioconductor/GeneR 1  R Users/tg03/bin   > library(GeneR)  # GeneR R   1) REST NCBI Rhodopsin   x R   > x <‐ "         "  > s <‐placeString(x)  2)  CDS R CDS   # rhodopsin.txt fasta   > writeFasta(file="rhodopsin.txt",from =96, to=1142, comment="human")  [1] 1   # 1: , ‐1:                        
  • 27. Bioconductor/GeneR 2  1’) REST NCBI chimpanzee Rhodopsin   > x <‐ "         "  > s <‐placeString(x)  2’)  CDS R CDS   > writeFasta(file=“rhodopsin.txt”, from =    , to= , comment= "chimpanzee", append=T)    #append=T   1’’) REST NCBI macaque Rhodopsion   > x <‐ "         "  > s <‐placeString(x)  2’’)  CDS R CDS   > writeFasta(file=“rhodopsin.txt”, from =    , to= , comment= "macaque", append=T)    Users/tg03/bin rhodopsin.txt   Seq_R_96_1142    
  • 28. Clustalw 3)  ClustalW mulople alignment   $ export PATH=/Users/tg03/bin/clustalw‐2.0.12‐macosx  $ cd /Users/tg03/bin/clustalw‐2.0.12‐macosx  $mv ~/bin/rhodopsin.txt ~/bin/clustalw‐2.0.12‐macosx  $ clustalw2 ‐INFILE=rhodopsin.txt ‐OUTFILE=rho.aln rho.aln     CotEditor / Courier Courier New   4)  transioon transversion  
  • 29.   Rhodopsin CDS(coding sequence) 1047bp   transioon: A        G, C       T  transversion: A,G        C,T  human – chimpanzee, human – macaque   (NM_000539)‐(XM_516740) (NM_000539)‐(XM_001094250)  transioon transversion   transioon transversion p distance human‐ 5 1 6/1047 = 0.006 chimpanzee human‐   32 8 40/1047 = 0.038 macaque 600   3000    
  • 30.   2 2     1 2 TCTGAGACCT TCTGTGACCT 5th A T  6th G C 3th T A  6th G C 6th G C 3th T A  9th C G  9th C G  5th A T  6th C G TCAGTGACCT  TCTGTCACGT  TCAGTGACCT  TCTGTCACGT 
  • 31.     1) p distance:  :  , :  2) Jukes and Cantor model (1969):  A T C G     A ‐ α α α ( ) [ ( )] d = − 3 4 ln 1− 4 3 p T α ‐ α α G α α ‐ α 3) Kimura’s two parameter model (1980):    C α α α ‐     transioon   €   transvesion   A T C G ( )( P = 1 4 1− 2e−4 (α + β ) t + e−8 βt ) A ‐ β β α T β ‐ α Q= 1( 2)(1− e )−8 βt G β α ‐ β d ≡ 2rt = 2αt + 4 βt € C α β β ‐ € ( ) ( ) = − 1 2 ln(1− 2P − Q) − 1 4 ln(1− 2Q)
  • 32. 1 ( a) a’,a”   t t+1   a’ t t+1        T    a’          T    a’          T    a’   (1‐r) (1‐r)2 a A G C        T     a a’’ 1‐2r (1‐r)        T   a”        T   a”        T   a” r 10‐8,10‐9  A G C   r =3α r2     ( t          T    a’          T    a’   (1‐r) ) a” :r         C   a”        T   a” r/3 2r(1‐r)/3 1‐r    t          A    a’          T    a’   r/3 2r/3 a’ :1/3        T   a”        T   a” (1‐r)
  • 33. 2 Jukes and Cantor model (1969): t t+1   qt qt+1     qt t qt+1‐qt dq/dt                   (d)   2rt d   qt   ( ) [ ( )] d = − 3 4 ln 1− 4 3 p pt=(1‐qt):   p:               (p distance) €
  • 34. human, chimanzee, macaque Rhodopsin   Jukes and Cantor model   transioon transversion p distance JC distance human‐ 5 1 6/1047 = 0.006  chimpanzee human‐   32 8 40/1047 = 0.038 macaque ( ) [ ( )] d = − 3 4 ln 1− 4 3 p 4   € > ‐(3/4)*log(1‐(4/3)*0.006) 
  • 35.  R       (transioon/transversion )