SlideShare une entreprise Scribd logo
1  sur  17
The Role of Machine Learning in
Modelling the Cell.

      John Hawkins
      ARC Centre for Complex Systems
      University of Queensland
      Australia
Overview of Talk

   Overview of cell biology
   Modelling the cell
   Subcellular localisation signals
   Machine Learning in General
   Neural networks
       Feed Forward versus Recurrent
Cell Biology – Quick and Dirty
                      Membrane bound
                       Organelles
                      Nucleus
                      DNA -> RNA ->
                       Protein
                      Transport, e.g.
                        Mitochondria

                        Peroxisome

                      Modification, e.g.
                        Disulphide
                          Bond Formation
                        Glycosylation
Cell Feedback

   At a particular time point a set of genes
    will be expressed.
   These do not remain constant, instead
    the emerging picture is that
       There is some essential cycle of gene
        expression
       With a capacity to indulge in alternative
        pathways of expression under external
        stimulus.
   The pattern of expression is
Modelling the cell
   Ideally we would like to model the cell
    from the level of a 3D physical
    simulation.
       Currently this is infeasible
   So numerous approaches are taken to
    form abstractions
       Gene Regulatory Networks
       Differential equation models of particular
        pathways
       Machine learning models of particular
Biological Sequences
   Many Important Biological Molecules are
    Polymers.
       Thus representable as a sequence of discrete
        symbols.
   Sequence M = [m1, m2, …, mn] where:
   DNA mi  { A, T, G, C }
   RNA mi  { A, U, G, C }
   Protein mi  { G, A, V, L, I, P, S, C, T, M, D,
    E, H, K, R, N, Q, F, Y, W }
Information Content
   How much information in a linear sequence?
   Two crucial elements to function
       Physical/chemical properties
       Molecular shape
   Each residue has well known properties
   Denaturation. (Anfinsen,1973).
       Sequence defines arrangement of chemical
        properties which in turn defines folding.
Biological Patterns

   Motifs – General term for patterns
   Numerous Definitions & Visualisations
       PROSITE Patterns – Regular Expression
       PROSITE Profiles – Probability Matrix
       LOGOs
Peroxisomal Localisation

   Predominantly controlled by a C-
    terminal sequence called the PTS1
    signal.
   Roughly 12 residues long
   Known dependencies between
    locations
Nuclear Export
   Some proteins move continuously between the
    nucleus and cytoplasm of the cell.
   Either as:
       Transporters
       Regulators
Machine Learning
   Requires a set of examples, with
       Raw input, sequences data, and
       Known classes that the machine should
        predict
   In essence Function Approximation
       Start with a General parametrised
        function over the input data
       Adjust the parameters until the output of
        the function is a good approximation to
        the known classes of the examples.
Bias

   Bias is generally unavoidable
       (Mitchell, 1980)
   Three Sources of Bias
       Input Encoding
       Function Structure (Architecture)
       Parameter adjustment algorithm (learning)
Neural Networks
   Graphical Model consisting of layers of
    nodes connected by weights
   Feed forward neural networks
       Fixed input window
       Signal propagates in a single pass through the
        layers
   Recurrent Neural Networks
       Signal processed in parts
       Recurrent connections maintain a memory state
       Output generated after processing the last piece
        of the input signal
Simple Neural Networks




   F F N N O h = S (W1 ∙ I1 + W2 ∙ I2 + b)
   R N N O h = S (W1 ∙ I2 + W2 ∙ S (W1 ∙
    I1 + b ) + b )
RNNs in Bioinformatics

   Bi-Directional RNN
Applications

   We have applied these techniques to
       Subcellular Localisation to
           Endoplasmic Reticulum
           Mitochondria
           Chloroplast
           Peroxisome
   http://pprowler.imb.uq.edu.au
   Working with whole genome data and
    wet lab biologists to use these tools for
    data mining.
The End…




           ?

Contenu connexe

Tendances

DNA Sequencing in Phylogeny
DNA Sequencing in PhylogenyDNA Sequencing in Phylogeny
DNA Sequencing in PhylogenyBikash1489
 
Massively Parallel Signature Sequencing (MPSS)
Massively Parallel Signature Sequencing (MPSS) Massively Parallel Signature Sequencing (MPSS)
Massively Parallel Signature Sequencing (MPSS) Bharathiar university
 
Understanding the Nell2-Robo3 Interaction in Axon Guidance
Understanding the Nell2-Robo3 Interaction in Axon GuidanceUnderstanding the Nell2-Robo3 Interaction in Axon Guidance
Understanding the Nell2-Robo3 Interaction in Axon GuidanceNischal Acharya
 
Genome Mapping
Genome MappingGenome Mapping
Genome MappingStudent
 
Apollo - A webinar for the Phascolarctos cinereus research community
Apollo - A webinar for the Phascolarctos cinereus research communityApollo - A webinar for the Phascolarctos cinereus research community
Apollo - A webinar for the Phascolarctos cinereus research communityMonica Munoz-Torres
 
Lecture 3 gene cloning strategies
Lecture 3 gene cloning strategiesLecture 3 gene cloning strategies
Lecture 3 gene cloning strategiesIshah Khaliq
 
BITS - Comparative genomics: gene family analysis
BITS - Comparative genomics: gene family analysisBITS - Comparative genomics: gene family analysis
BITS - Comparative genomics: gene family analysisBITS
 
Plant nuclear genome organization
Plant  nuclear genome organizationPlant  nuclear genome organization
Plant nuclear genome organizationvijayakumars66
 
Tyler functional annotation thurs 1120
Tyler functional annotation thurs 1120Tyler functional annotation thurs 1120
Tyler functional annotation thurs 1120Sucheta Tripathy
 
Genome rearrangement
Genome rearrangementGenome rearrangement
Genome rearrangementPinky Vincent
 
Gene mapping / Genetic map vs Physical Map | determination of map distance a...
Gene mapping / Genetic map vs Physical Map |  determination of map distance a...Gene mapping / Genetic map vs Physical Map |  determination of map distance a...
Gene mapping / Genetic map vs Physical Map | determination of map distance a...NARC, Islamabad
 
Segmenting Epithelial Cells in High-Throughput RNAi Screens (Miaab 2011)
Segmenting Epithelial Cells in High-Throughput RNAi Screens (Miaab 2011)Segmenting Epithelial Cells in High-Throughput RNAi Screens (Miaab 2011)
Segmenting Epithelial Cells in High-Throughput RNAi Screens (Miaab 2011)Kevin Keraudren
 

Tendances (20)

DNA Sequencing in Phylogeny
DNA Sequencing in PhylogenyDNA Sequencing in Phylogeny
DNA Sequencing in Phylogeny
 
Sage
SageSage
Sage
 
Massively Parallel Signature Sequencing (MPSS)
Massively Parallel Signature Sequencing (MPSS) Massively Parallel Signature Sequencing (MPSS)
Massively Parallel Signature Sequencing (MPSS)
 
Intergenic segments
Intergenic segmentsIntergenic segments
Intergenic segments
 
Gene mapping and its sequence
Gene mapping and its sequenceGene mapping and its sequence
Gene mapping and its sequence
 
prediction methods for ORF
prediction methods for ORFprediction methods for ORF
prediction methods for ORF
 
Understanding the Nell2-Robo3 Interaction in Axon Guidance
Understanding the Nell2-Robo3 Interaction in Axon GuidanceUnderstanding the Nell2-Robo3 Interaction in Axon Guidance
Understanding the Nell2-Robo3 Interaction in Axon Guidance
 
Genome Mapping
Genome MappingGenome Mapping
Genome Mapping
 
Apolo Taller en BIOS
Apolo Taller en BIOS Apolo Taller en BIOS
Apolo Taller en BIOS
 
Short_CV
Short_CVShort_CV
Short_CV
 
Apollo - A webinar for the Phascolarctos cinereus research community
Apollo - A webinar for the Phascolarctos cinereus research communityApollo - A webinar for the Phascolarctos cinereus research community
Apollo - A webinar for the Phascolarctos cinereus research community
 
Gene prediction strategies
Gene prediction strategies Gene prediction strategies
Gene prediction strategies
 
Lecture 3 gene cloning strategies
Lecture 3 gene cloning strategiesLecture 3 gene cloning strategies
Lecture 3 gene cloning strategies
 
BITS - Comparative genomics: gene family analysis
BITS - Comparative genomics: gene family analysisBITS - Comparative genomics: gene family analysis
BITS - Comparative genomics: gene family analysis
 
Plant nuclear genome organization
Plant  nuclear genome organizationPlant  nuclear genome organization
Plant nuclear genome organization
 
Tyler functional annotation thurs 1120
Tyler functional annotation thurs 1120Tyler functional annotation thurs 1120
Tyler functional annotation thurs 1120
 
Genome rearrangement
Genome rearrangementGenome rearrangement
Genome rearrangement
 
Gene mapping / Genetic map vs Physical Map | determination of map distance a...
Gene mapping / Genetic map vs Physical Map |  determination of map distance a...Gene mapping / Genetic map vs Physical Map |  determination of map distance a...
Gene mapping / Genetic map vs Physical Map | determination of map distance a...
 
Critique
CritiqueCritique
Critique
 
Segmenting Epithelial Cells in High-Throughput RNAi Screens (Miaab 2011)
Segmenting Epithelial Cells in High-Throughput RNAi Screens (Miaab 2011)Segmenting Epithelial Cells in High-Throughput RNAi Screens (Miaab 2011)
Segmenting Epithelial Cells in High-Throughput RNAi Screens (Miaab 2011)
 

Similaire à The role of machine learning in modelling the cell

scRNA-Seq Workshop Presentation - Stem Cell Network 2018
scRNA-Seq Workshop Presentation - Stem Cell Network 2018scRNA-Seq Workshop Presentation - Stem Cell Network 2018
scRNA-Seq Workshop Presentation - Stem Cell Network 2018David Cook
 
Final cnn shruthi gali
Final cnn shruthi galiFinal cnn shruthi gali
Final cnn shruthi galiSam Ram
 
Introduction to biocomputing
 Introduction to biocomputing Introduction to biocomputing
Introduction to biocomputingNatalio Krasnogor
 
NIPS machine learning in computational biology presentation
NIPS machine learning in computational biology presentationNIPS machine learning in computational biology presentation
NIPS machine learning in computational biology presentationKieran Campbell
 
Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...
Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...
Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...Varij Nayan
 
American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1Double Check ĆŐNSULTING
 
Intro to cells
Intro to cellsIntro to cells
Intro to cellsaljeirou
 
Intro to cells
Intro to cellsIntro to cells
Intro to cellsaljeirou
 
Sample Powerpoint Presentation
Sample Powerpoint PresentationSample Powerpoint Presentation
Sample Powerpoint Presentationteachdna
 
3. What is an ANN Describe various types of ANN. Which ANN do you p.pdf
3. What is an ANN Describe various types of ANN. Which ANN do you p.pdf3. What is an ANN Describe various types of ANN. Which ANN do you p.pdf
3. What is an ANN Describe various types of ANN. Which ANN do you p.pdfivylinvaydak64229
 
Introtocells 111109074946-phpapp01
Introtocells 111109074946-phpapp01Introtocells 111109074946-phpapp01
Introtocells 111109074946-phpapp01joy000 renojo
 
ISMB2014読み会 イントロ + Deep learning of the tissue-regulated splicing code
ISMB2014読み会 イントロ + Deep learning of the tissue-regulated splicing codeISMB2014読み会 イントロ + Deep learning of the tissue-regulated splicing code
ISMB2014読み会 イントロ + Deep learning of the tissue-regulated splicing codeKengo Sato
 
Introduction to systems biology – How systems work?
Introduction to systems biology – How systems work?Introduction to systems biology – How systems work?
Introduction to systems biology – How systems work?improvemed
 
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017David Cook
 

Similaire à The role of machine learning in modelling the cell (20)

Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical ModelsBiological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
 
scRNA-Seq Workshop Presentation - Stem Cell Network 2018
scRNA-Seq Workshop Presentation - Stem Cell Network 2018scRNA-Seq Workshop Presentation - Stem Cell Network 2018
scRNA-Seq Workshop Presentation - Stem Cell Network 2018
 
Final cnn shruthi gali
Final cnn shruthi galiFinal cnn shruthi gali
Final cnn shruthi gali
 
Introduction to biocomputing
 Introduction to biocomputing Introduction to biocomputing
Introduction to biocomputing
 
Synaptic Transmission
Synaptic TransmissionSynaptic Transmission
Synaptic Transmission
 
NIPS machine learning in computational biology presentation
NIPS machine learning in computational biology presentationNIPS machine learning in computational biology presentation
NIPS machine learning in computational biology presentation
 
Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...
Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...
Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...
 
American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1
 
Intro to cells
Intro to cellsIntro to cells
Intro to cells
 
Intro to cells
Intro to cellsIntro to cells
Intro to cells
 
Predicting Functional Regions in Genomic DNA Sequences Using Artificial Neur...
Predicting Functional Regions in Genomic DNA Sequences Using  Artificial Neur...Predicting Functional Regions in Genomic DNA Sequences Using  Artificial Neur...
Predicting Functional Regions in Genomic DNA Sequences Using Artificial Neur...
 
Lecture at the C3BI 2018
Lecture at the C3BI 2018Lecture at the C3BI 2018
Lecture at the C3BI 2018
 
Sample Powerpoint Presentation
Sample Powerpoint PresentationSample Powerpoint Presentation
Sample Powerpoint Presentation
 
3. What is an ANN Describe various types of ANN. Which ANN do you p.pdf
3. What is an ANN Describe various types of ANN. Which ANN do you p.pdf3. What is an ANN Describe various types of ANN. Which ANN do you p.pdf
3. What is an ANN Describe various types of ANN. Which ANN do you p.pdf
 
Introtocells 111109074946-phpapp01
Introtocells 111109074946-phpapp01Introtocells 111109074946-phpapp01
Introtocells 111109074946-phpapp01
 
Molecular biology lecture
Molecular biology lectureMolecular biology lecture
Molecular biology lecture
 
ISMB2014読み会 イントロ + Deep learning of the tissue-regulated splicing code
ISMB2014読み会 イントロ + Deep learning of the tissue-regulated splicing codeISMB2014読み会 イントロ + Deep learning of the tissue-regulated splicing code
ISMB2014読み会 イントロ + Deep learning of the tissue-regulated splicing code
 
Introduction to systems biology – How systems work?
Introduction to systems biology – How systems work?Introduction to systems biology – How systems work?
Introduction to systems biology – How systems work?
 
EGR 183 Bow Tie Presentation
EGR 183 Bow Tie PresentationEGR 183 Bow Tie Presentation
EGR 183 Bow Tie Presentation
 
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
 

Plus de butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 

Plus de butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

The role of machine learning in modelling the cell

  • 1. The Role of Machine Learning in Modelling the Cell. John Hawkins ARC Centre for Complex Systems University of Queensland Australia
  • 2. Overview of Talk  Overview of cell biology  Modelling the cell  Subcellular localisation signals  Machine Learning in General  Neural networks  Feed Forward versus Recurrent
  • 3. Cell Biology – Quick and Dirty  Membrane bound Organelles  Nucleus  DNA -> RNA -> Protein  Transport, e.g.  Mitochondria  Peroxisome  Modification, e.g.  Disulphide Bond Formation  Glycosylation
  • 4. Cell Feedback  At a particular time point a set of genes will be expressed.  These do not remain constant, instead the emerging picture is that  There is some essential cycle of gene expression  With a capacity to indulge in alternative pathways of expression under external stimulus.  The pattern of expression is
  • 5. Modelling the cell  Ideally we would like to model the cell from the level of a 3D physical simulation.  Currently this is infeasible  So numerous approaches are taken to form abstractions  Gene Regulatory Networks  Differential equation models of particular pathways  Machine learning models of particular
  • 6. Biological Sequences  Many Important Biological Molecules are Polymers.  Thus representable as a sequence of discrete symbols.  Sequence M = [m1, m2, …, mn] where:  DNA mi  { A, T, G, C }  RNA mi  { A, U, G, C }  Protein mi  { G, A, V, L, I, P, S, C, T, M, D, E, H, K, R, N, Q, F, Y, W }
  • 7. Information Content  How much information in a linear sequence?  Two crucial elements to function  Physical/chemical properties  Molecular shape  Each residue has well known properties  Denaturation. (Anfinsen,1973).  Sequence defines arrangement of chemical properties which in turn defines folding.
  • 8. Biological Patterns  Motifs – General term for patterns  Numerous Definitions & Visualisations  PROSITE Patterns – Regular Expression  PROSITE Profiles – Probability Matrix  LOGOs
  • 9. Peroxisomal Localisation  Predominantly controlled by a C- terminal sequence called the PTS1 signal.  Roughly 12 residues long  Known dependencies between locations
  • 10. Nuclear Export  Some proteins move continuously between the nucleus and cytoplasm of the cell.  Either as:  Transporters  Regulators
  • 11. Machine Learning  Requires a set of examples, with  Raw input, sequences data, and  Known classes that the machine should predict  In essence Function Approximation  Start with a General parametrised function over the input data  Adjust the parameters until the output of the function is a good approximation to the known classes of the examples.
  • 12. Bias  Bias is generally unavoidable  (Mitchell, 1980)  Three Sources of Bias  Input Encoding  Function Structure (Architecture)  Parameter adjustment algorithm (learning)
  • 13. Neural Networks  Graphical Model consisting of layers of nodes connected by weights  Feed forward neural networks  Fixed input window  Signal propagates in a single pass through the layers  Recurrent Neural Networks  Signal processed in parts  Recurrent connections maintain a memory state  Output generated after processing the last piece of the input signal
  • 14. Simple Neural Networks  F F N N O h = S (W1 ∙ I1 + W2 ∙ I2 + b)  R N N O h = S (W1 ∙ I2 + W2 ∙ S (W1 ∙ I1 + b ) + b )
  • 15. RNNs in Bioinformatics  Bi-Directional RNN
  • 16. Applications  We have applied these techniques to  Subcellular Localisation to  Endoplasmic Reticulum  Mitochondria  Chloroplast  Peroxisome  http://pprowler.imb.uq.edu.au  Working with whole genome data and wet lab biologists to use these tools for data mining.