SlideShare une entreprise Scribd logo
1  sur  21
"Junk" DNA Proves to be Highly Valuable1 What was once thought of as DNA with zero value in plants--dubbed "junk" DNA--may turn out to be key in helping scientists improve the control of gene expression in transgenic crops.2 Cooper and collaborators investigated "junk" DNA in the model plant Arabidopsis thaliana, using a computer program to find short segments of DNA that appeared as molecular patterns…These linked patterns are called pyknons… This discovery in plants illustrates that the link between coding DNA and junk DNA crosses higher orders of biology and suggests a universal genetic mechanism at play that is not yet fully understood.  1-Alfredo Flores, June 2, 2009; http://www.ars.usda.gov/is/pr/2009/090602.htm. 2-Bret Cooper, Soybean Genomics and Improvement Laboratory, Agricultural Research Service, USDA.
“Perhaps it is time tobid farewell to the term ‘junk’ DNA – we knew not your true nature.”  (Regulatory RNAs and the demise of ‘junk’ DNA. Genome Biology 2006, 7:328) The genome genes Functional elements? Functional Elements: 90%?? 		Junk: 10%?? "...a certain amount of hubris was required  for anyone to call any part of the genome 'junk,'  given our level of ignorance."(Francis Collins, 2006)
Fig. 1. Pyknons in the 3' UTRs of the apoptosis inhibitor birc4 (shown above the horizontal line) and nine other genes Rigoutsos, Isidore et al. (2006) Proc. Natl. Acad. Sci. USA 103, 6605-6610 Copyright ©2006 by the National Academy of Sciences
WordSeekerA Software Suite for Discovery and Characterization of Genomic Words and Genome-Wide Patterns
www.word-seeker.org
word discovery methods sequence-driven (alignment-based) pattern-driven (enumerative) exhaustive optimized probabilistic optimization deterministic optimization YMF preprocess combine  short patterns AlignAce MEME WINNOWER heuristic exact Teiresias, WordSeeker suffix tree, Weeder GuhaThakurta D., Computational identification of transcriptional regulatory elements in DNA sequence. Nucleic Acids Res. 2006 Jul 19;34(12):3585-98. Print 2006. Review.  Sandve GK, Drabløs F., A survey of motif discovery methods in an integrated framework. Biol Direct. 2006 Apr 6;1:11.
The WORDIFIER Pattern  for Functional and Regulatory Genomics sequence(s) words WORDIFIER scientist scientist
OWEF: An Open Source Word Enumeration Framework for Bioinformatics Kyle Kurz, Lonnie R. Welch,  Frank Drews, Lee Nau,  Jens Lichtenberg Ohio University School of EECS  Bioinformatics Laboratory
Motivation Create a robust Motif Discovery framework using abstracted core algorithms Use a modular design, allowing new methods and algorithms to be implemented quickly and easily Abstract C++ classes Easily extensible Support the Scientific Discovery process
Approach
Project Information Project:  http://bio-s1.cs.ohiou.edu/~wordseek/download/ Open Source License:  GNU General Public License (GPL v3) Language:  C++ Applications: Currently in final testing phase Future Work: Will provide backend for WordSeeker tool at Ohio University and Ohio Supercomputer Center Will be used to fully analyze the Arabidopsis thaliana genome
Open Source Implementation of Batch Extraction for Coding and Non-Coding Sequences Jens Lichtenberg, Lonnie R. Welch Bioinformatics Laboratory School of EECS Ohio University
Motivation Regulatory Genomics tools return and operate on lists of Gene Symbols (e.g. STAT5A, Cd59a, Slc35f4) To our knowledge, no currently supported, open source “tool” that allows extraction of specific non-coding sequences for any organism Ensembl API provides limited functionality
Approach connect to  Ensembl database Input Output Set up repository Retrieve Gene Adaptor create gene object Gene Symbol Retrieve 5’UTR Retrieve 3’UTR Retrieve Exons Retrieve Upstream Adaptor Retrieve Introns Retrieve Promoter Promoter length Output Files
Project Information Project:  http://opensource.msseeker.org GNU General Public License (GPL) Language:  Perl Integrated in WordSeeker motif discovery tool of Ohio University Bioinformatics Lab Future Work: Connection to Genbank repository information Release into BioPerl or CPAN
Acknowledgements Thomas Bitterman, OSC Laura Elnitski, NHGRI Susan Evans, OU Matt Geisler, SIU Erich Grotewold , OSU Edwin Jacox, NHGRI Stephen S. Lee, U. Idaho Pooja M. Majmudar, OU Paul Morris, BGSU Chase Nelson, Oberlin Eric Stockinger , OSU Sarah Wyatt, OU Alper Yilmaz, OSU Jeffrey Parvin, OSU Kun Huang, OSU Thomas Mitchell , OSU Kengo Morohashi, OSU Rebecca Lamb , OSU John Finer, OSU ,[object Object]
Jens Lichtenberg
Rami Alouran
Frank Drews

Contenu connexe

Tendances

Genomics on the Half Shell: Making Science more Open
Genomics on the Half Shell: Making Science more OpenGenomics on the Half Shell: Making Science more Open
Genomics on the Half Shell: Making Science more Open
sr320
 
Visualization of insect vector-plant pathogen interactions in the citrus gree...
Visualization of insect vector-plant pathogen interactions in the citrus gree...Visualization of insect vector-plant pathogen interactions in the citrus gree...
Visualization of insect vector-plant pathogen interactions in the citrus gree...
Surya Saha
 

Tendances (20)

Kefed introduction 12-05-10-2224
Kefed introduction 12-05-10-2224Kefed introduction 12-05-10-2224
Kefed introduction 12-05-10-2224
 
Iplant pag
Iplant pagIplant pag
Iplant pag
 
Protein association networks: Large-scale integration of data and text
Protein association networks: Large-scale integration of data and textProtein association networks: Large-scale integration of data and text
Protein association networks: Large-scale integration of data and text
 
RDVW Hands-on session: Python
RDVW Hands-on session: PythonRDVW Hands-on session: Python
RDVW Hands-on session: Python
 
Zebrafish and Data Management Midterm Project
Zebrafish and Data Management Midterm ProjectZebrafish and Data Management Midterm Project
Zebrafish and Data Management Midterm Project
 
The Language of the Gene Ontology
The Language of the Gene OntologyThe Language of the Gene Ontology
The Language of the Gene Ontology
 
Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1 Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1
 
Kishor Presentation
Kishor PresentationKishor Presentation
Kishor Presentation
 
Web Apollo: Lessons learned from community-based biocuration efforts.
Web Apollo: Lessons learned from community-based biocuration efforts.Web Apollo: Lessons learned from community-based biocuration efforts.
Web Apollo: Lessons learned from community-based biocuration efforts.
 
Genomics on the Half Shell: Making Science more Open
Genomics on the Half Shell: Making Science more OpenGenomics on the Half Shell: Making Science more Open
Genomics on the Half Shell: Making Science more Open
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
BM405 Lecture Slides 21/11/2014 University of Strathclyde
BM405 Lecture Slides 21/11/2014 University of StrathclydeBM405 Lecture Slides 21/11/2014 University of Strathclyde
BM405 Lecture Slides 21/11/2014 University of Strathclyde
 
Gene association networks: Large-scale integration of data and text
Gene association networks: Large-scale integration of data and textGene association networks: Large-scale integration of data and text
Gene association networks: Large-scale integration of data and text
 
Cross-Kingdom Standards in Genomics, Epigenomics and Metagenomics
Cross-Kingdom Standards in Genomics, Epigenomics and MetagenomicsCross-Kingdom Standards in Genomics, Epigenomics and Metagenomics
Cross-Kingdom Standards in Genomics, Epigenomics and Metagenomics
 
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
 
Cross-Disciplinary Biomedical Research at Calit2
Cross-Disciplinary Biomedical Research at Calit2Cross-Disciplinary Biomedical Research at Calit2
Cross-Disciplinary Biomedical Research at Calit2
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
Intro bioinformatics
Intro bioinformaticsIntro bioinformatics
Intro bioinformatics
 
Visualization of insect vector-plant pathogen interactions in the citrus gree...
Visualization of insect vector-plant pathogen interactions in the citrus gree...Visualization of insect vector-plant pathogen interactions in the citrus gree...
Visualization of insect vector-plant pathogen interactions in the citrus gree...
 
Building an Information Infrastructure to Support Microbial Metagenomic Sciences
Building an Information Infrastructure to Support Microbial Metagenomic SciencesBuilding an Information Infrastructure to Support Microbial Metagenomic Sciences
Building an Information Infrastructure to Support Microbial Metagenomic Sciences
 

En vedette

En vedette (7)

Prlic Bio Java Bosc2009
Prlic Bio Java Bosc2009Prlic Bio Java Bosc2009
Prlic Bio Java Bosc2009
 
He Qi Bible Stories
He Qi Bible StoriesHe Qi Bible Stories
He Qi Bible Stories
 
Happy Valentine Day 02
Happy Valentine Day 02Happy Valentine Day 02
Happy Valentine Day 02
 
Guy In The Mirror
Guy In The MirrorGuy In The Mirror
Guy In The Mirror
 
Heart Is Like A
Heart Is Like AHeart Is Like A
Heart Is Like A
 
Procter Vamsas Bosc2009
Procter Vamsas Bosc2009Procter Vamsas Bosc2009
Procter Vamsas Bosc2009
 
Taverna 2 in Pictures
Taverna 2 in PicturesTaverna 2 in Pictures
Taverna 2 in Pictures
 

Similaire à Welch Wordifier Bosc2009

Munoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ssMunoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ss
Monica Munoz-Torres
 
Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014
Monica Munoz-Torres
 
Three's a crowd-source: Observations on Collaborative Genome Annotation
Three's a crowd-source: Observations on Collaborative Genome AnnotationThree's a crowd-source: Observations on Collaborative Genome Annotation
Three's a crowd-source: Observations on Collaborative Genome Annotation
Monica Munoz-Torres
 
provenance of microarray experiments
provenance of microarray experimentsprovenance of microarray experiments
provenance of microarray experiments
Helena Deus
 
From peer-reviewed to peer-reproduced: a role for research objects in scholar...
From peer-reviewed to peer-reproduced: a role for research objects in scholar...From peer-reviewed to peer-reproduced: a role for research objects in scholar...
From peer-reviewed to peer-reproduced: a role for research objects in scholar...
Alejandra Gonzalez-Beltran
 

Similaire à Welch Wordifier Bosc2009 (20)

Ibn Sina
Ibn SinaIbn Sina
Ibn Sina
 
Web Apollo Workshop University of Exeter
Web Apollo Workshop University of ExeterWeb Apollo Workshop University of Exeter
Web Apollo Workshop University of Exeter
 
rheumatoid arthritis
rheumatoid arthritisrheumatoid arthritis
rheumatoid arthritis
 
Munoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ssMunoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ss
 
Chibucos annot go_final
Chibucos annot go_finalChibucos annot go_final
Chibucos annot go_final
 
call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...
 
Introduction to Web Apollo for the i5K pilot species.
Introduction to Web Apollo for the i5K pilot species.Introduction to Web Apollo for the i5K pilot species.
Introduction to Web Apollo for the i5K pilot species.
 
Protease Phylogeny
 Protease Phylogeny  Protease Phylogeny
Protease Phylogeny
 
Web Apollo Workshop UIUC
Web Apollo Workshop UIUCWeb Apollo Workshop UIUC
Web Apollo Workshop UIUC
 
Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014
 
Apollo Workshop at KSU 2015
Apollo Workshop at KSU 2015Apollo Workshop at KSU 2015
Apollo Workshop at KSU 2015
 
Three's a crowd-source: Observations on Collaborative Genome Annotation
Three's a crowd-source: Observations on Collaborative Genome AnnotationThree's a crowd-source: Observations on Collaborative Genome Annotation
Three's a crowd-source: Observations on Collaborative Genome Annotation
 
Expressed sequence tag (EST), molecular marker
Expressed sequence tag (EST), molecular markerExpressed sequence tag (EST), molecular marker
Expressed sequence tag (EST), molecular marker
 
provenance of microarray experiments
provenance of microarray experimentsprovenance of microarray experiments
provenance of microarray experiments
 
Introduction to BioNLP and its applications
Introduction to BioNLP and its applicationsIntroduction to BioNLP and its applications
Introduction to BioNLP and its applications
 
lecture 1.pptx
lecture 1.pptxlecture 1.pptx
lecture 1.pptx
 
Bioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahuBioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahu
 
KnetMiner - Knowledge Network Miner
KnetMiner - Knowledge Network MinerKnetMiner - Knowledge Network Miner
KnetMiner - Knowledge Network Miner
 
From peer-reviewed to peer-reproduced: a role for research objects in scholar...
From peer-reviewed to peer-reproduced: a role for research objects in scholar...From peer-reviewed to peer-reproduced: a role for research objects in scholar...
From peer-reviewed to peer-reproduced: a role for research objects in scholar...
 
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
 

Plus de bosc

Swertz Molgenis Bosc2009
Swertz Molgenis Bosc2009Swertz Molgenis Bosc2009
Swertz Molgenis Bosc2009
bosc
 
Bosc Intro 20090627
Bosc Intro 20090627Bosc Intro 20090627
Bosc Intro 20090627
bosc
 
Software Patterns Panel Bosc2009
Software Patterns Panel Bosc2009Software Patterns Panel Bosc2009
Software Patterns Panel Bosc2009
bosc
 
Schbath Rmes Bosc2009
Schbath Rmes Bosc2009Schbath Rmes Bosc2009
Schbath Rmes Bosc2009
bosc
 
Kallio Chipster Bosc2009
Kallio Chipster Bosc2009Kallio Chipster Bosc2009
Kallio Chipster Bosc2009
bosc
 
Rice Emboss Bosc2009
Rice Emboss Bosc2009Rice Emboss Bosc2009
Rice Emboss Bosc2009
bosc
 
Senger Soaplab Bosc2009
Senger Soaplab Bosc2009Senger Soaplab Bosc2009
Senger Soaplab Bosc2009
bosc
 
Cock Biopython Bosc2009
Cock Biopython Bosc2009Cock Biopython Bosc2009
Cock Biopython Bosc2009
bosc
 
Hanmer Software Patterns Bosc2009
Hanmer Software Patterns Bosc2009Hanmer Software Patterns Bosc2009
Hanmer Software Patterns Bosc2009
bosc
 
Snell Psoda Bosc2009
Snell Psoda Bosc2009Snell Psoda Bosc2009
Snell Psoda Bosc2009
bosc
 
Drablos Composite Motifs Bosc2009
Drablos Composite Motifs Bosc2009Drablos Composite Motifs Bosc2009
Drablos Composite Motifs Bosc2009
bosc
 
Fauteux Seeder Bosc2009
Fauteux Seeder Bosc2009Fauteux Seeder Bosc2009
Fauteux Seeder Bosc2009
bosc
 
Moeller Debian Bosc2009
Moeller Debian Bosc2009Moeller Debian Bosc2009
Moeller Debian Bosc2009
bosc
 
Prins Bio Lib Bosc 2009
Prins Bio Lib Bosc 2009Prins Bio Lib Bosc 2009
Prins Bio Lib Bosc 2009
bosc
 
Wilczynski_BNFinder_BOSC2009
Wilczynski_BNFinder_BOSC2009Wilczynski_BNFinder_BOSC2009
Wilczynski_BNFinder_BOSC2009
bosc
 
Welsh_BioHDF_BOSC2009
Welsh_BioHDF_BOSC2009Welsh_BioHDF_BOSC2009
Welsh_BioHDF_BOSC2009
bosc
 
Varre_Biomanycores_BOSC2009
Varre_Biomanycores_BOSC2009Varre_Biomanycores_BOSC2009
Varre_Biomanycores_BOSC2009
bosc
 
Trelles_QnormBOSC2009
Trelles_QnormBOSC2009Trelles_QnormBOSC2009
Trelles_QnormBOSC2009
bosc
 
Rother_ModeRNA_BOSC2009
Rother_ModeRNA_BOSC2009Rother_ModeRNA_BOSC2009
Rother_ModeRNA_BOSC2009
bosc
 
Piipari_iMotif_BOSC2009
Piipari_iMotif_BOSC2009Piipari_iMotif_BOSC2009
Piipari_iMotif_BOSC2009
bosc
 

Plus de bosc (20)

Swertz Molgenis Bosc2009
Swertz Molgenis Bosc2009Swertz Molgenis Bosc2009
Swertz Molgenis Bosc2009
 
Bosc Intro 20090627
Bosc Intro 20090627Bosc Intro 20090627
Bosc Intro 20090627
 
Software Patterns Panel Bosc2009
Software Patterns Panel Bosc2009Software Patterns Panel Bosc2009
Software Patterns Panel Bosc2009
 
Schbath Rmes Bosc2009
Schbath Rmes Bosc2009Schbath Rmes Bosc2009
Schbath Rmes Bosc2009
 
Kallio Chipster Bosc2009
Kallio Chipster Bosc2009Kallio Chipster Bosc2009
Kallio Chipster Bosc2009
 
Rice Emboss Bosc2009
Rice Emboss Bosc2009Rice Emboss Bosc2009
Rice Emboss Bosc2009
 
Senger Soaplab Bosc2009
Senger Soaplab Bosc2009Senger Soaplab Bosc2009
Senger Soaplab Bosc2009
 
Cock Biopython Bosc2009
Cock Biopython Bosc2009Cock Biopython Bosc2009
Cock Biopython Bosc2009
 
Hanmer Software Patterns Bosc2009
Hanmer Software Patterns Bosc2009Hanmer Software Patterns Bosc2009
Hanmer Software Patterns Bosc2009
 
Snell Psoda Bosc2009
Snell Psoda Bosc2009Snell Psoda Bosc2009
Snell Psoda Bosc2009
 
Drablos Composite Motifs Bosc2009
Drablos Composite Motifs Bosc2009Drablos Composite Motifs Bosc2009
Drablos Composite Motifs Bosc2009
 
Fauteux Seeder Bosc2009
Fauteux Seeder Bosc2009Fauteux Seeder Bosc2009
Fauteux Seeder Bosc2009
 
Moeller Debian Bosc2009
Moeller Debian Bosc2009Moeller Debian Bosc2009
Moeller Debian Bosc2009
 
Prins Bio Lib Bosc 2009
Prins Bio Lib Bosc 2009Prins Bio Lib Bosc 2009
Prins Bio Lib Bosc 2009
 
Wilczynski_BNFinder_BOSC2009
Wilczynski_BNFinder_BOSC2009Wilczynski_BNFinder_BOSC2009
Wilczynski_BNFinder_BOSC2009
 
Welsh_BioHDF_BOSC2009
Welsh_BioHDF_BOSC2009Welsh_BioHDF_BOSC2009
Welsh_BioHDF_BOSC2009
 
Varre_Biomanycores_BOSC2009
Varre_Biomanycores_BOSC2009Varre_Biomanycores_BOSC2009
Varre_Biomanycores_BOSC2009
 
Trelles_QnormBOSC2009
Trelles_QnormBOSC2009Trelles_QnormBOSC2009
Trelles_QnormBOSC2009
 
Rother_ModeRNA_BOSC2009
Rother_ModeRNA_BOSC2009Rother_ModeRNA_BOSC2009
Rother_ModeRNA_BOSC2009
 
Piipari_iMotif_BOSC2009
Piipari_iMotif_BOSC2009Piipari_iMotif_BOSC2009
Piipari_iMotif_BOSC2009
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

Welch Wordifier Bosc2009

  • 1. "Junk" DNA Proves to be Highly Valuable1 What was once thought of as DNA with zero value in plants--dubbed "junk" DNA--may turn out to be key in helping scientists improve the control of gene expression in transgenic crops.2 Cooper and collaborators investigated "junk" DNA in the model plant Arabidopsis thaliana, using a computer program to find short segments of DNA that appeared as molecular patterns…These linked patterns are called pyknons… This discovery in plants illustrates that the link between coding DNA and junk DNA crosses higher orders of biology and suggests a universal genetic mechanism at play that is not yet fully understood. 1-Alfredo Flores, June 2, 2009; http://www.ars.usda.gov/is/pr/2009/090602.htm. 2-Bret Cooper, Soybean Genomics and Improvement Laboratory, Agricultural Research Service, USDA.
  • 2. “Perhaps it is time tobid farewell to the term ‘junk’ DNA – we knew not your true nature.” (Regulatory RNAs and the demise of ‘junk’ DNA. Genome Biology 2006, 7:328) The genome genes Functional elements? Functional Elements: 90%?? Junk: 10%?? "...a certain amount of hubris was required for anyone to call any part of the genome 'junk,' given our level of ignorance."(Francis Collins, 2006)
  • 3. Fig. 1. Pyknons in the 3' UTRs of the apoptosis inhibitor birc4 (shown above the horizontal line) and nine other genes Rigoutsos, Isidore et al. (2006) Proc. Natl. Acad. Sci. USA 103, 6605-6610 Copyright ©2006 by the National Academy of Sciences
  • 4.
  • 5.
  • 6. WordSeekerA Software Suite for Discovery and Characterization of Genomic Words and Genome-Wide Patterns
  • 8. word discovery methods sequence-driven (alignment-based) pattern-driven (enumerative) exhaustive optimized probabilistic optimization deterministic optimization YMF preprocess combine short patterns AlignAce MEME WINNOWER heuristic exact Teiresias, WordSeeker suffix tree, Weeder GuhaThakurta D., Computational identification of transcriptional regulatory elements in DNA sequence. Nucleic Acids Res. 2006 Jul 19;34(12):3585-98. Print 2006. Review. Sandve GK, Drabløs F., A survey of motif discovery methods in an integrated framework. Biol Direct. 2006 Apr 6;1:11.
  • 9. The WORDIFIER Pattern for Functional and Regulatory Genomics sequence(s) words WORDIFIER scientist scientist
  • 10. OWEF: An Open Source Word Enumeration Framework for Bioinformatics Kyle Kurz, Lonnie R. Welch, Frank Drews, Lee Nau, Jens Lichtenberg Ohio University School of EECS Bioinformatics Laboratory
  • 11. Motivation Create a robust Motif Discovery framework using abstracted core algorithms Use a modular design, allowing new methods and algorithms to be implemented quickly and easily Abstract C++ classes Easily extensible Support the Scientific Discovery process
  • 13. Project Information Project: http://bio-s1.cs.ohiou.edu/~wordseek/download/ Open Source License: GNU General Public License (GPL v3) Language: C++ Applications: Currently in final testing phase Future Work: Will provide backend for WordSeeker tool at Ohio University and Ohio Supercomputer Center Will be used to fully analyze the Arabidopsis thaliana genome
  • 14. Open Source Implementation of Batch Extraction for Coding and Non-Coding Sequences Jens Lichtenberg, Lonnie R. Welch Bioinformatics Laboratory School of EECS Ohio University
  • 15. Motivation Regulatory Genomics tools return and operate on lists of Gene Symbols (e.g. STAT5A, Cd59a, Slc35f4) To our knowledge, no currently supported, open source “tool” that allows extraction of specific non-coding sequences for any organism Ensembl API provides limited functionality
  • 16. Approach connect to Ensembl database Input Output Set up repository Retrieve Gene Adaptor create gene object Gene Symbol Retrieve 5’UTR Retrieve 3’UTR Retrieve Exons Retrieve Upstream Adaptor Retrieve Introns Retrieve Promoter Promoter length Output Files
  • 17. Project Information Project: http://opensource.msseeker.org GNU General Public License (GPL) Language: Perl Integrated in WordSeeker motif discovery tool of Ohio University Bioinformatics Lab Future Work: Connection to Genbank repository information Release into BioPerl or CPAN
  • 18.
  • 34. Kaiyu ShenCollaborators WordSeeker Team Former Members of the team
  • 35. a pattern “describes a problem which occurs over and over again in our environment, and then describes the core of the solution to that problem, in such a way that you can use the solution a million times over, without ever doing it the same way twice [1].” C. Alexander, S. Ishikawa, and M. Silverstein, A Pattern Language: Towns, Buildings, Construction. Oxford University Press, 1977.
  • 36.
  • 37. evidence for its validity
  • 38.
  • 39. stated in the form of an instruction—so that you know exactly what you need to do, to build the patternDiagram - shows the solution, with labels to indicate its main components  A paragraph which ties the pattern to all those smaller patterns in the language, which are needed to complete this pattern, to embellish it, to fill it out…
  • 40. Picture, Introduction, Headline With the availability of the genomic sequences of numerous organisms, life scientists are working in conjunction with bioinformaticians to decipher the meanings of the genomes. Projects such as Encyclopedia of Genomic Elements (ENCODE) [2] and Pyknons [3], seek to identify and charatcetrize the functional elements in genomes. The functional elements are often referred to as words. Given a genomic sequence (or a set of sequences), an important problem is the enumeration of all subsequences (words) contained in the sequence (or the set of sequences). The WORDIFIER Pattern for Functional and Regulatory Genomics