SlideShare une entreprise Scribd logo
1  sur  14
SNP Allele Designations (Bio::SNP::Inherit) Christopher Bottoms BOSC 2010
5 million data “items” one CPU:  2+ days eight CPUs: 1-2 days
SNP ID Sample ID Base1 Base2 1 1 A A 1 2 A A 1 3 A G … … … … 1 5000 A A 2 1 C C …  … … … … … … … 1106 5000 GG GG
SNP ID Sample ID Base1 Base2 1 1 A A 1 2 A A 1 3 A G … … … … 1 5000 A A 2 1 C C …  … … … … … … … 1106 5000 GG GG
“ Matrix” data file format SNP ID 1 2 3 … 5000 SNP1 AA AA AG … AA SNP2 CC GG GG … CG
“ Matrix” data file format SNP ID 1 2 3 … 5000 SNP1 AA AA AG … AA SNP2 CC GG GG … CG
Using new data format ,[object Object],[object Object]
ID’s file ID Name Group 1 B73 B73 2 B73xZ1 NAMF1 3 Mo17 Control 4 M100 IBM 5 Bob B73xZ1
ID’s file ID Name Group 1 B73 B73 2 B73xZ1 NAMF1 3 Mo17 Control 4 M100 IBM 5 Bob B73xZ1
“ Human Parsed” ID’s file ID Name Group A (ID) B (ID) AxB (ID) 1 B73 B73 2 B73xZ1 NAMF1 3 Mo17 Control 4 M100 IBM 1 3 5 Bob B73xZ1 1 2
Lessons learned ,[object Object],[object Object],[object Object],[object Object],[object Object]
Acknowledgements ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Acknowledgements ,[object Object],[object Object],[object Object],[object Object]
End

Contenu connexe

En vedette

Lost in elysium
Lost in elysiumLost in elysium
Lost in elysiumJay Lee
 
Twitterを利用した学生生活活性化案
Twitterを利用した学生生活活性化案Twitterを利用した学生生活活性化案
Twitterを利用した学生生活活性化案maruri0423
 
Gastcollege bibliotheekschool Gent mei 2016[automatisch opgeslagen]
Gastcollege bibliotheekschool Gent mei 2016[automatisch opgeslagen]Gastcollege bibliotheekschool Gent mei 2016[automatisch opgeslagen]
Gastcollege bibliotheekschool Gent mei 2016[automatisch opgeslagen]Erna Winters
 
Anne Meininger Usa
Anne Meininger UsaAnne Meininger Usa
Anne Meininger Usa555123
 
StartupLunch. Voices in the head
StartupLunch. Voices in the headStartupLunch. Voices in the head
StartupLunch. Voices in the headalarin
 
Science.ppt [autosaved]
Science.ppt [autosaved]Science.ppt [autosaved]
Science.ppt [autosaved]MH BS
 
Boro01 2006
Boro01 2006Boro01 2006
Boro01 2006alciput
 
안드로이드스터디 4
안드로이드스터디 4안드로이드스터디 4
안드로이드스터디 4jangpd007
 
Overzicht syllabus beroepspraktijk 1
Overzicht syllabus beroepspraktijk 1Overzicht syllabus beroepspraktijk 1
Overzicht syllabus beroepspraktijk 1CVO-SSH
 
Acorns and Habitat: Oaks Support a Diversity of Forest Wildlife
Acorns and Habitat: Oaks Support a Diversity of Forest Wildlife Acorns and Habitat: Oaks Support a Diversity of Forest Wildlife
Acorns and Habitat: Oaks Support a Diversity of Forest Wildlife Cary Institute of Ecosystem Studies
 
Desarrollo de Proyecto de aula
Desarrollo de Proyecto de aula Desarrollo de Proyecto de aula
Desarrollo de Proyecto de aula omar
 
Ecological building keynote
Ecological building keynoteEcological building keynote
Ecological building keynoteholly
 
sounds in bada
sounds in badasounds in bada
sounds in badaSamsung
 
Academic Honesty at Oxford College of Emory University
Academic Honesty at Oxford College of Emory UniversityAcademic Honesty at Oxford College of Emory University
Academic Honesty at Oxford College of Emory Universityoxfordcollegelibrary
 
2010 Powerpoint!
2010 Powerpoint!2010 Powerpoint!
2010 Powerpoint!Michelle
 
infrastructures Mare's Dream
infrastructures Mare's Dreaminfrastructures Mare's Dream
infrastructures Mare's Dreamsamsamaras
 
Social Media London Presentation 5th April 2011
Social Media London Presentation 5th April 2011Social Media London Presentation 5th April 2011
Social Media London Presentation 5th April 2011iohann Le Frapper
 

En vedette (19)

Lost in elysium
Lost in elysiumLost in elysium
Lost in elysium
 
Twitterを利用した学生生活活性化案
Twitterを利用した学生生活活性化案Twitterを利用した学生生活活性化案
Twitterを利用した学生生活活性化案
 
Gastcollege bibliotheekschool Gent mei 2016[automatisch opgeslagen]
Gastcollege bibliotheekschool Gent mei 2016[automatisch opgeslagen]Gastcollege bibliotheekschool Gent mei 2016[automatisch opgeslagen]
Gastcollege bibliotheekschool Gent mei 2016[automatisch opgeslagen]
 
Anne Meininger Usa
Anne Meininger UsaAnne Meininger Usa
Anne Meininger Usa
 
StartupLunch. Voices in the head
StartupLunch. Voices in the headStartupLunch. Voices in the head
StartupLunch. Voices in the head
 
Science.ppt [autosaved]
Science.ppt [autosaved]Science.ppt [autosaved]
Science.ppt [autosaved]
 
Boro01 2006
Boro01 2006Boro01 2006
Boro01 2006
 
안드로이드스터디 4
안드로이드스터디 4안드로이드스터디 4
안드로이드스터디 4
 
Overzicht syllabus beroepspraktijk 1
Overzicht syllabus beroepspraktijk 1Overzicht syllabus beroepspraktijk 1
Overzicht syllabus beroepspraktijk 1
 
Ppt podcast
Ppt podcastPpt podcast
Ppt podcast
 
Cleaning Historic Bread And Cheese Creek
Cleaning Historic Bread And Cheese CreekCleaning Historic Bread And Cheese Creek
Cleaning Historic Bread And Cheese Creek
 
Acorns and Habitat: Oaks Support a Diversity of Forest Wildlife
Acorns and Habitat: Oaks Support a Diversity of Forest Wildlife Acorns and Habitat: Oaks Support a Diversity of Forest Wildlife
Acorns and Habitat: Oaks Support a Diversity of Forest Wildlife
 
Desarrollo de Proyecto de aula
Desarrollo de Proyecto de aula Desarrollo de Proyecto de aula
Desarrollo de Proyecto de aula
 
Ecological building keynote
Ecological building keynoteEcological building keynote
Ecological building keynote
 
sounds in bada
sounds in badasounds in bada
sounds in bada
 
Academic Honesty at Oxford College of Emory University
Academic Honesty at Oxford College of Emory UniversityAcademic Honesty at Oxford College of Emory University
Academic Honesty at Oxford College of Emory University
 
2010 Powerpoint!
2010 Powerpoint!2010 Powerpoint!
2010 Powerpoint!
 
infrastructures Mare's Dream
infrastructures Mare's Dreaminfrastructures Mare's Dream
infrastructures Mare's Dream
 
Social Media London Presentation 5th April 2011
Social Media London Presentation 5th April 2011Social Media London Presentation 5th April 2011
Social Media London Presentation 5th April 2011
 

Similaire à Bottoms bosc2010 bio_snp_inherit

Ricostruzione forense di NTFS con metadati parzialmente danneggiati
Ricostruzione forense di NTFS con metadati parzialmente danneggiatiRicostruzione forense di NTFS con metadati parzialmente danneggiati
Ricostruzione forense di NTFS con metadati parzialmente danneggiatiAndrea Lazzarotto
 
Petascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big AnalyticsPetascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big AnalyticsHeiko Joerg Schick
 
Design Patterns using Amazon DynamoDB
 Design Patterns using Amazon DynamoDB Design Patterns using Amazon DynamoDB
Design Patterns using Amazon DynamoDBAmazon Web Services
 
Georgia Geospatial Workshop: Proper Care and Feeding of Metadata
Georgia Geospatial Workshop: Proper Care and Feeding of MetadataGeorgia Geospatial Workshop: Proper Care and Feeding of Metadata
Georgia Geospatial Workshop: Proper Care and Feeding of Metadatageospatialmetadata
 
Parallelized pipeline for whole genome shotgun metagenomics with GHOSTZ-GPU a...
Parallelized pipeline for whole genome shotgun metagenomics with GHOSTZ-GPU a...Parallelized pipeline for whole genome shotgun metagenomics with GHOSTZ-GPU a...
Parallelized pipeline for whole genome shotgun metagenomics with GHOSTZ-GPU a...Masahito Ohue
 
Console development
Console developmentConsole development
Console developmentspartasoft
 
Managing your black friday logs - Code Europe
Managing your black friday logs - Code EuropeManaging your black friday logs - Code Europe
Managing your black friday logs - Code EuropeDavid Pilato
 
Ceph Day Chicago - Supermicro Ceph - Open SolutionsDefined by Workload
Ceph Day Chicago - Supermicro Ceph - Open SolutionsDefined by WorkloadCeph Day Chicago - Supermicro Ceph - Open SolutionsDefined by Workload
Ceph Day Chicago - Supermicro Ceph - Open SolutionsDefined by WorkloadCeph Community
 
VB2015 Malware Classification meets crowd-sourcing
VB2015 Malware Classification meets crowd-sourcingVB2015 Malware Classification meets crowd-sourcing
VB2015 Malware Classification meets crowd-sourcingJohn D. Park
 

Similaire à Bottoms bosc2010 bio_snp_inherit (13)

Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
Ricostruzione forense di NTFS con metadati parzialmente danneggiati
Ricostruzione forense di NTFS con metadati parzialmente danneggiatiRicostruzione forense di NTFS con metadati parzialmente danneggiati
Ricostruzione forense di NTFS con metadati parzialmente danneggiati
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
Petascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big AnalyticsPetascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big Analytics
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
Design Patterns using Amazon DynamoDB
 Design Patterns using Amazon DynamoDB Design Patterns using Amazon DynamoDB
Design Patterns using Amazon DynamoDB
 
Georgia Geospatial Workshop: Proper Care and Feeding of Metadata
Georgia Geospatial Workshop: Proper Care and Feeding of MetadataGeorgia Geospatial Workshop: Proper Care and Feeding of Metadata
Georgia Geospatial Workshop: Proper Care and Feeding of Metadata
 
Parallelized pipeline for whole genome shotgun metagenomics with GHOSTZ-GPU a...
Parallelized pipeline for whole genome shotgun metagenomics with GHOSTZ-GPU a...Parallelized pipeline for whole genome shotgun metagenomics with GHOSTZ-GPU a...
Parallelized pipeline for whole genome shotgun metagenomics with GHOSTZ-GPU a...
 
Console development
Console developmentConsole development
Console development
 
Managing your black friday logs - Code Europe
Managing your black friday logs - Code EuropeManaging your black friday logs - Code Europe
Managing your black friday logs - Code Europe
 
DynamoDB Design Workshop
DynamoDB Design WorkshopDynamoDB Design Workshop
DynamoDB Design Workshop
 
Ceph Day Chicago - Supermicro Ceph - Open SolutionsDefined by Workload
Ceph Day Chicago - Supermicro Ceph - Open SolutionsDefined by WorkloadCeph Day Chicago - Supermicro Ceph - Open SolutionsDefined by Workload
Ceph Day Chicago - Supermicro Ceph - Open SolutionsDefined by Workload
 
VB2015 Malware Classification meets crowd-sourcing
VB2015 Malware Classification meets crowd-sourcingVB2015 Malware Classification meets crowd-sourcing
VB2015 Malware Classification meets crowd-sourcing
 

Plus de BOSC 2010

Mercer bosc2010 microsoft_framework
Mercer bosc2010 microsoft_frameworkMercer bosc2010 microsoft_framework
Mercer bosc2010 microsoft_frameworkBOSC 2010
 
Langmead bosc2010 cloud-genomics
Langmead bosc2010 cloud-genomicsLangmead bosc2010 cloud-genomics
Langmead bosc2010 cloud-genomicsBOSC 2010
 
Schultheiss bosc2010 persistance-web-services
Schultheiss bosc2010 persistance-web-servicesSchultheiss bosc2010 persistance-web-services
Schultheiss bosc2010 persistance-web-servicesBOSC 2010
 
Swertz bosc2010 molgenis
Swertz bosc2010 molgenisSwertz bosc2010 molgenis
Swertz bosc2010 molgenisBOSC 2010
 
Rice bosc2010 emboss
Rice bosc2010 embossRice bosc2010 emboss
Rice bosc2010 embossBOSC 2010
 
Morris bosc2010 evoker
Morris bosc2010 evokerMorris bosc2010 evoker
Morris bosc2010 evokerBOSC 2010
 
Kono bosc2010 pathway_projector
Kono bosc2010 pathway_projectorKono bosc2010 pathway_projector
Kono bosc2010 pathway_projectorBOSC 2010
 
Kanterakis bosc2010 molgenis
Kanterakis bosc2010 molgenisKanterakis bosc2010 molgenis
Kanterakis bosc2010 molgenisBOSC 2010
 
Gautier bosc2010 pythonbioconductor
Gautier bosc2010 pythonbioconductorGautier bosc2010 pythonbioconductor
Gautier bosc2010 pythonbioconductorBOSC 2010
 
Gardler bosc2010 community_developmentattheasf
Gardler bosc2010 community_developmentattheasfGardler bosc2010 community_developmentattheasf
Gardler bosc2010 community_developmentattheasfBOSC 2010
 
Friedberg bosc2010 iprstats
Friedberg bosc2010 iprstatsFriedberg bosc2010 iprstats
Friedberg bosc2010 iprstatsBOSC 2010
 
Fields bosc2010 bio_perl
Fields bosc2010 bio_perlFields bosc2010 bio_perl
Fields bosc2010 bio_perlBOSC 2010
 
Chapman bosc2010 biopython
Chapman bosc2010 biopythonChapman bosc2010 biopython
Chapman bosc2010 biopythonBOSC 2010
 
Bonnal bosc2010 bio_ruby
Bonnal bosc2010 bio_rubyBonnal bosc2010 bio_ruby
Bonnal bosc2010 bio_rubyBOSC 2010
 
Puton bosc2010 bio_python-modules-rna
Puton bosc2010 bio_python-modules-rnaPuton bosc2010 bio_python-modules-rna
Puton bosc2010 bio_python-modules-rnaBOSC 2010
 
Bader bosc2010 cytoweb
Bader bosc2010 cytowebBader bosc2010 cytoweb
Bader bosc2010 cytowebBOSC 2010
 
Talevich bosc2010 bio-phylo
Talevich bosc2010 bio-phyloTalevich bosc2010 bio-phylo
Talevich bosc2010 bio-phyloBOSC 2010
 
Zmasek bosc2010 aptx
Zmasek bosc2010 aptxZmasek bosc2010 aptx
Zmasek bosc2010 aptxBOSC 2010
 
Wilkinson bosc2010 moby-to-sadi
Wilkinson bosc2010 moby-to-sadiWilkinson bosc2010 moby-to-sadi
Wilkinson bosc2010 moby-to-sadiBOSC 2010
 
Venkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkitVenkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkitBOSC 2010
 

Plus de BOSC 2010 (20)

Mercer bosc2010 microsoft_framework
Mercer bosc2010 microsoft_frameworkMercer bosc2010 microsoft_framework
Mercer bosc2010 microsoft_framework
 
Langmead bosc2010 cloud-genomics
Langmead bosc2010 cloud-genomicsLangmead bosc2010 cloud-genomics
Langmead bosc2010 cloud-genomics
 
Schultheiss bosc2010 persistance-web-services
Schultheiss bosc2010 persistance-web-servicesSchultheiss bosc2010 persistance-web-services
Schultheiss bosc2010 persistance-web-services
 
Swertz bosc2010 molgenis
Swertz bosc2010 molgenisSwertz bosc2010 molgenis
Swertz bosc2010 molgenis
 
Rice bosc2010 emboss
Rice bosc2010 embossRice bosc2010 emboss
Rice bosc2010 emboss
 
Morris bosc2010 evoker
Morris bosc2010 evokerMorris bosc2010 evoker
Morris bosc2010 evoker
 
Kono bosc2010 pathway_projector
Kono bosc2010 pathway_projectorKono bosc2010 pathway_projector
Kono bosc2010 pathway_projector
 
Kanterakis bosc2010 molgenis
Kanterakis bosc2010 molgenisKanterakis bosc2010 molgenis
Kanterakis bosc2010 molgenis
 
Gautier bosc2010 pythonbioconductor
Gautier bosc2010 pythonbioconductorGautier bosc2010 pythonbioconductor
Gautier bosc2010 pythonbioconductor
 
Gardler bosc2010 community_developmentattheasf
Gardler bosc2010 community_developmentattheasfGardler bosc2010 community_developmentattheasf
Gardler bosc2010 community_developmentattheasf
 
Friedberg bosc2010 iprstats
Friedberg bosc2010 iprstatsFriedberg bosc2010 iprstats
Friedberg bosc2010 iprstats
 
Fields bosc2010 bio_perl
Fields bosc2010 bio_perlFields bosc2010 bio_perl
Fields bosc2010 bio_perl
 
Chapman bosc2010 biopython
Chapman bosc2010 biopythonChapman bosc2010 biopython
Chapman bosc2010 biopython
 
Bonnal bosc2010 bio_ruby
Bonnal bosc2010 bio_rubyBonnal bosc2010 bio_ruby
Bonnal bosc2010 bio_ruby
 
Puton bosc2010 bio_python-modules-rna
Puton bosc2010 bio_python-modules-rnaPuton bosc2010 bio_python-modules-rna
Puton bosc2010 bio_python-modules-rna
 
Bader bosc2010 cytoweb
Bader bosc2010 cytowebBader bosc2010 cytoweb
Bader bosc2010 cytoweb
 
Talevich bosc2010 bio-phylo
Talevich bosc2010 bio-phyloTalevich bosc2010 bio-phylo
Talevich bosc2010 bio-phylo
 
Zmasek bosc2010 aptx
Zmasek bosc2010 aptxZmasek bosc2010 aptx
Zmasek bosc2010 aptx
 
Wilkinson bosc2010 moby-to-sadi
Wilkinson bosc2010 moby-to-sadiWilkinson bosc2010 moby-to-sadi
Wilkinson bosc2010 moby-to-sadi
 
Venkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkitVenkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkit
 

Dernier

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 

Dernier (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

Bottoms bosc2010 bio_snp_inherit

  • 1. SNP Allele Designations (Bio::SNP::Inherit) Christopher Bottoms BOSC 2010
  • 2. 5 million data “items” one CPU: 2+ days eight CPUs: 1-2 days
  • 3. SNP ID Sample ID Base1 Base2 1 1 A A 1 2 A A 1 3 A G … … … … 1 5000 A A 2 1 C C … … … … … … … … 1106 5000 GG GG
  • 4. SNP ID Sample ID Base1 Base2 1 1 A A 1 2 A A 1 3 A G … … … … 1 5000 A A 2 1 C C … … … … … … … … 1106 5000 GG GG
  • 5. “ Matrix” data file format SNP ID 1 2 3 … 5000 SNP1 AA AA AG … AA SNP2 CC GG GG … CG
  • 6. “ Matrix” data file format SNP ID 1 2 3 … 5000 SNP1 AA AA AG … AA SNP2 CC GG GG … CG
  • 7.
  • 8. ID’s file ID Name Group 1 B73 B73 2 B73xZ1 NAMF1 3 Mo17 Control 4 M100 IBM 5 Bob B73xZ1
  • 9. ID’s file ID Name Group 1 B73 B73 2 B73xZ1 NAMF1 3 Mo17 Control 4 M100 IBM 5 Bob B73xZ1
  • 10. “ Human Parsed” ID’s file ID Name Group A (ID) B (ID) AxB (ID) 1 B73 B73 2 B73xZ1 NAMF1 3 Mo17 Control 4 M100 IBM 1 3 5 Bob B73xZ1 1 2
  • 11.
  • 12.
  • 13.
  • 14. End

Notes de l'éditeur

  1. The data file had to read into the database and then the information from the database was used to determine inheritance codes.
  2. We had 5000 samples of data associated with one “SNP ID” and we had over 1000 SNP ID’s, making our data file over 5 million lines long. It was actually much messier looking than this and I ended up processing each line and storing the results in a database. After talking with my boss about this, he provided me the same data in a different format.
  3. We had 5000 samples of data associated with one “SNP ID” and we had over 1000 SNP ID’s, making our data file over 5 million lines long. It was actually much messier looking than this and I ended up processing each line and storing the results in a database. After talking with my boss about this, he provided me the same data in a different format.
  4. This format really condensed the data file. From 800MB to less than 15MB, in fact. However, now each “data point” isn’t “tagged”, so some additional preprocessing needed to be done.
  5. This format really condensed the data file. From 800MB to less than 15MB, in fact. However, now each “data point” isn’t “tagged”, so some additional preprocessing needed to be done.
  6. The sample ID’s I showed you earlier, each represented a different individual corn plant. Knowing the relationships among the different plants was required for processing the data. Here, since I’m a human familiar the genetic system, I know that IBM stands for an Intermated B73 x Mo17 population. This is a simplified example of a manifest file. Z1, M100, and “Bob” are just made up names and any similarity to known names is purely coincidental. When you start looking at these, you see that the way the Relationships were defined in multiple ways. There isn’t anything here that directly tells that IBM and Mo17 and B73 are related. To take advantage of this information I wrote a long series of rules. Well, the break through came with the realization that I couldn’t keep this up forever. Instead of telling the computer how to understand these relationships, I decided to just tell the computer what the relationships are (next slide).
  7. The sample ID’s I showed you earlier, each represented a different individual corn plant. Knowing the relationships among the different plants was required for processing the data. Here, since I’m a human familiar the genetic system, I know that IBM stands for an Intermated B73 x Mo17 population. This is a simplified example of a manifest file. Z1, M100, and “Bob” are just made up names and any similarity to known names is purely coincidental. When you start looking at these, you see that the way the Relationships were defined in multiple ways. There isn’t anything here that directly tells that IBM and Mo17 and B73 are related. To take advantage of this information I wrote a long series of rules. Well, the break through came with the realization that I couldn’t keep this up forever. Instead of telling the computer how to understand these relationships, I decided to just tell the computer what the relationships are (next slide).
  8. This is organized in a way that is simple to both humans and computer programs to understand.
  9. Configuration files are great for some tasks that are easy for humans but more difficult to program. They are also great for things that are variable Setting up the configuration file only takes minutes. If we don’t know what these relationships are to start with, then we’re in trouble anyway. Simple for humans ≠ simple for computers Something else I didn’t put up here is that reducing your dependencies sure makes it easier to install.