SlideShare une entreprise Scribd logo
1  sur  20
INTRODUCTION TO
HMMER
Biosequence Analysis
Using Profile
Hidden Markov Models

Anaxagoras Fotopoulos | 2014
Course: Algorithms in Molecular Biology
A brief History
Sean Eddy
 HMMER 1.8, the first public release of HMMER, came in April 1995

 “Far too much of HMMER was written in coffee shops, airport lounges, transoceanic flights, and
Graeme Mitchison’s kitchen”
 “If the world worked as I hoped, the combination of the book Biological Sequence Analysis and
the existence of HMMER2 as a widely-used proof of principle should have motivated the
widespread adoption of probabilistic modeling methods for sequence analysis.”
 “BLAST continued to be the most widely used search program. HMMs widely considered as a
mysterious and orthogonal black box.”
 “NCBI, seemed to be slow to adopt or even understand HMM methods. This nagged at me; the
revolution was unfinished!”

 “In 2006 we moved the lab and I decided that we should aim to replace BLAST with an entirely
new generation of software. The result is the HMMER3 project.”
Usage

 HMMER is used to search for homologs of protein or DNA sequences to sequence
databases or to single sequences by comparing a profile-HMM
 Able to make sequence alignments.
 Powerful when the query is an alignment of multiple instances of a sequence family.
 Automated construction and maintenance of large multiple alignment databases. Useful
to organize sequences into evolutionarily related families
 Automated annotation of the domain structure of proteins by searching in protein family
databases such as Pfam and InterPro
How it works

HMMER makes a
profile-HMM from a
multiple sequence
alignment

A query is created
that assigns a positionspecific scoring system
for substitutions,
insertions and
deletions.

HMMER3 uses Forward
scores rather than
Viterbi scores, which
improves sensitivity.
Forward scores are
better for detecting
distant homologs

Sequences that score
significantly better to
the profile-HMM
compared to a null
model are considered
to be homologous
Posterior probabilities
of alignment are
reported, enabling
assessments on a
residue-by-residue
basis.
HMMER3 also makes extensive use of parallel
distribution commands for increasing computational
speed based on a significant acceleration of
the Smith-Waterman algorithm for aligning two
sequences (Farrar M, 2007)
Index of Commands (1/4)

Build models and align sequences (DNA or protein)
hmmbuild

Build a profile HMM from an input multiple alignment.

hmmalign

Make a multiple alignment of many sequences to a common profile
HMM.
Index of Commands (2/4)
Search protein queries to protein databases
phmmer

Search a single protein sequence to a protein sequence database

Like
BLASTP

jackhmmer

Iteratively search a protein sequence to a protein sequence database

Like
PSIBLAST

hmmsearch

Search a protein profile HMM against a protein sequence database.

hmmscan

Search a protein sequence against a protein profile HMM database.

hmmpgmd

Search daemon used for hmmer.org website.
Index of Commands (3/4)

Search DNA queries to DNA databases
nhmmer

Search DNA queries against DNA database

nhmmscan

Search a DNA sequence against a DNA profile HMM
database

Like
BLASTN
Index of Commands (4/4)
alimask

Modify alignment file to mask column ranges.

hmmconvert

Convert profile formats to/from HMMER3 format.

hmmemit

Generate (sample) sequences from a profile HMM.

hmmfetch

Get a profile HMM by name or accession from an HMM database.

hmmpress

Format an HMM database into a binary format for hmmscan

hmmstat

Show summary statistics for each profile in an HMM database

Other Utilities
Basic
Examples
with HMMER

hmmbuild [options] <hmmfile out> <multiple sequence alignment file>

> hmmbuild globins4.hmm tutorial/globins4.sto

Most Used Options
-o <f> Direct the summary output to file <f>, rather
than to stdout.
-O <f> Resave annotated modified source
alignments to a file <f> in Stockholm format.
--amino Specify that all sequences in msafile are
proteins.
--dna Specify that all sequences in msafile are
DNAs.
--rna Specify that all sequences in msafile are RNAs.
--pnone Don’t use any priors. Probability
parameters will simply be the observed frequencies,
after relative sequence weighting.
--plaplace Use a Laplace +1 prior in place of the
default mixture Dirichlet prior.
Basic
Examples
with HMMER

hmmbuild [options] <hmmfile out> <multiple sequence alignment file>

> hmmbuild globins4.hmm tutorial/globins4.sto
Internal Use!
Basic
Examples
with HMMER

hmmsearch [options] <hmmfile> <seqdb>

Search a protein profile HMM
against a protein sequence
database.

> hmmsearch globins4.hmm uniprot sprot.fasta > globins4.out

Keynotes
hmmsearch accepts any FASTA file as target database
input. It also accepts EMBL/UniProt text format
-o <f> Direct the human-readable output to a file <f>
instead of the default stdout.
-A <f> Save a multiple alignment of all significant hits (those
satisfying inclusion thresholds) to the file <f>.
--tblout <f> Save a simple tabular (space-delimited) file
summarizing the per-target output, with one data line per
homologous target sequence found.
--domtblout <f> Save a simple tabular (space-delimited) file
summarizing the per-domain output, with one data line per
homologous domain detected in a query sequence for
each homologous model.

• The most important number here is
the sequence E-value
• The lower the E-value, the more
significant the hit
• if both E-values are significant (<< 1),
the sequence is likely to be
homologous to your query.
• if the full sequence E-value is
significant but the single best domain
E-value is not, the target sequence is
a multidomain remote homolog
Basic
Examples
with HMMER
•

•
•
•

phmmer [options] <seqfile> <seqdb>

search protein sequence(s)
against a protein sequence
database

> phmmer tutorial/HBB HUMAN uniprot sprot.fasta
jackhmmer [options] <seqfile> <seqdb>

Keynotes
phmmer works essentially just like
hmmsearch does, except you
provide a query sequence
instead of a query profile HMM.
The default score matrix is
BLOSUM62
Everything about the output is
essentially as previously
described for hmmsearch
jackhmmer is for searching a
single sequence query iteratively
against a sequence database,
(like PSI-BLAST)

Iterative protein searches

> jackhmmer tutorial/HBB HUMAN uniprot sprot.fasta

• The first round is identical to a phmmer search. All the
matches that pass the inclusion thresholds are put in a
multiple alignment.
• In the second (and subsequent) rounds, a profile is made
from these results, and the database is searched again
with the profile.
• Iterations continue either until no new sequences are
detected or the maximum number of iterations is
reached.
Basic
Examples
with HMMER

jackhmmer [options] <seqfile> <seqdb>

Iterative protein searches

> jackhmmer tutorial/HBB HUMAN uniprot sprot.fasta

• This is telling you that the new
alignment contains 936
sequences, your query plus 935
significant matches.
• For round two, it’s built a new
model from this alignment.
• After round 2, many more globin
sequences have been found
• After round five, the search ends
it reaches the default maximum
of five iterations
Basic
Examples
with HMMER

hmmalign [options] <hmmfile> <seqfile>

Creating multiple alignments

> hmmalign globins4.hmm tutorial/globins45.fasta

A file with 45
unaligned globin
sequences

Posterior Probability
Estimate
Smart(Hmm)er
Create a tiny database
> hmmpress minifam
> hmmscan minifam tutorial/7LESS DROME
> hmmsearch globins4.hmm uniprot sprot.fasta
> cat globins4.hmm | hmmsearch - uniprot sprot.fasta
> cat uniprot sprot.fasta | hmmsearch globins4.hmm -

Identical

> hmmfetch --index Pfam-A.hmm
> cat myqueries.list | hmmfetch -f Pfam.hmm - | hmmsearch - uniprot sprot.fasta
This takes a list of query profile names/accessions in myqueries.list, fetches them
one by one from Pfam, and does an hmmsearch with each of them against
UniProt
Latest Edition
Features
DNA sequence comparison. HMMER now includes tools that are specifically designed for DNA/DNA
comparison: nhmmer and nhmmscan. The most notable improvement over using HMMER3’s tools is the
ability to search long (e.g. chromosome length) target sequences.

More sequence input formats. HMMER now handles a wide variety of input sequence file formats, both
aligned (Stockholm, Aligned FASTA, Clustal, NCBI PSI-BLAST, PHYLIP, Selex, UCSC SAM A2M) and
unaligned (FASTA, EMBL, Genbank), usually with autodetection.
MSV stage of HMMER acceleration pipeline now even faster. Bjarne Knudsen, Chief Scientific Officer
of CLC bio in Denmark, contributed an important optimization of the MSV filter (the first stage in the
accelerated ”filter pipeline”) that increases overall HMMER3 speed by about two-fold. This speed
improvement has no impact on sensitivity.

Web implementation of hmmer
Available Online
phmmer
hmmscan
Hmmsearch
jackhammer

http://hmmer.janelia.org/search/hmmsearch
Advantages/Disadvantages

 The methods are consistent and
therefore highly automatable,
allowing us to make libraries of
hundreds of profile HMMs and
apply them on a very large scale
to whole genome analysis
 HMMER can be used as a search
tool for additional homologues

 One is that HMMs do not capture
any higher-order correlations. An
HMM assumes that the identity of a
particular position is independent
of the identity of all other positions.
 Profile HMMs are often not good
models of structural RNAs, for
instance, because an HMM cannot
describe base pairs.
More Information

http://hmmer.janelia.org

http://cryptogenomicon.org/
Thank you!

Algorithms in Molecular Biology
Information Technologies in Medicine and Biology
Technological Education
Institute of Athens
Department of Biomedical
Engineering

National & Kapodistrian
University of Athens
Department of Informatics
Biomedical Research
Foundation
Academy of Athens

20

Demokritos
National Center
for Scientific Research

Contenu connexe

Tendances

Primary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyanaPrimary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyanaPuneet Kulyana
 
BITS: Basics of Sequence similarity
BITS: Basics of Sequence similarityBITS: Basics of Sequence similarity
BITS: Basics of Sequence similarityBITS
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomicsprateek kumar
 
Scoring schemes in bioinformatics (blosum)
Scoring schemes in bioinformatics (blosum)Scoring schemes in bioinformatics (blosum)
Scoring schemes in bioinformatics (blosum)SumatiHajela
 
Dot matrix Analysis Tools (Bioinformatics)
Dot matrix Analysis Tools (Bioinformatics)Dot matrix Analysis Tools (Bioinformatics)
Dot matrix Analysis Tools (Bioinformatics)Safa Khalid
 
Structural genomics
Structural genomicsStructural genomics
Structural genomicsAshfaq Ahmad
 
Introduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbjIntroduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbjKAUSHAL SAHU
 
The Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesThe Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesMelanie Courtot
 
MULTIPLE SEQUENCE ALIGNMENT
MULTIPLE  SEQUENCE  ALIGNMENTMULTIPLE  SEQUENCE  ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENTMariya Raju
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBIgeetikaJethra
 
Secondary protein structure prediction
Secondary protein structure predictionSecondary protein structure prediction
Secondary protein structure predictionSiva Dharshini R
 
Gene prediction methods vijay
Gene prediction methods  vijayGene prediction methods  vijay
Gene prediction methods vijayVijay Hemmadi
 
Bioinformatics databases
Bioinformatics databasesBioinformatics databases
Bioinformatics databasesgokilaamu
 
Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)AnkitTiwari354
 

Tendances (20)

Primary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyanaPrimary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyana
 
BITS: Basics of Sequence similarity
BITS: Basics of Sequence similarityBITS: Basics of Sequence similarity
BITS: Basics of Sequence similarity
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
 
Scoring schemes in bioinformatics (blosum)
Scoring schemes in bioinformatics (blosum)Scoring schemes in bioinformatics (blosum)
Scoring schemes in bioinformatics (blosum)
 
Dot matrix Analysis Tools (Bioinformatics)
Dot matrix Analysis Tools (Bioinformatics)Dot matrix Analysis Tools (Bioinformatics)
Dot matrix Analysis Tools (Bioinformatics)
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Genome sequencing
Genome sequencingGenome sequencing
Genome sequencing
 
Structural genomics
Structural genomicsStructural genomics
Structural genomics
 
Introduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbjIntroduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbj
 
Proteins databases
Proteins databasesProteins databases
Proteins databases
 
The Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesThe Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resources
 
MULTIPLE SEQUENCE ALIGNMENT
MULTIPLE  SEQUENCE  ALIGNMENTMULTIPLE  SEQUENCE  ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENT
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBI
 
String.pptx
String.pptxString.pptx
String.pptx
 
Secondary protein structure prediction
Secondary protein structure predictionSecondary protein structure prediction
Secondary protein structure prediction
 
blast bioinformatics
blast bioinformaticsblast bioinformatics
blast bioinformatics
 
Gene prediction methods vijay
Gene prediction methods  vijayGene prediction methods  vijay
Gene prediction methods vijay
 
Bioinformatics databases
Bioinformatics databasesBioinformatics databases
Bioinformatics databases
 
Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)
 

Similaire à Introduction to HMMER - A biosequence analysis tool with Hidden Markov Models

BC-Cancer ChimeraScan Presentation
BC-Cancer ChimeraScan PresentationBC-Cancer ChimeraScan Presentation
BC-Cancer ChimeraScan PresentationElijah Willie
 
Phylogenetic prediction - maximum parsimony method
Phylogenetic prediction - maximum parsimony methodPhylogenetic prediction - maximum parsimony method
Phylogenetic prediction - maximum parsimony methodAfnan Zuiter
 
Comparison Study of Lossless Data Compression Algorithms for Text Data
Comparison Study of Lossless Data Compression Algorithms for Text DataComparison Study of Lossless Data Compression Algorithms for Text Data
Comparison Study of Lossless Data Compression Algorithms for Text DataIOSR Journals
 
thesis-defense-animation
thesis-defense-animationthesis-defense-animation
thesis-defense-animationMd Pavel Mahmud
 
Bioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxBioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxxRowlet
 
RNA-seq differential expression analysis
RNA-seq differential expression analysisRNA-seq differential expression analysis
RNA-seq differential expression analysismikaelhuss
 
RSEM and DE packages
RSEM and DE packagesRSEM and DE packages
RSEM and DE packagesRavi Gandham
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fastaALLIENU
 
B.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blastB.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blastRai University
 
B.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blastB.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blastRai University
 
Meta Machine Learning: Hyperparameter Optimization
Meta Machine Learning: Hyperparameter OptimizationMeta Machine Learning: Hyperparameter Optimization
Meta Machine Learning: Hyperparameter OptimizationPriyatham Bollimpalli
 
Lecture 5.pptx
Lecture 5.pptxLecture 5.pptx
Lecture 5.pptxericndunek
 
XabTracker & SeqAgent: Integrated LIMS & Sequence Analysis Tools for Antibody...
XabTracker & SeqAgent: Integrated LIMS & Sequence Analysis Tools for Antibody...XabTracker & SeqAgent: Integrated LIMS & Sequence Analysis Tools for Antibody...
XabTracker & SeqAgent: Integrated LIMS & Sequence Analysis Tools for Antibody...Mark Evans
 
A file in fasta format is probably the most common way to store sequence info...
A file in fasta format is probably the most common way to store sequence info...A file in fasta format is probably the most common way to store sequence info...
A file in fasta format is probably the most common way to store sequence info...hwbloom42
 

Similaire à Introduction to HMMER - A biosequence analysis tool with Hidden Markov Models (20)

HPC_HMMER.pptx
HPC_HMMER.pptxHPC_HMMER.pptx
HPC_HMMER.pptx
 
JGI_HMMER.pptx
JGI_HMMER.pptxJGI_HMMER.pptx
JGI_HMMER.pptx
 
MAKER2
MAKER2MAKER2
MAKER2
 
BC-Cancer ChimeraScan Presentation
BC-Cancer ChimeraScan PresentationBC-Cancer ChimeraScan Presentation
BC-Cancer ChimeraScan Presentation
 
Phylogenetic prediction - maximum parsimony method
Phylogenetic prediction - maximum parsimony methodPhylogenetic prediction - maximum parsimony method
Phylogenetic prediction - maximum parsimony method
 
Comparison Study of Lossless Data Compression Algorithms for Text Data
Comparison Study of Lossless Data Compression Algorithms for Text DataComparison Study of Lossless Data Compression Algorithms for Text Data
Comparison Study of Lossless Data Compression Algorithms for Text Data
 
thesis-defense-animation
thesis-defense-animationthesis-defense-animation
thesis-defense-animation
 
Example of force fields
Example of force fieldsExample of force fields
Example of force fields
 
Bioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxBioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptx
 
patterndat.pdf
patterndat.pdfpatterndat.pdf
patterndat.pdf
 
RNA-seq differential expression analysis
RNA-seq differential expression analysisRNA-seq differential expression analysis
RNA-seq differential expression analysis
 
RSEM and DE packages
RSEM and DE packagesRSEM and DE packages
RSEM and DE packages
 
50320130403003 2
50320130403003 250320130403003 2
50320130403003 2
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fasta
 
B.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blastB.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blast
 
B.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blastB.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blast
 
Meta Machine Learning: Hyperparameter Optimization
Meta Machine Learning: Hyperparameter OptimizationMeta Machine Learning: Hyperparameter Optimization
Meta Machine Learning: Hyperparameter Optimization
 
Lecture 5.pptx
Lecture 5.pptxLecture 5.pptx
Lecture 5.pptx
 
XabTracker & SeqAgent: Integrated LIMS & Sequence Analysis Tools for Antibody...
XabTracker & SeqAgent: Integrated LIMS & Sequence Analysis Tools for Antibody...XabTracker & SeqAgent: Integrated LIMS & Sequence Analysis Tools for Antibody...
XabTracker & SeqAgent: Integrated LIMS & Sequence Analysis Tools for Antibody...
 
A file in fasta format is probably the most common way to store sequence info...
A file in fasta format is probably the most common way to store sequence info...A file in fasta format is probably the most common way to store sequence info...
A file in fasta format is probably the most common way to store sequence info...
 

Plus de Anax Fotopoulos

Tzitzikosta message for the world heritage monuments exhibition
Tzitzikosta message for the world heritage monuments exhibitionTzitzikosta message for the world heritage monuments exhibition
Tzitzikosta message for the world heritage monuments exhibitionAnax Fotopoulos
 
Acropoils & other hellenic world monuments
Acropoils & other hellenic world monumentsAcropoils & other hellenic world monuments
Acropoils & other hellenic world monumentsAnax Fotopoulos
 
Architecture of the human regulatory network derived from encode data
Architecture of the human regulatory network derived from encode dataArchitecture of the human regulatory network derived from encode data
Architecture of the human regulatory network derived from encode dataAnax Fotopoulos
 
From Smart Homes to Smart Cities: An approach based on Internet-of-Things
From Smart Homes to Smart Cities: An approach based on Internet-of-ThingsFrom Smart Homes to Smart Cities: An approach based on Internet-of-Things
From Smart Homes to Smart Cities: An approach based on Internet-of-ThingsAnax Fotopoulos
 
The social aspect of Smart Wearable Systems in the era of Internet-of-Things
The social aspect of Smart Wearable Systems in the era of Internet-of-ThingsThe social aspect of Smart Wearable Systems in the era of Internet-of-Things
The social aspect of Smart Wearable Systems in the era of Internet-of-ThingsAnax Fotopoulos
 
TIS prediction in human cDNAs with high accuracy
TIS prediction in human cDNAs with high accuracyTIS prediction in human cDNAs with high accuracy
TIS prediction in human cDNAs with high accuracyAnax Fotopoulos
 
Wef the future role of civil society report 2013
Wef the future role of civil society report 2013Wef the future role of civil society report 2013
Wef the future role of civil society report 2013Anax Fotopoulos
 
UNESCO’s Division for Freedom of Expression, Democracy and Peace Report
UNESCO’s Division for Freedom of  Expression, Democracy and Peace ReportUNESCO’s Division for Freedom of  Expression, Democracy and Peace Report
UNESCO’s Division for Freedom of Expression, Democracy and Peace ReportAnax Fotopoulos
 
Europa Nostra Athens Congress - Registration fees
Europa Nostra Athens Congress - Registration feesEuropa Nostra Athens Congress - Registration fees
Europa Nostra Athens Congress - Registration feesAnax Fotopoulos
 
Europa Nostra Congress Athens 2013 - Programme
Europa Nostra Congress Athens 2013 - ProgrammeEuropa Nostra Congress Athens 2013 - Programme
Europa Nostra Congress Athens 2013 - ProgrammeAnax Fotopoulos
 
A new approach in specifying the inverse quadratic matrix in modulo-2 for con...
A new approach in specifying the inverse quadratic matrix in modulo-2 for con...A new approach in specifying the inverse quadratic matrix in modulo-2 for con...
A new approach in specifying the inverse quadratic matrix in modulo-2 for con...Anax Fotopoulos
 
Συστήματα ανίχνευσης εισβολών με νευρωνικά δίκτυα
Συστήματα ανίχνευσης εισβολών με νευρωνικά δίκτυαΣυστήματα ανίχνευσης εισβολών με νευρωνικά δίκτυα
Συστήματα ανίχνευσης εισβολών με νευρωνικά δίκτυαAnax Fotopoulos
 
Introduction to Tempus Programme (5th Call)
Introduction to Tempus Programme (5th Call)Introduction to Tempus Programme (5th Call)
Introduction to Tempus Programme (5th Call)Anax Fotopoulos
 
TEI Piraeus IEEE Student Branch Actions 2011-2012
TEI Piraeus IEEE Student Branch  Actions 2011-2012 TEI Piraeus IEEE Student Branch  Actions 2011-2012
TEI Piraeus IEEE Student Branch Actions 2011-2012 Anax Fotopoulos
 
Η ΣΚΟΤΕΙΝΗ ΠΛΕΥΡΑ ΤΟΥ ΔΙΑΔΙΚΤΥΟΥ - ΠΑΡΟΥΣΙΑΣΗ ΤΩΝ ΠΡΟΒΛΗΜΑΤΩΝ & ΤΩΝ ΜΕΤΡΩΝ Π...
Η ΣΚΟΤΕΙΝΗ ΠΛΕΥΡΑ ΤΟΥ ΔΙΑΔΙΚΤΥΟΥ - ΠΑΡΟΥΣΙΑΣΗ ΤΩΝ ΠΡΟΒΛΗΜΑΤΩΝ &  ΤΩΝ ΜΕΤΡΩΝ Π...Η ΣΚΟΤΕΙΝΗ ΠΛΕΥΡΑ ΤΟΥ ΔΙΑΔΙΚΤΥΟΥ - ΠΑΡΟΥΣΙΑΣΗ ΤΩΝ ΠΡΟΒΛΗΜΑΤΩΝ &  ΤΩΝ ΜΕΤΡΩΝ Π...
Η ΣΚΟΤΕΙΝΗ ΠΛΕΥΡΑ ΤΟΥ ΔΙΑΔΙΚΤΥΟΥ - ΠΑΡΟΥΣΙΑΣΗ ΤΩΝ ΠΡΟΒΛΗΜΑΤΩΝ & ΤΩΝ ΜΕΤΡΩΝ Π...Anax Fotopoulos
 
Measuring the EMF of various widely used electronic devices and their possibl...
Measuring the EMF of various widely used electronic devices and their possibl...Measuring the EMF of various widely used electronic devices and their possibl...
Measuring the EMF of various widely used electronic devices and their possibl...Anax Fotopoulos
 
Eισήγηση στα χαοτικα τεχνητα νευρωνικα δικτυα
Eισήγηση στα χαοτικα τεχνητα νευρωνικα δικτυαEισήγηση στα χαοτικα τεχνητα νευρωνικα δικτυα
Eισήγηση στα χαοτικα τεχνητα νευρωνικα δικτυαAnax Fotopoulos
 

Plus de Anax Fotopoulos (20)

AFMM Manual
AFMM ManualAFMM Manual
AFMM Manual
 
Tzitzikosta message for the world heritage monuments exhibition
Tzitzikosta message for the world heritage monuments exhibitionTzitzikosta message for the world heritage monuments exhibition
Tzitzikosta message for the world heritage monuments exhibition
 
Acropoils & other hellenic world monuments
Acropoils & other hellenic world monumentsAcropoils & other hellenic world monuments
Acropoils & other hellenic world monuments
 
Architecture of the human regulatory network derived from encode data
Architecture of the human regulatory network derived from encode dataArchitecture of the human regulatory network derived from encode data
Architecture of the human regulatory network derived from encode data
 
Ret protooncogene
Ret protooncogeneRet protooncogene
Ret protooncogene
 
From Smart Homes to Smart Cities: An approach based on Internet-of-Things
From Smart Homes to Smart Cities: An approach based on Internet-of-ThingsFrom Smart Homes to Smart Cities: An approach based on Internet-of-Things
From Smart Homes to Smart Cities: An approach based on Internet-of-Things
 
The social aspect of Smart Wearable Systems in the era of Internet-of-Things
The social aspect of Smart Wearable Systems in the era of Internet-of-ThingsThe social aspect of Smart Wearable Systems in the era of Internet-of-Things
The social aspect of Smart Wearable Systems in the era of Internet-of-Things
 
TIS prediction in human cDNAs with high accuracy
TIS prediction in human cDNAs with high accuracyTIS prediction in human cDNAs with high accuracy
TIS prediction in human cDNAs with high accuracy
 
Wef the future role of civil society report 2013
Wef the future role of civil society report 2013Wef the future role of civil society report 2013
Wef the future role of civil society report 2013
 
UNESCO’s Division for Freedom of Expression, Democracy and Peace Report
UNESCO’s Division for Freedom of  Expression, Democracy and Peace ReportUNESCO’s Division for Freedom of  Expression, Democracy and Peace Report
UNESCO’s Division for Freedom of Expression, Democracy and Peace Report
 
Europa Nostra Athens Congress - Registration fees
Europa Nostra Athens Congress - Registration feesEuropa Nostra Athens Congress - Registration fees
Europa Nostra Athens Congress - Registration fees
 
Europa Nostra Congress Athens 2013 - Programme
Europa Nostra Congress Athens 2013 - ProgrammeEuropa Nostra Congress Athens 2013 - Programme
Europa Nostra Congress Athens 2013 - Programme
 
RMCEF
RMCEFRMCEF
RMCEF
 
A new approach in specifying the inverse quadratic matrix in modulo-2 for con...
A new approach in specifying the inverse quadratic matrix in modulo-2 for con...A new approach in specifying the inverse quadratic matrix in modulo-2 for con...
A new approach in specifying the inverse quadratic matrix in modulo-2 for con...
 
Συστήματα ανίχνευσης εισβολών με νευρωνικά δίκτυα
Συστήματα ανίχνευσης εισβολών με νευρωνικά δίκτυαΣυστήματα ανίχνευσης εισβολών με νευρωνικά δίκτυα
Συστήματα ανίχνευσης εισβολών με νευρωνικά δίκτυα
 
Introduction to Tempus Programme (5th Call)
Introduction to Tempus Programme (5th Call)Introduction to Tempus Programme (5th Call)
Introduction to Tempus Programme (5th Call)
 
TEI Piraeus IEEE Student Branch Actions 2011-2012
TEI Piraeus IEEE Student Branch  Actions 2011-2012 TEI Piraeus IEEE Student Branch  Actions 2011-2012
TEI Piraeus IEEE Student Branch Actions 2011-2012
 
Η ΣΚΟΤΕΙΝΗ ΠΛΕΥΡΑ ΤΟΥ ΔΙΑΔΙΚΤΥΟΥ - ΠΑΡΟΥΣΙΑΣΗ ΤΩΝ ΠΡΟΒΛΗΜΑΤΩΝ & ΤΩΝ ΜΕΤΡΩΝ Π...
Η ΣΚΟΤΕΙΝΗ ΠΛΕΥΡΑ ΤΟΥ ΔΙΑΔΙΚΤΥΟΥ - ΠΑΡΟΥΣΙΑΣΗ ΤΩΝ ΠΡΟΒΛΗΜΑΤΩΝ &  ΤΩΝ ΜΕΤΡΩΝ Π...Η ΣΚΟΤΕΙΝΗ ΠΛΕΥΡΑ ΤΟΥ ΔΙΑΔΙΚΤΥΟΥ - ΠΑΡΟΥΣΙΑΣΗ ΤΩΝ ΠΡΟΒΛΗΜΑΤΩΝ &  ΤΩΝ ΜΕΤΡΩΝ Π...
Η ΣΚΟΤΕΙΝΗ ΠΛΕΥΡΑ ΤΟΥ ΔΙΑΔΙΚΤΥΟΥ - ΠΑΡΟΥΣΙΑΣΗ ΤΩΝ ΠΡΟΒΛΗΜΑΤΩΝ & ΤΩΝ ΜΕΤΡΩΝ Π...
 
Measuring the EMF of various widely used electronic devices and their possibl...
Measuring the EMF of various widely used electronic devices and their possibl...Measuring the EMF of various widely used electronic devices and their possibl...
Measuring the EMF of various widely used electronic devices and their possibl...
 
Eισήγηση στα χαοτικα τεχνητα νευρωνικα δικτυα
Eισήγηση στα χαοτικα τεχνητα νευρωνικα δικτυαEισήγηση στα χαοτικα τεχνητα νευρωνικα δικτυα
Eισήγηση στα χαοτικα τεχνητα νευρωνικα δικτυα
 

Dernier

Ahmedabad Call Girls CG Road 🔝9907093804 Short 1500 💋 Night 6000
Ahmedabad Call Girls CG Road 🔝9907093804  Short 1500  💋 Night 6000Ahmedabad Call Girls CG Road 🔝9907093804  Short 1500  💋 Night 6000
Ahmedabad Call Girls CG Road 🔝9907093804 Short 1500 💋 Night 6000aliya bhat
 
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...narwatsonia7
 
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowKolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowNehru place Escorts
 
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...saminamagar
 
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...narwatsonia7
 
VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...
VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...
VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...Miss joya
 
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.MiadAlsulami
 
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...Miss joya
 
Glomerular Filtration and determinants of glomerular filtration .pptx
Glomerular Filtration and  determinants of glomerular filtration .pptxGlomerular Filtration and  determinants of glomerular filtration .pptx
Glomerular Filtration and determinants of glomerular filtration .pptxDr.Nusrat Tariq
 
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original PhotosCall Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photosnarwatsonia7
 
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service MumbaiVIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbaisonalikaur4
 
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking ModelsMumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Modelssonalikaur4
 
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment BookingCall Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Bookingnarwatsonia7
 
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingCall Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingNehru place Escorts
 
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 

Dernier (20)

Ahmedabad Call Girls CG Road 🔝9907093804 Short 1500 💋 Night 6000
Ahmedabad Call Girls CG Road 🔝9907093804  Short 1500  💋 Night 6000Ahmedabad Call Girls CG Road 🔝9907093804  Short 1500  💋 Night 6000
Ahmedabad Call Girls CG Road 🔝9907093804 Short 1500 💋 Night 6000
 
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
 
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowKolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
 
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
 
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
 
VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...
VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...
VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...
 
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
 
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
 
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
 
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
 
Glomerular Filtration and determinants of glomerular filtration .pptx
Glomerular Filtration and  determinants of glomerular filtration .pptxGlomerular Filtration and  determinants of glomerular filtration .pptx
Glomerular Filtration and determinants of glomerular filtration .pptx
 
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original PhotosCall Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
 
sauth delhi call girls in Bhajanpura 🔝 9953056974 🔝 escort Service
sauth delhi call girls in Bhajanpura 🔝 9953056974 🔝 escort Servicesauth delhi call girls in Bhajanpura 🔝 9953056974 🔝 escort Service
sauth delhi call girls in Bhajanpura 🔝 9953056974 🔝 escort Service
 
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service MumbaiVIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
 
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking ModelsMumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
 
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
 
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment BookingCall Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
 
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingCall Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
 
Escort Service Call Girls In Sarita Vihar,, 99530°56974 Delhi NCR
Escort Service Call Girls In Sarita Vihar,, 99530°56974 Delhi NCREscort Service Call Girls In Sarita Vihar,, 99530°56974 Delhi NCR
Escort Service Call Girls In Sarita Vihar,, 99530°56974 Delhi NCR
 
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
 

Introduction to HMMER - A biosequence analysis tool with Hidden Markov Models

  • 1. INTRODUCTION TO HMMER Biosequence Analysis Using Profile Hidden Markov Models Anaxagoras Fotopoulos | 2014 Course: Algorithms in Molecular Biology
  • 2. A brief History Sean Eddy  HMMER 1.8, the first public release of HMMER, came in April 1995  “Far too much of HMMER was written in coffee shops, airport lounges, transoceanic flights, and Graeme Mitchison’s kitchen”  “If the world worked as I hoped, the combination of the book Biological Sequence Analysis and the existence of HMMER2 as a widely-used proof of principle should have motivated the widespread adoption of probabilistic modeling methods for sequence analysis.”  “BLAST continued to be the most widely used search program. HMMs widely considered as a mysterious and orthogonal black box.”  “NCBI, seemed to be slow to adopt or even understand HMM methods. This nagged at me; the revolution was unfinished!”  “In 2006 we moved the lab and I decided that we should aim to replace BLAST with an entirely new generation of software. The result is the HMMER3 project.”
  • 3. Usage  HMMER is used to search for homologs of protein or DNA sequences to sequence databases or to single sequences by comparing a profile-HMM  Able to make sequence alignments.  Powerful when the query is an alignment of multiple instances of a sequence family.  Automated construction and maintenance of large multiple alignment databases. Useful to organize sequences into evolutionarily related families  Automated annotation of the domain structure of proteins by searching in protein family databases such as Pfam and InterPro
  • 4. How it works HMMER makes a profile-HMM from a multiple sequence alignment A query is created that assigns a positionspecific scoring system for substitutions, insertions and deletions. HMMER3 uses Forward scores rather than Viterbi scores, which improves sensitivity. Forward scores are better for detecting distant homologs Sequences that score significantly better to the profile-HMM compared to a null model are considered to be homologous Posterior probabilities of alignment are reported, enabling assessments on a residue-by-residue basis. HMMER3 also makes extensive use of parallel distribution commands for increasing computational speed based on a significant acceleration of the Smith-Waterman algorithm for aligning two sequences (Farrar M, 2007)
  • 5. Index of Commands (1/4) Build models and align sequences (DNA or protein) hmmbuild Build a profile HMM from an input multiple alignment. hmmalign Make a multiple alignment of many sequences to a common profile HMM.
  • 6. Index of Commands (2/4) Search protein queries to protein databases phmmer Search a single protein sequence to a protein sequence database Like BLASTP jackhmmer Iteratively search a protein sequence to a protein sequence database Like PSIBLAST hmmsearch Search a protein profile HMM against a protein sequence database. hmmscan Search a protein sequence against a protein profile HMM database. hmmpgmd Search daemon used for hmmer.org website.
  • 7. Index of Commands (3/4) Search DNA queries to DNA databases nhmmer Search DNA queries against DNA database nhmmscan Search a DNA sequence against a DNA profile HMM database Like BLASTN
  • 8. Index of Commands (4/4) alimask Modify alignment file to mask column ranges. hmmconvert Convert profile formats to/from HMMER3 format. hmmemit Generate (sample) sequences from a profile HMM. hmmfetch Get a profile HMM by name or accession from an HMM database. hmmpress Format an HMM database into a binary format for hmmscan hmmstat Show summary statistics for each profile in an HMM database Other Utilities
  • 9. Basic Examples with HMMER hmmbuild [options] <hmmfile out> <multiple sequence alignment file> > hmmbuild globins4.hmm tutorial/globins4.sto Most Used Options -o <f> Direct the summary output to file <f>, rather than to stdout. -O <f> Resave annotated modified source alignments to a file <f> in Stockholm format. --amino Specify that all sequences in msafile are proteins. --dna Specify that all sequences in msafile are DNAs. --rna Specify that all sequences in msafile are RNAs. --pnone Don’t use any priors. Probability parameters will simply be the observed frequencies, after relative sequence weighting. --plaplace Use a Laplace +1 prior in place of the default mixture Dirichlet prior.
  • 10. Basic Examples with HMMER hmmbuild [options] <hmmfile out> <multiple sequence alignment file> > hmmbuild globins4.hmm tutorial/globins4.sto Internal Use!
  • 11. Basic Examples with HMMER hmmsearch [options] <hmmfile> <seqdb> Search a protein profile HMM against a protein sequence database. > hmmsearch globins4.hmm uniprot sprot.fasta > globins4.out Keynotes hmmsearch accepts any FASTA file as target database input. It also accepts EMBL/UniProt text format -o <f> Direct the human-readable output to a file <f> instead of the default stdout. -A <f> Save a multiple alignment of all significant hits (those satisfying inclusion thresholds) to the file <f>. --tblout <f> Save a simple tabular (space-delimited) file summarizing the per-target output, with one data line per homologous target sequence found. --domtblout <f> Save a simple tabular (space-delimited) file summarizing the per-domain output, with one data line per homologous domain detected in a query sequence for each homologous model. • The most important number here is the sequence E-value • The lower the E-value, the more significant the hit • if both E-values are significant (<< 1), the sequence is likely to be homologous to your query. • if the full sequence E-value is significant but the single best domain E-value is not, the target sequence is a multidomain remote homolog
  • 12. Basic Examples with HMMER • • • • phmmer [options] <seqfile> <seqdb> search protein sequence(s) against a protein sequence database > phmmer tutorial/HBB HUMAN uniprot sprot.fasta jackhmmer [options] <seqfile> <seqdb> Keynotes phmmer works essentially just like hmmsearch does, except you provide a query sequence instead of a query profile HMM. The default score matrix is BLOSUM62 Everything about the output is essentially as previously described for hmmsearch jackhmmer is for searching a single sequence query iteratively against a sequence database, (like PSI-BLAST) Iterative protein searches > jackhmmer tutorial/HBB HUMAN uniprot sprot.fasta • The first round is identical to a phmmer search. All the matches that pass the inclusion thresholds are put in a multiple alignment. • In the second (and subsequent) rounds, a profile is made from these results, and the database is searched again with the profile. • Iterations continue either until no new sequences are detected or the maximum number of iterations is reached.
  • 13. Basic Examples with HMMER jackhmmer [options] <seqfile> <seqdb> Iterative protein searches > jackhmmer tutorial/HBB HUMAN uniprot sprot.fasta • This is telling you that the new alignment contains 936 sequences, your query plus 935 significant matches. • For round two, it’s built a new model from this alignment. • After round 2, many more globin sequences have been found • After round five, the search ends it reaches the default maximum of five iterations
  • 14. Basic Examples with HMMER hmmalign [options] <hmmfile> <seqfile> Creating multiple alignments > hmmalign globins4.hmm tutorial/globins45.fasta A file with 45 unaligned globin sequences Posterior Probability Estimate
  • 15. Smart(Hmm)er Create a tiny database > hmmpress minifam > hmmscan minifam tutorial/7LESS DROME > hmmsearch globins4.hmm uniprot sprot.fasta > cat globins4.hmm | hmmsearch - uniprot sprot.fasta > cat uniprot sprot.fasta | hmmsearch globins4.hmm - Identical > hmmfetch --index Pfam-A.hmm > cat myqueries.list | hmmfetch -f Pfam.hmm - | hmmsearch - uniprot sprot.fasta This takes a list of query profile names/accessions in myqueries.list, fetches them one by one from Pfam, and does an hmmsearch with each of them against UniProt
  • 16. Latest Edition Features DNA sequence comparison. HMMER now includes tools that are specifically designed for DNA/DNA comparison: nhmmer and nhmmscan. The most notable improvement over using HMMER3’s tools is the ability to search long (e.g. chromosome length) target sequences. More sequence input formats. HMMER now handles a wide variety of input sequence file formats, both aligned (Stockholm, Aligned FASTA, Clustal, NCBI PSI-BLAST, PHYLIP, Selex, UCSC SAM A2M) and unaligned (FASTA, EMBL, Genbank), usually with autodetection. MSV stage of HMMER acceleration pipeline now even faster. Bjarne Knudsen, Chief Scientific Officer of CLC bio in Denmark, contributed an important optimization of the MSV filter (the first stage in the accelerated ”filter pipeline”) that increases overall HMMER3 speed by about two-fold. This speed improvement has no impact on sensitivity. Web implementation of hmmer
  • 18. Advantages/Disadvantages  The methods are consistent and therefore highly automatable, allowing us to make libraries of hundreds of profile HMMs and apply them on a very large scale to whole genome analysis  HMMER can be used as a search tool for additional homologues  One is that HMMs do not capture any higher-order correlations. An HMM assumes that the identity of a particular position is independent of the identity of all other positions.  Profile HMMs are often not good models of structural RNAs, for instance, because an HMM cannot describe base pairs.
  • 20. Thank you! Algorithms in Molecular Biology Information Technologies in Medicine and Biology Technological Education Institute of Athens Department of Biomedical Engineering National & Kapodistrian University of Athens Department of Informatics Biomedical Research Foundation Academy of Athens 20 Demokritos National Center for Scientific Research