SlideShare une entreprise Scribd logo
1  sur  67
Phylogenetic Analysis and Identification of 1, 4-Dioxane Degrading
Genes
Keith Sanders May 9th 2016 Dr. Iyer and Brian Iken
Abstract
1, 4-dioxane is a substance that was used as a solvent for other organic compounds.
Exposure to this compound can have numerous deleterious effects on a living organisms and is
suspected as a carcinogen. Originally, this substance was known as just an occupational hazard.
Unfortunately, 1, 4-dioxane has also been found to contaminate ground water. After a brief
analysis of the compound, bioremediation became a key possibility in the degradation of the
substance in contaminated areas. In order to accomplish this, the bacterium Pseudonocardia
Dioxanivorans was discovered. Pseudonocardia Dioxanivorans has the ability to degrade 1, 4-
dioxane thanks to its multicomponent monooxygenase, which contained specific genes working
together with the monooxygenase. It is my hypothesis that organisms with a multicomponent
monooxygenase system phylogenetically similar Pseudonocardia Dioxanivorans will also be
effective 1, 4-dioxane degraders. After the particular genes of interest were discovered,
programs like BLAST were used to discover similar sequences within many biotechnology
databases. Next Clustal Omega was used to create multiple sequence alignments, as well as
output data which would provide further phylogenetic data from the sequences gathered. The
results of the project showed that while many organism contained components of the
monooxygenase or notable biomarkers, three notable organisms provided sufficient evidence
of being true 1,4 dioxane degraders. These organisms are Rhodococcus sp. YYL, Pseudonocardia
tetrahydrofuranoxydans, and Pseudonocardia sp. ENV478.
Introduction
1, 4 dioxane was used as a solvent for numerous organic and inorganic compounds. This
compound is a clear colorless liquid with an odor similar to ether. 1, 4 dioxane is also soluble in
water. This compound is also known to be highly flammable in both its liquid and vapor state. 1,
4 dioxane is hazardous for humans. Short-term exposure such as inhalation of this chemical can
cause minor ailments such as dizziness and headaches, or major aliments such as irritation of
the throat, lungs, eyes. 1, 4 dioxane can also be absorbed through the skin causing mild to
severe skin irritation. Chronic exposure to 1, 4 dioxane can be extremely detrimental, and even
lethal. Studies have shown that long term exposure to 1, 4 dioxane can damage the kidney and
the liver. Multiple studies using exposing rats to 1, 4 dioxane in both their drinking water and
vapor resulting in a number of rats suffering damage to the organs in their endocrine system
(Kasai, T., Kano, H.,Umeda, Y., Sasaki, T., Ikawa,N., Nishizawa, T., . . . Fukushima, S. (2009). These
rats also developed cancerous cells. These studies lead 1, 4 dioxane to be classified as a
probable human carcinogen. Typically, humans only come in contact with this substance as a
part of occupational hazards. However, 1, 4 dioxane has been detected as a contaminant in
both surface and ground water. 1, 4-dioxane is a very dangerous chemical and unfortunately it
is also problematic to get rid of. The purpose of this study is to find bio-degraders, organisms
which can perform bioremediation by degrading one substance and converting it into a
different product. These bio-degraders are often favored for remediation problems because
they are easy to maintain and generally less harmful to the environment.
After performing a literary review on 1, 4 dioxane, I started to search for literature
about organisms. More specifically I was looking for genes which had the ability to degrade this
substance and the organisms they belong too. After reviewing articles I discovered that a key
gene of interest was a monooxygenase component MmoB/DmpM. This gene was mentioned in
the organism Pseudonocardia Dioxanivorans strain 1190. This monooxygenase was particularly
interesting because it did not require other organic substrates to degrade 1,4-dioxane(
Gedalanga, P. B.,Pornwongthong, P., Mora, R., Chiang, S. D.,Baldwin, B., Ogles, D.,& Mahendra, S.
(2014) ) .Interesting moreover, the monooxygenase found in Pseudonocardia Dioxanivorans
contained a multi-component gene cluster which aided in its ability to degrade 1,4 -dioxane.
These components included things such as an alpha and beta subunits, a reductase. The other
genes that were in the complexes were evaluated and used to confirm or scrutinize the results,
however the monooxygenase MmoB/DmpM was the target gene. Another article led me to
examine biomarkers which showed promise in being 1, 4-dioxane degraders. This article
provided me the means to search for genes like phenol-2 monooxygenase and propane
monooxygenase which needed specific substrates to operate. This article also guided me into
looking into alcohol dehydrogenase genes. Since there was a lot of information on
Pseudonocardia Dioxanivorans, I used the genes from this organism as a comparative measure
against new information (Gedalanga, P. B.,Pornwongthong, P.,Mora, R., Chiang, S. D., Baldwin, B.,
Ogles, D.,& Mahendra, S. (2014)).
Going into the research project, I wanted to make sure I had enough information to
evaluate the results of my search. Information about the degradation pathway was discovered
in order to get a better idea of target genes and organisms to further look into (Stevenson, E., &
Turnbull, M. (2013, April 17). Thisarticle alsopointedme todifferent avenueswhichcouldbe revisited
for additional experimentation.
In this project I will use principle bioinformatics techniques to approach and analyze
genes capable of degrading 1, 4-dioxane. Starting the project I already know of a select
organism which can perform 1, 4-dioxane degradation so there are a few only possible
outcomes. One is that Pseudonocardia Dioxanivorans is alone in its degradation ability while
the other being that large multitude of organisms which can perform this task. A compromise
between the two possible outcomes is that while the genes themselves are not exclusive to
Pseudonocardia Dioxanivorans, there is a system at work in this organism which makes it more
effective on a critical level than most organisms. It is my hypothesis that organisms with
monooxygenase systems phylogenetically similar Pseudonocardia Dioxanivorans will also be
effective 1, 4-dioxane degraders.
Materials
The materials used in this project were entirely composed of bioinformatics
practices using computational applications and databases. As a result there weren’t any
chemical reagents used. Instead, many different computer applications and databases were
used to conduct and explore the subject material. Although the specifics of the hardware are
not important, it is noteworthy to state that most of the work conducted on this project were
done at the College of Technology computer lab and the library computer lab at the University
of Houston.
The key materials of this project are the sequences used during this project. These
sequences come in FASTA form and are found on the National Center of Biotechnology
Information or NCBI. FASTA in terms of this project, is a text based format used to represent
both DNA, RNA, and Protein sequences. These sequences are placed in FASTA format because it
is nearly universal among many different types of Bioinformatics’ applications. Most of the
work done in this experiment will be conducted, translated, or produced from a FASTA format.
NBCI also plays a critical role in this process. NBCI is the central hub for many databases
used to produce information that the project builds off of. Other databases such as PDB will
play a role in the analysis of the protein created by the genes. NBCI contains the Pubmed
Database which was used during most of the literary review. It also contained ascension
numbers, which allowed sequences to be streamed and referenced across other databases.
Also ExPASy and EBI were used as a database for applications. ExPASy is a
Bioinformatics resource portal. This was used as a source for other bioinformatics applications
including GENIO/LOGO, T-Coffee, and the PHYLIPS tools. GENIO/LOGO was used to create the
consensus sequence logo. T-Coffee was the secondary tool used to create multiple sequence
alignments, and the PHYLIP tools are a set of programs ranging from DNA and protein
sequences, as well as phylogeny tree building programs. EBI is an acronym for The European
Bioinformatics Institution. This placed many of the programs used during the production of
Multiple Sequence Alignments or MSAs. The primary program found from EBI and used in the
project was Clustal Omega. This program was able to create MSAs and make outputs using both
visual and FASTA formats. Clustal Omega also had the ability to create phylogeny trees and tree
file output data, which could be used in other programs.
The final program used was Treeview. This program had the ability to read tree file or
phylip tree files outputs, and convert them into the visual images of the phylogeny tree. This
program also had different styles of phylogeny tree. Ultimately, this tool was used to create the
phylogeny trees seen in the results section.
Methods
Literary Review
This project has three main goal. The first one is to discover the identity of genes which
could degrade 1, 4 Dioxane. The second is to provide a MSA of the genetic sequence of the
gene in question. The last part is to conduct a phylogenetic tree of the genes with the
organisms that accompany them.
To accomplish my first task, I conducted a literary review. This simply means that I
searched my resources to find publications pertaining to the scope of my study. In this case, it
was the identity of a gene which could degrade the 1, 4 Dioxane substance. The identity of a
gene and organism was discovered using articles discovered on Pub med. Pubmed is one of the
many databases located on the NCBI website. Likewise, the other databases on NCBI like gene,
protein, nucleotide, and Genbank were utilized when gaining find and record new sequences.
During my initial searches the organism.
Articles eventually led me to the discovery of the organism Pseudonocardia
dioxanivorans. More importantly this lead me to my first gene of interest, monooxygenase
component MmoB/DmpM. I found information on the gene using the Gene database located
on NCBI. With this information I was able to gather key features of the gene. The most
important of these features were its family identifies and it’s FASTA. While still using the gene
database located on NCBI I found more genes related to the gene family MmoB/DmpM. The
search of these monooxygenase genes also lead to the discovery of many different
monooxygenases which unlike the monooxygenase component MmoB/DmpM, used different
organic substances to perform and degrade monooxygenase. The result of searches for this
substrate included monooxygenase that used propane, phenol, and toluene as substrates. MY
literary review led me to believe that some substrate dependent tested viable options while
others were not.
BLAST
After I discovered all the genes I could using the NBCI search I used the algorithm BLAST
to analyze the FASTA sequence and compare it with sequences used in other databases. BLAST
stands for Basic Local Alignment Search Tool. This tool lead me to a few genes that I missed
during my previous search just using the NBCI database.
This tool also had parameters which allowed me to control my searches. When I would
perform a search using the nucleotide sequence of a gene I would do so using Megablast
parameter. Also I would exclude models and uncultured/ environmental samples in my search
because I felt like it was important to the project that I obtain non hypothetical results.
Furthermore, this search was conducted using the nucleotide collection (nr/nt) database
because it contained the largest source of DNA sequence information. Whenever I used a
protein sequence in BLAST I would use the protein BLAST algorithm, with the DELTA-BLAST
parameter. Searches performed in this method were used under the UniProt/SwissProt
database. This was mostly done because I was more familiar with this databases. The DELTA-
BLAST parameter helped validity my results by excluding matches found with low similarities.
Like before I also excluded models and uncultured/ environmental samples in this search.
Multiple Sequence Alignments
After more genes were discovered and compiled it was time to perform MSAs. To do
this I used EBI’s Clustal Omega program. I kept the parameters at the default for all my
alignments. I broke my sequencing analysis down in a way which would allow me to look at a
type of gene on a separate basis before compiling everything together. The groups examined
were first was the monooxygenase component MmoB/DmpM genes. The genes to follow were
the broken down components of the gene cluster associated with the monooxygenase
component MmoB/DmpM. These genes include the alpha subunit, the beta subunit, and the
reductase. The next genes evaluated were the propane monooxygenase, the phenol2-
monooxygenase, and alcohol dehydrogenase. Once the MSAs were completed, the output files
were converted into identity matrix scores, FASTA MSAs, visual MSAs, and phylogeny trees files.
All four if these output files were created from Clustal Omega. If a particular gene showed a
score at 60% or below on the MSA identity matrix is was excluded from further processing.
Some exceptions include the score being higher on the DNA score but failing the protein
threshold, or vise versa. Another program named T-Coffee was also used to conduct the MSAs.
This program ran sequences using its default parameters. I thought that it was important to
provide a second opinion of MSAs. Although the algorithms used between T-coffee and Clustal
Omega might be slightly different, I was mostly looking for huge changes in MSA scores rather
than small ones. In the end I decided to stick with the Clustal Omega MSA
Phylogenetic Tree
After the MSAs were conducted it was time to move into the final phase of the analysis
work for the project. A phylogenic tree was conducted using the sequences discovered. Even
though both Clustal and T-coffee produce a phylogenic tree using the results I decided that I
wanted to use a different program for this. Using the output Phyllis file from the Clustal Omega
output I conducted phylogenic trees using the Tree view program. Trees were completed using
the default parameters of the Treeview programs. The style the phylogenetic tree is produced
in is known as a phylogram. I started using making tree files of gene groups individually like
before. This means the first group of trees contained just monooxygenase component
MmoB/DmpM genes, then the next contained monooxygenase genes with substrates. Finally a
phylogenetic tree using all the sequences I compiled was produced. The trees produced by
Treeview lost their distance information visually placed on the image. To compensate for that.
There is a scale to size distance at the bottom of the left hand corner.
Results
The results of the project yielded results for the genes Alcohol Dehydrogenase, Phenol
2-monooxygenase, Propane monooxygenase, multi-component monooxygenase MMoB/DmpB,
as well as the other components of the multi-component monooxygenase complex. These
additional components include an Alpha and Beta subunit, and a reductase. I thought it was
also important to evaluate the whole all together multi-component unit.
When applicable, both the results of DNA and protein sequences are present. However,
there were situations where either DNA or protein sequences could not be obtained.
Alcohol Dehydrogenase
Protein MSA Score:
Thiscontainsthe score of each proteinsequence usedfromthe multiplesequence alignment.
Figure 1
Protein Multiple Sequence Alignment (See Attachment):
Figure 2
Protein sequence consensus Logo (See Attachment):
Using multiple amino acid residue sequences, a consensus sequence is created. A Logo is used
to visually represent the sequence where the height of the residue represents its appearance
the given position. The taller the residue, the more often it appears in that position.
Figure 3
Protein Phylogeny Tree (See Attachment):
This is a phylogeny tree created from the amino acid residue sequences. The phylogenic trees
were created using the Average Distance % Identity.
Figure 4
Phenol 2-monooxygenase
DNA MSA Score:
Thiscontainsthe score of each DNA nucleotide sequence usedfromthe multiplesequencealignment.
Figure 5
DNA Multiple SequenceAlignment(SeeAttachment):
Figure 6
DNA nucleotide sequence consensuslogo(SeeAttachment):
Usingmultiple DNA sequences,aconsensussequence iscreated.A Logoisusedto visuallyrepresentthe
sequence where the heightof the residuerepresentsitsappearancethe givenposition.The tallerthe
residue,the more oftenitappearsinthat position.
Figure 8
Protein MSA Score:
Thiscontainsthe score of each proteinsequence usedfromthe multiplesequence alignment.
Figure 9
ProteinsequenceconsensusLogo(See Attachment):
Usingmultiple aminoacidresiduesequences,aconsensus sequence iscreated.A Logoisusedtovisually
representthe sequence wherethe heightof the residue representsitsappearance the givenposition. .
The tallerthe residue,the more oftenitappearsinthatposition.
Figure 10
Protein Multiple SequenceAlignment(SeeAttachment):
Figure 12
Propane Monooxygenase
DNA MSA Score:
Thiscontainsthe score of each DNA nucleotide sequence usedfromthe multiplesequencealignment.
Figure 13
DNA nucleotide sequence consensuslogo(SeeAttachment):
Usingmultiple DNA sequences,aconsensussequence iscreated.A Logoisusedto visuallyrepresentthe
sequence where the heightof the residuerepresentsitsappearancethe givenposition.The tallerthe
residue,the more oftenitappearsinthatposition.
Figure 15
Protein MSA Score:
Thiscontainsthe score of each proteinsequence usedfromthe multiplesequence alignment.
Figure 17
Protein Multiple SequenceAlignment(SeeAttachment):
Figure 18
ProteinsequenceconsensusLogo(See Attachment):
Usingmultiple aminoacidresiduesequences,aconsensussequence iscreated.A Logoisusedtovisually
representthe sequence wherethe heightof the residue representsitsappearance the givenposition. .
The tallerthe residue,the more oftenitappearsinthat position.
Figure 19
Protein Phylogeny Tree(See Attachment):
Thisis a phylogenytree createdfromthe aminoacidresidue sequences.The phylogenictreeswere
createdusingthe Average Distance %Identity.
Figure 20
Multi-componentmonooxygenase
DNA MSA Score:
Thiscontainsthe score of each DNA nucleotide sequence usedfromthe multiplesequencealignment.
Figure 21
DNA Multiple SequenceAlignment(SeeAttachment):
Figure 22
DNA nucleotide sequence consensuslogo(SeeAttachment):
Usingmultiple DNA sequences,aconsensussequence iscreated.A Logoisusedto visuallyrepresentthe
sequence where the heightof the residuerepresentsitsappearancethe givenposition.The tallerthe
residue,the more oftenitappearsinthatposition.
Figure 23
DNA Phylogeny Tree(See Attachment):
Thisis a phylogenytree createdfromthe DNA nucleotide sequences.The phylogenictreeswere created
usingthe Average Distance %Identity.
Figure 24
Protein MSA Score:
Thiscontainsthe score of each proteinsequence usedfromthe multiplesequence alignment.
Figure 25
Protein Multiple SequenceAlignment(SeeAttachment):
Figure 26
ProteinsequenceconsensusLogo(See Attachment):
Usingmultiple aminoacidresiduesequences,aconsensussequence iscreated.A Logoisusedtovisually
representthe sequence wherethe heightof the residue representsitsappearance the givenposition.
The tallerthe residue,the more oftenitappearsinthatposition.
Figure 27
Protein Phylogeny Tree(See Attachment):
Thisis a phylogenytree createdfromthe aminoacidresidue sequences.The phylogenictreeswere
createdusingthe Average Distance %Identity.
Figure 28
AlphaSubunit
DNA MSA Score:
Thiscontainsthe score of each DNA nucleotide sequence usedfromthe multiplesequencealignment.
Figure 29
DNA Multiple SequenceAlignment(SeeAttachment):
Figure 30
DNA nucleotide sequence consensuslogo(SeeAttachment):
Usingmultiple DNA sequences,aconsensussequence iscreated.A Logoisusedto visuallyrepresentthe
sequence where the heightof the residuerepresentsitsappearancethe givenposition.The tallerthe
residue,the more oftenitappearsinthatposition.
Figure 31
DNA Phylogeny Tree(See Attachment):
Thisis a phylogenytree createdfromthe DNA nucleotide sequences. The phylogenictreeswere created
usingthe Average Distance %Identity.
Figure 32
Protein MSA Score:
Thiscontainsthe score of each proteinsequence usedfromthe multiplesequence alignment.
Figure 33
Protein Multiple SequenceAlignment(SeeAttachment):
Figure 34
ProteinsequenceconsensusLogo(See Attachment):
Usingmultiple aminoacidresiduesequences,aconsensussequence iscreated.A Logoisusedtovisually
representthe sequence wherethe heightof the residue representsitsappearance the givenposition. .
The tallerthe residue,the more oftenitappearsinthatposition.
Figure 35
Protein Phylogeny Tree(See Attachment):
Thisis a phylogenytree createdfromthe amino acidresidue sequences.The phylogenictreeswere
createdusingthe Average Distance %Identity.
Figure 36
Beta Subunit
DNA MSA Score:
Thiscontainsthe score of each DNA nucleotide sequence usedfromthe multiplesequencealignment.
Figure 37
DNA Multiple SequenceAlignment(SeeAttachment):
Figure 38
DNA nucleotide sequence consensuslogo(SeeAttachment):
Usingmultiple DNA sequences,aconsensussequence iscreated.A Logoisusedto visuallyrepresentthe
sequence where the heightof the residuerepresentsitsappearancethe givenposition.The tallerthe
residue,the more oftenitappearsinthatposition.
Figure 39
DNA Phylogeny Tree(See Attachment):
Thisis a phylogenytree createdfromthe DNA nucleotide sequences.The phylogenictreeswere created
usingthe Average Distance %Identity.
Figure 40
Protein MSA Score:
Thiscontainsthe score of each proteinsequence usedfromthe multiplesequence alignment.
Figure 41
Protein Multiple SequenceAlignment(See Attachment):
Figure 42
ProteinsequenceconsensusLogo(See Attachment):
Usingmultiple aminoacidresiduesequences,aconsensussequence iscreated.A Logoisusedtovisually
representthe sequence wherethe heightof the residue representsitsappearance the givenposition. .
The tallerthe residue,the more oftenitappearsinthatposition.
Figure 43
Protein Phylogeny Tree(See Attachment):
Thisis a phylogenytree createdfromthe aminoacidresidue sequences.The phylogenictreeswere
createdusingthe Average Distance %Identity.
Figure 44
Reductase
DNA MSA Score:
Thiscontainsthe score of each DNA nucleotide sequence usedfromthe multiplesequencealignment.
Figure 45
DNA Multiple SequenceAlignment(See Attachment):
Figure 46
DNA nucleotide sequence consensuslogo(SeeAttachment):
Usingmultiple DNA sequences,aconsensussequence iscreated.A Logoisusedto visuallyrepresentthe
sequence where the heightof the residuerepresentsitsappearancethe givenposition.The tallerthe
residue,the more oftenitappearsinthatposition.
Figure 47
DNA Phylogeny Tree(See Attachment):
Thisis a phylogenytree createdfromthe DNA nucleotide sequences.The phylogenictreeswere created
usingthe Average Distance %Identity.
Figure 48
Protein MSA Score:
Thiscontainsthe score of each proteinsequence usedfromthe multiplesequence alignment.
Figure 49
Protein Multiple SequenceAlignment(SeeAttachment):
Figure 50
ProteinsequenceconsensusLogo(See Attachment):
Usingmultiple aminoacidresiduesequences,aconsensussequence iscreated.A Logoisusedtovisually
representthe sequence wherethe heightof the residue representsitsappearance the givenposition. .
The tallerthe residue,the more oftenitappearsinthatposition.
Figure 51
Protein Phylogeny Tree(See Attachment):
Thisis a phylogenytree createdfromthe aminoacidresidue sequences.The phylogenictreeswere
createdusingthe Average Distance %Identity.
Figure 52
Multi-componentGene complex
DNA MSA Score:
Thiscontainsthe score of each DNA nucleotide sequence usedfromthe multiplesequencealignment.
Figure 53
DNA nucleotide sequence consensuslogo(SeeAttachment):
Usingmultiple DNA sequences,aconsensussequence is created.A Logoisusedto visuallyrepresentthe
sequence where the heightof the residuerepresentsitsappearancethe givenposition.The tallerthe
residue,the more oftenitappearsinthatposition.
Figure 54
DNA Phylogeny Tree(See Attachment):
Thisis a phylogenytree createdfromthe DNA nucleotide sequences.Itcreatesthe Phylogenictrees
usingthe Average Distance %Identity.
Figure 55
Master PhylogenyTree (SeeAttachment):
This isthe phylogenytree createdfromall the sequencescollectedduringthe project.Itisannotated
withcolorto helpforeasiernavigation.
The firsttree containedthe propane monooxygenaseandthe regularmonooxygenase gene clusters
alongwiththe Phen2-monoxygenase.Thistree doesnotinclude the alcohol dehydrogenase.
Figure 56
The last figure showsthe compete phylogenytree withall the sequencesusedforthe project.
Figure 57
Conclusion
At the start of my project, I was able to find literature that led me to believe Pseudonocardia
dioxanivorans was an organism with the ability to degrade 1, 4-dioxane. Furthermore the
identity of the gene that made this possible was discovered. Monooxygenase MmoB/DmpM
was the target gene which started much of the research. After more research was conducted, it
was discovered that the monooxygenase MmoB/DmpM worked within a gene complex. This
complex contained a reductase, an alpha, and beta subunit. The complex was then analyzed
with its individual components, as well as whole. Part of the reason for analyzing parts
individually was to find different sequences that may be lost in the overall gene cluster. When
this was performed only the monooxygenase for the alpha subunit yielded new results.
The initial analysis provided me with the organisms Pseudonocardia sp. K1,
Pseudonocardia sp. ENV478, and Rhodococcus sp. YYL. It is also important to note that while
using DNA sequences, the organism Pseudonocardia sp. K1 was displayed. However, when
using protein the organism changed its name to Pseudonocardia tetrahydrofuranoxydans.
However, Pseudonocardia tetrahydrofuranoxydans and Pseudonocardia sp. K1 are indeed the
same organism. Looking at the percent identity score, you can see that these organisms have
the strongest identity score with our target organism believe Pseudonocardia dioxanivorans.
The gene of interest was analyzed along with the individual components of the gene complex
and the complex as a whole. When it comes to the monooxygenase complex, in both its
individual and complete components, the percent identity score never drops below 90%. This is
a very strong indicator of functional similarity. The percent identify score of the propane
monooxygenase stayed mostly in the 70% range. This implied an identify score strong enough
to be relevant. Alcohol dehydrogenase percent identify score ranged from the mid-80s to high
60s. The range provided me with significant enough results to continue the project. The Phen-2
monooxygenase scores never got above 70% but was never below 60%. This coupled with the
suggestion to investigate from my literary review is what kept these gene in for further
evaluation.
The alcohol dehydrogenase, phen-2 monooxygenase, and propane monooxygenase,
were all evaluated as well using the same analytical biotechnology techniques. The propane
monooxygenase was the only gene discovered to have a gene cluster similar to the previous
monooxygenase gene cluster. This provoked me to exclude propane monooxygenase that were
not a part of the cluster because overall they had low percent identity or they only had
relations to one part of the cluster and no relationship to the gene cluster as a whole. The only
reason the monooxygenase alpha subunits were allowed to keep their singular similarity
matches was because the present identity score was still much too high to exclude. The overall
identity scores of the Propane monooxygenase, Phen-2 monooxygenase, and alcohol
dehydrogenase were high enough to be significant, but not as high as the monooxygenase
within the gene cluster previously spoken of.
The consensus logo is a way to visualize the results of the MSA and the percent identity
score. In DNA the gene cluster for the monooxygenase shared the strongest consistency, with it
having multiple matches at a 100% frequency. This was found in the individual components.
Odd enough, while the gene cluster still holds a percent identity score above 90% the
consensus sequences varies frequently among two different nucleotides. In proteins, the
consensus sequence varied. This variation was observed with at least two amino acids sharing a
50% frequency each.
The Phen-2 monooxygenase showed strong and mixed consensus among its DNA logo.
The protein logo showed mixed consensus with three amino acids usually fighting over
consensus. The alcohol dehydrogenase showed strong consensus among its protein sequence.
Many sections had a 100% frequency. The Propane monooxygenase was given a gene logo of
the entire cluster when it came to the nucleotide however, since creating a logo of the same
cluster was problematic on a protein, only the actual propane monooxygenase was given a
logo. The DNA logo of the cluster shows plenty of conflicting consensus and very little 100%
frequency. The protein logo showed much stronger consensus among its sequences with most
sequences in a 50% frequency. The alcohol dehydrogenase only has a protein logo created
because not all of the nucleotide sequences could be found. These sequences show strong
consensus with many 100% frequencies.
Multiple Sequence Alignments were also produced. In typical fashion a “*” represents a
completely conserved residue, ‘’:“indicates a conserved residue, and a “.”. A blank represents a
portion with no kind of conservative match. Represents a semi-conserved residue. MSAs were
conducted for all sequences, however it was problematic to exhibit the MSAs for the complete
gene cluster for monooxygenase and the propane monooxygenase due to the enormous size of
that data. Individual DNA and protein MSAs for each of the components of the monooxygenase
have been provided. The individual and the gene cluster show very strong fully conserved
regions. This matches up with their percent identity score. The propane monooxygenase
portion of its gene cluster has been provided. This shows a mixture of both fully conserved and
to a lesser extent, conserved regions. The phen-2 monooxygenase and the alcohol
dehydrogenase show similar results, with a mix of fully, conserved, and semi-conserved regions.
Mostly fully conserved regions.
Phylogeny trees were conducted for every type of gene, however only results from
genes containing more than three entries will be provided. This is because a tree with three or
less results give you little to no practical information, especially with the scope of this project.
When it came to monooxygenase, it is important to notice that our target organism
Pseudonocardia dioxanivorans and the genes associated with it usually closest related to
Rhodococcus sp. YYL. This can be observed in both individual genes and the gene cluster. The
alcohol dehydrogenase genes show a varied amount of diversity among themselves. An
important factor to notice is that a Pseudonocardia dioxanivorans organism was located. If you
look at the phylogeny tree you can see that this organism is more closely related to other
Pseudonocardia rather than the Rhodococcus. Comparing this to the monooxygenase tree may
suggest that while these two organism have strong similarities within this gene, there are still
many avenues were there are different. The final tree is shown with all the genes examined in
this project in based on their protein. Like before, you see Pseudonocardia dioxanivorans and
Rhodococcus sp. YYL being the closest related among each other. The exception to this is when
it comes to the alpha subunit. The alcohol dehydrogenase genes are isolated furthest away
from the rest of the genes. This could suggest that their role in 1, 4-dioxane degradation is
entirely different from the rest of the genes.
The results of this tree prompted me to make another tree without the alcohol
dehydrogenase. This tree uses the entire gene cluster of the primary and propane
monooxygenase. I did this because I thought that the results of the gene complex separated
was mostly redundant. These results showed that Pseudonocardia dioxanivorans and
Rhodococcus sp. YYL are still the closest in relation.
There are errors and limitations that have that occurred during the project. A major
limitation I faced was the amount of sequences available. I had to work with sequences that
available from the databases and my ability to find those sequences. This means that I could
have missed a sequence, or that there could possibly be more organisms whose genetic
sequences are not available but also can become 1, 4-dioxane degraders. Another source of
error could be my human error, better explained as my critic on what counts as valuable
information. I wanted to include sequences that I thought were relevant but I fear that I might
have excluded some sequences based on my own exclusion criteria.
The information gained in this project has many useful applications. The first being this
increases the number of organism which are suspected to be 1, 4-dioxane degraders. While
more experiments are needed to evaluate their effectiveness, phylogenetic analysis does
provide evidence to support further study. Having multiple organisms which can perform this
task makes using them for bioremediation purposes more feasible. Furthermore, upon
researching 1, 4-dioxane degradation, I discovered articles about fungi which could perform this
task. This means that more information about different organisms who have the ability to
perform this task may still be out there (Kinne, M., Poraj-Kobielska, M., Ralph, S. A., Ullrich, R.,
Hofrichter, M., & Hammel, K. E. (2009).). This information can also b e evaluated with the
results of this project to examine the different or similar processes both organisms provide to
degrade 1, 4-dioxane.
In summation, propane monooxygenase and phen-2 monooxygenase have the ability to
degrade 1, 4-dioxane, but only when their particular substrates are available. This makes them
less optimal than the other monooxygenase examined in this project. Alcohol Dehydrogenase is
the least related to any of the genes, which would suggest that its role in dioxane degradation is
not as direct as the other genes. The key piece of information obtained from the results showed
that three organisms Pseudonocardia sp. K1, Pseudonocardia sp. ENV478, and Rhodococcus sp.
YYL have genes most closely related to the gene of interest. The gene complex which contains
the monooxygenase, as well as the alpha subunit, beta subunit, reductase, and the
monooxygenase is what gives these organisms more affinity for the degradation task. This
supports the idea that these organisms are true 1, 4-dioxane degradation. Furthermore, the
genes associated with Rhodococcus sp. YYL are more closely related to Pseudonocardia
Dioxanivorans.
References
1.Sales, C. M., Mahendra, S., Grostern, A., Parales, R. E., Goodwin, L. A., Woyke, T., . . . Alvarez-Cohen,
L. (2011). Genome Sequence of the 1,4-Dioxane-Degrading Pseudonocardia dioxanivoransStrain
CB1190. Journal of Bacteriology, 193(17), 4549-4550. doi:10.1128/jb.00415-11
2. Gedalanga, P. B., Pornwongthong, P., Mora, R., Chiang, S. D., Baldwin, B., Ogles, D., & Mahendra, S.
(2014). Identification of Biomarker GenesTo Predict Biodegradation of 1,4-Dioxane. Applied and
Environmental Microbiology, 80(10), 3209-3218. doi:10.1128/aem.04162-13
3. 1,4-Dioxane (1,4-Diethyleneoxide). (n.d.). Retrieved April 30, 2016, from
https://www3.epa.gov/airtoxics/hlthef/dioxane.html
4. Kasai, T., Kano, H., Umeda, Y., Sasaki, T., Ikawa, N., Nishizawa, T., . . . Fukushima, S.
(2009). Two-year inhalation study of carcinogenicity and chronic toxicity of 1,4-dioxane in
male rats. Inhalation Toxicology, 21(11), 889-897. doi:10.1080/08958370802629610
5. Stevenson, E., & Turnbull, M. (2013, April 17). 1,4-Dioxane Pathway Map. Retrieved May
09, 2016, from http://eawag-bbd.ethz.ch/diox/diox_map.html
6. Kinne, M., Poraj-Kobielska, M., Ralph, S. A., Ullrich, R., Hofrichter, M., & Hammel, K. E.
(2009). Oxidative Cleavage of Diverse Ethers by an Extracellular Fungal
Peroxygenase. Journal of Biological Chemistry, 284(43), 29343-29349.
doi:10.1074/jbc.m109.040857
Attachments
Alcohol Dehydrogenase
Protein Multiple SequenceAlignment(SeeAttachment)
Figure 2
ProteinsequenceconsensusLogo(See Attachment):
Figure 3
Protein Phylogeny Tree(See Attachment):
Figure 4
Phenol 2-monooxygenase
DNA Multiple SequenceAlignment(SeeAttachment):
Figure 6
DNA nucleotide sequence consensuslogo(SeeAttachment):
Figure 7
Protein Multiple SequenceAlignment(SeeAttachment):
Figure 10
ProteinsequenceconsensusLogo(See Attachment):
Figure 11
Propane Monooxygenase
Figure 15
Protein Multiple SequenceAlignment(SeeAttachment):
Figure 18
ProteinsequenceconsensusLogo(See Attachment):
Figure 19
Protein Phylogeny Tree(See Attachment):
Figure 20
Multi-component monooxygenase
DNA Multiple SequenceAlignment(SeeAttachment):
Figure 22
DNA nucleotide sequence consensuslogo(SeeAttachment):
Figure 23
DNA Phylogeny Tree(See Attachment):
Figure 24
Protein Multiple SequenceAlignment(SeeAttachment):
Figure 26
ProteinsequenceconsensusLogo(See Attachment):
Figure 27
Protein Phylogeny Tree(See Attachment):
Figure 28
Alpha Subunit
DNA Multiple SequenceAlignment(SeeAttachment):
Figure 30
DNA nucleotide sequence consensuslogo(SeeAttachment):
Figure 31
DNA Phylogeny Tree(See Attachment):
Figure 32
Protein Multiple SequenceAlignment(SeeAttachment):
Figure 34
ProteinsequenceconsensusLogo(See Attachment):
Figure 35
Protein Phylogeny Tree(See Attachment):
Figure 36
Beta Subunit
DNA Multiple SequenceAlignment(SeeAttachment):
Figure 38
DNA nucleotide sequence consensuslogo(SeeAttachment):
Figure 39
DNA Phylogeny Tree(See Attachment):
Figure 40
Protein Multiple SequenceAlignment(SeeAttachment):
Figure 42
ProteinsequenceconsensusLogo(See Attachment):
Figure 43
Protein Phylogeny Tree(See Attachment):
Figure 44
Reductase
DNA Multiple SequenceAlignment(SeeAttachment):
Figure 46
DNA nucleotide sequence consensuslogo(SeeAttachment):
Figure 47
DNA Phylogeny Tree(See Attachment):
Figure 48
Protein Multiple SequenceAlignment(SeeAttachment):
Figure 50
ProteinsequenceconsensusLogo(See Attachment):
Figure 51
Protein Phylogeny Tree(See Attachment):
Figure 52
Multi-componentGene complex
DNA nucleotide sequence consensuslogo(SeeAttachment):
Figure 54
DNA Phylogeny Tree(See Attachment):
Figure 55
Master PhylogenicTree
Figure 56
Figure 56
Master Sequence Collection
(Protein)
Propane Monooxygenase
>gi|323461805|dbj|BAJ76721.1| phenol andpropane monooxygenasecouplingprotein
[Mycobacteriumgoodii]
>gi|38678096|dbj|BAD03959.1| propane monooxygenase couplingprotein[Gordoniasp.TY-5]
>gi|115511403|dbj|BAF34311.1| propane monooxygenase couplingprotein[Pseudonocardiasp.TY-7]
Alcohol Dehydrogenase
>gi|503437687|ref|WP_013672348.1| alcohol dehydrogenase[Pseudonocardiadioxanivorans]
>gi|739178298|ref|WP_037042195.1| alcohol dehydrogenase[Pseudonocardiaautotrophica]
>gi|502712806|ref|WP_012947904.1| alcohol dehydrogenase[Geodermatophilusobscurus]
>gi|655587148|ref|WP_028934354.1| alcohol dehydrogenase[Pseudonocardiaspinosispora]
>gi|655567920|ref|WP_028921610.1| alcohol dehydrogenase[Pseudonocardiaacaciae]
>gi|657222806|ref|WP_029336517.1| alcohol dehydrogenase[Geodermatophilaceae bacterium
URHB0048]
>gi|655577420|ref|WP_028928735.1| alcohol dehydrogenase[Pseudonocardiaasaccharolytica]
>gi|664302843|ref|WP_030832244.1| alcohol dehydrogenase[Streptomyceshygroscopicus]
>gi|739374752|ref|WP_037235711.1| alcohol dehydrogenase[Rhodococcuswratislaviensis]
>gi|1005622338|ref|WP_061698619.1| alcohol dehydrogenase [Rhodococcussp.LB1]
>gi|522114292|ref|WP_020625501.1| alcohol dehydrogenase[Pseudonocardiasp.P2]
>gi|983563300|ref|WP_060713791.1| alcohol dehydrogenase[Pseudonocardiasp.HH130629-09]
>gi|517141860|ref|WP_018330678.1| alcohol dehydrogenase[Actinomycetosporachiangmaiensis]
>gi|652459636|ref|WP_026854440.1| alcohol dehydrogenase[Geodermatophilaceae bacterium
URHB0062]
Alpha-Subunit
>gi|10443292|emb|CAC10506.1| alpha-subunitof multicomponenttetrahydrofuranmonooxygenase
[Pseudonocardiatetrahydrofuranoxydans]
>gi|338794148|gb|AEI99544.1| tetrahydrofuranmonooxygenase oxygenase componentalphasubunit
[Pseudonocardiasp.ENV478]
>gi|193888337|gb|ACF28534.1| multicomponenttetrahydrofuran-degradingmonooxygenase alhpa-
subnit[Rhodococcussp.YYL]
>gi|975830114|dbj|BAU36821.1| soluble di-ironmonooxygenase alphasubunit,partial [Rhodococcus
ruber]
>gi|975830092|dbj|BAU36810.1| soluble di-ironmonooxygenase alphasubunit,partial
[Pseudonocardiadioxanivorans]
>gi|975830110|dbj|BAU36819.1| soluble di-ironmonooxygenase alphasubunit,partial
[Pseudonocardiasp.D17]
Beta-Subunit
>gi|315936315|gb|ADU55885.1| ThmB [Rhodococcussp.YYL]
>gi|10443295|emb|CAC10509.1| beta-subunitof multicomponenttetrahydrofuranmonooxygenase
[Pseudonocardiatetrahydrofuranoxydans]
>gi|338794150|gb|AEI99546.1| tetrahydrofuranmonooxygenase oxygenase componentbetasubunit
[Pseudonocardiasp.ENV478]
>gi|326955346|gb|AEA29039.1| methane/phenol/toluene hydroxylase (plasmid)[Pseudonocardia
dioxanivoransCB1190]
Monooxygenase component
>gi|503969935|ref|WP_014203929.1| monooxygenase componentMmoB/DmpM[Pseudonocardia
dioxanivorans]
>gi|10443296|emb|CAC10510.1| regulatoryproteinof multicomponenttetrahydrofuran
monooxygenase [Pseudonocardiatetrahydrofuranoxydans]
>gi|315936316|gb|ADU55886.1| ThmC[Rhodococcussp.YYL]
>gi|338794151|gb|AEI99547.1| tetrahydrofuranmonooxygenase couplingprotein[Pseudonocardiasp.
ENV478]
Phen2- monooxygenase
>gi|326949330|gb|AEA23027.1| Phenol 2-monooxygenase [PseudonocardiadioxanivoransCB1190]
>gi|93354574|gb|ABF08663.1| Phenol hydroxylase P3protein(Phenol 2-monooxygenase P3
component) [CupriavidusmetalliduransCH34]
>gi|187728549|gb|ACD29713.1| methane/phenol/toluene hydroxylase [Ralstoniapickettii 12J]
Reductase
>gi|10443294|emb|CAC10508.1| reductase componentof multicomponentterahydrofuran
monooxygenase [Pseudonocardiatetrahydrofuranoxydans]
>gi|338794149|gb|AEI99545.1| tetrahydrofuranmonooxygenase reductase component
[Pseudonocardiasp.ENV478]
>gi|193888338|gb|ACF28535.1| multicomponentterahydrofuran-degradingmonooxygenase reductase
component[Rhodococcussp.YYL]
>gi|375129130|ref|YP_004991225.1| Ferredoxin--NAD(+) reductase(plasmid) [Pseudonocardia
dioxanivoransCB1190]
(DNA)
Gene Cluster
>gi|10443289|emb|AJ296087.1| Pseudonocardiasp.K1ORF y,thmS gene,thmagene,ORFx,thmD
gene,thmBgene,thmC gene,ORFQ, ORFZ andthm H gene
>gi|338794146|gb|HQ699618.1| Pseudonocardiasp.ENV478tetrahydrofurandegradationgene
cluster,complete sequence
>gi|315936312|gb|EU732588.2| Rhodococcussp.YYL tetrahydrofuran-degradinggenecluster,partial
sequence
>gb|CP002597.1|:28891-38076 PseudonocardiadioxanivoransCB1190 plasmidpPSED02 genomic
sequence
Alpha Subunit
>gi|10443289:2945-3326 Pseudonocardiasp.K1ORF y,thmS gene,thmagene,ORFx,thmD gene,thmB
gene,thmC gene,ORFQ, ORF Z andthm H gene
>gb|HQ699618.1|:3382-3763 Pseudonocardiasp.ENV478 tetrahydrofurandegradationgenecluster,
complete sequence
>gb|EU732588.2|:2916-3297 Rhodococcussp.YYL tetrahydrofuran-degradinggene cluster,partial
sequence
>gi|975830109|dbj|LC114144.1| Pseudonocardiasp.D17 SDIMO gene forsoluble di-iron
monooxygenase alphasubunit,partial cds
>gi|975830113|dbj|LC114146.1| RhodococcusruberSDIMO gene forsoluble di-ironmonooxygenase
alphasubunit,partial cds,strain:T5
>gb|CP002597.1|:31833-32214 PseudonocardiadioxanivoransCB1190 plasmidpPSED02 genomic
sequence
Beta Subunit
>gi|10443289:5108-6148 Pseudonocardiasp.K1ORF y,thmS gene,thmagene,ORFx,thmD gene,thmB
gene,thmC gene,ORFQ, ORF Z andthm H gene
>gi|315936312>gb|HQ699618.1|:5547-6590 Pseudonocardiasp.ENV478 tetrahydrofurandegradation
gene cluster,completesequence
>gb|EU732588.2|:5083-6123 Rhodococcussp.YYL tetrahydrofuran-degradinggene cluster,partial
sequence
>gb|CP002597.1|:34003-35043 PseudonocardiadioxanivoransCB1190 plasmidpPSED02 genomic
sequence
Monooxygenase Component
>gb|CP002597.1|:35043-35396 PseudonocardiadioxanivoransCB1190 plasmidpPSED02 genomic
sequence
>gb|EU732588.2|:6123-6476 Rhodococcussp.YYL tetrahydrofuran-degradinggene cluster,partial
sequence
>gb|HQ699618.1|:6587-6940 Pseudonocardiasp.ENV478 tetrahydrofurandegradationgenecluster,
complete sequence
>gi|10443289:6148-6501 Pseudonocardiasp.K1ORF y, thmS gene,thmagene,ORFx,thmD gene,thmB
gene,thmC gene,ORFQ, ORF Z andthm H gene
Reductase
>gi|10443289:3995-5077 Pseudonocardiasp.K1ORF y,thmS gene,thmagene,ORFx,thmD gene,thmB
gene,thmC gene,ORFQ, ORF Z andthm H gene
>gb|HQ699618.1|:4434-5516 Pseudonocardiasp.ENV478 tetrahydrofurandegradationgenecluster,
complete sequence
>gb|EU732588.2|:3964-5052 Rhodococcussp.YYL tetrahydrofuran-degradinggene cluster,partial
sequence
>gi|375129105:32884-33972 PseudonocardiadioxanivoransCB1190 plasmidpPSED02,complete
sequence
Phen2-Monooxygenase
>gb|CP002593.1|:820783-822312 PseudonocardiadioxanivoransCB1190, complete genome
>gb|CP000352.1|:1935574-1937121 CupriavidusmetalliduransCH34,complete genome
>gb|CP001069.1|:954525-956069 Ralstoniapickettii 12Jchromosome 2,complete sequence
Propane Monooxygenase
>gi|115511399|dbj|AB250942.1| Pseudonocardiasp.TY-7prm2A,prm2B, prm2C, prm2D genesfor
propane monooxygenase
>gi|38678092|dbj|AB112920.1| Gordoniasp.TY-5 prmA,prmB,prmC, prmD, orf1, orf2, adh1,orf3
genesforpropane monooxygenase
>gi|323461801|dbj|AB568291.1| Mycobacteriumgoodii mimA,mimB,mimC,mimDgenes,complete
cds, strain:12523

Contenu connexe

Tendances

DataCite workshop at BL April 2011
DataCite workshop at BL April 2011DataCite workshop at BL April 2011
DataCite workshop at BL April 2011Gudmundur Thorisson
 
Wagner College Forum for Undergraduate Research, Vol. 18 No. 1
Wagner College Forum for Undergraduate Research, Vol. 18 No. 1Wagner College Forum for Undergraduate Research, Vol. 18 No. 1
Wagner College Forum for Undergraduate Research, Vol. 18 No. 1Wagner College
 
Content Mining of Science and Medicine
Content Mining of Science and MedicineContent Mining of Science and Medicine
Content Mining of Science and MedicineTheContentMine
 
Automatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literature Automatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literature TheContentMine
 
MIB200A at UCDavis Module: Microbial Phylogeny; Class 2
MIB200A at UCDavis Module: Microbial Phylogeny; Class 2MIB200A at UCDavis Module: Microbial Phylogeny; Class 2
MIB200A at UCDavis Module: Microbial Phylogeny; Class 2Jonathan Eisen
 
The UsefulChem Project at NERM2006
The UsefulChem Project at NERM2006The UsefulChem Project at NERM2006
The UsefulChem Project at NERM2006drexchem08
 
EOL China Center status
EOL China Center statusEOL China Center status
EOL China Center statusCyndy Parr
 

Tendances (7)

DataCite workshop at BL April 2011
DataCite workshop at BL April 2011DataCite workshop at BL April 2011
DataCite workshop at BL April 2011
 
Wagner College Forum for Undergraduate Research, Vol. 18 No. 1
Wagner College Forum for Undergraduate Research, Vol. 18 No. 1Wagner College Forum for Undergraduate Research, Vol. 18 No. 1
Wagner College Forum for Undergraduate Research, Vol. 18 No. 1
 
Content Mining of Science and Medicine
Content Mining of Science and MedicineContent Mining of Science and Medicine
Content Mining of Science and Medicine
 
Automatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literature Automatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literature
 
MIB200A at UCDavis Module: Microbial Phylogeny; Class 2
MIB200A at UCDavis Module: Microbial Phylogeny; Class 2MIB200A at UCDavis Module: Microbial Phylogeny; Class 2
MIB200A at UCDavis Module: Microbial Phylogeny; Class 2
 
The UsefulChem Project at NERM2006
The UsefulChem Project at NERM2006The UsefulChem Project at NERM2006
The UsefulChem Project at NERM2006
 
EOL China Center status
EOL China Center statusEOL China Center status
EOL China Center status
 

En vedette

1,4 Dioxane & Cosmetic Safety
1,4 Dioxane & Cosmetic Safety 1,4 Dioxane & Cosmetic Safety
1,4 Dioxane & Cosmetic Safety v2zq
 
Hazardous Chemicals in Cosmetics
Hazardous Chemicals in Cosmetics Hazardous Chemicals in Cosmetics
Hazardous Chemicals in Cosmetics v2zq
 
Adriana Cisneros en el Foro Iberoamérica – Perspectiva de Futuro: La visión d...
Adriana Cisneros en el Foro Iberoamérica – Perspectiva de Futuro: La visión d...Adriana Cisneros en el Foro Iberoamérica – Perspectiva de Futuro: La visión d...
Adriana Cisneros en el Foro Iberoamérica – Perspectiva de Futuro: La visión d...Adriana Cisneros
 
Grant Thornton International Business Report 2013
Grant Thornton International Business Report 2013Grant Thornton International Business Report 2013
Grant Thornton International Business Report 2013Grant Thornton
 
TRABAJO DE TECNICAS
TRABAJO DE TECNICASTRABAJO DE TECNICAS
TRABAJO DE TECNICASadricisneros
 
Exhortación Pastoral del Episcopado Venezolano Asamblea Ordinaria CIV
Exhortación Pastoral del Episcopado Venezolano Asamblea Ordinaria CIVExhortación Pastoral del Episcopado Venezolano Asamblea Ordinaria CIV
Exhortación Pastoral del Episcopado Venezolano Asamblea Ordinaria CIVGustavo Cisneros
 
Adriana Cisneros - Entrevista The Miami Herald
Adriana Cisneros - Entrevista The Miami HeraldAdriana Cisneros - Entrevista The Miami Herald
Adriana Cisneros - Entrevista The Miami HeraldAdriana Cisneros
 
2. ventajas competitivas de_la_logistica
2. ventajas competitivas de_la_logistica2. ventajas competitivas de_la_logistica
2. ventajas competitivas de_la_logisticaBERENICE GUADARRAMA
 
Adriana Cisneros entrevista en TV Latina
Adriana Cisneros entrevista en TV Latina Adriana Cisneros entrevista en TV Latina
Adriana Cisneros entrevista en TV Latina Adriana Cisneros
 
El empresario Gustavo Cisneros visita la provincia homónima de su apellido pa...
El empresario Gustavo Cisneros visita la provincia homónima de su apellido pa...El empresario Gustavo Cisneros visita la provincia homónima de su apellido pa...
El empresario Gustavo Cisneros visita la provincia homónima de su apellido pa...Gustavo Cisneros
 
Gustavo Cisneros nombra a Adriana Cisneros como nueva CEO de la Organización ...
Gustavo Cisneros nombra a Adriana Cisneros como nueva CEO de la Organización ...Gustavo Cisneros nombra a Adriana Cisneros como nueva CEO de la Organización ...
Gustavo Cisneros nombra a Adriana Cisneros como nueva CEO de la Organización ...Gustavo Cisneros
 
Los Cisneros: Rostros y rastros de una familia [1570 – 2015]- Introducción
Los Cisneros: Rostros y rastros de una familia [1570 – 2015]- IntroducciónLos Cisneros: Rostros y rastros de una familia [1570 – 2015]- Introducción
Los Cisneros: Rostros y rastros de una familia [1570 – 2015]- IntroducciónGustavo Cisneros
 
Gustavo Cisneros’ speech in AEF´s Honors Night
Gustavo Cisneros’ speech in AEF´s  Honors Night Gustavo Cisneros’ speech in AEF´s  Honors Night
Gustavo Cisneros’ speech in AEF´s Honors Night Gustavo Cisneros
 
Perfil Diego Cisneros a sus 100 años
Perfil Diego Cisneros a sus 100 años Perfil Diego Cisneros a sus 100 años
Perfil Diego Cisneros a sus 100 años Gustavo Cisneros
 
Adriana Cisneros - Roundtable Discussion on Innovation and VC/PE Development ...
Adriana Cisneros - Roundtable Discussion on Innovation and VC/PE Development ...Adriana Cisneros - Roundtable Discussion on Innovation and VC/PE Development ...
Adriana Cisneros - Roundtable Discussion on Innovation and VC/PE Development ...Adriana Cisneros
 
Adriana Cisneros: la mujer que comanda la revolución de la TV
Adriana Cisneros: la mujer que comanda la revolución de la TVAdriana Cisneros: la mujer que comanda la revolución de la TV
Adriana Cisneros: la mujer que comanda la revolución de la TVAdriana Cisneros
 
Insumos volverán a tener aranceles para mantener drawback
Insumos volverán a tener aranceles para mantener drawbackInsumos volverán a tener aranceles para mantener drawback
Insumos volverán a tener aranceles para mantener drawbackComité de Proveedores
 

En vedette (20)

1,4 Dioxane & Cosmetic Safety
1,4 Dioxane & Cosmetic Safety 1,4 Dioxane & Cosmetic Safety
1,4 Dioxane & Cosmetic Safety
 
Hazardous Chemicals in Cosmetics
Hazardous Chemicals in Cosmetics Hazardous Chemicals in Cosmetics
Hazardous Chemicals in Cosmetics
 
Formaldehyde
FormaldehydeFormaldehyde
Formaldehyde
 
Adriana Cisneros en el Foro Iberoamérica – Perspectiva de Futuro: La visión d...
Adriana Cisneros en el Foro Iberoamérica – Perspectiva de Futuro: La visión d...Adriana Cisneros en el Foro Iberoamérica – Perspectiva de Futuro: La visión d...
Adriana Cisneros en el Foro Iberoamérica – Perspectiva de Futuro: La visión d...
 
Grant Thornton International Business Report 2013
Grant Thornton International Business Report 2013Grant Thornton International Business Report 2013
Grant Thornton International Business Report 2013
 
TRABAJO DE TECNICAS
TRABAJO DE TECNICASTRABAJO DE TECNICAS
TRABAJO DE TECNICAS
 
Exhortación Pastoral del Episcopado Venezolano Asamblea Ordinaria CIV
Exhortación Pastoral del Episcopado Venezolano Asamblea Ordinaria CIVExhortación Pastoral del Episcopado Venezolano Asamblea Ordinaria CIV
Exhortación Pastoral del Episcopado Venezolano Asamblea Ordinaria CIV
 
Adriana Cisneros - Entrevista The Miami Herald
Adriana Cisneros - Entrevista The Miami HeraldAdriana Cisneros - Entrevista The Miami Herald
Adriana Cisneros - Entrevista The Miami Herald
 
2. ventajas competitivas de_la_logistica
2. ventajas competitivas de_la_logistica2. ventajas competitivas de_la_logistica
2. ventajas competitivas de_la_logistica
 
Top Businesswomen 2014
Top Businesswomen 2014Top Businesswomen 2014
Top Businesswomen 2014
 
Adriana Cisneros entrevista en TV Latina
Adriana Cisneros entrevista en TV Latina Adriana Cisneros entrevista en TV Latina
Adriana Cisneros entrevista en TV Latina
 
El empresario Gustavo Cisneros visita la provincia homónima de su apellido pa...
El empresario Gustavo Cisneros visita la provincia homónima de su apellido pa...El empresario Gustavo Cisneros visita la provincia homónima de su apellido pa...
El empresario Gustavo Cisneros visita la provincia homónima de su apellido pa...
 
Gustavo Cisneros nombra a Adriana Cisneros como nueva CEO de la Organización ...
Gustavo Cisneros nombra a Adriana Cisneros como nueva CEO de la Organización ...Gustavo Cisneros nombra a Adriana Cisneros como nueva CEO de la Organización ...
Gustavo Cisneros nombra a Adriana Cisneros como nueva CEO de la Organización ...
 
Los Cisneros: Rostros y rastros de una familia [1570 – 2015]- Introducción
Los Cisneros: Rostros y rastros de una familia [1570 – 2015]- IntroducciónLos Cisneros: Rostros y rastros de una familia [1570 – 2015]- Introducción
Los Cisneros: Rostros y rastros de una familia [1570 – 2015]- Introducción
 
Gustavo Cisneros’ speech in AEF´s Honors Night
Gustavo Cisneros’ speech in AEF´s  Honors Night Gustavo Cisneros’ speech in AEF´s  Honors Night
Gustavo Cisneros’ speech in AEF´s Honors Night
 
Perfil Diego Cisneros a sus 100 años
Perfil Diego Cisneros a sus 100 años Perfil Diego Cisneros a sus 100 años
Perfil Diego Cisneros a sus 100 años
 
Adriana Cisneros - Roundtable Discussion on Innovation and VC/PE Development ...
Adriana Cisneros - Roundtable Discussion on Innovation and VC/PE Development ...Adriana Cisneros - Roundtable Discussion on Innovation and VC/PE Development ...
Adriana Cisneros - Roundtable Discussion on Innovation and VC/PE Development ...
 
Adriana Cisneros: la mujer que comanda la revolución de la TV
Adriana Cisneros: la mujer que comanda la revolución de la TVAdriana Cisneros: la mujer que comanda la revolución de la TV
Adriana Cisneros: la mujer que comanda la revolución de la TV
 
Women in business report (IBR 2013)
Women in business report (IBR 2013)Women in business report (IBR 2013)
Women in business report (IBR 2013)
 
Insumos volverán a tener aranceles para mantener drawback
Insumos volverán a tener aranceles para mantener drawbackInsumos volverán a tener aranceles para mantener drawback
Insumos volverán a tener aranceles para mantener drawback
 

Similaire à Phylogenetic Analysis and Identification Of Dioxane Degrader

Similaire à Phylogenetic Analysis and Identification Of Dioxane Degrader (20)

rheumatoid arthritis
rheumatoid arthritisrheumatoid arthritis
rheumatoid arthritis
 
Metagenomics
MetagenomicsMetagenomics
Metagenomics
 
Seminar summaries for wp
Seminar summaries for wpSeminar summaries for wp
Seminar summaries for wp
 
Caenorhabditi Elegans Research Paper
Caenorhabditi Elegans Research PaperCaenorhabditi Elegans Research Paper
Caenorhabditi Elegans Research Paper
 
Evolution of DNA repair genes, proteins and processes
Evolution of DNA repair genes, proteins and processesEvolution of DNA repair genes, proteins and processes
Evolution of DNA repair genes, proteins and processes
 
Louisville1
Louisville1Louisville1
Louisville1
 
curriculum vitae
curriculum vitaecurriculum vitae
curriculum vitae
 
MOLECULAR ARCHAEOLOGY
MOLECULAR ARCHAEOLOGYMOLECULAR ARCHAEOLOGY
MOLECULAR ARCHAEOLOGY
 
5 kang et al acai paper in food chemistry 2010
5 kang et al acai paper in food chemistry 20105 kang et al acai paper in food chemistry 2010
5 kang et al acai paper in food chemistry 2010
 
Plegable Biología Molecular
Plegable Biología MolecularPlegable Biología Molecular
Plegable Biología Molecular
 
Final dna summary
Final dna summaryFinal dna summary
Final dna summary
 
Human Genome Project
Human Genome ProjectHuman Genome Project
Human Genome Project
 
Drosophila Melanogaster Experiment
Drosophila Melanogaster ExperimentDrosophila Melanogaster Experiment
Drosophila Melanogaster Experiment
 
2014.apr.rx
2014.apr.rx2014.apr.rx
2014.apr.rx
 
DNA structure
DNA structureDNA structure
DNA structure
 
Diversity Diversity Diversity Diversity ....
Diversity Diversity Diversity Diversity ....Diversity Diversity Diversity Diversity ....
Diversity Diversity Diversity Diversity ....
 
Biology homework help
Biology homework helpBiology homework help
Biology homework help
 
Presentación plegable1
Presentación plegable1Presentación plegable1
Presentación plegable1
 
Presentación plegable 1
Presentación plegable 1Presentación plegable 1
Presentación plegable 1
 
July 2013
July 2013July 2013
July 2013
 

Phylogenetic Analysis and Identification Of Dioxane Degrader

  • 1. Phylogenetic Analysis and Identification of 1, 4-Dioxane Degrading Genes Keith Sanders May 9th 2016 Dr. Iyer and Brian Iken Abstract 1, 4-dioxane is a substance that was used as a solvent for other organic compounds. Exposure to this compound can have numerous deleterious effects on a living organisms and is suspected as a carcinogen. Originally, this substance was known as just an occupational hazard. Unfortunately, 1, 4-dioxane has also been found to contaminate ground water. After a brief analysis of the compound, bioremediation became a key possibility in the degradation of the substance in contaminated areas. In order to accomplish this, the bacterium Pseudonocardia Dioxanivorans was discovered. Pseudonocardia Dioxanivorans has the ability to degrade 1, 4- dioxane thanks to its multicomponent monooxygenase, which contained specific genes working together with the monooxygenase. It is my hypothesis that organisms with a multicomponent monooxygenase system phylogenetically similar Pseudonocardia Dioxanivorans will also be effective 1, 4-dioxane degraders. After the particular genes of interest were discovered, programs like BLAST were used to discover similar sequences within many biotechnology databases. Next Clustal Omega was used to create multiple sequence alignments, as well as output data which would provide further phylogenetic data from the sequences gathered. The results of the project showed that while many organism contained components of the monooxygenase or notable biomarkers, three notable organisms provided sufficient evidence of being true 1,4 dioxane degraders. These organisms are Rhodococcus sp. YYL, Pseudonocardia tetrahydrofuranoxydans, and Pseudonocardia sp. ENV478. Introduction 1, 4 dioxane was used as a solvent for numerous organic and inorganic compounds. This compound is a clear colorless liquid with an odor similar to ether. 1, 4 dioxane is also soluble in water. This compound is also known to be highly flammable in both its liquid and vapor state. 1, 4 dioxane is hazardous for humans. Short-term exposure such as inhalation of this chemical can cause minor ailments such as dizziness and headaches, or major aliments such as irritation of
  • 2. the throat, lungs, eyes. 1, 4 dioxane can also be absorbed through the skin causing mild to severe skin irritation. Chronic exposure to 1, 4 dioxane can be extremely detrimental, and even lethal. Studies have shown that long term exposure to 1, 4 dioxane can damage the kidney and the liver. Multiple studies using exposing rats to 1, 4 dioxane in both their drinking water and vapor resulting in a number of rats suffering damage to the organs in their endocrine system (Kasai, T., Kano, H.,Umeda, Y., Sasaki, T., Ikawa,N., Nishizawa, T., . . . Fukushima, S. (2009). These rats also developed cancerous cells. These studies lead 1, 4 dioxane to be classified as a probable human carcinogen. Typically, humans only come in contact with this substance as a part of occupational hazards. However, 1, 4 dioxane has been detected as a contaminant in both surface and ground water. 1, 4-dioxane is a very dangerous chemical and unfortunately it is also problematic to get rid of. The purpose of this study is to find bio-degraders, organisms which can perform bioremediation by degrading one substance and converting it into a different product. These bio-degraders are often favored for remediation problems because they are easy to maintain and generally less harmful to the environment. After performing a literary review on 1, 4 dioxane, I started to search for literature about organisms. More specifically I was looking for genes which had the ability to degrade this substance and the organisms they belong too. After reviewing articles I discovered that a key gene of interest was a monooxygenase component MmoB/DmpM. This gene was mentioned in the organism Pseudonocardia Dioxanivorans strain 1190. This monooxygenase was particularly interesting because it did not require other organic substrates to degrade 1,4-dioxane( Gedalanga, P. B.,Pornwongthong, P., Mora, R., Chiang, S. D.,Baldwin, B., Ogles, D.,& Mahendra, S. (2014) ) .Interesting moreover, the monooxygenase found in Pseudonocardia Dioxanivorans contained a multi-component gene cluster which aided in its ability to degrade 1,4 -dioxane. These components included things such as an alpha and beta subunits, a reductase. The other genes that were in the complexes were evaluated and used to confirm or scrutinize the results, however the monooxygenase MmoB/DmpM was the target gene. Another article led me to examine biomarkers which showed promise in being 1, 4-dioxane degraders. This article provided me the means to search for genes like phenol-2 monooxygenase and propane monooxygenase which needed specific substrates to operate. This article also guided me into
  • 3. looking into alcohol dehydrogenase genes. Since there was a lot of information on Pseudonocardia Dioxanivorans, I used the genes from this organism as a comparative measure against new information (Gedalanga, P. B.,Pornwongthong, P.,Mora, R., Chiang, S. D., Baldwin, B., Ogles, D.,& Mahendra, S. (2014)). Going into the research project, I wanted to make sure I had enough information to evaluate the results of my search. Information about the degradation pathway was discovered in order to get a better idea of target genes and organisms to further look into (Stevenson, E., & Turnbull, M. (2013, April 17). Thisarticle alsopointedme todifferent avenueswhichcouldbe revisited for additional experimentation. In this project I will use principle bioinformatics techniques to approach and analyze genes capable of degrading 1, 4-dioxane. Starting the project I already know of a select organism which can perform 1, 4-dioxane degradation so there are a few only possible outcomes. One is that Pseudonocardia Dioxanivorans is alone in its degradation ability while the other being that large multitude of organisms which can perform this task. A compromise between the two possible outcomes is that while the genes themselves are not exclusive to Pseudonocardia Dioxanivorans, there is a system at work in this organism which makes it more effective on a critical level than most organisms. It is my hypothesis that organisms with monooxygenase systems phylogenetically similar Pseudonocardia Dioxanivorans will also be effective 1, 4-dioxane degraders. Materials The materials used in this project were entirely composed of bioinformatics practices using computational applications and databases. As a result there weren’t any chemical reagents used. Instead, many different computer applications and databases were used to conduct and explore the subject material. Although the specifics of the hardware are not important, it is noteworthy to state that most of the work conducted on this project were done at the College of Technology computer lab and the library computer lab at the University of Houston.
  • 4. The key materials of this project are the sequences used during this project. These sequences come in FASTA form and are found on the National Center of Biotechnology Information or NCBI. FASTA in terms of this project, is a text based format used to represent both DNA, RNA, and Protein sequences. These sequences are placed in FASTA format because it is nearly universal among many different types of Bioinformatics’ applications. Most of the work done in this experiment will be conducted, translated, or produced from a FASTA format. NBCI also plays a critical role in this process. NBCI is the central hub for many databases used to produce information that the project builds off of. Other databases such as PDB will play a role in the analysis of the protein created by the genes. NBCI contains the Pubmed Database which was used during most of the literary review. It also contained ascension numbers, which allowed sequences to be streamed and referenced across other databases. Also ExPASy and EBI were used as a database for applications. ExPASy is a Bioinformatics resource portal. This was used as a source for other bioinformatics applications including GENIO/LOGO, T-Coffee, and the PHYLIPS tools. GENIO/LOGO was used to create the consensus sequence logo. T-Coffee was the secondary tool used to create multiple sequence alignments, and the PHYLIP tools are a set of programs ranging from DNA and protein sequences, as well as phylogeny tree building programs. EBI is an acronym for The European Bioinformatics Institution. This placed many of the programs used during the production of Multiple Sequence Alignments or MSAs. The primary program found from EBI and used in the project was Clustal Omega. This program was able to create MSAs and make outputs using both visual and FASTA formats. Clustal Omega also had the ability to create phylogeny trees and tree file output data, which could be used in other programs. The final program used was Treeview. This program had the ability to read tree file or phylip tree files outputs, and convert them into the visual images of the phylogeny tree. This program also had different styles of phylogeny tree. Ultimately, this tool was used to create the phylogeny trees seen in the results section. Methods
  • 5. Literary Review This project has three main goal. The first one is to discover the identity of genes which could degrade 1, 4 Dioxane. The second is to provide a MSA of the genetic sequence of the gene in question. The last part is to conduct a phylogenetic tree of the genes with the organisms that accompany them. To accomplish my first task, I conducted a literary review. This simply means that I searched my resources to find publications pertaining to the scope of my study. In this case, it was the identity of a gene which could degrade the 1, 4 Dioxane substance. The identity of a gene and organism was discovered using articles discovered on Pub med. Pubmed is one of the many databases located on the NCBI website. Likewise, the other databases on NCBI like gene, protein, nucleotide, and Genbank were utilized when gaining find and record new sequences. During my initial searches the organism. Articles eventually led me to the discovery of the organism Pseudonocardia dioxanivorans. More importantly this lead me to my first gene of interest, monooxygenase component MmoB/DmpM. I found information on the gene using the Gene database located on NCBI. With this information I was able to gather key features of the gene. The most important of these features were its family identifies and it’s FASTA. While still using the gene database located on NCBI I found more genes related to the gene family MmoB/DmpM. The search of these monooxygenase genes also lead to the discovery of many different monooxygenases which unlike the monooxygenase component MmoB/DmpM, used different organic substances to perform and degrade monooxygenase. The result of searches for this substrate included monooxygenase that used propane, phenol, and toluene as substrates. MY literary review led me to believe that some substrate dependent tested viable options while others were not. BLAST After I discovered all the genes I could using the NBCI search I used the algorithm BLAST to analyze the FASTA sequence and compare it with sequences used in other databases. BLAST
  • 6. stands for Basic Local Alignment Search Tool. This tool lead me to a few genes that I missed during my previous search just using the NBCI database. This tool also had parameters which allowed me to control my searches. When I would perform a search using the nucleotide sequence of a gene I would do so using Megablast parameter. Also I would exclude models and uncultured/ environmental samples in my search because I felt like it was important to the project that I obtain non hypothetical results. Furthermore, this search was conducted using the nucleotide collection (nr/nt) database because it contained the largest source of DNA sequence information. Whenever I used a protein sequence in BLAST I would use the protein BLAST algorithm, with the DELTA-BLAST parameter. Searches performed in this method were used under the UniProt/SwissProt database. This was mostly done because I was more familiar with this databases. The DELTA- BLAST parameter helped validity my results by excluding matches found with low similarities. Like before I also excluded models and uncultured/ environmental samples in this search. Multiple Sequence Alignments After more genes were discovered and compiled it was time to perform MSAs. To do this I used EBI’s Clustal Omega program. I kept the parameters at the default for all my alignments. I broke my sequencing analysis down in a way which would allow me to look at a type of gene on a separate basis before compiling everything together. The groups examined were first was the monooxygenase component MmoB/DmpM genes. The genes to follow were the broken down components of the gene cluster associated with the monooxygenase component MmoB/DmpM. These genes include the alpha subunit, the beta subunit, and the reductase. The next genes evaluated were the propane monooxygenase, the phenol2- monooxygenase, and alcohol dehydrogenase. Once the MSAs were completed, the output files were converted into identity matrix scores, FASTA MSAs, visual MSAs, and phylogeny trees files. All four if these output files were created from Clustal Omega. If a particular gene showed a score at 60% or below on the MSA identity matrix is was excluded from further processing. Some exceptions include the score being higher on the DNA score but failing the protein threshold, or vise versa. Another program named T-Coffee was also used to conduct the MSAs. This program ran sequences using its default parameters. I thought that it was important to
  • 7. provide a second opinion of MSAs. Although the algorithms used between T-coffee and Clustal Omega might be slightly different, I was mostly looking for huge changes in MSA scores rather than small ones. In the end I decided to stick with the Clustal Omega MSA Phylogenetic Tree After the MSAs were conducted it was time to move into the final phase of the analysis work for the project. A phylogenic tree was conducted using the sequences discovered. Even though both Clustal and T-coffee produce a phylogenic tree using the results I decided that I wanted to use a different program for this. Using the output Phyllis file from the Clustal Omega output I conducted phylogenic trees using the Tree view program. Trees were completed using the default parameters of the Treeview programs. The style the phylogenetic tree is produced in is known as a phylogram. I started using making tree files of gene groups individually like before. This means the first group of trees contained just monooxygenase component MmoB/DmpM genes, then the next contained monooxygenase genes with substrates. Finally a phylogenetic tree using all the sequences I compiled was produced. The trees produced by Treeview lost their distance information visually placed on the image. To compensate for that. There is a scale to size distance at the bottom of the left hand corner. Results The results of the project yielded results for the genes Alcohol Dehydrogenase, Phenol 2-monooxygenase, Propane monooxygenase, multi-component monooxygenase MMoB/DmpB, as well as the other components of the multi-component monooxygenase complex. These additional components include an Alpha and Beta subunit, and a reductase. I thought it was also important to evaluate the whole all together multi-component unit. When applicable, both the results of DNA and protein sequences are present. However, there were situations where either DNA or protein sequences could not be obtained. Alcohol Dehydrogenase Protein MSA Score: Thiscontainsthe score of each proteinsequence usedfromthe multiplesequence alignment.
  • 8. Figure 1 Protein Multiple Sequence Alignment (See Attachment): Figure 2 Protein sequence consensus Logo (See Attachment): Using multiple amino acid residue sequences, a consensus sequence is created. A Logo is used to visually represent the sequence where the height of the residue represents its appearance the given position. The taller the residue, the more often it appears in that position. Figure 3 Protein Phylogeny Tree (See Attachment): This is a phylogeny tree created from the amino acid residue sequences. The phylogenic trees were created using the Average Distance % Identity. Figure 4 Phenol 2-monooxygenase DNA MSA Score: Thiscontainsthe score of each DNA nucleotide sequence usedfromthe multiplesequencealignment. Figure 5 DNA Multiple SequenceAlignment(SeeAttachment): Figure 6 DNA nucleotide sequence consensuslogo(SeeAttachment):
  • 9. Usingmultiple DNA sequences,aconsensussequence iscreated.A Logoisusedto visuallyrepresentthe sequence where the heightof the residuerepresentsitsappearancethe givenposition.The tallerthe residue,the more oftenitappearsinthat position. Figure 8 Protein MSA Score: Thiscontainsthe score of each proteinsequence usedfromthe multiplesequence alignment. Figure 9 ProteinsequenceconsensusLogo(See Attachment): Usingmultiple aminoacidresiduesequences,aconsensus sequence iscreated.A Logoisusedtovisually representthe sequence wherethe heightof the residue representsitsappearance the givenposition. . The tallerthe residue,the more oftenitappearsinthatposition. Figure 10 Protein Multiple SequenceAlignment(SeeAttachment): Figure 12 Propane Monooxygenase DNA MSA Score: Thiscontainsthe score of each DNA nucleotide sequence usedfromthe multiplesequencealignment. Figure 13
  • 10. DNA nucleotide sequence consensuslogo(SeeAttachment): Usingmultiple DNA sequences,aconsensussequence iscreated.A Logoisusedto visuallyrepresentthe sequence where the heightof the residuerepresentsitsappearancethe givenposition.The tallerthe residue,the more oftenitappearsinthatposition. Figure 15 Protein MSA Score: Thiscontainsthe score of each proteinsequence usedfromthe multiplesequence alignment. Figure 17 Protein Multiple SequenceAlignment(SeeAttachment): Figure 18 ProteinsequenceconsensusLogo(See Attachment): Usingmultiple aminoacidresiduesequences,aconsensussequence iscreated.A Logoisusedtovisually representthe sequence wherethe heightof the residue representsitsappearance the givenposition. . The tallerthe residue,the more oftenitappearsinthat position. Figure 19 Protein Phylogeny Tree(See Attachment): Thisis a phylogenytree createdfromthe aminoacidresidue sequences.The phylogenictreeswere createdusingthe Average Distance %Identity. Figure 20 Multi-componentmonooxygenase DNA MSA Score:
  • 11. Thiscontainsthe score of each DNA nucleotide sequence usedfromthe multiplesequencealignment. Figure 21 DNA Multiple SequenceAlignment(SeeAttachment): Figure 22 DNA nucleotide sequence consensuslogo(SeeAttachment): Usingmultiple DNA sequences,aconsensussequence iscreated.A Logoisusedto visuallyrepresentthe sequence where the heightof the residuerepresentsitsappearancethe givenposition.The tallerthe residue,the more oftenitappearsinthatposition. Figure 23 DNA Phylogeny Tree(See Attachment): Thisis a phylogenytree createdfromthe DNA nucleotide sequences.The phylogenictreeswere created usingthe Average Distance %Identity. Figure 24 Protein MSA Score: Thiscontainsthe score of each proteinsequence usedfromthe multiplesequence alignment.
  • 12. Figure 25 Protein Multiple SequenceAlignment(SeeAttachment): Figure 26 ProteinsequenceconsensusLogo(See Attachment): Usingmultiple aminoacidresiduesequences,aconsensussequence iscreated.A Logoisusedtovisually representthe sequence wherethe heightof the residue representsitsappearance the givenposition. The tallerthe residue,the more oftenitappearsinthatposition. Figure 27 Protein Phylogeny Tree(See Attachment): Thisis a phylogenytree createdfromthe aminoacidresidue sequences.The phylogenictreeswere createdusingthe Average Distance %Identity. Figure 28 AlphaSubunit DNA MSA Score: Thiscontainsthe score of each DNA nucleotide sequence usedfromthe multiplesequencealignment.
  • 13. Figure 29 DNA Multiple SequenceAlignment(SeeAttachment): Figure 30 DNA nucleotide sequence consensuslogo(SeeAttachment): Usingmultiple DNA sequences,aconsensussequence iscreated.A Logoisusedto visuallyrepresentthe sequence where the heightof the residuerepresentsitsappearancethe givenposition.The tallerthe residue,the more oftenitappearsinthatposition. Figure 31 DNA Phylogeny Tree(See Attachment): Thisis a phylogenytree createdfromthe DNA nucleotide sequences. The phylogenictreeswere created usingthe Average Distance %Identity. Figure 32 Protein MSA Score: Thiscontainsthe score of each proteinsequence usedfromthe multiplesequence alignment. Figure 33
  • 14. Protein Multiple SequenceAlignment(SeeAttachment): Figure 34 ProteinsequenceconsensusLogo(See Attachment): Usingmultiple aminoacidresiduesequences,aconsensussequence iscreated.A Logoisusedtovisually representthe sequence wherethe heightof the residue representsitsappearance the givenposition. . The tallerthe residue,the more oftenitappearsinthatposition. Figure 35 Protein Phylogeny Tree(See Attachment): Thisis a phylogenytree createdfromthe amino acidresidue sequences.The phylogenictreeswere createdusingthe Average Distance %Identity. Figure 36 Beta Subunit DNA MSA Score: Thiscontainsthe score of each DNA nucleotide sequence usedfromthe multiplesequencealignment. Figure 37 DNA Multiple SequenceAlignment(SeeAttachment): Figure 38 DNA nucleotide sequence consensuslogo(SeeAttachment): Usingmultiple DNA sequences,aconsensussequence iscreated.A Logoisusedto visuallyrepresentthe sequence where the heightof the residuerepresentsitsappearancethe givenposition.The tallerthe residue,the more oftenitappearsinthatposition. Figure 39
  • 15. DNA Phylogeny Tree(See Attachment): Thisis a phylogenytree createdfromthe DNA nucleotide sequences.The phylogenictreeswere created usingthe Average Distance %Identity. Figure 40 Protein MSA Score: Thiscontainsthe score of each proteinsequence usedfromthe multiplesequence alignment. Figure 41 Protein Multiple SequenceAlignment(See Attachment): Figure 42 ProteinsequenceconsensusLogo(See Attachment): Usingmultiple aminoacidresiduesequences,aconsensussequence iscreated.A Logoisusedtovisually representthe sequence wherethe heightof the residue representsitsappearance the givenposition. . The tallerthe residue,the more oftenitappearsinthatposition. Figure 43 Protein Phylogeny Tree(See Attachment): Thisis a phylogenytree createdfromthe aminoacidresidue sequences.The phylogenictreeswere createdusingthe Average Distance %Identity. Figure 44 Reductase
  • 16. DNA MSA Score: Thiscontainsthe score of each DNA nucleotide sequence usedfromthe multiplesequencealignment. Figure 45 DNA Multiple SequenceAlignment(See Attachment): Figure 46 DNA nucleotide sequence consensuslogo(SeeAttachment): Usingmultiple DNA sequences,aconsensussequence iscreated.A Logoisusedto visuallyrepresentthe sequence where the heightof the residuerepresentsitsappearancethe givenposition.The tallerthe residue,the more oftenitappearsinthatposition. Figure 47 DNA Phylogeny Tree(See Attachment): Thisis a phylogenytree createdfromthe DNA nucleotide sequences.The phylogenictreeswere created usingthe Average Distance %Identity. Figure 48 Protein MSA Score: Thiscontainsthe score of each proteinsequence usedfromthe multiplesequence alignment.
  • 17. Figure 49 Protein Multiple SequenceAlignment(SeeAttachment): Figure 50 ProteinsequenceconsensusLogo(See Attachment): Usingmultiple aminoacidresiduesequences,aconsensussequence iscreated.A Logoisusedtovisually representthe sequence wherethe heightof the residue representsitsappearance the givenposition. . The tallerthe residue,the more oftenitappearsinthatposition. Figure 51 Protein Phylogeny Tree(See Attachment): Thisis a phylogenytree createdfromthe aminoacidresidue sequences.The phylogenictreeswere createdusingthe Average Distance %Identity. Figure 52 Multi-componentGene complex DNA MSA Score: Thiscontainsthe score of each DNA nucleotide sequence usedfromthe multiplesequencealignment. Figure 53 DNA nucleotide sequence consensuslogo(SeeAttachment):
  • 18. Usingmultiple DNA sequences,aconsensussequence is created.A Logoisusedto visuallyrepresentthe sequence where the heightof the residuerepresentsitsappearancethe givenposition.The tallerthe residue,the more oftenitappearsinthatposition. Figure 54 DNA Phylogeny Tree(See Attachment): Thisis a phylogenytree createdfromthe DNA nucleotide sequences.Itcreatesthe Phylogenictrees usingthe Average Distance %Identity. Figure 55 Master PhylogenyTree (SeeAttachment): This isthe phylogenytree createdfromall the sequencescollectedduringthe project.Itisannotated withcolorto helpforeasiernavigation. The firsttree containedthe propane monooxygenaseandthe regularmonooxygenase gene clusters alongwiththe Phen2-monoxygenase.Thistree doesnotinclude the alcohol dehydrogenase. Figure 56 The last figure showsthe compete phylogenytree withall the sequencesusedforthe project. Figure 57 Conclusion At the start of my project, I was able to find literature that led me to believe Pseudonocardia dioxanivorans was an organism with the ability to degrade 1, 4-dioxane. Furthermore the identity of the gene that made this possible was discovered. Monooxygenase MmoB/DmpM was the target gene which started much of the research. After more research was conducted, it was discovered that the monooxygenase MmoB/DmpM worked within a gene complex. This complex contained a reductase, an alpha, and beta subunit. The complex was then analyzed with its individual components, as well as whole. Part of the reason for analyzing parts
  • 19. individually was to find different sequences that may be lost in the overall gene cluster. When this was performed only the monooxygenase for the alpha subunit yielded new results. The initial analysis provided me with the organisms Pseudonocardia sp. K1, Pseudonocardia sp. ENV478, and Rhodococcus sp. YYL. It is also important to note that while using DNA sequences, the organism Pseudonocardia sp. K1 was displayed. However, when using protein the organism changed its name to Pseudonocardia tetrahydrofuranoxydans. However, Pseudonocardia tetrahydrofuranoxydans and Pseudonocardia sp. K1 are indeed the same organism. Looking at the percent identity score, you can see that these organisms have the strongest identity score with our target organism believe Pseudonocardia dioxanivorans. The gene of interest was analyzed along with the individual components of the gene complex and the complex as a whole. When it comes to the monooxygenase complex, in both its individual and complete components, the percent identity score never drops below 90%. This is a very strong indicator of functional similarity. The percent identify score of the propane monooxygenase stayed mostly in the 70% range. This implied an identify score strong enough to be relevant. Alcohol dehydrogenase percent identify score ranged from the mid-80s to high 60s. The range provided me with significant enough results to continue the project. The Phen-2 monooxygenase scores never got above 70% but was never below 60%. This coupled with the suggestion to investigate from my literary review is what kept these gene in for further evaluation. The alcohol dehydrogenase, phen-2 monooxygenase, and propane monooxygenase, were all evaluated as well using the same analytical biotechnology techniques. The propane monooxygenase was the only gene discovered to have a gene cluster similar to the previous monooxygenase gene cluster. This provoked me to exclude propane monooxygenase that were not a part of the cluster because overall they had low percent identity or they only had relations to one part of the cluster and no relationship to the gene cluster as a whole. The only reason the monooxygenase alpha subunits were allowed to keep their singular similarity matches was because the present identity score was still much too high to exclude. The overall identity scores of the Propane monooxygenase, Phen-2 monooxygenase, and alcohol
  • 20. dehydrogenase were high enough to be significant, but not as high as the monooxygenase within the gene cluster previously spoken of. The consensus logo is a way to visualize the results of the MSA and the percent identity score. In DNA the gene cluster for the monooxygenase shared the strongest consistency, with it having multiple matches at a 100% frequency. This was found in the individual components. Odd enough, while the gene cluster still holds a percent identity score above 90% the consensus sequences varies frequently among two different nucleotides. In proteins, the consensus sequence varied. This variation was observed with at least two amino acids sharing a 50% frequency each. The Phen-2 monooxygenase showed strong and mixed consensus among its DNA logo. The protein logo showed mixed consensus with three amino acids usually fighting over consensus. The alcohol dehydrogenase showed strong consensus among its protein sequence. Many sections had a 100% frequency. The Propane monooxygenase was given a gene logo of the entire cluster when it came to the nucleotide however, since creating a logo of the same cluster was problematic on a protein, only the actual propane monooxygenase was given a logo. The DNA logo of the cluster shows plenty of conflicting consensus and very little 100% frequency. The protein logo showed much stronger consensus among its sequences with most sequences in a 50% frequency. The alcohol dehydrogenase only has a protein logo created because not all of the nucleotide sequences could be found. These sequences show strong consensus with many 100% frequencies. Multiple Sequence Alignments were also produced. In typical fashion a “*” represents a completely conserved residue, ‘’:“indicates a conserved residue, and a “.”. A blank represents a portion with no kind of conservative match. Represents a semi-conserved residue. MSAs were conducted for all sequences, however it was problematic to exhibit the MSAs for the complete gene cluster for monooxygenase and the propane monooxygenase due to the enormous size of that data. Individual DNA and protein MSAs for each of the components of the monooxygenase have been provided. The individual and the gene cluster show very strong fully conserved regions. This matches up with their percent identity score. The propane monooxygenase
  • 21. portion of its gene cluster has been provided. This shows a mixture of both fully conserved and to a lesser extent, conserved regions. The phen-2 monooxygenase and the alcohol dehydrogenase show similar results, with a mix of fully, conserved, and semi-conserved regions. Mostly fully conserved regions. Phylogeny trees were conducted for every type of gene, however only results from genes containing more than three entries will be provided. This is because a tree with three or less results give you little to no practical information, especially with the scope of this project. When it came to monooxygenase, it is important to notice that our target organism Pseudonocardia dioxanivorans and the genes associated with it usually closest related to Rhodococcus sp. YYL. This can be observed in both individual genes and the gene cluster. The alcohol dehydrogenase genes show a varied amount of diversity among themselves. An important factor to notice is that a Pseudonocardia dioxanivorans organism was located. If you look at the phylogeny tree you can see that this organism is more closely related to other Pseudonocardia rather than the Rhodococcus. Comparing this to the monooxygenase tree may suggest that while these two organism have strong similarities within this gene, there are still many avenues were there are different. The final tree is shown with all the genes examined in this project in based on their protein. Like before, you see Pseudonocardia dioxanivorans and Rhodococcus sp. YYL being the closest related among each other. The exception to this is when it comes to the alpha subunit. The alcohol dehydrogenase genes are isolated furthest away from the rest of the genes. This could suggest that their role in 1, 4-dioxane degradation is entirely different from the rest of the genes. The results of this tree prompted me to make another tree without the alcohol dehydrogenase. This tree uses the entire gene cluster of the primary and propane monooxygenase. I did this because I thought that the results of the gene complex separated was mostly redundant. These results showed that Pseudonocardia dioxanivorans and Rhodococcus sp. YYL are still the closest in relation. There are errors and limitations that have that occurred during the project. A major limitation I faced was the amount of sequences available. I had to work with sequences that
  • 22. available from the databases and my ability to find those sequences. This means that I could have missed a sequence, or that there could possibly be more organisms whose genetic sequences are not available but also can become 1, 4-dioxane degraders. Another source of error could be my human error, better explained as my critic on what counts as valuable information. I wanted to include sequences that I thought were relevant but I fear that I might have excluded some sequences based on my own exclusion criteria. The information gained in this project has many useful applications. The first being this increases the number of organism which are suspected to be 1, 4-dioxane degraders. While more experiments are needed to evaluate their effectiveness, phylogenetic analysis does provide evidence to support further study. Having multiple organisms which can perform this task makes using them for bioremediation purposes more feasible. Furthermore, upon researching 1, 4-dioxane degradation, I discovered articles about fungi which could perform this task. This means that more information about different organisms who have the ability to perform this task may still be out there (Kinne, M., Poraj-Kobielska, M., Ralph, S. A., Ullrich, R., Hofrichter, M., & Hammel, K. E. (2009).). This information can also b e evaluated with the results of this project to examine the different or similar processes both organisms provide to degrade 1, 4-dioxane. In summation, propane monooxygenase and phen-2 monooxygenase have the ability to degrade 1, 4-dioxane, but only when their particular substrates are available. This makes them less optimal than the other monooxygenase examined in this project. Alcohol Dehydrogenase is the least related to any of the genes, which would suggest that its role in dioxane degradation is not as direct as the other genes. The key piece of information obtained from the results showed that three organisms Pseudonocardia sp. K1, Pseudonocardia sp. ENV478, and Rhodococcus sp. YYL have genes most closely related to the gene of interest. The gene complex which contains the monooxygenase, as well as the alpha subunit, beta subunit, reductase, and the monooxygenase is what gives these organisms more affinity for the degradation task. This supports the idea that these organisms are true 1, 4-dioxane degradation. Furthermore, the genes associated with Rhodococcus sp. YYL are more closely related to Pseudonocardia Dioxanivorans.
  • 23. References 1.Sales, C. M., Mahendra, S., Grostern, A., Parales, R. E., Goodwin, L. A., Woyke, T., . . . Alvarez-Cohen, L. (2011). Genome Sequence of the 1,4-Dioxane-Degrading Pseudonocardia dioxanivoransStrain CB1190. Journal of Bacteriology, 193(17), 4549-4550. doi:10.1128/jb.00415-11 2. Gedalanga, P. B., Pornwongthong, P., Mora, R., Chiang, S. D., Baldwin, B., Ogles, D., & Mahendra, S. (2014). Identification of Biomarker GenesTo Predict Biodegradation of 1,4-Dioxane. Applied and Environmental Microbiology, 80(10), 3209-3218. doi:10.1128/aem.04162-13 3. 1,4-Dioxane (1,4-Diethyleneoxide). (n.d.). Retrieved April 30, 2016, from https://www3.epa.gov/airtoxics/hlthef/dioxane.html 4. Kasai, T., Kano, H., Umeda, Y., Sasaki, T., Ikawa, N., Nishizawa, T., . . . Fukushima, S. (2009). Two-year inhalation study of carcinogenicity and chronic toxicity of 1,4-dioxane in male rats. Inhalation Toxicology, 21(11), 889-897. doi:10.1080/08958370802629610 5. Stevenson, E., & Turnbull, M. (2013, April 17). 1,4-Dioxane Pathway Map. Retrieved May 09, 2016, from http://eawag-bbd.ethz.ch/diox/diox_map.html 6. Kinne, M., Poraj-Kobielska, M., Ralph, S. A., Ullrich, R., Hofrichter, M., & Hammel, K. E. (2009). Oxidative Cleavage of Diverse Ethers by an Extracellular Fungal Peroxygenase. Journal of Biological Chemistry, 284(43), 29343-29349. doi:10.1074/jbc.m109.040857 Attachments Alcohol Dehydrogenase Protein Multiple SequenceAlignment(SeeAttachment)
  • 26. Protein Phylogeny Tree(See Attachment): Figure 4 Phenol 2-monooxygenase DNA Multiple SequenceAlignment(SeeAttachment):
  • 28. DNA nucleotide sequence consensuslogo(SeeAttachment): Figure 7
  • 33. Protein Multiple SequenceAlignment(SeeAttachment): Figure 18 ProteinsequenceconsensusLogo(See Attachment):
  • 34. Figure 19 Protein Phylogeny Tree(See Attachment):
  • 35. Figure 20 Multi-component monooxygenase DNA Multiple SequenceAlignment(SeeAttachment): Figure 22 DNA nucleotide sequence consensuslogo(SeeAttachment):
  • 36. Figure 23 DNA Phylogeny Tree(See Attachment):
  • 37. Figure 24 Protein Multiple SequenceAlignment(SeeAttachment): Figure 26 ProteinsequenceconsensusLogo(See Attachment):
  • 38. Figure 27 Protein Phylogeny Tree(See Attachment):
  • 39. Figure 28 Alpha Subunit DNA Multiple SequenceAlignment(SeeAttachment):
  • 40. Figure 30 DNA nucleotide sequence consensuslogo(SeeAttachment):
  • 41.
  • 42. Figure 31 DNA Phylogeny Tree(See Attachment): Figure 32 Protein Multiple SequenceAlignment(SeeAttachment):
  • 45. Protein Phylogeny Tree(See Attachment): Figure 36 Beta Subunit DNA Multiple SequenceAlignment(SeeAttachment):
  • 46.
  • 47. Figure 38 DNA nucleotide sequence consensuslogo(SeeAttachment):
  • 48. Figure 39 DNA Phylogeny Tree(See Attachment):
  • 49. Figure 40 Protein Multiple SequenceAlignment(SeeAttachment):
  • 52. Protein Phylogeny Tree(See Attachment): Figure 44 Reductase DNA Multiple SequenceAlignment(SeeAttachment):
  • 53.
  • 54. Figure 46 DNA nucleotide sequence consensuslogo(SeeAttachment):
  • 55. Figure 47 DNA Phylogeny Tree(See Attachment):
  • 56. Figure 48 Protein Multiple SequenceAlignment(SeeAttachment):
  • 59. Protein Phylogeny Tree(See Attachment): Figure 52 Multi-componentGene complex DNA nucleotide sequence consensuslogo(SeeAttachment):
  • 61. DNA Phylogeny Tree(See Attachment): Figure 55
  • 63. Figure 56 Master Sequence Collection (Protein) Propane Monooxygenase >gi|323461805|dbj|BAJ76721.1| phenol andpropane monooxygenasecouplingprotein [Mycobacteriumgoodii] >gi|38678096|dbj|BAD03959.1| propane monooxygenase couplingprotein[Gordoniasp.TY-5] >gi|115511403|dbj|BAF34311.1| propane monooxygenase couplingprotein[Pseudonocardiasp.TY-7] Alcohol Dehydrogenase >gi|503437687|ref|WP_013672348.1| alcohol dehydrogenase[Pseudonocardiadioxanivorans] >gi|739178298|ref|WP_037042195.1| alcohol dehydrogenase[Pseudonocardiaautotrophica] >gi|502712806|ref|WP_012947904.1| alcohol dehydrogenase[Geodermatophilusobscurus] >gi|655587148|ref|WP_028934354.1| alcohol dehydrogenase[Pseudonocardiaspinosispora] >gi|655567920|ref|WP_028921610.1| alcohol dehydrogenase[Pseudonocardiaacaciae]
  • 64. >gi|657222806|ref|WP_029336517.1| alcohol dehydrogenase[Geodermatophilaceae bacterium URHB0048] >gi|655577420|ref|WP_028928735.1| alcohol dehydrogenase[Pseudonocardiaasaccharolytica] >gi|664302843|ref|WP_030832244.1| alcohol dehydrogenase[Streptomyceshygroscopicus] >gi|739374752|ref|WP_037235711.1| alcohol dehydrogenase[Rhodococcuswratislaviensis] >gi|1005622338|ref|WP_061698619.1| alcohol dehydrogenase [Rhodococcussp.LB1] >gi|522114292|ref|WP_020625501.1| alcohol dehydrogenase[Pseudonocardiasp.P2] >gi|983563300|ref|WP_060713791.1| alcohol dehydrogenase[Pseudonocardiasp.HH130629-09] >gi|517141860|ref|WP_018330678.1| alcohol dehydrogenase[Actinomycetosporachiangmaiensis] >gi|652459636|ref|WP_026854440.1| alcohol dehydrogenase[Geodermatophilaceae bacterium URHB0062] Alpha-Subunit >gi|10443292|emb|CAC10506.1| alpha-subunitof multicomponenttetrahydrofuranmonooxygenase [Pseudonocardiatetrahydrofuranoxydans] >gi|338794148|gb|AEI99544.1| tetrahydrofuranmonooxygenase oxygenase componentalphasubunit [Pseudonocardiasp.ENV478] >gi|193888337|gb|ACF28534.1| multicomponenttetrahydrofuran-degradingmonooxygenase alhpa- subnit[Rhodococcussp.YYL] >gi|975830114|dbj|BAU36821.1| soluble di-ironmonooxygenase alphasubunit,partial [Rhodococcus ruber] >gi|975830092|dbj|BAU36810.1| soluble di-ironmonooxygenase alphasubunit,partial [Pseudonocardiadioxanivorans] >gi|975830110|dbj|BAU36819.1| soluble di-ironmonooxygenase alphasubunit,partial [Pseudonocardiasp.D17] Beta-Subunit >gi|315936315|gb|ADU55885.1| ThmB [Rhodococcussp.YYL] >gi|10443295|emb|CAC10509.1| beta-subunitof multicomponenttetrahydrofuranmonooxygenase [Pseudonocardiatetrahydrofuranoxydans] >gi|338794150|gb|AEI99546.1| tetrahydrofuranmonooxygenase oxygenase componentbetasubunit [Pseudonocardiasp.ENV478] >gi|326955346|gb|AEA29039.1| methane/phenol/toluene hydroxylase (plasmid)[Pseudonocardia dioxanivoransCB1190]
  • 65. Monooxygenase component >gi|503969935|ref|WP_014203929.1| monooxygenase componentMmoB/DmpM[Pseudonocardia dioxanivorans] >gi|10443296|emb|CAC10510.1| regulatoryproteinof multicomponenttetrahydrofuran monooxygenase [Pseudonocardiatetrahydrofuranoxydans] >gi|315936316|gb|ADU55886.1| ThmC[Rhodococcussp.YYL] >gi|338794151|gb|AEI99547.1| tetrahydrofuranmonooxygenase couplingprotein[Pseudonocardiasp. ENV478] Phen2- monooxygenase >gi|326949330|gb|AEA23027.1| Phenol 2-monooxygenase [PseudonocardiadioxanivoransCB1190] >gi|93354574|gb|ABF08663.1| Phenol hydroxylase P3protein(Phenol 2-monooxygenase P3 component) [CupriavidusmetalliduransCH34] >gi|187728549|gb|ACD29713.1| methane/phenol/toluene hydroxylase [Ralstoniapickettii 12J] Reductase >gi|10443294|emb|CAC10508.1| reductase componentof multicomponentterahydrofuran monooxygenase [Pseudonocardiatetrahydrofuranoxydans] >gi|338794149|gb|AEI99545.1| tetrahydrofuranmonooxygenase reductase component [Pseudonocardiasp.ENV478] >gi|193888338|gb|ACF28535.1| multicomponentterahydrofuran-degradingmonooxygenase reductase component[Rhodococcussp.YYL] >gi|375129130|ref|YP_004991225.1| Ferredoxin--NAD(+) reductase(plasmid) [Pseudonocardia dioxanivoransCB1190] (DNA) Gene Cluster >gi|10443289|emb|AJ296087.1| Pseudonocardiasp.K1ORF y,thmS gene,thmagene,ORFx,thmD gene,thmBgene,thmC gene,ORFQ, ORFZ andthm H gene >gi|338794146|gb|HQ699618.1| Pseudonocardiasp.ENV478tetrahydrofurandegradationgene cluster,complete sequence >gi|315936312|gb|EU732588.2| Rhodococcussp.YYL tetrahydrofuran-degradinggenecluster,partial sequence >gb|CP002597.1|:28891-38076 PseudonocardiadioxanivoransCB1190 plasmidpPSED02 genomic sequence Alpha Subunit
  • 66. >gi|10443289:2945-3326 Pseudonocardiasp.K1ORF y,thmS gene,thmagene,ORFx,thmD gene,thmB gene,thmC gene,ORFQ, ORF Z andthm H gene >gb|HQ699618.1|:3382-3763 Pseudonocardiasp.ENV478 tetrahydrofurandegradationgenecluster, complete sequence >gb|EU732588.2|:2916-3297 Rhodococcussp.YYL tetrahydrofuran-degradinggene cluster,partial sequence >gi|975830109|dbj|LC114144.1| Pseudonocardiasp.D17 SDIMO gene forsoluble di-iron monooxygenase alphasubunit,partial cds >gi|975830113|dbj|LC114146.1| RhodococcusruberSDIMO gene forsoluble di-ironmonooxygenase alphasubunit,partial cds,strain:T5 >gb|CP002597.1|:31833-32214 PseudonocardiadioxanivoransCB1190 plasmidpPSED02 genomic sequence Beta Subunit >gi|10443289:5108-6148 Pseudonocardiasp.K1ORF y,thmS gene,thmagene,ORFx,thmD gene,thmB gene,thmC gene,ORFQ, ORF Z andthm H gene >gi|315936312>gb|HQ699618.1|:5547-6590 Pseudonocardiasp.ENV478 tetrahydrofurandegradation gene cluster,completesequence >gb|EU732588.2|:5083-6123 Rhodococcussp.YYL tetrahydrofuran-degradinggene cluster,partial sequence >gb|CP002597.1|:34003-35043 PseudonocardiadioxanivoransCB1190 plasmidpPSED02 genomic sequence Monooxygenase Component >gb|CP002597.1|:35043-35396 PseudonocardiadioxanivoransCB1190 plasmidpPSED02 genomic sequence >gb|EU732588.2|:6123-6476 Rhodococcussp.YYL tetrahydrofuran-degradinggene cluster,partial sequence >gb|HQ699618.1|:6587-6940 Pseudonocardiasp.ENV478 tetrahydrofurandegradationgenecluster, complete sequence >gi|10443289:6148-6501 Pseudonocardiasp.K1ORF y, thmS gene,thmagene,ORFx,thmD gene,thmB gene,thmC gene,ORFQ, ORF Z andthm H gene Reductase >gi|10443289:3995-5077 Pseudonocardiasp.K1ORF y,thmS gene,thmagene,ORFx,thmD gene,thmB gene,thmC gene,ORFQ, ORF Z andthm H gene
  • 67. >gb|HQ699618.1|:4434-5516 Pseudonocardiasp.ENV478 tetrahydrofurandegradationgenecluster, complete sequence >gb|EU732588.2|:3964-5052 Rhodococcussp.YYL tetrahydrofuran-degradinggene cluster,partial sequence >gi|375129105:32884-33972 PseudonocardiadioxanivoransCB1190 plasmidpPSED02,complete sequence Phen2-Monooxygenase >gb|CP002593.1|:820783-822312 PseudonocardiadioxanivoransCB1190, complete genome >gb|CP000352.1|:1935574-1937121 CupriavidusmetalliduransCH34,complete genome >gb|CP001069.1|:954525-956069 Ralstoniapickettii 12Jchromosome 2,complete sequence Propane Monooxygenase >gi|115511399|dbj|AB250942.1| Pseudonocardiasp.TY-7prm2A,prm2B, prm2C, prm2D genesfor propane monooxygenase >gi|38678092|dbj|AB112920.1| Gordoniasp.TY-5 prmA,prmB,prmC, prmD, orf1, orf2, adh1,orf3 genesforpropane monooxygenase >gi|323461801|dbj|AB568291.1| Mycobacteriumgoodii mimA,mimB,mimC,mimDgenes,complete cds, strain:12523