15. Identity The extent to which two (nucleotide or amino acid) sequences are invariant. Homology Similarity attributed to descent from a common ancestor. Definitions RBP: 26 RV K ENFDKARFS GTW YA MA KKDPEGLFLQDNIV A EFS V DE T GQMSATAKGRVRL L NN W D- 84 + K ++ + + GTW ++ MA + L + A V T + + L + W + glycodelin: 23 QT K QDLELPKLA GTW HS MA MA-TNNISLMATLK A PLR V HI T SLLPTPEDNLEIV L HR W EN 81
16. Orthologous Homologous sequences in different species that arose from a common ancestral gene during speciation; may or may not be responsible for a similar function. Paralogous Homologous sequences within a single species that arose by gene duplication. Definitions
54. Other similarity scoring matrices might be constructed from any property of amino acids that can be quantified - partition coefficients between hydrophobic and hydrophilic phases - charge - molecular volume Unfortunately, …
91. 4 3 2 1 0 A brief history of time (BYA) Origin of life Origin of eukaryotes insects Fungi/animal Plant/animal Earliest fossils BYA
92. Margaret Dayhoff’s 34 protein superfamilies Protein PAMs per 100 million years Ig kappa chain 37 Kappa casein 33 Lactalbumin 27 Hemoglobin 12 Myoglobin 8.9 Insulin 4.4 Histone H4 0.10 Ubiquitin 0.00
93.
94.
95.
96. BLOSUM ( BLO ck – SUM ) scoring DDNAAV DNAVDD NNVAVV Block = ungapped alignent Eg. Amino Acids D N V A a b c d e f 1 2 3 S = 3 sequences W = 6 aa N= (W*S*(S-1))/2 = 18 pairs
97. A. Observed pairs DDNAAV DNAVDD NNVAVV a b c d e f 1 2 3 D N A V D N A V 1 4 1 3 1 1 1 1 4 1 f f ij D N A V D N A V .056 .222 .056 .167 .056 .056 .056 .056 .222 .056 g ij /18 Relative frequency table Probability of obtaining a pair if randomly choosing pairs from block
98. B. Expected pairs A DDDDD NNNN AAAA VVVVV DDNAAV DNAVDD NNVAVV P i 5/18 4/18 4/18 5/18 P{Draw DN pair}= P{Draw D, then N or Draw M, then D} P{Draw DN pair}= P D P N + P N P D = 2 * (5/18)*(4/18) = .123 D N A V D N A V .077 .123 .154 .123 .049 .123 .099 .049 .123 .049 e ij Random rel. frequency table Probability of obtaining a pair of each amino acid drawn independently from block
99.
100.
101.
102.
103.
104.
105. Rat versus mouse RBP Rat versus bacterial lipocalin
106.
107.
108.
109.
110.
111.
112.
113.
114. Reduction of Dot Plot Noise Self alignment of ACCTGAGCTCACCTGAGTTA
Mutation probability matrix for the evolutionary distance of 1 PAM (i.e., one Accepted Point Mutation per 100 amino acids). An element of this matrix, [Mij], gives the probability that the amino acid in column j will be replaced by the amino acid in row i after a given evolutionary interval, in this case 1 PAM. Thus, there is a 0.56% probability that Asp will be replaced by Glu. To simplify the appearance, the elements are shown multiplied by 10,000. (Adapted from Figure 82. Atlas of Protein Sequence and Structure, Suppl 3, 1978, M.O. Dayhoff, ed. National Biomedical Research Foundation, 1979.)