[ This discussion pertains to two papers: Birth of a unique enzyme from an alternative reading frame by Ohno and Evolutionary Adaptation of Plasmid Encoded Enzymes by Okada. For those without institutional access one might try this link for Okada’s paper: OKADA DOCSLIDE]
Because Ohno’s PR.C sequence doesn’t cover the entire sequence published by Okada in Okada’s 1983 paper, this supplement provides information relating Ohno’s PR.C sequence with Okada’s RS-IIA sequence.
Okada’s sequence is also recorded in gen bank under accession number X00046.1.
Provided below is the section from Okada’s paper that Ohno used to create the start of his PR.C sequence. The red marking and text were added for clarity to suggest how to take the sequence from X00046.1 and modify it to the PR.C sequence. Constructing PR.C in this manner ensures accuracy of creating Ohno’s PR.C sequence rather than manually retyping Ohno’s PR.C sequence from Ohno’s 1984 paper.
There is one problem, however. Ohno reported the actual end of his sequence as “GCGGCTGA” without explaining why it deviates from the sequence reported by Okada (and thus genbank). According to genbank, the end sequence of PR.C should be “GCGGCGTGA” not “GCGGCTGA”, where the red letter “G” indicates the guanine base Ohno omitted without any explanation.
To illustrate the problem, here is the end sequence in Okada’s 1983 paper with the end sequence circled in green.
See how this contrasts with the end sequence in Ohno’s paper, with the mistake circled in red.
Because there are the R-IIA (of nylB) and R-IIB (of nylB’) sequences, Ohno’s typo is also an implicit mistake on the supposed corresponding frame shift mutation that must take place in R-IIB. Ohno made a correspondence between PR.C and nylB but there should be some sort of PR.C’ that corresponds to nylB’ as well, so his typo effectively creates two problems, not just one.
Below is Ohno’s PR.C sequence. As mentioned, the sequence can be somewhat reconstructed by taking the Genbank sequence (with the acession number X00046.1) and editing it to match Ohno’s sequence. That was the technique used to reproduce Ohno’s PR.C sequence since it was deemed to be more accurate than retyping the sequence from Ohno’s paper. The position of the yet-to-be thymine is marked in red. The lower case letters indicate the location of actual nylB gene locus. Spacing was added for clairity.
ATGGGCTACATCGATCTCTCCGCCCCCGTCGCGATGATCGTCAGC
GGTGGCCTCTACTATCTCTTCACCCGCCGCGGCTACACCTTCGGAGACACT
CG agaacgcacgttccacc
ggccagcaccccgccaggtatcccggagccgcggccggggagccgacactcgacagctgg
caggaggccccgcacaaccgctgggccttcgcccgcctgggcgagctgctgcccacggcg
gcggtctcccggcgcgacccggcgacgcccgcggagcccgtcgtgcggctcgacgcgctc
gcgacgcggctccccgatctcgagcagcggctcgaggagacctgcaccgacgcattcctc
gtgctgcgcggctccgaggtcctcgccgagtactaccgggcgggtttcgcacccgacgac
cgtcacctgctgatgagcgtctcgaagtcgctgtgcggcacggtcgtcggcgcgctgatc
gacgaggggcgcatcgatcccgcgcagcccgtcaccgagtatgtacccgagctcgcgggc
tccgtctacgacgggccctccgtgctgcaggtgctcgacatgcagatctcgatcgactac
aacgaggactacgtcgatccggcctcggaggtgcagacccacgatcgctccgccggctgg
cgcacgcggcgagacggggaccccgccgacacctacgagttcctcaccaccctccgcggc
gacggcggcaccggcgagttccagtactgctcggcgaacaccgacgtgctcgcctggatc
gtcgagcgggtcaccggtctgcgctacgtcgaagcgctctccacgtacctgtgggcgaag
ctcgacgccgatcgggatgcgaccatcacggtcgaccagaccggcttcggcttcgcgaac
gggggcgtctcctgcaccgcgcgggatctcgcacgcgtgggccgcatgatgctcgacggc
ggcgtcgctcccggcggacgggtcgtatcgcagggctgggtggaaagcgtgctggccggc
ggctcccgcgaagccatgaccgacgagggtttcacctccgcattccccgagggcagctac
acgcgccagtggtggtgcacgggcaacgagcgcggcaacgtgagcggcatcggcatccac
ggccagaacctctggctcgatccgcgcaccgactcggtgatcgtcaagctctcgtcgtgg
cccgatcccgacacccggcactggcacgggctgcagagcgggatcctgctcgacgtcagc
cgtgccctcgacgcggtgtag GCGGCTGA
OHNO’S PR.C POLYPEPTIDE
The polypeptide hypothetically coded by PR.C can be generated by taking PR.C and entering in the ExPASy website:
http://web.expasy.org/translate/
Here is the resulting poplypetide sequence:
MGYIDLSAPVAMIVSGGLYYLFTRRGYTFGDTRERTFHRPAPRQVSRSRGRGADTRQLAG
GPAQPLGLRPPGRAAAHGGGLPARPGDARGARRAARRARDAAPRSRAAARGDLHRRIPRA
ARLRGPRRVLPGGFRTRRPSPADERLEVAVRHGRRRADRRGAHRSRAARHRVCTRARGLR
LRRALRAAGARHADLDRLQRGLRRSGLGGADPRSLRRLAHAARRGPRRHLRVPHHPPRRR
RHRRVPVLLGEHRRARLDRRAGHRSALRRSALHVPVGEARRRSGCDHHGRPDRLRLRERG
RLLHRAGSRTRGPHDARRRRRSRRTGRIAGLGGKRAGRRLPRSHDRRGFHLRIPRGQLHA
PVVVHGQRARQRERHRHPRPEPLARSAHRLGDRQALVVARSRHPALARAAERDPARRQPC
PRRGVGG
Lastly, Ohno’s abstract and one footnotes lists the number of residues as “472”. This appears to be a typo. It should be “427” instead of “472” since the above polypeptide is 427 residues long.
The above sequences can then be used for BLASTN and BLASTP searches.