[ This discussion pertains to two papers:  Birth of a unique enzyme from an alternative reading frame  by  Ohno and Evolutionary Adaptation of Plasmid Encoded Enzymes by Okada.  For those without institutional access one might try this link for Okada’s paper:  OKADA DOCSLIDE]

Because Ohno’s PR.C sequence doesn’t cover the entire sequence published by Okada in Okada’s 1983 paper, this supplement provides information relating Ohno’s PR.C sequence with Okada’s RS-IIA sequence.

Okada’s sequence is also recorded in gen bank under accession number X00046.1.

Provided below is the section from Okada’s paper that Ohno used to create the start of his PR.C sequence. The red marking and text were added for clarity to suggest how to take the sequence from X00046.1 and modify it to the PR.C sequence. Constructing PR.C in this manner ensures accuracy of creating Ohno’s PR.C sequence rather than manually retyping Ohno’s PR.C sequence from Ohno’s 1984 paper.

CLICK FOR ENLARGED IMAGE

There is one problem, however. Ohno reported the actual end of his sequence as “GCGGCTGA” without explaining why it deviates from the sequence reported by Okada (and thus genbank). According to genbank, the end sequence of PR.C should be “GCGGCGTGA” not “GCGGCTGA”, where the red  letter “G” indicates the guanine base Ohno omitted without any explanation.

To illustrate the problem, here is the end sequence in Okada’s 1983 paper with the end sequence circled in green.

CLICK FOR ENLARGED IMAGE

See how this contrasts with the end sequence in Ohno’s paper, with the mistake circled in red.

CLICK FOR ENLARGED IMAGE

Because there are the R-IIA (of nylB) and R-IIB (of nylB’) sequences, Ohno’s typo is also an implicit mistake on the supposed corresponding frame shift mutation that must take place in R-IIB. Ohno made a correspondence between PR.C and nylB but there should be some sort of PR.C’ that corresponds to nylB’ as well, so his typo effectively creates two problems, not just one.

Below is Ohno’s PR.C sequence. As mentioned, the sequence can be somewhat reconstructed by taking the Genbank sequence (with the acession number X00046.1) and editing it to match Ohno’s sequence. That was the technique used to reproduce Ohno’s PR.C sequence since it was deemed to be more accurate than retyping the sequence from Ohno’s paper. The position of the yet-to-be thymine is marked in red. The lower case letters indicate the location of actual nylB gene locus. Spacing was added for clairity.

ATGGGCTACATCGATCTCTCCGCCCCCGTCGCGATGATCGTCAGC

GGTGGCCTCTACTATCTCTTCACCCGCCGCGGCTACACCTTCGGAGACACT

CG agaacgcacgttccacc

ggccagcaccccgccaggtatcccggagccgcggccggggagccgacactcgacagctgg

caggaggccccgcacaaccgctgggccttcgcccgcctgggcgagctgctgcccacggcg

gcggtctcccggcgcgacccggcgacgcccgcggagcccgtcgtgcggctcgacgcgctc

gcgacgcggctccccgatctcgagcagcggctcgaggagacctgcaccgacgcattcctc

gtgctgcgcggctccgaggtcctcgccgagtactaccgggcgggtttcgcacccgacgac

cgtcacctgctgatgagcgtctcgaagtcgctgtgcggcacggtcgtcggcgcgctgatc

gacgaggggcgcatcgatcccgcgcagcccgtcaccgagtatgtacccgagctcgcgggc

tccgtctacgacgggccctccgtgctgcaggtgctcgacatgcagatctcgatcgactac

aacgaggactacgtcgatccggcctcggaggtgcagacccacgatcgctccgccggctgg

cgcacgcggcgagacggggaccccgccgacacctacgagttcctcaccaccctccgcggc

gacggcggcaccggcgagttccagtactgctcggcgaacaccgacgtgctcgcctggatc

gtcgagcgggtcaccggtctgcgctacgtcgaagcgctctccacgtacctgtgggcgaag

ctcgacgccgatcgggatgcgaccatcacggtcgaccagaccggcttcggcttcgcgaac

gggggcgtctcctgcaccgcgcgggatctcgcacgcgtgggccgcatgatgctcgacggc

ggcgtcgctcccggcggacgggtcgtatcgcagggctgggtggaaagcgtgctggccggc

ggctcccgcgaagccatgaccgacgagggtttcacctccgcattccccgagggcagctac

acgcgccagtggtggtgcacgggcaacgagcgcggcaacgtgagcggcatcggcatccac

ggccagaacctctggctcgatccgcgcaccgactcggtgatcgtcaagctctcgtcgtgg

cccgatcccgacacccggcactggcacgggctgcagagcgggatcctgctcgacgtcagc

cgtgccctcgacgcggtgtag GCGGCTGA

OHNO’S PR.C POLYPEPTIDE

The polypeptide hypothetically coded by PR.C can be generated by taking PR.C and entering in the ExPASy website:

http://web.expasy.org/translate/

Here is the resulting poplypetide sequence:

MGYIDLSAPVAMIVSGGLYYLFTRRGYTFGDTRERTFHRPAPRQVSRSRGRGADTRQLAG

GPAQPLGLRPPGRAAAHGGGLPARPGDARGARRAARRARDAAPRSRAAARGDLHRRIPRA

ARLRGPRRVLPGGFRTRRPSPADERLEVAVRHGRRRADRRGAHRSRAARHRVCTRARGLR

LRRALRAAGARHADLDRLQRGLRRSGLGGADPRSLRRLAHAARRGPRRHLRVPHHPPRRR

RHRRVPVLLGEHRRARLDRRAGHRSALRRSALHVPVGEARRRSGCDHHGRPDRLRLRERG

RLLHRAGSRTRGPHDARRRRRSRRTGRIAGLGGKRAGRRLPRSHDRRGFHLRIPRGQLHA

PVVVHGQRARQRERHRHPRPEPLARSAHRLGDRQALVVARSRHPALARAAERDPARRQPC

PRRGVGG

Lastly, Ohno’s abstract and one footnotes lists the number of residues as “472”. This appears to be a typo.  It should be “427” instead of “472” since the above polypeptide is 427 residues long.

The above sequences can then be used for BLASTN and BLASTP searches.