Petit RJ, Demesure B, Dumolin-Lapègue S (1996) CpDNA and plant mtDNA primers. In Molecular tools for screening biodiversity: Plants and Animals. A Karp, PG Isaac, D Ingram eds, Chapman & Hall (in press)
Introduction
Because they are haploid and evolve clonally, chloroplast DNA (cpDNA) and plant mitochondrial DNA (mtDNA) are a source of original markers very useful in phylogeny and in population genetics. Both are large molecules but they evolve very differently. cpDNA is a circular molecule (155844 bp in Nicotiana tabaccum)1, which is highly conserved in size and structure. It usually possesses two long inverted repeats (IR) which separate a large single copy region (LSC) from a small single copy region (SSC). Plant mtDNA genomes vary enormously in size and gene arrangement, but nucleotide substitution rates of mtDNA is much lower than that of cpDNA sequences. The conservation of the arrangement of the genes in cpDNA has lead to the design of numerous 'consensus' or 'universal' chloroplast primers which facilitate phylogenetic or population genetic studies. In particular, the availability of primers which allow the amplification and direct sequencing of the gene rbcL has revolutionized plant taxonomy2. At lower taxonomic levels, consensus primers which amplify non-coding chloroplast DNA separating very conserved regions are also extremely useful. In this case, either the sequence amplified is small enough to allow direct sequencing3 or it is longer and PCR is followed by restriction analysis4. The degree of 'universality' of these cpDNA primers within the plant kingdom varies but in some cases they are conserved enough to amplify virtually any land plants and many algae3. Plant mtDNA primers which can amplify non-coding sequences are more difficult to design because the order of the genes in the mtDNA molecule is usually not conserved. However, there are introns which may be amplified as well as a few short intergenic regions separating genes which have remained associated throughout evolution (such as the ribosomal genes or the genes rps14 and cob).
Because tiny amounts of (possibly degraded) DNA are required for successful amplification, and also because cpDNA is present at many more copies per cell than most nuclear DNA sequences, cpDNA primers have been used to successfully amplify rbcL sequences from plant remains several million years of age5. Such straightforward, efficient and rapid PCR approaches seem extremely attractive by comparison to the more traditional RFLP studies where cpDNA is transferred on a membrane and hybridized with a labelled heterologous probe before autoradiography, and are likely to become the standard way to obtain chloroplast or mitochondrial genetic markers.
Method
The availability of complete cpDNA sequences for tobacco1 (a Dicotyledon), rice6 (a Monocotyledon) and marchantia7 (a Bryophyte) in the molecular databases such as GenBank or Embl (accessions number Z00044|CHNTXX, X15901|CHOSXX, X04465|CHMXX) have been determinant in the design of the chloroplast primers which work across different taxonomic groups, by aligning these sequences and identifying conserved regions. Transfer RNAs are particularily helpful: their number (30 in the cpDNA molecule of tobacco), their repartition all over the chloroplast genome and their high level of conservation make them ideal targets for locating conserved primers, but proteins may also be used, as in the case of rbcL. For mtDNA primers, conserved sequences in protein genes are easily identified but conserved associations of genes are rare.
In FDthe complete cpDNA molecule of tobacco is represented along with the position of the cpDNA primers which have been described so far. Table 1 list the names of the genes where the primers are anchored, followed in brackets by the name given to the primers by the authors which have described them (if different from the name of the gene). The number for each primer refers to Fig. 1. Although the Nicotiana sequence was often used to design the primers, there is not always 100% homology between the sequence of the primers and the tobacco cpDNA sequence. In these cases, the mismatches have been underlined. These sites could be degenerated, partially or totally, to ensure amplification over a wide taxonomic range. The expected length of the product in Nicotiana tabaccum is then given, along with the exact 5' and 3' position of the amplified product on the cpDNA tobacco sequence, followed by the reference source, where amplification conditions can be found. Altogether, over 40 kb of the cpDNA of tobacco can be amplified. This corresponds to 30% of the genome (by counting the inverted repeats only once). Note that most primers amplify portions of the large single copy region (LSC on fig.1). Indeed, this region is often more variable than the rest of the genome, especially if compared to the inverted repeat. In Fig. 1 and 3, the sequences of additional primers which are useful for direct sequencing of rbcL and of more variable non-coding regions are provided.
Sequences for mtDNA primers are provided in Table 4. Additional primers useful for sequencing the ribosomal region of mtDNA and the coxII intron are described in ref. 11 and 13.
References
1. Shinozaki K and 22 authors (1986) The complete nucleotide sequence of the tobacco chloroplast genome: its gene organization and expression. The EMBO Journal 5: 2043-2049
2. Clegg MT (1993) Chloroplast gene sequences and the study of plant evolution. Proceedings of the National Academy of Sciences of the USA 90: 363-367
3. Taberlet P, Gielly L, Pautou G, Bouvet J (1991) Universal primers for amplification of three non-coding regions of chloroplast DNA. Plant Molecular Biology 17: 1105-1109
4. Arnold ML, Buckner CM, Robinson JJ (1991) Pollen-mediated introgression and hybrid speciation in Louisiana irises. Proceedings of the National Academy of Sciences of the USA 88: 1398-1402
5. Soltis PS, Soltis DE, Smiley CJ (1992) An rbcL sequence from a Miocene Taxodium (bald cypress). Proceedings of the National Academy of Sciences of the USA 89: 449-451
6. Hiratsuka J and 15 authors (1989) The complete sequence of the rice (Oryza sativa) chloroplast genome: Intermolecular recombination between distinct tRNA genes accounts for a major plastid DNA inversion during the evolution of cereals. Molecular and General Genetics 217: 185-194
7. Ohyama K and 12 authors (1986) Complete nucleotide sequence of liverwort Marchantia polymorpha chloroplast DNA. Plant Molecular Biology Reporter 4: 148-175
8. Demesure B, Sodzi N, Petit RJ (1995) A set of universal primers for amplification of polymorphic non-coding regions of mitochondrial and chloroplast DNA in plants. Molecular Ecology 4: 129-131
9. Liston A (1992) Variation in the chloroplast genes rpoC1 and rpoC2 of the genus Astragalus (Fabaceae): evidence from restriction site mapping of a PCR-amplified fragment. American Journal of Botany 79: 953-961
10. Morton BR, Clegg MT (1993) A chloroplast DNA mutational hotspot and gene conversion in a noncoding region near rbcL in the grass family (Poaceae). Current Genetics 24: 357-365
11. Al-Janabi SM, McClelland M, Petersen C, Sobral BWS (1994) Phylogenetic analysis of organellar DNA sequences in the Andropogoneae: Saccharinae. Theoretical and Applied Genetics 88: 933-944
12. Bousquet J, Strauss SH, Li P (1992) Complete congruence between morphological and rbcL-based molecular phylogenies in birches and related species (Betulaceae). Molecular Biology and Evolution 9: 1076-1088
13. Rabbi MF, Wilson KG (1993) The mitochondrial coxII intron has been lost in two different lineages of dicots and altered in others. American Journal of Botany 80: 1216-1223
Table 1. Compilation of conserved PCR primers for amplification of chloroplast DNA sequences.
primer 1 primer 2 length in References
Nicotiana
1. trnH [tRNA-His (GUG)] 2. trnK [tRNA-Lys (UUU) 3' exon] 1831 bp Ref.8
5'-ACG GGA ATT GAA CCC GCG CA-3' CCG ACT AGT TCC GGG TTC GA (14-1844)
3. trnK [tRNA-Lys (UUU) 3' exon] 4. trnK [tRNA-Lys (UUU) 5' exon] 2585 bp Ref.8
5'-GGG TTG CCC GGG ACT CGA AC-3' 5'-CAA CGG TAG AGT ACT CGG CTT TTA-3' (1811-4395)
5. trnQ [tRNA-Gln (UUG)] 6. trnR [tRNA-Arg (UCU)] 3086 bp unpublished
5'-GGG ACG GAA GGA TTC GAA CC-3' 5'-ATT GCG TCC AAT AGG ATT TGA A-3' (7418-10503)
7. rpoC2 8. rpoC1 4105 bp Ref.9
5'-TAG ACA TCG GTA CTC CAG TGC-3' 5'-AAG CGG AAT TTG TGC TTG TG-3' (19967-24071)
9. trnC [tRNA-Cys(GCA)] 10. trnD [tRNA-Asp (GUC)] 3169 bp Ref.8
5'-CCA GTT CAA ATC TGG GTG TC-3' 5'-GGG ATT GTA GTT CAA TTG GT-3' (28831-31999)
11. trnD [tRNA-Asp (GUC)] 12. trnT [tRNA-Thr (GGU)] 1213 bp Ref.8
5'-ACC AAT TGA ACT ACA ATC CC-3' 5'-CTA CCA CTG AGT TAA AAG GG-3' (31980-33192)
13. trnT [tRNA-Thr (GGU)] 14. psbC [psII 44 kd protein] 3236 bp unpublished
5'-GCC CTT TTA ACT CAG TGG TA-3' 5'-GAG CTT GAG AAG CTT CTG GT-3' (33172-36407)
15. psbC [psII 44 kd protein] 16. trnS [tRNA-Ser (UGA)] 1611 bp Ref.8
5'-GGT CGT GAC CAA GAA ACC AC-3' 5'-GGT TCG AAT CCC TCT CTC TC-3' (35543-37153)
17. trnS [tRNA-Ser (UGA)] 18. trnfM [tRNA-fMet (CAU)] 1254 bp Ref.8
5'-GAG AGA GAG GGA TTC GAA CC-3' 5'-CAT AAC CTT GAG GTC ACG GG-3' (37134-38387)
19. psaA [PS I (P 700 apoprotein 20. trnS [tRNA-Ser (GGA)] 3681 bp Ref.8
A1)] 5'-AAC CAC TCG GCC ATC TCT CCT A-3' (43450-47130)
5'-ACT TCT GGT TCC GGC GAA CGA A-3'
21. trnS [tRNA-Ser(GGA)] 22. trnT [tRNA-Thr(UGU)] 1386 bp Ref.8
5'-CGA GGG TTC GAA TCC CTC TC-3' 5'-AGA GCA TCG CAT TTG TAA TG-3' (47172-48557)
23. trnT [tRNA-Thr (UGU)] (a 24. trnF [tRNA-Phe (GAA)] (f 1754 bp Ref.3
(B48557)) (A50272)) (48538-50291)
5'-CAT TAC AAA TGC GAT GCT CT-3' 5'-ATT TGA ACT GGT GAC ACG AG-3'
25. trnF [tRNA-Phe (GAA)] 26. trnV [tRNA-Val (UAC) 3' exon] 3511 bp unpublished
5'-CTC GTG TCA CCA GTT CAA AT-3' 5'-CCG AGA AGG TCT ACG GTT CG-3' (50272-53763)
27. trnV [tRNA-Val (UAC) 3' exon] 29. rbcL [RuBisCO large subunit] 3850 bp unpublished
5'-CGA ACC GTA GAC CTT CTC GG-3' 5'-GCT TTA GTC TCT GTT TGT GG-3' (53763-57612)
28. trnM [tRNA-Met(CAU)] idem 3005 bp Ref.8
5'-TGC TTT CAT ACG GCG GGA GT-3' (54608-57612)
30. rbcL [RuBisCO large subunit] 31. rbcL [RuBisCO large subunit] 1381 bp Ref.6
(Z1) (Z1351R) (57587-58967)
5'-ATG TCA CCA CAA ACA GAA ACT AAA 5'-CTT CAC AAG CAG CAG CTA GTT CAG GAC
GCA AGT-3' TCC-3'
idem 33. ORF 512 (ORF106) 3274 bp Ref.4
5'-ACT ACA GAT CTC ATA CTA CCC C-3' (57587-60860)
32. rbcL [RuBisCO large subunit] 34. psaI 3350 bp Ref.10
(Z1204) 5'-GCA ATT GCC GGA AAT ACT AAG C-3' (58790-62139)
5'-TTT GGT GGA GGA ACT TTA GGA CAC
CCT TGG GG-3'
35. trnV [tRNA-Val (GAC)] 36. 16S rRNA (cp16S5P1) 297 bp Ref.11
(cpval3P2) 5'-GCA TGC CGC CAG CGT TCA TC-3' (102509-102805
5'-AGT TCG AGC CTG ATT ATC CC-3' )
Total 40076 bp
Table 2. Sequencing primers for the gene rbcL in Dicotyledons. After ref. 12.
Position Primer Sequence (strand) 334 (A) 5'-TCT GTT ACT AAC ATG TTT ACT TC-3' 691 (A) 5'-GAA ACA GGT GAA ATC AAA GGG CAT TA-3' 1144 (A) 5'-GGT ATT CAC GTT TGG CAT ATG CCT GC-3' 216 (B) 5'-TCG GTC CAC ACA GTT GTC CAT GT-3' 537 (B) 5'-CCC AAT TTA GGT TTA ATA GTA CAT CC-3' 979 (B) 5'-AAT ATG ATC TCC ACC AGA CAA ACG TAA-3' 1303 (B) 5'-TCC CTC ATT ACG AGC TTG TAC ACA-3'
Table 3. Sequencing primers for the non-coding regions located between trnT and trnF in plants. After ref. 3.
Primer Gene (strand) Primer Sequence Position
a (B48557) trnT (UGU) (B) 5'-CAT TAC AAA TGC GAT GCT CT-3' 48538
b (A49291) trnL (UAA) 5' exon 5'-TCT ACC GAT TTC GCC ATA TC-3' 49291
(A)
c (B49317) trnL (UAA) 5' exon 5'-CGA AAT CGG TAG ACG CTA CG-3' 49298
(B)
d (A49855) trnL (UAA) 3' exon 5'-GGG GAT AGA GGG ACT TGA AC-3' 49855
(A)
e (B49873) trnL (UAA) 3' exon 5'-GGT TCA AGT CCC TCT ATC CC-3' 49854
(B)
f (A50272) trnF (GAA) (A) 5'-ATT TGA ACT GGT GAC ACG AG-3' 50272
Table 4. Compilation of conserved PCR primers for amplification of plant mitochondrial DNA sequences.
Primer 1 Primer 2 Size Reference
(accession
nb)
nad 1B nad 1C 1184 bp Ref.8
5'-GCA TTA CGA TCT GCA GCT CA-3' 5'-GGA GCT CGA TTA GTT TCT GC-3' (X60401)
nad 4 exon 1 nad 4 exon 2 2100 bp Ref. 8
5'-CAG TGG GTT GGT CTG GTA TG-3' 5'-TCA TAT GGG CTA CTG AGG AG-3' (X60794)
nad 4 exon 2 nad 4 exon 3(*) 2840 bp Ref.8
5'-TGT TTC CCG AAG CGA CAC TT-3' 5'-AAC CAG TCC ATG ACT TAA CA-3' (X60794) and
unpublished(*)
rpS 14 cob 1396 bp Ref.8
5'-CAC GGG TCG CCC TCG TTC CG-3' 5'-GTG TGG AGG ATA TAG GTT GT-3' (X07237)
coxII coxII 1531 bp Ref. 13
(A (5'-amplimer)) (I (3'-amplimer)) (X01088)
5'-AAT CCA ATC CCG CAA AGG ATT-3' 5'-AGA AGA TGA TCC AGA ATT GGG-3'
18S rRNA 5S rRNA 1177 bp Ref. 11
(mt18S1170 (7B)) (mt5S5P2 (4B)) (Z11512)
5'-GTG TTG CTG AGA CAT GCG CC-3' 5'-ATA TGG CGC AAG ACG ATT CC-3'
*This pair of primer should be used in place of the nad4 exon2-nad4 exon 4 described in Ref. 8.