Abstract

Revealing phylogeographic structure is important for accurate subspecies delineation and understanding a species’ evolutionary history. In leopards (Panthera pardus), there are currently nine subspecies recognized. On the African continent, only one subspecies occurs (P. p. pardus), although mitochondrial DNA from historical samples suggests the presence of three putative continental clades: (1) West Africa (WA), (2) Central Africa (CA), and (3) Southern Africa (SA). So far, whole genome data did not recover this phylogeographic structure, although leopards in the southern periphery of their distribution range in Africa have not yet been investigated in detail. The Mpumalanga province of South Africa is of particular interest, as here, the CA and the SA clade possibly meet. The aim of this study was to characterize the first mitogenomes of African leopards from Mpumalanga, to help clarifying how South African leopards fit into continental patterns of genetic differentiation. Complete mitogenomes from nine leopards, including a strawberry leopard, were assembled de novo and included in phylogenetic analysis, in combination with 32 publicly available mitogenomes. Bayesian inference and maximum likelihood analyses identified two deeply diverged putative clades within South Africa, which were more genetically distinct than two subspecies in Asia. The clades dated back to 0.76-0.86 million years ago, indicating that they originated during the climatically unstable Mid-Pleistocene, as seen in other large mammals. The Pleistocene refuge theory states that the maintenance of savanna refugia in East and Southern Africa promoted the divergence between populations. As such, leopards may reflect the unique climatic history of southern Africa, which has resulted in eminent and endemic genetic diversity.

1. Introduction

Spatial patterns of phylogenetic diversity represent a species’ evolutionary potential and are a key element of biodiversity [1]. Revealing phylogeographic structure is also essential for accurate population and subspecies delineation, and understanding a species’ demographic history [2]. Many phylogeographic studies have used mitochondrial DNA, which is a separate, circular, haploid genome in the mitochondrion. Mitochondrial DNA (mtDNA) has several unique characteristics, such as a relatively constant molecular clock and maternal inheritance, which allows the reconstruction of female genealogies without the confounding effect of recombination [3]. Other benefits are the high mutation rate, 5–10 times that of nuclear DNA, and the simple conserved genome structure [4]. This combination makes mtDNA useful for evolutionary and population genetic studies, showing high congruence with known relationships based on morphology and nuclear genomes [5, 6]. Although most studies have used individual protein-coding genes as markers, advances in sequencing technology have made it feasible to sequence entire mitochondrial genomes or mitogenomes [7]. The reconstruction of complete mitogenomes is highly advantageous since it can clarify species relationships when individual mitochondrial genes are inconclusive [810]. Although studies based on single genes can still provide accurate topologies and offer a powerful first insight [11], mitogenomes are widely considered to offer a more holistic approach for studying part of the evolutionary history of wild species [1214]. Nonetheless, the nature of mtDNA can obscure spatial genetic differentiation, making this marker not very suitable for testing recent and male-biased dispersal or isolation-by-distance [15, 16]. For a full understanding of phylogenetic relationships, the addition of nuclear markers is therefore essential [17].

One species with an inconclusive phylogeographic structure is the leopard Panthera pardus, in part due to its far-ranging habits across the globe [18, 19]. The leopard is one of the most versatile large carnivores in the world, with a distribution that spreads from the southern tip of Africa to the Russian Far East in Asia, and historically even Europe, across different habitat types [2]. Although their numbers have declined severely over the past decades, levels of genetic diversity seem to have remained relatively high in most areas [1921]. Studies showed that the genomic diversity of leopards is two to five times higher than other big cats, likely as a result of their habitat flexibility and the long-term maintenance of large effective population sizes [19]. Nonetheless, the collective impact of habitat loss and unnatural mortality did already lead to local losses of genetic diversity [22] and increased population structure in some populations [23]. Genetic impoverishment has been reported in the critically endangered Amur leopard P. p. orientalis [24] and isolated island populations in South East Asia [25]. As suitable habitat is decreasing at a fast rate, and leopards are facing additional threats such as a lack of natural prey, unsustainable hunting, and poaching [26], a more in-depth understanding of the evolutionary history is needed to elaborate management plans for this species.

Leopards originated approximately three million years ago (mya) in Africa [2]. There are nine subspecies recognized, or eight according to a recent assessment [27], which have resulted from adaptive radiation and isolation-by-distance, of which six occur in Asia [25]. Leopards likely colonised Asia approximately 0.5 to 0.6 mya, following an out-of-African dispersal event [26]. The leopard populations in Africa have traditionally been placed into three continental clades based on mtDNA diversity: (1) West Africa (WA), representing the western extent of the leopard habitat; (2) Central Africa (CA), consisting of clades in coastal West-Central, Central East, and Central Southern Africa across equatorial leopard habitat; and (3) Southern Africa (SA) in the southern extent of the leopard habitat [18]. Although whole genomes did not reflect the mtDNA phylogeographic structure [19, 20], populations from South Africa and Mozambique were largely missing from these studies, leaving the phylogenomic placement of the SA individuals uncertain. The population in northern South Africa is of particular interest since it forms a possible contact zone between two diverged CA and SA clades [18, 23, 25].

Historically, South African leopards are considered to be a single population that consists of several geographically distinct mtDNA lineages, which were likely shaped by isolation-by-distance [23], habitat fragmentation [28], and anthropogenic mortality. Typically, lineages are described as lines of divergent descendants, whereas clades are composed of several lineal descendants that share a common ancestor and cluster together during phylogenetic analysis [29]. It has previously been suggested that a cryptic subspecies may occur within South Africa, based on some unique phenotypic traits in this population, such as the small-sized Cape leopard [30], although no genetic data has yet supported this [23]. Furthermore, a unique colour morph occurs in northern South Africa, primarily in the North West province and Mpumalanga, referred to as the red or strawberry leopard [31]. It is marked by light fur, pink skin, and blue eyes, caused by a 1 bp deletion in the tyrosinase-related protein 1 [22]. Additionally, it is known that two genetic clades occur within the Mpumalanga province [32], although the evolutionary relationship of leopards in the area remains uncertain. While one study refers to these clades as the East and West Mpumalanga clades and points to the presence of a historic dispersal barrier across the Drakensberg escarpment [32], others suggest that the observed pattern may be the remnant of a historical event that took place on a much larger geographic scale [18, 25].

Due to the geological diversity of South Africa, it would be interesting to further investigate the evolutionary history of leopards in this area in response to past climatic events. For instance, during the Pliocene and Pleistocene, savanna habitat contracted, and aridification in the Namib and Kalahari deserts increased, which promoted genetic divergence between refugia of many savanna-adapted species [33]. High levels of endemism in the Kalahari-Namibian-Cape region suggest that this area may have been isolated for extended periods [34]. This may have resulted in the origin of the SA mitochondrial clade found in this region [18]. Alternatively, it may also result from dispersal limitations associated with being located at the edge of a species’ range [35]. This study is aimed at assembling, annotating, and characterizing nine mitogenomes of South African leopards and at reconstructing the phylogenetic relationship of these individuals compared to other populations across the globe. This includes a sample of the rare strawberry leopard that occurs in Mpumalanga. By assembling the first leopard mitogenomes from South Africa and reconstructing a more accurate taxonomic status of the unique population from Mpumalanga province, our results can help to build the baseline for future phylogenetic studies [36]. This may contribute further in the conservation of the species. To our knowledge, our study is the first to describe the distinction of the central and southern clades in South Africa that have since come into secondary contact.

2. Methods

2.1. Sample Collection and DNA Extraction

We collected nine samples from various locations in the Mpumalanga province, South Africa (Figure 1), including Middelburg (Mid), Lydenburg (Lyd), Sabi Sands (Sab), Malelane (Mal), Skukuza (Sku), Piet Retief (Pie), Loskop Dam (Los), and Makubu (Mak) (Table S1). We also collected a sample of a regional colour morph, referred to as the “red” strawberry leopard (Red). Tissue samples were opportunistically collected from dead leopards, commonly from a roadkill. The samples were moved to the University of Johannesburg Wildlife Genomics laboratory and stored at −20°C before DNA extraction. DNA was isolated using the QIAgen Blood and Tissue kit (Qiagen, Valencia, CA, United States) applying the support protocol for tissue samples. Double-stranded DNA was quantified using QuBit®, and samples were run on an agarose gel to assess possible DNA fragmentation visually. DNA extractions of molecular  kb were then sequenced on an Illumina HiSeq 2500 platform by a commercial provider using paired-end ( bp) chemistry following the manufacturer’s instruction.

2.2. Mitogenome Assembly and Gene Annotation

Before mitogenome assembly, sequencing adapters were removed from raw sequences using Trimmomatic v.0.39 [37]. We used NOVOPlasty v.4.3.1 [38] to assemble mitogenomes de novo from each individual. The assembly was performed using a kmer size of 33, and COI sequences from the publicly available leopard reference mitogenome (GenBank accession number NC010641, [39]) were used as the initiating seed. The assembled mitogenomes were annotated using MITOS Web Server (http://mitos2.bioinf.uni-leipzig.de/index.py, accessed on 12 May 2023). The exact locations of start and stop codons for each mitogenomic feature were further adjusted in MEGA v.11.0 [40]. When the annotation pipeline identified a noncanonical start codon or an incomplete stop codon, the genes’ boundaries were determined by aligning the corresponding gene from the reference genome. The average number of variable sites was measured using MEGA [40], including standard deviation (SD). Finally, we used MitoZ v.3.6 [41] to draw mitochondrial genome maps.

2.3. Phylogenetic Reconstruction

We reconstructed the phylogenetic relationship between leopards using two different phylogenetic approaches and two separate datasets. In the first dataset, nine Mpumalanga leopards in combination with eight individuals from NCBI GenBank were used, of publicly available full mitogenomes from the Russian Far East and China (Supplementary Table S1), to date the South African clades more accurately. The lion, Panthera leo (NC028302), was used as an outgroup. In the second dataset, 22 partial leopard mitogenomes were added, as well as the tiger Panthera tigris (NC010642) as additional outgroup (Table S2). The second dataset included nine mitogenomes that were assembled from historical museum specimens (collected during ~1800s) across Africa, and seven ancient specimens (Pleistocene) from Europe [2]. The reason for having a separate dataset for historical mitogenomes is that constructing the mitogenome from museum samples brings many analytical challenges, such as non-target DNA contamination, degradation, and potential sequencing errors. Furthermore, the exact origin of specimens is occasionally unknown (see e.g. [42, 43]), so we considered it necessary to incorporate these inevitable limitations when interpreting the results. Both datasets were aligned separately using MAFFT v.7 [44] before phylogenetic reconstruction. In the first method, we applied a Bayesian phylogenetic inference in BEAST2 v.2 [45], and in the second method, a maximum likelihood approach in PhyML v.3.0 [46] with 1000 bootstrap repetitions. We ran both methods on two datasets for 13 protein-coding genes and complete sequences separately, resulting in a total of eight phylogenetic trees. The trees were visualised in Figtree v.1.4 [47].

2.4. Bayesian Phylogenetic Reconstruction

For the Bayesian phylogenetic reconstruction, the optimal sequence substitution model for each gene was predicted using the Bayesian phylogenetic site model averaging method implemented in BModelTest package [48]. Similarly, the marginal likelihood of the three different molecular clock models, which are strict, relax log normal, and random local, were estimated using a series of path-sampling analysis in the BEAST2 package, with 95% highest posterior density (HPD). The best model for the evolution of the molecular clock based on the estimated Bayes factor was selected for the downstream analysis. To calibrate the phylogenetic tree, we defined two additional monophyletic constraints. For the first constraint, a log normal prior distribution that includes the timing of the split between the ancestors of the lion and tiger (~6.8 MYA, 95% CI 4.8-17.4) was selected [49, 50]. In the second constraint, the prior distribution for the split between all current leopard populations from their most recent common ancestor (MRCA) was set as the timing of the split between African and Eurasian leopards (mean~710 Ka) estimated in earlier studies [2, 37, 51].

We ran the BEAST2 package for ten independent chains, each with 500 million iterations with an initial 150 million as burn-in. The trace file for each chain was first individually inspected for convergence using Tracer v.1.7 [52], after which all trace files were combined in Log Combiner v.2.1 [45] and convergence between different chains verified. A maximum clade credibility tree using a median height and 40% burn-in steps was reconstructed in TreeAnnotator [53] and visualised in Figtree v.1.4 [47]. In the second method, we used the Reverse-Jump Based (RB) method implemented in BEAST2. For this purpose, the 32 complete mitogenomes were first reoriented in the same direction before being aligned using MAFFT v.7 aligner [54]. We removed low-quality segments of the multiple alignments using Gblocks v.0.91 [55]. Briefly, in this method, the mitogenome is partitioned into k consecutive parts, and the boundaries of these partitions as well as the optimal substitution model for each partition sample are estimated during the MCMC steps [56]. The phylogenetic tree’s prior constraints and the calibration ranges remained similar to the first method.

2.5. Haplotype Network of NADH 5 and Cytochrome B

Complete mitogenomes from African specimens in public databases are rare, and these low numbers of sequences do not reflect the complete distribution range of the species across the continent. Our search of the public databases showed that DNA sequences from two particular genes, NADH dehydrogenase subunit 5 (NADH 5) and cytochrome B (CYTB), are comparatively more abundant and cover the widest distribution range of leopards in Africa. As a result, we retrieved a total of 127 CYTB and NADH 5 (Table S3) sequences from Tensen et al. [28], Ropiquet et al. [23], Anco et al. [18], and Paijmans et al. [2], which were concatenated to reconstruct a haplotype network. We used PopArt v1.7 [57] to produce a median-joining haplotype network [58] and identify the number of mutation steps between haplotypes. Base pairs with undefined states (genotyping errors) were masked, and sequences containing significantly more undefined states than others were removed. A separate haplotype network was built that included sequences with many undefined states and excluded sequences from Tensen et al. [28], who generated short sequences from northern South Africa, thereby limiting the sequence length of the network analysis.

3. Results

3.1. Genome Structure

The sequencing run resulted in an average of 79 million sequences per individual, with an average GC content of 42%. NOVOPlasty assembled a single circular genome from all nine individuals (Figure 2). The annotation pipeline identified a total of 13 protein-coding genes, 22 transfer RNA (tRNA), two ribosomal RNA (rRNA), and one misc-feature region (putative control region) as is typical of all mammals. When the nine mitogenomes were annotated to the reference genome, we found an average of 15,921 conserved sites (95.5%) and a total of 507 (SD 10) single nucleotide polymorphisms (SNPs), of which 436 (SD 27) were placed in protein-coding genes (PCGs). Little variation existed within the SA clade, with only 2-14 pairwise differences between individuals, whereas variation in the CA clade was considerably higher (2-178 SNPs). Among leopards of the CA clade, the strawberry leopard showed a comparatively high number of variable sites (), of which 122 were in PCGs. The pairwise distance between the SA clade (Mid, Los, Mak) and the remaining leopards belonging to the CA clade (Mal, Sku, Pie, Sab, Lyd, Red), was on average 0.020 (0.017-0.023), equivalent to 2%, which is larger than the average pairwise distance among Amur and North Chinese leopards (0.005-0.014). When comparing SA and CA leopards to the reference genome, no significant difference in base pair composition (31% A, 28% C, 14% G, and 28% T) that could point to mitogenome assembly artifacts was observed, even though such a presence cannot be conclusively dismissed.

3.2. Bayesian Phylogenetic Reconstruction

In both phylogenetic methods, the trace file of each individual run as well of that of combined chains resulted in a very similar value and had a very high effective sample size (), confirming that the probability space has been effectively searched. The Bayesian phylogenetic analysis based on the 13 protein-coding genes and the whole mitogenome supported an uncorrelated relax molecular clock as the most likely model of the clock evolution in the phylogenetic tree. However, both the coefficient of variation and the standard deviation of the clock rate in log space across different branches were bigger than one, making the posterior sampling of both the clock rate and divergence time challenging. As a consequence, we repeated the Bayesian phylogenetic analysis using a strict molecular clock. The topology of the tree and the divergence analysis remained unchanged, pointing that the tree topology was mainly robust to the selection of clock model. Because the results based on 13 PCGs and full mitogenomes yielded identical tree topologies, we only presented the results for 13 PCGs.

The phylogenetic tree that did not include the historical museum specimens identified two putative clades within South Africa, a southern and central clade (Figure 3(a)). The strawberry leopard clustered with the central clade, although it had a unique haplotype terminal to the other haplotypes. The two clades were more genetically distinct than the two subspecies in Asia. The divergence time between Amur and North Chinese subspecies was 0.22-0.68 mya and between the two African clades was 0.56-0.87 mya. The African clades also diverged in an earlier geographic time period than the Asian subspecies. The split between the putatively distinct clades from South Africa was dated back to 0.8 mya (95% ). However, when including the genes annotated from historical partial mitogenomes of leopards from Africa, a slightly different picture emerged (Figure 3(b)). The Southern African clade (SA) was still visible but clustered with an old specimen from Burundi (CA). The rest of Central Africa appeared to be a single lineage with no apparent population structure across the landscape. Algeria unexpectedly clustered with a leopard from South Africa. The mitogenomic phylogeny based on the ML method also identified the CA and SA clades within South Africa (Figure S1). The ML branch length estimation of the phylogenetic tree showed that the two African clades were 2% divergent, with a large number of variable sites () placed in PCGs.

3.3. Haplotype Network

To contextualize our samples with prior studies, we constructed a haplotype network based on concatenated CYTB and NADH5 sequences. The network () also illustrated the divergence between the CA clade and SA clade, with 7 mutational steps between them, which were almost as deep as the Africa/Asia divide (Figure 4). Leopards from the Mpumalanga province, Eastern and Western Cape, Kruger NP, and Mozambique were present in both clades. The Waterberg and northern Limpopo samples did not appear to be admixed in the network but instead shared some haplotypes with the CA clade. A geographic divide was not apparent from partial mitogenomes retrieved from ancient leopard samples from Africa when looking at both the phylogenetic tree (Figure 3(b)) and the haplotype network (Figure 4), due to the unexpected position of a sample from Burundi. Results were consistent with a haplotype network that included all historic samples and excluded sequences from Tensen et al. [28], thereby increasing the sequence length from 492 to 1736 base pairs. In this network, the SA and CA clades differed by 9 (NAHD5) to 11 (CYTB) mutational steps (Figure S2).

4. Discussion

In this study, we sequenced the first complete mitogenomes of African leopards. They originate from South Africa, where according to our data, two unique clades occur. Our results show a deep divergence between the two clades, which dated back to approximately 0.8 million years ago. We hypothesize that these clades originated during the climatically unstable Mid-Pleistocene, as seen in other savanna-dwelling mammals [33, 34, 5961]. Thus, the genetic clades seen in Mpumalanga province are not the result of a regional, historical dispersal barrier, per Morris et al. [32], but instead remnants of past climatic events that took place on a larger, continental scale [18]. As such, leopards likely reflect the unique climatic history of southern Africa, which has resulted in past isolation of the Cape region and subsequent dispersal, causing eminent and endemic genetic diversity. Genomic nuclear data from our study region are needed to confirm any phylogenetic hypotheses.

This is the first mitogenomic study on African leopards that confirms the existence of two deeply diverged mitochondrial clades. The Southern African clade (SA) is restricted to South Africa and Mozambique, whereas the Central African clade (CA) extends far into West-Central and East Africa [18]. Thus, the admixture in Mpumalanga may illustrate that the separately evolved female clades have since come into secondary contact across a vast area. The two clades of African leopard have previously been found by Uphyrkina et al. [25] and Ropiquet et al. [23], using mitochondrial sequences, in which the latter study also showed admixing in Kruger NP and Mozambique, in the area bordering Mpumalanga. Microsatellite-based analyses could not resolve the diverged clades [23, 32, 62]. Nevertheless, a few neutral microsatellite markers are not representative of genome-wide variation, which is subject to natural selection [63]. Furthermore, due to hybridization and recombination, we expect that introgression has led to homogenization of the diverged lineages on the nuclear level [3]. One study that applied a whole genome approach to study leopard population genetic structure has not yet included South African leopards but did find a lack of deep divergence on the continent [19]. Another mitogenomic study only included one historic sample from Durban and illustrated a particularly high heterozygosity [2], which could potentially be a sign of hybridization [64].

Based on the calculated divergence time, the SA clade likely evolved due to climatic shifts during the Mid-Pleistocene, which lasted from 1.6 to 0.52 mya. Bearing in mind that mitochondrial DNA holds the risk of overestimating divergence times [65], our results support previous studies on genetic diversification worldwide [66, 67]. In sub-Saharan Africa, the Pleistocene was marked by cool and dry cycles alternating with warm and wet cycles, which drove repeated expansions and contractions of savanna grasslands [68]. The Pleistocene refuge theory states that maintaining habitat patches in South Africa enabled large mammal persistence through repeated ice ages and promoted the divergence between populations [69]. Southwest Botswana and northwest South Africa underwent a period of intense aridification from about 2.5 to 1.7 mya, during which sand dunes were repositioned across the Makgadikgadi Palaeo-lake [70]. We expect that this caused the historical isolation of the Kalahari-Namibian-Cape region (Figure 1), which has been well-documented in other African taxa [33, 71]. Although leopards are extremely versatile habitat generalists with large ranging habits [2], their prey has a more patchy distribution due to specific habitat requirements, such as savanna woodlands and the availability of permanent water sources [60]. Several studies have documented similar spatial genetic patterns among African ungulates as a result of Pleistocene climatic shifts (see [69, 7274]).

The coalescence time between African and Asian leopard mitochondrial clades was ~0.9 mya, similar to previous mitochondrial studies [20, 25]. When comparing South African leopard mitogenomes with Asian conspecifics, it became evident that African leopards exhibit a level of genetic differentiation comparable to, or even more extensive than, Amur and North Chinese subspecies, which has previously been suggested [18]. Because a subspecies is traditionally defined as a “phenotypically distinct and discrete geographic population that differs in heritable genetic variation” [75], we consider the diverged clades not to represent subspecies, although caution must be given to the fact that our sample size is small. Leopards within the Mpumalanga province (whether part of the SA or CA clade) are not phenotypically distinct, and they are admixed across a sizeable hybrid zone that extends well into Mozambique and most likely across the Northern Cape [33]. Nevertheless, the delimitation of a subspecies is still widely debated, and it is now acknowledged that subspecies can live in sympatry with ongoing admixture [76]. Our data mainly highlights that leopards did not maintain low population differentiation and lack of migration barriers across Africa during the Pleistocene, as suggested by Paijmans et al. [20] and Pečnerová et al. [19], but contain deeply diverged clades of maternal lineages. The divergence between the SA and CA was calculated to be 2%, which is higher than what was measured for the Amur and North Chinese subspecies (1.1%). Therefore, it is justified to refer to the South African leopard as a unique genetic entity [77], even though they are currently part of the continental metapopulation [78].

We also report on the first mitogenome of the strawberry leopard, which is presumed to be unique to South Africa, and their phylogenetic relationships to wild-type leopards. The coat dilution is caused by a recessive mutation in TYRP1 [22], which downregulates the tyrosinase (TYR) protein-coding gene, better known as the albino gene, causing the loss of pigments in cats [79]. This colour morph has recently started to prevail in northern provinces, which was thought to be a result of a local reduction in effective population size [22]. The recessively inherited nature of the TYRP1 mutation means that increased relatedness can cause an increased expression of this phenotype [80]. The strawberry leopard can offer a valuable experimental platform for phenotype-genotype-assisted research and theoretical studies on pigmentation mechanisms in the wild [81]. Interestingly, we found that the individual was quite distinct from other leopards, positioned at the base of the CA clade. Mitochondrial lineages are normally incongruent with nuclear candidate genes that relate to the occurrence of colour morphs [82]. Nonetheless, colour morphs often speciate more rapidly, as a result of niche segregation and assortative mating, and can give rise to monomorphic daughter species [83]. Furthermore, polymorphic species are more often ancestral in phylogenetic trees, consistent with our results and seen in multiple taxa [8486], which grants for further research. Therefore, the mitogenomic data from this study could be a fundamental basis to address genetic identity and diversity of this rare, naturally occurring colour morph.

Surprisingly, we did not see clear a subspecies distinction in Amur and North Chinese leopards using mitogenomes, which could mean several things. First, the NCBI GenBank database does not always provide a sample location or subspecies delineation, which means that some samples may have been mislabelled. Secondly, the two subspecies occurring in northern China are not as distinct as we considered them to be. The latter would be supported by the minor differentiation among any of the Asian samples, regardless of their origin, and by the clustering of subspecies that were clearly labelled as belonging to either the North Chinese leopard P. p. japonensis or Amur leopard P. p. orientalis. The first study on phylogeographic structure in leopards [87] already assigned these to the same East Asian group and proposed that only leopards in Central Asia are distinct (P. p. saxicolor and P. p. sindica). They mentioned that further sampling is required for subspecies recognition in this region. The genetic study that followed [25] has suggested subspecies status of North Chinese leopards from combined NADH 5 (611 bp) and control region (116 bp) sequences. However, the latter study also concluded that they could not be significantly differentiated using microsatellites. Our mitogenomic data seems to confirm this. More exhaustive sampling and nuclear genomic analysis would be required for conclusive insights. That leopards in Northern China and the Amur region show lower divergence than previously assumed which might be beneficial to their conservation. Even though conservation management would normally allow for the protection of historical differences among unique lineages, as well as adaptive differences that may be present across the species range [88], there has been an increasing acknowledgment that hybridization could also benefit species when their conservation status is dire [89].

When looking at other historical samples, one unexpected result in our phylogenetic analysis is that SA leopards cluster with Burundi in Central-East Africa and Algeria in Northern Africa. With regards to Algeria, no obvious ecological scenario will be able to explain this observation, which leads us to look into other directions, such as sequence errors, possible DNA contamination, and the authenticity of the specimens. For instance, leopard skins have been traded for several decennia, which might have resulted in discrepancies in museum collections [90]. The sample originated from the Museum of Denmark and was collected in 1850. When mapped to the reference genome, 1514 base pairs were missing, and 131 base pairs were unambiguous, which is indicative of some level of DNA degradation. Although basic steps were taken to avoid sequencing artifacts, such as 2X serial captures, we cannot verify whether contamination has taken place, which is common in museum collections [42, 43]. Alternatively, the unexpected clustering could be explained by ancestral polymorphism, which relates to an acceleration of the molecular clock and long-term balancing selection [91], which can be accelerated in mtDNA, thus leading to molecular clock mismatches with the nuclear genome [92]. This can change mutation rates on a short timescale, causing unexpected spatial patterns of genetic diversity, commonly during the segregation of lineages in the ancestral population [93]. Based on the leopard’s genomic diversity and basal position of Burundi, it was suggested that Northwest and East Africa form the potential origin of leopards [2]. It is, therefore, highly interesting that this sample clusters with our samples from the SA clade. Considering the presence of leopards for at least 2 million years in East Africa, it is possible that the SA clade was once part of this ancestral population and split afterward, with the loss of intermediate lineages and accelerated speciation [2]. Further research is required, preferably using modern samples from areas surrounding Burundi. More importantly, we still consider our results strong enough to support the existence of the Southern African clade.

Our data on African leopards illustrates that unique evolutionary histories can be maintained over large geographic timescales. Speciation has long been assumed to require geographic barriers to gene flow, but there has been growing recognition that adaptive radiation can occur under sympatric conditions as well [94]. Although phylogenetic clades may have originated in response to past climatic events and temporal fluctuations in population size, leopards can cross hundreds of kilometres through unsuitable habitat and human-modified landscapes [95], and dispersal is currently considered to occur across South Africa [77], primarily due to long-distanced male dispersal [78]. Therefore, despite their distinct evolutionary histories, we consider the clades to be part of the same, interconnected metapopulation, which do not require separate conservation approaches [77]. We expect that hybridization of the African leopard clades has been occurring since the Holocene, which is likely to have caused significant homogenization in the nuclear genome [96]. Nevertheless, every genome contains some “barrier loci” that seem to resist the homogenizing effects of gene flow [97], due to which we still expect to find remnants of the CA and SA division when a whole genome approach is applied. This would be particularly interesting for the Cape leopard and strawberry leopard, which are morphologically distinct.

In conclusion, our results show the unique history of South African leopards and support the hypothesis of a southern glacial refuge during the Pleistocene, after which they likely spread northeast during the Holocene and came into secondary contact with the Central African leopard clade. Although the mitochondrial clades are deeply diverged and date back to approximately 0.8 million years ago, a whole genome approach is advised to confirm this hypothesis.

Data Availability

The sequence data have been made available via the online repository NCBI GenBank under accession numbers OR777682, OR817738-41, and PP420034-36.

Ethical Approval

Ethical clearance was granted by the University of Johannesburg (2023-02-03/Tensen). Samples were collected as per Employee No. 151051, where under Section 87, subsections c to f, the Board allows appointed staff to catch, move, and do research on any wild animal. The samples were brought to the University of Johannesburg under TOPS permit number MPB.5862 from Mpumalanga Tourism & Park Agency and O-3264 from Gauteng Province. No animal was exclusively killed for this study.

Disclosure

A preprint of this manuscript is available at doi:10.21203/rs.3.rs-3258041/v1.

Conflicts of Interest

The authors declare no conflict of interests.

Authors’ Contributions

The study conception and design were done by L.T. Sample collection was performed by L.S. and G.C. Data collection and analysis were performed by L.T., A.E.K., and A.K. The first draft of the manuscript was written by L.T. The project was supervised by K.F. All authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Acknowledgments

We thank Professor Peter Teske and John Power for their insights and professional advice.

Supplementary Materials

Table S1: dataset 1 of full mitochondrial genomes of leopard Panthera pardus collected in the Mpumalanga province (), South Africa, and from the NCBI GenBank database (). These samples were used to conduct Bayesian and maximum likelihood phylogenetic inferences. Table S2: dataset 2 of partial mitochondrial genomes of leopard Panthera pardus collected in the Mpumalanga province (), South Africa, and from the NCBI Genbank database (). The lion Panthera leo and tiger Panthera tigris were used as outgroup taxa. These samples were used to conduct Bayesian and maximum likelihood phylogenetic inferences. Table S3: additional NAHD5 and cytochrome B sequences of leopards Panthera pardus from Southern Africa used to construct median-joining haplotype networks. n.a. = not applicable. KZN = Kwazulu-Natal (province). Figure S1: phylogenetic relationships of African and Asian leopards. The trees show a consensus maximum likelihood (ML) phylogeny with the most common recent ancestor (MRCA) tree overlaid, inferred from a mitogenome dataset of 13 protein-coding genes. Numbers on the nodes correspond to ML divergence times measured with the Bayesian analysis method, of which all values >0.001 are illustrated. All nodes that had bootstrap values of <100% are also illustrated. (a) Phylogenetic tree of all available complete mitogenomes: 9 leopards from this study and 32 sequences from NCBI GenBank. (b) Phylogenetic tree of partial mitogenomes, including ancient DNA samples from Africa. The lion (Panthera leo) and tiger (P. tigris) were used as outgroup. Figure S2: median-joining haplotype network of leopards Panthera pardus from South Africa, based on concatenated NADH5 and cytochrome B mitochondrial sequences (1736 bp). Circles represent unique haplotypes, circle sizes are relative to the sample size, and the numbers on the branches show the number of mutational steps when >1. Dots positioned on the nodes represent lost or nonsampled haplotypes. (Supplementary Materials)