Complete Mitochondrial Genome of Contracaecum Sp. (Nematoda: Ascarididae) from Night Herons in China

Abstract Contracaecum species are zooparasitic anisakid nematodes and occur in gastrointestinal tracts of vertebrate/invertebrate animals, including humans, causing gastrointestinal pain, diarrhea, and increasingly severe vomiting. Although the complete mitochondrial (mt) genome (mitogenome) of Contracaecum sp. isolated from night herons in Beijing has been reported, the detailed information about this mt sequence is still puzzling. In the present study, we described the detailed characteristics across the complete mt DNA of Contracaecum sp., which includes 36 genes consisting of 12 protein genes, 22 transfer RNAs (tRNAs), 2 ribosomal RNAs (rRNAs), and 2 noncoding regions (NCRs), and all genes have the same orientation of transcription. The AT content in the complete mitogenome of Contracaecum sp. was 72.2%, and it was the least value (66.7%) in the cox1 gene but was the highest rate (84.1%) in NCRs. The highest nucleotide diversity (Pi) among the genus Contracaecum was nad4 (0.190) and the least was cox1 (0.125), which indicates that nad4 might have the potential ability as useful markers to detect cryptic species in the genus Contracaecum or subspecies. Based on the maximum likelihood (ML) and Bayesian inference (BI) computational algorithms within subfamilies Ascaridoidea and Heterakoidea, the results supported that Contracaecum sp. was a new species and the family Ascaridiidae was paraphyletic. The complete mitogenome sequence of Contracaecum sp. supported a clear recognition of Contracaecum species and provided the potential existence of cryptic species in the genus Contracaecum. Our findings would better contribute to the surveillance, molecular epidemiology, and control of Contracaecum.

Contracaecum species are nematodes that parasitize throughout the world, causing severe pathogenic influences on vertebrate and invertebrate animals, including humans   (Zhang et al., 2021). In Australia, the first anisakid nematode was detected in the human body coupled with gastrointestinal pain, diarrhea, and increasingly severe vomiting, although it was not identified at the species level (Shamsi and Butcher, 2011;. Over the years, anisakidosis caused by Contracaecum was diagnosed mainly based on its morphological characteristics. In the past decades, morphological features and molecular sequences of >100 Contracaecum species have been described Zhang et al., 2021). However, it was challenging for nonexperts to distinguish and identify specific helminths based only on morphological features, especially for detecting cryptic species. Studies show that there are considerable differences between cryptic subspecies/species which are isolated from different hosts or geography (Shamsi et al., 2009;Mattiucci et al., 2014;Timi et al., 2014;Liu et al., 2016). The complete mitochondrial (mt) genome (mitogenome) has been evidenced as a useful molecular marker to identify and distinguish different/ similar species between related taxa, even cryptic species, especially for Contracaecum (Mohandas et al., 2014). However, only three Contracaecum species published complete mitogenome sequences, C. ogmorhini, C. osculatum, and C. rudolphii, which made it difficult to detect new/cryptic species within the genus Contracaecum. Although the complete mitogenome of Contracaecum sp., which was collected from black night herons from Beijing, China, has been published (GenBank no. MN892395), the published sequence was not characterized and detailed features were not recorded, which caused inconvenience to use it.
Therefore, in the present study, we aim to (i) reassemble and annotate the complete mitogenome of Contracaecum sp., which was isolated from black herons in Beijing, China, and describe detailed information of this sequence; (ii) based on uploaded annotated sequences of Ascaridoidea and Heterakoidea species, conduct phylogenetic analyses to verify Zhang et al. (2021) hypothesis; and (iii) provide more detailed molecular features and new useful markers for detect cryptic species within genus Contracaecum for successive studies.

Parasites and molecular identification
Helminth specimens were obtained from the digestive tracts of gray and night herons in Beijing Zoo, China. The species were washed with ultrapure water and physiological saline solution, fixed in 75% ethanol, and stored at -40°C. The specimens were preliminarily identified as Contracaecum based on hosts and primary characteristic morphology (Zhang et al., 2021). For additional examination of molecules, the total genomic DNA was extracted using a QIAampÒ DNA Micro Kit as per the manufacturer's instructions. Based on polymerase chain reaction (PCR) amplification of partial cox1 (with primers JB3 -JB4.5) (Bowles et al., 1992;Bowles and McManus, 1994) and ITS (including ITS-1, 5.8S, and ITS-2) (with primers NC5 -NC2) (Newton et al., 1998;Chilton et al., 2001), the worms were recognized at the species level. The obtained ITS sequence was totally matched with published Contracaecum sp. (GenBank no. MW538933~36), and the partial cox1 sequence showed 99.7% identity with Porrocaecum reticulatum (GenBank no. MF113244).

Sequencing, assembling, and annotation
The genomic DNA sample was fragmented to a size of 350 bp. The DNA libraries were sequenced using high-throughput sequencing (HTS) on an Illumina Hiseq 6000 platform (Novogene Co. Ltd., Tianjin, China), and 250-bp paired-end reads were generated. The raw data were obtained and recorded in FASTQ format. Then, the reads with low-quality bases (Phred quality <5) or uncertain reads with repetitive "N" bases were filtered to acquire clean data. The partial cox1 sequence was used as the initial reference to assemble complete mt sequence of Contracaecum sp. using Geneious Prime 2022.0.1 (Kearse et al., 2012). The assembly was operated with the following parameters: (i) minimum overlap within the range of 150 bp to 200 bp; (ii) minimum overlap identity among 98% to 100%; and (iii) maximum gap of 5 bp. The assembled mitogenome was verified by long PCR with designed primers (Table S1 and Fig. S1 in Supplementary Materials).

Nucleotide variation in mtDNA genomes among Contracaecum spp.
Based on available mitogenome sequences of the genus Contracaecum in the NCBI, mt sequences were aligned using Clustal X1.83 to a single alignment dataset, including C. osculatum, C. rudolphii, C. ogmorhini, and Contracaecum sp. The nucleotide diversity of Contracaecum species was computed by DnaSP v5 using sliding windows (Librado and Rozas, 2009). The parameters of the sliding window were followed with 300-bp window length and a default 25bp step site to calculate the nucleotide diversity (Pi or p). Each boundary of protein genes was identified due to mid-point position, and we then graphed nucleotide diversity for 12 protein genes from Contracaecum.

Phylogenetic analyses
A total of 41 mitogenomes of species from families Ascaridoidea and Heterakoidea were applied to analyze phylogeny with outgroups Enterobius vermicularis (GenBank accession no. EU281143) and Wellcomia siamensis (GenBank accession no. NC_016129) ( Table S2 in Supplementary Material). Each amino acid sequence was aligned using a MAFFT computational algorithm (Katoh et al., 2019). The aligned sequences were then concatenated to a single alignment dataset. The ambiguous gaps in the alignment were excluded by Gblocks 0.91b with default parameters "less stringent" (Dereeper et al., 2008). Computational algorithm maximum likelihood (ML) (Guindon et al., 2010) was conducted to perform a phylogenetic tree with the best model "JTT+I+G+F" screened by ProtTest 3.4.2 (Darriba et al., 2011) and applied 1,000 replicates. Bayesian analysis was operated with MrBayes 3.2 (Ronquist et al., 2012), and "GTR + F + G" was selected as the most suitable model by ModelFinder in IQTree v.2.1.3 (Kalyaanamoorthy et al., 2017). Four Markov chains were progressed with 1,000,000 MCMC generations, with sampling analysis tree each 100 generations. The residual trees were calculated with Bayesian posterior probabilities (BPP), burning first 250 trees.

Mitogenome organization and composition
The clean data of Contracaecum sp. are nearly 2 GB with a total of 8,677,194 ´ 2 clean reads for further assembling. The circular mt genome of Contracaecum sp. (GenBank accession: ON149889) assembled was 14,082 bp in size, shorter than that Zhang et al. (2021) published, with 12 PCGs, 22 tRNAs, 2 rRNAs, and 2 noncoding regions (NCRs) ( Table 1 and Fig. 1). A total of 36 genes were transcribed in the forward direction and gene arrangement was recognized as the typical GA3 pattern, which is mostly observed in the worms (Liu et al., 2013). Consistent with previous reports, there was an obvious bias of A + T bases (71.2%). A total of 10 intergenic regions were found among the complete mt genome of Contracaecum sp. ranging from 1 bp to 16 bp (Table 1). One short NCR (122 bp) was located between nad4 and cox1, and one long NCR (691 bp) was placed in tRNA-Ser2 and tRNA-Asn. The values of AT skew were negative from -0.475 (nad6) to -0.111 (NCRs), and inversely, the values of GC-skew were positive with scope 0.226 (nad4) to 0.674 (nad3), suggesting Ts and Gs were more frequently used in the genome.

Protein-coding genes
TTG was the most common initial codon in this study, followed by ATT. TTG was used as the start codon for nine genes (cox1-3, cytb, nad1-4, and nad6) ( Table 1). The rest three PCGs (atp6, nad4L, and nad5) used ATT as the initial codon. Generally, TAG and TAA were common stop codons in metazoans (Hu et al., 2004). In this study, TAA was the most frequent termination among nad6, nad4L, nad4, cytb, and nad2. The genes nad1 and nad3 used TAG as their stop codon. The rest genes used incomplete stop codons T (cox1 and cox3) or TA (atp6, cox2 and nad5), respectively.
A total of 3,422 amino acids were translated by 12 PCGs. TTT (480) was the most common codon used in encoding Phe, followed by GTT (219, Val), TTG (216, Leu), and ATT (214, Ile). Leu (519) and Phe (499) were the most frequent amino acids, while Arg (34) was the least. There was a tendency of Gs and Ts in the same amino acid by comparing the relative synonymous codon usage (RSCU) ( Table 2). The AT content of 12 protein genes ranged from 66.7% (cox1) to 78.9% (nad6) ( Table 3). The values of AT skew ranged from -0.475 (nad6) to -0.373 (cox2), while the values of GC skew were 0.226 (nad4) to 0.674 (nad3), suggesting the bias of T and G bases.

Transfer RNA genes, rRNA genes, and noncoding region
The length of 22 tRNAs ranged from 51 bp (tRNA-Ser1) to 65 bp (tRNA-His). The typical structure of tRNA consisted of one acceptor stem, a dihydrouridine loop (D-loop), an anticodon loop, TYC loop, and related    (Su et al., 2020). However, the TψC loop was always replaced by a TV replacement loop in nematodes. In our study, 16 of 20 tRNAs (excluding tRNA-Ser1 and tRNA-Ser2) lacked a TYC loop, replaced by several nucleotide residues, which compromised the TV replacement loop (Hu et al., 2004). The tRNA-His, tRNA-Ile, and tRNA-Met were observed in a relatively standard cloverleaf structure with a TYC loop, although the latter two (tRNA-Ile and tRNA-Met) lacked DHU stem. The tRNA-Ser1 and tRNA-Ser2 were similar to previous reports with one TYC-loop but lacked D-loop (Su et al., 2020). Ribosomal RNAs of Contracaecum sp. were fixed as a GA3 pattern. The rrnL was located between tRNA-His and nad3 with a size of 959 bp, and the rrnS gene was located between tRNA-Glu and tRNA-Ser2 with a size of 711 bp ( Table 1). The content of A + T for rrnL and rrnS was 75.6% and 70.6%, respectively. There were two NCRs among the mt genome of Contracaecum sp. One short region was placed in nad4 and cox1 with a length of 122 bp, and the long region was situated between tRNA-Ser2 and tRNA-Asn with a length of 691 bp.

Nucleotide variation of genus Contracaecum
Based on aligned nucleotide sequences among species C. osculatum, C. rudolphii, C. ogmorhini, and Contracaecum sp., nucleotide diversities (Pi) were calculated based on the sliding window. The values of Pi ranged from 0.124 to 0.181 by analyzing a window of 300 bp and a default step of 25 bp (Fig. 2). The most variable genes were cytb (0.178), nad2 (0.181), nad4 (0.179), and nad6 (0.172), and the most conserved genes were cox1 (0.124) and cox2 (0.130) in Contracaecum (Fig. 2). Protein genes cox1 and cox2 seemed to be the most stable genes in Contracaecum nematodes with the least variation, which could be used as molecular markers to identify species from Contracaecum. Results also supported that nad2 and nad4 could act as alternative markers among nematodes isolated from different distributions.

Phylogenetic analyses
The present phylogenetic trees were constructed based on the 12 PCGs of 41 available mt genome sequences from the superfamilies Ascaridoidea and Heterakoidea (Table S2 in Supplementary Material). Two phylogenetic trees, both Bayesian inference (BI) and ML, had similar topologies, excluding species within the superfamily Heterakoidea. The topologies of ML and BI phylogenetic trees were highly similar to those of previous studies (Liu et al., 2016;Zhang et al., 2021;Zhao et al., 2021). Contracaecum sp. formed a branch with Contracaecum nematodes, indicating a closer relationship within the genus with strong support (Fig. 3); however, the long distance between Contracaecum sp. and the other three Contracaecum species (C. osculatum, C. rudolphii, and C. ogmorhini) was longer than the branch distance within other anisakid nematodes, which further indicated Contracaecum sp. was a novel species and verified the hypothesis of Zhang et al. (2021) proposed. According to the structure of phylogenetic trees, results supported that the superfamilies Ascaridoidea and Heterakoidea were monophyletic and evidenced families, including Ascarididae, Anisakidae, Heterocheiidae, Toxocaridae,   and Cucullanidae, were monophyletic, consistent with previous studies Zhao et al., 2021). Within the superfamily Ascaridoidea, both BI and ML showed identical topologies. Among the family Ascarididae, the genera Ascaris, Baylisascaris, Toxascaris, and Parascaris had a closer relationship than Ophidascaris, which was similar to Zhou et al. (2021) reported. Based on morphological descriptions, the genus Ophidascaris was classified as a member of the superfamily Ascaridoidea (Pinto et al., 2010), and phylogenetic analyses suggested the genus Ophidascaris was more related to the family Ascaridae. However, the distance between the genus Ophidascaris and other Ascaridae genera was longer, suggesting there was systematic controversy in the Ascaridae. In the present study, the family Ascarididae was closely related to the family Anisakidae (Fig. 3), different from the previous study where the family Ascarididae was closely related to Toxocaridae (Zhou et al., 2021). In addition, results also supported the monophyly of all 5 families and all 11 genera within the superfamily Ascaridoidea with strong support (BPP = 1, Bf >70, Fig. 3), consistent with records (Liu et al., 2016;Zhao et al., 2018). Liu et al. (2013) confirmed that Ascaridia columbae was more related to Ascaridia sp. than A. galli. The phylogenetic analyses in the present study also confirmed this. In ML analysis, the topology showed that A. galli was more related to Heterakis species with high statistical support (Bf = 87), in line with Liu et al. (2016) studied. However, BI analysis presented a totally different topology from that of ML tree. A. galli formed a distinct branch from genera Heterakis and Ascaridia with strong support (BPP = 1), hypothesizing A. galli might be another genus. Results also showed the family Heterakidae was a sister taxon to Ascaridiidae, and phylogenetic analyses (BI and ML) suggested the family Ascaridiidae might be paraphyly.

Conclusion
In the present study, we annotated the complete mitogenome sequence of Contracaecum sp. isolated from night herons and described its characteristics. Based on available mitogenome sequences of Contracaecum species, we also calculated the nucleotide diversity, indicating cox1 and cox2 could be used as effective markers to distinguish and identify other Contracaecum species. Results also supported the hypothesis of Zhang et al. (2021) proposed that Contracaecum sp. was a novel species, and evidenced that families Heterakidae + Ascaridiidae were closely related and all genera and families (excluding genus Ascaridia and family Ascaridiidae) were monophyletic.

Author's Contributions
G-HL and Y-PD conceived and designed the study, and critically revised the manuscript. Y-T provided the sample worms and provided initial identification. Y-PD performed the experiments and analyzed the data. G-HL and Y-PD drafted the manuscript. R-L and H-MW helped in study design, study implementation, and manuscript preparation. All authors read and approved the final manuscript.

Conflict of Interest
None.

Financial Support
This study was supported by the Training Programme for Excellent Young Innovators of Changsha (Grant No. KQ2106044).

Ethical Standards
Not applicable.