De novo transcriptome sequencing and analysis of Anisakis pegreffii (Nematoda: Anisakidae) third-stage and fourth stage larvae
Publié en ligne: 16 avr. 2020
Pages: 1 - 16
Reçu: 26 nov. 2019
DOI: https://doi.org/10.21307/jofnem-2020-041
Mots clés
© 2020 U-Hwa Nam et al., published by Sciendo.
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
The family Anisakidae comprises the nematode species whose adult stages can be found in aquatic animals, while the third stage larvae (L3) generally exist in the body cavity, visceral organs and muscles of various fish and squid species. Although humans are not the final hosts of these nematodes, the larval forms of anisakid nematodes, particularly those of the genera
Of genus
Recent impressive progress in genome-wide analyses of socio-economically important nematode parasites helped us to better understand the genetic information of them covering the general biology, host-parasite interaction, pathogenicity and development of drug or vaccine candidates. Moreover, the increase of sequencing data have opened a new era in comparative studies in parasitic nematodes, thereby offering many useful genetic information from different species or closely related species with which to compare pathogenesis, conduct differential diagnosis and a large scale epidemiology (Cantacessi et al., 2015, Lv et al., 2015). Currently, 134 parasitic or free-living nematodes genomes or transcriptomes are available in databases (
Alive APL3 were obtained from chub mackerel (
The collected nematodes were placed in a Petri dish filled with sterile PBS and washed several times to remove any tissue debris. Then, they were observed under a stereomicroscope and actively moving nematodes without any injury were selected for in vitro culture to obtain
Each individual APL3 and APL4 were washed with sterile PBS, then individually placed in a 1.5 ml Eppendorf tube and stored at −80˚C until use. Each larva was identified by PCR-RFLP and subsequent sequencing for the mitochondrial cox2 gene according to the method described elsewhere (D’Amelio et al., 2000; Nadler and Hudspeth, 2000). The caudal end of each larva was cut and used for molecular identification, and the rest of them were used for next generation sequencing. The larvae identified as
Total RNA (1 μg per each sample set) was prepared by homogenizing approximately 50 larvae per each developmental stage with Trizol, following the manufacturer’s protocol (Invitrogen, USA). Prior to mRNA isolation, total RNA samples were treated with DNAse. And the RNA yield and integrity was measured by NanoDrop 1000 (Thermo Scientific, Wilmington, USA) and BioAnalyzer 2100 (Agilent Technology, Santa Clara, USA), respectively.
The TruSeq stranded mRNA sample preparation kit (RS-122-2101, Illumina, San Diego, USA) was used for preparing mRNA sequencing libraries. Poly-A-containing mRNA was purified from 1 μg of total RNA, by using Oligo dT attached magnetic beads. Then the purified mRNA was disrupted into short fragments, and first-stranded cDNAs were synthesized using SuperScript∏— reverse transcriptase (Invitrogen, USA) and random hexamers. cDNA with adaptors ligated to both ends were enriched by PCR. The cDNA library size and quality were electrophoretically evaluated by using the Agilent DNA 1000 kit on a BioAnalyzer 2100. The libraries were subsequently sequenced on an Illumina HiSeq 2500. Image analysis was performed using the HiSeq control software version 2.2.58. Raw data were processed and base calling was performed using the standard Illumina pipeline (CASAVA version 1.8.2 and RTA version 1.18.64).
The complete sequences of two sample sets were subjected to the adaptor and quality trimming by FastQC method. The contigs were obtained by the assembly of preprocessed sequences assembly using Trinity (Haas et al., 2013). Then, they were clustered using the CD-HIT and TGIGL, to obtain the unique genes as a reference transcriptome (Pertea et al., 2003; Li and Godzik, 2006). The preprocessed reads were further mapped to the reference transcriptome using BWA and the expression patterns (i.e. FPKM) of each genes/transcripts were obtained using RSEM method (Li and Durbin, 2010; Li and Dewey, 2011). Normalization of the FPKM values was conducted by edgeR using TCC R package (Sun et al., 2013). The reference transcriptomes were annotated by mapping against to NR database using BLASTx and SwissProt/UniProt database using InterProScan. The protein coding regions were also annotated with TransDecoder (Haas et al., 2013). The complete annotations were merged together to improve the annotations of each genes/transcripts using in-house scripting.
In total, six genomes of nematodes were selected for whole-genome orthologous cluster analysis, i.e.
For DEG analyses, complete FPKM and edgeR scores were taken, and the statistical significance (
DEGs in either APL3 or APL4 were validated by real-time PCR analysis. cDNA targeting highly ranked DEGs in each ASL3 and ASL4 transcriptomes were synthesized by using PrimeScript™ RT reagent Kit with gDNA Eraser(Perfect Real Time; Takara, Japan), according to the manufacturer’s instructions. As a reference gene, GAPDH (Glyceraldehyde-3-phosphate dehydrogenase) gene was selected. Total RNA was extracted from each APL3 and APL4 samples as mentioned above; 1 μl of total RNA extracted using Trizol (Sigma. USA) was mixed with 2 μl of 5 X gDNA Eraser Buffer, 1 μl of gDNA Eraser, 6 μl of RNase-free dH2O, and incubated at 42˚C for 2 min. Then, cDNA was synthesized by adding 4 μl of 5 X PrimeScript™ Buffer 2, 1 μl of PrimeScript™ RT Enzyme Mix I, 1 μl of RT Primer Mix, 4 μl of RNase-free dH2O, and incubated at 37˚C for 15 min.
The selected
Primer information for real-time PCR analysis.
Stage | Gene | Primer sequences (5´ − 3´) | Length (bp) |
---|---|---|---|
APL3 | Acin1 | F: 5´-TGAATGACGAAGAAGGGAGATGA-3´ | 108 |
R: 5´-CAGTGTCAAATGAGAATAGCGTTTC-3´ | |||
SI | F: 5´-GAGGCGATTGCTGGAAACAT-3´ | 111 | |
R: 5´-CTTCGTTGGTTCTTTTTGTCGTT-3´ | |||
lact-2 | F: 5´-CACCCCACCATCCTCTCCTT-3´ | 114 | |
R: 5´-CCGAGTGTATCTGCGAGGAAA-3´ | |||
sptl-2 | F: 5´-GGCAACCAAGAACGAGTGATG-3´ | 108 | |
R: 5´-TCAGCATGGGATTTGCAACA-3´ | |||
Macf1 | F: 5´-ATGCTTGCTCCAGTTCTCCTAAA-3´ | 125 | |
R: 5´-ATGGACAGGACGCTCTTCTTG-3´ | |||
ergic2 | F: 5´-CTGTTGAGTTGATGGCTGGAAA-3´ | 106 | |
R: 5´-ATTTTGGCACCACTGCTTTTATTAG-3´ | |||
mgat4b | F: 5´-AATGCCAGCGATTGAAGAGTTT-3´ | 115 | |
R: 5´-CGCTGGTGTACGCAAATCATA-3´ | |||
B036.11 | F: 5´-TAATGATTGCTGAATGCGTCTGT-3´ | 120 | |
R: 5´-ACACCGAAAGATAACAACGAATACG-3´ | |||
nas-13 | F: 5´-AGCAATAGCAGCAGCGATGA-3´ | 131 | |
R: 5´-GTAGCCAATGCTTTT-3´ | |||
EF-TSMT | F: 5´-TTCTTCTACTCCTCATCGCATCTG-3´ | 100 | |
R: 5´-TTATGCTCCTCCAGTGCTTTTG-3´ | |||
SFXN2 | F: 5´-TTAGAATGGCGTTGAAGCAGTAGTAG-3´ | 130 | |
R: 5´-AGTATCGGTTCTGACCAGTTTTTTG-3´ | |||
hyd | F: 5´-CCATCCAGTGAAGAAGGATTCC-3´ | 116 | |
R: 5´-GAATAGAGCGGTAAGTAGAGCCTTGA-3´ | |||
dhs-27 | F: 5´-ACGATGAACGATTCGGGTAAGT-3´ | 140 | |
R: 5´-GCGTTATCCCCACATTTTCAA-3´ | |||
APL4 | vps-11 | F: 5´-CTCAACACACCACAATACTCATCAAT | 105 |
R: 5´-CAGCAACATCAACATCACAACTCA | |||
EGF1 | F: 5´-CACATCAGCCCAGAAACATACG | 106 | |
R: 5´-ATTTAGACCGTCGCCAGGAA | |||
CLINT1 | F: 5´-AACGCAAGAATAGGCAAACACA-3´ | 113 | |
R: 5´-AGGAGCATCAACAGAAAACAATGAG-3´ | |||
let-805 | F: 5´-CACCGAGCACCGCTATCAA-3´ | 120 | |
R: 5´-GAAGGAGGCATCAAGGAAAGTG-3´ | |||
VIT | F: 5´-AGGCTTATCATTACTTCGTTCACTCA-3´ | 105 | |
R: 5´-GTTTGGTTGTCGGTTCAGGTGTA-3´ | |||
F09E10.7 | F: 5´-CCACGCTCAAAAGCCCATT-3´ | 116 | |
R: 5´-GGACCAGCGGAAGTTCAGAA-3´ | |||
ATP8B4 | F: 5´-CGACACATTCTTCGTCAGCATT-3´ | 135 | |
R: 5´-CCGAACCCAAGCATGAAAA-3´ | |||
pept-1 | F: 5´-TGCTCAAGGAGAAGGAAGTTTACA-3´ | 113 | |
R: 5´-TACCCCACTTCACCGATTCG-3´ | |||
col-34 | F: 5´-GCCGAGTAAGCCACAAAACG-3´ | 121 | |
R: 5´-TACACCCCGTCCACCTTCA-3´ | |||
col-40 | F: TTCAGCAGGTGGGCAGTCA | 98 | |
R: ATGGAACTCCGGGTGAGGAT | |||
SAM-MT | F: 5´-CCATCAAGGCGGCGTTAC-3´ | 93 | |
R: 5´-CAGCATGCCATAGATCCAGTGT-3´ |
To make a list of putative allergens from
In total, 57,831,158 (4,395,168,008 bases) and 61,963,202 (4,709,203,352 bases) raw reads were obtained for APL3 and APL4 by Illumina sequencing. The generated raw reads were deposited in the Sequence Read Archive database of NCBI (accession number: PRJNA602791, PRJNA602795). After cleansing and removing low quality reads (
Preprocessing statistics of
Name | Raw reads | Clean reads | ||
---|---|---|---|---|
Reads | Base pairs | Reads (%) | Base pairs (%) | |
APL3 | 57,831,158 | 4,395,168,008 | 53,790,942 (93.0%) | 4,079,160,122 (92.8%) |
APL4 | 61,963,202 | 4,709,203,352 | 56,726,910 (91.5%) | 4,301,151,997 (91.3%) |
In total, 47, 243 and 43,660 genes were expressed in APL3 and APL4, respectively (> fpkm 1.0). Of these genes, 28,490 and 24,664 genes were novel for APL3 and APL4 (Table 3). The predicted gene number of
Gene expression overview of
Gene (fpkm < 1.0) | ||||
---|---|---|---|---|
Name | Expressed | Known | Novel | Unexpressed |
APL3 | 47,243 | 18,753 | 28,490 | 3,030 |
APL4 | 43,660 | 18,996 | 24,664 | 6,613 |
The number of orthologous transcripts of
The abundance of each gene in APL3 and APL4 transcriptome was calculated; ubiquitin genes (UBB, UBIQP_XENLA) were highly expressed in APL3, while collagen genes (col-34, col-138, col-40) were highly expressed in APL4. Several mitochondrial enzyme genes (COI, COII, COIII) were highly expressed both in APL3 and APL4 (Tables 4 and 5).
Top 20 most abundant unigenes in
No. | ID | Name | Description |
---|---|---|---|
1 | TBIU006603 | COIII | Cytochrome c oxidase subunit 3 |
2 | TBIU013810 | COI | Cytochrome c oxidase subunit 1 |
3 | TBIU015558 | UBB | Polyubiquitin-B |
4 | TBIU017350 | ZK970.7 | Protein ZK970.7 |
5 | TBIU017923 | act-2b | Actin-2 |
6 | TBIU015560 | UBB | Polyubiquitin-B |
7 | TBIU017925 | act-2b | Actin-2 |
8 | TBIU008478 | COII | Cytochrome c oxidase subunit 2 |
9 | TBIU015561 | UBIQP_XENLA | Polyubiquitin |
10 | TBIU002664 | UBIQP_XENLA | Polyubiquitin |
11 | TBIU022916 | HSP70 | Heat shock 70 kDa protein |
12 | TBIU007173 | NAP1L4 | Nucleosome assembly protein 1-like 4 |
13 | TBIU018580 | ART2 | Putative uncharacterized protein ART2 |
14 | TBIU001952 | R09H10.3 | Probable 5-hydroxyisourate hydrolase R09H10.3 |
15 | TBIU003351 | PCCB | Propionyl-CoA carboxylase beta chain, mitochondrial |
16 | TBIU022918 | HSP70 | Heat shock 70 kDa protein |
17 | TBIU030005 | ASP2_ANISI | Serine protease inhibitor 2 |
18 | TBIU005085 | Mmadhc | Methylmalonic aciduria and homocystinuria type D homolog, mitochondrial |
19 | TBIU032617 | CLPH_ONCVO | Calponin homolog OV9M |
20 | TBIU030096 | Mcee | Methylmalonyl-CoA epimerase, mitochondrial |
Top 20 most abundant unigenes in
No. | ID | Name | Description |
---|---|---|---|
1 | TBIU007721 | PI-3 | Major pepsin inhibitor 3 |
2 | TBIU033076 | col-145 | Putative cuticle collagen 145 |
3 | TBIU014455 | col-34 | Cuticle collagen 34 |
4 | TBIU001741 | col-34 | Cuticle collagen 34 |
5 | TBIU018580 | ART2 | Putative uncharacterized protein ART2 |
6 | TBIU006603 | COIII | Cytochrome c oxidase subunit 3 |
7 | TBIU020343 | col-138 | Protein COL-138 |
8 | TBIU020344 | Bm1_54705 | Nematode cuticle collagen N-terminal domain containing protein |
9 | TBIU013810 | COI | Cytochrome c oxidase subunit 1 |
10 | TBIU000598 | col-40 | Cuticle collagen 40 |
11 | TBIU008478 | COII | Cytochrome c oxidase subunit 2 |
12 | TBIU003870 | Y51F10.7 | Protein Y51F10.7 |
13 | TBIU015232 | IXCI_RHIMP | Chymotrypsin-elastase inhibitor ixodidin |
14 | TBIU017350 | ZK970.7 | Protein ZK970.7 |
15 | TBIU001757 | Bm1_23600 | hypothetical protein |
16 | TBIU015233 | IXCI_RHIMP | Chymotrypsin-elastase inhibitor ixodidin |
17 | TBIU006107 | col-182 | Protein COL-182 |
18 | TBIU029263 | CRVP | Venom allergen 3 |
19 | TBIU004632 | col-34 | Cuticle collagen 34 |
20 | TBIU030958 | LGALS8 | Galectin-8 |
Different gene expression profiles between APL3 and APL4 transcriptomes were investigated by DEG analyses. In total, 1,923 genes were up-regulated either in APL3 or APL4. Among them, 614 were up-regulated in APL3 while 1,309 were up-regulated in APL4 (data not shown). The top 20 up-regulated genes either in APL3 or APL4 were listed in Tables 6 and 7.
Top 20 most differentially expressed genes in
No. | Name | Description | Log FC |
|
|
---|---|---|---|---|---|
1 | ergic2 | Endoplasmic reticulum-Golgi intermediate compartment protein 2 | 16.8 | 0.0001179 | 0.247384 |
2 | mgat4b | Alpha-1,3-mannosyl-glycoprotein 4-beta-N-acetylglucosaminyltransferase B | 16.7 | 0.0001274 | 0.247384 |
3 | Acin1 | Apoptotic chromatin condensation inducer in the nucleus | 16.3 | 0.0002229 | 0.255429 |
4 | SFXN2 | Sideroflexin-2 | 16.1 | 0.0003069 | 0.265269 |
5 | ANKHD1 | Ankyrin repeat and KH domain-containing protein 1 | 16.1 | 0.0002901 | 0.265269 |
6 | RALGAPA1 | Ral GTPase-activating protein subunit alpha-1 | 15.9 | 0.0003816 | 0.278843 |
7 | SI | Sucrase-isomaltase, intestinal | 15.9 | 0.0004035 | 0.278843 |
8 | B0361.11 | Putative transporter B0361.11 | 15.9 | 0.0004042 | 0.278843 |
9 | lact-2 | Beta-lactamase domain-containing protein 2 | 15.8 | 0.0004392 | 0.291452 |
10 | sptl-2 | Serine palmitoyltransferase 2 | 15.8 | 0.0004306 | 0.291452 |
11 | hyd | E3 ubiquitin-protein ligase hyd | 15.7 | 0.0005016 | 0.292532 |
12 | Macf1 | Microtubule-actin cross-linking factor 1 | 15.7 | 0.0005216 | 0.292532 |
13 | ZK370.4 | Uncharacterized NTE family protein ZK370.4 | 15.7 | 0.000554 | 0.292532 |
14 | LOAG_00120 | Hypothetical protein | 15.7 | 0.0005268 | 0.292532 |
15 | FZD10 | Frizzled-10 | 15.6 | 0.0005786 | 0.292532 |
16 | cdgs-1 | Phosphatidate cytidylyltransferase | 15.6 | 0.0005679 | 0.292532 |
17 | Pcsk1 | Neuroendocrine convertase 1 | 15.6 | 0.000601 | 0.292532 |
18 | Bm1_49625 | Hypothetical protein | 15.6 | 0.0006287 | 0.292532 |
19 | GLSN_MEDSA | Glutamate synthase [NADH], amyloplastic | 15.6 | 0.0006022 | 0.292532 |
20 | Moe | Moesin/ezrin/radixin homolog 1 | 15.6 | 0.0005762 | 0.292532 |
Top 20 most differentially expressed genes in
No | Name | Description | Log FC | p-value | q-value |
---|---|---|---|---|---|
1 | F09E10.7 | Protein F09E10.7 | 20.7 | 0.0000115 | 0.171197 |
2 | ATP8B4 | Probable phospholipid-transporting ATPase IM | 18.7 | 0.0000172 | 0.171197 |
3 | pept-1 | Peptide transporter family 1 | 18.5 | 0.0000214 | 0.171197 |
4 | vps-11 | Vacuolar protein sorting-associated protein 11 homolog | 18.5 | 0.0000214 | 0.171197 |
5 | col-34 | Cuticle collagen 34 | 18.4 | 0.0000229 | 0.171197 |
6 | EGF1 | Fibropellin-1 | 17.5 | 0.000063 | 0.206261 |
7 | CLINT1 | Clathrin interactor 1 | 17.4 | 0.0000751 | 0.218706 |
8 | let-805 | Protein LET-805 | 17.2 | 0.0000989 | 0.247384 |
9 | Bm1_33530 | Hypothetical protein | 17 | 0.0001257 | 0.247384 |
10 | VIT | Vitrin | 17 | 0.0001194 | 0.247384 |
11 | col-40 | Cuticle collagen 40 | 16.9 | 0.0001524 | 0.255429 |
12 | VIT | Vitrin | 16.7 | 0.0002081 | 0.255429 |
13 | unc-104 | Kinesin-like protein unc-104 | 16.7 | 0.0001838 | 0.255429 |
14 | SAM-MT | Probable fatty acid methyltransferase | 16.7 | 0.000202 | 0.255429 |
15 | sulp-3 | Sulfate permease family protein 3 | 16.6 | 0.0002432 | 0.258155 |
16 | CLINT1 | Clathrin interactor 1 | 16.6 | 0.0002241 | 0.255429 |
17 | Oplah | 5-oxoprolinase | 16.5 | 0.0002634 | 0.265269 |
18 | VIT | Vitrin | 16.4 | 0.0003171 | 0.265269 |
19 | LOAG_09410 | Hypothetical protein | 16.4 | 0.0003238 | 0.265269 |
20 | ZAN | Zonadhesin | 16.4 | 0.0003204 | 0.265269 |
DEGs were subjected to GO analyses. In total 1,184 DEGs were mapped to either of three main categories (cellular component, biological process and molecular functions). GO analysis of the DEG revealed that 14 cellular component, 41 biological process, and 10 molecular function categories (
Figure 1:
GO analysis of DEGs in

The mRNA expression level of the DEGs either in APL3 or APL4 were validated by real-time PCR. For APL3, the top 5 DEGs (Acin1, SI, lact-2, sptl-2, Macf1) were analyzed by real-time PCR, and only one gene (SI) was validated as significantly highly expressed in APL3 (
Figure 2:
The expression level of selected DEGs either in APL3 or in APL4 by real-time PCR. A: the real-time PCR results of five genes with significantly higher expression level in APL3 compared to APL4; B: the real-time PCR results of five genes with significantly higher expression level in APL4 compared to APL3; C and D are 14 additional genes selected to compare the expression levels of ASL3 and APL4 (C: eight gene for APL3 compared to APL4, D: six gene for APL4 compared to APL3). GAPDH genes were used as the reference genes. The results are given as mean of samples (

Comparative analyses of
Mitochondrial enzymes-related (COI, COII, COIII) and polyubiquitin-related genes (UNN, UBIQP_XENLA) were highly expressed in L3, while collagen-related genes (col-34, -145, 138, Bm1_54705) were highly expressed in L4. Mitochondrial cytochrome c oxidase is the key enzyme of aerobic cell respiration in all eukaryotes and many prokaryotes, and different level of this enzyme at different developmental stages of a parasitic nematode
Collagens are ubiquitous structural proteins constituting the cuticle, a multi-layered flexible exoskeleton for protecting nematodes from adverse environmental conditions (Page, 2013). A new cuticle is synthesized for each developmental stage of nematodes, with many enzymes being involved in this process. Large families of the cuticle collagen genes are known in
Differential metabolic profiles among different developmental stages of several parasitic nematodes were shown by transcriptome analysis. A filarial nematode
Allergic reactions are one of the clinical symptoms caused by
While 15 allergens (Ani s 1 to Ani s 14 and Ani s troponin C) have been described from anisakid nematodes (Pomés et al., 2018), 8
There are several classes of proteases in nematodes (e.g. cysteine, serine, aspartic, and metalloproteases), with a variety of functions, and some of them are thought to be well conserved across the nematode species such as molting, embryogenesis or cellular survival. On the other hand, proteases of parasitic nematodes are also involved in diverse adaptive functions including host tissue penetration, larval migration, immune evasion, host tissue digestion, and so forth (Britton, 2013 and the references therein). Different parasitic nematode species and different developmental stages in a certain nematode species may have different repertoire of proteases for adapting themselves to a specific niche, and comparative analyses of these differences would be essential to understand their pathogenesis and develop potential control strategies. For example, transcriptome analysis of a parasitic nematode
While we report comparative transcriptomic analyses of different developmental stages of