Accesso libero

Nematode Genome Announcement: A Draft Genome for Rice Root-Knot Nematode, Meloidogyne graminicola

INFORMAZIONI SU QUESTO ARTICOLO

Cita

Rice is the second most important food crop in the world after corn based on the total production. In 2016, rice was cultivated in 161.1 million ha area, and the global production was 482 million metric tons (World Rice Statistics, International Rice Research Institute, Manila, Philipines, http://ricestat.irri.org:8080/wrsv3/entrypoint.htm). The rice root-knot nematode, Meloidogyne graminicola, has emerged as a devastating pest of rice in South-East Asia (Dutta et al., 2012; Mantelin et al., 2016), where it is highly damaging under upland, rainfed lowland (Prot et al., 1994) and irrigated (Netscher and Erlan, 1993) cultivation conditions. Severe M. graminicola infection is known to cause 100% damage to the rice nursery. Here, we report the sequencing and assembly of the genome of M. graminicola IARI strain. This resource would help researchers investigate and understand the unique biology of this nematode and discover new strategies for its management.

Considering the ~30 Mb genome size of M. graminicola as predicted by Feulgen densitometry (Lapp and Triantaphyllou, 1972), we planned to generate two libraries of varying insert length with ~150× depth of data (~4.5 Gb) per library using paired-end sequencing to achieve a comprehensive assembly. The M. graminicola population was collected from the infected rice fields from Indian Agricultural Research Institute farm, New Delhi, and multiplied from a single egg mass in pots under greenhouse conditions. Freshly hatched second stage juveniles were used for the genomic DNA extraction using Gentra Puregene Tissue Kit (Cat No.: 158667 Qiagen, Valencia, CA, USA). The short (150–200 bp) and long (300–500 bp) DNA fragments were obtained by diluting 1 µg of genomic DNA in 100 µl nuclease free water (Ambion, Waltham, MA, USA) and sonication by Bioruptor (Diagenode, Seraing (Ougrée), Belgium) at 20 and 13 pulses at 30 sec ON and 30 sec OFF, respectively. The resulting fragmented DNA was cleaned using QIAquick columns (Qiagen, Valencia, CA, USA). The size distribution was checked by running an aliquot of the fragmented DNA sample on Agilent high sensitivity bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). Subsequently, the libraries for whole genome sequencing were constructed as per the Illumina TruSeq DNA sample preparation guide (Illumina, San Diego, CA, USA). The sequencing was performed on Illumina GAIIx platform at the Genotypic Technology Pvt. Ltd., Bengaluru, India.

A total of ~130 million raw reads were generated comprising of 13 Gb sequence data using 100 bp paired-end sequencing. Approximately 120 million High Quality (HQ) reads were obtained from the raw data by using NGS QC Tool Kit v.2.3.3 (Patel and Jain, 2012). These ~12 Gb of 120 million HQ reads were better than our planned strategy expecting nine Gb. The HQ reads obtained from both short and long insert libraries were used to generate primary assembly using Platanus assembler v.1.2.4 (Kajitani et al., 2014), and the resulting contigs were further scaffolded using Platanus Scaffolding module to generate secondary assembly. The secondary assembly was further refined by Redundans pipeline (Pryszcz and Gabaldón, 2016) to generate the final genome assembly with a minimum sequence length of 500 bp. The contaminating mitochondrial and bacterial sequences were identified by NCBI servers and removed from the draft genome assembly prior to submission to the NCBI GenBank. The mitochondrial genome was assembled separately from complete HQ reads using SPAdes assembler (Bankevich et al., 2012) with coverage cutoff of 500, wherein available M. graminicola mitochondrial genome sequences (accession nos. HG529223, KJ139963) obtained from GenBank were provided as trusted contigs to the SPAdes assembler. This resulted in only 4 scaffolds from the assembly. The resulted scaffolds from SPAdes assembler were further merged using EMBOSS merger tool (Rice et al., 2000) to construct full length mitochondrial genome. The assembled genome was further annotated using MITOS (Bernt et al., 2013) and ARWEN (Laslett and Canbäck, 2008) servers.

The final M. graminicola genome sequence assembly was of 38.18 Mb size, and included 4,304 scaffolds with an average scaffold length of 8.87 Kb. The minimum and maximum scaffold length was 501 bp and 145 Kb, respectively. The N50 and N90 lengths for the final assembly were 20.4 Kb and 4.2 Kb, respectively. The GC content of the assembled genome was 23.05%, and there were 1.88% N’s in the assembly. Core Eukaryotic Genes Mapping Approach (CEGMA) (Parra et al., 2007) was used to assess the completeness of the M. graminicola genome assembly, and out of 248 core genes, 209 complete (84.27%) and 225 partial (90.73%) core eukaryotic genes (CEGs) were found to be present. Identification of protein-coding genes was carried out by using GenMark-ES tool (Borodovsky and McIninch, 1993) which predicted 10,196 protein-coding genes, as compared with 6,712 to 20,317 in other plant-parasitic nematode genomes (summarized in Kikuchi et al., 2017). Functional annotation of predicted M. graminicola protein-coding genes performed using OrthoMCL (Li et al., 2003) identified 5,427 proteins that shared high homology with other Meloidogyne spp. In addition, 245 tRNA genes were predicted. The mitochondrial genome sequence of M. graminicola IARI strain was 19,019 bp long and contained 12 protein-coding genes, 22 tRNA and two ribosomal RNA genes. Based on the mitochondrial genome sequence, the M. graminicola IARI strain appears phylogenetically closer to the M. graminicola strain from Philippines (HG529223, 20,030 bp, Besnard et al., 2014) as compared with the Chinese strain (KJ139963, 19,589 bp, Sun et al., 2014).

The present assembly size deviates from that of the ~30 Mb as predicted by Feulgen densitometry (Lapp and Triantaphyllou, 1972). Using sequencing technologies that produce longer reads such as PacBio or mate pair sequencing to obtain better genome assemblies, and, inbreeding the nematode strain to be used for sequencing to reduce possible heterozygosity might help in correcting the mismatch between predicted and assembled genome sizes. However, N50 value, complete and partial CEGs and other genome statistics for our M. graminicola assembly are comparable to the closely related and published plant-parasitic nematode genomes solved using similar sequencing platforms (Supplementary Table A1).

A comparison of Meloidogyne graminicola genome information with the published plant-parasitic nematode genomes.

Sl. no. Nematode Sequencing platform Assembly approach/assembler/assembly-pipeline Assembly size No. of scaffolds N50 value (kbp) CEGMA score (complete/partial %) or complete% Reference Year
1 Meloidogyne incognita Sanger, ABI3730x1 DNA analyzer De-novo, Arachne 86.1 2,995 62.5 77/80.6 Abad et al. (2008) 2008
2 Meloidogyne hapla ABI3730, megabase dequence analyzer De-novo, Arachne v2.0.1 53.0 3,452 37.6 94.8/96.8 Opperman et al. (2008) 2008
3 Bursaphelenchus xylophilus 454 FLX, illumina GAI De-novo, Newbler v2.3, Velvet v 1.0.12, AbacasII, IMAGE, iCORN 74.6 5,527 949.8 97.6/98.4 Kikuchi et al. (2011) 2011
4 Meloidogyne floridensis Illumina HiSeq2000 De-novo, Velvet v1.1.04 96.7 58,696 3.7 58.1/77.4 Lunt et al. (2014) 2014
5 Globodera pallida ABI 3730 Capillary DNA Analyser, 454 GS-20 and GS-FLX sequencer, Illumina GAIIx De-novo, Celera, Newbler, Abyss v1.2.7, SSPACE v1 124.6 6,873 122 74.19/80.65 Cotton et al. (2014) 2014
6 Pratylenchus coffeae Roche 454 De-novo, Newbler 19.7 5,821 10 NA Burke et al. (2015) 2015
7 Globodera rostochiensis Illumina HiSeq2000 De-novo, SGA v0.9.7, Velvet v1.3.7, SSPACE, Gapfiller 95.9 4,377 89 93.55/95.56 Eves-van den Akker et al. (2016) 2016
8 Meloidogyne enterolobii Illumina L30 De-novo, Platanus 162.4 46,090 9.2 81 /NA Szitenberg et al. (2017) 2017
9 Meloidogyne floridensis Illumina SJF1 De-novo, Platanus 74.9 9,134 13.2 84 Szitenberg et al. (2017) 2017
10 Meloidogyne incognita Illumina W1 De-novo, Platanus 122.1 33,735 16.4 83 Szitenberg et al. (2017) 2017
11 Globodera ellingtonae HISeq, MiSeq, PacBio De-novo assembler Allpaths-LG, PBJelly 119.1 2,248 360 92/96 Phillips et al. (2017) 2017
12 Meloidogyne javanica Illumina VW4 De-novo, Platanus 142.6 34,394 14.2 90 Szitenberg et al. (2017) 2017
13 Meloidogyne arenaria Illumina HarA De-novo, Platanus 163.7 46,509 10.5 91 Szitenberg et al. (2017) 2017
14 Ditylenchus destructor Illumina HiSeq2500, PacBio RSII De-novo, ALLPATHS-LG, SSPACE, pb-jelly, Gapfiller 112 1,761 570 91 Zheng et al. (2016) 2016
15 Meloidogyne incognita 454, Illumina HiSeq2000 De-novo, MIRA, SSPACE, GapCloser (SOAPdenovo 2) 183.5 12,091 38.6 97 Blanc-Mathieu et al. (2017) 2017
16 Meloidogyne javanica 454, Illumina HiSeq2000 De-novo, MIRA, SSPACE, GapCloser (SOAPdenovo 2) 256.3 31,341 10.4 96 Blanc-Mathieu et al. (2017) 2017
17 Meloidogyne arenaria 454, Illumina HiSeq2000 De-novo, MIRA, SSPACE, GapCloser (SOAPdenovo 2) 235.5 26,196 16.4 95 Blanc-Mathieu et al. (2017) 2017
18 Meloidogyne graminicola Illumina GAIIx De-novo, Platanus, Redundans 38.18 4,304 20.4 84.27/90.73 This study 2018

This draft genome sequence would be useful for the researchers working on comparative genomics of Meloidogyne and other tylenchid nematodes, and enable functional genomics in M. graminicola. We understand that the present M. graminicola draft genome is incomplete, and expect to improve it in the near future. The present assembly would work as a base for the further improvement of the M. graminicola genome sequence.

GenBank accession numbers: The Whole Genome Shotgun project has been deposited at DDBJ/ENA/GenBank under the accession NXFT00000000. The raw DNA sequence data was deposited in GenBank under BioSample no. SAMN04041660, BioProject No. PRJNA411966 and SRX1224028 (long insert library) and SRX1223928 (short insert library), respectively. The mitochondrial genome was submitted to GenBank under accession no. MG763561.

eISSN:
2640-396X
Lingua:
Inglese
Frequenza di pubblicazione:
Volume Open
Argomenti della rivista:
Life Sciences, other