Uneingeschränkter Zugang

In Silico Analysis of Novel Titin Non-Synonymous Missense Variants Detected by Targeted Next-Generation Sequencing in a Cohort of Romanian Index Patients with Hypertrophic Cardiomyopathy


Zitieren

INTRODUCTION

Titin, encoded by titin gene (TTN), is the largest human protein and a key component of sarcomere, involved in sarcomere assembly, mechano-sensing, and signal transduction1,2,3. Considering its essential role in both structure and function of cardiac sarcomere, it is not surprising that TTN mutations are involved in the pathogenesis of various cardiomyopathies. Of note, due to its vast size, TTN is particularly prone to variation, sequence variants being relatively common in the general population, being found in up to 6% of reference subjects4,5,6,7. Indeed, almost 25.000 unique TTN variants have been identified in the gnomad database (https://gnomad.broadinstitute.org).

The advent of high-throughput next generation sequencing (NGS) technology has undeniably enhanced the efficiency of genetic testing for inherited cardiac conditions, by enabling the identification of both novel disease-related genes and novel causative variants in already known genes. However, the increased genetic information is encumbered by challenges related to accurate interpretation and classification of hundreds, or even thousands of variants detected in a NGS run. In spite of clear standards and guidelines available, the definite clinical classification of sequence variants is not always possible, the majority of detected variants in cardiogenetic panels being still classified as variants of unknown significance (VUS)8,9. Although VUS reclassifying is of paramount importance, with a substantial impact on the clinical management of patients and their relatives, it is hindered by the time and costs required to gain additional evidence, such as allele segregation within large pedigrees, or in vivo and/or in vitro functional assessment.

Our group recently reported a number of rare variants detected by targeted NGS in core and putative genes associated with hypertrophic cardiomyopathy (HCM) in a cohort of Romanian adult probands10. For the majority of identified mutations, the clinical significance is yet to be established. In the current study, we focused explicitly on novel TTN nonsynonymous missense variants identified in our cohort. Herein we present our strategy for prioritization these variants for subsequent experimental investigation to enable a definite classification.

MATERIAL AND METHODS
Study population

The study was approved by the Ethics Committee of the Clinical Emergency Hospital of Bucharest, and performed in compliance with the principles of the Declaration of Helsinki. Written informed consent was obtained from all participants prior to enrolment.

A total of 45 unrelated HCM adult index patients fulfilling the diagnostic criteria recommended by European Society of Cardiology (ESC)11 were included in this study. All patients underwent comprehensive clinical work-up, including personal and family medical history, physical examination, 12-lead electrocardiogram, two-dimensional transthoracic echocardiography, and genetic testing, as previously described10.

Genetic testing

Patients underwent genetic testing following a methodology that has been detailed elsewhere10,12. Briefly, blood samples were collected at enrolment and total DNA was isolated using MagCore Genomic DNA Whole Blood Kit (RBC Bioscience) following the manufacturer's protocol. Targeted next generation sequencing (NGS) was performed on an Illumina Mi-Seq platform using TruSight Cardio Sequencing Kit (Illumina) according to manufacturer's instructions. An initial amount of 50 ng of genomic DNA was used for optimal gene enrichment.

Variant assessment

Data files yielded during sequencing runs were processed by MiSeq Reporter software (Illumina) to generate FASTQ files, and to perform the mapping of reads against the reference human genome (GRCh37) using Burrows-Wheeler Aligner-Maximal Exact Match (BWA-MEM) algorithm13. Variant calling was achieved with Genome Analysis Toolkit (GATK), and Variant Call Format (VCF) files were further analysed with VariantStudio v3.0 software (Illumina).

A stepwise strategy was used to select and prioritize the candidate variants for further analysis.

First, a filtering approach was used to select TTN protein-coding variants with high quality calling (PASS filter) and allele frequency (AF) <0.1% in population databases (1000 genomes project, gnomAD, and Exome Variant Server from the NHLBI Exome Sequencing Project). The cut-off of 0.1% was chosen considering the disease prevalence in general population (1 in 500 individuals or 1/1000 chromosomes)14. Of these, only novel (i.e., not previously reported) nonsynonymous missense mutations were retrieved for downstream analysis.

Secondly, the Mendelian Clinically Applicable Pathogenicity (M-CAP) score15 was used to predict the pathogenicity of each novel TTN missense variant. Thirdly, the variants predicted possibly pathogenic by M-CAP were analysed individually using a dedicated computational tool: TITINdb16. Quotient solvent accessible surface area [Q(SASA)]17 and mutation Cutoff Scanning Matrix (mCSM) class18 were retrieved, variants predicted to be destabilizing being considered candidate risk variants to be used in subsequent experimental investigation.

Variants were reported using Human Genome Variation Society standardized nomenclature19. Interpretation of clinical significance followed the joint consensus recommendations of American College of Medical Genetics and Genomics and the Association for Molecular Pathology (ACMG/AMP), which classifies variants into one of the categories: benign (B), likely benign (LB), variant of uncertain significance (VUS), likely pathogenic (LP), or pathogenic (P)20.

Variant databases and in silico tools

We queried the following variant databases (accessed on August 2020): 1000 Genomes Project (https://www.internationalgenome.org/1000-genomes-browsers), the Exome Variant Server from the NHLBI Exome Sequencing Project (ESP) (https://esp.gs.washington.edu/EVS/), NCBI dbSNP (https://www.ncbi.nlm.nih.gov/snp/), Genome Aggregation Database (gnomAD; http://gnomad.broadinstitute.org), ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/), Human Genome Mutation Database (5-day trial license HGMD Professional 2020.2; http://www.biobase-international.com/).

In silico tools used in this study were as it follows: M-CAP (http://bejerano.stanford.edu/mcap/), TITINdb (http://fraternalilab.kcl.ac.uk/TITINdb/).

Statistical analysis

Data were analysed using SPSS Statistics (version 23.0); results were presented as mean ± standard deviation for continuous variables and n (%) for categorical variables.

RESULTS
Study Population

Forty-five unrelated index patients (33 men and 12 women) with HCM were studied, as previously stated10while inconclusive results due to either known or novel variants were established in 31 cases (68.9%. The mean age at enrolment was 51 years (SD 15.5, range 21 to 87 years). Maximal LV wall thickness was 20.8 ± 5.2 mm (range 15 to 38 mm) in the overall cohort. Summary characteristics of each group and entire cohort are presented in Table 1. Baseline characteristics between TTN carriers and noncarriers were compared. Subjects with and those without TTN mutations were comparable with respect to age, gender, and echocardiographic findings.

General and echocardiographic characteristics of HCM subjects

Variable Overall cohort (n = 45) TTN+ (n = 13) TTN− (n = 32) p

Age at inclusion, years 51±15.5 54.31±14 49.34±15.84 0.34

Male sex, n (%) 33 (73.3%) 10 (76.9%) 23 (71.9%) 0.72

Family history of HCM, n (%) 7 (15.56%) 4 (30.77%) 3 (9.4%) 0.47

Family history of SCD, n (%) 14 (31.1%) 3 (23.1%) 11 (34.4%) 0.45

ICD, n (%) 7 (15.56%) 3 (23.1%) 4 (12.5 %) 0.56

Atrial fibrillation, n (%) 22 (48.9%) 7 (53.8%) 15 (46.9%) 0.67

Maron classification, n (%) 0.32
  1 7 (15.56%) 1 (7.7%) 6 (18.8%)
  2 5 (11.1%) 2 (15.4%) 3 (9.4%)
  3 32 (71.1%) 9 (69.2%)) 23 (71.9%)
  4 1 (2.2%) 1 (7.7%) 0

Presence of LVOTO, n (%) 20 (44.44%) 7 (53.9%) 13 (40.6%) 0.53

LV maximal wall thickness, mm 20.8±5.2 20.61±6.58 20.69±4.58 0.32

LV mass, g 279.91±90.72 297±90.62 272.04±91.6 0.51

LVEDD, mm 41.21±7.87 42.72±7.49 40.59±8.07 0.44

LVESD, mm 25.31±9.77 24.18±5.27 25.83±111.32 0.98

LVEDV, ml 111.42±39.6 125.55±45.8 106.12±36.68 0.41

LVESV, ml 55.55±30.13 55.69±32.56 56.74±29.63 0.98

LVEF, % 57.09±7.64 56.35±5.86 57.39±7.12 0.66

LAD, mm 40.92±6.86 41.09±8.69 40.85±6.13 0.88

LAV, ml 90.32±41.75 87.5±41.06 91.42±42.75 0.68

HCM hypertrophic cardiomyopathy; ICD internal cardiac defibrillator; LAD left atrium diameter; LAV left atrium volume; LV left ventricular; LVEDD left ventricular end-diastolic diameter; LVEDV left ventricular end-diastolic volume; LVEF left ventricular ejection fraction; LVESD left ventricular end-systolic diameter; LVESV left ventricular end-systolic volume; LVOTO left ventricular outflow tract obstruction; SCD sudden cardiac death; TTN+ titin carriers; TTN− titin noncarriers.

Genes and variants

Of the 174 genes covered by TruSight Cardio Sequencing Kit, only TTN was considered in this analysis. After initial variant calling, a total of 1604 variants passed quality filters and were used for downstream analysis. Subsequent filtering (Figure 1) yielded 17 distinct rare TTN coding variants found in 13 of 45 probands, of which 1 was stop-gain variant, 1 splice-site variant, and the vast majority (15 variants) being missense.

Figure 1

Bioinformatic filtering strategy to identify variants for further prioritization. From total TTN variants passing quality filters were retrieved the coding ones with allele frequency < 0.1% in population databases. Furthermore, only nonsynonymous missense variants were recovered and manually searched through population databases and repositories. Seven novel TTN nonsynonymous missense variants were selected for prioritization.

All 17 variants were detected in heterozygosis, and were identified only once in our database. The mean depth of sequence coverage across target regions was 84× (ranged from 25 to 251). Of the 15 missense variants, 3 were synonymous and were discarded from further analysis. An extensive search of the remaining 12 variants was conducted through population databases and public archives mentioned in methodology; 7 mutations were proved to be novel (i.e., absent from queried databases and repositories), all being positioned in exons encoding I-band domains or Z-disk /near Z-disk, as depicted in Table 2.

Novel TTN nonsynonymous missense variants detected in our cohort

Gene HGVSc HGVSp Exon Region
TTN c.44530G>T p.Ala14844Ser 242 I-band
TTN c.30392G>T p.Cys10131Phe 108 I-band
TTN c.25185G>T p.Lys8395Asn 88 I-band
TTN c.16783G>T p.Val5595Leu 58 I-band
TTN c.11927A>G p.Lys3976Arg 49 I-band
TTN c.2518G>T p.Ala840Ser 16 near Z-disk
TTN c.49G>T p.Val17Leu 2 Z-disk

Variant pathogenicity potential was further assessed by two in silico prediction tools, which were used sequentially (Figure 2). Four variants (c.30392G>T, c.25185G>T, c.2518G>T, c.49G>T) were predicted to be possibly pathogenic by M-CAP and were further subjected to TITINdb analysis. All mutated residues except p.Ala840Ser (c.2518G>T) were predicted to be buried [Q(SASA) values ≤0.3] and also to be destabilizing (mCSM class).

Figure 2

Variant prioritizing strategy for further experimental investigation. From 7 novel TTN nonsynonymous missense variants, 4 were predicted to be possibly pathogenic by M-CAP and entered second prediction. Finally, 3 TTN variants were identified as likely function-impacting variants.

DISCUSSION

While the role of TTN as a causative gene is largely acknowledged for dilated cardiomyopathy (DCM)21,22,23,24,25, its implication in HCM is not so well established. Especially, studies have focused on the assessment and clinical interpretation of TTN truncating mutations26, overlooking missense variants despite the fact that they are the most commonly observed. Accordingly, these variants are the most challenging in terms of defining their clinical significance27, requiring expensive and time-consuming additional studies, such as in vivo/in vitro functional studies and sequencing of identified variants in family members. Hence, limiting the number of mutations to be further analysed, but without missing some of the disease-causing ones, is of paramount importance. Herein we present our strategy to prioritize TTN variants detected in our HCM cohort.

The general characteristics of our study cohort were reported in detail earlier, with an average age at enrolment falling in the fifth decade of life, and with male predominance10. No statistically significant differences were found in terms of clinical presentation or general characteristics between TTN positive and TTN negative individuals.

The screening for TTN variations showed 17 TTN coding variants with AF < 0.1% in reference populations, the vast majority being nonsynonymous missense variants (70%, n=12), of which over half (n=7) were novel (Table 2) and were the subject of further in silico analysis. As detailed in our prior study10, all the 7 variants were classified as VUS; none of the patients harbouring one of the 7 novel TTN variants carried a LP/P in other HCM-associated genes.

In silico prediction is an important step in assessment of novel detected sequence variants, as it is one of the criteria recommended for variant interpretation by ACMG/AMP20. Polyphen28, SIFT29, MutationTaster30 CADD31, and PROVEAN32 are the most frequently used algorithms in daily practice, with 79% concordance for pathogenic variants and only 33% for benign variants33. After extensive documentation of various computational tools that can identify those missense variants most likely to have a pathogenic effect, we chose to use as first line the M-CAP score15. M-CAP is a highly sensitive pathogenicity classifier for rare missense variants in the human genome, designed to misclassify no more than 5% of pathogenic variants while significantly reducing VUS number. As opposed to the aforesaid algorithms which misclassified 26 to 38% of known pathogenic mutations, M-CAP has been proved to outperforms existing methods at all thresholds and correctly dismissed 60% of rare missense VUS detected in a typical genome at 95% sensitivity15. Three out of the 7 TTN variants in our study were predicted to be LB, thus reducing the list of VUS to be further analysed to 4 (57%) (Figure 2). c.30392G>T, c.25185G>T, c.2518G>T, and c.49G>T entered TITINdb prediction, with retrieval of Q(SASA) values and mCSM score.

Solvent accessibility surface area (SASA) is a critical attribute of proteins for determining their folding and stability. It was defined in early ‘70s as the surface described around a protein by a hypothetical centre of a solvent sphere in contact with the van der Waals surface of the molecule34. Based on SASA values, amino acid residues of a protein can be classified as “buried” or “exposed”. Laddach et al. showed that disease associated mutations tend to be located to residues with a significantly lower Q(SASA)16. Indeed, by mapping single amino acid variants (SAVs) on a cu-rated database of human protein structures, Savojardo and colleagues35 found that disease related SAVs are less accessible to solvent than those involved in polymorphisms, suggesting that pathogenicity is more frequently associated to the buried quality than to the exposed one.

Moreover, the mCSM algorithm within TITINdb enabled the assessment of SAVs on protein stability. Accordingly, it has been evidenced that disease associated single-point mutations were predicted to be significantly more destabilising than neutral ones16.

In our study, all 4 affected residues except p.Ala840Ser (c.2518G>T) had Q(SASA) values ≤0.3 indicating that they were buried, and that the respective variants might cause disease through disruption of the underlying domain. Additionally, computational saturation mutagenesis performed by mCSM predicted destabilizing effects for the same 3 sequence variants, strengthening the hypothesis derived from the Q(SASA) analysis that the respective altered coding sequences could lead to disease by protein stability changes.

Finally, 3 TTN missense variants (c.30392G>T, c.2518G>T, and c.49G>T) were designated as likely function-impacting and considered for further experimental studies.

CONCLUSIONS

Herein we presented our strategy to prioritize the novel TTN missense variants detected in a cohort of HCM patients. By applying various in silico tools, we restricted the list of VUS to be further investigated to those most likely to be disease-associated, and thus reducing the need to perform expensive and time-consuming additional studies.

eISSN:
2734-6382
Sprache:
Englisch
Zeitrahmen der Veröffentlichung:
4 Hefte pro Jahr
Fachgebiete der Zeitschrift:
Medizin, Klinische Medizin, Allgemeinmedizin, Innere Medizin, Kardiologie, Kinder- und Jugendmedizin, Kinderkardiologie, Chirurgie, Herzchirurgie