Single-Nucleotide Polymorphisms in Exonic and Promoter Regions of Transcription Factors of Second Heart Field Associated with Sporadic Congenital Cardiac Anomalies
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
INTRODUCTION
Congenital cardiac anomalies are the main cause of infant death and the most common birth defect worldwide (1). Patients with sporadic congenital heart disease (CHD) account for approximately 80% of CHD patients (2). Depending on the different anatomic or pathophysiological changes, CHD can be divided into 21 different forms, including simple common forms of CHD (1) and moderate and severe forms of CHD (1). Although the incidence rate of CHD is high and its clinical symptoms are obvious, its etiology is still unclear in most patients (3).
Cardiac progenitor cells from the second heart field (SHF) participate in the development of linear cardiac tubes when the cardiac tube becomes the four-chambered heart (4). More than 10 transcription factors, including GATA5 (GenBank accession no. NM_080473), MEF2C (GenBank accession no. NM_002397), SYMD1 (GenBank accession no. NM_198274), and TBX20 (GenBank accession no. NM_001166220), contribute to SHF development by controlling the proliferation and differentiation of cardiac progenitor cells (5). The knockout of these genes could lead to different types of CHD in mice (6–9). Many exonic mutations of SHF transcription factors (GATA5, MEF2C, SYMD1, and TBX20) are related to CHD in humans (10–13). However, the underlying genetic pathogenesis of CHD remains unclear. In this study, we reveal that minor alleles of ten exonic and promoter single nucleotide polymorphisms (SNPs) located in SHF transcription factors increase sporadic CHD risk.
MATERIALS AND METHODS
Patient information
From January 13, 2012 to May 5, 2012, a total of 383 patients were enrolled in this study. These patients were suffering from sporadic CHD and were scheduled for surgery in our hospital (Table 1). The average age of the patients was 1 year old (3 months – 9 years). Patients were classified into simple CHD [ventricular septal defects (VSD), atrial septal defects (ASD), and patent ductus arteriosus (PDA)] (33%), right ventricular outflow tract obstruction (RVOTO) [tetralogy of Fallot (TOF), pulmonary atresia (PA), and pulmonary stenosis (PS)] (58%), and single ventricle (SV) (9%) (Table 1). A total of 383 healthy children were also recruited from our hospital as a control group. No significant differences were observed in age or sex between the CHD patients and control subjects (Table S1, S2). The diagnosis and inclusion and exclusion criteria for the subjects are described in the “Methods section” of the supplementary material. The study complied with the 1964 Declaration of Helsinki and its subsequent amendments and was approved by the Medical Ethics Committee of Fuwai Hospital. All patients or their legal guardians signed informed consent forms.
In all subjects, the extraction of genomic DNA from leukocytes was performed with a Wizard® Genomic DNA Purification Kit (Promega, WI, USA). Ten SNPs in exonic and promoter regions from 4 genes (GATA5, SMYD1, TBX20, and MEF2C) (from unpublished sequencing data for CHD) were genotyped by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDITOF-MS) in both CHD patients and control subjects (Figure 1, 2). The steps of MALDI-TOF-MS included multiplex polymerase chain reaction (PCR), amplification, shrimp alkaline phosphatase digestion, IPLEX primer extension, resin cleaning, MALDI-TOF-MS analysis, and data analysis (14). Analyses were repeated in 10% of randomly selected samples for quality control.
Plasmids, site-directed mutagenesis, cell transfection, and luciferase assays
MEF2C: rs80043958 A>G, MEF2C: rs304154 A>G, and TBX20: rs336284 A>G were all located in promoter elements. The promoter fragments of MEF2C containing the A allele of rs80043958 or rs304154 and the TBX20 promoter fragment containing the A allele of rs336284 were amplified from genomic DNA. The PCR products were subcloned into the KpnI/XhoI restriction sites of the GV238-basic vector (GeneChem, Shanghai, China). Transcription factors included ZFX (GenBank accession no. NM_001330327), CEBPA (GenBank accession no. NM_001287424), HLTF (GenBank accession no. NM_003071), FOXC1 (GenBank accession no. NM_001453), and GATA1 (GenBank accession no. NM_002049) were also amplified and subcloned into the GV141-basic vector. Plasmids carrying the corresponding G allele were generated by site-directed mutagenesis with the MutanBEST kit (Takara, Berkeley, CA, USA) to ensure a uniform backbone sequence. All recombinant clones were verified by DNA sequencing. The human embryonic kidney cell line HEK 293T (4×105) was seeded in 24 well culture plates. After 24 h, HEK 293T cells were transfected with 1.0 μg of the wild-type promoter or mutant promoter and the corresponding transcription factors, according to the manufacturer's instructions. After an additional 24 h of culture, the transfected cells were assayed for luciferase activity using the Dual-Luciferase Reporter Assay System (Promega). There were eight experimental groups for rs80043958 and rs304154. For rs336284, there were four experimental groups. Each luciferase assay was performed in triplicate.
Statistical analyses
The means ± standard deviations (SD) were used for the continuous variables. Continuous variables were compared between the two groups by Student's t test. Pearson's χ2 test or Fisher's exact test was used to compare the categorical variables between the two groups. The odds ratios (ORs) and the corresponding 95% confidence intervals (CIs) were estimated for the risk of CHD. Differences were considered significant if p<0.05. The statistical analyses were performed using the SPSS version 17.0 software package (SPSS, Inc., Chicago, IL, USA).
RESULTS
Ten SNPs located in exon and promoter regions were genotyped
There were seven SNPs in exons and three SNPs in promoters (Table S3). Among the seven exonic SNPs, SMYD1: rs88387557 T>G was a nonsynonymous mutation, while the remaining six SNPs were all synonymous mutations (Table S3). The bioinformatics analysis suggested that the other three SNPs located in promoters caused promoter loss, which might influence MEF2C and TBX20 mRNA transcription (Table S3). The polymorphism rates of genotyping were 98–100%. The genotype frequencies of the controls were in accordance with Hardy-Weinberg equilibrium (p>0.05) (Table S4).
The risk of CHD was increased by minor alleles of rs6061243 and rs336283
The minor alleles of two SNPs, including the C allele of GATA5: rs6061243 G>C and the G allele of TBX20: rs336283 A>G, significantly increased the risk of CHD. Subjects carrying GATA5: rs6061243 GC had a 4.31-fold increased risk of CHD (OR=4.31, 95% CI 3.03–6.13, p=1.03×10−16); additionally, compared with patients with wild-type TBX20: rs336283 AA, AG increased the risk of CHD by 1.54-fold (OR=1.54; 95% CI 1.08–2.19, p<0.05) (Table 2). The subjects with TBX20: rs336283 GG were associated with a 1.91-fold increased risk of CHD in comparison with patients carrying AA (OR=1.91; 95% CI 1.27–2.87, p=0.002). The remaining minor alleles of the 8 polymorphisms were not associated with CHD risk (Table 2).
Main effects of SNPs on CHD risk
SNP
Homozygotes (for common alleles)
Heterozygotes
Homozygotes (for rarer alleles)
Case
Control
Case
Control
OR (95%CI)
p value
Case
Control
OR (95%CI)
p value
GATA5: rs6061243 G>C
103
182
200
82
4.31 (3.03–6.13)
1.03× 10−16
70
119
1.04 (0.71–1.52)
0.84
TBX20: rs336283 A>G
76
112
193
185
1.54 (1.08–2.19)
0.017
109
84
1.91 (1.27–2.87)
0.002
SMYD1: rs1542088 T>G
292
310
83
73
1.21 (0.85–1.72)
0.3
6
0
–
–
MEF2C: rs80043958 A>G
280
294
96
83
1.21 (0.87–1.70)
0.26
5
3
1.75 (0.41–7.39)
0.44
GATA5: rs6587239 T>C
88
90
196
203
0.99 (0.69–1.41)
0.94
96
88
1.12 (0.74–1.69)
0.6
GATA5: rs41305803 G>A
128
147
188
178
1.21 (0.89–1.66)
0.23
62
58
1.23 (0.78–1.89)
0.35
MEF2C: rs304154 A>G
134
151
188
162
1.31 (0.96–1.79)
0.09
58
69
0.95 (0.62–1.44)
0.8
SMYD1: rs2919881 A>G
257
250
106
116
0.89 (0.65–1.22)
0.47
18
16
1.09 (0.55–2.19)
0.8
TBX20: rs336284 A>G
126
131
169
181
0.97 (0.70–1.34)
0.86
77
66
1.21 (0.81–1.83)
0.36
SMYD1: rs88387557 T>G
350
357
31
24
1.32 (0.76–2.29)
0.33
1
1
1.02 (0.06–16.37)
1
Subgroup analyses were then implemented to assess the impacts of GATA5: rs6061243 G>C and TBX20: rs336283 A>G polymorphisms on CHD subtypes. The minor C allele of GATA5 rs6061243 G>C increased the risk of different CHD subtypes (p<0.05) (Table S5). However, GG in TBX20: rs336283 A>G increased the risk of various CHD subtypes but not simple CHD (p<0.05) (Table S6).
The minor alleles of SMYD1: rs1542088 T>G, MEF2C: rs80043958 A>G and GATA5: rs6587239 T>C increase the risk of simple CHD
The minor alleles in SMYD1: rs1542088 T>G, MEF2C: rs80043958 A>G and GATA5: rs6587239 T>C had no associations with the risk of CHD but were associated with the risk of simple CHD, including VSD, ASD, and PDA. Additionally, these three SNPs did not change the RVOTO or SV risk. For MEF2C: rs80043958 A>G, subjects carrying GG+GA had an increased simple CHD risk in comparison with patients carrying AA homozygotes (OR=1.59, 95% CI 1.02–2.48, p=0.04) (Table 3). For SMYD1: rs1542088 T>G, subjects carrying GT were associated with an increased risk of simple CHD compared with TT subjects (OR=1.62, 95% CI 1.01–2.60, p=0.043) (Table S7). GG+GT increased the risk of simple CHD compared with TT (OR = 1.72, 95% CI 1.08–2.73, p=0.021) (Table S7). GATA5: rs6587239 CC was associated with an increased risk of simple CHD compared with TT+TC (OR=1.59, 95% CI 1.02–2.48, p=0.042) (Table S8).
MEF2C: rs80043958 A>G
Genotype
simple CHD (n=125)
Control (n=381)
OR (95%CI)
p value
No
(%)
No.
(%)
AA
85
68
294
77
1
GA
37
30
84
22
1.52(0.97–2.40)
0.07
GG
3
2
3
1
3.46(0.69–17.45)
0.11
GG+GA
40
32
87
23
1.59(1.02–2.48)
0.04
AA+GA
122
98
378
99
1
GG
3
10
3
4
3.10(0.62–15.55)
0.15
A allele
207
83
672
88
1
G allele
43
17
90
12
1.55(1.05–2.30)
0.029
CHD: congenital heart disease
The minor A allele of GATA5: rs41305803 G>A and the minor G allele of MEF2C: rs304154 A>G increase the risk of RVOTO or TOF
The minor G allele of MEF2C: rs304154 A>G only increased TOF risk. GA increased the TOF risk by 1.67-fold in comparison with AA (OR=1.67, 95% CI = 1.03–2.69, p = 0.036) (Table 4). The minor A allele of GATA5: rs41305803 G>A increased the risk of RVOTO. The RVOTO group was then divided into three subgroups: TOF, PA or PS with VSD and PA or PS with IVS. The minor A allele of GATA5: rs41305803 G>A only increased the risk of TOF (Table S9).
MEF2C: rs304154 A>G
Genotype
TOF (n=105)
Control (n=382)
OR (95%CI)
p value
No
(%)
No.
(%)
AA
33
31
151
40
1
GA
59
56
162
42
1.67(1.03–2.69)
0.036
GG
13
12
69
18
0.86(0.43–1.74)
0.68
GG+GA
72
69
231
60
1.43(0.90–2.26)
0.13
AA+GA
92
88
313
82
1
GG
13
10
69
4
0.64(0.34–1.21)
0.17
A allele
125
60
464
61
1
G allele
85
40
300
39
1.05(0.77–1.44)
0.75
TOF: tetralogy of Fallot
The minor alleles of TBX20: rs336284 A>G, SMYD1: rs2919881 A>G, and SMYD1: rs88387557 T>G increase the risk of other CHD types
Among patients harboring TBX20: rs336284 A>G, GG subjects were significantly associated with an increased risk of SV compared with AA subjects (OR =2.26, 95% CI 1.05–4.86, p=0.033) (Table 5). However, among patients harboring SMYD1: rs2919881 A>G, the risk of PA or PS with IVS was increased significantly in GG subjects compared with AA patients (OR =3.19, 95% CI 1.19–8.59, p= 0.039) (Table S10). For SMYD1: rs88387557 T>G, the SV risk was increased by 2.66-fold in the G allele compared with the T allele (OR=2.66, 95% CI 1.06–6.70, p=0.03) (Table S11).
TBX20: rs336284 A>G
Genotype
SV (n=34)
Control (n=378)
OR (95%CI)
p value
No
(%)
No.
(%)
AA
9
26
131
35
1
GA
14
41
181
48
1.13(0.47–2.68)
0.79
GG
11
32
66
17
2.42(0.96–6.14)
0.06
GG+GA
25
74
247
65
1.47(0.67–3.25)
0.33
AA+GA
23
68
312
83
1
GG
11
10
66
4
2.26(1.05–4.86)
0.033
A allele
32
47
443
59
1
G allele
36
53
313
41
1.59(0.97–2.62)
0.07
SV: single ventricle
Luciferase assays of MEF2C: rs80043958 A>G, MEF2C: rs304154 A>G, and TBX20: rs336284 A>G
For rs80043958, the G allele plasmid showed non-significant luciferase expression compared with the A allele counterparts in HEK 293T cells (p>0.05) (Figure 3A). When combined with HLTF, the G promoter showed a higher expression level than the A promoter (p<0.01) (Figure 3A). No such increase occurred when CEBPA or CEBPA+HLTF were added to the MEF2C promoter (Figure 3A).
For rs304154, the G promoter displayed a significantly lower luciferase expression than the A promoter (p<0.01) (Figure 3B). When the MEF2C promoter was combined with GATA1, FOXC1, or GATA1+FOXC1, the two groups still exhibited a significant difference (p<0.01) (Figure 3B).
For rs336284, the G promoter exhibited a significantly lower level of luciferase expression than the A promoter (p<0.01) (Figure 3C). The promoter in the (G)+ZFX group showed a lower expression level (p<0.01).
DISCUSSION
SNPs located in exon or promoter regions play important roles in CHD development. Our previous study revealed that polymorphisms located in the exons and promoters of growth factors influence the risk of CHD (15). We then revealed that the minor alleles in the intronic SNPs in the transcription factors regulating the second heart field increased the risk of CHD (14). In this study, we evaluated the associations between polymorphisms in the exons and promoters of the second heart field and CHD.
Mutations in SMYD1 (16), MEF2C (11), GATA5 (17), and TBX20 (10) were found to be critical for the transcriptional regulation of gene expression for CHD development.
Mutations in the exons of these genes can inactivate an allele or cause gene dysfunction by interfering with DNA interactions. In this study, we found ten SNPs located in the exons and promoters of SMYD1, MEF2C, GATA5, and TBX20 which regulate SHF development and were significantly associated with CHD. The minor alleles of two SNPs, GATA5: rs6061243 G>C and TBX20: rs336283 A>G, were significantly associated with an increased risk of CHD. Additionally, the minor alleles of the remaining SNPs increased the risk of different CHD types.
SHF regulation involves numerous signaling and transcriptional cascades. The ISL1-GATA-MEF2C pathway plays an important and central role in the transcription factor network of SHF. When heart looping occurs, Mef2c transcripts are robustly expressed in the outflow tract and right ventricle and are less abundant in the left ventricle and atria (18). Mouse embryos lacking Mef2c exhibit severe defects of the outflow tract and hypoplasia of the right ventricle (19). These observations imply a crucial role for MEF2C in the transcription factor networks regulating myoblast differentiation in SHFs. There were two SNPs located in the promoter region of MEF2C that were associated with CHD in this study (15). The minor G allele of MEF2C: rs80043958 A>G exhibited an increased risk of VSD, ASD, and PDA, while the G allele of MEF2C: rs304154 A>G only increased TOF risk. The bioinformatic analysis suggested that both SNPs cause promoter loss in MEF2C, which might disrupt binding with other transcription factors and influence MEF2C transcription. Subsequent luciferase assays showed that the minor G allele of rs304154 could either decrease the transcription level of MEF2C alone or influence its transcription level in combination with FOXC1, GATA1, or FOXC1+GATA1. Although we did not find that the G allele of rs80043958 influenced the transcriptional level of MEF2C when the HLTF interacted with a promoter containing rs80043958. The minor G allele increased the transcription level of MEF2C.
Epigenetic factors may also be involved in cardiac morphogenesis. The disturbance of the chromatin remodeling protein Smyd1 in mice results in a phenotype with a decreased right ventricle and dysplasia of the left ventricle (20). There have been few reports of SMYD1 mutations associated with CHD. One study revealed that one de novo exonic mutation in SMYD1 was associated with hypertrophic cardiomyopathy (12). In the present study, we found that both identified SNPs of SMYD1 were located in the exon. The minor G allele of SMYD1: rs2919881 A>G (p.I253I) increased the risk of PA or PS with IVS. The minor G allele of SMYD1: rs1542088 T>G (p.R131R) increased the risk of simple CHD. Both SNPs are synonymous variants. There is a SET domain in Smyd1 that has methyltransferase activity and histone deacetylase (HDAC) activity, which can represses genes (20). p.I253I is located in the ET domain of SMYD1, which is a pivotal SET domain that functions as the primary catalytic domain. p.R131R is located in the SET-I domain, which contributes to the binding of cofactors with substrates and protein stability (21). Therefore, we could infer that the p.I253I variant might influence methyltransferase activity, while p.R131R could interfere with cofactor and substrate binding with SMYD1.
Three GATA factors, GATA 4, 5, and 6, were expressed in the heart in a partially overlapping way. GATA transcription factors are key regulators of cardiac development. In contrast to GATA 4 and 6, GATA 5 expression is more restricted to endocardial cushions in the outflow tract during cardiac development. Laforest et al. found that Gata4+/−Gata5+/− and Gata5+/−Gata6+/− double heterozygous mice die in the embryonic or perinatal periods due to dysplasia of the OFT, including double outlet right ventricle (DORV) and VSD. These studies reveal the existence of important genetic interactions between GATA5 and the other two Gata factors in outflow tract morphogenesis (22). Later studies in humans revealed GATA5 mutations associated with VSD, aortic bicuspid, DORV, and TOF (13, 23–25). In our study, the minor A allele of GATA5: rs41305803 G>A (p.D203D) increased the risk of RVOTO, which is an outflow tract malformation. The minor C allele of GATA5: rs6587239 T>C (p.K284K) increased the risk of simple CHD. Subjects carrying GATA5 rs6061243 (p.S327S) GC presented a 4.31-fold increase in the risk of CHD. These results were similar to previous studies and suggest that subtle alterations in the activity of the GATA5 factor might cause CHD in humans.
TBX20 is critical for heart chamber formation, especially the outflow tract and right ventricle, which are the anterior derivatives of the SHF (26). Targeted disruption of TBX20 leads to unlooped and severely hypoplastic myocardial tubes in mice (27). Incomplete knockdown of TBX20 results in severely compromised valve formation, hypoplastic right ventricle, and persistent truncus arteriosus (9). In cardiac development, TBX20 functions as a dosage-dependent moderator (9). Either loss- or gain-of-function in TBX20 results in abnormal heart development (28). TBX20 mutations are associated with VSD, ASD, TOF, DORV, persistent truncus arteriosus (PTA), and adult dilated cardiomyopathy in humans (29, 30). Mutations in either the exon or promoter regions of TBX20 can lead to CHD (10). This study revealed that TBX20: rs336283 A>G in the promoter increased the risk of CHD, while TBX20: rs336284 A>G (p.S13S) in an exon increased the risk of SV. Further luciferase assays showed that the G allele of rs336284 decreased the transcription level of TBX20, even with the interaction with ZFX.
Limitations are still present in this study. First, the small sample size limited the persuasiveness. A large number of CHD patients and healthy controls for genotyping will be included in our next study. Second, the mechanisms of these ten SNPs affecting the risk of CHD still require further investigation. The knock-in mouse model must be built in order to explore the mechanism by which these SNPs affect CHD formation at the genetic and molecular biological levels.
In this study, the associations of exonic and promoter SNPs in SHF transcription factors with an increased risk of CHD were evaluated. This results suggest that SNPs in SHF exon and promoter regions play roles in the pathogenesis of CHD.