EARLY DETECTION OF CANDIDATE GENES FOR BODY WEIGHT IN INDONESIAN CATTLE BREEDS WITH GENOME-WIDE ASSOCIATION STUDY (GWAS)

Genome-wide association study (GWAS) was used to detect candidate genes affecting economic traits in livestock. GWAS can detect single nucleotide polymorphisms (SNPs) in all chromosome regions. This study aimed to determine the genetic markers for body weight by GWAS in native cattle breeds of Indonesia. The Illumina Bovine 50K BeadChip was used to determine the candidate genes in three mixed-sex Indonesian cattle breeds of Bali (16 animals), Madura (16 animals), and Ongole grade (13 animals). All animals were raised at the Pasuruan Regency, East Java, Indonesia breeding station. The GWAS was performed in pooled sample of animals (45 animals) with the general linear model (GLM) method using SNP markers with minimum allele frequency (MAF) values more than 0.05 by TASSEL 5.0. software. Therefore, the body weight of cattle at 1 to 3 years of age was collected for each animal for computing Manhattan plot graphics. This research found that SUGT1 , SF3A3 , and DSCAM genes were detected as potential genetic markers for body weight in cattle breeds of Indonesia. The SUGT1 and DSCAM genes were monomorphic in Bali cattle ( Bos javanicus ). In addition, both genes were significantly associated ( P <0.05) with the body weight of Ongole-grade cattle ( Bos indicus ) at three years of age. However, the SF3A3 gene was significantly ( P <0.05) associated with body weight of Madura cattle ( Bos indicus ) at 2 and 3 years of age. In conclusion, the GWAS of pool animals reveals three candidate genes significantly associated with body weight in many cattle breeds of Indonesia. Further study to detect SNPs in candidate genes with sequencing method is essential to apply these findings practically.


INTRODUCTION
Indonesia has many native cattle breeds for meat production.In 2022, Indonesia had 18,610,148 heads of cattle with 498,923.14 tons of beef production [1].However, the total beef consumption in Indonesia in 2022 was about 440,706 tons [2].Hence, a deficit of 58,217.14tons (12%) was imported from other countries.Beef production can be increased by the selection of breeding programs in the native cattle breeds.Bali (Bos javanicus), Madura (Bos indicus), and Ongole grade (Bos indicus) cattle are three Indonesian native breeds that are kept for meat production purposes.Wiyatna [3] reported that carcasses obtained from Bali, Madura, and Ongole grade bulls reached weights of 182.68 kg, 138.26 kg, and 180.76 kg, respectively.
The genetic improvement in livestock can be assessed with the genome-wide association study (GWAS) method [4].GWAS can detect many single nucleotide polymorphisms (SNP) in all chromosomes concerning economic traits, including body and carcass weights.The present report used the GWAS to detect the genetic markers for birth weight in Ongole grade [5] and Bali [6] cattle of Indonesia.Moreover, many previous studies reported the genetic markers for body weight with GWAS in Nellore (Bos indicus) [7,8], Braunvieh (Bos taurus) [9], and Charolais (Bos taurus) [10] cattle breeds.
Unfortunately, no studies are reporting the new candidate genes for body weight in native cattle breeds of Indonesia by utilization of GWAS.Hence, exploring the new genetic markers is essential to obtain new candidate genes controlling the economic traits of cattle.This study aimed to determine the genetic markers for body weight of Indonesian native cattle at 1 to 3 years of age using Illumina Bovine 50K BeadChip.The results of the present study are essential for developing a molecular selection program in the native cattle of Indonesia based on genomic information.

Ethical approval
The study was approved by the Animal Ethics Committee of the Indonesian Agency for Agricultural Research and Development (Balitbangtan/Lolitsapi/Rm/ll/2018).

Management of animals
The animals were kept in the barn with a natural mating system.The ratio of 1 bull and 15 to 20 cows per stall was achieved.Forage feed was given at about 3-5 kg/ head consisting of Elephant grass (Pennisetum purpureum) and ad libitum rice straw.The standard nutritional content of 9-10% of crude protein (CP), 58-60% of total digestible nutrient (TDN), and 19-22% of crude fiber (CF) were given to the cattle from birth to the weaning period.The standard nutritional content of 10-11% of CP, 58-60% of TDN, and 17-19% of CF was given to cattle from weaning to adulthood.About 30% of concentrate (3% of body weight) and 70% of forages were combined in the feed ration while fresh water was given ad libitum.Regular medical examinations and vaccinations were carried out.The composition of concentrate feed that was used for the animals during the study is shown in Table 1.

Body weight
The body weights of the animals at one year of age (BW1), two years of age (BW2), and three years of age (BW3) were collected using a digital weighing scale (Sonic NI-7, China).The average of body weight in the experimental animals is presented in Table 2.The correction factor of sex was performed for the body weight of female animals using a mathematical formula [11] : Where CF sex is the correction factor of sex, BW c is the corrected body weight, BW is the actual body weight, X male is the average body weight in males, X female is the average body weight in females.

Genomic analysis
An amount of 5 μL of the blood sample was collected from each animal by jugular venepuncture using venoject and vacutainer tubes containing EDTA (BD Vacutainer, USA).Thus, the DNA samples were extracted from the collected blood samples with a DNA Extraction Kit (Geneaid, Taiwan) following the manufacturer's protocols.The DNA samples with the clarity (260/280 nm) of 1.8-2.0 were selected for genome analysis with Illumina Bovine 50K BeadChip (Macrogen, South Korea).

Bioinformatics
Two PLINK format ped and map files were obtained from the GWAS in pool animals through GenomeStudio software (Illumina, USA) [12].The quality control for total SNP markers (53,218 sites) was performed by TASSEL 5.0 software (The Buckler Lab at Cornell University, USA) [13].The SNP markers with a minimum allele frequency (MAF) value of less than 5% were not used for the genome analysis [14].This study selected 24,347 SNP markers (MAF>5%) for analysis.The quantilequantile plots (QQ-plots) and Manhattan plots graphics of three traits (BW1, BW2, BW3) were computed with the general linear model (GLM) method using TASSEL 5.0 software.In addition, the SNP marker with the highest Bonferroni threshold value was described as the best SNP marker for the evaluated traits.Referring to Becker [15], detection of the candidate genes was performed using the Bos taurus genome sequence (assembly: Btau_4.6.1 and Btau_5.0.1) that was accessed at National Center for Biotechnology Information (NCBI) website (https://www.ncbi.nlm.nih.gov).Moreover, the gene interaction network among candidate genes was performed using STRING v.11 software (Global Core Biodata Resource, USA) [16].

Data analysis
The genetic diversity of selected SNP markers such as genotype frequency, allele frequency, observed heterozygosity (Ho), expected heterozygosity (He), number of effective alleles (n e ), polymorphic informative content (PIC), and Chi-square (χ 2 ) values [17][18][19][20] was analyzed.The data records of body weight were used for the association study with SPSS 16.0 software [21] using a mathematical formula: Where Y ijk is the observed traits, μ is the overall mean, G i is the effect of i th genotype, B j is the effect of j th breed, and ε ijk is the experimental error.

RESULTS
The QQ-plots of the BW2 trait were spread under the threshold line.While the QQ-plots of BW1 and BW3 traits were spread upper the threshold line (Figure 1).However, many SNP marker plots in the BW1 were spread under the threshold line.

DISCUSSION
The Bonferroni corrected threshold value of 3.8 to 6.0 was used to select the SNP marker of body weight in the animals under study.Previous studies reported that GWAS can determine the genetic marker of body weight with the Bonferroni corrected threshold of -Log 10 (p) = 4.0 to 5.0 in Nellore [8], Charolais [22] and Braunvieh [23] beef cattle.The GWAS in Russian cattle reveals the Bonferroni corrected threshold of -Log 10 = 5.0 to 6.0 of the body weight [24].The Bonferroni corrected threshold can be affected by statistical analysis models used to select SNP markers [25].The SNP markers with the highest Bonferroni corrected value also indicate the most significant makers of observed traits.In this study, the best SNP markers for the body weight of Indonesian native cattle were located at BTA1, BTA3, and BTA12.
This study obtained three novel candidate genes of SUGT1, SF3A3, and DSCAM genes based on GWAS.The SUGT1 gene is essential to the immune response [31].
According to the BTA12 sequence (GenBank: NC_037339.1),the length of the bovine SUGT1 geneis 41,878 bp with 14 exons.In cattle, the genetic mutation in the SUGT1 gene (g.11102143A>G) is associated with embryonic mortality rate with the G allele is undesirable [32].The SF3A3 gene is essential in the pre-mRNA splicing or transcriptional control [33].According to the BTA3 sequence (GenBank: NC_037330.1),the length of bovine SF3A3 is 26,317 bp with 17 exons.In humans, the genetic mutation in the SF3A3 gene can influence the risk of ovarian cancer [34].Therefore, the DSCAM gene plays a role in neuronal self-avoidance and promotes repulsion between specific neuronal processes of either the same cell or the same subtype of cells [35].According to the BTA1 sequence (GenBank: NC_037328.1),the bovine DSCAM gene has a length of 690,467 bp with 33 exons.In humans, the genetic mutation in the DSCAM gene can affect Hirschsprung disease [36].In mice, the genetic mutation in the DSCAM gene affects the nervous system formation and several neurological defects, such as ataxia and seizures [37].On the other hand, the DSCAM has also been implicated in cell migration of embryonic cephalic cells destined to become neuroectoderm in zebrafish [38].
Three candidate genes for body weight in animals under study are survival-related genes that play essential roles in survivability traits (immune response, transcriptional activity, and nervous system).Interestingly, SUGT1, SF3A3, and DSCAM genes can express the related protein by involving many related genes for interaction.SUGT1and HSP90AA1 are immune-associated genes with related protein expression [39].In general, the Heat Shock family genes can interact with the BCAS2 gene to regulate the protein stability of cells [40].DSCAM and UNC5C are crucial genes for developing the nervous system and neuronal growth [41].In general, the PIC value in SUGT1, SF3A3, and DSCAM genes in Indonesian cattle is under the moderate to high category.Hence, the genetic diversity of these genes are high and possible to improve the body weight of cattle with molecular selection.The PIC value can be categorized into low (<0.10),moderate (0.11-0.30), and high (>0.30)categories [18].
The early investigation with a limited sample revealed that three survival-related genes in this study could influence the body weight of the animal under study.In the tropical climate, the survival-related genes in Bali, Madura, and Ongole grades can influence growth traits.A study in Bali cattle revealed that the HSP70 gene affects the body weight and body measurements traits [42].Additionally, the Interleukin-2 gene is one of the immune-related genes that can affect cattle's milk production [43].Hence, survivalrelated genes can indirectly influence the growth traits of tropical cattle breeds.

CONCLUSION
Three survival-related genes of SUGT1, SF3A3, and DSCAM were significantly associated with body weight in native cattle of Indonesia.These genes have the moderate to high category of PIC value.Therefore, the survival-related genes under study have the potency for the candidate genes of growth traits in tropical cattle of Indonesia.Hence, designing of specific primer pairs for detecting of SNP marker with sequencing analysis is important to manage a molecular selection with low cost and easily.However, in-depth research involving large samples is essential to obtain several candidate genes for growth traits accurately.

Figure 1 .
Figure 1.QQ-plots of body weight at one year (BW1), two years (BW2), and three years (BW3) of age in the pooled animals.Dots indicate -Log 10 (p) values for individual SNPs.The line indicates the expected values when confirming the null hypothesis of the absence of associations.

Figure 2 .
Figure 2. The best SNP markers in Manhattan plot for body weight at one year (BW1), two years (BW2), and three years (BW3) of age in pool animals.The X-axis shows chromosomal positions.The Y-axis shows -Log 10 (p) values.The colored dots indicate the SNP markers at different chromosomes.

Figure 3 .
Figure 3. Gene interaction network between the candidate genes under study (star symbol) and other related genes

Table 1 .
Composition of the concentrate feed

Table 2 .
Average body weight in three cattle breeds of Indonesia

Table 3 .
Genetic diversity of three selected SNP markers in three cattle breeds of Indonesia Ongole grade), respectively.A SNP marker of ARS-BFGL-NGS-43764 was polymorphic in all cattle breeds with the PIC value of 0.23 (Bali), 0.19 (Madura) and 0.33 (Ongole grade).In addition, a SNP marker of ARS-BFGL-NGS-39460 was polymorphic in animals under study with the PIC value of 0.23 (Madura) and 0.29 (Ongole grade).Generally, three selected SNP markers in the present study were in a genetic equilibrium (χ 2 <5.99).

Table 5 .
Association between the three selected SNP markers and body weight in three cattle breeds of Indonesia N: number of animals; BW1: body weight (kg) at one year of age; BW2: body weight (kg) at two years of age; BW3: body weight (kg) at three years of age.Superscripts in the same column differ significantly (P<0.05)