Article Category: Perspectives
Published Online: Aug 01, 2024
Page range: 29 - 43
Received: May 21, 2024
Accepted: Jul 16, 2024
DOI: https://doi.org/10.2478/jdis-2024-0023
Keywords
© 2024 Yu Zhao et al., published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
Solid Earth geoscience is immensely important, encompassing numerous directions and accumulating vast amounts of data ripe for machine learning analysis to assist scientists in clarifying their research paths (Bergen et al., 2019; Wang et al., 2021). This discipline faces significant challenges such as the sheer volume of data, the diversity of research directions, the multitude of researchers involved, and the inherent complexity of the field. Recent advances in data-driven solid Earth science discoveries illustrate the potential of these techniques in various areas, including the identification of mineral diversity patterns (Hazen & Morrison, 2022), high-resolution marine invertebrate biodiversity curves (Fan et al., 2020), a refined Cenozoic atmospheric CO2 record (CenCO2PIP Consortium, 2023), and a global landscape evolution model revealing stable Cenozoic sedimentation rates (Salles et al., 2023). As geosciences enter the era of big data, emerging research directions are continually appearing, presenting numerous new problems that urgently require solutions. However, in a crowded field of concerns, how can researchers prioritize the most significant questions?
To address this challenge, bibliometric analysis and science mapping offer powerful techniques (Pessin et al., 2022). Through the application of mathematics and statistical methods to books and other media of communication (Pritchard, 1969), bibliometric analysis effectively facilitates identifying and tracking research trends (Aksnes & Sivertsen, 2023; Mazov et al., 2020; Small, 2003; Upham & Small, 2010). Currently, scholars have employed bibliometric analysis to investigate research hotspots within the geosciences (Ai et al., 2022; Hazen, 1980; Ren et al., 2023; Wang et al., 2022; Xiao & Sun, 2005; Zhao et al., 2024). However, despite its advantages in exploring research trends, bibliometric analysis is limited by data source biases, citation delays, and metric standardization issues. These limitations underscore the necessity of incorporating expert insights for comprehensive trend evaluation (“Experts still needed”, 2009).
Expert insights are indispensable in geoscience research for interpreting and validating bibliometric data, providing a nuanced understanding of research trends beyond what quantitative analysis alone can offer. These insights deliver contextual knowledge, enable early identification of emerging trends, validate statistical findings, integrate qualitative perspectives, and facilitate comprehensive crossdisciplinary assessments, thus addressing the inherent limitations of bibliometric analysis (Allen et al., 2009; Iivari, 2008; Laurens et al., 2010). The Deep-time Digital Earth (DDE) initiative, recognized as the first big science program by the International Union of Geological Sciences (IUGS) (Normile, 2019), exemplifies an effective approach to overcoming these limitations. It leverages the expertise of internationally renowned members of the DDE Science Committee to enhance the robustness and depth of geoscience research.
In this paper, we integrate bibliometric analysis with expert insights to navigate the complexities of big data and prioritize research questions that will drive the Solid Earth geoscience forward.
We define research trends at two levels: theoretical and practical. At the conceptual level, research trends refer to issues or subjects that are extensively discussed and studied within the current academic community. These research trends reflect the latest developments and advancements in the academic field and have the potential to address real-world problems closely associated with society, science, and technology. They attract a substantial number of high-quality papers published within a short period, resulting in a significant increase in the citation frequency within the specific field.
At the practical level, the research trends are identified through the clustering of highly co-cited papers and papers with significant keyword co-occurrence. These papers are obtained from the Web of Science and Dimensions. The time range was set from 2014 to 2023.
Our methodology is tailored to explore the evolving research trends specifically in the field of Solid Earth Sciences from data perspective and expert view. It meticulously sifts through and identifies key areas where innovative concepts and practical applications intersect within this scientific domain.
The data collection process for this article was meticulously conducted by leveraging two primary sources for bibliometrics: Web of Science and Dimensions databases. The publication time window of this article is a 10-year period. Inspired by the paper titled “A century of physics”(Sinatra et al., 2015), we collected data by two distinct methods to provide a comprehensive overview of research trends in Geosciences. The study drew insights from over 400,000 papers sourced from 466 Geosciences journals, focusing on diverse sub-fields such as Geochemistry & Geophysics and Geology. These journals were carefully selected from a pool of about 21,000 indexed journals in the Web of Science. Data was also extracted from interdisciplinary journals, which involved analyzing approximately 5,800 papers sourced from 93 interdisciplinary journals, including reputable publications like Nature and Science. Notably, a significant portion of references within these interdisciplinary journals originated from the 466 Geosciences journals analyzed earlier. By merging insights from 41 thousand articles in these two robust databases, we were able to conduct a thorough and comprehensive analysis of research trends in Geosciences, encompassing a broad spectrum of academic publications and interdisciplinary references.
We gathered metadata information such as titles, abstracts, keywords, and references of the articles. Subsequently, we employed CiteSpace (Chen, 2006) to construct citation networks and visualize the data. Through spectral clustering algorithms, we clustered the articles based on network structures to group together papers with similar or related topics, resulting in 407 clusters. For each cluster, we applied the Log-Likelihood Ratio (LLR) algorithm to extract research terms from the titles and abstracts of the articles and calculated the relevance among the papers. The highest scoring terms were selected as the cluster’s label keywords. We then conducted statistical analyses on the number of papers, average publication year, and average citation count within each cluster.
These trends were meticulously selected by the DDE Science Committee based on their representation, relevance, and impact within the Geosciences domain. Each trend was manually labeled by the DDE Science Committee, with careful consideration of the content and implications of the associated research. Additionally, the literature datasets intended for in-depth analysis have been thoroughly reviewed and validated by the DDE Science Committee to ensure their accuracy and comprehensiveness. Furthermore, the DDE Science Committee has ensured that the selected trends are not only relevant to current scientific inquiries but also have significant potential to influence future research directions.
Leveraging the data-enhanced and scientific community-driven methods, 30 trends were identified as significant in the field of Geosciences, spanning five domains: deep space, deep time, deep Earth, habitable Earth, and big earth data (Figure 1). In this section, we will delve into one trend from each domain to highlight the diverse and influential research that is shaping the field of Geosciences.

30 Trends in Geosciences.
For a detailed exploration of each trend, we invite you to visit our website (
Looking ahead,
Figure 2 (a) illustrates a steady increase in the number of articles within this trend, accompanied by a rapid surge in cumulative citation counts between 2018 and 2021. This trend significantly contributes to “SDG 9 Industry, Innovation, and Infrastructure”, “SDG 11 Sustainable Cities and Communities”, and “SDG 13 Climate Action” (Figure 2 (b)).

Trends in
Figure 3 (a) demonstrates a rapid increase in both the number of articles and the cumulative citation counts within this trend. This trend plays a crucial role in advancing “SDG 2 Zero Hunger”, “SDG 6 Gender Equality”, “SDG 11 Sustainable Cities and Communities”, and “SDG 13 Climate Action” (Figure 3 (b)).

Trends in
The exploration of
Trend in

Trends in
The number of articles and total citations in this trend have doubled over the past decade (Figure 5 (a)). This research significantly contributes to “SDG 3 Good Health and Well-being”, “SDG 6 Clean Water and Sanitation”, “SDG 7 Affordable and Clean Energy”, “SDG 11 Sustainable Cities and Communities”, “SDG 12 Responsible Consumption and Production”, “SDG 13 Climate Action”, “SDG 14 Life Below Water” and “SDG 15 Life on Land” (Figure 5 (b)).

Trends in
The field has witnessed remarkable advancements in predictive modeling, particularly in mineral prospecting and groundwater mapping, employing advanced algorithms like Random Forest and Support Vector Machines. These approaches enhance our ability to identify mineral deposits and evaluate groundwater potential with greater accuracy. In disciplines such as seismology and hydrology, the adoption of machine learning techniques, including cutting-edge models like LSTM networks, has revolutionized earthquake detection, seismic tomography, and streamflow forecasting. Hybrid models that blend data-driven analytics with traditional physical modeling are increasingly being recognized as essential for comprehensive predictions.
Progress in climate and weather prediction is notable, with significant strides made in refining bias correction methods for extreme temperature forecasts and implementing innovative approaches like Generative Adversarial Networks for stochastic parameterization. However, the field faces challenges related to data integration and the interpretability of complex models, essential for fully harnessing the potential of machine learning in geoscience.
Looking ahead, the trajectory of
In conclusion, the ongoing evolution of machine learning and big data analytics holds immense potential to transform our capacity to address the multifaceted challenges in geoscience. Through innovative solutions and enhanced stewardship of Earth’s resources, these technologies promise to shape a more sustainable and resilient future.
With breakthroughs in the field of machine learning, its application in Earth sciences has been steadily increasing, reflected in a dual rise in both the number of publications and their impact (Figure 6 (a)). This trend has contributed significantly to “SDG 3 Good Health and Well-being”, “SDG 4 Quality Education” and “SDG 8 Decent Work and Economic Growth” (Figure 6 (b)).

Trends in
The list of trends was curated through a combination of expert knowledge, insights from literature, and co-citation methods. This approach aimed to achieve a moderate level of granularity in defining and characterizing the identified trends, providing a comprehensive overview of the evolving landscape in Geosciences.
These 30 topics reflect the latest trends and advancements in Geosciences and have the potential to address real-world problems that are closely related to society, science, and technology. What’s more, they are potential to provide valuable insights into major research breakthroughs, research methods, and solutions to significant scientific problems in Geosciences.