Issues

Journal & Issues

AHEAD OF PRINT

Volume 7 (2022): Issue 3 (August 2022)

Volume 7 (2022): Issue 2 (April 2022)

Volume 7 (2022): Issue 1 (February 2022)

Volume 6 (2021): Issue 4 (November 2021)

Volume 6 (2021): Issue 3 (June 2021)

Volume 6 (2021): Issue 2 (April 2021)

Volume 6 (2021): Issue 1 (February 2021)

Volume 5 (2020): Issue 4 (November 2020)

Volume 5 (2020): Issue 3 (August 2020)

Volume 5 (2020): Issue 2 (April 2020)

Volume 5 (2020): Issue 1 (February 2020)

Volume 4 (2019): Issue 4 (December 2019)

Volume 4 (2019): Issue 3 (August 2019)

Volume 4 (2019): Issue 2 (May 2019)

Volume 4 (2019): Issue 1 (February 2019)

Volume 3 (2018): Issue 4 (November 2018)

Volume 3 (2018): Issue 3 (August 2018)

Volume 3 (2018): Issue 2 (May 2018)

Volume 3 (2018): Issue 1 (February 2018)

Volume 2 (2017): Issue 4 (December 2017)

Volume 2 (2017): Issue 3 (August 2017)

Volume 2 (2017): Issue 2 (May 2017)

Volume 2 (2017): Issue 1 (February 2017)

Volume 1 (2016): Issue 4 (November 2016)

Volume 1 (2016): Issue 3 (August 2016)

Volume 1 (2016): Issue 2 (May 2016)

Volume 1 (2016): Issue 1 (February 2016)

Journal Details
Format
Journal
eISSN
2543-683X
First Published
30 Mar 2017
Publication timeframe
4 times per year
Languages
English

Search

Volume 1 (2016): Issue 1 (February 2016)

Journal Details
Format
Journal
eISSN
2543-683X
First Published
30 Mar 2017
Publication timeframe
4 times per year
Languages
English

Search

8 Articles

Editorial

access type Open Access

Introduction of the Inaugural Issue of the Journal of Data and Information Science

Published Online: 01 Sep 2017
Page range: 1 - 2

Abstract

Perspective

access type Open Access

Let the Data Speak for Themselves: Opportunities and Caveats

Published Online: 01 Sep 2017
Page range: 3 - 5

Abstract

Research Paper

access type Open Access

How Users Search the Mobile Web: A Model for Understanding the Impact of Motivation and Context on Search Behaviors

Published Online: 01 Sep 2017
Page range: 98 - 122

Abstract

AbstractPurpose

This study explores how search motivation and context influence mobile Web search behaviors.

Design/methodology/approach

We studied 30 experienced mobile Web users via questionnaires, semi-structured interviews, and an online diary tool that participants used to record their daily search activities. SQLite Developer was used to extract data from the users’ phone logs for correlation analysis in Statistical Product and Service Solutions (SPSS).

Findings

One quarter of mobile search sessions were driven by two or more search motivations. It was especially difficult to distinguish curiosity from time killing in particular user reporting. Multi-dimensional contexts and motivations influenced mobile search behaviors, and among the context dimensions, gender, place, activities they engaged in while searching, task importance, portal, and interpersonal relations (whether accompanied or alone when searching) correlated with each other.

Research limitations

The sample was comprised entirely of college students, so our findings may not generalize to other populations. More participants and longer experimental duration will improve the accuracy and objectivity of the research.

Practical implications

Motivation analysis and search context recognition can help mobile service providers design applications and services for particular mobile contexts and usages.

Originality/value

Most current research focuses on specific contexts, such as studies on place, or other contextual influences on mobile search, and lacks a systematic analysis of mobile search context. Based on analysis of the impact of mobile search motivations and search context on search behaviors, we built a multi-dimensional model of mobile search behaviors.

Keywords

  • Mobile search
  • Query formulation
  • Search motivation
  • Search context
  • Contextual dimension
access type Open Access

Dpaper: An Authoring Tool for Extractable Digital Papers

Published Online: 01 Sep 2017
Page range: 86 - 97

Abstract

AbstractPurpose

To develop a structured, rich media digital paper authoring tool with an object-based model that enables interactive, playable, and convertible functions.

Design/methodology/approach

We propose Dpaper to organize the content (text, data, rich media, etc.) of dissertation papers as XML and HTML5 files by means of digital objects and digital templates.

Findings

Dpaper provides a structured-paper editorial platform for the authors of PhDs to organize research materials and to generate various digital paper objects that are playable and reusable. The PhD papers are represented as Web pages and structured XML files, which are marked with semantic tags.

Research limitations

The proposed tool only provides access to a limited number of digital objects. For instance, the tool cannot create equations and graphs, and typesetting is not yet flexible compared to MS Word.

Practical implications

The Dpaper tool is designed to break through the patterns of unstructured content organization of traditional papers, and makes the paper accessible for not only reading but for exploitation as data, where the document can be extractable and reusable. As a result, Dpaper can make the digital publishing of dissertation texts more flexible and efficient, and their data more assessable.

Originality/value

The Dpaper tool solves the challenge of making a paper structured and object-based in the stage of authoring, and has practical values for semantic publishing.

Keywords

  • Dpaper
  • Extractable paper
  • Digital object
  • Authoring tool
access type Open Access

A Bootstrapping-based Method to Automatically Identify Data-usage Statements in Publications

Published Online: 01 Sep 2017
Page range: 69 - 85

Abstract

AbstractPurpose

Our study proposes a bootstrapping-based method to automatically extract data-usage statements from academic texts.

Design/methodology/approach

The method for data-usage statements extraction starts with seed entities and iteratively learns patterns and data-usage statements from unlabeled text. In each iteration, new patterns are constructed and added to the pattern list based on their calculated score. Three seed-selection strategies are also proposed in this paper.

Findings

The performance of the method is verified by means of experiments on real data collected from computer science journals. The results show that the method can achieve satisfactory performance regarding precision of extraction and extensibility of obtained patterns.

Research limitations

While the triple representation of sentences is effective and efficient for extracting data-usage statements, it is unable to handle complex sentences. Additional features that can address complex sentences should thus be explored in the future.

Practical implications

Data-usage statements extraction is beneficial for data-repository construction and facilitates research on data-usage tracking, dataset-based scholar search, and dataset evaluation.

Originality/value

To the best of our knowledge, this paper is among the first to address the important task of automatically extracting data-usage statements from real data.

Keywords

  • Data-usage statements extraction
  • Information extraction
  • Bootstrapping
  • Unsupervised learning
  • Academic text-mining
access type Open Access

A Bibliometric Framework for Identifying “Princes” Who Wake up the “Sleeping Beauty” in Challenge-type Scientific Discoveries

Published Online: 01 Sep 2017
Page range: 50 - 68

Abstract

AbstractPurpose

This paper develops and validates a bibliometric framework for identifying the “princes” (PR) who wake up the “sleeping beauty” (SB) in challenge-type scientific discoveries, so as to figure out the awakening mechanisms, and promote potentially valuable but not readily accepted innovative research. (A PR is a research study.)

Design/methodology/approach

We propose that PR candidates must meet the following four criteria: (1) be published near the time when the SB began to attract a lot of citations; (2) be highly cited papers themselves; (3) receive a substantial number of co-citations with the SB; and (4) within the challenge-type discoveries which contradict established theories, the “pulling effect” of the PR on the SB must be strong. We test the usefulness of the bibliometric framework through a case study of a key publication by the 2014 chemistry Nobel laureate Stefan W. Hell, who negated Ernst Abbe’s diffraction limit theory, one of the most prominent paradigms in the natural sciences.

Findings

The first-ranked candidate PR article identified by the bibliometric framework is in line with historical facts. An SB may need one or more PRs and even “retinues” to be “awakened.” Documents with potential awakening functionality tend to be published in prestigious multidisciplinary journals with higher impact and wider scope than the journals publishing SBs.

Research limitations

The above framework is only applicable to transformative innovations, and the conclusions are drawn from the analysis of one typical SB and her awakening process. Therefore the generality of our work might be limited.

Practical implications

Publications belonging to so-called transformative research, even when less frequently cited, should be given special attention as early as possible, because they may suddenly attract many citations after a period of sleep, as reflected in our case study.

Originality/value

The definition of PR(s) as the first paper(s) that cited the SB article (selfciting excluded) has its limitations. Instead, the SB-PR co-citations should be given priority in current environment of scholarly communication. Since the “premature” or “transformative” breakthroughs in the challenge-type SB documents are either beyond the current knowledge domain, or violate established paradigms, people’s psychological distance from the SB is larger than that from the PR, which explains why the annual citations of the PR are usually higher than those of the SB, especially prior to or during the SB’s citation boom period.

Keywords

  • Citation history
  • Delayed recognition
  • Awakening mechanisms
  • Transformative innovation
  • Nobel Prize
access type Open Access

Understanding the Correlations between Social Attention and Topic Trends of Scientific Publications

Published Online: 01 Sep 2017
Page range: 28 - 49

Abstract

AbstractPurpose

We propose and apply a simplified nowcasting model to understand the correlations between social attention and topic trends of scientific publications.

Design/methodology/approach

First, topics are generated from the obesity corpus by using the latent Dirichlet allocation (LDA) algorithm and time series of keyword search trends in Google Trends are obtained. We then establish the structural time series model using data from January 2004 to December 2012, and evaluate the model using data from January 2013. We employ a state-space model to separate different non-regression components in an observational time series (i.e. the tendency and the seasonality) and apply the “spike and slab prior” and stepwise regression to analyze the correlations between the regression component and the social media attention. The two parts are combined using Markov-chain Monte Carlo sampling techniques to obtain our results.

Findings

The results of our study show that (1) the number of publications on child obesity increases at a lower rate than that of diabetes publications; (2) the number of publication on a given topic may exhibit a relationship with the season or time of year; and (3) there exists a correlation between the number of publications on a given topic and its social media attention, i.e. the search frequency related to that topic as identified by Google Trends. We found that our model is also able to predict the number of publications related to a given topic.

Research limitations

First, we study a correlation rather than causality between topics’ trends and social media. As a result, the relationships might not be robust, so we cannot predict the future in the long run. Second, we cannot identify the reasons or conditions that are driving obesity topics to present such tendencies and seasonal patterns, so we might need to do “field” study in the future. Third, we need to improve the efficiency of our model by finding more efficient variable selection models, because the stepwise regression method is time consuming, especially for a large number of variables.

Practical implications

This paper analyzes publication topic trends from three perspectives: tendency, seasonality, and correlation with social media attention, providing a new perspective for identifying and understanding topical themes in academic publications.

Originality/value

To the best of our knowledge, we are the first to apply the state-space model to examine the relationships between healthcare-related publications and social media to investigate the relationships between a topic’s evolvement and people’s search behavior in social media. This paper thus provides a new viewpoint in the correlation analysis area, and demonstrates the value of considering social media attention in the analysis of publication topic trends.

Keywords

  • Social media
  • Publication topic trends
  • Correlation
  • State-space model
  • Variable selection
  • Nowcasting
access type Open Access

Gauging a Firm’s Innovative Performance Using an Integrated Structural Index for Patents

Published Online: 01 Sep 2017
Page range: 6 - 27

Abstract

AbstractPurpose

In this contribution we try to find new indicators to measure characteristics of a firm’s patents and their influence on a company’s profits.

Design/methodology/approach

We realize that patent evaluation and influence on a company’s profits is a complicated issue requiring different perspectives. For this reason we design two types of structural h-indices, derived from the International Patent Classification (IPC). In a case study we apply not only basic statistics but also a nested case-control methodology.

Findings

The resulting indicator values based on a large dataset (19,080 patents in total) from the pharmaceutical industry show that the new structural indices are significantly correlated with a firm’s profits.

Research limitations

The new structural index and the synthetic structural index have just been applied in one case study in the pharmaceutical industry.

Practical implications

Our study suggests useful implications for patentometric studies and leads to suggestions for different sized firms to include a healthy research and development (R&D) policy management. The structural h-index can be used to gauge the profits resulting from the innovative performance of a firm’s patent portfolio.

Originality/value

Traditionally, the breadth and depth of patents of a firm and their citations are considered separately. This approach, however, does not provide an integrated insight in the major characteristics of a firm’s patents. The Sh(Y) index, proposed in our investigation, can reflect a firm’s innovation activities, its technological breadth, and its influence in an integrated way.

Keywords

  • Patent analysis
  • Structural h-index
  • Market value of patents
  • Technological value of patents
  • Pharmaceutical industry
  • Nested case-control
8 Articles

Editorial

access type Open Access

Introduction of the Inaugural Issue of the Journal of Data and Information Science

Published Online: 01 Sep 2017
Page range: 1 - 2

Abstract

Perspective

access type Open Access

Let the Data Speak for Themselves: Opportunities and Caveats

Published Online: 01 Sep 2017
Page range: 3 - 5

Abstract

Research Paper

access type Open Access

How Users Search the Mobile Web: A Model for Understanding the Impact of Motivation and Context on Search Behaviors

Published Online: 01 Sep 2017
Page range: 98 - 122

Abstract

AbstractPurpose

This study explores how search motivation and context influence mobile Web search behaviors.

Design/methodology/approach

We studied 30 experienced mobile Web users via questionnaires, semi-structured interviews, and an online diary tool that participants used to record their daily search activities. SQLite Developer was used to extract data from the users’ phone logs for correlation analysis in Statistical Product and Service Solutions (SPSS).

Findings

One quarter of mobile search sessions were driven by two or more search motivations. It was especially difficult to distinguish curiosity from time killing in particular user reporting. Multi-dimensional contexts and motivations influenced mobile search behaviors, and among the context dimensions, gender, place, activities they engaged in while searching, task importance, portal, and interpersonal relations (whether accompanied or alone when searching) correlated with each other.

Research limitations

The sample was comprised entirely of college students, so our findings may not generalize to other populations. More participants and longer experimental duration will improve the accuracy and objectivity of the research.

Practical implications

Motivation analysis and search context recognition can help mobile service providers design applications and services for particular mobile contexts and usages.

Originality/value

Most current research focuses on specific contexts, such as studies on place, or other contextual influences on mobile search, and lacks a systematic analysis of mobile search context. Based on analysis of the impact of mobile search motivations and search context on search behaviors, we built a multi-dimensional model of mobile search behaviors.

Keywords

  • Mobile search
  • Query formulation
  • Search motivation
  • Search context
  • Contextual dimension
access type Open Access

Dpaper: An Authoring Tool for Extractable Digital Papers

Published Online: 01 Sep 2017
Page range: 86 - 97

Abstract

AbstractPurpose

To develop a structured, rich media digital paper authoring tool with an object-based model that enables interactive, playable, and convertible functions.

Design/methodology/approach

We propose Dpaper to organize the content (text, data, rich media, etc.) of dissertation papers as XML and HTML5 files by means of digital objects and digital templates.

Findings

Dpaper provides a structured-paper editorial platform for the authors of PhDs to organize research materials and to generate various digital paper objects that are playable and reusable. The PhD papers are represented as Web pages and structured XML files, which are marked with semantic tags.

Research limitations

The proposed tool only provides access to a limited number of digital objects. For instance, the tool cannot create equations and graphs, and typesetting is not yet flexible compared to MS Word.

Practical implications

The Dpaper tool is designed to break through the patterns of unstructured content organization of traditional papers, and makes the paper accessible for not only reading but for exploitation as data, where the document can be extractable and reusable. As a result, Dpaper can make the digital publishing of dissertation texts more flexible and efficient, and their data more assessable.

Originality/value

The Dpaper tool solves the challenge of making a paper structured and object-based in the stage of authoring, and has practical values for semantic publishing.

Keywords

  • Dpaper
  • Extractable paper
  • Digital object
  • Authoring tool
access type Open Access

A Bootstrapping-based Method to Automatically Identify Data-usage Statements in Publications

Published Online: 01 Sep 2017
Page range: 69 - 85

Abstract

AbstractPurpose

Our study proposes a bootstrapping-based method to automatically extract data-usage statements from academic texts.

Design/methodology/approach

The method for data-usage statements extraction starts with seed entities and iteratively learns patterns and data-usage statements from unlabeled text. In each iteration, new patterns are constructed and added to the pattern list based on their calculated score. Three seed-selection strategies are also proposed in this paper.

Findings

The performance of the method is verified by means of experiments on real data collected from computer science journals. The results show that the method can achieve satisfactory performance regarding precision of extraction and extensibility of obtained patterns.

Research limitations

While the triple representation of sentences is effective and efficient for extracting data-usage statements, it is unable to handle complex sentences. Additional features that can address complex sentences should thus be explored in the future.

Practical implications

Data-usage statements extraction is beneficial for data-repository construction and facilitates research on data-usage tracking, dataset-based scholar search, and dataset evaluation.

Originality/value

To the best of our knowledge, this paper is among the first to address the important task of automatically extracting data-usage statements from real data.

Keywords

  • Data-usage statements extraction
  • Information extraction
  • Bootstrapping
  • Unsupervised learning
  • Academic text-mining
access type Open Access

A Bibliometric Framework for Identifying “Princes” Who Wake up the “Sleeping Beauty” in Challenge-type Scientific Discoveries

Published Online: 01 Sep 2017
Page range: 50 - 68

Abstract

AbstractPurpose

This paper develops and validates a bibliometric framework for identifying the “princes” (PR) who wake up the “sleeping beauty” (SB) in challenge-type scientific discoveries, so as to figure out the awakening mechanisms, and promote potentially valuable but not readily accepted innovative research. (A PR is a research study.)

Design/methodology/approach

We propose that PR candidates must meet the following four criteria: (1) be published near the time when the SB began to attract a lot of citations; (2) be highly cited papers themselves; (3) receive a substantial number of co-citations with the SB; and (4) within the challenge-type discoveries which contradict established theories, the “pulling effect” of the PR on the SB must be strong. We test the usefulness of the bibliometric framework through a case study of a key publication by the 2014 chemistry Nobel laureate Stefan W. Hell, who negated Ernst Abbe’s diffraction limit theory, one of the most prominent paradigms in the natural sciences.

Findings

The first-ranked candidate PR article identified by the bibliometric framework is in line with historical facts. An SB may need one or more PRs and even “retinues” to be “awakened.” Documents with potential awakening functionality tend to be published in prestigious multidisciplinary journals with higher impact and wider scope than the journals publishing SBs.

Research limitations

The above framework is only applicable to transformative innovations, and the conclusions are drawn from the analysis of one typical SB and her awakening process. Therefore the generality of our work might be limited.

Practical implications

Publications belonging to so-called transformative research, even when less frequently cited, should be given special attention as early as possible, because they may suddenly attract many citations after a period of sleep, as reflected in our case study.

Originality/value

The definition of PR(s) as the first paper(s) that cited the SB article (selfciting excluded) has its limitations. Instead, the SB-PR co-citations should be given priority in current environment of scholarly communication. Since the “premature” or “transformative” breakthroughs in the challenge-type SB documents are either beyond the current knowledge domain, or violate established paradigms, people’s psychological distance from the SB is larger than that from the PR, which explains why the annual citations of the PR are usually higher than those of the SB, especially prior to or during the SB’s citation boom period.

Keywords

  • Citation history
  • Delayed recognition
  • Awakening mechanisms
  • Transformative innovation
  • Nobel Prize
access type Open Access

Understanding the Correlations between Social Attention and Topic Trends of Scientific Publications

Published Online: 01 Sep 2017
Page range: 28 - 49

Abstract

AbstractPurpose

We propose and apply a simplified nowcasting model to understand the correlations between social attention and topic trends of scientific publications.

Design/methodology/approach

First, topics are generated from the obesity corpus by using the latent Dirichlet allocation (LDA) algorithm and time series of keyword search trends in Google Trends are obtained. We then establish the structural time series model using data from January 2004 to December 2012, and evaluate the model using data from January 2013. We employ a state-space model to separate different non-regression components in an observational time series (i.e. the tendency and the seasonality) and apply the “spike and slab prior” and stepwise regression to analyze the correlations between the regression component and the social media attention. The two parts are combined using Markov-chain Monte Carlo sampling techniques to obtain our results.

Findings

The results of our study show that (1) the number of publications on child obesity increases at a lower rate than that of diabetes publications; (2) the number of publication on a given topic may exhibit a relationship with the season or time of year; and (3) there exists a correlation between the number of publications on a given topic and its social media attention, i.e. the search frequency related to that topic as identified by Google Trends. We found that our model is also able to predict the number of publications related to a given topic.

Research limitations

First, we study a correlation rather than causality between topics’ trends and social media. As a result, the relationships might not be robust, so we cannot predict the future in the long run. Second, we cannot identify the reasons or conditions that are driving obesity topics to present such tendencies and seasonal patterns, so we might need to do “field” study in the future. Third, we need to improve the efficiency of our model by finding more efficient variable selection models, because the stepwise regression method is time consuming, especially for a large number of variables.

Practical implications

This paper analyzes publication topic trends from three perspectives: tendency, seasonality, and correlation with social media attention, providing a new perspective for identifying and understanding topical themes in academic publications.

Originality/value

To the best of our knowledge, we are the first to apply the state-space model to examine the relationships between healthcare-related publications and social media to investigate the relationships between a topic’s evolvement and people’s search behavior in social media. This paper thus provides a new viewpoint in the correlation analysis area, and demonstrates the value of considering social media attention in the analysis of publication topic trends.

Keywords

  • Social media
  • Publication topic trends
  • Correlation
  • State-space model
  • Variable selection
  • Nowcasting
access type Open Access

Gauging a Firm’s Innovative Performance Using an Integrated Structural Index for Patents

Published Online: 01 Sep 2017
Page range: 6 - 27

Abstract

AbstractPurpose

In this contribution we try to find new indicators to measure characteristics of a firm’s patents and their influence on a company’s profits.

Design/methodology/approach

We realize that patent evaluation and influence on a company’s profits is a complicated issue requiring different perspectives. For this reason we design two types of structural h-indices, derived from the International Patent Classification (IPC). In a case study we apply not only basic statistics but also a nested case-control methodology.

Findings

The resulting indicator values based on a large dataset (19,080 patents in total) from the pharmaceutical industry show that the new structural indices are significantly correlated with a firm’s profits.

Research limitations

The new structural index and the synthetic structural index have just been applied in one case study in the pharmaceutical industry.

Practical implications

Our study suggests useful implications for patentometric studies and leads to suggestions for different sized firms to include a healthy research and development (R&D) policy management. The structural h-index can be used to gauge the profits resulting from the innovative performance of a firm’s patent portfolio.

Originality/value

Traditionally, the breadth and depth of patents of a firm and their citations are considered separately. This approach, however, does not provide an integrated insight in the major characteristics of a firm’s patents. The Sh(Y) index, proposed in our investigation, can reflect a firm’s innovation activities, its technological breadth, and its influence in an integrated way.

Keywords

  • Patent analysis
  • Structural h-index
  • Market value of patents
  • Technological value of patents
  • Pharmaceutical industry
  • Nested case-control

Plan your remote conference with Sciendo