- Dettagli della rivista
- Pubblicato per la prima volta
- 30 Mar 2017
- Periodo di pubblicazione
- 4 volte all'anno
- Accesso libero
Introduction of the Inaugural Issue of the Pubblicato online: 01 Sep 2017
Journal of Data and Information Science
Pagine: 1 - 2
- Accesso libero
Let the Data Speak for Themselves: Opportunities and Caveats
Pagine: 3 - 5
- Accesso libero
How Users Search the Mobile Web: A Model for Understanding the Impact of Motivation and Context on Search Behaviors
Pagine: 98 - 122
This study explores how search motivation and context influence mobile Web search behaviors.
We studied 30 experienced mobile Web users via questionnaires, semi-structured interviews, and an online diary tool that participants used to record their daily search activities. SQLite Developer was used to extract data from the users’ phone logs for correlation analysis in Statistical Product and Service Solutions (SPSS).
One quarter of mobile search sessions were driven by two or more search motivations. It was especially difficult to distinguish curiosity from time killing in particular user reporting. Multi-dimensional contexts and motivations influenced mobile search behaviors, and among the context dimensions, gender, place, activities they engaged in while searching, task importance, portal, and interpersonal relations (whether accompanied or alone when searching) correlated with each other.
The sample was comprised entirely of college students, so our findings may not generalize to other populations. More participants and longer experimental duration will improve the accuracy and objectivity of the research.
Motivation analysis and search context recognition can help mobile service providers design applications and services for particular mobile contexts and usages.
Most current research focuses on specific contexts, such as studies on place, or other contextual influences on mobile search, and lacks a systematic analysis of mobile search context. Based on analysis of the impact of mobile search motivations and search context on search behaviors, we built a multi-dimensional model of mobile search behaviors.
- Mobile search
- Query formulation
- Search motivation
- Search context
- Contextual dimension
- Accesso libero
Dpaper: An Authoring Tool for Extractable Digital Papers
Pagine: 86 - 97
To develop a structured, rich media digital paper authoring tool with an object-based model that enables interactive, playable, and convertible functions.
We propose Dpaper to organize the content (text, data, rich media, etc.) of dissertation papers as XML and HTML5 files by means of digital objects and digital templates.
Dpaper provides a structured-paper editorial platform for the authors of PhDs to organize research materials and to generate various digital paper objects that are playable and reusable. The PhD papers are represented as Web pages and structured XML files, which are marked with semantic tags.
The proposed tool only provides access to a limited number of digital objects. For instance, the tool cannot create equations and graphs, and typesetting is not yet flexible compared to MS Word.
The Dpaper tool is designed to break through the patterns of unstructured content organization of traditional papers, and makes the paper accessible for not only reading but for exploitation as data, where the document can be extractable and reusable. As a result, Dpaper can make the digital publishing of dissertation texts more flexible and efficient, and their data more assessable.
The Dpaper tool solves the challenge of making a paper structured and object-based in the stage of authoring, and has practical values for semantic publishing.
- Extractable paper
- Digital object
- Authoring tool
- Accesso libero
A Bootstrapping-based Method to Automatically Identify Data-usage Statements in Publications
Pagine: 69 - 85
Our study proposes a bootstrapping-based method to automatically extract data-usage statements from academic texts.
The method for data-usage statements extraction starts with seed entities and iteratively learns patterns and data-usage statements from unlabeled text. In each iteration, new patterns are constructed and added to the pattern list based on their calculated score. Three seed-selection strategies are also proposed in this paper.
The performance of the method is verified by means of experiments on real data collected from computer science journals. The results show that the method can achieve satisfactory performance regarding precision of extraction and extensibility of obtained patterns.
While the triple representation of sentences is effective and efficient for extracting data-usage statements, it is unable to handle complex sentences. Additional features that can address complex sentences should thus be explored in the future.
Data-usage statements extraction is beneficial for data-repository construction and facilitates research on data-usage tracking, dataset-based scholar search, and dataset evaluation.
To the best of our knowledge, this paper is among the first to address the important task of automatically extracting data-usage statements from real data.
- Data-usage statements extraction
- Information extraction
- Unsupervised learning
- Academic text-mining
- Accesso libero
A Bibliometric Framework for Identifying “Princes” Who Wake up the “Sleeping Beauty” in Challenge-type Scientific Discoveries
Pagine: 50 - 68
This paper develops and validates a bibliometric framework for identifying the “princes” (PR) who wake up the “sleeping beauty” (SB) in challenge-type scientific discoveries, so as to figure out the awakening mechanisms, and promote potentially valuable but not readily accepted innovative research. (A PR is a research study.)
We propose that PR candidates must meet the following four criteria: (1) be published near the time when the SB began to attract a lot of citations; (2) be highly cited papers themselves; (3) receive a substantial number of co-citations with the SB; and (4) within the challenge-type discoveries which contradict established theories, the “pulling effect” of the PR on the SB must be strong. We test the usefulness of the bibliometric framework through a case study of a key publication by the 2014 chemistry Nobel laureate Stefan W. Hell, who negated Ernst Abbe’s diffraction limit theory, one of the most prominent paradigms in the natural sciences.
The first-ranked candidate PR article identified by the bibliometric framework is in line with historical facts. An SB may need one or more PRs and even “retinues” to be “awakened.” Documents with potential awakening functionality tend to be published in prestigious multidisciplinary journals with higher impact and wider scope than the journals publishing SBs.
The above framework is only applicable to transformative innovations, and the conclusions are drawn from the analysis of one typical SB and her awakening process. Therefore the generality of our work might be limited.
Publications belonging to so-called transformative research, even when less frequently cited, should be given special attention as early as possible, because they may suddenly attract many citations after a period of sleep, as reflected in our case study.
The definition of PR(s) as the first paper(s) that cited the SB article (selfciting excluded) has its limitations. Instead, the SB-PR co-citations should be given priority in current environment of scholarly communication. Since the “premature” or “transformative” breakthroughs in the challenge-type SB documents are either beyond the current knowledge domain, or violate established paradigms, people’s psychological distance from the SB is larger than that from the PR, which explains why the annual citations of the PR are usually higher than those of the SB, especially prior to or during the SB’s citation boom period.
- Citation history
- Delayed recognition
- Awakening mechanisms
- Transformative innovation
- Nobel Prize
- Accesso libero
Understanding the Correlations between Social Attention and Topic Trends of Scientific Publications
Pagine: 28 - 49
We propose and apply a simplified nowcasting model to understand the correlations between social attention and topic trends of scientific publications.
First, topics are generated from the obesity corpus by using the latent Dirichlet allocation (LDA) algorithm and time series of keyword search trends in Google Trends are obtained. We then establish the structural time series model using data from January 2004 to December 2012, and evaluate the model using data from January 2013. We employ a state-space model to separate different non-regression components in an observational time series (i.e. the tendency and the seasonality) and apply the “spike and slab prior” and stepwise regression to analyze the correlations between the regression component and the social media attention. The two parts are combined using Markov-chain Monte Carlo sampling techniques to obtain our results.
The results of our study show that (1) the number of publications on child obesity increases at a lower rate than that of diabetes publications; (2) the number of publication on a given topic may exhibit a relationship with the season or time of year; and (3) there exists a correlation between the number of publications on a given topic and its social media attention, i.e. the search frequency related to that topic as identified by Google Trends. We found that our model is also able to predict the number of publications related to a given topic.
First, we study a correlation rather than causality between topics’ trends and social media. As a result, the relationships might not be robust, so we cannot predict the future in the long run. Second, we cannot identify the reasons or conditions that are driving obesity topics to present such tendencies and seasonal patterns, so we might need to do “field” study in the future. Third, we need to improve the efficiency of our model by finding more efficient variable selection models, because the stepwise regression method is time consuming, especially for a large number of variables.
This paper analyzes publication topic trends from three perspectives: tendency, seasonality, and correlation with social media attention, providing a new perspective for identifying and understanding topical themes in academic publications.
To the best of our knowledge, we are the first to apply the state-space model to examine the relationships between healthcare-related publications and social media to investigate the relationships between a topic’s evolvement and people’s search behavior in social media. This paper thus provides a new viewpoint in the correlation analysis area, and demonstrates the value of considering social media attention in the analysis of publication topic trends.
- Social media
- Publication topic trends
- State-space model
- Variable selection
- Accesso libero
Gauging a Firm’s Innovative Performance Using an Integrated Structural Index for Patents
Pagine: 6 - 27
In this contribution we try to find new indicators to measure characteristics of a firm’s patents and their influence on a company’s profits.
We realize that patent evaluation and influence on a company’s profits is a complicated issue requiring different perspectives. For this reason we design two types of structural h-indices, derived from the International Patent Classification (IPC). In a case study we apply not only basic statistics but also a nested case-control methodology.
The resulting indicator values based on a large dataset (19,080 patents in total) from the pharmaceutical industry show that the new structural indices are significantly correlated with a firm’s profits.
The new structural index and the synthetic structural index have just been applied in one case study in the pharmaceutical industry.
Our study suggests useful implications for patentometric studies and leads to suggestions for different sized firms to include a healthy research and development (R&D) policy management. The structural h-index can be used to gauge the profits resulting from the innovative performance of a firm’s patent portfolio.
Traditionally, the breadth and depth of patents of a firm and their citations are considered separately. This approach, however, does not provide an integrated insight in the major characteristics of a firm’s patents. The
- Patent analysis
- Structural h-index
- Market value of patents
- Technological value of patents
- Pharmaceutical industry
- Nested case-control