rss_2.0Mathematics FeedSciendo RSS Feed for Mathematics Feed and Adjusting Bias Due to Mixed-Mode in Aspect of Daily Life Survey<abstract><title style='display:none'>Abstract</title><p>The mixed-mode (MM) designs are adopted by NSIs both to contrast declining response and coverage rates and to reduce the cost of the surveys. However, MM introduces several issues that must be addressed both at the design phase, by defining the best collection instruments to contain the measurement error, and at the estimation phase, by assessing and adjusting the mode effect. In the MM surveys, the mode effect refers to the introduction of bias effects on the estimate of the parameters of interest due to the difference in the selection and measurement errors specific to each mode. The switching of a survey from single to mixed-mode is a delicate operation: the accuracy of the estimates must be ensured in order to preserve their consistency and comparability over time. This work focuses on the methods chosen for the evaluation of the mode effect in the Italian National Institute of Statistics (ISTAT) mixed-mode survey “Aspects of Daily Life – 2017”, in the experimental context for which an independent control single-mode (SM) PAPI sample was planned to assess the introduction of the sequential web/PAPI survey. The presented methods aim to analyze the causes that can determine significant differences in the estimates obtained with the SM and MM surveys.</p></abstract>ARTICLE2021-06-22T00:00:00.000+00:00Applying Machine Learning for Automatic Product Categorization<abstract><title style='display:none'>Abstract</title><p>Every five years, the U.S. Census Bureau conducts the Economic Census, the official count of US businesses and the most extensive collection of data related to business activity. Businesses, policymakers, governments and communities use Economic Census data for economic development, business decisions, and strategic planning. The Economic Census provides key inputs for economic measures such as the Gross Domestic Product and the Producer Price Index. The Economic Census requires businesses to fill out a lengthy questionnaire, including an extended section about the goods and services provided by the business.</p><p>To address the challenges of high respondent burden and low survey response rates, we devised a strategy to automatically classify goods and services based on product information provided by the business. We asked several businesses to provide a spreadsheet containing Universal Product Codes and associated text descriptions for the products they sell. We then used natural language processing to classify the products according to the North American Product Classification System. This novel strategy classified text with very high accuracy rates - our best algorithms surpassed over 90%.</p></abstract>ARTICLE2021-06-22T00:00:00.000+00:00Preface Hybrid Technique for the Multiple Imputation of Survey Data<abstract><title style='display:none'>Abstract</title><p>Most of the background variables in MICS (Multiple Indicator Cluster Surveys) are categorical with many categories. Like many other survey data, the MICS 2014 women’s data suffers from a large number of missing values. Additionally, complex dependencies may be existent among a large number of categorical variables in such surveys. The most commonly used parametric multiple imputation (MI) approaches based on log linear models or chained Equations (MICE) become problematic in these situations and often the implemented algorithms fail. On the other hand, nonparametric MI techniques based on Bayesian latent class models worked very well if only categorical variables are considered. This article describes how chained equations MI for continuous variables can be made dependent on categorical variables which have been imputed beforehand by using latent class models. Root mean square errors (RMSEs) and coverage rates of 95% confidence intervals (CI) for generalized linear models (GLM’s) with binary response are estimated in a simulation study and a comparison is made among proposed and various existing MI methods. The proposed method outperforms the MICE algorithms in most of the cases with less computational time. The results obtained by the simulation study are supported by a real data example.</p></abstract>ARTICLE2021-06-22T00:00:00.000+00:00A Diagnostic for Seasonality Based Upon Polynomial Roots of ARMA Models<abstract><title style='display:none'>Abstract</title><p>Methodology for seasonality diagnostics is extremely important for statistical agencies, because such tools are necessary for making decisions whether to seasonally adjust a given series, and whether such an adjustment is adequate. This methodology must be statistical, in order to furnish quantification of Type I and II errors, and also to provide understanding about the requisite assumptions. We connect the concept of seasonality to a mathematical definition regarding the oscillatory character of the moving average (MA) representation coefficients, and define a new seasonality diagnostic based on autoregressive (AR) roots. The diagnostic is able to assess different forms of seasonality: dynamic versus stable, of arbitrary seasonal periods, for both raw data and seasonally adjusted data. An extension of the AR diagnostic to an MA diagnostic allows for the detection of over-adjustment. Joint asymptotic results are provided for the diagnostics as they are applied to multiple seasonal frequencies, allowing for a global test of seasonality. We illustrate the method through simulation studies and several empirical examples.</p></abstract>ARTICLE2021-06-22T00:00:00.000+00:00The Evolution of the Italian Framework to Measure Well-Being<abstract><title style='display:none'>Abstract</title><p>Recently, a new approach for measuring well-being was developed by eighteen European countries in the wake of the “Beyond GDP movement” started in the 1990 and continued by the Stiglitz Commission. Among these European economies, eleven of them use measures of well-being for monitoring public policy. The Italian Statistical Institute (Istat) jointly with the National Council for Economics and Labor (CNEL) developed a multi-dimensional framework for measuring “equitable and sustainable well-being” (Bes) and since 2013 Istat publishes an annual report on well-being. The Bes framework is continuously updated to take into account new challenges: the exploitation of new data sources, to produce better indicators; new ways for making the communication more effective and foster public awareness; the inclusion of well-being indicators in the budget documents, as established by law. Especially for the latter, the Italian Bes can be considered a forerunner and, more generally, the Italian experience is one of the most relevant at the European level, showing potential of become a benchmark for other countries. This article illustrates the development of the Italian Bes, focusing on its recent progresses and challenges.</p></abstract>ARTICLE2021-06-22T00:00:00.000+00:00Measuring and Communicating the Uncertainty in Official Economic Statistics<abstract><title style='display:none'>Abstract</title><p>Official economic statistics are uncertain even if not always interpreted or treated as such. From a historical perspective, this article reviews different categorisations of data uncertainty, specifically the traditional typology that distinguishes sampling from nonsampling errors and a newer typology of Manski (2015). Throughout, the importance of measuring and communicating these uncertainties is emphasised, as hard as it can prove to measure some sources of data uncertainty, especially those relevant to administrative and big data sets. Accordingly, this article both seeks to encourage further work into the measurement and communication of data uncertainty in general and to introduce the Comunikos (COMmunicating UNcertainty In Key Official Statistics) project at Eurostat. Comunikos is designed to evaluate alternative ways of measuring and communicating data uncertainty specifically in contexts relevant to official economic statistics.</p></abstract>ARTICLE2021-06-22T00:00:00.000+00:00Improving Time Use Measurement with Personal Big Data Collection – The Experience of the European Big Data Hackathon 2019<abstract><title style='display:none'>Abstract</title><p>This article assesses the experience with i-Log at the European Big Data Hackathon 2019, a satellite event of the New Techniques and Technologies for Statistics (NTTS) conference, organised by Eurostat. i-Log is a system that enables capturing personal big data from smartphones’ internal sensors to be used for time use measurement. It allows the collection of heterogeneous types of data, enabling new possibilities for sociological urban field studies. Sensor data such as those related to the location or the movements of the user can be used to investigate and gain insights into the time diaries’ answers and assess their overall quality. The key idea is that the users’ answers are used to train machine-learning algorithms, allowing the system to learn from the user’s habits and to generate new time diaries’ answers. In turn, these new labels can be used to assess the quality of existing ones, or to fill the gaps when the user does not provide an answer. The aim of this paper is to introduce the pilot study, the i-Log system and the methodological evidence that emerged during the survey.</p></abstract>ARTICLE2021-06-22T00:00:00.000+00:00A structural Equation Model for Measuring Relative Development of Hungarian Counties in the Years 1994–2016<abstract><title style='display:none'>Abstract</title><p>Relative development of Hungarian counties is described generally by the GDP per capita indicator, but this figure does not cover the knowledge gap on the liveability of the regions. The other frequently used method is the indicator systems, but it does not emphasize the structure of causes and consequences of the regional development, and so, it does not provide information on which factors are more likely to be the causes or, reversely, the consequences of the different regional development. To overcome the shortcomings of the above-mentioned methods, we created a structural equation model (SEM) at NUTS 3 level for years 1994–2016 based on the LISREL estimation procedure. The applied model can be classified into experimental statistics, but it uses data only from official statistics, namely the regional indicators published by the Hungarian Central Statistical Office. The model assumes that the economic development depends on observable economic indicators, and it determines the regional development as well. In addition, the regional development is also explained by non-economic, social, demographic and cultural and infrastructural indicators. The variable selection and the classification into causes and consequences was a three-step process, and the factors were classified by analysis of correlations, cross-correlations and Granger-causality. The results of estimation provided basis for a deeper analysis; how the regional development has changed in Hungary after the regime change, and how these variables were influenced by the country’s integration into the global value chain.</p></abstract>ARTICLE2021-06-22T00:00:00.000+00:00Measuring the Accuracy of Aggregates Computed from a Statistical Register<abstract><title style='display:none'>Abstract</title><p>The Italian National Statistical Institute (Istat) is currently engaged in a modernization programme that foresees a significant revision of the methods traditionally used for the production of official statistics. The main concept behind this transformation is the use of the Integrated System Statistical Registers, created by a massive integration of administrative archives and survey data. In this article, we focus on how to measure the accuracy of register estimates of a population total from measurements calculated at the unit level. We propose the global mean squared error (GMSE) as a statistical quantity suitable for measuring accuracy in the context of the production of official statistics. It can be defined to explicitly consider the main sources of uncertainty that may affect registers. The article suggests a feasible calculation strategy for the GMSE that allows National Statistical Institutes to build algorithms that can promptly be applied for each user request, thus improving the relevance, transparency and confidence of official statistics. Through a simulation study, we verified the efficacy of the proposed strategy.</p></abstract>ARTICLE2021-06-22T00:00:00.000+00:00Variance Estimation after Mass Imputation Based on Combined Administrative and Survey Data<abstract><title style='display:none'>Abstract</title><p>This article discusses methods for evaluating the variance of estimated frequency tables based on mass imputation. We consider a general set-up in which data may be available from both administrative sources and a sample survey. Mass imputation involves predicting the missing values of a target variable for the entire population. The motivating application for this article is the Dutch virtual population census, for which it has been proposed to use mass imputation to estimate tables involving educational attainment. We present a new analytical design-based variance estimator for a frequency table based on mass imputation. We also discuss a more general bootstrap method that can be used to estimate this variance. Both approaches are compared in a simulation study on artificial data and in an application to real data of the Dutch census of 2011.</p></abstract>ARTICLE2021-06-22T00:00:00.000+00:00A Product Match Adjusted R Squared Method for Defining Products with Transaction Data<abstract><title style='display:none'>Abstract</title><p>The occurrence of relaunches of consumer goods at the barcode (GTIN) level is a well-known phenomenon in transaction data of consumer purchases. GTINs of disappearing and reintroduced items have to be linked in order to capture possible price changes.</p><p>This article presents a method that groups GTINs into strata (‘products’) by balancing two measures: an explained variance (R squared) measure for the ‘homogeneity’ of GTINs within products, while the second expresses the degree to which products can be ‘matched’ over time with respect to a comparison period. The resulting product ‘match adjusted R squared’ (MARS) combines explained variance in product prices with product match over time, so that different stratification schemes can be ranked according to the combined measure.</p><p>MARS has been applied to a broad range of product types. Individual GTINs are suitable as products for food and beverages, but not for product types with higher rates of churn, such as clothing, pharmacy products and electronics. In these cases, products are defined as combinations of characteristics, so that GTINs with the same characteristics are grouped into the same product. Future research focuses on further developments of MARS, such as attribute selection when data sets contain large numbers of variables.</p></abstract>ARTICLE2021-06-22T00:00:00.000+00:00Bounds on the Number of Edges of Edge-Minimal, Edge-Maximal and L-Hypertrees<abstract> <title style='display:none'>Abstract</title> <p>In their paper, Bounds on the number of edges in hypertrees, G.Y. Katona and P.G.N. Szabó introduced a new, natural definition of hypertrees in k- uniform hypergraphs and gave lower and upper bounds on the number of edges. They also defined edge-minimal, edge-maximal and l-hypertrees and proved an upper bound on the edge number of l-hypertrees. </p> <p>In the present paper, we verify the asymptotic sharpness of the <inline-graphic xmlns:xlink="" xlink:href="graphic/Untitled-5.jpg"/> upper bound on the number of edges of k-uniform hypertrees given in the above mentioned paper. We also make an improvement on the upper bound of the edge number of 2-hypertrees and give a general extension construction with its consequences. </p> <p>We give lower and upper bounds on the maximal number of edges of k-uniform edge-minimal hypertrees and a lower bound on the number of edges of k-uniform edge-maximal hypertrees. In the former case, the sharp upper bound is conjectured to be asymptotically <inline-graphic xmlns:xlink="" xlink:href="graphic/Untitled-6.jpg"/>. </p> </abstract>ARTICLE2016-04-15T00:00:00.000+00:00Distance Magic Cartesian Products of Graphs<abstract> <title style='display:none'>Abstract</title> <p>A distance magic labeling of a graph G = (V,E) with |V | = n is a bijection ℓ : V → {1, . . . , n} such that the weight of every vertex v, computed as the sum of the labels on the vertices in the open neighborhood of v, is a constant. </p> <p>In this paper, we show that hypercubes with dimension divisible by four are not distance magic. We also provide some positive results by proving necessary and sufficient conditions for the Cartesian product of certain complete multipartite graphs and the cycle on four vertices to be distance magic.</p> </abstract>ARTICLE2016-04-15T00:00:00.000+00:00New Bounds on the Signed Total Domination Number of Graphs<abstract> <title style='display:none'>Abstract</title> <p>In this paper, we study the signed total domination number in graphs and present new sharp lower and upper bounds for this parameter. For example by making use of the classic theorem of Turán [8], we present a sharp lower bound on K<sub>r+1</sub>-free graphs for r ≥ 2. Applying the concept of total limited packing we bound the signed total domination number of G with δ(G) ≥ 3 from above by <inline-graphic xmlns:xlink="" xlink:href="graphic/Untitled-1.jpg"/>. Also, we prove that γ<sub>st</sub>(T) ≤ n − 2(s − s′) for any tree T of order n, with s support vertices and s′ support vertices of degree two. Moreover, we characterize all trees attaining this bound.</p> </abstract>ARTICLE2016-04-15T00:00:00.000+00:00End Simplicial Vertices in Path Graphs<abstract> <title style='display:none'>Abstract</title> <p>A graph is a path graph if there is a tree, called UV -model, whose vertices are the maximal cliques of the graph and for each vertex x of the graph the set of maximal cliques that contains it induces a path in the tree. A graph is an interval graph if there is a UV -model that is a path, called an interval model. Gimbel [3] characterized those vertices in interval graphs for which there is some interval model where the interval corresponding to those vertices is an end interval. In this work, we give a characterization of those simplicial vertices x in path graphs for which there is some UV -model where the maximal clique containing x is a leaf in this UV -model.</p> </abstract>ARTICLE2016-04-15T00:00:00.000+00:00The Steiner Wiener Index of A Graph<abstract> <title style='display:none'>Abstract</title> <p>The Wiener index W(G) of a connected graph G, introduced by Wiener in 1947, is defined as W(G) = ∑<sub>u,v∈V(G)</sub> d(u, v) where d<sub>G</sub>(u, v) is the distance between vertices u and v of G. The Steiner distance in a graph, introduced by Chartrand et al. in 1989, is a natural generalization of the concept of classical graph distance. For a connected graph G of order at least 2 and S ⊆ V (G), the Steiner distance d(S) of the vertices of S is the minimum size of a connected subgraph whose vertex set is S. We now introduce the concept of the Steiner Wiener index of a graph. The Steiner k-Wiener index SW<sub>k</sub>(G) of G is defined by <inline-graphic xmlns:xlink="" xlink:href="graphic/Untitled-2.jpg"/>. Expressions for SW<sub>k</sub> for some special graphs are obtained. We also give sharp upper and lower bounds of SW<sub>k</sub> of a connected graph, and establish some of its properties in the case of trees. An application in chemistry of the Steiner Wiener index is reported in our another paper.</p> </abstract>ARTICLE2016-04-15T00:00:00.000+00:00The Quest for A Characterization of Hom-Properties of Finite Character<abstract> <title style='display:none'>Abstract</title> <p>A graph property is a set of (countable) graphs. A homomorphism from a graph G to a graph H is an edge-preserving map from the vertex set of G into the vertex set of H; if such a map exists, we write G → H. Given any graph H, the hom-property →H is the set of H-colourable graphs, i.e., the set of all graphs G satisfying G → H. A graph property P is of finite character if, whenever we have that F ∈ P for every finite induced subgraph F of a graph G, then we have that G ∈ P too. We explore some of the relationships of the property attribute of being of finite character to other property attributes such as being finitely-induced-hereditary, being finitely determined, and being axiomatizable. We study the hom-properties of finite character, and prove some necessary and some sufficient conditions on H for →H to be of finite character. A notable (but known) sufficient condition is that H is a finite graph, and our new model-theoretic proof of this compactness result extends from hom-properties to all axiomatizable properties. In our quest to find an intrinsic characterization of those H for which →H is of finite character, we find an example of an infinite connected graph with no finite core and chromatic number 3 but with hom-property not of finite character.</p> </abstract>ARTICLE2016-04-15T00:00:00.000+00:00The Existence of Quasi Regular and Bi-Regular Self-Complementary 3-Uniform Hypergraphs<abstract> <title style='display:none'>Abstract</title> <p>A k-uniform hypergraph H = (V ;E) is called self-complementary if there is a permutation σ : V → V , called a complementing permutation, such that for every k-subset e of V , e ∈ E if and only if σ(e) ∉ E. In other words, H is isomorphic with H′ = (V ; V<sup>(k)</sup> − E). In this paper we define a bi-regular hypergraph and prove that there exists a bi-regular self-complementary 3-uniform hypergraph on n vertices if and only if n is congruent to 0 or 2 modulo 4. We also prove that there exists a quasi regular self-complementary 3-uniform hypergraph on n vertices if and only if n is congruent to 0 modulo 4.</p> </abstract>ARTICLE2016-04-15T00:00:00.000+00:00Heavy Subgraph Conditions for Longest Cycles to Be Heavy in Graphs<abstract> <title style='display:none'>Abstract</title> <p>Let G be a graph on n vertices. A vertex of G with degree at least n/2 is called a heavy vertex, and a cycle of G which contains all the heavy vertices of G is called a heavy cycle. In this note, we characterize graphs which contain no heavy cycles. For a given graph H, we say that G is H-heavy if every induced subgraph of G isomorphic to H contains two nonadjacent vertices with degree sum at least n. We find all the connected graphs S such that a 2-connected graph G being S-heavy implies any longest cycle of G is a heavy cycle.</p> </abstract>ARTICLE2016-04-15T00:00:00.000+00:00en-us-1