Uneingeschränkter Zugang

A Product Match Adjusted R Squared Method for Defining Products with Transaction Data

Journal of Official Statistics's Cover Image
Journal of Official Statistics
Special Issue on New Techniques and Technologies for Statistics

Zitieren

The occurrence of relaunches of consumer goods at the barcode (GTIN) level is a well-known phenomenon in transaction data of consumer purchases. GTINs of disappearing and reintroduced items have to be linked in order to capture possible price changes.

This article presents a method that groups GTINs into strata (‘products’) by balancing two measures: an explained variance (R squared) measure for the ‘homogeneity’ of GTINs within products, while the second expresses the degree to which products can be ‘matched’ over time with respect to a comparison period. The resulting product ‘match adjusted R squared’ (MARS) combines explained variance in product prices with product match over time, so that different stratification schemes can be ranked according to the combined measure.

MARS has been applied to a broad range of product types. Individual GTINs are suitable as products for food and beverages, but not for product types with higher rates of churn, such as clothing, pharmacy products and electronics. In these cases, products are defined as combinations of characteristics, so that GTINs with the same characteristics are grouped into the same product. Future research focuses on further developments of MARS, such as attribute selection when data sets contain large numbers of variables.

eISSN:
2001-7367
Sprache:
Englisch
Zeitrahmen der Veröffentlichung:
4 Hefte pro Jahr
Fachgebiete der Zeitschrift:
Mathematik, Wahrscheinlichkeitstheorie und Statistik