Big Metadata, Smart Metadata, and Metadata Capital: Toward Greater Synergy Between Data Science and Metadata
22 ago 2017
Acerca de este artículo
Categoría del artículo: Expert Review
Publicado en línea: 22 ago 2017
Páginas: 19 - 36
Recibido: 10 jun 2017
Aceptado: 26 jul 2017
DOI: https://doi.org/10.1515/jdis-2017-0012
Palabras clave
© 2017 Jane Greenberg
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.



The five Vs of big metadata_
Five Vs | Definition |
---|---|
Volume | The quantity and usefulness of metadata generated daily confirms the existence of big metadata. At times metadata is less than or equal to the extent of the data it describes in size (bytes). During other times the metadata exceeds the data being described or tracked, due to the complexity of the data lifecycle activity. Linked data offers an example, with metadata renderings that can be larger than the volume of data object(s) being represented. Like big data, not all big metadata is useful, and a challenge is to identify the big metadata that is useful for data science and analytic endeavors. |
Velocity | Metadata is generated via automatic processes at immense speed correlating with rate of digital transactions. For example, searching Google, answering an email, purchasing an item online, and day-to-day office activities such as word processing of all log data, as well as associated metadata. |
Variety | Metadata reflects the wide variety of data formats, types, and genres along with the extensive range of data and metadata lifecycles. In addition, the different types of metadata (e.g. discovery, technical, preservation, etc.) as well as unique domain specific metadata requirements intensify the variety. |
Variability | There is an unmistakable unevenness of metadata across the digital ecosystem. Lack of uniformity is extensive for data descriptions across different domains, systems, and processes. This unevenness can even be profound within domains, given economic factors supporting metadata generation, competing standards, or, simply, differing adoption policies. For example, two organizations may use the same metadata standard, but have different implementation practices. Even when standardization is imposed, an organization, process, and human activity can contribute to inconsistencies. |
Value | |
Metadata, as the |