What is metadata?
Metadata, in its shortest definition, is data about data. It is a collection of information that defines and describes data, explains its quality, source, format and other variables. Metadata acts as a bridge between data producers and data users.
What is Statistical Metadata?
Statistical Metadata is the organized information that includes explanations about the definition, production, publishing, access options and legal basis of statistical data, presenting its quality and source, making use and management of data easier.
Why is Metadata necessary?
• An end-to-end standardization is achieved in all statistical production processes and a common language is created with the use of common standardized processes, terminology (concepts and definitions), variables and code lists.
• For all objects used in the statistical production process; names, descriptions, value domains, the meanings of the values become defined.
• Coordination and integration within the National and International Statistical System is ensured by describing statistical information in a consistent way. Collaboration is strengthened by preventing duplication.
• Data and metadata are recorded and versioned as soon as they are created, thus institutional memory is attained.
• When data exists together with its metadata, it becomes easily accessible and understandable for anyone who wants to use it.
• Statistical quality can be enhanced with a powerful metadata system.
Statistical Metadata Components
Basically, statistical metadata consists of two parts:
Structural metadata: It is the metadata used to describe the structure of the data and what it measures. Database tables (data sets), questions, variables and definitions of variables, code lists, data types of the variable (numeric, character, date, etc.) and lengths are components of structural metadata.
Reference metadata: It is the metadata that defines the content and quality of statistical data. The purpose and scope of the research, data collection and processing methods, quality and dissemination indicators (timeliness, announcement of possible changes in the publishing schedule, measurement of sampling errors and non-sampling errors, etc.) are within the scope of reference metadata. The classifications used in statististical production, the statistical unit (household, business, etc.), geographic coverage, documents related to methodology, quality indicators, institutional quality reports, seasonal adjustment and revision policies can be given as examples of reference metadata. The metadata regarding the press releases on the TURKSTAT website are reference metadata.
International Model and Standards
Metadata Standards are the standards that ensure the users as well as the owners of the data to understand and use the data correctly, and aim at establishing a common meaning and understanding of the data. In this context, links to international models and standards (GSIM, DDI, SDMX, GSBPM, etc.) used in statistical offices are presented below.
Generic Statistical Information Model (GSIM)
Data Documentation Initiative (DDI)
Statistical Data and Metadata Exchange (SDMX) Knowledge Area
Generic Statistical Business Process Model (GSBPM)
OECD Glossary of Statistical Terms
Statistics Explained – Eurostat
Main_Page RAMON - Eurostat’s Metadata Server Eurostat
Dublin Core Metadata Initiative (DCMI)