DDI RDF Vocabularies

Current status: The vocabularies underwent public review in 2014. XKOS was officially published in June 2019.  Disco has been updated for public review and publication in 2019, and PHDD has been postponed due to similar development in related vocabularies.

The DDI Alliance has supported work on three RDF vocabularies, including: the DDI-RDF Discovery vocabulary for publishing metadata about datasets into the Web of Linked Data (Disco); PHDD, a vocabulary for describing existing data in rectangular format; and XKOS, an RDF vocabulary for describing statistical classifications, which is an extension of the popular SKOS vocabulary.


XKOS - Extended Knowledge Organization System

Current Status: Published

XKOS leverages the Simple Knowledge Organization System (SKOS) for managing statistical classifications and concept management systems, since SKOS is widely used. LOD is used to create Web artifacts that machines can interpret, so publishing machine-readable statistical classifications and other concept management systems as SKOS instances is desired. The XKOS developers found that SKOS was insufficient for the problem. No aspect of SKOS was found to be wrong, just incomplete. Therefore, an extension to SKOS, called XKOS, is proposed.

XKOS extends SKOS for the needs of statistical classifications. It does so in two main directions. First, it defines a number of terms that enable the representation of statistical classifications with their structure and textual properties, as well as the relations between classifications. Second, it refines SKOS semantic properties to allow the use of more specific relations between concepts. Those specific relations can be used for the representation of classifications or for any other case where SKOS is employed. XKOS adds the extensions that are desirable to meet the requirements of the statistical community.


Disco - DDI-RDF Discovery Vocabulary

Current Status: In Development

This specification is designed to support the discovery of microdata sets and related metadata using RDF technologies in the Web of Linked Data. The vocabulary leverages the DDI specification to create a simplified version of this model for the discovery of data files. It is based on a subset of the DDI XML formats of DDI Codebook and DDI Lifecycle. It supports identifying programmatically the relevant datasets for a specific research purpose. Existing DDI XML instances can be transformed into this RDF format and therefore exposed in the Web of Linked Data. The reverse process is not intended, as the developers of the RDF discovery vocabulary have defined DDI-RDF components and reused components of other RDF vocabularies which make sense only in the Linked Data field.


PHDD - Physical Data Description

Current Status: Postponed

Description of the physical properties of existing or published data (tables) in a rectangular format. The data could be either represented in records with character-separated values (CSV) or in records with fixed length.

PHDD could be used standalone or together with related vocabularies like Data Catalog Vocabulary (DCAT) or DDI-RDF Discovery (Disco). Descriptions in PHDD could be added to Web pages which provide tables in rectangular format. This would enable processing of this data by programs.

The combined usage of PHDD, DDI-RDF Discovery, and DCAT would support the creation of data repositories which provide metadata for the description of collections, for data discovery, and for processing of the data.


Mailing list

Development history

The work on both vocabularies began in a workshop on "Semantic Statistics for Social, Behavioural, and Economic Sciences: Leveraging the DDI Model for the Linked Data Web" at Schloss Dagstuhl - Leibniz Center for Informatics, Germany, in September 2011. This work has been continued at these three meetings: follow-up working meeting (Discovery vocabulary) at the 3rd Annual European DDI Users Group Meeting (EDDI11) in Gothenburg, Sweden, in December 2011; a second workshop on "Semantic Statistics for Social, Behavioural, and Economic Sciences: Leveraging the DDI Model for the Linked Data Web" at Schloss Dagstuhl in October 2012; and a follow-up meeting (Discovery only) at GESIS-Leibniz Institute for the Social Sciences in Mannheim, Germany, in February 2013.