Controlled Vocabularies

About the Controlled Vocabularies

Controlled vocabularies play a critical role in metadata standards in terms of (1) semantics -- definition of the meaning of metadata elements, and (2) content -- declaration of instructions for what and how values should be assigned to elements.

A first set of controlled vocabularies is now being developed for the DDI standard, to be used to describe specific aspects of research data across the data life cycle. These DDI-CVs, which are being created by the DDI Controlled Vocabularies Group (CVG), may be used for other purposes and by other applications as well. The group will add new vocabularies to the site as soon as they are finalized, in addition to the core set published at this point.

Select DDI Alliance vocabularies are already in use at organizations like the Finnish Social Science Data Archive (FSD), the GESIS - Leibniz Institute for the Social Sciences, the Inter-university Consortium for Political and Social Science (ICPSR), Mathematica Policy Research,  the UK Data Archive (UKDA), and the University at Bielefeld, Germany. Nesstar Publisher (http://www.nesstar.com/) now incorporates the controlled vocabularies for Analysis Unit and Time Method.

A paper on "Controlled Vocabularies for DDI 3: Enhancing Machine-Actionability" provides additional background on this effort.

Formats

The vocabularies are published independently of the DDI schemas in an XML format called Genericode, an OASIS specification. The Genericode format provides a tabular model for code lists.

The vocabularies are also made available in HTML and XLS (Excel) form.

Download

This overview page lists the CVs currently available and provides download links for each format. It also provides a zipped package of the most recent version of all the available CVs and formats.

Usage

DDI Codebook and DDI Lifecycle handle controlled vocabularies in different ways. However, the published DDI-CVs work with both versions of the specification. Usage information for each controlled vocabulary is available in the vocabulary documentation. Usage instructions specific to DDI Lifecycle, as well as recommendations for citing the CVs outside DDI are also available, along with examples.

Publication, Maintenance, and Management

The CVG functions as the management team for the vocabularies. Comments, as well as suggestions for  amendments or additions, are welcome from all users. To provide feedback, or submit proposals for changes, please contact the ddi-cvg [at] ddialliance [dot] org (CVG). Updated vocabularies will be published with new version numbers. Please note that the production, publication, and maintenance of translations are currently outside CVG's scope, but CVG would appreciate being informed if other agencies are interested in, or planning to undertake such tasks.

Versioning Policy

The DDI CV versioning policy as described below has been approved by the DDI Alliance in November 2012 and is published and implemented starting February 2013. This new protocol supersedes the previous policy which was based on a three-digit version numbering system. Users who have referenced these vocabularies prior to February 1, 2013 will need to retroactively change any reference to V. 1.0.0 into V. 1.0. From that point on, new versions can be used and referenced normally.

The controlled vocabularies versioning policy is based on an intellectual, or logical, assessment of the nature of change, which distinguishes between substantive and non-substantive changes in the CVs, as described further below. To reflect this distinction, the version numbering system is based on a two-level structure (examples: 1.0, 1.1, 1.2, 2.0, etc.). A change in the integral part of the decimal number will indicate a substantive change in the controlled vocabulary. A change in the fractional part will indicate a non-substantive change. All version levels (i.e. the full decimal number, even when the fractional part is zero) will always be mentioned.

Versioning of the CVs is done at the level of each published controlled vocabulary, and not at the item level.

An item in a CV list consists of the following parts:

Code The specific content that is entered into the DDI specification to identify the item. In hierarchical lists, all of the levels are always mentioned in each code, and are separated by a period.
Term The display label associated with the code. This may be available in multiple languages.
Definition The definition of the code. This may be available in multiple languages.

 

Changes in version will be made according to the following rules:

Substantive changes: any change in (list) content or (code) meaning

  • Addition of new code(s) (change in list content)
  • Deletion of existing code(s) (change in list content)
  • Widening the definition of a code (change in meaning)
  • Narrowing the definition of a code (change in meaning)
  • Change in the “value” or “name” of a code, including change in spelling; since the codes are the “official” or “legal” entries (“terms” and “definitions” are documentation for the codes) a change in name really amounts to a change in code, i.e. change in list content, therefore this is a substantial change)
  • Merging codes (amounts to deleting codes and adding new one(s))
  • Splitting codes (amounts to deleting codes and adding new ones)

Non-substantive changes: Changes in wording, spelling, etc. (i.e. “form”) that do not involve changes in content or meaning:

  • Rephrasing a definition to make it clearer, or adding examples without changing the meaning of the code
  • Rephrasing a “term” (the natural language “label” for the code) for clarity without changing the meaning of the code
  • Correcting spelling errors in both “term” and “definition”.

In addition to a change in the version number, each new version of a CV will contain documentation about how the new CV compares with the previous version. In the Genericode XML, the changes will be documented using the following notations:

UNCHANGED: X -- Code X and its definition have remained unchanged.

RENAMED: X-Y -- The definition for code X has remained the same but the code itself has been changed (renamed) to Y.

REDEFINED: X -- The definition for code X has been changed to reflect a change in meaning for code X.

DEFINITION REPHRASED: X -- The definition for code X has been rephrased for clarity or edited for accuracy without a change in meaning for code X.

TERM REPHRASED: X -- The term describing code X has been changed or edited for clarity or accuracy without a change in meaning for code X.

WIDENED: X -- The definition of code X has been changed to expand the meaning of the code.

NARROWED: X -- The definition for code X has been changed to reflect a narrowing in the meaning of the code.

MERGED: X, Y, (n)-Y -- The old code X has been removed and all the data classified with it are included in code Y.

SPLIT: X-X, Y,(n) -- The meaning of code X has been narrowed and code Y has been added to cover for the remainder of the meaning previously held by code X.

REMOVED: X -- Code X has been deleted.

ADDED: Y -- A new code Y has been added to the CV.

Note: DDI-CVG has also produced a set of guidelines to support controlled vocabularies users in retrofitting their collections following the publication of new CV versions. Please note that these are only intended as recommendations, and are not being enforced as part of the versioning policy.