DDI Controlled Vocabulary for Summary Statistic Type


Description

Specifies the type of summary statistic. Summary statistics are a single number representation of the characteristics of a set of values.


Details

Short Name:
SummaryStatisticType
Long Name:
Summary Statistic Type
Version:
2.0
Version Notes:
Version 2.0 incorporates two new codes, PercentageOfValidCases, and PercentageOfInvalidCases, that were added to make the list more complete and enhance usability.
Version Changes:
ADDED CODES: PercentageOfValidCases, PercentageOfInvalidCases.
Canonical URI:
urn:ddi-cv:SummaryStatisticType
Canonical URI of this version:
urn:ddi-cv:SummaryStatisticType:1.0
Location URI:
http://www.ddialliance.org/Specification/DDI-CV/SummaryStatisticType_2.0_Genericode1.0_DDI-CVProfile1.0.xml
Alternate format location URI:
http://www.ddialliance.org/Specification/DDI-CV/SummaryStatisticType_2.0.html
Alternate format location URI:
http://www.ddialliance.org/Specification/DDI-CV/SummaryStatisticType_2.0_InputSheet_Excel2003.xls
Agency Name:
DDI Alliance

Code List
Value of the Code Descriptive Term of the Code Definition of the Code
ArithmeticMean Arithmetic mean (X) Mathematical average of a set of values. The mean is calculated by adding up two or more values and dividing the total by their number. In social/political science, it is usually the sum of the measurements divided by the number of subjects, or cases.
GeometricMean Geometric mean Average value of all data if extracting the nth root of the product of all (n) values. Rarely used in social sciences.
HarmonicMean Harmonic mean Average value of all data if calculating the reciprocal of the arithmetic mean of the reciprocal of values. Rarely used in social sciences.
TrimmedMean Trimmed mean The (arithmetic) mean calculated after discarding given parts of observations at the high and low end (e.g., interquartile mean when the lowest 25% and the highest 25% are discarded, and the mean of the remaining values is calculated).
StandardErrorOfMean Standard error of the mean The Standard Error for the mean value.
Mode Mode (Mo) The most frequently observed data value (Statistics Canada).
Median Median (Mdn) The values below which, and above which, half of the values in a distribution fall (50th percentile).
ValidCases Valid Cases Cases with observations which are considered to be valid, i.e., providing substantial information and to be included for calculation.
InvalidCases Invalid cases Cases which are considered/defined as "missing" (e.g., not ascertained, not applicable, etc.), usually excluded from calculation.
Minimum Minimum The lowest valid value in a variable.
Maximum Maximum The highest valid value in a variable.
Range Range The range of valid values, i.e., all values that fall between the lowest and highest valid values.
Sum Sum The sum or total of the values, across all valid cases.
Variance Variance (s2) The variance is the mean square deviation of the variable around the average value. It reflects the dispersion of a frequency distribution around its mean (OECD Glossary of Statistics).
StandardDeviation Standard deviation (s) The positive square root of the variance. The most widely used measure of dispersion of a frequency distribution.
CoefficientOfVariation Coefficient of variation (CV) Standard deviation divided by the mean.
AverageAbsoluteDeviation Average absolute deviation (AAD) The average of the absolute differences between each value and the overall mean. Measure of statistical dispersion around the mean, alternative to Standard Deviation.
MedianAbsoluteDeviation Median absolute deviation (MAD) The median absolute deviation from the median. Measure of statistical dispersion around the median.
FirstQuartile First quartile The first of three values which separate the total frequency of a distribution into four equal parts.
Second Quartile Second quartile The second of three values which separate the total frequency of a distribution into four equal parts (= median).
ThirdQuartile Third quartile The third of three values which separate the total frequency of a distribution into four equal parts.
InterquartileRange Interquartile range The range between the first and third quartile values.
FirstQuintile First quintile The first of four values which separate the total frequency of a distribution into five equal parts.
SecondQuintile Second quintile The second of four values which separate the total frequency of a distribution into five equal parts.
ThirdQuintile Third quintile The third of four values which separate the total frequency of a distribution into five equal parts.
FourthQuintile Fourth quintile The fourth of four values which separate the total frequency of a distribution into five equal parts.
InterquintileRange Interquintile range The range between the first and fourth quintile values.
FirstDecile First decile The first of nine values which separate the total frequency of a distribution into ten equal parts.
SecondDecile Second decile The second of nine values which separate the total frequency of a distribution into ten equal parts.
ThirdDecile Third decile The third of nine values which separate the total frequency of a distribution into ten equal parts.
FourthDecile Fourth decile The fourth of nine values which separate the total frequency of a distribution into ten equal parts.
FifthDecile Fifth decile The fifth of nine values which separate the total frequency of a distribution into ten equal parts (= median).
SixthDecile Sixth decile The sixth of nine values which separate the total frequency of a distribution into ten equal parts.
SeventhDecile Seventh decile The seventh of nine values which separate the total frequency of a distribution into ten equal parts.
EighthDecile Eighth decile The eighth of nine values which separate the total frequency of a distribution into ten equal parts.
NinthDecile Ninth decile The ninth of nine values which separate the total frequency of a distribution into ten equal parts.
InterdecileRange Interdecile range The range between the first and ninth decile values.
OtherPercentile Other percentile A percentile not covered by any of the other percentile terms.
Beta1 Skewness A measure for the asymmetry of a probability distribution of a variable.
Beta2 Kurtosis A measure for the "peakedness" of a probability distribution of a variable.
ShapiroWilk Shapiro-Wilk Normality test statistics.
PercentageOfValidCases Percentage of valid cases Indicates the percentage of valid cases of the total number of cases.
PercentageOfInvalidCases Percentage of invalid cases Indicates the percentage of invalid cases of the total number of cases.
Other Other Use if the summary statistic type is known, but not found in the list.

Usage

A classification of the type of summary statistic provided. Supports the use of an external controlled vocabulary. DDI strongly recommends the use of a widely shared controlled vocabulary to support interoperability.

Module Name Element Name
physicalinstance TypeOfSummaryStatistic

This vocabulary cannot be used for the element sumStat (4.3.14) in DDI 2.1, because this element already comes with a hard-coded controlled vocabulary in the "type" attribute. For using in DDI 2.5, select value "other" in the "type" attribute, and insert the appropriate value from the external CV in the "otherType" attribute. Use the complex element controlledVocabUsed (in the docDscr section) to identify the controlled vocabulary to which the selected term belongs.

Element Number in DDI 2.1 Element/Attribute Name
4.3.14 sumStat

Copyright and License

Copyright © DDI Alliance 2014.

Creative Commons Attribution-ShareAlike 3  This work is licensed under a Creative Commons Attribution-ShareAlike 3.


Page generated by gc_ddi-cv2html.xslt.