A checkFunction to be called from check that identifies outlier values in a numeric/integer/Date variable by use of the Turkey Boxplot method (consistent witht the boxplot function).

identifyOutliersTBStyle(v, nMax = 10, maxDecimals = 2)

Arguments

v

A numeric, integer or Date variable to check.

nMax

The maximum number of problematic values to report. Default is 10. Set to Inf if all problematic values are to be included in the outputted message, or to 0 for no output.

maxDecimals

A positive integer or Inf. Number of decimals used when printing numerical values in the data summary and in problematic values from the data checks. If Inf, no rounding is performed.

Value

A checkResult with three entires: $problem (a logical indicating whether outliers were found), $message (a message describing which values are outliers) and $problemValues (the outlier values).

Details

Outliers are defined in the style of Turkey Boxplots (consistent with the boxplot function), i.e. as values that are smaller than the 1st quartile minus the inter quartile range (IQR) or greater than the third quartile plus the IQR.

For Date variables, the calculations are done on their raw numeric format (as obtained by using unclass), after which they are translated back to Dates. Note that no rounding is performed for Dates, no matter the value of maxDecimals.

See also

Examples

identifyOutliersTBStyle(c(1:10, 200, 200, 700))
#> Note that a check function found the following problematic values: 200, 700.