R/identifyLoners.R
identifyLoners.Rd
A checkFunction
to be called from check
that identifies values that
only occur less than 6 times in factor, labelled, or character variables (that is, loners).
identifyLoners(v, nMax = 10)
v | A character, labelled, or factor variable to check. |
---|---|
nMax | The maximum number of problematic values to report.
Default is |
A checkResult
with three entires:
$problem
(a logical indicating whether case issues where found),
$message
(a message describing which values in v
were loners) and
$problemValues
(the problematic values in their original format).
Note that Only unique problematic values
are listed and they are presented in alphabetical order.
For character, labelled, and factor variables, identify values that only have a very low number of observations, as these categories might be problematic when conducting an analysis. Unused factor levels are not considered "loners". "Loners" are defined as values with 5 or less observations, reflecting the commonly use rule of thumb for performing chi squared tests.
identifyLoners(c(rep(c("a", "b", "c"), 10), "d", "d"))#> Note that the following levels have at most five observations: d.