Convert a function, f, into an S3 summaryFunction object. This adds f to the overview list returned by an allSummaryFunctions() call.

summaryFunction(f, description, classes = NULL)

Arguments

f

A function. See details and examples below for the exact requirements of this function.

description

A character string describing the summary returned by f. If NULL (the default), the name of f will be used instead.

classes

The classes for which f is intended to be called. If NULL (the default), one of two things happens. If f is not a S3 generic function, the classes attribute of f will be an empty character string. If f is a S3 generic function, an automatic look-up for methods will be conducted, and the classes attribute will then be filled out automatically. Note that the function allClasses (listing all classes used in dataMaid) might be useful.

Value

A function of class summaryFunction which has to attributes, namely classes and description.

Details

summaryFunction represents the functions used in summarize and makeDataReport for summarizing the features of variables in a dataset.

An example of defining a new summaryFunction is given below. Note that the minimal requirements for such a function (in order for it to be compatible with summarize() and makeDataReport()) is the following input/output-structure: It must input at least two arguments, namely v (a vector variable) and .... Additional implemented arguments from summarize() and makeDataReport() include maxDecimals, see e.g. the pre-defined summaryFunction minMax for more details about how this arguments should be used. The output must be a list with at least the two entries $feature (a short character string describing what was summarized) and $result (a value or a character string with the result of the summarization). However, if the result of a summaryFunction is furthermore converted to a summaryResult object, a print() method also becomes available for consistent formatting of summaryFunction results.

Note that all available summaryFunctions are listed by the call allSummaryFunctions() and we recommed looking into these function, if more knowledge about summaryFunctions is required.

See also

Examples

#Define a valid summaryFunction that can be called from summarize() #and makeDataReport(). This function counts how many zero entries a given #variable has: countZeros <- function(v, ...) { res <- length(which(v == 0)) summaryResult(list(feature = "No. zeros", result = res, value = res)) } #Convert it to a summaryFunction object. We don't count zeros for #logical variables, as they have a different meaning here (FALSE): countZeros <- summaryFunction(countZeros, description = "Count number of zeros", classes = setdiff(allClasses(), "logical"))
#> Error in get(fName): object 'countZeros' not found
#Call it directly : countZeros(c(0, 0, 0, 1:100))
#> No. zeros: 3
#Call it via summarize(): data(cars) summarize(cars, numericSummaries = c(defaultNumericSummaries(), "countZeros"))
#> Error in countZeros(v = c(4, 4, 7, 7, 8, 9, 10, 10, 10, 11, 11, 12, 12, 12, 12, 13, 13, 13, 13, 14, 14, 14, 14, 15, 15, 15, 16, 16, 17, 17, 17, 18, 18, 18, 18, 19, 19, 19, 20, 20, 20, 20, 20, 22, 23, 24, 24, 24, 24, 25)): could not find function "countZeros"
#Note that countZeros now appears in a allSummaryFunctions() call: allSummaryFunctions()
#> #> ---------------------------------------------------------------------------- #> name description classes #> -------------- ------------------------------- ----------------------------- #> centralValue Compute median for numeric character, Date, factor, #> variables, mode for integer, labelled, logical, #> categorical variables numeric #> #> countMissing Compute proportion of missing character, Date, factor, #> observations integer, labelled, logical, #> numeric #> #> minMax Find minimum and maximum integer, numeric, Date #> values #> #> quartiles Compute 1st and 3rd quartiles Date, integer, numeric #> #> uniqueValues Count number of unique values character, Date, factor, #> integer, labelled, logical, #> numeric #> #> variableType Data class of variable character, Date, factor, #> integer, labelled, logical, #> numeric #> ---------------------------------------------------------------------------- #>