This function identifies outlier observations in the trajectories, and allows users to replace the observations or remove trajectories entirely.

outlier_detect(traj, id_field = FALSE, method = 1, threshold = 0.95,
count = 1, replace_with = 1, verbose=TRUE)

Arguments

traj

[matrix (numeric)]: longitudinal data. Each row represents an individual trajectory (of observations). The columns show the observations at consecutive time points.

id_field

[numeric or character] Whether the first column of the traj is a unique (id) field. Default: FALSE. If TRUE the function recognizes the second column as the first time step.

method

[integer (numeric)] indicating the method for identifying the outlier. Options are: '1': quantile method (default), and '2': manual method. The manual method requires a user-defined value.

threshold

[numeric] A cut-off value for outliers. If the method parameter is set as '1':quantile, the threshold should be a numeric vector of probability between [0,1], whilst if the method is set as '2': manual, the threshold could be any numeric vector.

count

[integer (numeric)] indicating the number of observations (in a trajectory) that must exceed the threshold in order for the trajectory to be considered an outlier. Default is 1.

replace_with

[integer (numeric)] indicating the technique to use for calculating a replacement for an outlier observation. The remaining observations on the row or the column in which the outlier observation is located are used to calculate the replacement. The replacement options are: '1': Mean value of the column, '2': Mean value of the row and '3': remove the row (trajectory) completely from the data. Default value is the '1' option.

verbose

to suppress output messages (to the console). Default: TRUE.

Value

A dataframe with outlier observations replaced or removed.

Details

Given a matrix, this function identifies outliers that exceed the threshold and replaces the outliers with an estimate calculated using the other observations either the rows or the columns in which the outlier observation is located. Option is also provided to remove the trajectories (containing the outlier) from the data.

Examples

data(traj) trajectry <- data_imputation(traj, id_field=TRUE, method = 1, replace_with = 1, verbose=FALSE) trajectry <- props(trajectry$CompleteData, id_field=TRUE) outp <- outlier_detect(trajectry, id_field = TRUE, method = 1, threshold = 0.95, count = 1, replace_with = 1, verbose=TRUE)
#> [1] "1 trajectories were found to contain outlier observations and replaced accordingly!" #> [1] "Summary:" #> [1] "*--Outlier observation(s) was found in trajectory 10 --*"
outp <- outlier_detect(trajectry, id_field = TRUE, method = 2, threshold = 15, count = 4, replace_with = 3, verbose=TRUE)
#> [1] "No outlier(s) found!"