This function fills any missing entries (NA, Inf, null) in a matrix or dataframe, according to a specified method. By default, '0' is considered a value.

data_imputation(traj, id_field = FALSE, method = 2,
replace_with = 1, fill_zeros = FALSE, verbose=TRUE)

Arguments

traj

[matrix (numeric)]: longitudinal data. Each row represents an individual trajectory (of observations). The columns show the observations at consecutive time points.

id_field

[numeric or character] Whether the first column of the traj is a unique (id) field. Default: FALSE. If TRUE the function recognises the second column as the first time step.

method

[an integer] indicating a method for calculating the missing values. Options are: '1': arithmetic method, and '2': regression method. The default is '1': arithmetic method

replace_with

[an integer from 1 to 6] indicating the technique, based on a specified method, for calculating the missing entries. '1': arithmetic method, replace_with options are: '1': Mean value of the corresponding column; '2': Minimum value of corresponding column; '3': Maximum value of corresponding column; '4': Mean value of corresponding row; '5': Minimum value of corresponding row, or '6': Maximum value of corresponding row. For '2': regression method: the available option for the replace_with is: '1': linear. The regression method fits a linear regression line to a trajectory with missing entry(s) and estimates the missing data values from the regression line. Note: only the missing data points derive their new values from the regression line while the rest of the data points retain their original values. The function terminates if there are trajectories with only one observation. The default is '1': Mean value of the corresponding column

fill_zeros

[TRUE or FALSE] whether to consider zeros 0 as missing values when 2: regression method is used. The default is FALSE.

verbose

to suppress printing output messages (to the console). Default: TRUE.

Value

A data.frame with missing values (NA, Inf, null) imputed according to the a specified technique.

Details

Given a matrix or data.frame with some missing values indicated by (NA, Inf, null), this function impute the missing value by using either an estimation from the corresponding rows or columns, or to use a regression method to estimate the missing values.

Examples

# Using the example 'traj' datasets imp_data <- data_imputation(traj, id_field = TRUE, method = 2, replace_with = 1, fill_zeros = FALSE, verbose=FALSE)