This function fills any missing entries (NA
,
Inf
, null
) in a matrix or dataframe, according to
a specified method. By default, '0'
is considered a value.
data_imputation(traj, id_field = FALSE, method = 2,
replace_with = 1, fill_zeros = FALSE, verbose=TRUE)
Arguments
traj |
[matrix (numeric) ]: longitudinal data. Each
row represents an individual trajectory (of observations). The
columns show the observations at consecutive time points. |
id_field |
[numeric or character] Whether the first column
of the traj is a unique (id ) field. Default:
FALSE . If TRUE the function recognises the second
column as the first time step. |
method |
[an integer] indicating a method for calculating
the missing values. Options are: '1' : arithmetic
method, and '2' : regression method. The default
is '1' : arithmetic method |
replace_with |
[an integer from 1 to 6] indicating the technique,
based on a specified method , for calculating the missing entries.
'1' : arithmetic method, replace_with options are:
'1' : Mean value of the corresponding column;
'2' : Minimum value of corresponding column; '3' :
Maximum value of corresponding column;
'4' : Mean value of corresponding row; '5' :
Minimum value of corresponding row,
or '6' : Maximum value of corresponding row. For '2' :
regression method:
the available option for the replace_with is: '1' :
linear .
The regression method fits a linear regression line to a trajectory
with missing entry(s)
and estimates the missing data values from the regression line.
Note: only the missing data points derive their new values from the
regression line
while the rest of the data points retain their original values. The
function terminates if there are
trajectories with only one observation. The default is '1' : Mean
value of the corresponding column |
fill_zeros |
[TRUE or FALSE] whether to consider zeros 0
as missing values when 2: regression method is used. The default
is FALSE . |
verbose |
to suppress printing output messages (to the console).
Default: TRUE . |
Value
A data.frame with missing values (NA
, Inf
,
null
) imputed according to the a specified technique.
Details
Given a matrix or data.frame with some missing values
indicated by (NA
, Inf
, null
), this function
impute the missing value by using either an estimation from the
corresponding rows or columns, or to use a regression method to
estimate the missing values.
Examples
# Using the example 'traj' datasets
imp_data <- data_imputation(traj, id_field = TRUE, method = 2,
replace_with = 1,
fill_zeros = FALSE, verbose=FALSE)