Validate data based on specified rules
Arguments
- files_data
A list of file paths for the datasets to be validated.
- data_names
(Optional) A character vector of names for the datasets. If not provided, names will be extracted from the file paths.
- file_rules
A file path for the rules file, either in .csv or .xlsx format.
- zip_data
A file path to a zip folder for validating unstructured data.
Value
A list containing the following elements: - data_formatted: A list of data frames with the validated data. - data_names: A character vector of dataset names. - report: A list of validation report objects for each dataset. - results: A list of validation result data frames for each dataset. - rules: A list of validator objects for each dataset. - status: A character string indicating the overall validation status ("success" or "error"). - issues: A logical vector indicating if there are any issues in the validation results. - message: A data.table containing information about any issues encountered.
Examples
# Validate data with specified rules
data("valid_example")
data("invalid_example")
data("test_rules")
result_valid <- validate_data(files_data = valid_example,
data_names = c("methodology", "particles", "samples"),
file_rules = test_rules)
result_invalid <- validate_data(files_data = invalid_example,
data_names = c("methodology", "particles", "samples"),
file_rules = test_rules)
#> Warning: All variables in the rules csv should be in the data csv and vice versa for the validation to work correctly. Download the Data Template for an example of correctly formatted upload. Ignoring these unmatched variables FilterDiameter, FilterPoreSize, ImageFile, ImageType, SampleID, SampleSize, Project, Affiliation