Quantcast
Channel: r software hub
Viewing all articles
Browse latest Browse all 1015

Easy data validation with the validate package

$
0
0

By mark

validate-iris

The validate package is our attempt to make checking data against domain knowledge as easy as possible. Here is an example.

library(magrittr)
library(validate)

iris %>% check_that(
  Sepal.Width > 0.5 * Sepal.Length
  , mean(Sepal.Width) > 0
  , if ( Sepal.Width > 0.5*Sepal.Length) Sepal.Length > 10
) %>% summary()

#  rule items passes fails nNA error warning                                              expression
# 1   V1   150     66    84   0 FALSE   FALSE                        Sepal.Width > 0.5 * Sepal.Length
# 2   V2     1      1     0   0 FALSE   FALSE                                   mean(Sepal.Width) > 0
# 3   V3   150     84    66   0 FALSE   FALSE !(Sepal.Width > 0.5 * Sepal.Length) | Sepal.Length > 10

The summary gives an overview of the number of items checked. For an aggregated test, such as the one where we test the mean of a variable only one item is tested: the whole Sepal.Width column. The other rules are tested on each record in iris. Furthermore the number of items that pass, fail …read more

Source:: r-bloggers.com


Viewing all articles
Browse latest Browse all 1015

Trending Articles