It is an iterative process for minimising the loss until a tolerance value.

  • na_polish_auto() returns the polished data.

  • na_polish_autotrace() returns a tibble for documenting the steps and metrics.

na_polish_auto(data, cutoff, tol = 0.1, funs = na_polish_funs(),
  quiet = FALSE)

na_polish_autotrace(data, cutoff, tol = 0.1, funs = na_polish_funs(),
  quiet = FALSE)

Arguments

data

A tsibble.

cutoff

Numerics of length 1 or length of funs between 0 and 1.

tol

A tolerance value close or equal to zero as stopping rule. It compares to the loss defined as (1 - prop_na) * prop_removed to be minimised. See na_polish_metrics() for details.

funs

A list of na_polish_*() functions to go through. By default, na_polish_funs() contains "measures", "key", "index", and "index2".

quiet

If FALSE, report metrics at each step and pass of the polishing process. It requires the "cliapp" package to be installed.

See also

Other missing value polishing functions: na_polish_measures, na_polish_metrics

Examples

# \dontrun{ wdi_ts <- tsibble::as_tsibble(wdi, key = country_code, index = year) wdi_after <- na_polish_auto(wdi_ts, cutoff = .8)
#> Error : Cannot write to fd (system error 9, Bad file descriptor) @client.c:149 #> Error in mycall(sym_write, fd, data) : #> Cannot write to fd (system error 9, Bad file descriptor) @client.c:149
#>
#> ── Iteration 1 ─────────────────────────────────────────────────────────────────
#> Error in mycall(sym_write, fd, data) : #> Cannot write to fd (system error 9, Bad file descriptor) @client.c:149
#> 1. `na_polish_index()`    (1 - 0.671) x 0.352 = 0.116
#> 2. `na_polish_index2()`   (1 - 0.558) x 0.126 = 0.056
#> 3. `na_polish_measures()` (1 - 0.878) x 0.109 = 0.013
#> Error in mycall(sym_write, fd, data) : #> Cannot write to fd (system error 9, Bad file descriptor) @client.c:149
#> ✔ 0.160
#> Error in mycall(sym_write, fd, data) : #> Cannot write to fd (system error 9, Bad file descriptor) @client.c:149 #> Error in mycall(sym_write, fd, data) : #> Cannot write to fd (system error 9, Bad file descriptor) @client.c:149 #> Error in mycall(sym_write, fd, data) : #> Cannot write to fd (system error 9, Bad file descriptor) @client.c:149 #> Error in mycall(sym_write, fd, data) : #> Cannot write to fd (system error 9, Bad file descriptor) @client.c:149
#>
#> ── Iteration 2 ─────────────────────────────────────────────────────────────────
#> Error in mycall(sym_write, fd, data) : #> Cannot write to fd (system error 9, Bad file descriptor) @client.c:149
#> 1. `na_polish_index()`  (1 - 0.415) x 0.157 = 0.092
#> 2. `na_polish_index2()` (1 - 0.240) x 0.053 = 0.041
#> Error in mycall(sym_write, fd, data) : #> Cannot write to fd (system error 9, Bad file descriptor) @client.c:149
#> ✔ 0.126
#> Error in mycall(sym_write, fd, data) : #> Cannot write to fd (system error 9, Bad file descriptor) @client.c:149 #> Error in mycall(sym_write, fd, data) : #> Cannot write to fd (system error 9, Bad file descriptor) @client.c:149 #> Error in mycall(sym_write, fd, data) : #> Cannot write to fd (system error 9, Bad file descriptor) @client.c:149 #> Error in mycall(sym_write, fd, data) : #> Cannot write to fd (system error 9, Bad file descriptor) @client.c:149
#>
#> ── Iteration 3 ─────────────────────────────────────────────────────────────────
#> Error in mycall(sym_write, fd, data) : #> Cannot write to fd (system error 9, Bad file descriptor) @client.c:149
#> 1. `na_polish_index()`  (1 - 0.328) x 0.114 = 0.077
#> 2. `na_polish_index2()` (1 - 0.184) x 0.042 = 0.034
#> Error in mycall(sym_write, fd, data) : #> Cannot write to fd (system error 9, Bad file descriptor) @client.c:149
#> ✔ 0.107
#> Error in mycall(sym_write, fd, data) : #> Cannot write to fd (system error 9, Bad file descriptor) @client.c:149 #> Error in mycall(sym_write, fd, data) : #> Cannot write to fd (system error 9, Bad file descriptor) @client.c:149 #> Error in mycall(sym_write, fd, data) : #> Cannot write to fd (system error 9, Bad file descriptor) @client.c:149 #> Error in mycall(sym_write, fd, data) : #> Cannot write to fd (system error 9, Bad file descriptor) @client.c:149
#>
#> ── Iteration 4 ─────────────────────────────────────────────────────────────────
#> Error in mycall(sym_write, fd, data) : #> Cannot write to fd (system error 9, Bad file descriptor) @client.c:149
#> 1. `na_polish_index()`  (1 - 0.282) x 0.091 = 0.066
#> 2. `na_polish_index2()` (1 - 0.167) x 0.035 = 0.029
#> Error in mycall(sym_write, fd, data) : #> Cannot write to fd (system error 9, Bad file descriptor) @client.c:149
#> ✔ 0.092
#> Error in mycall(sym_write, fd, data) : #> Cannot write to fd (system error 9, Bad file descriptor) @client.c:149 #> Error in mycall(sym_write, fd, data) : #> Cannot write to fd (system error 9, Bad file descriptor) @client.c:149
na_polish_metrics(wdi_ts, wdi_after)
#> # A tibble: 1 x 6 #> prop_na nobs_na prop_removed nobs_removed nrows_removed ncols_removed #> <dbl> <int> <dbl> <int> <int> <int> #> 1 0.575 240327 0.700 417900 7200 6
# Trace down `na_polish_auto()` na_polish_autotrace(wdi_ts, cutoff = .8, quiet = TRUE)
#> # A tibble: 9 x 8 #> iteration polisher prop_na prop_removed eval_loss iter_loss final_prop_na #> <int> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 1 na_poli… 0.671 0.352 0.116 0.160 0.575 #> 2 1 na_poli… 0.558 0.126 0.0556 0.160 0.575 #> 3 1 na_poli… 0.878 0.109 0.0133 0.160 0.575 #> 4 2 na_poli… 0.415 0.157 0.0918 0.126 0.575 #> 5 2 na_poli… 0.240 0.0533 0.0405 0.126 0.575 #> 6 3 na_poli… 0.328 0.114 0.0768 0.107 0.575 #> 7 3 na_poli… 0.184 0.0419 0.0342 0.107 0.575 #> 8 4 na_poli… 0.282 0.0913 0.0655 0.0918 0.575 #> 9 4 na_poli… 0.167 0.0346 0.0289 0.0918 0.575 #> # … with 1 more variable: final_prop_removed <dbl>
# }