Get / Set SPSS missing values

na_values(x)

na_values(x) <- value

na_range(x)

na_range(x) <- value

set_na_values(.data, ..., .values = NA, .strict = TRUE)

set_na_range(.data, ..., .values = NA, .strict = TRUE)

user_na_to_na(x)

Arguments

x

A vector.

value

A vector of values that should also be considered as missing (for na_values) or a numeric vector of length two giving the (inclusive) extents of the range (for na_values, use -Inf and Inf if you want the range to be open ended).

.data

a data frame

...

name-value pairs of missing values (see examples)

.values

missing values to be applied to the data.frame, using the same syntax as value in na_values(df) <- value or na_range(df) <- value.

.strict

should an error be returned if some labels doesn't correspond to a column of x?

Value

na_values() will return a vector of values that should also be considered as missing. na_range() will return a numeric vector of length two giving the (inclusive) extents of the range.

set_na_values() and set_na_range() will return an updated copy of .data.

Details

See haven::labelled_spss() for a presentation of SPSS's user defined missing values. Note that base::is.na() will return TRUE for user defined missing values. You can use user_na_to_na() to convert user defined missing values to NA.

Note

set_na_values() and set_na_range() could be used with dplyr syntax.

See also

haven::labelled_spss(), user_na_to_na()

Examples

v <- labelled(c(1,2,2,2,3,9,1,3,2,NA), c(yes = 1, no = 3, "don't know" = 9)) v
#> <labelled<double>[10]> #> [1] 1 2 2 2 3 9 1 3 2 NA #> #> Labels: #> value label #> 1 yes #> 3 no #> 9 don't know
na_values(v) <- 9 na_values(v)
#> [1] 9
v
#> <labelled_spss<double>[10]> #> [1] 1 2 2 2 3 9 1 3 2 NA #> Missing values: 9 #> #> Labels: #> value label #> 1 yes #> 3 no #> 9 don't know
is.na(v)
#> [1] FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE TRUE
user_na_to_na(v)
#> <labelled<double>[10]> #> [1] 1 2 2 2 3 NA 1 3 2 NA #> #> Labels: #> value label #> 1 yes #> 3 no #> 9 don't know
na_values(v) <- NULL v
#> <labelled<double>[10]> #> [1] 1 2 2 2 3 9 1 3 2 NA #> #> Labels: #> value label #> 1 yes #> 3 no #> 9 don't know
na_range(v) <- c(5, Inf) na_range(v)
#> [1] 5 Inf
v
#> <labelled_spss<double>[10]> #> [1] 1 2 2 2 3 9 1 3 2 NA #> Missing range: [5, Inf] #> #> Labels: #> value label #> 1 yes #> 3 no #> 9 don't know
user_na_to_na(v)
#> <labelled<double>[10]> #> [1] 1 2 2 2 3 NA 1 3 2 NA #> #> Labels: #> value label #> 1 yes #> 3 no #> 9 don't know
if (require(dplyr)) { # setting value labels df <- tibble(s1 = c("M", "M", "F", "F"), s2 = c(1, 1, 2, 9)) %>% set_value_labels(s2 = c(yes = 1, no = 2)) %>% set_na_values(s2 = 9) na_values(df) # removing missing values df <- df %>% set_na_values(s2 = NULL) df$s2 }
#> <labelled<double>[4]> #> [1] 1 1 2 9 #> #> Labels: #> value label #> 1 yes #> 2 no