Get / Set SPSS missing values
na_values(x)
na_values(x) <- value
na_range(x)
na_range(x) <- value
get_na_values(x)
get_na_range(x)
set_na_values(.data, ..., .values = NA, .strict = TRUE)
set_na_range(.data, ..., .values = NA, .strict = TRUE)
is_user_na(x)
is_regular_na(x)
user_na_to_na(x)
user_na_to_regular_na(x)
user_na_to_tagged_na(x)
A vector (or a data frame).
A vector of values that should also be considered as missing
(for na_values
) or a numeric vector of length two giving the (inclusive)
extents of the range (for na_values
, use -Inf
and Inf
if you
want the range to be open ended).
a data frame or a vector
name-value pairs of missing values (see examples)
missing values to be applied to the data.frame,
using the same syntax as value
in na_values(df) <- value
or
na_range(df) <- value
.
should an error be returned if some labels
doesn't correspond to a column of x
?
na_values()
will return a vector of values that should also be
considered as missing.
na_range()
will return a numeric vector of length two giving the
(inclusive) extents of the range.
set_na_values()
and set_na_range()
will return an updated
copy of .data
.
See haven::labelled_spss()
for a presentation of SPSS's user defined
missing values.
Note that base::is.na()
will return TRUE
for user defined missing values.
It will also return TRUE
for regular NA
values. If you want to test if a
specific value is a user NA but not a regular NA
, use is_user_na()
.
If you want to test if a value is a regular NA
but not a user NA, not a
tagged NA, use is_regular_na()
.
You can use user_na_to_na()
to convert user defined missing values to
regular NA
. Note that any value label attached to a user defined missing
value will be lost.
user_na_to_regular_na()
is a synonym of user_na_to_na()
.
The method user_na_to_tagged_na()
will convert user defined missing values
into haven::tagged_na()
, preserving value labels. Please note that
haven::tagged_na()
are defined only for double vectors. Therefore, integer
haven_labelled_spss
vectors will be converted into double haven_labelled
vectors; and user_na_to_tagged_na()
cannot be applied to a character
haven_labelled_spss
vector.
tagged_na_to_user_na()
is the opposite of user_na_to_tagged_na()
and
convert tagged NA
into user defined missing values.
get_na_values()
is identical to na_values()
and get_na_range()
to na_range()
.
set_na_values()
and set_na_range()
could be used with dplyr
syntax.
haven::labelled_spss()
, user_na_to_na()
v <- labelled(
c(1, 2, 2, 2, 3, 9, 1, 3, 2, NA),
c(yes = 1, no = 3, "don't know" = 9)
)
v
#> <labelled<double>[10]>
#> [1] 1 2 2 2 3 9 1 3 2 NA
#>
#> Labels:
#> value label
#> 1 yes
#> 3 no
#> 9 don't know
na_values(v) <- 9
na_values(v)
#> [1] 9
v
#> <labelled_spss<double>[10]>
#> [1] 1 2 2 2 3 9 1 3 2 NA
#> Missing values: 9
#>
#> Labels:
#> value label
#> 1 yes
#> 3 no
#> 9 don't know
is.na(v) # TRUE for the 6th and 10th values
#> [1] FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE TRUE
is_user_na(v) # TRUE only for the 6th value
#> [1] FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE
user_na_to_na(v)
#> <labelled<double>[10]>
#> [1] 1 2 2 2 3 NA 1 3 2 NA
#>
#> Labels:
#> value label
#> 1 yes
#> 3 no
na_values(v) <- NULL
v
#> <labelled<double>[10]>
#> [1] 1 2 2 2 3 9 1 3 2 NA
#>
#> Labels:
#> value label
#> 1 yes
#> 3 no
#> 9 don't know
na_range(v) <- c(5, Inf)
na_range(v)
#> [1] 5 Inf
v
#> <labelled_spss<double>[10]>
#> [1] 1 2 2 2 3 9 1 3 2 NA
#> Missing range: [5, Inf]
#>
#> Labels:
#> value label
#> 1 yes
#> 3 no
#> 9 don't know
user_na_to_na(v)
#> <labelled<double>[10]>
#> [1] 1 2 2 2 3 NA 1 3 2 NA
#>
#> Labels:
#> value label
#> 1 yes
#> 3 no
user_na_to_tagged_na(v)
#> <labelled<double>[10]>
#> [1] 1 2 2 2 3 NA(a) 1 3 2 NA
#>
#> Labels:
#> value label
#> 1 yes
#> 3 no
#> NA(a) don't know
# it is not recommended to mix user NAs and tagged NAs
x <- c(NA, 9, tagged_na("a"))
na_values(x) <- 9
x
#> <labelled_spss<double>[3]>
#> [1] NA 9 NA(a)
#> Missing values: 9
is.na(x)
#> [1] TRUE TRUE TRUE
is_user_na(x)
#> [1] FALSE TRUE FALSE
is_tagged_na(x)
#> [1] FALSE FALSE TRUE
is_regular_na(x)
#> [1] TRUE FALSE FALSE
if (require(dplyr)) {
# setting value label and user NAs
df <- tibble(s1 = c("M", "M", "F", "F"), s2 = c(1, 1, 2, 9)) %>%
set_value_labels(s2 = c(yes = 1, no = 2)) %>%
set_na_values(s2 = 9)
na_values(df)
# removing missing values
df <- df %>% set_na_values(s2 = NULL)
df$s2
# example with a vector
v <- 1:10
v <- v %>% set_na_values(5, 6, 7)
v
v %>% set_na_range(8, 10)
v %>% set_na_range(.values = c(9, 10))
v %>% set_na_values(NULL)
}
#> <labelled<integer>[10]>
#> [1] 1 2 3 4 5 6 7 8 9 10