The base function base::as.factor()
is not a generic, but this variant
is. By default, to_factor()
is a wrapper for base::as.factor()
.
Please note that to_factor()
differs slightly from haven::as_factor()
method provided by haven package.
unlabelled(x)
is a shortcut for
to_factor(x, strict = TRUE, unclass = TRUE, labelled_only = TRUE)
.
to_factor(x, ...)
# S3 method for haven_labelled
to_factor(
x,
levels = c("labels", "values", "prefixed"),
ordered = FALSE,
nolabel_to_na = FALSE,
sort_levels = c("auto", "none", "labels", "values"),
decreasing = FALSE,
drop_unused_labels = FALSE,
user_na_to_na = FALSE,
strict = FALSE,
unclass = FALSE,
explicit_tagged_na = FALSE,
...
)
# S3 method for data.frame
to_factor(
x,
levels = c("labels", "values", "prefixed"),
ordered = FALSE,
nolabel_to_na = FALSE,
sort_levels = c("auto", "none", "labels", "values"),
decreasing = FALSE,
labelled_only = TRUE,
drop_unused_labels = FALSE,
strict = FALSE,
unclass = FALSE,
explicit_tagged_na = FALSE,
...
)
unlabelled(x, ...)
Object to coerce to a factor.
Other arguments passed down to method.
What should be used for the factor levels: the labels, the values or labels prefixed with values?
TRUE
for ordinal factors, FALSE
(default) for nominal
factors.
Should values with no label be converted to NA
?
How the factor levels should be sorted? (see Details)
Should levels be sorted in decreasing order?
Should unused value labels be dropped?
(applied only if strict = FALSE
)
Convert user defined missing values into NA
?
Convert to factor only if all values have a defined label?
If not converted to a factor (when strict = TRUE
),
convert to a character or a numeric factor by applying base::unclass()
?
Should tagged NA (cf. haven::tagged_na()
) be
kept as explicit factor levels?
for a data.frame, convert only labelled variables to factors?
If some values doesn't have a label, automatic labels will be created,
except if nolabel_to_na
is TRUE
.
If sort_levels == 'values'
, the levels will be sorted according to the
values of x
.
If sort_levels == 'labels'
, the levels will be sorted according to
labels' names.
If sort_levels == 'none'
, the levels will be in the order the value
labels are defined in x
. If some labels are automatically created, they
will be added at the end.
If sort_levels == 'auto'
, sort_levels == 'none'
will be used, except
if some values doesn't have a defined label. In such case,
sort_levels == 'values'
will be applied.
When applied to a data.frame, only labelled vectors are converted by
default to a factor. Use labelled_only = FALSE
to convert all variables
to factors.
unlabelled()
is a shortcut for quickly removing value labels of a vector
or of a data.frame. If all observed values have a value label, then the
vector will be converted into a factor. Otherwise, the vector will be
unclassed.
If you want to remove value labels in all cases, use remove_val_labels()
.
v <- labelled(
c(1, 2, 2, 2, 3, 9, 1, 3, 2, NA),
c(yes = 1, no = 3, "don't know" = 9)
)
to_factor(v)
#> [1] yes 2 2 2 no don't know
#> [7] yes no 2 <NA>
#> Levels: yes 2 no don't know
to_factor(v, nolabel_to_na = TRUE)
#> [1] yes <NA> <NA> <NA> no don't know
#> [7] yes no <NA> <NA>
#> Levels: yes no don't know
to_factor(v, "p")
#> [1] [1] yes [2] 2 [2] 2 [2] 2 [3] no
#> [6] [9] don't know [1] yes [3] no [2] 2 <NA>
#> Levels: [1] yes [2] 2 [3] no [9] don't know
to_factor(v, sort_levels = "v")
#> [1] yes 2 2 2 no don't know
#> [7] yes no 2 <NA>
#> Levels: yes 2 no don't know
to_factor(v, sort_levels = "n")
#> [1] yes 2 2 2 no don't know
#> [7] yes no 2 <NA>
#> Levels: yes no don't know 2
to_factor(v, sort_levels = "l")
#> [1] yes 2 2 2 no don't know
#> [7] yes no 2 <NA>
#> Levels: 2 don't know no yes
x <- labelled(c("H", "M", "H", "L"), c(low = "L", medium = "M", high = "H"))
to_factor(x, ordered = TRUE)
#> [1] high medium high low
#> Levels: low < medium < high
# Strict conversion
v <- labelled(c(1, 1, 2, 3), labels = c(No = 1, Yes = 2))
to_factor(v)
#> [1] No No Yes 3
#> Levels: No Yes 3
to_factor(v, strict = TRUE) # Not converted because 3 does not have a label
#> <labelled<double>[4]>
#> [1] 1 1 2 3
#>
#> Labels:
#> value label
#> 1 No
#> 2 Yes
to_factor(v, strict = TRUE, unclass = TRUE)
#> [1] 1 1 2 3
#> attr(,"labels")
#> No Yes
#> 1 2
df <- data.frame(
a = labelled(c(1, 1, 2, 3), labels = c(No = 1, Yes = 2)),
b = labelled(c(1, 1, 2, 3), labels = c(No = 1, Yes = 2, DK = 3)),
c = labelled(
c("a", "a", "b", "c"),
labels = c(No = "a", Maybe = "b", Yes = "c")
),
d = 1:4,
e = factor(c("item1", "item2", "item1", "item2")),
f = c("itemA", "itemA", "itemB", "itemB"),
stringsAsFactors = FALSE
)
if (require(dplyr)) {
glimpse(df)
glimpse(unlabelled(df))
}
#> Rows: 4
#> Columns: 6
#> $ a <dbl+lbl> 1, 1, 2, 3
#> $ b <dbl+lbl> 1, 1, 2, 3
#> $ c <chr+lbl> "a", "a", "b", "c"
#> $ d <int> 1, 2, 3, 4
#> $ e <fct> item1, item2, item1, item2
#> $ f <chr> "itemA", "itemA", "itemB", "itemB"
#> Rows: 4
#> Columns: 6
#> $ a <dbl> 1, 1, 2, 3
#> $ b <fct> No, No, Yes, DK
#> $ c <fct> No, No, Maybe, Yes
#> $ d <int> 1, 2, 3, 4
#> $ e <fct> item1, item2, item1, item2
#> $ f <chr> "itemA", "itemA", "itemB", "itemB"