![open stata 13 in r open stata 13 in r](https://miro.medium.com/max/1000/1*SCCMnSn6chmDfe6Yax6yFQ.png)
As an example, we will use data from the British Election Study 2017 Face-to-face Post-election Survey Version 1.3 bes_f2f_2017_v1.3.dta dataset, which can be downloaded from the British Election Study’s website here. To export the file back out to SPSS, use write_sav(). # age_of_passaenger sex travel_class survival_state women_children_vs_men Janitor::clean_names() # clean names to be all lowercase, replacing spaces with "_" Pull(value)) %>% # extract new variable name into a character vector Mutate(value = coalesce(value, name)) %>% # fill in NA with original if missing Na_if("") %>% # if variable label is empty string, convert to NA Sjlabelled::get_label() %>% # pull labels from original tibbleĮnframe() %>% # convert list of column names to a new tibble Set_names(titanic %>% # convert variable names to informative labels Mutate_all(as_factor) %>% # convert all variables from chr+lbl to fct type
![open stata 13 in r open stata 13 in r](https://s3.amazonaws.com/libapps/accounts/11445/images/Libguide-Citrix-open.png)
We’ll also convert the variable names to their more descriptive labels if applicable, using sjlabelled::get_label, and clean up the names to be all lowercase and replacing the spaces with underscores using janitor::clean_names(). We can use as_factor() to convert these variables to fct variables keeping and showing only the labels. Notice that each variable is a dbl+lbl type. As an example, we will use the titanic.sav dataset, which can be downloaded from Butler University’s PS 310 course datasets here. Note that at the time of this writing, the write_sas() function is still experimental and only works for limited datasets. To write the file back out to SAS, use write_sas(). Now the tibble in R contains the descriptive variable names and values from the labels stored in the original SAS data files. # ... with 1 more variable: `What grade are you in?` Įven better. # `Month of admin... `Day of adminis... `How old are yo... `What is your s... Set_names(nyts %>% sjlabelled::get_label() %>% enframe() %>% pull(value)) # 5 Q3 What grade are you in? # convert column names to variable label using purrr::set_names() Sjlabelled::get_label() %>% # pull column names from tibbleĮnframe() # convert list of column names to a new tibble # A tibble: 5 x 2 Now we have the value labels showing up, but what do Q1, Q2, and Q3 mean? These have a label attribute that we can extract and show in our tibble using the get_label function from the sjlabelled package. Mutate_at(vars(Month, Q1:Q3), as_factor) # convert Month, Q1, Q2 and Q3 from chr+lbl to fct column type Mutate_all(zap_empty) %>% # convert any blank cells to NA
![open stata 13 in r open stata 13 in r](https://ssc.wisc.edu/sscc/pubs/R_intro/book/images/RFR_P_Start.png)
We would prefer these to be showing as NA in R, instead of an empty character. Factor ( fct) variables are what R calls categorical variables.Īnother thing to clean up are the blank cells. If you want to convert the chr+lbl variables to a traditional categorical variable that is internally stored and represented in the output as the labels, use the as_factor() function. These chr+lbl columns imported by haven are labeled categorical variables in SAS and treated as categorical variables in R, but show up as numeric values (e.g. 1, 2, 3) in the R output. sas7bcat file that is stored behind the scenes.
![open stata 13 in r open stata 13 in r](https://i.etsystatic.com/28619153/r/il/0f8545/2965936950/il_794xN.2965936950_s1xc.jpg)
sas7bdat file is what is shown in the tibble by default, but the data also contain a label from the. Notice that each column is a chr+lbl type, meaning the character value from the. Select(Month:Q3) # keep only first few cols for example Both will be specified as arguments in read_sas(). sas7bdat file and the formats/labels in a. The data comes in two files, the raw data in a. We will use the National Youth Tobacco Survey (NYTS) data from 2017, which can be downloaded from the CDC website here. Let’s start with an example of reading in a SAS dataset.