--- title: "Furniture" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Furniture} \usepackage[utf8]{inputenc} %\VignetteEngine{knitr::rmarkdown} date: "`r Sys.Date()`" editor_options: chunk_output_type: console --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` This vignette is current as of `furniture` `r packageVersion("furniture")`. ## Using `furniture` ```{r} library(furniture) ``` We will first make a fictitious data set: ```{r data} df <- data.frame(a = rnorm(1000, 1.5, 2), b = seq(1, 1000, 1), c = c(rep("control", 400), rep("Other", 70), rep("treatment", 500), rep("None", 30)), d = c(sample(1:1000, 900, replace=TRUE), rep(-99, 100))) ``` There are four functions that we'll demonstrate here: 1. `washer` 2. `table1` 3. `tableC` 4. `tableF` ## Washer `washer` is a great function for quick data cleaning. In situations where there are placeholders, extra levels in a factor, or several values need to be changed to another. ```{r washer, message=FALSE, warning=FALSE} library(dplyr) df <- df %>% mutate(d = washer(d, -99), ## changes the placeholder -99 to NA c = washer(c, "Other", "None", value = "control")) ## changes "Other" and "None" to "Control" ``` ## Table1 Now that the data is "washed" we can start exploring and reporting. ```{r table1} table1(df, a, b, factor(c), d) ``` The variables must be numeric or factor. Since we use a special type of evaluation (i.e. Non-Standard Evaluation) we can change the variables in the function (e.g., `factor(c)`). This can be extended to making a whole new variable in the function as well. ```{r table1.2} table1(df, a, b, d, ifelse(a > 1, 1, 0)) ``` This is just the beginning though. Two powerful things the function can do are shown below: ```{r table1.3} table1(df, a, b, d, ifelse(a > 1, 1, 0), splitby=~factor(c), test=TRUE) ``` The `splitby = ~factor(c)` stratifies the means and counts by a factor variable (in this case either control or treatment). When we use this we can also automatically compute tests of significance using `test=TRUE`. We can also use it intuitively within the pipe (for more about this, see the "Table 1" vignette): ```{r table1.3.2, message=FALSE, warning=FALSE} df %>% group_by(c) %>% table1(a, b, d, ifelse(a > 1, 1, 0), test=TRUE) ``` In this case, we used the `group_by()` function from `dplyr` (within the `tidyverse`) and `table1()` knows to use that as the grouping variable in place of the `splitby` argument. If the parametric tests (default) are not appropriate, you can set `param = FALSE`. ```{r table1.3.3, message=FALSE, warning=FALSE} df %>% group_by(c) %>% table1(a, b, d, ifelse(a > 1, 1, 0), test=TRUE, param=FALSE) ``` Finally, you can polish it quite a bit using a few other options. For example, you can do the following: ```{r table1.4} table1(df, a, b, d, ifelse(a > 1, 1, 0), splitby=~factor(c), test=TRUE, var_names = c("A", "B", "D", "New Var"), type = c("simple", "condensed")) ``` Note that `var_names` can be used for more complex naming (e.g., with spaces, brackets) that otherwise cannot be used with data frames. Alternatively, for more simple naming, we can name them directly. ```{r table1.4.2} table1(df, A = a, B = b, D = d, A2 = ifelse(a > 1, 1, 0), splitby=~factor(c), test=TRUE, type = c("simple", "condensed")) ``` You can also format the numbers (adding a comma for big numbers such as in 20,000 instead of 20000): ```{r table1.5} table1(df, a, b, d, ifelse(a > 1, 1, 0), splitby=~factor(c), test=TRUE, var_names = c("A", "B", "D", "New Var"), format_number = TRUE) ``` The table can be exported directly to a folder in the working directory called "Table1". Using `export`, we provide it with a string that will be the name of the CSV containing the formatted table. ```{r table1.6, eval=FALSE} table1(df, a, b, d, ifelse(a > 1, 1, 0), splitby=~factor(c), test=TRUE, var_names = c("A", "B", "D", "New Var"), format_number = TRUE, export = "example_table1") ``` This can also be outputted as a latex, markdown, or pandoc table (matching all the output types of `knitr::kable`). Below shows how to do a latex table (not using `kable` however, but a built-in function that provides the variable name at the top of the table): ```{r table1.7} table1(df, a, b, d, "new var" = ifelse(a > 1, 1, 0), splitby = ~factor(c), test = TRUE, output = "latex2") ``` Last item to show you regarding `table1()` is that it can be printed in a simplified and condensed form. This instead of reporting counts and percentages for categorical variables, it reports only percentages and the table has much less white space. ```{r simple_table1.1} table1(df, a, b, d, "new var" = ifelse(a > 1, 1, 0), splitby = ~factor(c), test = TRUE, type = c("simple", "condensed")) ``` ## Table C This function is to create simple, beautiful correlation tables. The syntax is just like `table1()` in most respects. Below we include all the numeric variables to see their correlations. Since there are missing values in `d` we will use the natural `na.rm=TRUE`. ```{r tableC.1} tableC(df, a, b, d, na.rm = TRUE) ``` All the adjustments that you can make in `table1()` can be done here as well. For example, ```{r tableC.2} tableC(df, "A" = a, "B" = b, "D" = d, na.rm = TRUE, output = "html") ``` ## Table F This function is to create simple frequency tables. The syntax is just like `table1()` and `tableC()` in most respects, except that it uses only one variable instead of many. ```{r tableF.1} tableF(df, a) ``` Similarly to `table1()` we can use a `splitby` argument (or `group_by()`). ```{r tableF.2} tableF(df, d, splitby = c) ``` ```{r tableF.3} df %>% group_by(c) %>% tableF(d) ``` ## Table X Lastly, `tableX()` is a pipe-able two-way version of `table()` with a similar syntax to that of the rest of the `furniture` functions. ```{r tableX.1} df %>% tableX(c, ifelse(d > 500, 1, 0)) ``` By default, it provides the total counts for the rows and columns with flexibility as to what is displayed and where. ## Conclusion The four functions: `table1()`, `tableC()`, `tableF()`, and `washer()` add simplicity to cleaning up and understanding your data. Use these pieces of furniture to make your quantitative life a bit easier.