How to create violin plots using SAS9API and R

  • Date August 30, 2019
  • Written by Vasilij Nevlev
  • Category R

This example shows how to create a violin plot for a SAS dataset using SAS9API.

Prerequisites

Before you start using this guide you’ll need the following:

  • Access to SAS9API proxy,
  • R and RStudio installed.

Step 1 – Getting the libraries needed

We will create our violin plot using ggplot2 package and we will use some nice colours from RColorBrewer . Also we will need rsas9api package to send requests to SAS9API and to install it from GitHub we will need devtools  package.

Packages devtools , ggplot2  and RColorBrewer  are available on CRAN, so if you don’t have them already installed run the following code:

install.packages("ggplot2")
install.packages("RColorBrewer")
install.packages("devtools")

And to get rsas9api  from GitHub run:

devtools::install_github("analytium/rsas9api")

Now let’s load our packages:

library(rsas9api)
library(ggplot2)
library(RColorBrewer)

Packages are loaded and we can go to the next step.

Step 2 – Defining connection properties for SAS9API proxy

To send requests to SAS9API endpoints you need to define:

  • URL for SAS9API proxy,
  • SAS workspace server name.

You can do that by replacing your_url  and your_server  in the following code and then running it.

sas9api_url <- "your_url" 
sas_workspace_server_name <- "your_server"

After specifying connection properties we can send requests to SAS9API.

Step 3 – Getting SAS dataset data

We will be using retrieve_data  function from rsas9api  package. This function allows us to get data from a SAS dataset and to store it.
To send a request using retrieve_data  function you will need to define:

  • library name of the dataset (“SASHELP” in our case),
  • dataset name (“CARS” in our case),
  • limit number: number of records to get from the dataset (we will use the maximum value of 10000),
  • offset number: number of records to skip from the beginning of dataset (we will leave it at 0),
  • asDataFrame flag (TRUE in our case, as we want our request to return a dataset).

Let’s run the code:

data_cars <- retrieve_data(url = sas9api_url, 
                           serverName = sas_workspace_server_name, 
                           libraryName = "SASHELP", 
                           datasetName = "CARS", 
                           limit = 10000, offset = 0, 
                           asDataFrame = TRUE)

And let’s have a look at the top rows of the dataframe we received:

head(data_cars)
##   Cylinders DriveTrain EngineSize Horsepower Invoice Length  Make                   Model MPG_City MPG_Highway  MSRP Origin  Type Weight Wheelbase
## 1         6        All        3.5        265   33337    189 Acura                     MDX       17          23 36945   Asia   SUV   4451       106
## 2         4      Front        2.0        200   21761    172 Acura          RSX Type S 2dr       24          31 23820   Asia Sedan   2778       101
## 3         4      Front        2.4        200   24647    183 Acura                 TSX 4dr       22          29 26990   Asia Sedan   3230       105
## 4         6      Front        3.2        270   30299    186 Acura                  TL 4dr       20          28 33195   Asia Sedan   3575       108
## 5         6      Front        3.5        225   39014    197 Acura              3.5 RL 4dr       18          24 43755   Asia Sedan   3880       115
## 6         6      Front        3.5        225   41100    197 Acura 3.5 RL w/Navigation 4dr       18          24 46100   Asia Sedan   3893       115

Now that we have the data we can start creating the plot.

Step 4 – Creating violin plot

For our plot we will define x axis as Type of vehicle and y axis as City mileage. To create a violin plot in ggplot2  we will use geom_violin  geometry. We will set trim = FALSE  to have long and thin tails (but you can try trim = TRUE  as well). To fill our violins with nice colours we will use scale_fill_brewer  function and palette = “Pastel1” .
Let’s run the code and see our plot!

ggplot(data_cars, 
    aes(x = Type, y = MPG_City, fill = Type)) +
    geom_violin(trim = FALSE, lwd = 0.75) +
    scale_fill_brewer(palette = "Pastel1") +
    labs(title = "City Mileage per Type of vehicle",
         x = "Type of Vehicle",
         y = "City Mileage") +
    theme_bw() +
    theme(legend.position = "none")

Conclusion

In this article, we showed how we can create a violin plot for SAS data using SAS9API.
We used retrieve_data  function from rsas9api  package to get data from a SAS dataset in the dataframe format. SAS9API proxy allows you to send different requests to SAS server, including getting and posting data. And with the help of rsas9api  package, you can send requests to all SAS9API endpoints using R language.

If you want to know how to create a similar violin plot using Python check this article.

To see other examples for SAS9API and R go to Examples > R page.

Feel free to contact us if you have any questions or comments!