# 13 Association Plots in R

## Bivariate Analysis - single numeric outcome against categorical dependent variable

For illustration, we will use ggplot2::diamonds dataset.

``> diamonds = ggplot2::diamonds``
``> price.means = aggregate(price ~ cut, diamonds, mean)``
``> head(price.means)``

Plot mean price

``> barplot(price.means\$price, names.arg = price.means\$cut)`` Let's make the visual better

• It is a good idea to sort the data before plotting
• Add title
• Add x and y labels
• Add a color to the bars
``> price.means = price.means[order(price.means\$price, decreasing = TRUE),]``
``> barplot(price.means\$price, ``
``    names.arg = price.means\$cut, ``
``    col = "Steelblue", ``
``    xlab = "Cut", ``
``    ylab = "Avg Price", ``
``    main = "Association plot between Cut and Average Price",``
``    border = "white")`` ## Distribution of Outcome (e.g. Price) against Common Categorical Variable

We are going to use grouped boxplot. For illustration, we will use built-in mtcars dataset.

``> data(mtcars)``
``> boxplot(mpg ~ cyl, data = mtcars)`` Let's modify this plot

1. Add x and y labels
2. Add Title
3. Add color to each group to differentiate
``> boxplot(mpg ~ cyl, ``
``        data = mtcars, ``
``        col = brewer.pal(3, "Paired"),``
``        xlab = "No of Cylinder",``
``        ylab = "Mileage",``
``        main = "Mileage by Number of Cylinders \n mtcars dataset",``
``        outpch = 16,``
``        outcol = brewer.pal(3, "Paired"),``
``        staplelty = 0,``
``        whisklty = 1,``
``        #name = c("", "")``
``        )`` For more practice use MASS::painters dataset and plot Expression ~ School

## Scatter Plot

Loading package for loading data from web.

``> require(RCurl)``

Load Pearson's Height dataset

``> url = "http://www.math.uah.edu/stat/data/Pearson.csv"``
``> pearson = read.csv(url) ``

Plot the data

``> plot(Son ~ Father, pearson)`` ``> plot(Son ~ Father, pearson, pch = 16, col = "Darkgrey", main = "Father's Height vs Son's Height \n Pearson Dataset", xlab = "Father's Height", ylab = "Son's Height")`` Add a linear fitted line to the plot.

> abline(lm(Son ~ Father, data = pearson), col = "Blue", lwd = 2) Add locally weighted scatterplot smoothing line (lowess)

``> lines(lowess(pearson\$Son, pearson\$Father), col = "Darkred", lwd = 2)`` More advanced scatterplot using car package

``> require(car) #Companion to Applied Regression``

> scatterplot(Son ~ Father, pearson, pch = 16, col = "Darkgrey", main = "Father's Height vs Son's Height \n Pearson Dataset", xlab = "Father's Height", ylab = "Son's Height") For more practice, plot scatter plot for built in cars data.

Among the numeric columns you can view all correlations together

require(GGally)

data(diamonds)

ggcorr(diamonds) Observation correlations on entire dataset

``require(psych)``
``data("iris")``
``pairs.panels(iris)`` 