12 Plotting Variables in R
Plot Single Discrete / Qualitative Variables
You can plot discrete or qualitative variables using the following techniques
pie (though, it is not a good charting method)
We are going to diamonds dataset in ggplot2 package for illustration purpose
Summarize data ... find frequency for each color of diamond
Order the barplot
Create a palette of 7 colors from RColorBrewer.
For more info about RColorBrewer look at this page http://blog.einext.com/r-1/working-with-colors-in-r
blues = brewer.pal(7, "Blues")
Use the color palette to the barplot. rev function reverses the color palette values
Tidy up the graph a little bit
col = rev(blues), # Color of the bars
horiz = TRUE, # Putting the label values horizontally
las = 1, # Orientation of x-labels
border = NA, # No borders on bars
main = "Frequencies of Different Colors of Diamond", # title of the graph
Display Categorical Variable using Pie Chart
Not Recommended, rather use barchart, see why below)
It is hard to tell the relative measures from pie chart, while the bar chart clearly shows the difference. Below is a text from the help text on pie function.
Pie charts are a very bad way of displaying information. The eye is good at judging linear measures and bad at judging relative areas. A bar chart or dot chart is a preferable way of displaying this type of data.
Cleveland (1985), page 264: “Data that can be shown by pie charts always can be shown by a dot chart. This means that judgements of position along a common scale can be made instead of the less accurate angle judgements.” This statement is based on the empirical investigations of Cleveland and McGill as well as investigations by perceptual psychologists.
Plot Quantitative or Continuous Variables
You can plot continuous variables or quantitative variables using the following
Specify number of bucket you want to create across x axis (... that contains the values of the continuous variable)
Plot density or relative frequency
Add a normal distribution curve to the histogram
Boxplot is useful to outliers and symmetry in the distribution. For this illustration, let's use iris dataset that comes with R.
Take a subset iris dataset