We read through the second half of the third chapter, and the fourth. I was chairing the session. Here is a collection of topics that were discussed and my code for reference.

So many categories, so few colours

In Section 3.3, ggplot advises against using the size of points in a scatterplot to represent a discrete, categorial variable (in this case, the class of vehicle).

ggplot(data = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy, size = class))
## Warning: Using size for a discrete variable is not advised.

If told to use shape to represent the third variable, ggplot uses six shapes at most. In this case, it left the seventh class (SUVs) unplotted.

ggplot(data = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy, shape = class))
## Warning: The shape palette can deal with a maximum of 6 discrete values
## because more than 6 becomes difficult to discriminate; you have 7.
## Consider specifying shapes manually if you must have them.
## Warning: Removed 62 rows containing missing values (geom_point).

A similar warning does not appear if one uses colour, which surprises me. In my eyes, the plot below with different colours for the seven categories looks confusing. How many colours can the human eye comfortably distinguish? Not that many in my opinion.

ggplot(data = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy, color = class))

Sometimes, one should consider using faceting instead of putting everything in a single plot. Notice that I used show.legend = FALSE to hide the legend, as that would have been redundant given we already have the individual headers.

ggplot(data = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy, color = class), show.legend = FALSE) +
  facet_wrap(~class)