We read through the first half of the third chapter about data visualisation using ggplot2.

Questions for review (answers in the book)

  1. What is an aesthetic? Name three examples of aesthetic in ggplot2. (section 3.3)

  2. What are facets?
    • Distinguish between facet_wrap() and facet_grid(). (section 3.5)
  3. What is a geom? (section 3.6)
    • What’s the difference between global and local mapping?
  4. What is a stat? (section 3.7)
    • Differentiate between the stats "identity" and "count" do in the context of geom_bar().
    • Which is the default?
  5. What is a position adjustment? (section 3.8)
    • Describe what the position adjustments "stack", "identity", "dodge" and "fill" do in the context of geom_bar().
    • Which is the default?
    • How does position = jitter help with overplotting in scatterplots?
  6. What is the default coordinate system in ggplot2?

Questions to ponder (answers in the brain)

  1. Distinguish between discrete and continuous data. For these aesthetics, discuss which is suitable for discrete, continuous data, both or neither.
    • size
    • shape
    • colour
    • linetype
    • alpha
    • fill

    Consider: discrete data can be categorial or non-categorial.

  2. With respect to geom_point objects, what is the maximum number of dimensions of our data could be simultaneously represented within a single ggplot object?
    • Consider we have two axes, various aesthetics and facetting at our disposal.
  3. Human eyes are positioned horizontally, giving us a wider stereoscopic vision.
    • Is that why we prefer to have more columns than rows when arranging subplots in a plot (related to question 6 in section 3.5.1)?
    • Perhaps that’s why computer screens were designed to be longer horizontally than vertically? Why did modern smartphones reverse the trend?

Interesting discussions

Random code chunks from the book and meeting

Here is a template for any type of plot with ggplot2. The template emphasises the layered grammar of graphics, which is what “gg” in ggplot2 stands for. Confession: I only realised this after half a year of using ggplot2.

ggplot(data = <DATA>) + 
  <GEOM_FUNCTION>(
     mapping = aes(<MAPPINGS>),
     stat = <STAT>, 
     position = <POSITION>
  ) +
  <COORDINATE_FUNCTION> +
  <FACET_FUNCTION>

Every geom has a default stat, and every stat has a default geom. The following code chunks generate the same plot. If desired, one could change the statistical transformation of a particular geom.

ggplot(data = diamonds) + 
    geom_bar(mapping = aes(x = cut))

ggplot(data = diamonds) + 
    stat_count(mapping = aes(x = cut))
dplyr::filter()
plyr::filter()