We went through the first chapter of the book, which is an introduction to data science and its core principles.

Questions for review (answers in the book)

  1. Name the six packages that form the core of the tidyverse.

  2. What size (in Gb) are datasets in big data?

  3. Define visualisation and modeling in the context of data science. What are their respective strengths and weaknesses?

  4. Give three examples of non-rectangular data. Do you think these are categorised as structured or unstructured data?

Questions to ponder (answers in the brain)

  1. What is data?

  2. What is data science? How is it different from statistics?

  3. Why do you want to join the book club? What do you aim to achieve from it?

Interesting discussions

Logistics of the meeting (aim to decide this as time goes)

  1. Aim to have rotating chairs, so everyone has a say.

  2. How long should the meeting duration be? Need to account for the busy schedules of people who work in the lab.

  3. Attendees are recommended to use slack to mention topics of interest 2 days in advance before a meeting. Chair could check it in advance.

  4. Separate channel for exercises on slack?

  5. Aim to pace the sessions accordingly to avoid anyone falling behind. Please shout (or type) if you need more time for a chapter.