For many years I have been writing down when and how much I work in a spreadsheet. I have already evaluated this data in many ways, but never made a heatmap out of it. So, time to understand heatmaps in R.

The structure of my spreadsheet looks like this:

Observations: 9,191
Variables: 14
$ Arbeit    <chr> "Coursera_Internet", "Introduction to Statistics", "Diagnostik", "Hiwi", "Introduction to Statistics", "Makroproje...
$ Typ       <chr> "MOOC", "MOOC", "Uni", "Job", "MOOC", "Uni", "Uni", "MOOC", "MOOC", "MOOC", "Uni", "MOOC", "Uni", "MOOC", "MOOC", ...
$ Datum     <chr> "2013-10-30", "2013-10-30", "2013-10-31", "2013-10-31", "2013-10-31", "2013-10-31", "2013-10-31", "2013-11-1", "20...
$ Anfang    <time> 21:51:00, 22:16:00, 11:39:00, 12:11:00, 01:08:00, 11:55:00, 11:17:00, 20:30:00, 22:48:00, 23:28:00, 14:23:00, 00:...
$ Ende      <time> 22:15:00, 23:28:00, 11:55:00, 12:14:00, 02:22:00, 12:11:00, 11:33:00, 22:33:00, 23:25:00, 23:59:00, 16:39:00, 00:...
$ Dauer     <time> 00:24:00, 01:12:00, 00:16:00, 00:03:00, 01:14:00, 00:16:00, 00:16:00, 02:03:00, 00:37:00, 00:31:00, 02:16:00, 00:...
$ Minuten   <int> 24, 72, 16, 3, 74, 16, 16, 123, 37, 31, 136, 35, 122, 98, 52, 134, 120, 60, 37, 105, 25, 67, 70, 231, 90, 65, 20, ...
  • Arbeit: Refers to what I do, for example, I might work on a webpage or I work for my PhD.
  • Typ: This is the generic term for what I do. For example, websites or working as a research associate.
  • Datum: The date on which I perform an activity.
  • Anfang: A to-the-minute record of when I start a task.
  • Ende: A to-the-minute record of when I end a task.
  • Dauer: The duration of the task.
  • Minuten: How many minutes the task took.

I want to visualize the amount of work per month over the past couple of years. I started writing down my work since the end of 2013. What I would need then is a summarised data of the amount of work in hours grouped by the years and months. But first, we have to do some data cleaning:

arbeit_cleaned <- arbeit %>%
  select(Arbeit, Typ, Datum, Anfang, Ende,
         Dauer, Minuten) %>%
  unite(
    beginning_datetime, Datum, Anfang, remove = FALSE, sep = " "
  ) %>%
  unite(
    end_datetime, Datum, Ende, sep = " "
  ) %>%
  select(-Anfang) %>%
  mutate(
    beginning_datetime = ymd_hms(beginning_datetime),
    end_datetime = ymd_hms(end_datetime)
  ) %>%
  mutate(
    month = month(beginning_datetime, label = TRUE),
    year = year(beginning_datetime),
    wday = wday(beginning_datetime, label = TRUE),
    week = week(beginning_datetime)
  )

I do not really have a datetime object in the dataset. To extract the year and month more easily I use lubridate, just because it is so easy.

Then, I’ll have to group the data by year and month:

work_per_month <- arbeit_cleaned %>% 
  group_by(year, month) %>% 
  summarise(sum = sum(Minuten, na.rm = TRUE) / 60) %>%
  na.omit()

Now, I can visualize my work as a heatmap:

ggplot(work_per_month, aes(x = year, y = month)) +
  geom_tile(aes(fill = sum), color = "white") +
  scale_fill_gradient(low = "#d8e1cf", high = "#438484") +
  labs(title = "Heatplot of the amount of work per month",
       x = "Year", y = "Month") +
  theme_minimal() +
  labs(fill = "Hours of work") 

Let’s see how this pattern looks like if I compare my regular job to all other activites I do.

work_per_month <- arbeit_cleaned %>% 
  mutate(
    Typ = fct_lump(Typ, 1)
  ) %>%
  group_by(year, month, Typ) %>% 
  summarise(sum = sum(Minuten, na.rm = TRUE) / 60) %>%
  na.omit()

ggplot(work_per_month, aes(x = year, y = month)) +
  geom_tile(aes(fill = sum), color = "white") +
  scale_fill_gradient(low = "#d8e1cf", high = "#438484") +
  labs(title = "Heatplot of the amount of work per month",
       x = "Year", y = "Month") +
  theme_minimal() +
  labs(fill = "Hours of work") +
  facet_wrap(~ Typ)

If I compare the work I did for my job and the stuff I do in my freetime, it is apparent how little time is left aside from my regular job. Before I started working I had plenty of time for my own education. During the past years these hours of personal education went down to about 40 hours per month.

I am also interested to see how much I work during the weekdays. I usually do not work over the weekend, but let’s have a look at the data:

work_per_week <- arbeit_cleaned %>%
  filter(Typ == "WissMit") %>%
  group_by(week, wday, year) %>%
  summarise(sum = sum(Minuten, na.rm = TRUE) / 60) %>%
  na.omit() %>%
  ungroup() %>%
  mutate(
    wday = fct_relevel(wday, c("So", "Sa", "Fr", "Do", "Mi", "Di", "Mo"))
  )


ggplot(work_per_week, aes(x = week, y = wday)) +
  geom_tile(aes(fill = sum), color = "white") +
  scale_fill_gradient(low = "#d8e1cf", high = "#438484") +
  labs(title = "Heatplot of the amount of work per wday",
       x = "Week", y = "Wday") +
  theme_minimal() +
  labs(fill = "Hours of work") +
  facet_wrap(~ year)

Indeed, weekends are sacret to me.