I create learning content for the web: Online courses, data visualizations, and tutorials.
November 23rd, 2018
For many years I have been writing down when and how much I work in a spreadsheet. I have already evaluated this data in many ways, but never made a heatmap out of it. So, time to understand heatmaps in R.
The structure of my spreadsheet looks like this:
Observations: 9,191
Variables: 14
$ Arbeit <chr> "Coursera_Internet", "Introduction to Statistics", "Diagnostik", "Hiwi", "Introduction to Statistics", "Makroproje...
$ Typ <chr> "MOOC", "MOOC", "Uni", "Job", "MOOC", "Uni", "Uni", "MOOC", "MOOC", "MOOC", "Uni", "MOOC", "Uni", "MOOC", "MOOC", ...
$ Datum <chr> "2013-10-30", "2013-10-30", "2013-10-31", "2013-10-31", "2013-10-31", "2013-10-31", "2013-10-31", "2013-11-1", "20...
$ Anfang <time> 21:51:00, 22:16:00, 11:39:00, 12:11:00, 01:08:00, 11:55:00, 11:17:00, 20:30:00, 22:48:00, 23:28:00, 14:23:00, 00:...
$ Ende <time> 22:15:00, 23:28:00, 11:55:00, 12:14:00, 02:22:00, 12:11:00, 11:33:00, 22:33:00, 23:25:00, 23:59:00, 16:39:00, 00:...
$ Dauer <time> 00:24:00, 01:12:00, 00:16:00, 00:03:00, 01:14:00, 00:16:00, 00:16:00, 02:03:00, 00:37:00, 00:31:00, 02:16:00, 00:...
$ Minuten <int> 24, 72, 16, 3, 74, 16, 16, 123, 37, 31, 136, 35, 122, 98, 52, 134, 120, 60, 37, 105, 25, 67, 70, 231, 90, 65, 20, ...
I want to visualize the amount of work per month over the past couple of years. I started writing down my work since the end of 2013. What I would need then is a summarised data of the amount of work in hours grouped by the years and months. But first, we have to do some data cleaning:
arbeit_cleaned <- arbeit %>%
select(Arbeit, Typ, Datum, Anfang, Ende,
Dauer, Minuten) %>%
unite(
beginning_datetime, Datum, Anfang, remove = FALSE, sep = " "
) %>%
unite(
end_datetime, Datum, Ende, sep = " "
) %>%
select(-Anfang) %>%
mutate(
beginning_datetime = ymd_hms(beginning_datetime),
end_datetime = ymd_hms(end_datetime)
) %>%
mutate(
month = month(beginning_datetime, label = TRUE),
year = year(beginning_datetime),
wday = wday(beginning_datetime, label = TRUE),
week = week(beginning_datetime)
)
I do not really have a datetime object in the dataset. To extract the year and month more easily I use lubridate, just because it is so easy.
Then, I’ll have to group the data by year and month:
work_per_month <- arbeit_cleaned %>%
group_by(year, month) %>%
summarise(sum = sum(Minuten, na.rm = TRUE) / 60) %>%
na.omit()
Now, I can visualize my work as a heatmap:
ggplot(work_per_month, aes(x = year, y = month)) +
geom_tile(aes(fill = sum), color = "white") +
scale_fill_gradient(low = "#d8e1cf", high = "#438484") +
labs(title = "Heatplot of the amount of work per month",
x = "Year", y = "Month") +
theme_minimal() +
labs(fill = "Hours of work")
Let’s see how this pattern looks like if I compare my regular job to all other activites I do.
work_per_month <- arbeit_cleaned %>%
mutate(
Typ = fct_lump(Typ, 1)
) %>%
group_by(year, month, Typ) %>%
summarise(sum = sum(Minuten, na.rm = TRUE) / 60) %>%
na.omit()
ggplot(work_per_month, aes(x = year, y = month)) +
geom_tile(aes(fill = sum), color = "white") +
scale_fill_gradient(low = "#d8e1cf", high = "#438484") +
labs(title = "Heatplot of the amount of work per month",
x = "Year", y = "Month") +
theme_minimal() +
labs(fill = "Hours of work") +
facet_wrap(~ Typ)
If I compare the work I did for my job and the stuff I do in my freetime, it is apparent how little time is left aside from my regular job. Before I started working I had plenty of time for my own education. During the past years these hours of personal education went down to about 40 hours per month.
I am also interested to see how much I work during the weekdays. I usually do not work over the weekend, but let’s have a look at the data:
work_per_week <- arbeit_cleaned %>%
filter(Typ == "WissMit") %>%
group_by(week, wday, year) %>%
summarise(sum = sum(Minuten, na.rm = TRUE) / 60) %>%
na.omit() %>%
ungroup() %>%
mutate(
wday = fct_relevel(wday, c("So", "Sa", "Fr", "Do", "Mi", "Di", "Mo"))
)
ggplot(work_per_week, aes(x = week, y = wday)) +
geom_tile(aes(fill = sum), color = "white") +
scale_fill_gradient(low = "#d8e1cf", high = "#438484") +
labs(title = "Heatplot of the amount of work per wday",
x = "Week", y = "Wday") +
theme_minimal() +
labs(fill = "Hours of work") +
facet_wrap(~ year)
Indeed, weekends are sacret to me.