Introduction

This homework assignment is designed to test and improve your skills in dataframe manipulation and data visualization using R. You will work with various datasets and apply different techniques to analyze and visualize data. The exercises increase in difficulty, challenging you to apply and combine different concepts.

Before you begin, make sure you have the following packages installed and loaded:

# Install packages if not already installed
# install.packages(c("tidyverse", "ggplot2", "dplyr", "tidyr", "lubridate", "scales"))

# Load required packages
library(tidyverse)
Error in library(tidyverse) : there is no package called ‘tidyverse’

Submission Instructions

Good luck!

Exercise 1: Basic Data Exploration

Using the built-in mtcars dataset, perform the following tasks:

  1. Display the first 5 rows of the dataset.
  2. Calculate and print the mean and median MPG for all cars.
  3. Create a scatter plot of mpg vs hp (Miles per Gallon vs Horsepower).

Exercise 2: Data Filtering and Grouping

Using the iris dataset, complete the following:

  1. Create a new dataframe containing only the flowers of the species “versicolor” and “virginica”.
  2. Calculate the mean sepal length and sepal width for each species in this new dataframe.
  3. Create a bar plot showing the mean sepal length for each species, with error bars representing the standard deviation.

Exercise 3: Data Reshaping

Load the tidyr package and use the relig_income dataset:

  1. Reshape the data from wide to long format, creating columns for religion, income_level, and count.
  2. Calculate the total count for each religion and sort in descending order.
  3. Create a horizontal bar plot showing the top 10 religions by total count.

Exercise 4: Time Series Analysis

Use the economics dataset from ggplot2:

  1. Convert the date column to a proper date format using lubridate.
  2. Calculate the year-over-year percentage change in the unemployment rate.
  3. Create a line plot showing the unemployment rate over time, with the line color changing based on whether the year-over-year change was positive or negative.

Exercise 5: Advanced Data Manipulation

Using the diamonds dataset from ggplot2:

  1. Create a new column price_per_carat by dividing price by carat.
  2. For each cut category, calculate the median price_per_carat and the count of diamonds.
  3. Create a scatter plot of cut vs median price_per_carat, with the point size representing the count of diamonds in each category.

Exercise 6: Faceted Plots

Continue working with the diamonds dataset:

  1. Create a new column price_category that categorizes diamonds as “Low”, “Medium”, or “High” based on their price (you decide the thresholds).
  2. Create a faceted histogram of carat for each cut, with different color fills based on the price_category.
  3. Adjust the facet layout to display the plots in two columns.

Exercise 7: Custom Functions and Apply

Create a custom function and apply it to the msleep dataset from ggplot2:

  1. Write a function that calculates the ratio of sleep_total to awake time for an animal.
  2. Use apply() or a dplyr function to calculate this ratio for all animals in the dataset.
  3. Create a scatter plot of body weight vs. brain weight, with point colors representing the sleep ratio, and size representing total sleep time.

Exercise 8: Working with Dates and Summarizing Data

Use the txhousing dataset from ggplot2:

  1. Create a new column for the year and another for the month name.
  2. Calculate the average sales and average volume for each year.
  3. Create a dual-axis plot showing the trend of average sales and average volume over the years.

Exercise 9: Animated Plots

Install and load the gganimate package. Using the gapminder dataset:

  1. Create an animated scatter plot of GDP per capita vs. life expectancy over time.
  2. Color the points by continent and size them by population.
  3. Add appropriate labels and animations to make the visualization informative and engaging.
LS0tCnRpdGxlOiAiSG9tZXdvcms6IERhdGFmcmFtZSBNYW5pcHVsYXRpb24gYW5kIFBsb3R0aW5nIGluIFIiCmF1dGhvcjogIllvdXIgTmFtZSIKZGF0ZTogIjIwMjQtMDktMTgiCm91dHB1dDogaHRtbF9ub3RlYm9vawotLS0KCmBgYHtyIHNldHVwLCBpbmNsdWRlPUZBTFNFfQprbml0cjo6b3B0c19jaHVuayRzZXQoZWNobyA9IFRSVUUpCmBgYAoKIyMgSW50cm9kdWN0aW9uCgpUaGlzIGhvbWV3b3JrIGFzc2lnbm1lbnQgaXMgZGVzaWduZWQgdG8gdGVzdCBhbmQgaW1wcm92ZSB5b3VyIHNraWxscyBpbiBkYXRhZnJhbWUgbWFuaXB1bGF0aW9uIGFuZCBkYXRhIHZpc3VhbGl6YXRpb24gdXNpbmcgUi4gWW91IHdpbGwgd29yayB3aXRoIHZhcmlvdXMgZGF0YXNldHMgYW5kIGFwcGx5IGRpZmZlcmVudCB0ZWNobmlxdWVzIHRvIGFuYWx5emUgYW5kIHZpc3VhbGl6ZSBkYXRhLiBUaGUgZXhlcmNpc2VzIGluY3JlYXNlIGluIGRpZmZpY3VsdHksIGNoYWxsZW5naW5nIHlvdSB0byBhcHBseSBhbmQgY29tYmluZSBkaWZmZXJlbnQgY29uY2VwdHMuCgpCZWZvcmUgeW91IGJlZ2luLCBtYWtlIHN1cmUgeW91IGhhdmUgdGhlIGZvbGxvd2luZyBwYWNrYWdlcyBpbnN0YWxsZWQgYW5kIGxvYWRlZDoKCmBgYHtyfQojIEluc3RhbGwgcGFja2FnZXMgaWYgbm90IGFscmVhZHkgaW5zdGFsbGVkCiMgaW5zdGFsbC5wYWNrYWdlcyhjKCJ0aWR5dmVyc2UiLCAiZ2dwbG90MiIsICJkcGx5ciIsICJ0aWR5ciIsICJsdWJyaWRhdGUiLCAic2NhbGVzIikpCgojIExvYWQgcmVxdWlyZWQgcGFja2FnZXMKbGlicmFyeSh0aWR5dmVyc2UpCmxpYnJhcnkoZ2dwbG90MikKbGlicmFyeShkcGx5cikKbGlicmFyeSh0aWR5cikKbGlicmFyeShsdWJyaWRhdGUpCmxpYnJhcnkoc2NhbGVzKQpgYGAKIyMgU3VibWlzc2lvbiBJbnN0cnVjdGlvbnMKCi0gICBDcmVhdGUgYSBSIE1hcmtkb3duIGZpbGUgKFJtZCkuCi0gICBTdWJtaXQgYm90aCB0aGUgLlJtZCBmaWxlIGFuZCB0aGUgY29tcGlsZWQgLmh0bWwgZmlsZS4KLSAgIE1ha2Ugc3VyZSBhbGwgeW91ciBjb2RlIHJ1bnMgd2l0aG91dCBlcnJvcnMgYW5kIGFsbCBwbG90cyBhcmUgdmlzaWJsZSBpbiB0aGUgY29tcGlsZWQgZG9jdW1lbnQuCi0gICBJbmNsdWRlIGNvbW1lbnRzIGluIHlvdXIgY29kZSB0byBleHBsYWluIHlvdXIgdGhvdWdodCBwcm9jZXNzIGFuZCBhbnkgYXNzdW1wdGlvbnMgeW91IG1hZGUuCgpHb29kIGx1Y2shCgojIyBFeGVyY2lzZSAxOiBCYXNpYyBEYXRhIEV4cGxvcmF0aW9uCgpVc2luZyB0aGUgYnVpbHQtaW4gYG10Y2Fyc2AgZGF0YXNldCwgcGVyZm9ybSB0aGUgZm9sbG93aW5nIHRhc2tzOgoKYSkgIERpc3BsYXkgdGhlIGZpcnN0IDUgcm93cyBvZiB0aGUgZGF0YXNldC4KYikgIENhbGN1bGF0ZSBhbmQgcHJpbnQgdGhlIG1lYW4gYW5kIG1lZGlhbiBNUEcgZm9yIGFsbCBjYXJzLgpjKSAgQ3JlYXRlIGEgc2NhdHRlciBwbG90IG9mIGBtcGdgIHZzIGBocGAgKE1pbGVzIHBlciBHYWxsb24gdnMgSG9yc2Vwb3dlcikuCgojIyBFeGVyY2lzZSAyOiBEYXRhIEZpbHRlcmluZyBhbmQgR3JvdXBpbmcKClVzaW5nIHRoZSBgaXJpc2AgZGF0YXNldCwgY29tcGxldGUgdGhlIGZvbGxvd2luZzoKCmEpICBDcmVhdGUgYSBuZXcgZGF0YWZyYW1lIGNvbnRhaW5pbmcgb25seSB0aGUgZmxvd2VycyBvZiB0aGUgc3BlY2llcyAidmVyc2ljb2xvciIgYW5kICJ2aXJnaW5pY2EiLgpiKSAgQ2FsY3VsYXRlIHRoZSBtZWFuIHNlcGFsIGxlbmd0aCBhbmQgc2VwYWwgd2lkdGggZm9yIGVhY2ggc3BlY2llcyBpbiB0aGlzIG5ldyBkYXRhZnJhbWUuCmMpICBDcmVhdGUgYSBiYXIgcGxvdCBzaG93aW5nIHRoZSBtZWFuIHNlcGFsIGxlbmd0aCBmb3IgZWFjaCBzcGVjaWVzLCB3aXRoIGVycm9yIGJhcnMgcmVwcmVzZW50aW5nIHRoZSBzdGFuZGFyZCBkZXZpYXRpb24uCgojIyBFeGVyY2lzZSAzOiBEYXRhIFJlc2hhcGluZwoKTG9hZCB0aGUgYHRpZHlyYCBwYWNrYWdlIGFuZCB1c2UgdGhlIGByZWxpZ19pbmNvbWVgIGRhdGFzZXQ6CgphKSAgUmVzaGFwZSB0aGUgZGF0YSBmcm9tIHdpZGUgdG8gbG9uZyBmb3JtYXQsIGNyZWF0aW5nIGNvbHVtbnMgZm9yIGByZWxpZ2lvbmAsIGBpbmNvbWVfbGV2ZWxgLCBhbmQgYGNvdW50YC4KYikgIENhbGN1bGF0ZSB0aGUgdG90YWwgY291bnQgZm9yIGVhY2ggcmVsaWdpb24gYW5kIHNvcnQgaW4gZGVzY2VuZGluZyBvcmRlci4KYykgIENyZWF0ZSBhIGhvcml6b250YWwgYmFyIHBsb3Qgc2hvd2luZyB0aGUgdG9wIDEwIHJlbGlnaW9ucyBieSB0b3RhbCBjb3VudC4KCiMjIEV4ZXJjaXNlIDQ6IFRpbWUgU2VyaWVzIEFuYWx5c2lzCgpVc2UgdGhlIGBlY29ub21pY3NgIGRhdGFzZXQgZnJvbSBnZ3Bsb3QyOgoKYSkgIENvbnZlcnQgdGhlIGBkYXRlYCBjb2x1bW4gdG8gYSBwcm9wZXIgZGF0ZSBmb3JtYXQgdXNpbmcgYGx1YnJpZGF0ZWAuCmIpICBDYWxjdWxhdGUgdGhlIHllYXItb3Zlci15ZWFyIHBlcmNlbnRhZ2UgY2hhbmdlIGluIHRoZSB1bmVtcGxveW1lbnQgcmF0ZS4KYykgIENyZWF0ZSBhIGxpbmUgcGxvdCBzaG93aW5nIHRoZSB1bmVtcGxveW1lbnQgcmF0ZSBvdmVyIHRpbWUsIHdpdGggdGhlIGxpbmUgY29sb3IgY2hhbmdpbmcgYmFzZWQgb24gd2hldGhlciB0aGUgeWVhci1vdmVyLXllYXIgY2hhbmdlIHdhcyBwb3NpdGl2ZSBvciBuZWdhdGl2ZS4KCiMjIEV4ZXJjaXNlIDU6IEFkdmFuY2VkIERhdGEgTWFuaXB1bGF0aW9uCgpVc2luZyB0aGUgYGRpYW1vbmRzYCBkYXRhc2V0IGZyb20gZ2dwbG90MjoKCmEpICBDcmVhdGUgYSBuZXcgY29sdW1uIGBwcmljZV9wZXJfY2FyYXRgIGJ5IGRpdmlkaW5nIGBwcmljZWAgYnkgYGNhcmF0YC4KYikgIEZvciBlYWNoIGBjdXRgIGNhdGVnb3J5LCBjYWxjdWxhdGUgdGhlIG1lZGlhbiBgcHJpY2VfcGVyX2NhcmF0YCBhbmQgdGhlIGNvdW50IG9mIGRpYW1vbmRzLgpjKSAgQ3JlYXRlIGEgc2NhdHRlciBwbG90IG9mIGBjdXRgIHZzIG1lZGlhbiBgcHJpY2VfcGVyX2NhcmF0YCwgd2l0aCB0aGUgcG9pbnQgc2l6ZSByZXByZXNlbnRpbmcgdGhlIGNvdW50IG9mIGRpYW1vbmRzIGluIGVhY2ggY2F0ZWdvcnkuCgojIyBFeGVyY2lzZSA2OiBGYWNldGVkIFBsb3RzCgpDb250aW51ZSB3b3JraW5nIHdpdGggdGhlIGBkaWFtb25kc2AgZGF0YXNldDoKCmEpICBDcmVhdGUgYSBuZXcgY29sdW1uIGBwcmljZV9jYXRlZ29yeWAgdGhhdCBjYXRlZ29yaXplcyBkaWFtb25kcyBhcyAiTG93IiwgIk1lZGl1bSIsIG9yICJIaWdoIiBiYXNlZCBvbiB0aGVpciBwcmljZSAoeW91IGRlY2lkZSB0aGUgdGhyZXNob2xkcykuCmIpICBDcmVhdGUgYSBmYWNldGVkIGhpc3RvZ3JhbSBvZiBgY2FyYXRgIGZvciBlYWNoIGBjdXRgLCB3aXRoIGRpZmZlcmVudCBjb2xvciBmaWxscyBiYXNlZCBvbiB0aGUgYHByaWNlX2NhdGVnb3J5YC4KYykgIEFkanVzdCB0aGUgZmFjZXQgbGF5b3V0IHRvIGRpc3BsYXkgdGhlIHBsb3RzIGluIHR3byBjb2x1bW5zLgoKIyMgRXhlcmNpc2UgNzogQ3VzdG9tIEZ1bmN0aW9ucyBhbmQgQXBwbHkKCkNyZWF0ZSBhIGN1c3RvbSBmdW5jdGlvbiBhbmQgYXBwbHkgaXQgdG8gdGhlIGBtc2xlZXBgIGRhdGFzZXQgZnJvbSBnZ3Bsb3QyOgoKYSkgIFdyaXRlIGEgZnVuY3Rpb24gdGhhdCBjYWxjdWxhdGVzIHRoZSByYXRpbyBvZiBzbGVlcF90b3RhbCB0byBhd2FrZSB0aW1lIGZvciBhbiBhbmltYWwuCmIpICBVc2UgYGFwcGx5KClgIG9yIGEgYGRwbHlyYCBmdW5jdGlvbiB0byBjYWxjdWxhdGUgdGhpcyByYXRpbyBmb3IgYWxsIGFuaW1hbHMgaW4gdGhlIGRhdGFzZXQuCmMpICBDcmVhdGUgYSBzY2F0dGVyIHBsb3Qgb2YgYm9keSB3ZWlnaHQgdnMuIGJyYWluIHdlaWdodCwgd2l0aCBwb2ludCBjb2xvcnMgcmVwcmVzZW50aW5nIHRoZSBzbGVlcCByYXRpbywgYW5kIHNpemUgcmVwcmVzZW50aW5nIHRvdGFsIHNsZWVwIHRpbWUuCgojIyBFeGVyY2lzZSA4OiBXb3JraW5nIHdpdGggRGF0ZXMgYW5kIFN1bW1hcml6aW5nIERhdGEKClVzZSB0aGUgYHR4aG91c2luZ2AgZGF0YXNldCBmcm9tIGdncGxvdDI6CgphKSAgQ3JlYXRlIGEgbmV3IGNvbHVtbiBmb3IgdGhlIHllYXIgYW5kIGFub3RoZXIgZm9yIHRoZSBtb250aCBuYW1lLgpiKSAgQ2FsY3VsYXRlIHRoZSBhdmVyYWdlIHNhbGVzIGFuZCBhdmVyYWdlIHZvbHVtZSBmb3IgZWFjaCB5ZWFyLgpjKSAgQ3JlYXRlIGEgZHVhbC1heGlzIHBsb3Qgc2hvd2luZyB0aGUgdHJlbmQgb2YgYXZlcmFnZSBzYWxlcyBhbmQgYXZlcmFnZSB2b2x1bWUgb3ZlciB0aGUgeWVhcnMuCgojIyBFeGVyY2lzZSA5OiBBbmltYXRlZCBQbG90cwoKSW5zdGFsbCBhbmQgbG9hZCB0aGUgYGdnYW5pbWF0ZWAgcGFja2FnZS4gVXNpbmcgdGhlIGBnYXBtaW5kZXJgIGRhdGFzZXQ6CgphKSAgQ3JlYXRlIGFuIGFuaW1hdGVkIHNjYXR0ZXIgcGxvdCBvZiBHRFAgcGVyIGNhcGl0YSB2cy4gbGlmZSBleHBlY3RhbmN5IG92ZXIgdGltZS4KYikgIENvbG9yIHRoZSBwb2ludHMgYnkgY29udGluZW50IGFuZCBzaXplIHRoZW0gYnkgcG9wdWxhdGlvbi4KYykgIEFkZCBhcHByb3ByaWF0ZSBsYWJlbHMgYW5kIGFuaW1hdGlvbnMgdG8gbWFrZSB0aGUgdmlzdWFsaXphdGlvbiBpbmZvcm1hdGl2ZSBhbmQgZW5nYWdpbmcuCgoK