Introduction
This homework assignment is designed to test and improve your skills
in dataframe manipulation and data visualization using R. You will work
with various datasets and apply different techniques to analyze and
visualize data. The exercises increase in difficulty, challenging you to
apply and combine different concepts.
Before you begin, make sure you have the following packages installed
and loaded:
# Install packages if not already installed
# install.packages(c("tidyverse", "ggplot2", "dplyr", "tidyr", "lubridate", "scales"))
# Load required packages
library(tidyverse)
Error in library(tidyverse) : there is no package called ‘tidyverse’
Submission Instructions
- Create a R Markdown file (Rmd).
- Submit both the .Rmd file and the compiled .html file.
- Make sure all your code runs without errors and all plots are
visible in the compiled document.
- Include comments in your code to explain your thought process and
any assumptions you made.
Good luck!
Exercise 1: Basic Data Exploration
Using the built-in mtcars
dataset, perform the following
tasks:
- Display the first 5 rows of the dataset.
- Calculate and print the mean and median MPG for all cars.
- Create a scatter plot of
mpg
vs hp
(Miles
per Gallon vs Horsepower).
Exercise 2: Data Filtering and Grouping
Using the iris
dataset, complete the following:
- Create a new dataframe containing only the flowers of the species
“versicolor” and “virginica”.
- Calculate the mean sepal length and sepal width for each species in
this new dataframe.
- Create a bar plot showing the mean sepal length for each species,
with error bars representing the standard deviation.
Exercise 3: Data Reshaping
Load the tidyr
package and use the
relig_income
dataset:
- Reshape the data from wide to long format, creating columns for
religion
, income_level
, and
count
.
- Calculate the total count for each religion and sort in descending
order.
- Create a horizontal bar plot showing the top 10 religions by total
count.
Exercise 4: Time Series Analysis
Use the economics
dataset from ggplot2:
- Convert the
date
column to a proper date format using
lubridate
.
- Calculate the year-over-year percentage change in the unemployment
rate.
- Create a line plot showing the unemployment rate over time, with the
line color changing based on whether the year-over-year change was
positive or negative.
Exercise 5: Advanced Data Manipulation
Using the diamonds
dataset from ggplot2:
- Create a new column
price_per_carat
by dividing
price
by carat
.
- For each
cut
category, calculate the median
price_per_carat
and the count of diamonds.
- Create a scatter plot of
cut
vs median
price_per_carat
, with the point size representing the count
of diamonds in each category.
Exercise 6: Faceted Plots
Continue working with the diamonds
dataset:
- Create a new column
price_category
that categorizes
diamonds as “Low”, “Medium”, or “High” based on their price (you decide
the thresholds).
- Create a faceted histogram of
carat
for each
cut
, with different color fills based on the
price_category
.
- Adjust the facet layout to display the plots in two columns.
Exercise 7: Custom Functions and Apply
Create a custom function and apply it to the msleep
dataset from ggplot2:
- Write a function that calculates the ratio of sleep_total to awake
time for an animal.
- Use
apply()
or a dplyr
function to
calculate this ratio for all animals in the dataset.
- Create a scatter plot of body weight vs. brain weight, with point
colors representing the sleep ratio, and size representing total sleep
time.
Exercise 8: Working with Dates and Summarizing Data
Use the txhousing
dataset from ggplot2:
- Create a new column for the year and another for the month
name.
- Calculate the average sales and average volume for each year.
- Create a dual-axis plot showing the trend of average sales and
average volume over the years.
Exercise 9: Animated Plots
Install and load the gganimate
package. Using the
gapminder
dataset:
- Create an animated scatter plot of GDP per capita vs. life
expectancy over time.
- Color the points by continent and size them by population.
- Add appropriate labels and animations to make the visualization
informative and engaging.
LS0tCnRpdGxlOiAiSG9tZXdvcms6IERhdGFmcmFtZSBNYW5pcHVsYXRpb24gYW5kIFBsb3R0aW5nIGluIFIiCmF1dGhvcjogIllvdXIgTmFtZSIKZGF0ZTogIjIwMjQtMDktMTgiCm91dHB1dDogaHRtbF9ub3RlYm9vawotLS0KCmBgYHtyIHNldHVwLCBpbmNsdWRlPUZBTFNFfQprbml0cjo6b3B0c19jaHVuayRzZXQoZWNobyA9IFRSVUUpCmBgYAoKIyMgSW50cm9kdWN0aW9uCgpUaGlzIGhvbWV3b3JrIGFzc2lnbm1lbnQgaXMgZGVzaWduZWQgdG8gdGVzdCBhbmQgaW1wcm92ZSB5b3VyIHNraWxscyBpbiBkYXRhZnJhbWUgbWFuaXB1bGF0aW9uIGFuZCBkYXRhIHZpc3VhbGl6YXRpb24gdXNpbmcgUi4gWW91IHdpbGwgd29yayB3aXRoIHZhcmlvdXMgZGF0YXNldHMgYW5kIGFwcGx5IGRpZmZlcmVudCB0ZWNobmlxdWVzIHRvIGFuYWx5emUgYW5kIHZpc3VhbGl6ZSBkYXRhLiBUaGUgZXhlcmNpc2VzIGluY3JlYXNlIGluIGRpZmZpY3VsdHksIGNoYWxsZW5naW5nIHlvdSB0byBhcHBseSBhbmQgY29tYmluZSBkaWZmZXJlbnQgY29uY2VwdHMuCgpCZWZvcmUgeW91IGJlZ2luLCBtYWtlIHN1cmUgeW91IGhhdmUgdGhlIGZvbGxvd2luZyBwYWNrYWdlcyBpbnN0YWxsZWQgYW5kIGxvYWRlZDoKCmBgYHtyfQojIEluc3RhbGwgcGFja2FnZXMgaWYgbm90IGFscmVhZHkgaW5zdGFsbGVkCiMgaW5zdGFsbC5wYWNrYWdlcyhjKCJ0aWR5dmVyc2UiLCAiZ2dwbG90MiIsICJkcGx5ciIsICJ0aWR5ciIsICJsdWJyaWRhdGUiLCAic2NhbGVzIikpCgojIExvYWQgcmVxdWlyZWQgcGFja2FnZXMKbGlicmFyeSh0aWR5dmVyc2UpCmxpYnJhcnkoZ2dwbG90MikKbGlicmFyeShkcGx5cikKbGlicmFyeSh0aWR5cikKbGlicmFyeShsdWJyaWRhdGUpCmxpYnJhcnkoc2NhbGVzKQpgYGAKIyMgU3VibWlzc2lvbiBJbnN0cnVjdGlvbnMKCi0gICBDcmVhdGUgYSBSIE1hcmtkb3duIGZpbGUgKFJtZCkuCi0gICBTdWJtaXQgYm90aCB0aGUgLlJtZCBmaWxlIGFuZCB0aGUgY29tcGlsZWQgLmh0bWwgZmlsZS4KLSAgIE1ha2Ugc3VyZSBhbGwgeW91ciBjb2RlIHJ1bnMgd2l0aG91dCBlcnJvcnMgYW5kIGFsbCBwbG90cyBhcmUgdmlzaWJsZSBpbiB0aGUgY29tcGlsZWQgZG9jdW1lbnQuCi0gICBJbmNsdWRlIGNvbW1lbnRzIGluIHlvdXIgY29kZSB0byBleHBsYWluIHlvdXIgdGhvdWdodCBwcm9jZXNzIGFuZCBhbnkgYXNzdW1wdGlvbnMgeW91IG1hZGUuCgpHb29kIGx1Y2shCgojIyBFeGVyY2lzZSAxOiBCYXNpYyBEYXRhIEV4cGxvcmF0aW9uCgpVc2luZyB0aGUgYnVpbHQtaW4gYG10Y2Fyc2AgZGF0YXNldCwgcGVyZm9ybSB0aGUgZm9sbG93aW5nIHRhc2tzOgoKYSkgIERpc3BsYXkgdGhlIGZpcnN0IDUgcm93cyBvZiB0aGUgZGF0YXNldC4KYikgIENhbGN1bGF0ZSBhbmQgcHJpbnQgdGhlIG1lYW4gYW5kIG1lZGlhbiBNUEcgZm9yIGFsbCBjYXJzLgpjKSAgQ3JlYXRlIGEgc2NhdHRlciBwbG90IG9mIGBtcGdgIHZzIGBocGAgKE1pbGVzIHBlciBHYWxsb24gdnMgSG9yc2Vwb3dlcikuCgojIyBFeGVyY2lzZSAyOiBEYXRhIEZpbHRlcmluZyBhbmQgR3JvdXBpbmcKClVzaW5nIHRoZSBgaXJpc2AgZGF0YXNldCwgY29tcGxldGUgdGhlIGZvbGxvd2luZzoKCmEpICBDcmVhdGUgYSBuZXcgZGF0YWZyYW1lIGNvbnRhaW5pbmcgb25seSB0aGUgZmxvd2VycyBvZiB0aGUgc3BlY2llcyAidmVyc2ljb2xvciIgYW5kICJ2aXJnaW5pY2EiLgpiKSAgQ2FsY3VsYXRlIHRoZSBtZWFuIHNlcGFsIGxlbmd0aCBhbmQgc2VwYWwgd2lkdGggZm9yIGVhY2ggc3BlY2llcyBpbiB0aGlzIG5ldyBkYXRhZnJhbWUuCmMpICBDcmVhdGUgYSBiYXIgcGxvdCBzaG93aW5nIHRoZSBtZWFuIHNlcGFsIGxlbmd0aCBmb3IgZWFjaCBzcGVjaWVzLCB3aXRoIGVycm9yIGJhcnMgcmVwcmVzZW50aW5nIHRoZSBzdGFuZGFyZCBkZXZpYXRpb24uCgojIyBFeGVyY2lzZSAzOiBEYXRhIFJlc2hhcGluZwoKTG9hZCB0aGUgYHRpZHlyYCBwYWNrYWdlIGFuZCB1c2UgdGhlIGByZWxpZ19pbmNvbWVgIGRhdGFzZXQ6CgphKSAgUmVzaGFwZSB0aGUgZGF0YSBmcm9tIHdpZGUgdG8gbG9uZyBmb3JtYXQsIGNyZWF0aW5nIGNvbHVtbnMgZm9yIGByZWxpZ2lvbmAsIGBpbmNvbWVfbGV2ZWxgLCBhbmQgYGNvdW50YC4KYikgIENhbGN1bGF0ZSB0aGUgdG90YWwgY291bnQgZm9yIGVhY2ggcmVsaWdpb24gYW5kIHNvcnQgaW4gZGVzY2VuZGluZyBvcmRlci4KYykgIENyZWF0ZSBhIGhvcml6b250YWwgYmFyIHBsb3Qgc2hvd2luZyB0aGUgdG9wIDEwIHJlbGlnaW9ucyBieSB0b3RhbCBjb3VudC4KCiMjIEV4ZXJjaXNlIDQ6IFRpbWUgU2VyaWVzIEFuYWx5c2lzCgpVc2UgdGhlIGBlY29ub21pY3NgIGRhdGFzZXQgZnJvbSBnZ3Bsb3QyOgoKYSkgIENvbnZlcnQgdGhlIGBkYXRlYCBjb2x1bW4gdG8gYSBwcm9wZXIgZGF0ZSBmb3JtYXQgdXNpbmcgYGx1YnJpZGF0ZWAuCmIpICBDYWxjdWxhdGUgdGhlIHllYXItb3Zlci15ZWFyIHBlcmNlbnRhZ2UgY2hhbmdlIGluIHRoZSB1bmVtcGxveW1lbnQgcmF0ZS4KYykgIENyZWF0ZSBhIGxpbmUgcGxvdCBzaG93aW5nIHRoZSB1bmVtcGxveW1lbnQgcmF0ZSBvdmVyIHRpbWUsIHdpdGggdGhlIGxpbmUgY29sb3IgY2hhbmdpbmcgYmFzZWQgb24gd2hldGhlciB0aGUgeWVhci1vdmVyLXllYXIgY2hhbmdlIHdhcyBwb3NpdGl2ZSBvciBuZWdhdGl2ZS4KCiMjIEV4ZXJjaXNlIDU6IEFkdmFuY2VkIERhdGEgTWFuaXB1bGF0aW9uCgpVc2luZyB0aGUgYGRpYW1vbmRzYCBkYXRhc2V0IGZyb20gZ2dwbG90MjoKCmEpICBDcmVhdGUgYSBuZXcgY29sdW1uIGBwcmljZV9wZXJfY2FyYXRgIGJ5IGRpdmlkaW5nIGBwcmljZWAgYnkgYGNhcmF0YC4KYikgIEZvciBlYWNoIGBjdXRgIGNhdGVnb3J5LCBjYWxjdWxhdGUgdGhlIG1lZGlhbiBgcHJpY2VfcGVyX2NhcmF0YCBhbmQgdGhlIGNvdW50IG9mIGRpYW1vbmRzLgpjKSAgQ3JlYXRlIGEgc2NhdHRlciBwbG90IG9mIGBjdXRgIHZzIG1lZGlhbiBgcHJpY2VfcGVyX2NhcmF0YCwgd2l0aCB0aGUgcG9pbnQgc2l6ZSByZXByZXNlbnRpbmcgdGhlIGNvdW50IG9mIGRpYW1vbmRzIGluIGVhY2ggY2F0ZWdvcnkuCgojIyBFeGVyY2lzZSA2OiBGYWNldGVkIFBsb3RzCgpDb250aW51ZSB3b3JraW5nIHdpdGggdGhlIGBkaWFtb25kc2AgZGF0YXNldDoKCmEpICBDcmVhdGUgYSBuZXcgY29sdW1uIGBwcmljZV9jYXRlZ29yeWAgdGhhdCBjYXRlZ29yaXplcyBkaWFtb25kcyBhcyAiTG93IiwgIk1lZGl1bSIsIG9yICJIaWdoIiBiYXNlZCBvbiB0aGVpciBwcmljZSAoeW91IGRlY2lkZSB0aGUgdGhyZXNob2xkcykuCmIpICBDcmVhdGUgYSBmYWNldGVkIGhpc3RvZ3JhbSBvZiBgY2FyYXRgIGZvciBlYWNoIGBjdXRgLCB3aXRoIGRpZmZlcmVudCBjb2xvciBmaWxscyBiYXNlZCBvbiB0aGUgYHByaWNlX2NhdGVnb3J5YC4KYykgIEFkanVzdCB0aGUgZmFjZXQgbGF5b3V0IHRvIGRpc3BsYXkgdGhlIHBsb3RzIGluIHR3byBjb2x1bW5zLgoKIyMgRXhlcmNpc2UgNzogQ3VzdG9tIEZ1bmN0aW9ucyBhbmQgQXBwbHkKCkNyZWF0ZSBhIGN1c3RvbSBmdW5jdGlvbiBhbmQgYXBwbHkgaXQgdG8gdGhlIGBtc2xlZXBgIGRhdGFzZXQgZnJvbSBnZ3Bsb3QyOgoKYSkgIFdyaXRlIGEgZnVuY3Rpb24gdGhhdCBjYWxjdWxhdGVzIHRoZSByYXRpbyBvZiBzbGVlcF90b3RhbCB0byBhd2FrZSB0aW1lIGZvciBhbiBhbmltYWwuCmIpICBVc2UgYGFwcGx5KClgIG9yIGEgYGRwbHlyYCBmdW5jdGlvbiB0byBjYWxjdWxhdGUgdGhpcyByYXRpbyBmb3IgYWxsIGFuaW1hbHMgaW4gdGhlIGRhdGFzZXQuCmMpICBDcmVhdGUgYSBzY2F0dGVyIHBsb3Qgb2YgYm9keSB3ZWlnaHQgdnMuIGJyYWluIHdlaWdodCwgd2l0aCBwb2ludCBjb2xvcnMgcmVwcmVzZW50aW5nIHRoZSBzbGVlcCByYXRpbywgYW5kIHNpemUgcmVwcmVzZW50aW5nIHRvdGFsIHNsZWVwIHRpbWUuCgojIyBFeGVyY2lzZSA4OiBXb3JraW5nIHdpdGggRGF0ZXMgYW5kIFN1bW1hcml6aW5nIERhdGEKClVzZSB0aGUgYHR4aG91c2luZ2AgZGF0YXNldCBmcm9tIGdncGxvdDI6CgphKSAgQ3JlYXRlIGEgbmV3IGNvbHVtbiBmb3IgdGhlIHllYXIgYW5kIGFub3RoZXIgZm9yIHRoZSBtb250aCBuYW1lLgpiKSAgQ2FsY3VsYXRlIHRoZSBhdmVyYWdlIHNhbGVzIGFuZCBhdmVyYWdlIHZvbHVtZSBmb3IgZWFjaCB5ZWFyLgpjKSAgQ3JlYXRlIGEgZHVhbC1heGlzIHBsb3Qgc2hvd2luZyB0aGUgdHJlbmQgb2YgYXZlcmFnZSBzYWxlcyBhbmQgYXZlcmFnZSB2b2x1bWUgb3ZlciB0aGUgeWVhcnMuCgojIyBFeGVyY2lzZSA5OiBBbmltYXRlZCBQbG90cwoKSW5zdGFsbCBhbmQgbG9hZCB0aGUgYGdnYW5pbWF0ZWAgcGFja2FnZS4gVXNpbmcgdGhlIGBnYXBtaW5kZXJgIGRhdGFzZXQ6CgphKSAgQ3JlYXRlIGFuIGFuaW1hdGVkIHNjYXR0ZXIgcGxvdCBvZiBHRFAgcGVyIGNhcGl0YSB2cy4gbGlmZSBleHBlY3RhbmN5IG92ZXIgdGltZS4KYikgIENvbG9yIHRoZSBwb2ludHMgYnkgY29udGluZW50IGFuZCBzaXplIHRoZW0gYnkgcG9wdWxhdGlvbi4KYykgIEFkZCBhcHByb3ByaWF0ZSBsYWJlbHMgYW5kIGFuaW1hdGlvbnMgdG8gbWFrZSB0aGUgdmlzdWFsaXphdGlvbiBpbmZvcm1hdGl2ZSBhbmQgZW5nYWdpbmcuCgoK