HW03: Wrangling Data

Overview

Due: Thursday, July 31st by 11:59 PM (Chicago Time)

In the first assignment you have demonstrated knowledge of your software setup, Git/GitHub via RStudio, and R Markdown. In the second assignment you practiced transforming and exploring data with dplyr and ggplot2.

The goal of this final assignment is to reinforce and deepen your understanding of all these tools, with a focus on data wrangling tasks.

Accessing and Cloning Your hw03 Repository

Important

This will work only if you have submitted your GitHub username (see Lecture 1) and accepted the email invitation to join our GitHub organization (check the email linked to your GitHub account). Make sure you’ve completed the setup under Software Setup and confirmed everything is working properly.

Accessing the repo:

  1. Go at this link to accept the invitation and create your private hw03 repository on GitHub. Your repository will be named hw03-<USERNAME> and will be ready within a few seconds. If this does not work, see the note in red above.

  2. Once the repository has been created, click the link provided to access it.

Cloning the repo into your Workbench:

  1. Log into Workbench and start a new project:
    File > New Project > Version Control > Git

  2. In the Repository URL field:

    • Paste the URL of your GitHub repository. To find the URL, go to your hw03 repo on GitHub, click the green Code button, and copy the SSH or HTTPS link (use the method you configured during setup: Workbench users must select SSH).
    • The project name should auto-fill. If not, type hw03-<USERNAME> manually.
  3. Choose where to save the project directory locally.
    Tip: Use a dedicated folder such as homeworks to keep things organized and avoid moving folders after cloning them

  4. Check the box “Open in new session” (optional but recommended for keeping projects separate and organized)

  5. Click Create Project. This will:

    • Create a new folder on your machine
    • Initialize a Git repository linked to GitHub
    • Open an RStudio Project in a new session

General Homework Workflow

See Homework 1 for details. The workflow is the same for all assignments (e.g., accept the repo, edit the files, save, then Pull – Stage – Commit – Push).

Assignment Description

For this assignment you will work with data from the U.S. Supreme Court. Your repository includes specific and open-ended questions. Answer all of them.

Note
  • The questions—especially the open-ended ones—are designed to help you think like a university student and future researcher. They won’t walk you through every step but encourage you to apply what you’ve learned and work independently.

  • Plan to complete this over multiple sessions. Start early, and remember to save, commit, and push your work regularly.

  • Do not use AI tools to generate R code from scratch. You may use it to debug (after trying on your own) or to look up how functions work (check course materials first). Only submit code you fully understand.

Submit the Assignment

To submit your assignment on Canvas, follow these steps:

  1. Before the deadline, push the final version of your work to your GitHub repository. Tip: don’t wait until the end to make your first commit—commit, stage, and push your work regularly as you go.

    Your GitHub repository should include (make sure to stage-commit-push all these files):

    • court.Rmd: main file you will work on
    • court.md: you will generate this file from the .Rmd by simply knitting it, like you did for HW01 (do not modify the output format)
    • the folder that contains all the graphs that you generated in your .Rmd (we need this file to be able to see your graphs and grade your homework)
  2. When you’re ready to submit:

    • Copy the URL of your GitHub repository. It will look something like: https://github.com/csp-summer25/hw03-yourusername
    • Submit that URL on Canvas under HW03. Do not upload any files to Canvas. We only need the repository link.

Assessment

All homework assignments are evaluated using a rubric (for more details, see the Assessment section of the Syllabus). Below are further guidelines for this specific homework to help you assess your work before submitting it.

In the past, “Excellent” or “Very Good” work included submissions that:

  • Completed all parts of the assignment correctly and accurately.
  • Included code that:
    • Ran without errors
    • Followed good style (clean, readable, well-organized)
    • Was well-documented but not over-commented
    • Used appropriate complexity and matched the prompt and course content
    • Demonstrated understanding of key concepts and functions
  • Included graphs and tables that were:
    • Well-chosen for the variable types
    • Clearly labeled, visually polished, and customized beyond defaults
    • Interpreted in a clear and concise analysis
  • Showed strong use of:
    • Required packages, showing curiosity to use them beyond the basics
    • R Markdown syntax (e.g., no rendering errors, appropriate use of syntax, etc.)
  • Followed good GitHub practices:
    • Multiple informative commits showing progress over time
    • All required files present and displaying correctly, including rendered graphs