Project

By the end of this course, you will have completed an original research project! As part of this project, you will collect real data, explore the data to discover patterns, and fit statistical models that describe these patterns mathematically. You’ll also have the opportunity to communicate your findings to your classmates in the course-wide project showcase.

Project Milestones

Final Project Follow-Through:

Please remember to submit the two surveys assigned in Canvas: The first allows you to provide individual input on your overall collaboration experience, and the relative contributions from each project group member. The second is an exit survey for the course as a whole. Both surveys need to be completed by all students before the end of the quarter.

Final Project Showcase: Present your findings

We have canceled the in-person final project showcase.

Please plan on recording your 5-minute group presentation (e.g., over Zoom or other similar software) and submitting the recording via Canvas.

This recorded presentation should meet all the same criteria as the live presentation (e.g., providing every group member an opportunity to speak; making direct reference to the visual content of the poster; lasting no more than 5 minutes).

Project Milestone 6: Submit final project poster

Each group is responsible for designing a scientific poster to communicate their results at the final project showcase and submitting an electronic copy of this poster as a PDF via Canvas.

Here are some additional resources to help you understand what constitutes an effective scientific poster and provide some practical tips for designing effective ones.

You are welcome to design your own poster in any application you like. However, to help you out we have developed this poster template using Google Slides which you are welcome to use (“Make a Copy” first then edit). If you choose to design your own using a different application, it is still a good idea to consult the template so you know what information to include in your poster.

Here are some additional general recommendations:

  • Omit unnecessary words. Only essential text should be retained in your poster. Complete sentences not required.
  • Use large font sizes and use consistent font styling throughout. For example, use bold or italics sparingly and for consistent reasons throughout poster (e.g., only for “take-home” messages).
  • Prioritize graphics to text. If you can replace some text with a graphic that conveys the same information, do so. If there is more than one data visualization that you think is important for understanding your project/results, find a way to include it.

Project Milestone 5: Submit final project report

Project Milestone 5 synthesizes all of the work you have done throughout the quarter on your final project and will be graded for clarity and accuracy. Here is a link to the grading rubric.

Only ONE submission per group is required.

Project Milestone 4: Fit statistical model to data and construct key visualizations

Please fetch & complete the “Project Milestone 4: Preprocessing and Model Fitting” Jupyter notebooks. This activity will help you practice preparing your data to be fit with a linear model by applying any needed preprocessing (e.g., filtering out obvious outliers, recoding variables). Only ONE submission per group is required. Please designate one project group member to act as the corresponding author for this and subsequent project milestones.

Project Milestone 3: Preregistration of research question

Please fetch & complete the “Project Milestone 3: Preregistration” Jupyter notebook from Canvas, and submit via Canvas. These activities will help you practice articulating the research question (i.e., about the relationship between two variables) for your final project, thinking about different potential data-generating processes (DGPs), and constructing useful data visualizations to answer targeted questions. Only ONE submission per group is required. Please designate one project group member to act as the corresponding author for this and subsequent project milestones.

Project Milestone 2: Data visualization

Please fetch & complete “Project Milestone 2A: Exploratory Visualization” and “Project Milestone 2B: Refining Data Visualizations” Jupyter notebook from Canvas, and submit via Canvas. ONE submission per student is required.

Project Milestone 1: Complete survey

Please complete & submit the classwwide survey while participating in discussion section to receive full credit for completing this milestone. The survey responses from the whole class will provide the dataset your final project group will explore and analyze. ONE submission per student is required.

Project Milestone 0: Generate variables

Please complete & submit the worksheet from discussion section to receive full credit for completing this milestone. You do NOT need to upload anything to Canvas to receive full credit for this assignment. ONE submission per student is required.

Final Project Bonus Datasets

We are offering these “bonus” datasets in case your group is interested in exploring and analyzing a different dataset for your final project. Please know that it is neither expected nor required that you use one of these “bonus” real-world datasets.

Fast Food Nutrition Dataset

This dataset provides a comprehensive breakdown of the nutritional content of various fast food products from popular fast food chains. It contains 515 rows and 16 variables that will help you identify patterns and make comparisons across menus.

You can find this dataset here

State-level Information Dataset

This dataset provides a comprehensive overview of various socio-economic and environmental factors for the 50 U.S. states and the District of Columbia. It contains 51 rows and 23 variables that will allow you to uncover and address critical questions about the diverse challenges and opportunities across states.

You can find this dataset here

Mental Health Data

This dataset provides an overview of mental health and disorder rates of 217 regions/countries worldwide, from 1990 to 2015. Since each country has about 25 rows each, it contains 5488 rows with 10 variables each, which will help you understand trends in the mental health landscape over time across different countries.

You can find this dataset here