Students should be able to do each of the following by the end of this course:
What is statistics? overview of the course
- Describe the central goals and fundamental concepts of statistics.
- Describe the difference between experimental and observational research with regard to what can be inferred about causality
- Explain how randomization provides the ability to make inferences about causation.
R Lab: Basics
- Interact with an RMarkdown notebook in RStudio
- Describe the difference between a variable and a function
- Create a vector, matrix, or data frame and access its elements
- Load a data file into a data frame and plots its contents
- Describe the difference between a probability and a conditional probability
- Describe the concept of statistical independence
- Use Bayes’ theorem to compute the inverse conditional probability.
R lab: probability
- Intro to Rmarkdown notebooks
- Compute probabilities of combinations of events
- Compute an empirical probability distribution
- Describe the different functions available for the normal distribution, and their usage
Working with data (make-up for session 2)
- Distinguish between different types of variables (quantitative/qualitative, discrete/continuous, scales of measurement)
- Describe the concept of measurement error
- Distinguish between the concepts of reliability and validity and apply each concept to a particular dataset
- Compute absolute, relative, and cumulative frequency distributions for a given dataset
- Generate a graphical representation of frequency distributions
- Describe the difference between a normal and a long-tailed distribution, and describe the situations that give rise to each
R lab: Data wrangling and visualization
- Describe the concept of tidy data
- Load a data file and prepare it for analysis
- Plot summary graphs using ggplot
Fitting models (central tendency)
- Describe the basic equation for statistical models (outcome=model + error)
- Describe different measures of central tendency and dispersion, how they are computed, and how to determine which is most appropriate in any given circumstance.
- Describe the principles that distinguish between good and bad graphs, and use them to identify good versus bad graphs.
- Distinguish between a population and a sample, and between population parameters and statistics
- Describe the concepts of sampling error and sampling distribution
- Describe how the Central Limit Theorem determines the nature of the sampling distribution of the mean
Resampling and simulation
- Describe the statistical concept of a random number
- Describe the concept of Monte Carlo simulation
- Describe the concept of the bootstrap and use it to estimate the sampling distribution of a statistic
R Lab: Simulation and resampling
- Demostrate the ability to implement a Monte Carlo simulation in R
- Describe how resampling can be used to compute a p-value.
- Define the concept of statistical power, and compute statistical power for a given statistical test.
- Describe the main criticisms of null hypothesis statistical testing
Confidence intervals and effect sizes
- Describe the proper interpretation of a confidence interval, and compute a confidence interval for the mean of a given dataset.
- Define the concept of effect size, and compute the effect size for a given test.
Modeling categorical relationships
- Describe the concept of a contingency table for categorical data.
- Describe the concept of the chi-squared test for association and compute it for a given contingency table.
Modeling continuous relationships (RP Gone - need guest lecturer)
- Describe the concept of the correlation coefficient and its interpretation and compute it for a bivariate dataset
- Describe the potential causal influences that can give rise to a correlation.
The general linear model
- Describe the concept of linear regression and apply it to a bivariate dataset
- Describe the concept of the general linear model and provide examples of its application
- Determine whether a one-sample t-test or two-sample t-test is appropriate for a given hypothesis.
- Compute a one-sample and two-sample t-test on relevant datasets, and compute the effect size and confidence intervals associated with each of these tests.
Statistical Inference R lab
- Demonstrate the ability to apply statistical models to real data in R
Statistical modeling: Practical examples
- Describe how to determine what kind of model to apply to a dataset
Doing reproducible research
- Describe the concept of P-hacking and its effects on scientific practice
- Describe the concept of positive predictive value and its relation to statstical power