Ref: Coursera - Duke Courses.
Link to Coursera - Duke Courses: Link.
Link to Coursera - Duke Data Analysis Course: Link.
ROUGH NOTES (!)
Updated: 11/4/26
INTRODUCTION TO PROBABILITY AND DATA WITH R
[Introduction to Statistics with R]
How does a doctor decide that a new drug is more effective than an existing drug?
How does Google use search terms to decide that a new flu season is starting?
How confident should a politician be in their latest poll numbers?
How does Netflix make personalized movie recommendations?
These are the types of questions you can answer with Statistical Data Analysis.
[Introduction to Data]
This unit will introduce you to the basics of collecting, analyzing and visualizing data as well as making data based decisions.
The goal of this course is to teach you to make sense of data using statistical tools, in order to be able to explore relationships between variables and make informed decisions.
When faced with a new study or a data set, the first question you should always ask yourself is:
- What is the population of interest?
- What is the sample?
Eg: Consider the study titled “Alcohol brand use and injury in the emergency department” (2013).
The study explored the question:
Are consumers of certain alcohol brands more likely to end up in the emergency room with injuries?
Based on this question alone, it appears that the population of interest is everyone. In other words, ideally, the researchers would like to find an answer to this question that can result in a recommendation for everyone who consumes alcohol. However, a closer look at this study reveals that the sample used in this study was only a group of emergency room patients at the Johns Hopkins Hospital in Baltimore in the US.
And alcohol brand consumption data were collected from patients who drank within six hours of presentation at the hospital. Therefore the results of the study can really only be generalized to residents of Baltimore, since certain brands maybe more easily available in this area than others due to national brand market share.
In this unit:
We will start by defining populations of interest, discuss methods of taking samples from this population, and designing studies that can best answer particular research questions.
We will also learn to identify scope of inference for a study (such as whether we can make causal versus correlational statements, and whether we can generalize our conclusions to the population at large).
We will also learn methods of exploratory data analysis such as data visualizations and summary statistics.