Course Description
This course is an introduction to statistical ideas and tools, underlying the foundations of data science. The course is broadly divided into 5 modules:
- Module 1 Descriptive Statistics
- Module 2 Probability & Random variables
- Module 3 Estimation & Inference
- Module 4 Statistical Modeling
- Module 5 Statistical Computing
Course Syllabus
Elements of descriptive statistics, averages, dispersion, skewness, quantiles; graphical displays, pie charts, bar charts, histograms, scatter plots, box plots, steam and leaf plots.
Probability spaces, conditional probability, independence; Random variables, distribution functions, probability mass and density functions, functions of random variables, standard univariate discrete and continuous distributions; Mathematical expectations, moments, moment generating functions, inequalities; Multidimensional random variables, joint, marginal and conditional distributions, conditional expectations, independence, covariance, correlation, standard multivariate distributions, functions of multidimensional random variables; Forms of convergence, law of large numbers, central limit theorem.
Sampling distributions; Point estimation - estimators, minimum variance unbiased estimation, maximum likelihood estimation, method of moments estimation, Cramer -Rao inequality, consistency; Interval estimation; Testing of hypotheses - tests and critical regions, Neymann-Pearson lemma, uniformly most powerful tests, likelihood ratio tests.
Linear regression, ANOVA, discriminant analysis.
Computing techniques, cross-validation, bootstrap re-sampling.
Course Logistics
- Schedule: Slot B, 9:00 am - 9:55 am Monday, 10:00 am - 10:55 am Tuesday, 11:00 am - 11:55 am Wednesday
- Venue: 5103, Core 5.
Course Evaluation
There will 5 surprise quizzes and 5 assignments, a mid-semester examination and an end-semester examination with the following weightage:
- Quizzes: 20%
- Assignments: 20%
- Mid semester exam: 30%
- End semester exam: 30%
Some references (not an exhaustive list)
- Hogg, R.V., McKean, J. and Craig, A.T., Introduction to mathematical statistics, 7th edition, Pearson Education, 2012.
- Rice, J.A., Mathematical statistics and data analysis. 3rd edition, Cengage Learning, 2006
- Wasserman, L., All of statistics: a concise course in statistical inference, Volume 26, New York Springer, 2004
- Rohatgi, V.K. and Saleh, A.M.E., An introduction to probability and statistics, 3rd edition, John Wiley & Sons, 2015.
- DeGroot, M.H. and Schervish, M.J., Probability and statistics, 4th edition, Pearson Education, 2010.
Topics Covered during the weeks
Lecture | Date | Topic | Resources | R codes |
---|---|---|---|---|
1 | 26-Jul-2023 |
|
||
2 | 31-Jul-2023 | Data and Questions |
|
|
3 | 1-Aug-2023 | Graphical Displays: Pie charts, Bar graphs, Histograms, Scatter plots | Data | Codes |
Thanks to Karan Kumawat, we have a nice bar chart for Nuclear powers of different countries with the proper labels here: Code , Barchart | ||||
4 | 2-Aug-2023 | Measures of centre and spread; Skewness; Five-figure summary and Box and Whiskers Plots | Book: Elements of Statistics by Daly, F. et al. | |
5 | 7-Aug-2023 | Probability as a mathematical framework; Setting up a Probabilistic Model: Random experiment; Sample Space; Probability Law | ||
6 | 8-Aug-2023 | Probability axioms; Consequences of Probability axioms; Birthday Problem | Birthday Problem | |
7 | 9-Aug-2023 | Independent events; Newton-Pepys problem; Conditional Probability; Multiplication Rule; Bayes' Rule | ||
8-9 | 12-Aug-2023 (Make up class for 16,17-Aug-2023) | Conditional Probability; Total Probability Law; Tree Diagrams; Monty Hall Problem | Monty Hall Problem Code 1 , Code 2 | |
10 | 14-Aug-2023 | Quiz 1 | Quiz 1 solutions and marking scheme | |
15-Aug-2023 | Assignment 1 | We have a Teams group now: Grp_DA241-July-Nov-2023. If you are a part of this class and you are not added in the group please write to me immediately. The assignment is uploaded there and the due date is 11:59 pm, 20-Aug-2023. Late turn-ins will stop after 11:59 pm, 21-Aug-2023. Type the solutions (handwritten will not be accepatble) in Word/Latex and submit the PDF only. | ||
11 | 28-Aug-2023 | Random Variables; Discrete Random Variables; Probability Mass Function; Examples | ||
12,13 | 29-Aug-2023 | Review Discrete Random Variables and PMFs; Special Discrete Random Variables: Bernoulli, Indicator, Binomial, Hypergeometric, Geometric, Poisson. Poisson Paradigm and Binomial Convergence to Poisson | ||
14 | 30-Aug-2023 | Continuous Random Variables; Probability Density Functions; Cumulative Distribution Functions and thier Properties; Discrete Example. | ||
3-Sep-2023 | Assignment 2 | Please check Teams group: Grp_DA241-July-Nov-2023. The assignment is uploaded there and the due date is 11:59 pm, 9-Sep-2023. Late turn-ins will stop after 11:59 pm, 10-Sep-2023. Type the solutions (handwritten will not be accepatble) in Word/Latex and submit the PDF only. All the other rules can be found on Teams. | Question 5 solution | |
15 | 4-Sep-2023 | Continuous Random Variables; Special r.v.s: Uniform, Piecewise Constant, Exponential, Normal; Universality of Uniform; Simulations | ||
16, 17 | 5-Sep-2023 | Normal random variables; Calculating Normal probabilities; Standardising a Normal random variable; Joint Distribustions; Marginals; Conditional Density; Independence of random variables; Some examples | ||
18 | 11-Sep-2023 | Expectation of a random variable; Properties of Expectations; Expectations of famous discrete random variables | Practice Assignment 1 | |
19, 20 | 12-Sep-2023 | Quiz 2; Expectations continued; Law of Unconscious statistician Expected value of local maxima in random permutation of integers; Finding means and variances of common continuous distributions; Memorylessness property of Exponentials | LOTUS | |
21 | 13-Sep-2023 | Quiz 2 solutions; Moment Generating Functions; Examples | Quiz 2 solutions and marking scheme Practice Assignment 2 |
|
19-Sep-2023 | Mid-semester Examination | Mid-semester exam solutions | ||
22 | 25-Sep-2023 | Covariance and Correlation; Conditional Expectation; Examples: Two envelope paradox; Patterns in repeated coin flips | The Other Person's Envelope is Always Greener by Barry Nalebuff | |
23 | 26-Sep-2023 | Conditional Expectation continued | ||
24 | 27-Sep-2023 | Inequalities: Cauchy-Schwartz inequality; Jensen's inequality; Markov's inequality; Chebychev's inequality; Convergence of random variables in probability | ||
25 | 3-Oct-2023 | Weak Law of Large Numbers (WLLN); Pollster's problem; Demonstration of WLLN in R | WLLN Demonstration; WLLN_animation1; WLLN_animation2 | |
26 | 4-Oct-2023 | Central Limit Theorem (CLT); Pollster's problem revisited; Demonstration of CLT in R | Check out some nice demonstrations here: http://www.randomservices.org/random/apps/index.html . Also, some cool animations can be found here: https://yihui.org/animation/ | CLT Demonstration; CLT Animation |
27 | 9-Oct-2023 | Statistical inference problems: Types of problems; Point Estimation | ||
28 | 10-Oct-2023 | Point Estimation: Some examples | ||
29 | 11-Oct-2023 | Desirable properties of estimators: Unbiasedness; Consistency; "small" Mean Squared Errors; Methods of estimation: Least Squares method; Method of Moments. | ||
30, 31 | 16-Oct-2023 | Maximum Likelihood Estimation; Examples; Quiz 3 | Quiz 3 solutions and marking scheme | |
32 | 25-Oct-2023 | Recall MLEs; Statistical Properties of MLEs; Fisher Information; Cramer-Rao lower bound | ||
28-Oct-2023 | Assignment 3 | Please check Teams group: Grp_DA241-July-Nov-2023. The assignment is uploaded there and the due date is 11:59 pm, 3-Nov-2023. Late turn-ins will stop after 11:59 pm, same day. You can find the details of submission on Teams. | ||
33 | 30-Oct-2023 | Confidence Intervals; Examples | ||
34 | 31-Oct-2023 | Hypothesis Testing: Big picture; Likelihood Ratio Tests; Examples | ||
35 | 6-Nov-2023 | Hypothesis Testing: More general scenarios; Quiz 4 | ||
36 | 7-Nov-2023 | Simple linear regression; Least Squares estimation; Maximum Likelihood Estimation; Probabilistic setting of the problem | ||
37 | 8-Nov-2023 | Multiple Linear Regression Model; Geometric Interpretation | ||
38, 39 | 14-Nov-2023 | A look ahead |