Course Description

This course is an introduction to statistical ideas and tools, underlying the foundations of data science. The course is broadly divided into 5 modules:

  • Module 1 Descriptive Statistics
  • Module 2 Probability & Random variables
  • Module 3 Estimation & Inference
  • Module 4 Statistical Modeling
  • Module 5 Statistical Computing

Course Syllabus

Elements of descriptive statistics, averages, dispersion, skewness, quantiles; graphical displays, pie charts, bar charts, histograms, scatter plots, box plots, steam and leaf plots.

Probability spaces, conditional probability, independence; Random variables, distribution functions, probability mass and density functions, functions of random variables, standard univariate discrete and continuous distributions; Mathematical expectations, moments, moment generating functions, inequalities; Multidimensional random variables, joint, marginal and conditional distributions, conditional expectations, independence, covariance, correlation, standard multivariate distributions, functions of multidimensional random variables; Forms of convergence, law of large numbers, central limit theorem.

Sampling distributions; Point estimation - estimators, minimum variance unbiased estimation, maximum likelihood estimation, method of moments estimation, Cramer -Rao inequality, consistency; Interval estimation; Testing of hypotheses - tests and critical regions, Neymann-Pearson lemma, uniformly most powerful tests, likelihood ratio tests.

Linear regression, ANOVA, discriminant analysis.

Computing techniques, cross-validation, bootstrap re-sampling.

Course Logistics

  • This semester the first half of the course was taught by Dr. Amulya Kumar Mahato. We will start the next half after the mid-semeser examination week.
  • Schedule: Slot C, 10:00 am - 10:55 am Monday, 11:00 am - 11:55 am Tuesday, 9:00 am - 9:55 am Friday
  • Venue: 5201, Core 5.

Course Evaluation

There will be 3 quizzes and an end-semester examination with the following weightage:

  • Quizzes: 15%
  • Attendance: 5%
  • End semester exam: 30%

Some references (not an exhaustive list)

  • Hogg, R.V., McKean, J. and Craig, A.T., Introduction to mathematical statistics, 7th edition, Pearson Education, 2012.
  • Rice, J.A., Mathematical statistics and data analysis. 3rd edition, Cengage Learning, 2006
  • Wasserman, L., All of statistics: a concise course in statistical inference, Volume 26, New York Springer, 2004
  • Rohatgi, V.K. and Saleh, A.M.E., An introduction to probability and statistics, 3rd edition, John Wiley & Sons, 2015.
  • DeGroot, M.H. and Schervish, M.J., Probability and statistics, 4th edition, Pearson Education, 2010.

Topics Covered during the weeks

Lecture Date Topic Resources R codes
1 23-Sep-2024
2 27-Sep-2024
3 30-Sep-2024
4 1-Oct-2024
5 4-Oct-2024
6 7-Oct-2024
7 8-Oct-2024
8 14-Oct-2024
9 15-Oct-2024
10 18-Oct-2024
11 21-Oct-2024
12 22-Oct-2024
13 25-Oct-2024
14 29-Oct-2024
15 1-Nov-2024
16 4-Nov-2024
17 5-Nov-2024
18 8-Nov-2024
19 11-Nov-2024
20 12-Nov-2024