5 steps in the data science process

Ask a question → data collection → data exploration → data modeling → data interpretation and conclusions

Data Exploration

Sample Mean: the arithmetic average of the data set, usually viewed as a “typical” value and can be used for interpolation.

Sample Medium: the “mid-point” of the data set.

Measuring the spread/concentration: using variance

$\frac{1}{n}\sum (x_i-\bar x)^2$ if we have n data points