DEV Community

Ola Sk
Ola Sk

Posted on • Updated on

Some basic differences between maths and statistics name conventions.

I write this note because I was confused with the names used for different concepts in maths vs statistics and maybe someone else too is. It's gonna at least serve me as a small reminder.

Say we have a multidimensional dataset, the data points have particular coordinates (that's what is taught in linear algebra) that in statistics are called 'variables', and the data points themselves in n-dimensional space (with n 'variables' or 'features' as their dimensions) are called 'individuals', 'cases' or sometimes 'observations'.

Variables (or features) can be categorical, discrete or continuous.

Data points or individuals sometimes don't have identities (names, identifiers) and in order for a data person to clean the data (order it, look for missing or funny values, handle them in a sensible way) and start performing some statistical testing on a data set (graph the data, look at averages, standard deviations etc.), then there may be need to get them back in some kind of order after operating on them, so they need to be 'tagged'.

Top comments (0)