## What does the term Combining data mean?

Using multiple data sets to improve the accuracy of your results, or to expand the number of things you can compare. Aggregating data requires the data sets to have a single point in common, for example a date range. Also known as aggregating data.

## What does the term Correlation mean?

The technique used to show if and how strongly variables are related. For example, taller people tend to be heavier than shorter people, but there are always going to be exceptions to this for very over-weight people.

## What does the term Data cleaning/cleansing mean?

The removal of invalid values from dirty data sets. This can be a manual process or a variety of automated processes depending on the type of invalid data.

## What does the term Dirty data mean?

A data set that contains values that are incorrect.

For example, a user has entered their date of birth incorrectly as 1902 instead of 2002.

Such values skew the data set and make any analysis invalid.

## What does the term Spearman rank correlation coefficient mean?

Using a rank to describe the relationship between data. This data is sorted from the smallest value to the largest, and is given a rank, known as a p-value, to determine if there is a significant relationship between the data.

Data that increases whilst any data related to it also increases is ranked as a positive number; data that shrinks while the other related data grows is ranked as a negative number; and data that has no relationship at all is ranked as 0.

P-values below 0.05 are considered significant and so anything given these values can be said to have a strong relationship.