Using multiple data sets to improve the accuracy of your results, or to expand the number of things you can compare. Aggregating data requires the data sets to have a single point in common, for example a date range. Also known as aggregating data.
The technique used to show if and how strongly variables are related. For example, taller people tend to be heavier than shorter people, but there are always going to be exceptions to this for very over-weight people.
The removal of invalid values from dirty data sets. This can be a manual process or a variety of automated processes depending on the type of invalid data.
A data set that contains values that are incorrect. For example, a user has entered their date of birth incorrectly as 1902 instead of 2002. Such values skew the data set and make any analysis invalid.
The size a country’s economy calculated by working out how much it produces including taxes, minus subsidies.
Using a rank to describe the relationship between data. This data is sorted from the smallest value to the largest, and is given a rank, known as a p-value, to determine if there is a significant relationship between the data. Data that increases whilst any data related to it also increases is ranked as a[…] Read the full article.
The process of correcting any data that is in an incorrect unit when compared to the rest of your data. When aggregating data, some records may have different units of measurement even though they represent the same thing. For example, temperature measured in both Celsius and Fahrenheit.