Data Exploration

Characterization

Data Exploration refers to the analysis of the datasets by summarizing main characteristics,often with visual methods. Data exploration is used throughout the data framework. Descriptive statistics play a principal role in data profiling, from identifying valid attribute values tochecking for semantic consistency. The use of visual techniques like boxplots support iterationsbetween data cleaning and transformation during the data preparation. Distributional characterizations of the data help identify needs and opportunities for data linkage.