Data Science

Good Quality Data

Good Quality Data | Blog

Good Quality Data

Good Data

This is data that has been cleaned and curated to remove noise (the stuff we don’t need). We may have performed some Feature Engineering such as scaling or reducing dimensionality.

Data Curation is a time intensive activity.

data-science

Right Data

When the Data Scientist makes a hypothesis we need enough relevant data to be able to build a predictive model.

The Problem Description is very important, this is where we define the hypothesis or the model that we plan to explore. The choice of algorithmic approach is equally important. This is things like Support Vector Machines, Logistic Regression, etc.

Just In-time Data

We’re trying to predict the future here. We need the data as soon as possible.

Historical data is important and great things can come from that, but knowing that if we’d changed a single feature we could have made a better decision on an event that has already taken place is not as good as knowing what action to take before the event takes place.

We need a platform that brings data together so that we have it ready when we need it!