Most data innovation projects fail because organizations ignore basic steps that should drive the project. After doing several data science projects in the industry, I have learnt that the basics are as important as the science behind the innovation. The following steps will help any business initiate a successful data analytics project.
CONDUCT A DATA AUDIT
Having good data is a way of creating accountability in an organization. Accountable employees ensure due diligence in capturing and storing data generated from the day to day operations of the business. Most often, organizations find themselves in a fix where existing data is not usable. Poor data quality leads to wrong insight and ultimately wrong decisions.
With poor quality data, data projects are bound to be valueless
There is nothing so fatal than making wrong decisions due to wrong data. This in turn can affect a broad range of things ranging from the business focus, brand representation to product development.
Data audits should ultimately ensure all data required to make decisions is available in an analyzable formats.
To conduct a data audit:
- Establish overall business or departmental objectives
- Establish and gain access to all existing data sources
- Establish departmental business processes
- Retrace the business process in the data process
- Evaluate data storage formats
- Evaluate data to identify, Missing, Erratic, Redundant, Duplicates, poor design etc.
Takeaways from a data audit may necessitate either
- Collection of more data
- Restructuring of the database
- Cleaning the existing database to make it analyzable
- Data integration to combine data from disparate sources into meaningful and valuable information
DEVELOP DESCRIPTIVE ANALYSIS OF DATA (DATA VISUALIZATION)
Also referred to as summary statistics, descriptive statistics are used to describe the basic features of the data. They provide simple summaries about the data. Together with simple graphical analysis, they form the basis of virtually every quantitative analysis of data.
Descriptive analysis will help you construct a mental picture of your business henceforth making the right moves towards people and events.
Descriptive analysis uncovers interesting patterns in the database. Important arithmetic’s to consider while conducting descriptive analysis are measures of dispersion and measures of central tendency. Dispersion measures bring an interesting outlook in that rather than showing how data are similar, they show how data differs (its variation, spread, or dispersion). Most well-known measures of dispersion are range, variance, and standard deviation. From metrics such as variance change, formidable range and optimal deviation, indicators can be derived to sustain productivity.
Measures of central tendency on the other hand describe the central points in a data set which include mean, median and mode. The mode can be used to identify most frequent elements/customers/payments/time etc. Measures of central tendency are mostly important in identification of outliers. Outliers are extreme data values notably differing from other data points.
The output of this stage is a data visualization system which can answer the daily operational questions in a business. There are several commercial and open source tools that can be used to implement data visualizations. These are such as Tableau, PowerBI, JS libraries etc. To find out more on how to do this, reach out to Nakala Analytics.
IDENTIFY INDICATORS AND VARIABLES THAT MATTER TO YOUR BUSINESS.
Descriptive analytics generate a lot of insights. They deliver an overwhelming amount of information that can guide a business to understand patterns and insights. Much of the insights are fascinating but after running a data business for months, I have met people who would like to streamline their performance indicators.
Fancy and obvious indicators wanes over time because the information is not used to identify specific instances that may fasten objectiveness.
And why should it?! Having clearly understood historical performance from descriptive statistics, identification of conversion rates in a business processes, customer/user exit points, general trends and behavior can form a baseline of what needs to be measured to effectively monitor performance. Indicators also called KPIs will guide your business in rethinking whether you are meeting goals or not.
Well defined indicators can be tested. Simulation models can be used to prototype a physical model so as to predict its performance in the real world. Says Prof. Ddembe Willeese Williams
Simulation models will help you choose critical success factors. Critical success factors provides a list of activities that an entity should focus on to be successful. SMART success factors facilitate the meeting of business goals and objectives within set time frames.
The final stage of data understanding should result in variable identification. There are many variables yes!, but which ones are important? Several machine learning models exist that can help you bypass the curse of dimensionality.
COLLECT MORE DATA IF NECESSARY
In the statistical world, missing data is a common phenomenon which can be avoided. Even in a well-designed and controlled studies, missing data can reduce the statistical power of a study and can produce biased estimates, leading to invalid conclusions. While this is true in ad hoc research, businesses are dynamic as a result of both internal and external factors. That means, changes are bound to happen. Collecting more data happens to understand your market to offer more competitive and effective products and services.
Example of a question that may necessitate collection of more data: Does tribe matter when marketing household products to Kenyan communities?
DEVELOP & IMPROVE PREDICTIVE MODELS
The final and most valuable stage in executing a data analytics project is the modelling stage. The stage is completely dependent on having good indicators, quality data and well mapped success factors. Success factors in turn become variables. Predictive models assign a probability score to an outcome using a set of predictor variables.
At this point, data analytics / data science will make more sense to your organization.
Written by Enock Keya, Chief Data Analytics Consultant