Modeling always makes use of historical “predictive” data – the data on which predictions are based – and “outcome” data – the data representing the result or event you are attempting to predict. Both of these must be present in sufficient quantities for analytic efforts to move forward.
Data sets are often missing observations. In some circumstances, “missing” might carry a meaning all its own, in other circumstances it may not. The analyst must determine if a missing observation has importance and if so, how it will be treated in predictive work.
Data obtained from multiple sources must often be linked, but if the “key” that links the sources is ambiguous, the linkage may fail. Sometimes the sources do not line up “one to one.” The collection time frames for the various data sources may not be consistent. These types of circumstances call for understanding and judgment calls about what is the appropriate course for the modeling effort.
Analysts can become hamstrung, and efforts slowed significantly, when the right development data is not available. And “right” not only means having potential for predictive use, but also how it is transformed for use in your analytic environment.
So it might be really tempting to simply erase the outliers. But any good analyst would want to dig into those outliers to find out why they're there. Questions they may ask are:
Especially for the smaller organization, a common problem is knowing where to start. If you believe you can leverage predictive analytics and you're not, or you're doing it at a very crude level, you face a few questions:
In some industries and settings where the technology is still just being adopted, initial efforts in predictive analytics can yield some big rewards. Educational institutions and fundraising organizations are areas where there is a relatively new use of predictive analytics. But in other industries, where the competitive environment is more advanced, a much larger investment is needed to enter, because analytics represents a formidable competitive moat.
A company trying PA for the first time might be hesitant to take the first step because it thinks it may not be big enough. But, in reality, it is taking the step that's important, not how big a step it is to start.
Analytic talent tends to be scarce. Advanced education focused in the field is relatively new, and the base of experienced professionals is low relative to the demand.Until the last decade there were no advanced degree programs in predictive analytics. Earlier practitioners were often from the areas of statistics, operations research and related areas of applied mathematics. Fresh graduates from these disciplines could be recruited and then schooled and molded in the particulars of predictive analytic efforts “on the job.” Over the course of years they would also develop expertise in the domain of their practice – for example, banking, financial services or health care.
All that is changing. Today, there are 49 masters degree programs in predictive analytics in the U.S., The oldest of these programs began in 2007, and 19 of them graduated their first class in 2014. This growth is evidence of the heightened demand for the specialized skills necessary to drive PA success. These degree programs allow companies to identify graduates who already have education in the particulars of predictive analytics.
Experienced practitioners in highly developed industries (such as banking and health care) are also a rare commodity. For managing the complex interactions of analytic efforts, one often needs experienced talent. So finding the right individuals with the combination of analytic background and business acumen, in the face of growing demand for these skills, is a big challenge.
You’ve acquired and assembled the data and talent necessary to build models, and the projected results are stellar. At this point, though, your work may be far from done.
In most cases, the models need to be programmed and tested in the environment where they will have an impact on business decisions. Treatments need be assigned to the various model ranges. Perhaps some new treatments will affect customer service approaches and projected inbound call volumes. If you are in a regulated industry, all of these changes may be subject to regulatory scrutiny. Finally, the performance of the models needs to be tracked to ensure the models are accurately predicting the outcomes intended in the real world, and adjustments made if and when performance slips.Companies can underestimate the effort required to implement predictive models in the setting that delivers decisions and business results. And many analytic projects fail, or result in wasted efforts, because they do not anticipate the implementation hurdles.