Month: May 2016
When building predictive models the data scientist can spend a large amount of time examining the models produced and how they work and perform on their hold out sample data sets. They do this to understand is the model gives a good general representation of the data and can identify/predict many different scenarios. When the “best” model has been selected then this is typically deployed is some sort of reporting environment, where a list is produced. This is typical deployment method but is far from being ideal. A more ideal deployment method is that the predictive models are build into the everyday applications that the company uses. For example, it is build into the call centre application, so that the staff have live and real-time feedback and predictions as they are talking to the customer.
But what kind of live and real-time feedback and predictions are possible. Again if we look at what is traditionally done in these applications they will get a predicted outcome (will they be a good customer or a bad customer) or some indication of their value (maybe lifetime value, possible claim payout value) etc.
But can we get anymore information? Information like what was reason for the prediction. This is sometimes called prediction insight. Can we get some details of what the prediction model used to decide on the predicted value. In more predictive analytics products this is not possible, as all you are told is the final out come.
What would be useful is to know some of the thinking that the predictive model used to make its thinking. The reasons when one customer may be a “bad customer” might be different to that of another customer. Knowing this kind of information can be very useful to the staff who are dealing with the customers. For those who design the workflows etc can then build more advanced workflows to support the staff when dealing with the customers.
Oracle as a unique feature that allows us to see some of the details that the prediction model used to make the prediction. This functions (based on using the Oracle Advanced Analytics option and Oracle Data Mining to build your predictive model) is called PREDICTION_DETAILS.
When you go to use PREDICTION_DETAILS you need to be careful as it will work differently in the 11.2g and 12c versions of the Oracle Database (Enterprise Editions). In Oracle Database 11.2g the PREDICTION_DETAILS function would only work for Decision Tree models. But in 12c (and above) it has been opened to include details for models created using all the classification algorithms, all the regression algorithms and also for anomaly detection.
The following gives an example of using the PREDICTION_DETAILS function.
select cust_id, prediction(clas_svm_1_27 using *) pred_value, prediction_probability(clas_svm_1_27 using *) pred_prob, prediction_details(clas_svm_1_27 using *) pred_details from mining_data_apply_v;
The PREDICTION_DETAILS function produces its output in XML, and this consists of the attributes used and their values that determined why a record had the predicted value. The following gives some examples of the XML produced for some of the records.
I’ve used this particular function in lots of my projects and particularly when building the applications for a particular business unit. Oracle too has build this functionality into many of their applications. The images below are from the HCM application where you can examine the details why an employee may or may not leave/churn. You can when perform real-time what-if analysis by changing some of attribute values to see if the predicted out come changes.
Oracle Data Visualization comes with all the typical features you have with Visual Analyzer that is part of BICS, DVCS and OBIEE.
An additional install you may want to do is to install the R language for Oracle Data Visualization Desktop. This is required to enable the Advanced Analytics feature of the tool.
After installing Data Visualisation Desktop when you open the Advanced Analytics section and try to add one of the Advanced Analytics graphing option you will get an errors message as, shown below.
In Windows, click on the Start button, then go to Programs and then Oracle. In there you will see a menu item called install Advanced Analytics i.e. install Oracle R Distribution on your machine.
When you click on this menu option a new command line window will open and will proceed with the installation of Oracle R Distribution (in this case version 3.1.1, which is not the current version of Oracle R Distribution).
By accepting the defaults and clicking next, Oracle R Distribution will be installed. The following images will step you through the installation.
The final part of the installation is download and install lots and lots of supporting R packages.
When these supporting R packages have been installed, you can now use the Advanced Analytics features of Oracle Data Visualisation Desktop.
If you had the tool open during this installation you will need to close/shutdown the tool and restart it.