Oracle Technology Network
At 5:20pm today (Friday 2nd December), I received an email from the Oracle ACE program. I had been nominated for the award of Oracle ACE.
“You have been chosen based on your significant contribution and activity in the Oracle technical community. Like your fellow Oracle ACEs, you have demonstrated a proficiency in Oracle technology as well as a willingness to share your knowledge and experiences with the community.”
I am so honoured, considering the experts from around the world that are members of the Oracle ACE program.
The Oracle ACE Award is issued by the Oracle Corporation and the award is made to people who are know for their strong credentials in the Oracle community as enthusiasts, advocates and technical knowledge.
In a previous blog post I talked about how you can take a copy of a workflow developed in Oracle Data Miner, and load it into a new schema.
When you data mining project gets to a mature stage and you need to productionalise the data mining process and model updates, you will need to use a different set of tools.
As you gather more and more data and cases, you will be updating/refreshing your models to reflect this new data. The new update data mining model needs to be moved from the development/test environment to the production environment. As with all things in IT we would like to automate this updating of the model in production.
There are a number of database features and packages that we can use to automate the update and it involves the setting up of some scripts on the development/test database and also on the production database.
These steps include:
- Creation of a directory on the development/test database
- Exporting of the updated Data Mining model
- Copying of the exported Data Mining model to the production server
- Removing the existing Data Mining model from production
- Importing of the new Data Mining model.
- Rename the imported mode to the standard name
The DBMS_DATA_MINING PL/SQL package has 2 functions that allow us to export a model and to import a model. These functions are an API to the Oracle Data Pump. The function to export a model is DBMS_DATA_MINING.EXPORT_MODEL and the function to import a model is DBMS_DATA_MINING.IMPORT_MODEL.The parameters to these function are what you would expect use if you were to use Data Pump directly, but have been tailored for the data mining models.
Lets start with listing the models that we have in our development/test schema:
SQL> connect dmuser2/dmuser2
SQL> SELECT model_name FROM user_mining_models;
Create/define the directory on the server where the models will be exported to.
CREATE OR REPLACE DIRECTORY DataMiningDir_Exports AS ‘c:\app\Data_Mining_Exports’;
The schema you are using will need to have the CREATE ANY DIRECTORY privilege.
Now we can export our mode. In this example we are going to export the Decision Tree model (CLAS_DT_1_6)
The function has the following structure
filename IN VARCHAR2,
directory IN VARCHAR2,
model_filter IN VARCHAR2 DEFAULT NULL,
filesize IN VARCHAR2 DEFAULT NULL,
operation IN VARCHAR2 DEFAULT NULL,
remote_link IN VARCHAR2 DEFAULT NULL,
jobname IN VARCHAR2 DEFAULT NULL);
If we wanted to export all the models into a file called Exported_DM_Models, we would run:
If we just wanted to export our Decision Tree model to file Exported_CLASS_DT_Model, we would run:
DBMS_DATA_MINING.EXPORT_MODEL(‘Exported_CLASS_DT_Model’, ‘DataMiningDir’, ‘name in (”CLAS_DT_1_6”)’);
Before you can load the new update data mining model into your production database we need to drop the existing model. Before we do this we need to ensure that this is done when the model is not in use, so it would be advisable to schedule the dropping of the model during a quiet time, like before or after the nightly backups/processes.
Warning : When importing the data mining model, you need to import into a tablespace that has the same name as the tablespace in the development/test database. If the USERS tablespace is used in the development/test database, then the model will be imported into the USERS tablespace in the production database.
Hint : Create a DATAMINING tablespace in your development/test and production databases. This tablespace can be used solely for data mining purposes.
To import the decision tree model we exported previously, we would run
DBMS_DATA_MINING.IMPORT_MODEL(‘Exported_CLASS_DT_Model’, ‘DataMiningDir’, ‘name=’CLAS_DT_1_6”’, ‘IMPORT’, null, null, ‘dmuser2:dmuser3’);
We now have the new updated data mining model loaded into the production database.
The final step before we can start using the new updated model in our production database is to rename the imported model to the standard name that is being used in the production database.
Scheduling of these steps
We can wrap most of this up into stored procedures and have schedule it to run on a semi-regular bases, using the DBMS_JOB function. The following example schedules a procedure that controls the importing, dropping and renaming of the models.
DBMS_JOB.SUBMIT(jobnum.nextval, ‘import_new_data_mining_model’, trunc(sysdate), add_month(trunc(sysdate)+1);
This schedules the the running of the procedure to import the new data mining models, to run immediately and then to run every month.
The Oracle 11.2 database contains 3 PL/SQL packages that allow you to perform all (well almost all) of your data mining functions.
So instead of using the Oracle Data Miner tool you can write some PL/SQL code that will you to do the same things.
Before you can start using these PL/SQL packages you need to ensure that the schema that you are going to use has been setup with the following:
- Create a schema or use and existing one
- Grant the schema all the data mining privileges: see my earlier posting on how to setup an Oracle schema for data mining – Click here and YouTube video
- Grant all necessary privileges to the data that you will be using for data mining
The first PL/SQL package that you will use is the DBMS_DATA_MINING_TRANSFORM. This PL/SQL package allows you to transform the data to make it suitable for data mining. There are a number of functions in this package that allows you to transform the data, but depending on the data you may need to write your own code to perform the transformations. When you apply your data model to the test or the apply data sets, ODM will automatically take the transformation functions defined using this package and apply them to the new data sets.
The second PL/SQL package is DBMS_DATA_MINING. This is the main data mining PL/SQL package. It contains functions to allow you to:
- To create a Model
- Describe the Model
- Exploring and importing of Models
- Computing costs and text metrics for classification Models
- Applying the Model to new data
- Administration of Models, like dropping, renaming, etc
The next (and last) PL/SQL package is DBMS_PREDICTIVE_ANALYTICS.The routines included in this package allows you to prepare data, build a model, score a model and return results of model scoring. The routines include EXPLAIN which ranks attributes in order of influence in explaining a target column. PREDICT which predicts the value of a target attribute based on values in the input. PROFILE which generates rules that describe the cases from the input data.
Over the coming weeks I will have separate blog posts on each of these PL/SQL packages. These will cover the functions that are part of each packages and will include some examples of using the package and functions.
The PL/SQL API interface for Oracle Data Miner has had a number of new features. These are listed below along with the new API features added with the 11.1 release.
- Support for Native Transactional Data with Association Rules: you can build association rule models without first transforming the transactional data.
- SVM class weights specified with CLAS_WEIGHTS_TABLE_NAME: including the GLM class weights
- FORCE argument to DROP_MODEL: you can now force a drop model operation even if a serious system error has interrupted the model build process
- GET_MODEL_DETAILS_SVM has a new REVERSE_COEF parameter: you can obtain the transformed attribute coefficients used internally by an SVM model by setting the new REVERSE_COEF parameter to 1
11.1g API New Features
- Mining Model schema objects: previous releases, DM models were implemented as a collection of tables and metadata within the DMSYS schema. in 11.1 models are implemented as data dictionary objects in the SYS schema. A new set of DD views present DM models and their properties
- Automatic and Embedded Data Preparation: previously data preparation was the responsibility of the user. Now it can be automated
- Scoping of Nested Data: supports nested data types for both categorical and numerical data. Most algorithms require multi-record case data to the presented as columns of nested rows, each containing an attribute name/value pair. ODM processes each nested row as a separate attribute.
- Standardised Handling of Sparse Data & Missing Values: standardised across all algorithms.
- Generalised Linear Models: has a new algorithm and supports classification (logistic regression) and regression (linear regression)
- New SQL Data Mining Function: PREDICTION_BOUNDS has been introduced for Generalised Linear Models. This returns the confidence bounds on predicted values (regression models) or predicted probabilities (classification)
- Enhanced Support for Cost-Sensitive Decision Making: can be added or removed using DATA_MINING.ADD_COST_MATRIX and DBMS_DATA_MINING_REMOVE_COST_MATRIX.
The new/updated SQL Developer 3.1 Early Adopter has just been released.
For the Data Miner, there are no major changes and it appears that there has been some bug fixes and some minor enhancements to so parts.
The main ODM features, apart from bug fixes, in this release include:
- Globalization support, including translated error messages and GUI for all languages supported by SQL Developer
- Improved accessibility features including the addition of a Structure navigator that lists all the nodes and links displayed in a workflow
Bug / Feature
After unzipping the download I opened SQL Developer. With each new release you will have to upgrade the existing ODM repository. The easiest way of doing this is to open the ODM connections pane and double click on one of your ODM schemas. SQL Developer will then run the necessary scripts to upgrade the repository.
I discovered a bug/feature with SQL Developer 3.1 EA1 upgrade script. The repository upgrade does not complete and an error is report.
I logged this error on the ODM forum on OTN. Mark Kelly who is the Development Manager for ODM and monitors the ODM forum, and his team, were quickly onto investigating the error. Mark has posted an update on the ODM form and give a script that needs to be run before you upgrade your existing repository.
You can download the pre-upgrade script from here.
If you don’t have an existing repository then you don’t have to run the script.
Check out the message on the ODM forum.
How to Upgrade SQL Developer & ODM
You will have to download the new SQL Developer 3.1 EA install files.
- Unzip this into your SQL Developer directory
- Create a shortcut for sqldeveloper.exe on your desktop and relabel it SQL Developer 3.1 EA
- Double-click this short cut
- You should be presented with the above window. Select the Yes button to migrate you previous install settings
- SQL Developer should now open and contains all your previous connections
If you have an existing ODM repository, you need to run the pre-upgrade script (see above) at this point
- You will now have to upgrade the ODM repository in the database. The simplest way of doing this is to allow SQL Developer to run the necessary scripts.
- From the View Menu, select Oracle Data Miner –> Connections
- In the ODM Connections pane double click one of your ODM schemas. Enter the username and password and click OK
- You will then be prompted to migrate/update the ODM repository to the new version. Click Yes.
- Enter the SYS username and Password
- Click Start button, to start the migrate/upgrade scripts
- On my laptop this migrate/upgrade step took less than 1 minute
- The upgrade is now finished and you can start using ODM.
ODM – SQL Developer 3.1 EA – Release Notes
The ODM release notes can be found at
Over the coming months (Q4 2011) there are a number of Oracle related events being run in Ireland. The schedule for these is below with the relevant links to the agenda webpages or to where you can book your place.
The OUG BI SIG meetings you can book your place with the UKOUG.
Venue Address – Dublin:
Oracle Block H, East Point Business Park, Dublin 3
Venue Address – Belfast:
The Mount Conference Center, 2 Woodstock Link, Belfast BT6 8DD
For questions about logistics please contact the marketing team on firstname.lastname@example.org
If you have any question about the content please contact: email@example.com
If you know of any other events that are not listed, let me know and I’ll update the list
Oracle Open World has a huge number of sessions commencing on Sunday and run until Thursday. To help attendees and non-attendees work out what sessions are available you can work your way through the schedule builder.
This can be a bit difficult to find the sessions that you might be interested in. So this year they have produced a set of Focus On documents that contain all the session related to particular areas.
The following are the available Focus On areas and documents:
Let me know if I have missed any Focus On documents and I will update the list.
Oh and don’t forget the Oracle Data Miner sessions.
If you are not able to attend OOW, you can check out the OOW Live channel on YouTube to watch the keynote and main session