Oracle 12.2]
ODM Model View Details Views in Oracle 12.2
A new feature for Oracle Data Mining in Oracle 12.2 is the new Model Details views.
In Oracle 11.2.0.3 and up to Oracle 12.1 you needed to use a range of PL/SQL functions (in DBMS_DATA_MINING package) to inspect the details of a data mining/machine learning model using SQL.
Check out these previous blog posts for some examples of how to use and extract model details in Oracle 12.1 and earlier versions of the database
Association Rules in ODM-Part 3
Extracting the rules from an ODM Decision Tree model
Instead of these functions there are now a lot of DB views available to inspect the details of a model. The following table summarises these various DB Views. Check out the DB views I’ve listed after the table, as these views might some some of the ones you might end up using most often.
I’ve now chance of remembering all of these and this table is a quick reference for me to find the DB views I need to use. The naming method used is very confusing but I’m sure in time I’ll get the hang of them.
NOTE: For the DB Views I’ve listed in the following table, you will need to append the name of the ODM model to the view prefix that is listed in the table.
table, th, td {
border: 1px solid black;
border-collapse: collapse;
text-align: left;
}
Data Mining Type | Algorithm & Model Details | 12.2 DB View | Description |
---|---|---|---|
Association | Association Rules | DM$VR | generated rules for Association Rules |
Frequent Itemsets | DM$VI | describes the frequent itemsets | |
Transaction Itemsets | DM$VT | describes the transactional itemsets view | |
Transactional Rules | DM$VA | describes the transactional rule view and transactional itemsets | |
Classification | (General views for Classification models) | DM$VT
DM$VC |
describes the target distribution for Classification models
describes the scoring cost matrix for Classification models |
Decision Tree | DM$VP
DM$VI DM$VO DM$VM |
describes the DT hierarchy & the split info for each level in DT
describes the statistics associated with individual tree nodes Higher level node description describes the cost matrix used by the Decision Tree build |
|
Generalized Linear Model | DM$VD
DM$VA |
describes model info for Linear Regres & Logistic Regres
describes row level info for Linear Regres & Logistic Regres |
|
Naive Bayes | DM$VP
DM$VV |
describes the priors of the targets for Naïve Bayes
describes the conditional probabilities of Naïve Bayes model |
|
Support Vector Machine | DM$VL | describes the coefficients of a linear SVM algorithm | |
Regression ??? | Doe | 80 | 50 |
Clustering | (General views for Clustering models) | DM$VD
DM$VA DM$VH DM$VR |
Cluster model description
Cluster attribute statistics Cluster historgram statistics Cluster Rule statistics |
k-Means | DM$VD
DM$VA DM$VH DM$VR |
k-Means model description
k-Means attribute statistics k-Means historgram statistics k-Means Rule statistics |
|
O-Cluster | DM$VD
DM$VA DM$VH DM$VR |
O-Cluster model description
O-Cluster attribute statistics O-Cluster historgram statistics O-Cluster Rule statistics |
|
Expectation Minimization | DM$VO
DM$VB DM$VI DM$VF DM$VM DM$VP |
describes the EM components
the pairwise Kullback–Leibler divergence attribute ranking similar to that of Attribute Importance parameters of multi-valued Bernoulli distributions mean & variance parameters for attributes by Gaussian distribution the coefficients used by random projections to map nested columns to a lower dimensional space |
|
Feature Extraction | Non-negative Matrix Factorization | DM$VE
DM$VI |
Encoding (H) of a NNMF model
H inverse matrix for NNMF model |
Singular Value Decomposition | DM$VE
DM$VV DM$VU |
Associated PCA information for both classes of models
describes the right-singular vectors of SVD model describes the left-singular vectors of a SVD model |
|
Explicit Semantic Analysis | DM$VA
DM$VF |
ESA attribute statistics
ESA model features |
|
Feature Section | Minimum Description Length | DM$VA | describes the Attribute Importance as well as the Attribute Importance rank |
Normalizing and Error Handling views created by ODM Automatic Data Processing (ADP)
- DM$VN : Normalization and Missing Value Handling
- DM$VB : Binning
Global Model Views
- DM$VG : Model global statistics
- DM$VS : Computed model settings
- DM$VW :Alerts issued during model creation
Each one of these new DB views needs their own blog post to explain what informations is being explained in each. I’m sure over time I will get round to most of these.
12.2 DBaaS (Extreme Edition) possible bug/issue with the DB install/setup
A few weeks ago the 12.2 Oracle Database was released on the cloud. I immediately set an account and got my 12.2 DBaaS setup. This was a relatively painless process and quick.
For me I wanted to test out all the new Oracle Advanced Analytics new features and the new features in SQL Developer 4.2 that only become visible when you are using the 12.2 Database.
When you are go to use the Oracle Data Miner (GUI tool) in SQL Developer 4.2, it will check to see if the ODMr repository is installed in the database. If it isn’t then you will be promoted for the SYS password.
This is the problem. In previous version of the DBaaS (12.1, etc) this was not an issue.
When you go to create your DBaaS you are asked for a password that will be used for the admin accounts of the database.
But when I entered the password for SYS, I got an error saying invalid password.
After using ssh to create a terminal connection to the DBaaS I was able to to connect to the container using
sqlplus / as sysdba
and also using
sqlplus sys/ as sysdba
Those worked fine. But when I tried to connect to the PDB1 I got the invalid username and/or password error.
sqlplus sys/@pdb1 as sysdba
I reconnected as follows
sqlplus / as sysdba
and then changed the password for SYS with containers=all
This command completed without errors but when I tried using the new password to connect the the PDB1 I got the same error.
After 2 weeks working with Oracle Support they eventually pointed me to the issue of the password file for the PDB1 was missing. They claim this is due to an error when I was creating/installing the database.
But this was a DBaaS and I didn’t install the database. This is a problem with how Oracle have configured the installation.
The answer was to create a password file for the PDB1 using the following
New OAA features in Oracle 12.2 Database
The Oracle 12.2c Database has been released and is currently available as a Cloud Service. The on-site version should be with us soon.
A few weeks ago I listed some of the new features that you will find in the Oracle Data Miner GUI tool (check out that blog post). I’ll have another blog post soon that looks a bit closer at how the new OAA features are exposed in this tool.
In this blog post I will list most of the new database related features in Oracle 12.2. There is a lot of new features and a lot of updated features. Over the next few months (yes it will take that long) I’ll have blog posts on most of these.
The Oracle Advanced Analytics Option new features include:
- The first new feature is one that you cannot see. Yes that sound a bit odd. But the underlying architecture of OAA has been rebuilt to allow for the algorithms to scale significantly. This is also future proofing OAA for new features coming in future releases of the database.
- Explicit Semantic Analysis. This is a new algorithm allows us to perform text similarity comparison. This is a great new addition and and much, much easier now compared to what we may have had to do previously.
- Using R models using SQL. Although we have been able to do this in the previous version of the database, the framework and supports have been extended to allow for greater and easier usage of user defined R scripts and R models with the in-database environment.
- Partitioned Models. We can now build partitioned mining models. This is where you can specify an attribute and a separate model will be created based on each value in the attribute.
- Partitioned scoring. Similarly we can now dynamically score the data based on an partition attribute.
- Extentions to Association Rules. Over the past few releases of the database, additional insights to the workings and decision making of the algorithms have been included. In 12.2 we now have some additional insights for the Association Rules aglorithm where we can now get to see the calculation of values associated with rules.
- DBMS_DATA_MINING package extended. This PL/SQL package has been extended to include the functionality for the new features listed above. Additional it can now process R algorithms and models.
- SQL Function changes: Change to the followi ODM related SQL functions to allow for partitioned models. CLUSTER_DETAILS, CLUSTER_DISTANCE, CLUSTER_ID, CLUSTER_PROBABILITY, CLUSTER_SET, FEATURE_COMPARE, FEATURE_DETAILS, FEATURE_ID, FEATURE_SET, FEATURE_VALUE, ORA_DM_PARTITION_NAME, PREDICTION, PREDICTION_BOUNDS, PREDICTION_COST, PREDICTION_DETAILS, PREDICTION_PROBABILITY, PREDICTION_SET
- New SQL Hint for ODM models. We have had hints in SQL for many, many versions now, but with 12.2c we now have a hint for partitioned models, called GROUPING hint.
- New CREATE_MODEL function. With the existing CREATE_MODEL function the input data set for the function needed to be defined in a table or accessed using a view. Basically the data needed to resist somewhere. With CREAETE_MODEL2 you can now define the input data set based on a SELECT statement.
In addition to all of these changes there are also some new interesting DB, SQL and PL/SQL new features that are of particular interest for your data science, machine learning, advanced analytics (or whatever the current favourite marketing term is today) projects.
It is going to be a busy few months ahead, working through all of these new features and write blog posts on how to use each of them.