Oracle « Ora-lytics

Oracle Magazine–March/April 1999

Posted on March 11, 2013

The headline articles for the March/April 1999 edition of Oracle Magazine were on the evolving world of the DBA. With some much new technology available in the database the role of the DBA is moving from a back office type role to one having a significant strategic influence in the organisation.

Other articles included:

Oracle releases a web based version of their Oracle Strategic Procurement application that includes three key parts: Strategic Sourcing, Internet Procurement and Process Automation.
Sun and Oracle announce a strategic agreement that allows both companies to enhance their product offerings by exchanging key technologies. Oracle will use the core of the Sun Solaris operating environment to deliver the industry’s first database server appliances.
Oracle Data Mart Suite releases version 2.5. It includes, Oracle Data Mart Builder, Oracle Data Mart Designer, Oracle 8 Enterprise Edition, Oracle Discoverer, Oracle Application Server and Oracle Reports and Reports Server.
New integration between Oracle Reports release 6.0 and Oracle Express Server release 6.2 to give users the ability to distribute high quality reports of information held in a multi-dimensional database across the enterprise.
The need for the DBA to know and understand the V$ views has been increasing during the later releases of 7.3 and 8i. The can be used for a variety of purposes, including understanding locked users, system resources, licencing and parameter settings.
One thing that all DBAs need to plan for is a database recovery. Planning it is one thing, but practicing it is another thing. A typical recovery plan will include, choosing a data file, create a backup, take the damaged tablespace offline, restore the damaged data file, bring the tablespace back online, recover the tablespace, bring the tablespace back online and test it.
Avoiding trigger errors, including Mutating and constraining table errors.
There is an article by Bryan Laplante on using Historgrams to Optimize Data Mart Performance.

To view the cover page and the table of contents click on the image at the top of this post or click here.
My Oracle Magazine Collection can be found here. You will find links to my blog posts on previous editions and a PDF for the very first Oracle Magazine from June 1987.

This entry was posted in Brendan Tierney, Oracle, Oracle Magazine, Oracle Technology Network, OTN, oug_ire.

Clustering in ODM–Part 1

Posted on February 13, 2013

This is a the first part of a five (5) part blog post on building and using Clustering in Oracle Data Miner. The following outlines the contents of each post in this series on Clustering.

This post part we will look at what clustering features exist in ODM and how to setup the data that we will be using in the examples
The second part will focus on how to building Clusters and examining the clusters produced in ODM .
The third post will focus on using the Clusters to apply to new data using ODM.
The fourth post will look at how you can build and evaluate a Clustering model using the ODM SQL and PL/SQL functions.
The fifth and final post will look at how you can apply your Clustering model to new data using the ODM SQL and PL/SQL functions.

Clustering is an unsupervised technique designed groupings of related data that are more similar to each other and are less similar to other groups. Typically clustering is used in customer segmentation analysis to try an better understand what type of customers you have.

Like with all data mining techniques, Clustering will not tell you or give you some magic insight into your data. Instead it gives you more information for you to interpret and add the business meaning to them. With Clustering you can explore the data that forms each cluster to understand what it really means.

The Clusters give by Oracle Data Miner are just patterns that it has found in the data.

Oracle has two Clustering algorithms:

K-Means : Oracle Data Miner runs an enhanced version of the typical k-means algorithm. ODM builds models in a hierarchical manner, using a top-down approach with binary splits and refinements of all nodes at the end. The centroid of the inner nodes in the hierarchy are updated to reflect changes as the tree grows. The tree grows one node at a time. The node wit the largest variance is split to increase the size of the tree until the desired number of clusters is reached.

O-Cluster : O-Cluster is an Orthogonal Partitioning Clustering that creates a hierarchical grid based clustering model. It operates recursively, generating a hierarchical structure. The resulting clusters define dense areas.

The Data Set for out Clustering examples

I’m going to use a data set that is available on OTN (somewhere) and has been used for demos in the prior versions of ODM before the 11gR2 version (SQL Developer 3). It has gone by many names but the table name we care going to use is INSURANCE_CUST_LTV.

The file is in CSV format and we will use the Import feature in SQL Developer to import it.

1. In the schema you are using for Oracle Data Miner, right click Tables in the Connections tab. The Import option will appear on the menu. Select this.

2. Go to the directory where you saved the file, select it and then click on the Open button.

3. You need to set the file Format to be ‘Delimited’ and the Delimiter set to ‘|’

4. In the next step give the table name as INSURANCE_CUST_LTV

5.In the next step Select all the Attributes. It should default to this. Click next.

6. In Step 4 of the Wizard you can set the data types for each attribute. The simplest way is to set the character attributes to VARCHAR2 (50) :

CUSTOMER_ID, LAST, FIRST, STATE, REGION, SEX, PROFESSION, BUY_INSURANCE (set this one to 3), MARITAL_STATUS, LTV_BIN

Set all the number attributes (all the others) to NUMBER without any precision or scale.

7. Click the next button and then the finish button. SQL Developer will now load 15,342 records into the INSURANCE_CUST_LTV table, with no errors (hopefully!)

We are now ready to start our work with the Clustering algorithms in ODM.

In the next blog post we will look at exploring the data, building our Clustering models and examining the clusters that were produced by ODM.

This entry was posted in data mining, ODM 11g R2, Oracle, Oracle Advanced Analytics, Oracle Analytics Option, Oracle Data Miner, oracle data mining, Oracle Data Mining 11g R2.

OUG Ireland 2013 Agenda is now live

Posted on February 7, 2013

The agenda for the OUG Ireland 2013 event is now live. The event will be in the Dublin Convention Centre on the 12th March. There are lots of excellent sessions, across 7 tracks!! So there will be something (or lots of things) for everyone who works in the Oracle world here in Ireland.

I’m sure the Oracle Database track will be very popular. I wonder why!!!

Agenda : http://www.ukoug.org/2013-events/oug-ireland-2013/agenda/

Remember registration is FREE. You don’t have to be a member of the User Group to come to this event. It is open to everyone and did I mention that it is FREE. Registration is now open.

I’ll be there. Well I suppose I have to as I’ll be presenting Smile

I hope to see you there.

This entry was posted in Conference, Ireland, Irish Oracle SIG, Oracle, oug_ire.

Oracle ACE Director

Posted on February 6, 2013

Towards the end of last week I received and email from Oracle saying that I had been nominated and accepted, by Oracle, to become an Oracle ACE Director.

This is something that makes me very proud and honours the work I have been doing over the past few years on Data Mining in Oracle (Advanced Analytics Option, Data Science/Predictive Analaytics, or whatever you want to call it). Thank you to everyone who nominated me.

If you are not familiar with the Oracle ACE Program, it is a way for Oracle to acknowledge not only technical skills but also personal engagement with the Oracle Community and Technology overall. There is even a FAQ that explains how this program works.

There are a few perks that come with the title, and Oracle have a few expectations too. Most of these expectations I’m already doing!! What I’m looking forward to later this years is my first Oracle ACE Director briefing at Oracle Open World (22-26 September)

This entry was posted in Oracle, oracle open world, OTN.

Oracle Magazine-Nov/Dec. 1998

Posted on January 30, 2013

The headline articles for the Nov/Dec 1998 edition of Oracle Magazine were on building web based applications and thin client computing. A large part of the magazine was dedicated to these topics. This was a bumper edition with a total of 152 pages of content.

OUG Norway Agenda is now live

Posted on January 25, 2013

The OUG Norway spring conference (17th April – 19th April) agenda is now live and is open for registrations.

Click here for the Conference Agenda

Click here for the Conference Registration

This is a 3 day conference. The first day (17th April) will be held in the Radisson BLU Scandinavia ( Holbergsplass ) and the next two (and a bit) days will be on the Color Magic boat that will be travelling between Oslo and Kiel in Germany and back to Oslo. The boat will be arriving back in Oslo on the Saturday morning (20th April).

There will be some presentations in Norwegian, but it looks like most of the presentations will be in English. There will also be some well known names from the Oracle world presenting at this conference.

In addition to these people, I will be giving two presentations on using Predictive Analytics in Oracle using the Oracle Data Miner tool and in-database functionality.

My first presentation will be an overview of the advanced analytics option and a demonstration of what you can do using the Oracle Data Miner tool (part of SQL Developer). This presentation is currently scheduled for Thursday (18th April) at 5pm.

My second presentation will be at 9:30am on the Friday morning (19th April). In this presentation we will look at the in-database features, what can we do in SQL and PL/SQL, and we will look at what you need to do deploy you Oracle Data Mining models in a production environment.

If possible we might be able to review some new 12c new features for Oracle Data Miner Smile

This entry was posted in Brendan Tierney, Conference, data mining, data mining blog, Oracle, Oracle Advanced Analytics, Oracle Data Miner, oracle data mining, OUGN.

BIWA Oracle Data Scientist Certificate

Posted on January 19, 2013

Last week I had had the opportunity to present at the BIWA Summit conference. This was held in the Sofitel Hotel beside the Oracle HQ buildings at Redwood Shores just out side of San Francisco.

This conference was a busy 2 days of with 4 parallel streams of presentations and another stream for Hands-on Labs. The streams covered Big Data, Advanced Analytics, Business Intelligence and Data Warehousing. There was lots of great presentations from well known names in the subject areas.

The BIWA Oracle Data Scientist Certificate was launched at the summit. The requirements for this certificate was to attend my presentation on ‘The Oracle Data Scientist’ (this was compulsory) and then to attend a number of other data science related presentations and hands-on labs. In addition to these presentations there is a short exam to take. This consists of some 30-ish questions, which were based on my presentation and some of the other presentations and hand-on labs. The main topic areas covered in the exam include what is data science about, Oracle Data Miner, Oracle R Enterprise and then some questions based on the key notes, in particular the keynote by Ari Kaplan.

There are a few days left to take the exam. Your answers to the questions will be reviewed and you should receive an email within a couple of days with your result and hopefully your certificate.

https://www.surveymonkey.com/s/BiwaSummitDataScientistCertificate

This was my first trip to Redwood Shores and I had some time to go for a walk around the Oracle HQ campus. Hopefully it wont be my last. Here is a photo I took of some of the Oracle buildings.

The BIWA Summit conference returns to Redwood Shores again in 2014 around the 14th and 15th January. It will be in the Oracle Conference centre that is part of the Oracle HQ campus.

Maybe I’ll see you there in 2014.

This entry was posted in BIWA, Oracle, Oracle Advanced Analytics, Oracle Data Miner, Oracle R Enterprise.

The ‘Oh No You Don’t’ of (Oracle) Data Science

Posted on January 17, 2013

Over the past couple of weeks I’ve had conversations with a large number of people about Data Science in the Oracle arena.

A few things have stood out. The first and perhaps the most important of these is that there is confusion of what Data Science actually means. Some think it is just another name for Statistics or Advanced Statistics, some Predictive Analytics or Data Mining, or Data Analysis, Data Architecture, etc.. The reality is it is not. It is more than what these terms mean and this is a topic for discussion for another day.

During these conversations the same questions or topics keep coming up and the simplest answer to all of these is taken from a Pantomime (Panto).

We need to have lots of statisticians
       ‘Oh No You Don’t !’
We can only do Data Science if we have Big Data
        ‘Oh No You Don’t !’
We can only do data mining/data science if we have 10’s or 100’s of Million of records
        ‘Oh No You Don’t !’
We need to have an Exadata machine
        ‘Oh No You Don’t !’
We need to have an Exalytics machine
        ‘Oh No You Don’t !’
We need extra servers to process the data
        ‘Oh No You Don’t !’
We need to buy lots of Statistical and Predictive Analytics software
        ‘Oh No You Don’t !’
We need to spend weeks statistically analysing a predictive model
        ‘Oh No You Don’t !’
We need to have unstructured data to do Data Science
        ‘Oh No You Don’t !’
Data Science is only for large companies
        ‘Oh No You Don’t !’
Data Science is very complex, I can not do it
        ‘Oh No You Don’t !’

Let us all say it together for one last time ‘Oh No You Don’t’

In its simplest form, performing Data Science using the Oracle stack, just involves learning and using some simple SQL and PL/SQL functions in the database.

Maybe we (in the Oracle Data Science world and those looking to get into it) need to adopt a phrase that is used by Barrack Obama of ‘Yes We Can’, or as he said it in Irish when he visited Ireland back in 2011, ‘Is Feidir Linn’.

Remember it is just SQL.

This entry was posted in data mining, data mining blog, ODM 11g R2, Oracle, Oracle Advanced Analytics, Oracle Analytics Option, oracle big data, Oracle Data Miner, oracle data mining, Oracle Data Mining 11g R2, oug_ire.

My Blog Stats for 2012

Posted on January 4, 2013

Here are the stats from my blog for 2012.

In total I’ve had almost 28,000 blog post views. This is a 7 fold increase on the number of blog post views I had in 2011.

I had 92 blog posts in 2012 and the most popular blog posts were

Top search keywords used to find my blog

exalytics pricing
oracle data mining
oracle data miner
data science
brendan tierney

Top Countries

United States 52%
Ireland 8%
United Kingdom 8%
India 4%
Russia 4%
Germany 3%
France 3%
Netherlands 1%
Canada 1%
Turkey 1%

Top OS

Windows 59%
Macintosh 28%
Linux 5%
iPhone 2%
iPad 1%

Top Browsers

Firefox 47%
Internet Explorer 26%
Chrome 15%
Safari 4%

This entry was posted in Brendan Tierney, data mining blog, Oracle, oug_ire.

OUG Norway April 2013 – New Year’s News

Posted on January 2, 2013

I received an email at 23:24 on the 1st January from the OUG in Norway telling me that I’ve had two presentations accepted for the Annual OUG Norway seminar event. This will be on during the 17th-19th April.

The first day of this event (17th April) will be held in a hotel in Oslo. Then on the morning of 18th April we board the Color Magic cruise for the next two days of the conference. The ferry/cruise will go from Oslo to Kiel in Germany and then back again to Oslo, returning around 10am on Saturday 20th April.

I will be giving two presentations on the Oracle Advanced Analytics Option. The first presentation, ‘Using Predictive Analytics in Oracle’, will give an overview of the Oracle Advanced Analytics Option and will then focus on the Oracle Data Miner work-flow tool. This will presentation will include a live demo of using Oracle Data Miner to create some data mining models.

The second presentation, ‘How to Deploy and Use your Oracle Data Miner Models in Production’, builds on the examples given in the first presentation and will show how you can migrate, user and update your Oracle Data Miner models using the features available in SQL and PL/SQL. Again a demo will be given.

This entry was posted in Brendan Tierney, data mining, data mining blog, ODM 11g R2, Oracle, Oracle Advanced Analytics, Oracle Analytics Option, Oracle Data Miner, Oracle Data Mining 11g R2, oracle open world, oug_ire, PL/SQL, SQL.

Articles wanted for Oracle Scene–Spring 2013

Posted on December 20, 2012

The Call for Articles is now open for the Spring edition of Oracle Scene magazine. This is a publication of the UKOUG.

We are looking for technical articles covering all product offerings from Oracle.

Typically articles will range from 3 pages to 8 pages (MS Word format). These will convert into 2 to 5 page articles in Oracle Scene.

Check out the Article Formatting Guidelines before submitting.
All pictures and images should be 300dpi.
Include a 100(max) word Bio and your photo
Email your article and images to

articles@ukoug.org.uk

For more details about submitting an article, check out
http://www.ukoug.org/what-we-offer/oracle-scene/article-submissions/

This entry was posted in Brendan Tierney, Oracle, Oracle Scene, oug_ire, UKOUG.

Association Rules in ODM-Part 4

Posted on December 19, 2012

This is a the final part of a four part blog post on building and using Association Rules in the Oracle Database using Oracle Data Miner. The following outlines the contents of each post in the series on Association Rules

This first part will focus on how to building an Association Rule model
The second post will be on examining the Association Rules produced by ODM – This blog post
The third post will focus on using the Association Rules on your data.
The final post will look at how you can do some of the above steps using the ODM SQL and PL/SQL functions.

In my previous posts I showed how you can go about setting up for Association Rule analysis in Oracle Data Miner and how to examine the rules that are generated.

This post will focus on how we build and use association rules using the functionality that is available in SQL and PL/SQL.

Step 1 – Build the Settings Table

As with all Oracle Data Mining functions in SQL and PL/SQL you will need to setup or build a settings table. This table contains all the settings necessary to run the model build functions. It is a good idea to create a separate settings table for each model build that you complete.

CREATE TABLE assoc_sample_settings (
setting_name VARCHAR2(30),
setting_value VARCHAR2(4000));

Step 2 – Define the Settings for the Model

Before you go to generate your model you need to set some of the parameters for the algorithm. To start with you need to defined that we are going to generate an Association Rules model, turn off the Automatic Data Preparation.

We can also set 3 additional settings for Association Rules.

The ASSO_MIN_SUPPORT has a default of 0.1 or 10%. That means that only rules that exist in 10% or more of the cases will be generated. This is really a figure that is too high. In the code below we will set this to a 1%. This matches the settings that we used in SQL Developer in my previous posts.

BEGIN

INSERT INTO assoc_sample_settings (setting_name, setting_value) VALUES

(dbms_data_mining.algo_name, dbms_data_mining.ALGO_APRIORI_ASSOCIATION_RULES);

INSERT into assoc_sample_settings (setting_name, setting_value) VALUES

(dbms_data_mining.prep_auto, dbms_data_mining.prep_auto_off);

INSERT into assoc_sample_settings (setting_name, setting_value) VALUES

(dbms_data_mining.ODMS_ITEM_ID_COLUMN_NAME, ‘PROD_ID’);

INSERT into assoc_sample_settings (setting_name, setting_value) VALUES

(dbms_data_mining.ASSO_MIN_SUPPORT, 0.01);

COMMIT;

END;

/

Step 3 – Prepare the Data

In our example scenario we are using the SALE data that is part of the SH schema. The CREATE_MODEL function needs to have an attribute (CASE_ID) that identifies the key of the shopping basket. In our case we have two attributes, so we will need to use a combined key. This combined key consists of the CUST_ID and the TIME_ID. This links all the transaction records related to the one shopping event together.

We also just need the attribute that has the information that we need. In our Association Rules (Market Basket Analysis) scenario, we will need to include the PROD_ID attribute. This contains the product key of each product that was included in the basket

CREATE VIEW ASSOC_DATA_V AS (

SELECT RANK() OVER (ORDER BY CUST_ID, TIME_ID) CASE_ID,

t.PROD_ID

FROM SH.SALES t );

Step 4 – Create the Model

We will need to use the DBMS_DATA_MINING.CREATE_MODEL function. This will use the settings in our ASSOC_SAMPLE_SETTINGS table. We will use the view created in Step 3 above and use the CASE_ID attribute we created as the Case ID in the function all.

BEGIN
   DBMS_DATA_MINING.CREATE_MODEL(
     model_name          => ‘ASSOC_MODEL_2’,
     mining_function     => DBMS_DATA_MINING.ASSOCIATION,
     data_table_name     => ‘ASSOC_DATA_V’,
     case_id_column_name => ‘CASE_ID’,
     target_column_name => null,
     settings_table_name => ‘assoc_sample_settings’);

END;

On my laptop this took approximately 5 second to run on just over 918K records involving just over 143K cases or baskets.

Now that is quick!!!

Step 5 – View the Model Outputs

There are a couple of functions that can be used to extract the rules produced in our previous step. These include:

GET_ASSOCIATION_RULES : This returns the rules from an association model.

SELECT rule_id,
       antecedent,
       consequent,
       rule_support,

rule_confidence

FROM TABLE(DBMS_DATA_MINING.GET_ASSOCIATION_RULES(‘assoc_model_2’, 10));

The 10 here returns the top 10 records or rules. GET_FREQUENT_ITEMSETS : returns a set of rows that represent the frequent item sets from an association model. In the following code we want the top 30 item sets to be returned, but filtered to only display item sets where there are 2 or more rules.

SELECT itemset_id,

items,

support,

number_of_items

FROM TABLE(DBMS_DATA_MINING.GET_FREQUENT_ITEMSETS(‘assoc_model_2’, 30))

WHERE number_of_items >= 2;

This entry was posted in Brendan Tierney, data mining, data mining blog, ODM 11g R2, Oracle, Oracle Advanced Analytics, Oracle Analytics Option, Oracle Data Miner, oracle data mining, Oracle Data Mining 11g R2, oug_ire.

Oracle

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: