oracle big data
I received an email today to say that I had a presentation accepted for the BIWA Summit. This conference will be in the Sofitel Hotel beside the Oracle HQ in Redwood City.
The title of the presentation is “The Oracle Data Scientist” and the abstract is
Over the past 18 months we have seen a significant increase in the demand for Data Scientists. But how does someone become a data scientist. If we examine the requirements and job descriptions of this role we can see that being able to understand and process data are fundamental skills. So an Oracle developer is ideally suited to being a Data Scientist. The presentation will show how an Oracle developer can evolve into a data scientist through a number of stages, including BI developer, OBIEE developer, statistical analysis, data miner and data scientist. The tasks and tools will be discussed and explored through each of these roles. The second half of the presentation will focus on the data mining functionality available in SQL and PL/SQL. This will consist of a demonstration of an Analytics Development environment and how you can migrate (and use) your models in a Production environment
For some reason Simon Cowell of XFactor fame kept on popping into my head and it now looks like he will be making an appearance in the presentation too. You will have to wait until the conference to find out what Simon Cowell and Being an Oracle Data Scientist have in common.
I’ll see you there
November (2012) is going to be a busy month for Oracle users in Ireland. There is a mixture of Oracle User Group events, with Oracle Day and the OTN Developer Days. To round off the year we have the UKOUG Conference during the first week in December.
Here are the dates and web links for each event.
Oracle User Group
The BI & EPM SIG will be having their next meeting on the Tuesday 20th November. This is almost a full day event, with presentations from End Users, Partners and Oracle product management. The main focus of the day will be on EPM, but will also be of interest to BI people.
As with all SIG meetings, this SIG will be held in the Oracle office in East Point (Block H). Things kick off at 9am and are due to finish around 4pm with plenty of tea/coffee and a free lunch too.
Remember to follow OUG Ireland on twitter using #oug_ire
Oracle will be having their Oracle Day 2012, on Thursday 15th, in Croke Park. Here is some of the blurb about the event, “…to learn how Oracle simplifies IT, whether it’s by engineering hardware and software to work together or making new technologies work for the modern enterprise. Sessions and keynotes feature an elite roster of Oracle solutions experts, partners and business associates, as well as fascinating user case studies and live demos.”
This is a full day event from 9am to 5pm with 3 parallel streams focusing on Big Data, Enterprise Applications and the Cloud.
OTN Developer Days
Oracle run their developer days about 3 times a year in Dublin. These events are run like a Hands-on Lab. So most of the work during the day is by yourself. You are provided with a workbook, a laptop and a virtual machine configured for the hands-on lab. This November we have the following developers days in the Oracle office in East Point, Dublin.
As you can see we have almost a full week of FREE training from Oracle. So there is no reason not to sign up for these days.
UKOUG Conference – in Birmingham
In December we have the annual UKOUG Conference. This is the largest Oracle User Group conference in Europe and the largest outside of the USA. At this conference you will have some of the main speakers and presentations from Oracle Open World, along with a range of speakers from all over the work.
In keeping with previous years there will be the OakTable Sunday and new this year there will be a Middleware Sunday. You need to register separately for these events. Here are the links
The main conference kicks off on the Monday morning with a very full agenda for Monday, Tuesday and Wednesday. There are a number of social events on the Monday and Tuesday, so come well rested.
On the Monday evening there is the focus pubs. This year it seems to have an Irish Pub theme. At the focus pub event there will be table for each of the user group SIGs.
Come and join me at the Ireland table on the Monday evening.
I will be giving a presentation on the Tuesday afternoon titled Getting Real Business Value from Predictive Analytics (OBIEE and Oracle Data Mining). This is a joint presentation with Antony Heljula of Peak Indicators.
At Oracle Open World a few weeks ago there was a large number of presentations on Big Data and Analytics. Most of these were marketing type presentations, with a couple of presentations on using R and how it can not be integrated into the Oracle Database 11.2.
In addition this these there was one presentation that focused on the Oracle Advanced Analytics (OAA) Option.
The Oracle Advanced Analytics Option covers the Oracle Data Mining features and the Oracle R Enterprise features in the Database.
The purpose of this blog post is to outline and summarise what was mentioned at these presentations, and will include what changes are/may be coming in the “Next Release” of the database i.e. Oracle 12c.
Health Warning: As with all the presentations at OOW that talked about what may be in or may be in the next release, there is not guarantee that the features will actually be in the release version of the database. Here is the slide that gives the Safe Harbor statement.
- 12c will come with R embedded into it. So there will be no need for any configurations.
- Oracle R client will come as part of the server install.
- Oracle R client will be able to use the Analytics functions that exist in the database.
- Will be able to run R code in the database.
- The database (12c) will be able to spawn multiple R engines.
- Will be able to emulate map-reduce style algorithms.
- There will be new PREDICTION function, replacing the existing (11g) functionality. This will combine a number of steps of building a model and applying it to the data to be scored into one function. But we will still need the functionality of the existing PREDICTION function that is in 11g. So it will be interesting to see how this functionality will be kept in addition to the new functionality being proposed in 12c.
- Although the Oracle Data Miner tool will still exits and will have many new features. It was also referred to as the ‘OAA Workflow’. So those this indicate a potential name change? We will have to wait and see.
- Oracle Data Miner will come with a new additional graphing feature. This will be in addition to the Explore Node and will allow us to produce more typical attribute related graphs. From what I could see these would be similar to the type of box plot, scatter, bar chart, etc. graphs that you can get from R.
- There will be a number of new algorithms too, including a useful One Class Support Vector Machine. This can be used when we have a data set with just one class value. This algorithm will work out what records/cases are more important and others.
- There will be a new SQL node. This will allow us to write our own data transformation code.
- There will be a new node to allow the calling of R code.
- The tool also comes with a slightly modified layout and colour scheme.
Again, the points that I have given above are just my observations. They may or may not appear in 12c, or maybe I misunderstood what was being said.
It certainly looks like we will have a integrate analytics environment in 12c with full integration of R and the ODM in-database features.
One of the most interesting of important aspects of a Decision Model is that we as a user can get to see what rules the machine learning algorithm has generated for our data.
I’ve give a number of examples in various blog posts over the past few years on how to generate a number of classification models. An example of the workflow is below.
In the Class Build node we get four models being generated. These include a Generalised Linear Model, Support Vector Machine, Naive Bayes and a Decision Tree model.
We can explore the Decision Tree model by right clicking on the Class Build Node, selecting View Models and then the Decision Tree model, which will be labelled with a ‘DT’ in the name.
As we explore the nodes and branches of the Decision Tree we can see the rule that was generated for a node in the lower pane of the applications. So by clicking on each node we get a different rule appearing in this pane
Sometimes there is a need to extract this rules so that they can be presented to a number of different types of users, to explain to them what is going on.
How can we extract the Decision Tree rules?
To do this, you will need to complete the following steps:
- From the Models section of the Component Palette select the Model Details node.
- Click on the Workflow pane and the Model Details node will be created
- Connect the Class Build node to the Model Details node. To do this right click on the Class Build node and select Connect. Then move the mouse to the Model Details node and click. The two nodes should now be connected.
- Edit the Model Details node, uncheck the Auto Settings, select Model Type to be Decision Tree, Output to be Full Tree and all the columns.
- Run the Model Details node. Right click on the node and select run. When complete you you will have the little green box with a tick mark, on the top right hand corner.
- To view the details produced, right click on the Model Details node and select View Data
- The rules for each node will now be displayed. You will need to scroll to the right of this pane to get to the rules and you will need to expand the columns for the rules to see the full details
I’ve recently compiled my list of presentation on the Oracle Analytics Option. All these presentations are for a 45 minute period.
I have two versions of the presentation ‘How to do Data Mining in SQL & PL/SQL’, one is for 45 minutes and the second version is for 2 hour.
I have given most of these presentations at conferences or SIGS.
Let me know if you are interesting in having one of these presentations at your SIG or conference.
- Oracle Analytics Option – 12c New Features – available 2013
- Real-time prediction in SQL & Oracle Analytics Option – Using the 12c PREDICTION function – available 2013
- How to do Data Mining in SQL & PL/SQL
- From BIG Data to Small Data and Everything in Between
- Oracle R Enterprise : How to get started
- Oracle Analytics Option : R vs Oracle Data Mining
- Building Predictive Analysts into your Forms Applications
- Getting Real Business Value from OBIEE and Oracle Data Mining (This is a cut down and merged version of the follow two presentations)
- Getting Real Business Value from OBIEE and Oracle Data Mining – Part 1 : The Oracle Data Miner part
- Getting Real Business Value from OBIEE and Oracle Data Mining – Part 2 : The OBIEE part
- How to Deploying and Using your Oracle Data Miner Models in Production
- Oracle Analytics Option 101
- From SQL Programmer to Data Scientist: evolving roles of an Oracle programmer
- Using an Oracle Oracle Data Mining Model in SQL & PL/SQL
- Getting Started with Oracle Data Mining
- You don’t need a PhD to do Data Mining
Check out the ‘My Presentations’ page for updates on new presentations.
Oracle Open World is fast approaching. Over the past couple of weeks I have been using the schedule builder tool to work out what sessions I would like to attend. Unfortunately there are LOTS of sessions I would love to attend but I haven’t worked out a way to be in 10 places at the same time.
When attending a conference I try to achieve a number of things. These are find out about new topics/features, benchmark my knowledge of existing topics, try some of the hands-on labs, try something new and do something completely different. This will be my challenge at Oracle Open World.
There are a number of other people from Ireland who will be attending OOW, and there are some plans to have an Ireland social event. Plus there is lots of meetings/catch-ups planned too with people I know from the virtual Oracle world.
There will be some people from AIB who will be presenting at OOW. Their presentation will be on the Tuesday morning 10:15-11:00. I’ll be there.
Other social things that are on include the Oracle ACE dinner, the Oracle Music festival, with Kings of Leon playing support to Pearl Jam on the Wednesdays night.
It is going to be a busy week, an enjoyable week, a week of learning new things and finding out lots of what the 12c database will be like.
Will I get time to go to everything ? The simple answer is NO. So I will just have to try to get to another Oracle Open World soon.
Here are the links to the 2 different sets of Big Data videos that Oracle have produced over the past 12 months
Oracle Big Data Videos – Version 1
Oracle Big Data Videos – Version 2
Other videos include