Oracle 18c DBaaS Cloud Setup
The 18c Oracle DBaaS is now available. This is the only place that Oracle 18c will be available until later in 2018. So if you want to try it out, then you are going to need to get some Oracle Cloud credits, or you may already have a paying account for Oracle Cloud.
The following outlines the steps you need to go through to gets Oracle 18c setup.
1. Log into your Oracle Cloud
Log into your Oracle Cloud environment. Depending on your access path you will get to your dashboard.
Select Create Instance from the dashboard.

2. Create a new Database
From the list of services to create, select Database.

3. Click ‘Create Instance’

4. Enter the Database Instance details
Enter the details for your new Oracle 18c Database. I’ve called mine ‘db18c’.
Then for the Software Release dropdown list, select ‘Oracle Database 18c’.
Next select the Software Edition from the dropdown list.

5. Fill in the Instance Details
Fill in the details for ‘DB Name’, ‘PDB Name’, ‘Administration Password’, ‘Confirm Password’, setup the SSH Public Key, and then decide if you need the Backup and Recovery option.
<
6. Create the DBaaS
Double check everything and when ready click on the ‘Create’ button.
7. Wait for Everything to be Create
Now is the time to be patient and wait while your cloud service is created.

I’ve created two different version of the 18c Oracle DBaaS. The Enterprise Edition to 30 minutes to complete and the High Performance service too 47 minutes.
No it’s time to go play.
18c is now available (but only on the Cloud)
On Friday afternoon (16th February) we started to see tweets and blog posts from people in Oracle saying that Oracle 18c was now available. But is only available on Oracle Cloud and Engineered Systems.
It looks like we will have to wait until the Autumn before we can install it ourselves on our own servers 😦
Here is the link to the official announcement for Oracle 18c.
Oracle 18c is really Oracle 12.2.0.2. The next full new release of the Oracle database is expected to be Oracle 19.
The new features and incremental enhancements in Oracle 18c are:
- Multitenant
- In-Memory
- Sharding
- Memory Optimized Fetches
- Exadata RAC Optimizations
- High Availability
- Security
- Online Partition Merge
- Improved Machine Learning (OAA)
- Polymorphic Table Functions
- Spatial and Graph
- More JSON improvements
- Private Temporary Tablespaces
- New mode for Connection Manager
And now the all important links to the documentation.
To give Oracle 18c a try you will need to go to cloud.oracle.com and select Database from the drop down list from the Platform menu. Yes you are going to need an Oracle Cloud account and some money or some free credit. Go and get some free cloud credits at the upcoming Oracle Code events.
If you want a ‘free’ way of trying out Oracle 18c, you can use Oracle Live SQL. They have setup some examples of the new features for you to try.
NOTE: Oracle 18c is not Autonomous. Check out Tim Hall’s blog posts about this. The Autonomous Oracle Database is something different, and we will be hearing more about this going forward.
Oracle and Python setup with cx_Oracle
Is Python the new R?
Maybe, maybe not, but that I’m finding in recent months is more companies are asking me to use Python instead of R for some of my work.
In this blog post I will walk through the steps of setting up the Oracle driver for Python, called cx_Oracle. The documentation for this drive is good and detailed with plenty of examples available on GitHub. Hopefully there isn’t anything new in this post, but it is my experiences and what I did.
1. Install Oracle Client
The Python driver requires Oracle Client software to be installed. Go here, download and install. It’s a straightforward install. Make sure the directories are added to the search path.
2. Download and install cx_Oracle
You can use pip3 to do this.
pip3 install cx_Oracle
Collecting cx_Oracle
Downloading cx_Oracle-6.1.tar.gz (232kB)
100% |████████████████████████████████| 235kB 679kB/s
Building wheels for collected packages: cx-Oracle
Running setup.py bdist_wheel for cx-Oracle ... done
Stored in directory: /Users/brendan.tierney/Library/Caches/pip/wheels/0d/c4/b5/5a4d976432f3b045c3f019cbf6b5ba202b1cc4a36406c6c453
Successfully built cx-Oracle
Installing collected packages: cx-Oracle
Successfully installed cx-Oracle-6.1
3. Create a connection in Python
Now we can create a connection. When you see some text enclosed in angled brackets <>, you will need to enter your detailed for your schema and database server.
# import the Oracle Python library
import cx_Oracle
# define the login details
p_username = ""
p_password = ""
p_host = ""
p_service = ""
p_port = "1521"
# create the connection
con = cx_Oracle.connect(user=p_username, password=p_password, dsn=p_host+"/"+p_service+":"+p_port)
# an alternative way to create the connection
# con = cx_Oracle.connect('/@/:1521')
# print some details about the connection and the library
print("Database version:", con.version)
print("Oracle Python version:", cx_Oracle.version)
Database version: 12.1.0.1.0
Oracle Python version: 6.1
4. Query some data and return results to Python
In this example the query returns the list of tables in the schema.
# define a cursor to use with the connection
cur = con.cursor()
# execute a query returning the results to the cursor
cur.execute('select table_name from user_tables')
# for each row returned to the cursor, print the record
for row in cur:
print("Table: ", row)
Table: ('DECISION_TREE_MODEL_SETTINGS',)
Table: ('INSUR_CUST_LTV_SAMPLE',)
Table: ('ODMR_CARS_DATA',)
Now list the Views available in the schema.
# define a second cursor
cur2 = con.cursor()
# return the list of Views in the schema to the cursor
cur2.execute('select view_name from user_views')
# display the list of Views
for result_name in cur2:
print("View: ", result_name)
View: ('MINING_DATA_APPLY_V',)
View: ('MINING_DATA_BUILD_V',)
View: ('MINING_DATA_TEST_V',)
View: ('MINING_DATA_TEXT_APPLY_V',)
View: ('MINING_DATA_TEXT_BUILD_V',)
View: ('MINING_DATA_TEXT_TEST_V',)
5. Query some data and return to a Panda in Python
Pandas are commonly used for storing, structuring and processing data in Python, using a data frame format. The following returns the results from a query and stores the results in a panda.
# in this example the results of a query are loaded into a Panda
# load the pandas library
import pandas as pd
# execute the query and return results into the panda called df
df = pd.read_sql_query("SELECT * from INSUR_CUST_LTV_SAMPLE", con)
# print the records returned by query and stored in panda
print(df.head())
CUSTOMER_ID LAST FIRST STATE REGION SEX PROFESSION \
0 CU13388 LEIF ARNOLD MI Midwest M PROF-2
1 CU13386 ALVA VERNON OK Midwest M PROF-18
2 CU6607 HECTOR SUMMERS MI Midwest M Veterinarian
3 CU7331 PATRICK GARRETT CA West M PROF-46
4 CU2624 CAITLYN LOVE NY NorthEast F Clerical
BUY_INSURANCE AGE HAS_CHILDREN ... MONTHLY_CHECKS_WRITTEN \
0 No 70 0 ... 0
1 No 24 0 ... 9
2 No 30 1 ... 2
3 No 43 0 ... 4
4 No 27 1 ... 4
MORTGAGE_AMOUNT N_TRANS_ATM N_MORTGAGES N_TRANS_TELLER \
0 0 3 0 0
1 3000 4 1 1
2 980 4 1 3
3 0 2 0 1
4 5000 4 1 2
CREDIT_CARD_LIMITS N_TRANS_KIOSK N_TRANS_WEB_BANK LTV LTV_BIN
0 2500 1 0 17621.00 MEDIUM
1 2500 1 450 22183.00 HIGH
2 500 1 250 18805.25 MEDIUM
3 800 1 0 22574.75 HIGH
4 3000 2 1500 17217.25 MEDIUM
[5 rows x 31 columns]
6. Wrapping it up and closing things
Finally we need to wrap thing up and close our cursors and our connection to the database.
# close the cursors cur2.close() cur.close() # close the connection to the database con.close()
Useful links
Watch out for more blog posts on using Python with Oracle, Oracle Data Mining and Oracle R Enterprise.
Relation Theory videos by CJ Date
Here is a series of videos by CJ Date on Relational Theory.
Introduction video
An Introduction to set theory (videos on O’Reilly website)
Nullology (videos on O’Reilly website)
Nulls, Three-Valued Logic, and Missing Information (videos on O’Reilly website)
The Closed World Assumption (videos on O’Reilly website)
View Updating (videos on O’Reilly website)
Oracle Code Online December 2017
This week Oracle Code will be having an online event consisting of 5 tracks and with 3 presentations on each track.
This online Oracle Code event will be given in 3 different geographic regions on 12th, 13th and 14th December.

I’ve been selected to give one of these talks, and I’ve given this talk at some live Oracle Code events and at JavaOne back in October.
The present is pre-recorded and I recorded this video back in September.
I hope to be online at the end of some of these presentations to answer any questions, but unfortunately due to changes with my work commitments I may not be able to be online for all of them.
The moderator for these events will take your questions (or you can send them to me here) and I will write a blog post answering all your questions.
Make SQL Great Again baseball call
Make SQL great again baseball cap

Let me know if you would like to order one.
They cost €15 + P&P
Irish people presenting at OOW
Here is a list of presentations at Oracle Open World and JavaOne in 2017, that will be given by people and partners based in Ireland.
(I’ll update this list if I find additional presentations)
table.myTable { border-collapse:collapse; }
table.myTable td, table.myTable th { border:1px solid black;padding:5px; }
| Day | Time | Presentation | Location |
|---|---|---|---|
| Sunday | 13:45-14:30 | SQL: One Language to Rule All Your Data [OOW SUN1238]
Brendan Tierney, Oralytics SQL is a very powerful language that has been in use for almost 40 years. SQL comes with many powerful techniques for analyzing your data, and you can analyze data outside the database using SQL as well. Using the new Oracle Big Data SQL it is now possible to analyze data that is stored in a database, in Hadoop, and in NoSQL all at the same time. This session explores the capabilities in Oracle Database that allow you to work with all your data. Discover how SQL really is the unified language for processing all your data, allowing you to analyze, process, run machine learning, and protect all your data. Hopefully this presentation will be a bit of Fun! For those who have been working with the database for a long time, we can sometimes forget what we can really do. For those starting out in the career may not realise what the database can do. The presentation delivers an important message while having a laugh or two (probably at me). | Marriott Marquis (Golden Gate Level) – Golden Gate C1/C2 |
| Monday | 16:30-17:15 | ESB Networks Automates Core IT Infrastructure and Grid Operations [CON7878]
Simon Holt, DBA / Technical Architect, ESB Networks Andrew Walsh, OMS Application Support, ESB In this session learn how ESB Networks deployed Oracle Utilities Network Management System Release 1.12 on a complete Oracle SuperCluster. Hear about the collaboration between multiple Oracle business units and the in-house expertise that delivered an end-to-end solution. This upgrade is an important step toward expanding ESB Network’s future network operations vision. Her about the challenges, the process of choosing a COTS solution, cybersecurity, and implementation. The session also explores the benefits the new system delivered when managing the effects of large-scale weather events, as well as the technical challenges of deploying a combined hardware and software solution. |
Park Central (Floor 2) – Metropolitan I |
| Monday | 16:45-17:30 | Automation and Innovation for Application Management and Support [CON7862]
Raja Roy, Associate Partner, IBM Ireland Automation and innovation are transforming the way application support and development projects are being executed. Market trends show three fundamental shifts: innovation to improve quality of service delivery, the emergence of knowledge-based systems with capabilities for self-service and self-heal, and leveraging the power of the cloud to move capital expenditures to operating expenditures for enhanced functionality. In this session see how IBM introduced innovation in deployments globally to help customers achieve employee and business productivity and enhanced quality of services. |
Moscone West – Room 3022 |
| Tuesday | 12:15-13:00 | DMigrating Oracle E-Business Suite to Oracle IaaS: A Customer Journey [CON1848]
Ken MacMahon, IT, Version 1 Ken Lynch, Head of IT, Irish Life Simon Joyce, Consultant / Contractor, Version 1 Software In this session hear about a leading global insurance provider’s experience of migrating Oracle E-Business Suite to Oracle Cloud. This session includes a discussion of the considerations for Oracle Iaas/PaaS vs. alternatives, the total cost of ownership for Oracle IaaS vs. on-premises solutions, the key project and support issues, the benefits of IaaS, and tips and tricks. Gain insights that can help others on their journey with Oracle IaaS generally and with Oracle E-Business Suite specifically. |
Moscone West – Room 2001 |
| Wednesday | 14:00-14:45 | Ireland’s An Post: Customer Analytics Using Oracle Analytics Cloud [CON7176]
Tony Cassidy, CEO, Vertice John Hagerty, Oracle An Post, the Republic of Ireland’s state-owned provider of postal services, is an organization in transformation. It has used data and analytics to create innovations that led to cost savings and better sustainability. The current focus—customer analytics for a new line of business called Parcels and Packets—utilizes Oracle Analytics Cloud to externalize pertinent data to clients through a portal in a secure, effective, and easy-to-manage environment. In this session hear from An Post and its partner, Vertice, as they discuss the architecture and solution, along with recommendations for ensuring success using Oracle Analytics Cloud. |
Moscone West – Room 3009 |
| Thursday | 13:45-14:30 | Is SQL the Best Language for Statistics and Machine Learning? [OOW and JavaOne CON7350]
Brendan Tierney, Oralytics Did you know that Oracle Database comes with more than 300 statistical functions? And most of these statistical functions are available in all versions of Oracle Database? Most people do not seem to know this. When we hear about people performing statistical analytics, we hear them talking about Excel and R, but what if we could do statistical analysis in the database without having to extract any data onto client machines? This presentation explores the various statistical areas available in Oracle Database and gives several demonstrations. We can also greatly expand our statistical capabilities by using Oracle R Enterprise with the embedded capabilities in SQL. This presentation is just one of the 14 presentations that are scheduled for the Thursday! I believe this session is already fully booked, but you can still add yourself to the wait list. |
Marriott Marquis (Golden Gate Level) – Golden Gate B |
My Oracle Open World 2017 Presentations
Oracle Open World 2017 will be happening very soon (1st-5th October). Still lots to do before I can get on that plane to San Francisco.
This year I’ll be giving 2 presentations (see table below). One on the Sunday during the User Groups Sunday sessions. I’ve been accepted on the EMEA track. I then get a few days off to enjoy and experience OOW until Thursday when I have my second presentation that is part of JavaOne (I think!)
My OOW kicks off on Friday 29th September with the ACE Director briefing at Oracle HQ, after flying to SFO on Thursday 28th. This year it is only for one day instead of two days. I really enjoy this event as we get to learn and see what Oracle will be announcing at OOW as well as some things that will be coming out during the following few months.
| Day | Time | Presentation | Location |
|---|---|---|---|
| Sunday | 13:45-14:30 | SQL: One Language to Rule All Your Data [OOW SUN1238]
SQL is a very powerful language that has been in use for almost 40 years. SQL comes with many powerful techniques for analyzing your data, and you can analyze data outside the database using SQL as well. Using the new Oracle Big Data SQL it is now possible to analyze data that is stored in a database, in Hadoop, and in NoSQL all at the same time. This session explores the capabilities in Oracle Database that allow you to work with all your data. Discover how SQL really is the unified language for processing all your data, allowing you to analyze, process, run machine learning, and protect all your data. Hopefully this presentation will be a bit of Fun! For those who have been working with the database for a long time, we can sometimes forget what we can really do. For those starting out in the career may not realise what the database can do. The presentation delivers an important message while having a laugh or two (probably at me). | Marriott Marquis (Golden Gate Level) – Golden Gate C1/C2 |
| Thursday | 13:45-14:30 | Is SQL the Best Language for Statistics and Machine Learning?
[OOW and JavaOne CON7350] Did you know that Oracle Database comes with more than 300 statistical functions? And most of these statistical functions are available in all versions of Oracle Database? Most people do not seem to know this. When we hear about people performing statistical analytics, we hear them talking about Excel and R, but what if we could do statistical analysis in the database without having to extract any data onto client machines? This presentation explores the various statistical areas available in Oracle Database and gives several demonstrations. We can also greatly expand our statistical capabilities by using Oracle R Enterprise with the embedded capabilities in SQL. This presentation is just one of the 14 presentations that are scheduled for the Thursday! I believe this session is already fully booked, but you can still add yourself to the wait list. |
Marriott Marquis (Golden Gate Level) – Golden Gate B |
My flights and hotel have been paid by OTN as part of the Oracle ACE Director program. Yes this costs a lot of money and there is no way I’d be able to pay these costs. Thank you.
My diary for OOW is really full. No it is completely over booked. It is just mental. Between attending conference session, meeting with various product teams (we only get to meet at OOW), attending various community meet-ups, this year I get to attend some events for OUG leaders (representing UKOUG), spending some time on the EMEA User Group booth, various meetings with people to discuss how they can help or contribute to the UKOUG, then there is Oak Table World, trying to check out the exhibition hall, spend some time at the OTN/ODC hangout area, getting a few OTN t-shirts, doing some book promotions at the Oracle Press shop, etc., etc., etc. I’m exhausted just thinking about it. Mosts days start at 7am and then finish around 10pm.
I’ll need a holiday when I get home! but it will be straight back to work 😦
If you are at OOW and want to chat then contact me via DM on Twitter or WhatsApp (these two are best) or via email (this will be the slowest way).
I’ll have another blog post listing the presentations from various people and partners from the Republic of Ireland who are speaking at OOW.
My Oracle Open World 2017 Presentations
Oracle Open World 2017 will be happening very soon (1st-5th October). Still lots to do before I can get on that plane to San Francisco.
This year I’ll be giving 2 presentations (see table below). One on the Sunday during the User Groups Sunday sessions. I’ve been accepted on the EMEA track. I then get a few days off to enjoy and experience OOW until Thursday when I have my second presentation that is part of JavaOne (I think!)
My OOW kicks off on Friday 29th September with the ACE Director briefing at Oracle HQ, after flying to SFO on Thursday 28th. This year it is only for one day instead of two days. I really enjoy this event as we get to learn and see what Oracle will be announcing at OOW as well as some things that will be coming out during the following few months.
table.myTable { border-collapse:collapse; }
table.myTable td, table.myTable th { border:1px solid black;padding:5px; }
| Day | Time | Presentation | Location |
|---|---|---|---|
| Sunday | 13:45-14:30 | SQL: One Language to Rule All Your Data [OOW SUN1238]
SQL is a very powerful language that has been in use for almost 40 years. SQL comes with many powerful techniques for analyzing your data, and you can analyze data outside the database using SQL as well. Using the new Oracle Big Data SQL it is now possible to analyze data that is stored in a database, in Hadoop, and in NoSQL all at the same time. This session explores the capabilities in Oracle Database that allow you to work with all your data. Discover how SQL really is the unified language for processing all your data, allowing you to analyze, process, run machine learning, and protect all your data. Hopefully this presentation will be a bit of Fun! For those who have been working with the database for a long time, we can sometimes forget what we can really do. For those starting out in the career may not realise what the database can do. The presentation delivers an important message while having a laugh or two (probably at me). | Marriott Marquis (Golden Gate Level) – Golden Gate C1/C2 |
| Thursday | 13:45-14:30 | Is SQL the Best Language for Statistics and Machine Learning?
[OOW and JavaOne CON7350] Did you know that Oracle Database comes with more than 300 statistical functions? And most of these statistical functions are available in all versions of Oracle Database? Most people do not seem to know this. When we hear about people performing statistical analytics, we hear them talking about Excel and R, but what if we could do statistical analysis in the database without having to extract any data onto client machines? This presentation explores the various statistical areas available in Oracle Database and gives several demonstrations. We can also greatly expand our statistical capabilities by using Oracle R Enterprise with the embedded capabilities in SQL. This presentation is just one of the 14 presentations that are scheduled for the Thursday! I believe this session is already fully booked, but you can still add yourself to the wait list. |
Marriott Marquis (Golden Gate Level) – Golden Gate C3 |
My flights and hotel have been paid by OTN as part of the Oracle ACE Director program. Yes this costs a lot of money and there is no way I’d be able to pay these costs. Thank you.
My diary for OOW is really full. No it is completely over booked. It is just mental. Between attending conference session, meeting with various product teams (we only get to meet at OOW), attending various community meet-ups, this year I get to attend some events for OUG leaders (representing UKOUG), spending some time on the EMEA User Group booth, various meetings with people to discuss how they can help or contribute to the UKOUG, then there is Oak Table World, trying to check out the exhibition hall, spend some time at the OTN/ODC hangout area, getting a few OTN t-shirts, doing some book promotions at the Oracle Press shop, etc., etc., etc. I’m exhausted just thinking about it. Mosts days start at 7am and then finish around 10pm.
I’ll need a holiday when I get home! but it will be straight back to work 😦
If you are at OOW and want to chat then contact me via DM on Twitter or WhatsApp (these two are best) or via email (this will be the slowest way).
I’ll have another blog post listing the presentations from various people and partners from the Republic of Ireland who are speaking at OOW.
My Oracle Open World 2017 Presentations
Oracle Open World 2017 will be happening very soon (1st-5th October). Still lots to do before I can get on that plane to San Francisco.
This year I’ll be giving 2 presentations (see table below). One on the Sunday during the User Groups Sunday sessions. I’ve been accepted on the EMEA track. I then get a few days off to enjoy and experience OOW until Thursday when I have my second presentation that is part of JavaOne (I think!)
My OOW kicks off on Friday 29th September with the ACE Director briefing at Oracle HQ, after flying to SFO on Thursday 28th. This year it is only for one day instead of two days. I really enjoy this event as we get to learn and see what Oracle will be announcing at OOW as well as some things that will be coming out during the following few months.
table.myTable { border-collapse:collapse; }
table.myTable td, table.myTable th { border:1px solid black;padding:5px; }
| Day | Time | Presentation | Location |
|---|---|---|---|
| Sunday | 13:45-14:30 | SQL: One Language to Rule All Your Data [OOW SUN1238]
SQL is a very powerful language that has been in use for almost 40 years. SQL comes with many powerful techniques for analyzing your data, and you can analyze data outside the database using SQL as well. Using the new Oracle Big Data SQL it is now possible to analyze data that is stored in a database, in Hadoop, and in NoSQL all at the same time. This session explores the capabilities in Oracle Database that allow you to work with all your data. Discover how SQL really is the unified language for processing all your data, allowing you to analyze, process, run machine learning, and protect all your data. Hopefully this presentation will be a bit of Fun! For those who have been working with the database for a long time, we can sometimes forget what we can really do. For those starting out in the career may not realise what the database can do. The presentation delivers an important message while having a laugh or two (probably at me). | Marriott Marquis (Golden Gate Level) – Golden Gate C1/C2 |
| Thursday | 13:45-14:30 | Is SQL the Best Language for Statistics and Machine Learning?
[OOW and JavaOne CON7350] Did you know that Oracle Database comes with more than 300 statistical functions? And most of these statistical functions are available in all versions of Oracle Database? Most people do not seem to know this. When we hear about people performing statistical analytics, we hear them talking about Excel and R, but what if we could do statistical analysis in the database without having to extract any data onto client machines? This presentation explores the various statistical areas available in Oracle Database and gives several demonstrations. We can also greatly expand our statistical capabilities by using Oracle R Enterprise with the embedded capabilities in SQL. This presentation is just one of the 14 presentations that are scheduled for the Thursday! I believe this session is already fully booked, but you can still add yourself to the wait list. |
Marriott Marquis (Golden Gate Level) – Golden Gate C3 |
My flights and hotel have been paid by OTN as part of the Oracle ACE Director program. Yes this costs a lot of money and there is no way I’d be able to pay these costs. Thank you.
My diary for OOW is really full. No it is completely over booked. It is just mental. Between attending conference session, meeting with various product teams (we only get to meet at OOW), attending various community meet-ups, this year I get to attend some events for OUG leaders (representing UKOUG), spending some time on the EMEA User Group booth, various meetings with people to discuss how they can help or contribute to the UKOUG, then there is Oak Table World, trying to check out the exhibition hall, spend some time at the OTN/ODC hangout area, getting a few OTN t-shirts, doing some book promotions at the Oracle Press shop, etc., etc., etc. I’m exhausted just thinking about it. Mosts days start at 7am and then finish around 10pm.
I’ll need a holiday when I get home! but it will be straight back to work 😦
If you are at OOW and want to chat then contact me via DM on Twitter or WhatsApp (these two are best) or via email (this will be the slowest way).
I’ll have another blog post listing the presentations from various people and partners from the Republic of Ireland who are speaking at OOW.
How to speed up your Oracle Data Mining with in-memory and parallel
Have you have found running a workflow in Oracle Data Miner slow or running the scripts in the database slow ?
No. Good, because I haven’t found it slow.
But (there is always a but) it really depends on the volume of data your are dealing with. For the vast majority of us who aren’t of the size of google, amazon, etc have data volumes that are not that large really and a basic server can process many millions of records extremely quickly using Oracle Data Mining.
But what if we have a large volume of data. In one recent project I had a data set containing over 3.5 billion records. Now that is big data. All of this data sitting in an Oracle Database.
So how can we process over 3.5 billion records in a couple of seconds, building 4 machine learning models in that time? Is that really possible with just using an Oracle Database? Yes is the answer and very easily. (Surely I needed Hadoop and Spark to process this data? Nope!)
The Oracle Data Miner (ODMr) tool comes with a new feature in SQL Developer 4 (and higer) that allows you to manage using Parallel execution and the in-memory DB features. These can be accessed on the ODMr Worksheet tool bar.

The best time to look at these setting is when you have created your workflow and are ready to run it for the first time. When you click on the ‘Performance Options’ link, you will get the following window. It will display the list of nodes you have in the workflow and will then indicate if the Degree of Parallel and the In-Memory options can be set for each of the nodes.

The default values are shown and you can changes these. For example, in a lot of scenarios you might prefer to leave the Degree of Parallel as System Determined. This will then use whatever the the default is for the database and controlled by the DBA, but if you want to specify a particular value then you can, for example setting the degree of parallel to 4 for the ‘Class Build’ node, in the above image. Similarly for the in-memory option, this will only be available for nodes where the in-memory option would be applicable. This will be where there is a lot of data processing (preparing data, transforming data, performing specific statistics, etc) and for storing any data that is generated by Oracle Data Mining.
But what if you want to change the default values. You can change these at a global level within the SQL Developer Preferences. Here you can set the default to be used for each of the different types of Oracle Data Mining nodes.

I mentioned at the start that I’ve been able to build 4 machine learning models using Oracle Data Mining on a data set of over 3.5 billion records, all in a couple of seconds. In my scenario Parallel was set to 16 and we didn’t use in-memory as we didn’t have the licence for it. You can see that machine learning at lighting speed (ish) is possible. This timing is only for building the models, which is the step that consumes the most about of resources and time. When it comes to scoring the data, that is lighting fast. In may scenario, scoring over 300,000 was less than a second, and I didn’t use parallel or anything else to speed things up. Because we didn’t need to.
Go give it a try!







You must be logged in to post a comment.