Python-Connecting to multiple Oracle Autonomous DBs in one program

Posted on Updated on

More and more people are using the FREE Oracle Autonomous Database for building new new applications, or are migrating to it.

I’ve previously written about connecting to an Oracle Database using Python. Check out that post for details of how to setup Oracle Client and the Oracle Python library cx_Oracle.

In thatblog post I gave examples of connecting to an Oracle Database using the HostName (or IP address), the Service Name or the SID.

But with the Autonomous Oracle Database things are a little bit different. With the Autonomous Oracle Database (ADW or ATP) you will need to use an Oracle Wallet file. This file contains some of the connection details, but you don’t have access to ServiceName/SID, HostName, etc.  Instead you have the name of the Autonomous Database. The Wallet is used to create a secure connection to the Autonomous Database.

You can download the Wallet file from the Database console on Oracle Cloud.

Screenshot 2020-01-10 12.24.10

Most people end up working with multiple database. Sometimes these can be combined into one TNSNAMES file. This can make things simple and easy. To use the download TNSNAME file you will need to set the TNS_ADMIN environment variable. This will allow Python and cx_Oracle library to automatically pick up this file and you can connect to the ATP/ADW Database.

But most people don’t work with just a single database or use a single TNSNAMES file. In most cases you need to switch between different database connections and hence need to use multiple TNSNAMES files.

The question is how can you switch between ATP/ADW Database using different TNSNAMES files while inside one Python program?

Use the os.environ setting in Python. This allows you to reassign the TNS_ADMIN environment variable to point to a new directory containing the TNSNAMES file. This is a temporary assignment and over rides the TNS_ADMIN environment variable.

For example,

import cx_Oracle
import os

os.environ['TNS_ADMIN'] = "/Users/brendan.tierney/Dropbox/wallet_ATP"

p_username = ''p_password = ''p_service = 'atp_high'
con = cx_Oracle.connect(p_username, p_password, p_service)


I can now easily switch to another ATP/ADW Database, in the same Python program, by changing the value of os.environ and opening a new connection.

import cx_Oracle
import os

os.environ['TNS_ADMIN'] = "/Users/brendan.tierney/Dropbox/wallet_ATP"
p_username = ''
p_password = ''
p_service = 'atp_high'
con1 = cx_Oracle.connect(p_username, p_password, p_service)

os.environ['TNS_ADMIN'] = "/Users/brendan.tierney/Dropbox/wallet_ADW2"
p_username = ''
p_password = ''
p_service = 'ADW2_high'
con2 = cx_Oracle.connect(p_username, p_password, p_service)

As mentioned previously the setting and resetting of TNS_ADMIN using os.environ, is only temporary, and when your Python program exists or completes the original value for this environment variable will remain.

Applying a Machine Learning Model in OAC

Posted on Updated on

There are a number of different tools and languages available for machine learning projects. One such tool is Oracle Analytics Cloud (OAC).  Check out my article for Oracle Magazine that takes you through the steps of using OAC to create a Machine Learning workflow/dataflow.

Screenshot 2019-12-19 14.31.24

Oracle Analytics Cloud provides a single unified solution for analyzing data and delivering analytics solutions to businesses. Additionally, it provides functionality for processing data, allowing for data transformations, data cleaning, and data integration. Oracle Analytics Cloud also enables you to build a machine learning workflow, from loading, cleaning, and transforming data and creating a machine learning model to evaluating the model and applying it to new data—without the need to write a line of code. My Oracle Magazine article takes you through the various tasks for using Oracle Analytics Cloud to build a machine learning workflow.

That article covers the various steps with creating a machine learning model. This post will bring you through the steps of using that model to score/label new data.

In the Data Flows screen (accessed via Data->Data Flows) click on Create. We are going to create a new Data Flow to process the scoring/labeling of new data.

Screenshot 2019-12-19 15.08.39

Select Data Flow from the pop-up menu. The ‘Add Data Set’ window will open listing your available data sets. In my example, I’m going to use the same data set that I used in the Oracle Magazine article to build the model.  Click on the data set and then click on the Add button.

Screenshot 2019-12-19 15.14.44

The initial Data Flow will be created with the node for the Data Set. The screen will display all the attributes for the data set and from this you can select what attributes to include or remove. For example, if you want a subset of the attributes to be used as input to the machine learning model, you can select these attributes at this stage. These can be adjusted at a later stages, but the data flow will need to be re-run to pick up these changes.

Screenshot 2019-12-19 15.17.48

Next step is to create the Apply Model node. To add this to the data flow click on the small plus symbol to the right of the Data Node. This will pop open a window from which you will need to select the Apply Model.

Screenshot 2019-12-19 15.22.40

A pop-up window will appear listing the various machine learning models that exist in your OAC environment. Select the model you want to use and click the Ok button.

Screenshot 2019-12-19 15.24.42

Screenshot 2019-12-19 15.25.22

The next node to add to the data flow is to save the results/outputs from the Apply Model node. Click on the small plus icon to the right of the Apply Model node and select Save Results from the popup window.

Screenshot 2019-12-19 15.27.50.png

We now have a completed data flow. But before you finish edit the Save Data node to give a name for the Save Data Set, and you can edit what attributes/features you want in the result set.

Screenshot 2019-12-19 15.30.25.png

You can now save and run the Data Flow, and view the outputs from applying the machine learning model. The saved data set results can be viewed in the Data menu.

Screenshot 2019-12-19 15.35.11


Oracle Magazine articles

Posted on Updated on

Over the past few weeks I’ve had a couple of articles published with Oracle Magazine and these can be viewed on their website.

The first article is titled ‘Quickly Create Charts and Graphs of You Query Data‘ using Oracle Machine Learning Notebooks.

Screenshot 2019-11-05 10.51.58

The second article is titled ‘REST-Enabling Oracle Machine Learning Models‘.

Screenshot 2019-11-05 10.55.13

Click on the above links to check out those articles and check out the Oracle Magazine website for lots more articles and content.

There will be a few more Oracle Magazine articles coming out over the next few months.


Creating a VM on Oracle Always Free

Posted on Updated on

I’m going to create a new Cloud VM to host some of my machine learning work. The first step is to create the VM before installing the machine learning software.

That’s what I’m going to do in this blog post and the next blog post. In this blog post I’ll step through how to setup the VM using the Oracle Always Free cloud offering. In the next I’ll go through the machine learning software install and setup.

Step 1 – Create a ssh key/file

Whatever your preferred platform for your day to day computer there will be software available for you to generate a ssh key file. You will need this when creating the VM and for when you want to login in to VM on the command line. My day-to-day workhorse is a Mac, and I used the following command to create the ssh key file.

ssh-keygen -t rsa -N "" -b 2048 -C "myOracleCloudkey" -f myOracleCloudkey

Step 2 – Login and Select create VM

Log into your Oracle Cloud Always Free account.

Screenshot 2019-10-21 17.17.00

Select Create a VM Instance.

Screenshot 2019-10-21 17.20.09

Step 3 – Configure the VM

Give the instance a name. I called mine ‘b01-vm-1

Screenshot 2019-10-21 17.23.02

Expand the networks section by clicking on Show Shape, Network and Storage Options. Set the IP address to be public.

Screenshot 2019-10-21 17.37.33

Scroll down to the ssh section. Select the ssh file you created earlier.

Screenshot 2019-10-21 17.24.22

Click on the Create button.

That’s it, all done. Just wait for the VM to be created. This will takes a few seconds.

Screenshot 2019-10-21 17.26.15

After the VM is created the IP address will be listed on this screen. Take note of it.

Step 4 – Connect and log into the VM

We can not log into the VM using ssh, to prove that it exists, using the command

ssh -i <name of ssh file> opc@<ip address of VM>

When I use this command I get the following:

The authenticity of host 'XXX.XXX.XXX.XXX (XXX.XXX.XXX.XXX)' can't be established.
ECDSA key fingerprint is SHA256:fX417Z1yFoQufm7SYfxNi/RnMH5BvpvlOb2gOgnlSCs.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'XXX.XXX.XXX.XXX' (ECDSA) to the list of known hosts.
Enter passphrase for key 'XXXXXXXXXX': 
[opc@b1-vm-01 ~]$ pwd

[opc@b1-vm-01 ~]$ df
Filesystem   1K-blocks     Used Available Use%  Mounted on
devtmpfs        469092        0    469092   0%  /dev
tmpfs           497256        0    497256   0%  /dev/shm
tmpfs           497256     6784    490472   2%  /run
tmpfs           497256        0    497256   0%  /sys/fs/cgroup
/dev/sda3     40223552  1959816  38263736   5%  /
/dev/sda1       204580     9864    194716   5%  /boot/efi
tmpfs            99452        0     99452   0%  /run/user/1000

And there we have it. A VM setup on Oracle Always Free.

Next step is to install some Machine Learning software.

Changing PDB/CDB spfile parameters

Posted on

When working with a Oracle database hosted on the Oracle cloud (not an Autonomous DB), I recently had the need to change/increase the number of processes for the database. After a bit of researching it looked liked I just had to make the change to the SPFILE and that would be it.

I needed to change/increase the PROCESSES parameter for the CDB and the PDB. Following the multitude of advice on the internet, I ssh into the DB server, found the SPFILE and changed it.

I bounced the DB and when I connected to the PDB, I found the number for PROCESSES was still the same as the old/original value. Nothing had changed.

By default the initialization parameter for the PDB inherit the values from the parameters for the CDB. But this didn’t seem to be the case.

After a bit more research, I needed to set this parameter for the CDB and the PDB. But no luck finding a parameter file for the PDB. It turns out the parameters for the PDB are set at the metadata level, and I needed to change the parameter there.

What I had to do was to change the value when connected to it using SQL*Plus, SQL Dev etc.  So, How did I change the parameter value.

Using SQL Developer as my tool, I connected as SYSDBA to my PDB. Then ran,

alter session set container = cdb$root

Now change the parameter value.

alter system set processes = 1200 scope=both

I then bounced the database, logged back into my PDB as system and checked the parameter values. It worked. This was such a simple solution and it worked for me, but there was way too many articles, blog posts, etc out there that didn’t work. Something I’ll need to investigate later is, did I need to connect to the CDB? could I have just run the second command only?  I need to setup a different/test DB and see.


OML Workspace Permissions

Posted on Updated on

When working with Oracle Machine Learning (OML) you are creating notebooks which focus on a particular data exploration and possibly some machine learning. Despite it’s name, OML is used extensively for data discovery and data exploration.

One of the aims of using OML, or notebooks in general, is that these can be easily shared with other people either within the same team or beyond. Something to consider when sharing notebooks is what you are allowing other people do with your notebook. Without any permissions you are allowing people to inspect, run and modify the notebooks. This can be a problem because those people you are sharing with may or may not be allowed to make modification. Some people should be able to just view the notebook, and others should be able to more advanced tasks.

With OML Notebooks there are four primary types of people who can access Notebooks and these can have different privileges. These are defined as

  • Developer : Can create new notebooks withing a project and workspace but cannot create a workspace or a project. Can create and run a notebook as a scheduled job.
  • Viewer : They can just view projects, Workspaces and notebooks. They are not allowed to create or run anything.
  • Manager : can create new notebooks and projects. But only view Workspaces. Additionally they can schedule notebook jobs.
  • Administrators : Administrators of the OML environment do not have any edit capabilities on notebooks. But they can view them.

Screenshot 2019-09-14 05.24.18

Screenshot 2019-09-14 10.40.23

OML Notebooks Interpreter Bindings

Posted on Updated on

When using Oracle Machine Learning notebooks, you can export and import these between different projects and different environments (from ADW to ATP).

But something to watch out for when you import a notebook into your ADW or ATP environment is to reset the Interpreter Bindings.

When you create a new OML Notebook and build it up, the various Interpreter Bindings are automatically set or turned on. But for Imported OML Notebooks they are not turned on.

I’m assuming this will be fixed at some future point.

If you import an OML Notebook and turn on the Interpreter Bindings you may find the code in your notebook cells running very slowly

To turn on these binding, click on the options icon as indicated by the red box in the following image.

Screenshot 2019-08-19 21.04.58

You will get something like the following being displayed. None of the bindings are highlighted.

Screenshot 2019-08-19 21.08.03

To enable the Interpreter Bindings just click on each of these boxes. When you do this each one will be highlighted and will turn a blue color.

Screenshot 2019-08-19 21.07.20

All done!  You can now run your OML Notebooks without any problems or delays.


ADW – Loading data using Object Storage

Posted on Updated on

There are a number of different ways to load data into your Autonomous Data Warehouse (ADW) environment. I’ll have posts about these alternatives.

In this blog post I’ll go through the steps needed to load data using Object Storage. This might appear to have a large-ish number of steps, but once you have gone through it and have some of the parts already setup and configuration from your first time, then the second and subsequent times will be easier.

After logging into your Oracle Cloud dashboard, select Object Storage from the side menu.

Screenshot 2019-07-29 14.58.46

Then click on the Create Bucket button.

Screenshot 2019-07-29 15.01.40

Enter a name for the Object Storage bucket, take the defaults for the for the rest, and click on the Create Bucket button at the bottom. In my example, I’ve called the bucket ‘ADW_Bucket’.

Screenshot 2019-07-29 15.04.42

Click on the name of the bucket in the list.

Screenshot 2019-07-29 15.05.15

And then click Upload Objects button.

Screenshot 2019-07-29 15.06.29

In the Upload Objects window, browse for the file(s) you want to upload.

Screenshot 2019-07-29 15.08.00  Screenshot 2019-07-29 15.09.12

Then click on the Upload Objects button on the Upload Objects window. After a few moments you will see a message saying the file(s) have been uploaded. Click on the Close window.

Screenshot 2019-07-29 15.13.02

Click into the Object details and take a note/copy of the URL Path. You will need this later

To load data from the Oracle Cloud Infrastructure(OCI) Object Storage you will need an OCI user with the appropriate privileges to read data (or upload) data to the Object Store. The communication between the database and the object store relies on the Swift protocol and the OCI user Auth Token. Go back to the menu in the upper left and select users.

Screenshot 2019-07-29 15.20.51  Screenshot 2019-07-29 15.22.41

Then click on the user name to view the details. This is probably your OCI username.

On the left hand side of the page click Auth Tokens, and then click on Generate Token button. Give a name for the token e.g ADW_TOKEN, and then generate token.

Screenshot 2019-07-29 15.29.42

Save the generated token to use later.

Screenshot 2019-07-29 15.33.33

Open SQL Developer and setup a connection to your OML User/schema. When connected the next steps is to authenticate with the Object storage using your OCI username and the Auth Token, generated above.

    credential_name => 'ADW_TOKEN',
    username => '<your cloud username>',
    password => '<generated auth token>'

If successful you should get the following message. If not then you probably entered something incorrectly. Go back and review the previous steps

PL/SQL procedure successfully completed.

Next, create a table to store the data you want to import. For my table the create table is the following. [It is one of the sample data sets for OML, and I’ve made the create table statement compact to save space in this post]

create table credit_scoring_100k 
( customer_id number(38,0), age number(4,0), income number(38,0), marital_status varchar2(26 byte), 
number_of_liables number(3,0), wealth varchar2(4000 byte), education_level varchar2(26 byte), 
tenure number(4,0), loan_type varchar2(26 byte), loan_amount number(38,0), 
loan_length number(5,0), gender varchar2(26 byte), region varchar2(26 byte), 
current_address_duration number(5,0), residental_status varchar2(26 byte), 
number_of_prior_loans number(3,0), number_of_current_accounts number(3,0), 
number_of_saving_accounts number(3,0), occupation varchar2(26 byte), 
has_checking_account varchar2(26 byte), credit_history varchar2(26 byte), 
present_employment_since varchar2(26 byte), fixed_income_rate number(4,1), 
debtor_guarantors varchar2(26 byte), has_own_phone_no varchar2(26 byte), 
has_same_phone_no_since number(4,0), is_foreign_worker varchar2(26 byte), 
number_of_open_accounts number(3,0), number_of_closed_accounts number(3,0), 
number_of_inactive_accounts number(3,0), number_of_inquiries number(3,0), 
highest_credit_card_limit number(7,0), credit_card_utilization_rate number(4,1), 
delinquency_status varchar2(26 byte), new_bankruptcy varchar2(26 byte), 
number_of_collections number(3,0), max_cc_spent_amount number(7,0), 
max_cc_spent_amount_prev number(7,0), has_collateral varchar2(26 byte), 
family_size number(3,0), city_size varchar2(26 byte), fathers_job varchar2(26 byte), 
mothers_job varchar2(26 byte), most_spending_type varchar2(26 byte), 
second_most_spending_type varchar2(26 byte), third_most_spending_type varchar2(26 byte), 
school_friends_percentage number(3,1), job_friends_percentage number(3,1), 
number_of_protestor_likes number(4,0), no_of_protestor_comments number(3,0), 
no_of_linkedin_contacts number(5,0), average_job_changing_period number(4,0), 
no_of_debtors_on_fb number(3,0), no_of_recruiters_on_linkedin number(4,0), 
no_of_total_endorsements number(4,0), no_of_followers_on_twitter number(5,0), 
mode_job_of_contacts varchar2(26 byte), average_no_of_retweets number(4,0), 
facebook_influence_score number(3,1), percentage_phd_on_linkedin number(4,0), 
percentage_masters number(4,0), percentage_ug number(4,0), 
percentage_high_school number(4,0), percentage_other number(4,0), 
is_posted_sth_within_a_month varchar2(26 byte), most_popular_post_category varchar2(26 byte), 
interest_rate number(4,1), earnings number(4,1), unemployment_index number(5,1), 
production_index number(6,1), housing_index number(7,2), consumer_confidence_index number(4,2), 
inflation_rate number(5,2), customer_value_segment varchar2(26 byte), 
customer_dmg_segment varchar2(26 byte), customer_lifetime_value number(8,0), 
churn_rate_of_cc1 number(4,1), churn_rate_of_cc2 number(4,1), 
churn_rate_of_ccn number(5,2), churn_rate_of_account_no1 number(4,1), 
churn_rate__of_account_no2 number(4,1), churn_rate_of_account_non number(4,2), 
health_score number(3,0), customer_depth number(3,0), 
lifecycle_stage number(38,0), credit_score_bin varchar2(100 byte));

After creating the table, you are ready to import the data from Object storage. To do this you will need to use the DBMS_COULD PL/SQL package.

   table_name =>'credit_scoring_100k',
   credential_name =>'ADW_TOKEN',
   file_uri_list => '<url of file in your Object Store bucket, see comment earlier in post>',
   format => json_object('ignoremissingcolumns' value 'true', 'removequotes' value 'true', 'dateformat' value 'YYYY-MM-DD HH24:MI:SS', 'blankasnull' value 'true', 'delimiter' value ',', 'skipheaders' value '1')

All done.

You can now query the data and use with Oracle Machine Learning, etc.

[I said at the top of the post there are other methods available. More on this in other posts]


Oracle ADW how to load new OML notebooks

Posted on Updated on

Oracle Autonomous Database (ADW) has been out a while now and have had several, behind the scenes, improvements and new/additional features added.

If you have used the Oracle Machine Learning (OML) component of ADW you will have seen the various sample OML Notebooks that come pre-loaded. These are easy to open, use and to try out the various OML features.

Screenshot 2019-07-29 13.07.01

The above image shows the top part of the login screen for OML. To see the available sample notebooks click on the Examples icon. When you do, you will get the following sample OML Notebooks.

Screenshot 2019-07-29 13.08.44

But what if you have a notebook you have used elsewhere. These can be exported in json format and loaded as a new notebook in OML.

To load a new notebook into OML, select the icon (three horizontal line) on the top left hand corner of the screen. Then select Notebooks from the menu.

Screenshot 2019-07-29 13.11.41           Screenshot 2019-07-29 13.21.07

Screenshot 2019-07-29 13.21.49

Then select the Import button located at the top of the Notebooks screen. This will open a File window, where you can select the json file from your file system.

Screenshot 2019-07-29 13.24.58

A couple of seconds later the notebook will be available and listed along side any other notebooks you may have created.

Screenshot 2019-07-29 13.26.13

All done!

You have now imported a new notebook into OML and can now use it to process your data and perform machine learning using the in-database features.







Machine Learning on Mobile Devices

Posted on Updated on

You: What? You can’t be serious?  Machine Learning on Mobile Devices?

Me: The simple answer is ‘Yes you can!”

You: But, what about all the complex data processing, CPU or GPU, and everything else that is needed for machine learning?

Me: Yes you are correct, those things might not be needed. What’s the answer to everything in IT?

You: It Depends ?

Me: Exactly. Yes It Depends on what you are doing. In most cases you don’t need large amounts of machine processing power to do machine learning. Except if you are doing image processing. Then you do need a bit of power to support that work.

You: But how can a mobile device be used for machine learning?

Screenshot 2019-07-19 14.24.22

Me: It Depends! 🙂  It depends on what you are doing. Most of the data processing power needed is for creating the models. That is what most people talk about. Very few people talk about the deployment of machine learning. Deployment, as in, using the machine learning models in your applications.

You: But why mobile devices? That sounds a bit silly?

Me: It does a bit. But when you think about it, how much do you use your mobile phone and tablet?  Where else have you seen mobile devices being used?

You: I use these all the time, to do nearly everything. Just like everyone else I know.

Me: Exactly!  and where else have you seen mobile devices being used?

You: Everywhere! hotels, bars, shops, hospitals, everywhere!

Me: Exactly. And it kind of makes sense to have machine learning scoring done at the point of capture of the data and not some hours or days or weeks later in some data warehouse or something else.

You: But what about the processing power of these devices. They aren’t powerful enough to run the machine learning models? Or are they?

Me: What is a machine learning model? In a simple way it is a mathematical formula of the data that calculates a particular outcome. Something that is a bit more complicated than using a sum function.  Could a mobile device do that easily?

You: Yes. That should be really easy and fast for mobile devices? But machine learning is complex. People keep telling me how complex it is and how difficult it is!

Me: True it can be, but for most problems it can be as simple as writing a few lines of code to create a model. 3-4 lines of code in some languages. But the applying of the the machine learning model can be a simple task (maybe 1 line of code), although some simple data formatting might be needed, but that is a simple task too.

You: So, how can a machine learning model be run on a mobile device?

Me: Programmers write code to run applications on mobile devices. This code can be extended to include the machine learning model. This can be used to score or label the data or do some other processing. A few lines of code.  A good alternative is to create a web service to all the remove scoring of the data.

You: The programming languages used for mobile development are a bit different to most other applications. Surely those mobile device languages don’t support machine learning.

Me: You’d be surprised by what’s available.

You: OK, What languages can I try? Where can I get started?

Me: Check out Firebase ML Kit, Apple CoreML and TensorFlow Lite. Those should be more than enough for you to get started with. There are a few others. But start with those.

You. Brilliant, thank you Brendan. I’ll let you know how I get on with those.



GoLang: Querying records from Oracle Database using goracle

Posted on Updated on

Continuing my series of blog posts on using Go Lang with Oracle, in this blog I’ll look at how to setup a query, run the query and parse the query results. I’ll give some examples that include setting up the query as a prepared statement and how to run a query and retrieve the first record returned. Another version of this last example is a query that returns one row.

Check out my previous post on how to create a connection to an Oracle Database.

Let’s start with a simple example. This is the same example from the blog I’ve linked to above, with the Database connection code omitted.

    dbQuery := "select table_name from user_tables where table_name not like 'DM$%' and table_name not like 'ODMR$%'"
    rows, err := db.Query(dbQuery)
    if err != nil {
        fmt.Println(".....Error processing query")
    defer rows.Close()

    fmt.Println("... Parsing query results") 
    var tableName string
    for rows.Next() {

Processing a query and it’s results involves a number of steps and these are:

  1. Using Query() function to send the query to the database. You could check for errors when processing each row
  2. Iterate over the rows using Next()
  3. Read the columns for each row into variables using Scan(). These need to be defined because Go is strongly typed.
  4. Close the query results using Close(). You might want to defer the use of this function but depends if the query will be reused. The result set will auto close the query after it reaches the last records (in the loop). The Close() is there just in case there is an error and cleanup is needed.

You should never use * as a wildcard in your queries. Always explicitly list the attributes you want returned and only list the attributes you want/need. Never list all attributes unless you are going to use all of them. There can be major query performance benefits with doing this.

Now let us have a look at using prepared statement. With these we can parameterize the query giving us greater flexibility and reuse of the statements. Additionally, these give use better query execution and performance when run the the database as the execution plans can be reused.

    dbQuery, err := db.Prepare("select cust_first_name, cust_last_name, cust_city from sh.customers where cust_gender = :1")
    if err != nil {
    defer dbQuery.Close()

    rows, err := dbQuery.Query('M')
    if err != nil {
        fmt.Println(".....Error processing query") 
    defer rows.Close()

    var CustFname, CustSname,CustCity string
    for rows.Next() {
        rows.Scan(&CustFname, &CustSname, &CustCity)   
        fmt.Println(CustFname, CustSname, CustCity) 

Sometimes you may have queries that return only one row or you only want the first row returned by the query. In cases like this you can reduce the code to something like the following.

var CustFname, CustSname,CustCity string
err := db.Prepare("select cust_first_name, cust_last_name, cust_city from sh.customers where cust_gender = ?").Scan(&CustFname, &CustSname, &CustCity)  
if err != nil {
fmt.Println(CustFname, CustSname, CustCity)

or an alternative to using Quer(), use QueryRow()

dbQuery, err := db.Prepare("select cust_first_name, cust_last_name, cust_city from sh.customers where cust_gender = ?")  
if err != nil {
defer dbQuery.Close() 

var CustFname, CustSname,CustCity string
err := dbQuery.QueryRow('M').Scan(&CustFname, &CustSname, &CustCity)  
if err != nil { 
    fmt.Println(".....Error processing query") 
fmt.Println(CustFname, CustSname, CustCity)





Embedding Transformation Data Pipeline into ML Model using Oracle Data Mining

Posted on Updated on

I’ve written several blog posts about how to use the DBMS_DATA_MINING.TRANSFORM function to create various data transformations and how to apply these to your data. All of these steps can be simple enough to following and re-run in a lab environment. But the real value with data science and machine learning comes when you deploy the models into production and have the ML models scoring data as it is being produced, and your applications acting upon these predictions immediately, and not some hours or days later when the data finally arrives in the lab environment.

It would be useful to be able to bundle all the transformations into the same process the create the model. The transformations and model become one, together.  If this is possible, then that greatly simplifies how the ML model can be deployed into production. It then becomes a simple function or REST call. We need to keep this simple (KISS).

Using the examples from my previous blog posts performing various data transformations, the following example shows how you can bundle these up into one defined set of transformations and then embed these transformations as part of the ML model. To do this we need to define a list of transformations. We can do this using:

xform_list            IN TRANSFORM_LIST DEFAULT NULL

Where TRANSFORM_LIST has the following structure:

     attribute_name       VARCHAR2(4000),
     attribute_subname    VARCHAR2(4000),
     expression           EXPRESSION_REC,
     reverse_expression   EXPRESSION_REC,
     attribute_spec       VARCHAR2(4000));

You can use the DBMS_DATA_MINING.SET_TRANSFORM function to defined the transformations. The following example illustrates the transformation of converting the BOOKKEEPING_APPLICATION attribute from a number data type to a character data type.

   transform_stack   dbms_data_mining_transform.TRANSFORM_LIST;

Alternatively you can use the SET_EXPRESSION function and then create the transformation using it.

You can Stack the transforms together. Using the above example you could express a number of transformations and have these stored in the TRANSFORM_STACK variable. You can then pass this variable into your CREATE_MODEL procedure and have these transformations embedded in your ML model.


   transform_stack   dbms_data_mining_transform.TRANSFORM_LIST;
   -- Define the transformation list

   -- Create the data mining model
      model_name           => 'DEMO_TRANSFORM_MODEL',
      mining_function      => dbms_data_mining.classification,
      data_table_name      => 'MINING_DATA_BUILD_V',
      case_id_column_name  => 'cust_id',
      target_column_name   => 'affinity_card',
      settings_table_name  => 'demo_class_dt_settings',
      xform_list           => transform_stack);

My previous blog posts showed how to create various types of transformations. These transformations were then used to create a view of the data set that included these transformations. To embed these transformations in the ML Model we need to use the  STACK function. The following examples illustrate the stacking of the transformations created in the previous blog posts. These transformations are added (or stacked) to a transformation list and then added to the CREATE_MODEL function, embedding these transformations in the model.


   transform_stack   dbms_data_mining_transform.TRANSFORM_LIST;
   -- Stack the missing numeric transformations
   dbms_data_mining_transform.STACK_MISS_NUM (
          miss_table_name   => 'TRANSFORM_MISSING_NUMERIC',
          xform_list        => transform_stack);

   -- Stack the missing categorical transformations
   dbms_data_mining_transform.STACK_MISS_CAT (
          miss_table_name   => 'TRANSFORM_MISSING_CATEGORICAL',
          xform_list        => transform_stack);

   -- Stack the outlier treatment for AGE
   dbms_data_mining_transform.STACK_CLIP (
          clip_table_name   => 'TRANSFORM_OUTLIER',
          xform_list        => transform_stack);

   -- Stack the normalization transformation
   dbms_data_mining_transform.STACK_NORM_LIN (
          norm_table_name   => 'MINING_DATA_NORMALIZE',
          xform_list        => transform_stack);

   -- Create the data mining model
      model_name           => 'DEMO_STACKED_MODEL',
      mining_function      => dbms_data_mining.classification,
      data_table_name      => 'MINING_DATA_BUILD_V',
      case_id_column_name => 'cust_id',
      target_column_name   => 'affinity_card',
      settings_table_name => 'demo_class_dt_settings',
      xform_list           => transform_stack);

To view the embedded transformations in your data mining model you can use the GET_MODEL_TRANSFORMATIONS function.

SELECT TO_CHAR(expression)


(CASE  WHEN (NVL("AGE",38.892)<18) THEN 18 WHEN (NVL("AGE",38.892)>70) THEN 70 E
LSE NVL("AGE",38.892) END -18)/52

08867)>8) THEN 8 ELSE NVL("YRS_RESIDENCE",4.08867) END -1)/7

NVL("COUNTRY_NAME",'United States of America')
NVL("CUST_INCOME_LEVEL",'J: 190,000 - 249,999')