Go Lang

Connecting Go Lang to Oracle Database

Posted on Updated on

It seems like more and more people are using Go. With that comes the need to  access a database or databases. This blog will show you how to get connected to an Oracle Database and to perform some basic operations using Go.

The first thing you need is to have Go installed. There are a couple of options for you. The first is go download from the Go Lang website, or if you are an Oracle purist they provide some repositories for you and these can be installed using yum.

Next you need to install Oracle Instant Client.

Screenshot 2019-04-22 10.48.42

Unzip the contents of the downloaded file. Copy the extracted directory (and it’s contents) to your preferred location and then add the path to this directory to the search PATH. Depending on your configuration and setup, you may need to configure some environment variables. Some people report having to create a ‘.pc’ file and having to change the symlinks for libraries. I didn’t have to do any of this.

The final preparation steps, after installing Go and Oracle Instant Client, is to download the ‘goracle’ package. This package provides a GO database/sql driver for connecting to Oracle Database using ODPI-C. To install the ‘goracle’ package run:

go get gopkg.in/goracle.v2

This takes a few seconds to run. There is no display updates or progress updates when this command is running.

See below for the full Go code for connecting to Oracle Database, executing a query and gathering some database information. To this code, with file name ‘ora_db.go’

go run ora_db.go

I’ll now break this code down into steps.

Import the packages that will be used in you application. In this example I’m importing four packages. ‘fmt’ is the formatting package and allows us to write to standard output. the ‘time’ package allows us to capture and print some date, time and how long things take. The ‘database/sql’ package to allow SQL execution and is needed for the final package ‘goracle’.

import (
    "fmt"
    "time"
    "database/sql"
    goracle "gopkg.in/goracle.v2"
)

Next we can define the values needed for connecting the Oracle Database. These include the username, password, the host string and the database name.

    username := "odm_user";
    password := "odm_user";
    host := ".....";
    database := "....";

Now test the database connection. This doesn’t actually create a connection. This is deferred until you run the first command against the database. This tests the connection

db, err := sql.Open("goracle", username+"/"+password+"@"+host+"/"+database)
if err != nil {
    fmt.Println("... DB Setup Failed") 
    fmt.Println(err)
    return
}
defer db.Close()

If an error is detected, the error message will be printed and the application will exit (return). the ‘defer db.Close’ command sets up to close the connection, but defers it to the end of the application. This allows you to keep related code together and avoid having to remember to add the close command at the end of your code.

Now force the connection to open using a Ping command

if err = db.Ping(); err != nil {
    fmt.Println("Error connecting to the database: %s\n", err)
    return
}

Our database connection is now open!

The ‘goracle’ package allows us to get some metadata about the connection, such as details of the client and server configuration. Here we just gather the details of what version of the database we are connected to. The Oracle Database I’m using is 18c Extreme Edition host on Oracle Cloud.

var serverVersion goracle.VersionInfo 
serverVersion, err = goracle.ServerVersion(db);
if err != nil {
    fmt.Printf("Error getting Database Information: %s\n", err)
    return
}
fmt.Println("DB Version : ",serverVersion)

First we define the variable used to store the server details in. This is defined with data type as specified in the ‘goracle’ package. Then gather the server version details, check for an error and finally print out the details.

To execute a query, we define the query (dbQuery) and then use the connection (db) to run this query (db.Query). The variable ‘rows’ points to the result set from the query. Then defer the closing of the results set point. We need to keep this results set open, as we will parse it in the next code segment.

dbQuery := "select table_name from user_tables where table_name not like 'DM$%' and table_name not like 'ODMR$%'"
rows, err := db.Query(dbQuery)
if err != nil {
   fmt.Println(".....Error processing query")
   fmt.Println(err)
   return
}
defer rows.Close()

To parse the result set, we can use a FOR loop. Before the loop we define a variable to contain the value returned from the result set (tableName). The FOR loop will extract each row returned and assign the value returned to the variable tableName. This variable is then printed.

var tableName string
for rows.Next() {
   rows.Scan(&tableName)
   fmt.Println(tableName)
}

That’s it.

We have connected to Oracle Database, retrieved the version of the database, executed a query and processed the result set.

Here is the full code and the output from running it.

package main

import (
    "fmt"
    "time"
    "database/sql"
    goracle "gopkg.in/goracle.v2"
)

func main(){
    username := "odm_user";
    password := "odm_user";
    host := ".....";
    database := "....";

    currentTime := time.Now()
    fmt.Println("Starting at : ", currentTime.Format("03:04:05:06 PM"))

    fmt.Println("... Setting up Database Connection") 
    db, err := sql.Open("goracle", username+"/"+password+"@"+host+"/"+database)
    if err != nil {
        fmt.Println("... DB Setup Failed") 
        fmt.Println(err)
        return
    }
    defer db.Close()

    fmt.Println("... Opening Database Connection") 
    if err = db.Ping(); err != nil {
        fmt.Println("Error connecting to the database: %s\n", err)
        return
    }
    fmt.Println("... Connected to Database")

    var serverVersion goracle.VersionInfo 
    serverVersion, err = goracle.ServerVersion(db);
    if err != nil {
        fmt.Printf("Error getting Database Information: %s\n", err)
        return
    }
    fmt.Println("DB Version : ",serverVersion)

    dbQuery := "select table_name from user_tables where table_name not like 'DM$%' and table_name not like 'ODMR$%'"
    rows, err := db.Query(dbQuery)
    if err != nil {
        fmt.Println(".....Error processing query")
        fmt.Println(err)
        return
    }
    defer rows.Close()

    fmt.Println("... Parsing query results") 
    var tableName string
    for rows.Next() {
        rows.Scan(&tableName)
        fmt.Println(tableName)
    }

    fmt.Println("... Closing connection") 
    finishTime := time.Now()
    fmt.Println("Finished at ", finishTime.Format("03:04:05:06 PM"))
}

Screenshot 2019-04-22 11.30.10

Advertisements

Migrating Python ML Models to other languages

Posted on Updated on

I’ve mentioned in a previous blog post about experiencing some performance issues with using Python ML in production. We needed something quicker and the possible languages we considered were C, C++, Java and Go Lang.

But the data science team used R and Python, with just a few more people using Python than R on the team.

One option was to rewrite everything into the language used in production. As you can imagine no-one wanted to do that and there was no way of ensure a bug free solution and one that gave similar results to the R and Python models. The other option was to look for some code to convert the models from one language to another.

The R users was well versed in using PMML. Predictive Model Markup Language (PMML) has been around a long time and well known and used by certain groups of data scientists who have been around a while. It is also widely supported by many analytics vendors, and provides an inter-change format to allow predictive models to be described and exchanged. For newer people, they hadn’t heard of it. PMML is an XML based interchange specification.

But with PMML there are some limitation. Not with the specification but how it is implemented by the various vendors that support it. PMML supports the exchange of the model pipeline including the data transformations as well as the model specification. Most vendors only support some elements of this and maybe just a couple of models. And there-in lies the problem. How can a ML pipeline be migrated from, as Python, to some other language and/or tool. There are limitations.

If you do want to explore PMML with Python check out the sklearn2pmml package and is also available on PyPl. This package allows you to export the ML pipeline and the model specification. As with most other implementations of PMML there are some parts of the PMML specification not implement, but it is better than post of the other implementation out there.

An alternative is to look at code translations options. With these we want something that will take our ML pipeline and convert it to another programming language like C++, JAVA, Go, etc. There aren’t too many solutions available to do this. One such solution we’ve explored over the past couple of weeks is called m2cgen.

m2cgen (Model 2 Code Generator) is a lightweight library which provides an easy way to transpile trained statistical models into a native code (Python, C, Java, Go). You can supply M2cgen with a range of models (linear, SVM, tree, random forest, or boosting, etc) and the tool will output code in the chosen language that will represent the trained model. The code generated will generated into native code without dependencies. Other packages or libraries are not dependent or required in the translated language. For example here is an example Decision Tree translated into a number of different languages.

 

C

#include <string.h>
void score(double * input, double * output) {
    double var0[3];
    if ((input[2]) <= (2.6)) {
        memcpy(var0, (double[]){1.0, 0.0, 0.0}, 3 * sizeof(double));
    } else {
        if ((input[2]) <= (4.8500004)) {
            if ((input[3]) <= (1.6500001)) {
                memcpy(var0, (double[]){0.0, 1.0, 0.0}, 3 * sizeof(double));
            } else {
                memcpy(var0, (double[]){0.0, 0.3333333333333333, 0.6666666666666666}, 3 * sizeof(double));
            }
        } else {
            if ((input[3]) <= (1.75)) {
                memcpy(var0, (double[]){0.0, 0.42857142857142855, 0.5714285714285714}, 3 * sizeof(double));
            } else {
                memcpy(var0, (double[]){0.0, 0.0, 1.0}, 3 * sizeof(double));
            }
        }
    }
    memcpy(output, var0, 3 * sizeof(double));
}

Java

public class Model {

    public static double[] score(double[] input) {
        double[] var0;
        if ((input[2]) <= (2.6)) {
            var0 = new double[] {1.0, 0.0, 0.0};
        } else {
            if ((input[2]) <= (4.8500004)) {
                if ((input[3]) <= (1.6500001)) {
                    var0 = new double[] {0.0, 1.0, 0.0};
                } else {
                    var0 = new double[] {0.0, 0.3333333333333333, 0.6666666666666666};
                }
            } else {
                if ((input[3]) <= (1.75)) {
                    var0 = new double[] {0.0, 0.42857142857142855, 0.5714285714285714};
                } else {
                    var0 = new double[] {0.0, 0.0, 1.0};
                }
            }
        }
        return var0;
    }
}

Go Lang

func score(input []float64) []float64 {
    var var0 []float64
    if (input[2]) <= (2.6) {
        var0 = []float64{1.0, 0.0, 0.0}
    } else {
        if (input[2]) <= (4.8500004) {
            if (input[3]) <= (1.6500001) {
                var0 = []float64{0.0, 1.0, 0.0}
            } else {
                var0 = []float64{0.0, 0.3333333333333333, 0.6666666666666666}
            }
        } else {
            if (input[3]) <= (1.75) {
                var0 = []float64{0.0, 0.42857142857142855, 0.5714285714285714}
            } else {
                var0 = []float64{0.0, 0.0, 1.0}
            }
        }
    }
    return var0
}

 

Machine Learning with Go Lang

Posted on Updated on

Recently I’ve been having a number of conversations with people in several countries about using Go Lang for machine learning. Most of these people have been struggling with using Python for machine learning and are looking for an alternative that will give them better performance. We have been experimenting with C++ and Go Lang to see what the performance differences are. Most of these are with the execution of the ML code. This is great and everyone is very happy with execution timings, compared to Python.

But, there is a flip side to this. Although we have faster execution timings, there is a down side in that the coding effort is higher, with more lines of code and fewer libraries/packages to support the various ML tasks. But most of these can be easily coded ourselves .

We also looked at some frameworks for converting ML models developed in one language but deployed in production using a different language. More on that in another post.

Overall the extra development work was considered worthwhile for the performance improvement and deployment gains.

Go Lang doesn’t really come with it’s own set of libraries/packages for ML, but those have a number of these that can be used to code up the necessary functions we need for our everyday ML needs.

But are there any Go Lang libraries/packages developed for ML, just like we have for the R Language, etc?  The simple answer is YES we have. But the number of these is small in comparison to R and Python. Both of these languages are interpreted languages. But those available for Go are slowly growing.

Here is list of the Go Lang libraries/packages that we examined and evaluated for these projects. Some are available from the Go Lang website/wiki and others are available on Github.

  • Anna – Artificial Neural Network Aspiration, aims to be self-learning and self-improving software.
  • bayesian – A naive bayes classifier.
  • Dialex – Dialex is a smart pipe that unscrambles text and makes it machine-readable.
  • Cloudforest – Ensembles of decision trees
  • ctw – Context Tree Weighting and Rissanen-Langdon Arithmetic Coding
  • eaopt – An evolutionary optimization library.
  • evo – a framework for implementing evolutionary algorithms in Go.
  • gobrain – Neural Networks
  • Go Learn – Machine Learning for Go
  • go-algs/maxflow Maxflow (graph-cuts) energy minimization library.
  • go-graph – Graph library for Go/Golang language
  • go-galib – Genetic algorithms.
  • go-pr – Pattern recognition package in Go lang
  • golinear – Linear SVM and logistic regression.
  • go-mind – A neural network library built in Go
  • go_ml – Linear Regression, Logistic Regression, Neural Networks, Collaborative Filtering, Gaussian Multivariate Distribution.
  • go-ml-transpiler – An open source Go transpiler for machine learning models.
  • go-mxnet-predictor – Go binding for MXNet c_predict_api to do inference with pre-trained model.
  • gorgonia – Neural network primitives library (like Theano or Tensorflow but for Go)
  • go-porterstemmer – An efficient native Go clean room implementation of the Porter Stemming algorithm.
  • go-pr – Gaussian classifier.
  • ntmNeural Turing Machines implementation
  • paicehusk – Go implementation of the Paice/Husk Stemmer
  • RF – Random forests implementation in Go
  • tfgo – Tensorflow + Go, the gopher way.