The ability to create and use Neural Networks on business data has been available in Oracle Database since Oracle 18c (18c and 19c are just slightly extended versions of Oracle 12c). With each minor database release we get some small improvements and minor features added. I’ve written other blog posts about other 20c new machine learning features (see here, here and here).
With Oracle 20c they have added a new neural network solver. This is called Adam Solver and the original research was conducted by Diederik Kingma from OpenAI and Jimmy Ba from the University of Toronto and they presented they work at ICLR 2015. The name Adam is derived from ‘adaptive moment estimation‘. This algorithm, research and paper has gathered some attention in the research community over the past few years. Most of this has been focused on the benefits of using it.
But care is needed. As with most machine learning (and deep learning) algorithms, they work up to a point. They may be good on certain problems and input data sets, and then for others they may not be as good or as efficient at producing an optimal outcome. Although using this solver may be beneficial to your problem, using the concept of ‘No Free Lunch’, you will need to prove the solver is beneficial for your problem.
With Oracle Machine Learning there are two Optimization Solver available for the Neural Network algorithm. The default solver is call L-BFGS (Limited memory Broyden-Fletch-Goldfarb-Shanno). This is one of the most popular solvers in use in most algorithms. The is a limited version of BFGS, using less memory (hence the L in the name) This solver finds the descent direction and line search is used to find the appropriate step size. The solver searches for the optimal solution of the loss function to find the extreme value (maximum or minimum) of the loss (cost) function
The Adam Solver uses an extension to stochastic gradient descent. It uses the squared gradients to scale the learning rate and it takes advantage of momentum by using moving average of the gradient instead of gradient. This allows the solver to work quickly by seeing less data and can work well with larger data sets.
With Oracle Data Mining the Adam Solver has the following parameters.
- ADAM_ALPHA : Learning rate for solver. Default value is 0.001.
- ADAM_BATCH_ROWS : Number of rows per batch. Default value is 10,000
- ADAM_BETA1 : Exponential decay rate for 1st moment estimates. Default value is 0.9.
- ADAM_BETA2 : Exponential decay rate for the 2nd moment estimates. Default value is 0.99.
- ADAM_GRADIENT_TOLERANCE : Gradient infinity norm tolerance. Default value is 1E-9.
The parameters ADAM_ALPHA and ADAM_BATCH_ROWS can have an effect on the timing for the neural network algorithm to produce the model. Some exploration is needed to determine the optimal values for this parameters based on the size of the data set. For example having a larger value for ADAM_ALPHA results in a faster initial learning before the rates is updated. Small values than the default slows learning down during training.
To tell Oracle Machine Learning to use the Adam Solver the DMSSET_NN_SOLVER parameter needs to be set. The default setting for a neural network is DMSSET_NN_SOLVER_LGFGS. But to use the Adam solver set it to DMSSET_NN_SOLVER_ADAM.
The following is an example of setting the parameters for the Adam solver and creating a neural network.
BEGIN DELETE FROM BANKING_NNET_SETTINGS; INSERT INTO BANKING_NNET_SETTINGS (setting_name, setting_value) VALUES (dbms_data_mining.algo_name, dbms_data_mining.algo_neural_network); INSERT INTO BANKING_NNET_SETTINGS (setting_name, setting_value) VALUES (dbms_data_mining.prep_auto, dbms_data_mining.prep_auto_on); INSERT INTO BANKING_NNET_SETTINGS (setting_name, setting_value) VALUES (dbms_data_mining.nnet_nodes_per_layer, '20,10,6'); INSERT INTO BANKING_NNET_SETTINGS (setting_name, setting_value) VALUES (dbms_data_mining.nnet_iterations, 10); INSERT INTO BANKING_NNET_SETTINGS (setting_name, setting_value) VALUES (dbms_data_mining.NNET_SOLVER, 'NNET_SOLVER_ADAM'); END; The addition of the last parameter overrides the default solver for building a neural network model.
To build the model we can use the following.
DECLARE v_start_time TIMESTAMP; BEGIN begin DBMS_DATA_MINING.DROP_MODEL('BANKING_NNET_72K_1'); exception when others then null; end; v_start_time := current_timestamp; DBMS_DATA_MINING.CREATE_MODEL( model_name. => 'BANKING_NNET_72K_1', mining_function => dbms_data_mining.classification, data_table_name => 'BANKING_72K', case_id_column_name => 'ID', target_column_name => 'TARGET', settings_table_name => 'BANKING_NNET_SETTINGS'); dbms_output.put_line('Time take to create model = ' || to_char(extract(second from (current_timestamp-v_start_time))) || ' seconds.'); END;
For me on my Oracle 20c Preview Database, this takes 1.8 seconds to run and create the neural network model ob a data set of 72,000 records.
Using the default solver, the model is created in 5.2 seconds. With using a small data set of 72,000 records, we can see the impact of using an Adam Solver for creating a neural network model.
These timings and the timings shown below (in seconds) are based on the Oracle 20c Preview Database, using a minimum VM sizing and specification available.
I recently had an article published on Oracle Developer Community website about Understanding, Building and Using Neural Network Machine Learning Models with Oracle 18c. I’ve also had a 2 Minute Tech Tip (2MTT) video about this topic and article. Oracle 18c Database brings prominent new machine learning algorithms, including Neural Networks and Random Forests. While many articles are available on machine learning, most of them concentrate on how to build a model. Very few talk about how to use these new algorithms in your applications to score or label new data. This article will explain how Neural Networks work, how to build a Neural Network in Oracle Database, and how to use the model to score or label new data. What are Neural Networks? Over the past couple of years, Neural Networks have attracted a lot of attention thanks to their ability to efficiently find patterns in data—traditional transactional data, as well as images, sound, streaming data, etc. But for some implementations, Neural Networks can require a lot of additional computing resources due to the complexity of the many hidden layers within the network. Figure 1 gives a very simple representation of a Neural Network with one hidden layer. All the inputs are connected to a neuron in the hidden layer (red circles). A neuron takes a set of numeric values as input and maps them to a single output value. (A neuron is a simple multi-input linear regression function, where the output is passed through an activation function.) Two common activation functions are logistic and tanh functions. There are many others, including logistic sigmoid function, arctan function, bipolar sigmoid function, etc. Continue reading the rest of the article here.