Is Python the new R?
Maybe, maybe not, but that I’m finding in recent months is more companies are asking me to use Python instead of R for some of my work.
In this blog post I will walk through the steps of setting up the Oracle driver for Python, called cx_Oracle. The documentation for this drive is good and detailed with plenty of examples available on GitHub. Hopefully there isn’t anything new in this post, but it is my experiences and what I did.
1. Install Oracle Client
The Python driver requires Oracle Client software to be installed. Go here, download and install. It’s a straightforward install. Make sure the directories are added to the search path.
2. Download and install cx_Oracle
You can use pip3 to do this.
pip3 install cx_Oracle Collecting cx_Oracle Downloading cx_Oracle-6.1.tar.gz (232kB) 100% |████████████████████████████████| 235kB 679kB/s Building wheels for collected packages: cx-Oracle Running setup.py bdist_wheel for cx-Oracle ... done Stored in directory: /Users/brendan.tierney/Library/Caches/pip/wheels/0d/c4/b5/5a4d976432f3b045c3f019cbf6b5ba202b1cc4a36406c6c453 Successfully built cx-Oracle Installing collected packages: cx-Oracle Successfully installed cx-Oracle-6.1
3. Create a connection in Python
Now we can create a connection. When you see some text enclosed in angled brackets <>, you will need to enter your detailed for your schema and database server.
# import the Oracle Python library import cx_Oracle # define the login details p_username = "" p_password = "" p_host = "" p_service = "" p_port = "1521" # create the connection con = cx_Oracle.connect(user=p_username, password=p_password, dsn=p_host+"/"+p_service+":"+p_port) # an alternative way to create the connection # con = cx_Oracle.connect('/@/:1521') # print some details about the connection and the library print("Database version:", con.version) print("Oracle Python version:", cx_Oracle.version) Database version: 220.127.116.11.0 Oracle Python version: 6.1
4. Query some data and return results to Python
In this example the query returns the list of tables in the schema.
# define a cursor to use with the connection cur = con.cursor() # execute a query returning the results to the cursor cur.execute('select table_name from user_tables') # for each row returned to the cursor, print the record for row in cur: print("Table: ", row) Table: ('DECISION_TREE_MODEL_SETTINGS',) Table: ('INSUR_CUST_LTV_SAMPLE',) Table: ('ODMR_CARS_DATA',)
Now list the Views available in the schema.
# define a second cursor cur2 = con.cursor() # return the list of Views in the schema to the cursor cur2.execute('select view_name from user_views') # display the list of Views for result_name in cur2: print("View: ", result_name) View: ('MINING_DATA_APPLY_V',) View: ('MINING_DATA_BUILD_V',) View: ('MINING_DATA_TEST_V',) View: ('MINING_DATA_TEXT_APPLY_V',) View: ('MINING_DATA_TEXT_BUILD_V',) View: ('MINING_DATA_TEXT_TEST_V',)
5. Query some data and return to a Panda in Python
Pandas are commonly used for storing, structuring and processing data in Python, using a data frame format. The following returns the results from a query and stores the results in a panda.
# in this example the results of a query are loaded into a Panda # load the pandas library import pandas as pd # execute the query and return results into the panda called df df = pd.read_sql_query("SELECT * from INSUR_CUST_LTV_SAMPLE", con) # print the records returned by query and stored in panda print(df.head()) CUSTOMER_ID LAST FIRST STATE REGION SEX PROFESSION \ 0 CU13388 LEIF ARNOLD MI Midwest M PROF-2 1 CU13386 ALVA VERNON OK Midwest M PROF-18 2 CU6607 HECTOR SUMMERS MI Midwest M Veterinarian 3 CU7331 PATRICK GARRETT CA West M PROF-46 4 CU2624 CAITLYN LOVE NY NorthEast F Clerical BUY_INSURANCE AGE HAS_CHILDREN ... MONTHLY_CHECKS_WRITTEN \ 0 No 70 0 ... 0 1 No 24 0 ... 9 2 No 30 1 ... 2 3 No 43 0 ... 4 4 No 27 1 ... 4 MORTGAGE_AMOUNT N_TRANS_ATM N_MORTGAGES N_TRANS_TELLER \ 0 0 3 0 0 1 3000 4 1 1 2 980 4 1 3 3 0 2 0 1 4 5000 4 1 2 CREDIT_CARD_LIMITS N_TRANS_KIOSK N_TRANS_WEB_BANK LTV LTV_BIN 0 2500 1 0 17621.00 MEDIUM 1 2500 1 450 22183.00 HIGH 2 500 1 250 18805.25 MEDIUM 3 800 1 0 22574.75 HIGH 4 3000 2 1500 17217.25 MEDIUM [5 rows x 31 columns]
6. Wrapping it up and closing things
Finally we need to wrap thing up and close our cursors and our connection to the database.
# close the cursors cur2.close() cur.close() # close the connection to the database con.close()
Watch out for more blog posts on using Python with Oracle, Oracle Data Mining and Oracle R Enterprise.