Annual Look at Database Trends (Jan 2023)
Monitoring trends in the popularity and usage of different Database vendors can be a interesting exercise. The marketing teams from each vendor do an excellent job of promoting their Database, along with the sales teams, developer advocates, and the user communities. Some of these are more active than others and it varies across the Database market on what their choice is for promoting their products. One of the problems with these various types of marketing, is how can be believe what they are saying about how “awesome” their Database is, and then there are some who actively talk about how “rubbish” (or saying something similar) other Databases area. I do wonder how this really works for these people and vendors when to go negative about their competitors. A few months ago I wrote about “What does Legacy Really Mean?“. That post was prompted by someone from one Database Vendor calling their main competitor Database a legacy product. They are just name calling withing providing any proof or evidence to support what they are saying.
Getting back to the topic of this post, I’ve gathered some data and obtained some league tables from some sites. These will help to have a closer look at what is really happening in the Database market throughout 2022. Two popular sites who constantly monitor the wider internet and judge how popular Databases area globally. These sites are DB-Engines and TOPDB Top Database index. These are well know and are frequently cited. Both of these sites give some details of how they calculate their scores, with one focused mainly on how common the Database appears in searches across different search engines, while the other one, in addition to search engine results/searches, also looks across different websites, discussion forms, social media, job vacancies, etc.
The first image below is a comparison of the league tables from DB-Engines taken in January 2022 and January 2023. I’ve divided this image into three sections/boxes. Overall for the first 10 places, not a lot has changed. The ranking scores have moved slightly in most cases but not enough to change their position in the rank. Even with a change of score by 30+ points is a very small change and doesn’t really indicate any great change in the score as these scores are ranked in a manner where, “when system A has twice as large a value in the DB-Engines Ranking as system B, then it is twice as popular when averaged over the individual evaluation criteria“. Using this explanation, Oracle would be twice as popular when compared to PostgreSQL. This is similar across 2022 and 2023.
Next we’ll look a ranking from TOPDB Top Database index. The image below compares January 2022 and January 2023. TOPDB uses a different search space and calculation for its calculation. The rankings from TOPDB do show some changes in the ranks and these are different to those from DB-Engines. Here we see the top three ranks remain the same with some small percentage changes, and nothing to get excited about. In the second box covering ranks 4-7 we do some changes with PostgreSQL improving by two position and MongoDB. These changes do seem to reflect what I’ve been seeing in the marketplace with MongoDB being replaced by PostgreSQL and MySQL, with this multi-model architecture where you can have relational, document, and other data models in the one Database. It’s important to note Oracle and SQL Server also support this. Over the past couple of years there has been a growing awareness of and benefits of having relation and document (and others) data models in the one database. This approach makes sense both for developer productivity, and for data storage and management.
The next gallery of images is based on some Python code I’ve written to look a little bit closer at the top five Databases. In this case these are Oracle, MySQL, SQL Server, PostgreSQL and MongoDB. This gallery plots a bar chart for each Database for their top 15 Counties, and compares them with the other four Databases. The results are interesting and we can see some geographic aspects to the popularity of the Databases.
What does Legacy really mean?
In the IT industry we hear the term “legacy” being using, but that does it mean? It can mean a lot of different things and it really depends on the person who is saying it, their context, what they want to portray and their intended meaning. In a lot of cases people seem to use it without knowing the meaning or the impact it can have. This can result in negative impact and not in the way the person intended.
Before looking at some (and there can be lots) possible meanings, lets have a look at what one person said recently.
“Migrating away from legacy databases like Oracle can seem like a daunting undertaking for businesses. But it doesn’t have to be.”
To give context to this quote, the person works for a company selling products, services, support, etc for PostgreSQL and wants everyone to move to ProtgreSQL (Postgres), which is understandable given their role. There’s nothing wrong with trying to convince people/companies to use software that you sell lots of services and additional software to support it. What is interesting is they used the work “legacy”.
Legacy can mean lots of different things to different people. Here are some examples of how legacy is used within the IT industry.
- The product is old and out of date
- The product has no relevancy in software industry today
- Software or hardware that has been superseded
- Any software that has just been released (yes I’ve come across this use)
- Outdated computing software and/or hardware that is still in use. The system still meets the needs it was originally designed for, but doesn’t allow for growth
- Anything in production
- Software that has come to an end of life with no updates, patching and/or no product roadmap
Going back to the quote given above, let’s look a little closer at their intended use. As we can see from the list above the use of the word “legacy” can be used in derogatory way and can try to make one software appear better then it’s old, out of date, not current, hard to use, etc competitor.
If you were to do a side-by-side comparison of PostgreSQL and Oracle, there would be a lot of the same or very similar features. But there are differences too and this, in PostgreSQL case, we see various vendors offering add-on software you can pay for. This is kind of similar with Oracle where you need to license various add-ons, or if you are using a Cloud offering it may come as part of the package. On a features comparison level when these are similar, saying one is “legacy” doesn’t seem right. Maybe its about how old the software is, as in legacy being old software. The first release of Oracle was 1979 and we now get yearly update releases (previously it could be every 2-4 years). PostgresSQL, or its previous names date back to 1974 with the first release of Ingres, which later evolved to Postgres in early 1980s, and took on the new name of PostgreSQL in 1996. Are both products today still the same as what they had in the 1970s, 1980s, 1990s, etc. The simple answer is No, they have both evolved and matured since then. Based on this can we say PostgreSQL is legacy or is more of a Legacy product than Oracle Database which was released in 1979 (5 years after Ingres)? Yes we can.
I’m still very confused by the quote (given above) as to what “legacy” might mean, in their scenario. Apart from and (trying) to ignore the derogatory aspect of “they” are old and out of date, and look at us we are new and better, it is difficult to see what they are trying to achieve.
Getting back to saying MongoDB is legacy, again comes back to the person saying it. They work at a company who is selling cloud based data engineering and analytic services. Is using cloud services the only thing people should be using? For me it is No but a hybrid cloud and on-premises approach will work based for most. Some of the industry analysts are now promoting this, saying vendors offering both will succeed into the future, where does only offering cloud based services will have limited growth, unless the adapt now.
What about other types legacy software applications. Here is an example Stew Ashton posted on Twitter. “I once had a colleague who argued, in writing, that changing the dev stack had the advantage of forcing a rewrite of “legacy applications” – which he had coded the previous year! Either he thought he had greatly improved, or he wanted guaranteed job security”
There are lots and lots of more examples out there and perhaps you will encounter some when you are attending presentations or sales pitches from various vendors. If you hear, then saying one product is “legacy” get them to define their meaning of it and to give specific examples to illustrate it. Does their meaning match with one from the list given above, or something else. Are they just using the word to make another product appear inferior without knowing the meaning or the differences in the product? Their intended meaning within their context is what defines their meaning, which may be different to yours.
Postgres on Docker
Prostgres is one of the most popular databases out there, being used in Universities, open source projects and also widely used in the corporate marketplace. I’ve written a previous post on running Oracle Database on Docker. This post is similar, as it will show you the few simple steps to have a persistent Postgres Database running on Docker.
The first step is go to Docker Hub and locate the page for Postgres. You should see something like the following. Click through to the Postgres page.
There are lots and lots of possible Postgres images to download and use. The simplest option is to download the latest image using the following command in a command/terminal window. Make sure Docker is running on your machine before running this command.
docker pull postgres
Although, if you needed to install a previous release, you can do that.
After the docker image has been downloaded, you can now import into Docker and create a container.
docker run --name postgres -p 5432:5432 -e POSTGRES_USER=postgres -e POSTGRES_PASSWORD=pgPassword -e POSTGRES_DB=postgres -d postgres
Important: I’m using Docker on a Mac. If you are using Windows, the format of the parameter list is slightly different. For example, remove the = symbol after POSTGRES_DB
If you now check with Docker you’ll see Postgres is now running on post 5432.
Next you will need pgAdmin to connect to the Postgres Database and start working with it. You can download and install it, or run another Docker container with pgAdmin running in it.
First, let’s have a look at installing pgAdmin. Download the image and run, accepting the initial requirements. Just let it run and finish installing.
When pgAdmin starts it looks for you to enter a password. This can be anything really, but one that you want to remember. For example, I set mine to pgPassword.
Then create (or Register) a connection to your Postgres Database. Enter the details you used when creating the docker image including username=postgres, password=pgPassword and IP address=0.0.0.0.
The IP address on your machine might be a little different, and to check what it is, run the following
docker ps -a
When your (above) connection works, the next step is to create another schema/user in the database. The reason we need to do this is because the user we connected to above (postgres) is an admin user. This user/schema should never be used for database development work.
Let’s setup a user we can use for our development work called ‘student’. To do this, right click on the ‘postgres’ user connection and open the query tool.
Then run the following.
After these two commands have been run successfully we can now create a connection to the postgres database, open the query tool and you’re now all set to write some SQL.
oracledb Python Library – Connect to DB & a few other changes
Oracle have released a new Python library for connecting to Oracle Databases on-premises and on the Cloud. It’s called (very imaginatively, yet very clearly)
oracledb. This new Python library replaces the previous library called
cx_Oracle. Just consider
cx_oracle as obsolete, and use
oracledb going forward, as all development work on new features and enhancements will be done to
cx_oracle has been around a long time, and it’s about time we have a new and enhanced library that is more flexible and will suit many different deployment scenarios. The previous library (
cx_Oracle) was great, but it did require additional software installation with Oracle Client, and some OS environment settings, which at times took a bit of debugging. This makes it difficult/challenging to deploy in different environments, for example IOTs, CI/CD, containers, etc. Deployment environments have changed and the new
oracledb library makes it simpler.
To check out the following links for a full list of new features and other details.
Home page: oracle.github.io/python-oracledb
Installation instructions: python-oracledb.readthedocs.io/en/latest/installation.html
One of the main differences between the two libraries is how you connect to the Database. With
oracledb you need to use named the parameters, and the new library uses a thin connection. If you need the thick connection you can switch to that easily enough.
The following examples will illustrate how to connect to Oracle Database (local and cloud ADW/ATP) and how these are different to using the cx_Oracle library (which needed Oracle Client software installed). Remember the new
oracledb library does not need Oracle Client.
To get started, install
pip3 install oracledb
Local Database (running in Docker)
To test connection to a local Database I’m using a Docker image of 21c (hence localhost in this example, replace with IP address for your database). Using the previous library (
cx_Oracle) you could concatenate the connection details to form a string and pass that to the connection. With
oracledb, you need to use named parameters and specify each part of the connection separately.
This example illustrates this simple connection and prints out some useful information about the connection, do we have a healthy connection, are we using thing database connection and what version is the connection library.
p_username = "..." p_password = "..." p_dns = "localhost/XEPDB1" p_port = "1521" con = oracledb.connect(user=p_username, password=p_password, dsn=p_dns, port=p_port) print(con.is_healthy()) print(con.thin) print(con.version) --- True True 220.127.116.11.0
Having created the connection we can now query the Database and close the connection.
cur = con.cursor() cur.execute('select table_name from user_tables') for row in cur: print(row) --- ('WHISKIES_DATASET',) ('HOLIDAY',) ('STAGE',) ('DIRECTIONS',) --- cur.close() con.close()
The code I’ve given above is simple and straight forward. And if you are converting from
cx_Oracle, you will probably have minimal changes as you probably had your parameter keywords defined in your code. If not, some simple editing is needed.
To simplify the above code even more, the following does all the same steps without the explicit open and close statements, as these are implicit in this example.
import oracledb con = oracledb.connect(user=p_username, password=p_password, dsn=p_dns, port=p_port) with con.cursor() as cursor: for row in cursor.execute('select table_name from user_tables'): print(row)
(Basic) Oracle Cloud – Autonomous Database, ATP/ADW
Everyone is using the Cloud, Right? If you believe the marketing they are, but in reality most will be working in some hybrid world using a mixture of on-premises and cloud storage. The example given in the previous section illustrated connecting to a local/on-premises database. Let’s now look at connecting to a database on Oracle Cloud (Autonomous Database, ATP/ADW).
oracledb library things have been simplified a little. In this section I’ll illustrate a simple connection to a ATP/ADW using a thin connection.
What you need is the location of the directory containing the unzipped wallet file. No Oracle client is needed. If you haven’t downloaded a Wallet file in a while, you should go download a new version of it. The Wallet will contain a pem file which is needed to securely connect to the DB. You’ll also need the password for the Wallet, so talk nicely with your DBA. When setting up the connection you need to provide the directory for the tnsnames.ora file and the ewallet.pem file. If you have downloaded and unzipped the Wallet, these will be in the same directory
import oracledb p_username = "..." p_password = "..." p_walletpass = '...' #This time we specify the location of the wallet con = oracledb.connect(user=p_username, password=p_password, dsn="student_high", config_dir="/Users/brendan.tierney/Dropbox/5-Database-Wallets/Wallet_student-Full", wallet_location="/Users/brendan.tierney/Dropbox/5-Database-Wallets/Wallet_student-Full", wallet_password=p_walletpass) print(con) con.close()
This method allows you to easily connect to any Oracle Cloud Database.
(Thick Connection) Oracle Cloud – Autonomous Database, ATP/ADW
If you have Oracle Client already installed and set up, and you want to use a thick connection, you will need to initialize the function
import oracledb p_username = "..." p_password = "..." #point to directory containing tnsnames.ora oracledb.init_oracle_client(config_dir="/Applications/instantclient_19_8/network/admin") con = oracledb.connect(user=p_username, password=p_password, dsn="student_high") print(con) con.close()
Warning: Some care is needed with using init_oracle_client. If you use it once in your Python code or App then all connections will use it. You might need to do a code review to look at when this is needed and if not remove all occurrences of it from your Python code.
(Additional Security) Oracle Cloud – Autonomous Database, ATP/ADW
There are a few other additional ways of connecting to a database, but one of my favorite ways to connect involves some additional security, particularly when working with IOT devices, or in scenarios that additional security is needed. Two of these involve using One-way TLS and Mututal TLS connections. The following gives an example of setting up One-Way TLS. This involves setting up the Database to only received data and connections from one particular device via an IP address. This requires you to know the IP address of the device you are using and running the code to connect to the ATP/ADW Database.
To set this up, go to the ATP/ADW details in Oracle Cloud, edit the Access Control List, add the IP address of the client device, disable mutual TLS and download the DB Connection. The following code gives and example of setting up a connection
import oracledb p_username = "..." p_password = "..." adw_dsn = '''(description= (retry_count=20)(retry_delay=3)(address=(protocol=tcps)(port=1522) (host=adb.us-ashburn-1.oraclecloud.com))(connect_data=(service_name=a8rk428ojzuffy_student_high.adb.oraclecloud.com)) (security=(ssl_server_cert_dn="CN=adwc.uscom-east-1.oraclecloud.com,OU=Oracle BMCS US,O=Oracle Corporation,L=Redwood City,ST=California,C=US")))''' con4 = oracledb.connect(user=p_username, password=p_password, dsn=adw_dsn)
This sets up a secure connection between the client device and the Database.
From my initial testing of existing code/applications (although no formal test cases) it does appear the new
oracledb library is processing the queries and data quicker than
cx_Oracle. This is good and hopefully we will see more improvements with speed in later releases.
Also don’t forget the impact of changing the buffer size for your database connection. This can have a dramatic effect on speeding up your database interactions. Check out this post which illustrates this.
Database Vendors on Twitter, Slack, downloads, etc.
Each year we see some changes in the positioning of the most popular databases on the market. “The most popular” part of that sentence can be the most difficult to judge. There are lots and lots of different opinions on this and ways of judging them. There are various sites giving league tables, and even with those some people don’t agree with how they perform their rankings.
The following table contains links for some of the main Database engines including download pages, social media links, community support sites and to the documentation.
One of the most common sites is DB-Engines, and another is TOPDB Top Database index. The images below show the current rankings/positions of the database vendors (in January 2022).
I’ve previously written about using the Python pytrends package to explore the relative importance of the different Database engines. The results from pytrends gives results based on number of searches etc in Google. Check out that Blog Post. I’ve rerun the same code for 2021, and the following gallery displays charts for each Database based on their popularity. This will allow you to see what countries are most popular for each Database and how that relates to the other databases. For these charts I’ve included Oracle, MySQL, SQL Server, PostgreSQL and MongoDB, as these are the top 5 Databases from DB-Engines.
Oracle 21c XE Database and Docker setup
You know when you are waiting for the 39 bus for ages, and then two of them turn up at the same time. It’s a bit like this with Oracle 21c XE Database Docker image being released a few days after the 18XE Docker image!
Again we have Gerald Venzi to thank for putting these together and making them available.
Are you running an Apple M1 chip Laptop? If so, follow these instructions (and ignore the rest of this post)
If you want to install Oracle 21c XE yourself then go to the download page and within a few minutes you are ready to go. Remember 21c XE is a fully featured version of their main Enterprise Database, with a few limitations, basically on size of deployment. You’d be surprised how many organisations who’s data would easily fit within these limitations/restrictions. The resource limits of Oracle Database 21 XE include:
- 2 CPU threads
- 2 GB of RAM
- 12GB of user data (Compression is included so you can store way way more than 12G)
- 3 pluggable Databases
It is important to note, there are some additional restrictions on feature availability, for example Parallel Query is not possible, etc.
Remember the 39 bus scenario I mentioned above. A couple of weeks ago the Oracle 18c XE Docker image was released. This is a full installation of the database and all you need to do is to download it and run it. Nothing else is required. Check out my previous post on this.
To download, install and run Oracle 21c XE Docker image, just run the following commands.
docker pull gvenzl/oracle-xe:21-full docker run -d -p 1521:1521 -e ORACLE_PASSWORD=SysPassword1 -v oracle-volume:/opt/oracle/XE21CFULL/oradata gvenzl/oracle-xe:21-full docker rename da37a77bb436 21cFull sqlplus system/SysPassword1@//localhost/XEPDB1
It’s a good idea to create a new schema for your work. Here is an example to create a schema called ‘demo’. First log into system using sqlplus, as shown above, and then run these commands.
create user demo identified by demo quota unlimited on users; grant connect, resource to demo;
To check that schema was created you can connect to it using sqlplus.
Then to stop the image from running and to restart it, just run the following
docker stop 21cFull docker start 21cFull
Check out my previous post on Oracle 18c XE setup for a few more commands.
SQL Developer Connection Setup
An alternative way to connect to the Database is to use SQL Developer. The following image shows and example of connecting to a schema called DEMO, which I created above. See the connection details in this image. They are the same as what is shown above when connecting using sqlplus.
Oracle 18c XE Docker setup
During August (2021) Gerald Venzi of Oracle released a new set of Docker images and these included Oracle 18c XE Database. Check out Gerald’s blog post about this for a lot more details on these images. Great work Gerald, and it’s way simpler to set this up compared to previous.
The following is really just a reminder to myself of the commands needed to install and run one of the 18c XE docker images.
Gerald has provided 3 different versions of 18c XE Database. Check out his blog post for more details of what is included/excluded in each image.
I decided to go with the FULL docker image (oracle-xe-full), just because I use most of the DB features and like to play around with the rest. If you just want a Database then go with the medium or small sized docker images
|Docker Image Name||Description|
|oracle-xe-full||Contains full Oracle 18c XE Database installation. Containing all the bells and whistles. This is the largest docker image.|
|oralce-xe||This medium sized image has some things stripped out from the installation. Contains most of the functionality from the full image, but some of the edge case functionality has been removed.|
|oracle-xe-slim||This is the smallest image and has a lot of extra features remove. Probably only suitable if you want a basic Database.|
Before you run the following commands you will need to install Docker.
Step 1: Download the 18c XE image
docker pull gvenzl/oracle-xe
Step 2: Check the image exist in your Docker env
Step 3: Run the image
docker run -d -p 1521:1521 -e ORACLE_PASSWORD=SysPassword1 -v oracle-volume:/opt/oracle/oradata gvenzl/oracle-xe
This command remaps the 1521 port to local 1521, changed/set the password and gives volume details to all any changes to the database and image to be persisted i.e. when you restart the image your previous work will be there
Step 4: Rename image [you can skip this step if you want. I just wanted a different name]
docker rename d95a3db95747 18XE
NB: Use the code/reference for your docker image. It will be different to mine (d95a3db95747)
Step 5: Connect to the Database as DBA/Admin schema
You can use SQL*Plus or some other client side tool to connect to the database
A simple query to check we are connected to the database.
select username from dba_users;
Step 6: Create your own (developer) Schema
create user demo identified by demo quota unlimited on users;
grant connect, resource to demo;
Exit SQL*Plus and log back into the Database using the DEMO schema you just created.
Step 7: Create a Table and enter some Records
create table test (col1 NUMBER, col2 VARCHAR2(10));
insert into test values (1, 'Brendan');
Step 8: Test the Docker image persists the data
Stop the docker image
docker stop 18XE
Check it is no-longer running
Nothing will be displayed
Step 9: Start the 18XE Docker image and Check data was persisted
docker start 18XE
You should see the docker image is running
sqlplus demo/demo@//localhost/XEPDB1 select table_name from user_tables; select * from test;
These last two commands should show the table and the record in the table. This means the data was persisted.
All done you now have a working Docker image of Oracle 18XE running.
Just remember to stop the image when you don’t need it on your computer. These will save you some resource usage.
Collection of Oracle 21c posts on new Machine Learning and Statistical functions
Oracle 21c was officially released a few days about and this post contains links to some blog posts I’ve written on new machine learning and statistical functions in the new Oracle 21c.
- Adam Optimization Solver for Neural Network Algorithm
- MSET-SPRT Algorithm
- XGBoost Algorithm
- Measuring SKEWNESS Function
- Measuring tailedness of data with KURTOSIS Function
I also have posts on new OML4Py and AutoML too, and I’ll have a different set of posts for those, so look out them.
Also check out my previous blog post that covers new machine learning feature introduced in Oracle 19c.
18c is now available (but only on the Cloud)
On Friday afternoon (16th February) we started to see tweets and blog posts from people in Oracle saying that Oracle 18c was now available. But is only available on Oracle Cloud and Engineered Systems.
It looks like we will have to wait until the Autumn before we can install it ourselves on our own servers 😦
Here is the link to the official announcement for Oracle 18c.
Oracle 18c is really Oracle 18.104.22.168. The next full new release of the Oracle database is expected to be Oracle 19.
The new features and incremental enhancements in Oracle 18c are:
- Memory Optimized Fetches
- Exadata RAC Optimizations
- High Availability
- Online Partition Merge
- Improved Machine Learning (OAA)
- Polymorphic Table Functions
- Spatial and Graph
- More JSON improvements
- Private Temporary Tablespaces
- New mode for Connection Manager
And now the all important links to the documentation.
To give Oracle 18c a try you will need to go to cloud.oracle.com and select Database from the drop down list from the Platform menu. Yes you are going to need an Oracle Cloud account and some money or some free credit. Go and get some free cloud credits at the upcoming Oracle Code events.
If you want a ‘free’ way of trying out Oracle 18c, you can use Oracle Live SQL. They have setup some examples of the new features for you to try.
NOTE: Oracle 18c is not Autonomous. Check out Tim Hall’s blog posts about this. The Autonomous Oracle Database is something different, and we will be hearing more about this going forward.
12.2 DBaaS (Extreme Edition) possible bug/issue with the DB install/setup
A few weeks ago the 12.2 Oracle Database was released on the cloud. I immediately set an account and got my 12.2 DBaaS setup. This was a relatively painless process and quick.
For me I wanted to test out all the new Oracle Advanced Analytics new features and the new features in SQL Developer 4.2 that only become visible when you are using the 12.2 Database.
When you are go to use the Oracle Data Miner (GUI tool) in SQL Developer 4.2, it will check to see if the ODMr repository is installed in the database. If it isn’t then you will be promoted for the SYS password.
This is the problem. In previous version of the DBaaS (12.1, etc) this was not an issue.
When you go to create your DBaaS you are asked for a password that will be used for the admin accounts of the database.
But when I entered the password for SYS, I got an error saying invalid password.
After using ssh to create a terminal connection to the DBaaS I was able to to connect to the container using
sqlplus / as sysdba
and also using
sqlplus sys/ as sysdba
Those worked fine. But when I tried to connect to the PDB1 I got the invalid username and/or password error.
sqlplus sys/@pdb1 as sysdba
I reconnected as follows
sqlplus / as sysdba
and then changed the password for SYS with containers=all
This command completed without errors but when I tried using the new password to connect the the PDB1 I got the same error.
After 2 weeks working with Oracle Support they eventually pointed me to the issue of the password file for the PDB1 was missing. They claim this is due to an error when I was creating/installing the database.
But this was a DBaaS and I didn’t install the database. This is a problem with how Oracle have configured the installation.
The answer was to create a password file for the PDB1 using the following
You must be logged in to post a comment.