Importance of setting Fetched Rows size for Database Query using Golang

Posted on Updated on

When issuing queries to the database one of the challenges every developer faces is how to get the results quickly. If your queries are only returning a small number of records, eg. < 5, then you don’t really have to worry about execution time. That is unless your query is performing some complex processing, joining lots of tables, etc.

Most of the time developers are working with one or a small number of records, using a simple query. Everything runs quickly.

But what if your query is returning several tens or thousands of records. Assuming we have a simple query and no query optimization is needed, the challenge facing the developer is how can you get all of those records quickly into your environment and process them. Typically the database gets blamed for the query result set being returned slowly. But what if this wasn’t the case? In most cases developers take the default parameter settings of the functions and libraries. For database connection libraries and their functions, you can change some of the parameters and affect how your code, your query, gets executed on the Database server and can affect how quickly the data is shipped from the database to your code.

One very important parameter to consider is the query array size. This is the number of records the database will send to your code in each batch. The database will keep sending batches until you tell it to stop. It makes sense to have the size of this batch set to a small value, as most queries return one or a small number of records. But when we get onto returning a larger number of records it can affect the response time significantly.

I tested the effect of changing the size of the returning buffer/array using Golang and querying data in an Oracle Database, hosted on Oracle Cloud, and using goracle library to connect to the database.

[ I did a similar test using Python. The results can be found here. You will notices that Golang is significantly quicker than Python, as you would expect. ]

The database table being queried contains 55,000 records and I just executed a SELECT * FROM … on this table. The results shown below contain the timing the query took to process this data for different buffer/array sizes by setting the FetchRowCount value.

rows, err := db.Query(dbQuery, goracle.FetchRowCount(arraySize))

Screenshot 2019-05-22 14.52.48

As you can see, as the size of the buffer/array size increases the timing it takes to process the data drops. This is because the buffer/array is returning a larger number of records, and this results in a reduced number of round trips to/from the database i.e. fewer packets of records are sent across the network.

The challenge for the developer is to work out the optimal number to set for the buffer/array size. The default for the goracle libary, using Oracle client is 256 row/records.

When that above query is run, without the FetchRowCount setting, it will use this default 256 value. When this is used we get the following timings.

Screenshot 2019-05-22 15.00.00

We can see, for the data set being used in this test case the optimal setting needs to be around 1,500.

What if we set the parameter to be very large?  That would no necessarily make it quicker. You can see from the first table the timing starts to increase for the last two settings. There is an overhead in gathering and sending the data.

Here is a subset of the Golang code I used to perform the tests.

var currentTime = time.Now()

var i int

var custId int

arrayOne := [11] int{5, 10, 30, 50, 100, 200, 500, 1000, 1500, 2000, 2500}


currentTime = time.Now()


fmt.Println("Array Size = ", arraySize, " : ", currentTime.Format("03:04:05:06 PM"))


for index, arraySize := range arrayOne {

    currentTime = time.Now()

    fmt.Println(index, " Array Size = ", arraySize, " : ", currentTime.Format("03:04:05:06 PM"))


    db, err := sql.Open("goracle", username+"/"+password+"@"+host+"/"+database)

    if err != nil {

        fmt.Println("... DB Setup Failed")

        fmt.Println(err)

        return

    }

    defer db.Close()



    if err = db.Ping(); err != nil {

        fmt.Printf("Error connecting to the database: %s\n", err)

        return

    }



    currentTime = time.Now()

    fmt.Println("...Executing Query", currentTime.Format("03:04:05:06 PM"))

    dbQuery := "select cust_id from sh.customers"

    rows, err := db.Query(dbQuery, goracle.FetchRowCount(arraySize))

    if err != nil {

        fmt.Println(".....Error processing query")

        fmt.Println(err)

        return

    }

    defer rows.Close()



    i = 0

    currentTime = time.Now()

    fmt.Println("... Parsing query results", currentTime.Format("03:04:05:06 PM"))
 
   for rows.Next() {

        rows.Scan(&custId)

        i++

        if i% 10000 == 0 {

            currentTime = time.Now()

            fmt.Println("...... ",i, " customers processed", currentTime.Format("03:04:05:06 PM"))

        }

    }


    currentTime = time.Now()

    fmt.Println(i, " customers processed", currentTime.Format("03:04:05:06 PM"))


    fmt.Println("... Closing connection")

    finishTime := time.Now()

    fmt.Println("Finished at ", finishTime.Format("03:04:05:06 PM"))


}

 

 

 

 

Advertisement

2 thoughts on “Importance of setting Fetched Rows size for Database Query using Golang

    Batch read from DBs - TechTalk7 said:
    March 27, 2022 at 9:21 am

    […] idea is to have the driver packages handle this. As mentioned in the issue and also in this blog https://oralytics.com/2019/06/17/importance-of-setting-fetched-rows-size-for-database-query-using-go&#8230;, I found out that golangs oracle driver package has this option that I can pass for batching. But […]

    Like

    #Kscope19 week – OBIEE News said:
    June 23, 2019 at 8:27 am

    […] 6. Importance of setting Fetched Rows size for Database Query using Golang […]

    Like

Comments are closed.