In a conversation that I was involved with yesterday, someone mentioned how useful cfquery maxRows is and how they like using it. I was pretty sure at the time that this cfquery cross-database attribute comes at a cost, but I thought I had better double check it. As it turns out, I was right and if you are using this attribute then you should definitely read this post to understand what is happening under the hood.
History and usage of maxRows
This is a typical use of the maxRows attribute:
The result of this query is that even though you are selecting all the records in the table article, ColdFusion returns only the first 5 in the recordset. The power of this is that it works across all databases, no matter if you are using SQL Server, mySQL, Oracle and so on.
Before ColdFusion MX, the maxRows attribute would actually return all the records from the database but would give you only those limited by the maxRows. This was fixed in subsequent versions, and ColdFusion handles this now appropriately, returning but also fetching the correct records.
However, is this attribute really so magic as it seems? Is it the blue pill that allows you to go on as if everything is fine, or is it the red pill that leads to maychem? (just a little diversion here from my favorite film).
Problem number 1. Database Vs ColdFusion.
When you run a query using the maxRows attribute, the ColdFusion JDBC driver hands off the query to the database without any filtering whatsoever for the top rows. This is very important. It’s as if you are running the query inside the cfquery tag without the maxRows. As the database is processing the results, the JDBC driver is smart enough to know that it has reached the limit of the maxRows and stops the connection from the database and returns the results to the ColdFusion query at the right time. However, the database keeps on going, like the Energizer bunny. If the query actually runs through thousands or maybe millions of records that take a while to execute, what you are inadvertently doing is unnecessarily stressing your database.
Taking the lead from Jochem van Dieten’s article, I wrote some code to test for this using SQL Server and ColdFusion 9. The test is simple. We create our source table, populate it with some sample data, and then use the two methods (the efficient and native to SQL Server ‘TOP’ attribute versus the ColdFusion ‘MAXROWS’) to copy from one table and insert in the other. The point is to try to limit how many are copied over using the two notations and then see how many are actually copied.
Here’s the code (included for you to test as a download at the bottom of the post):
The results are very clear. Even though ColdFusion seems to return the correct number of rows and does so pretty fast (because it drops the connection to the database once the maxRows are reached), the database actually runs the complete query:
|CFQUERY returned records
|SQL Server actual records
|SQL Server ‘TOP’ attribute
|ColdFusion ‘MAXROWS’ attribute
The last row in the table is the one that really counts, and clarifies the fact that even though ColdFusion fetches 5 records in its query using maxRows, the database actually runs all 100 of them.
Problem number 2. Issues with cachedWithin.
Let’s say you do this:
ColdFusion fetches the results and caches the query for 1 minute. Fair enough. But what if you edit the file and change the maxRows from 5 to 15, save and run again?
You would probably expect to get 15 rows back. But, you still get 5 results, and only when the cachedWithin time expires the query will start returning 15 rows. This happens because nothing has actually changed inside the cfquery tag. The query that gets cached is actually the content of the cfquery and has nothing to do with the tag attributes. If you wanted to make the change effective immediately, then you can simple add some line breaks to your query, like so:
Use this attribute with great care. Just as long as you know what’s happening under the hood, then you can make the right decisions. Imagine if you had audit triggers in the database to log how many records were selected and you use maxRows. You would then have a mismatch between the database logs and your ColdFusion application logs. Which is bad. Really bad.