Database developers like to use the ISNULL() function to check two conditions and compare them with a search value. This is often done with the intention of finding records that do not have a value stored in the attribute. Sometimes they can get lucky – but not always.
Inhaltsverzeichnis
SARGable – what does that mean?
- A query is SARGable if it allows the database engine to use indexes effectively, resulting in faster data retrieval
- SARGable queries use simple comparisons directly on columns, avoiding transformations or functions that prevent index usage
- Non-SARGable queries force full table scans instead of indexed searches
- Writing SARGable queries is a best practice in SQL optimization
ISNULL()
- The function
ISNULL(expression, replacement_value)
checks whether the specified expressionexpression
is NULL and, if so, replaces it with the specified valuereplacement_value
. - It is often used to provide a default value when a column or expression contains
NULL
, for exampleISNULL(Price, 0)
. - The return value of
ISNULL
has the same data type as the first expression (expression
). - ISNULL can be more efficient than comparable functions such as COALESCE when only a default value is needed, because ISNULL expects exactly two arguments.
Initial situation
A database developer was given the task of optimizing a query from the business. He was supposed to examine the query and optimize it if necessary using an index or rewriting the query. Unfortunately, he forgot a special feature during testing, which then led to a deterioration in performance in the production environment.
Task
The query provided should output both customers from the „IT SERVICE“ sector and customers without assignment from the [dbo].[customers] table.
/*
After the development team has created a test table they startet immediately
with the test query given by the business users.
*/
SELECT c_custkey,
c_mktsegment,
c_nationkey,
c_name,
c_address,
c_phone,
c_acctbal,
c_comment
FROM test.customers
WHERE ISNULL(c_mktsegment, 'IT SERVICE') = 'IT SERVICE'
ORDER BY
c_name ASC;
GO
The execution plan suggested that an index was missing that could optimize the query, so that index was implemented…
/* The performance is really bad so decided to create an index on c_mktsegment ... */
CREATE NONCLUSTERED INDEX nix_test_customers_c_mktsegment
ON test.customers (c_mktsegment)
WITH (DATA_COMPRESSION = PAGE);
GO
… and they noticed that it was good
Since the index was obviously good, it was decided to implement it in the production environment. Unfortunately, it did not produce the desired result – success was largely lacking.
A small success (no parallelization) that still completely scanned the new index.
What is going wrong here?
The developers also asked themselves this question and together we looked at and analyzed the statistics using https://statisticsparser.com.
--------------------------------------------------------
Test
--------------------------------------------------------
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'customers'. Scan count 1, logical reads 55, physical reads 29, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 77 ms.
--------------------------------------------------------
Prod
--------------------------------------------------------
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'customers'. Scan count 1, logical reads 1642, physical reads 24, page server reads 0, read-ahead reads 1615, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
SQL Server Execution Times:
CPU time = 843 ms, elapsed time = 957 ms.
causal research
The reason for the serious difference is a small but subtle feature of the attribute in which a value is to be searched for.
In the test environment, the developer declared the attribute as NOT NULL, while in the production environment it allowed NULL.
The query optimizer of Microsoft SQL Server is smart and recognized in the test environment that the ISNULL function is not needed because the attribute can never be empty. This setting does not exist in the production environment, and so a SARGable query in the test environment became in the production environment a NON-SARGable query. The ISNULL() function had to check for each data record whether a value was present.
Solution(s)
Fortunately, the developers had the freedom to rewrite the query. So, solutions were sought to rewrite the query in such a way that the ISNULL() function could be avoided.
ISNULL() = OR
- ISNULL can be interpreted as „OR“
- ISNULL = NULL or Value
SELECT c_custkey,
c_mktsegment,
c_nationkey,
c_name,
c_address,
c_phone,
c_acctbal,
c_comment
FROM prod.customers
WHERE c_mktsegment IS NULL
OR c_mktsegment = 'IT SERVICE'
ORDER BY
c_name ASC;
GO
ISNULL() = OR = UNION ALL
- When ISNULL() works like an OR than
- an OR is Set #1 combined with Set #2
- Set #1 looks for rows with NULL
- Set #2 looks for rows with „IT SERVICE“
SELECT c_custkey,
c_mktsegment,
c_nationkey,
c_name,
c_address,
c_phone,
c_acctbal,
c_comment
FROM prod.customers
WHERE c_mktsegment IS NULL
UNION ALL
SELECT c_custkey,
c_mktsegment,
c_nationkey,
c_name,
c_address,
c_phone,
c_acctbal,
c_comment
FROM prod.customers
WHERE c_mktsegment = 'IT SERVICE'
ORDER BY
c_name ASC;
GO
Thank you for reading!