As we have discussed lot of points about features of Sql
server 2016. One of the best and most useful features in SQL server 2016 is
JSON Support.
Now here, I am discussing how we can apply indexing on JSON
String Columns.
Suppose we have a table “User” that has Address and Contact
Number where these columns are NVARCHAR/VARCHAR. In these columns I am storing
JSON string value of user. If we fetch
all users records match on the Contact Number.
To handle this case, creating an index on this column is not the best
approach because if we create indexes on this column then it also is apply on
the JSON values that will cause the performance issue and will take lots of
space.
I am
creating “Base_User” table below is the simple script
CREATE TABLE dbo.Base_Users(
Id INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
Name NVARCHAR(50),
Address NVARCHAR(MAX))
GO
Now, I
am inserting 5000 records in this table. Below is the script for this.
INSERT INTO dbo.Base_Users ( Name, Address )
SELECT TOP 50000 NEWID(),
REPLACE('{"Address":{"State":"UP","Country":"India"},
"Contact_Number":"@Contact"}',
'@Contact', 8999999999-ROW_NUMBER() OVER (ORDER BY SC1.object_id))
FROM SYS.all_columns SC1
CROSS JOIN SYS.all_columns SC2
We can
get the table
storage usage details at this point of time as below query
SP_SPACEUSED Base_Users
OUTPUT:-
name
|
rows
|
reserved
|
data
|
index_size
|
unused
|
Base_Users
|
50000
|
12744 KB
|
12504 KB
|
56 KB
|
184 KB
|
We can also measure
the performance of the queries. By Enable the IO and TIME statistics using the
following script
--Enable
Statistics
SET STATISTICS IO ON
SET STATISTICS TIME ON
GO
SELECT *
FROM dbo.Base_Users
WHERE JSON_VALUE(address,'$.Contact_Number') = '8999999982'
Result:-
Id
|
Name
|
Address
|
17
|
3AE4AF7F-BAFB-4770-803F-5341C8F1DF8A
|
{"Address":{"State":"UP","Country":"India"},
"Contact_Number":"8999999982"}
|
Here is a performance
detail of the above query
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.
(1 row(s)
affected)
Table
'Base_Users'. Scan count 1, logical
reads 1570, physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 453 ms, elapsed time
= 462 ms.
You can see that above query resulting in higher number of
IO and CPU time
To solve this problem, we need to create a non-persisted
computed column ContactNumber and
store the contact number from JSON string in this column.
To add ContactNumber
columns in “Base_User” table,
following is the query.
ALTER TABLE dbo.Base_Users
ADD ContactNumber AS JSON_VALUE(address,'$.Contact_Number')
Now, I am checking
the storage usage details after adding the new columns
SP_SPACEUSED Base_Users
OUTPUT:-
name
|
rows
|
reserved
|
data
|
index_size
|
unused
|
Base_Users
|
50000
|
12744 KB
|
12504 KB
|
56 KB
|
184 KB
|
Here, we can see that
it doesn’t take any additional storage space.
Now I am measuring the
performance of the queries
SELECT *
FROM dbo.Base_Users
WHERE ContactNumber = 8999999982
SQL
Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL
Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
(1 row(s)
affected)
Table
'Base_Users'. Scan count 1, logical reads 1570, physical reads 0, read-ahead
reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 469 ms, elapsed time = 467 ms.
Here you can see
there is not any improvement in the performance.
Now I am going to
create Index on ContactNumber that we have added by Alter query.
CREATE INDEX
IX_Base_Users_ContactNumber
ON dbo.Base_Users(ContactNumber)
Now, I am checking
the storage usage details after adding the new columns
SP_SPACEUSED Base_Users
OUTPUT:-
name
|
rows
|
reserved
|
data
|
index_size
|
unused
|
Base_Users
|
50000
|
15056 KB
|
12504 KB
|
1864 KB
|
688 KB
|
Here, we can see that
it doesn’t take any additional storage space.
Now I am measuring the
performance of the queries
SELECT *
FROM dbo.Base_Users
WHERE ContactNumber = 8999999982
SQL
Server parse and compile time:
CPU time = 0 ms, elapsed time = 4 ms.
SQL
Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
(1 row(s)
affected)
Table
'Base_Users'. Scan count 1, logical
reads 6, physical reads 6, read-ahead reads 0, lob logical reads 0, lob
physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time =
3 ms.
Now, you can see the
improved the IO and CPU time of the above query.
You can Disable
statistics by using below query.
--Disable statistics
SET STATISTICS IO OFF
SET STATISTICS TIME OFF
GO
No comments:
Post a comment