Advertisement

Latest Post

Thursday, 27 July 2017

Indexing for JSON Value in Sql Server 2016

As we have discussed lot of points about features of Sql server 2016. One of the best and most useful features in SQL server 2016 is JSON Support.
Now here, I am discussing how we can apply indexing on JSON String Columns.       
Suppose we have a table “User” that has Address and Contact Number where these columns are NVARCHAR/VARCHAR. In these columns I am storing JSON string value of user.  If we fetch all users records match on the Contact Number.  To handle this case, creating an index on this column is not the best approach because if we create indexes on this column then it also is apply on the JSON values that will cause the performance issue and will take lots of space.

I am creating “Base_User” table below is the simple script

CREATE TABLE dbo.Base_Users(
    Id INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
    Name NVARCHAR(50),
       Address NVARCHAR(MAX))
GO

Now, I am inserting 5000 records in this table. Below is the script for this.
INSERT INTO dbo.Base_Users ( Name, Address )
       SELECT TOP 50000 NEWID(),
        REPLACE('{"Address":{"State":"UP","Country":"India"},
        "Contact_Number":"@Contact"}',
        '@Contact', 8999999999-ROW_NUMBER() OVER (ORDER BY SC1.object_id))
FROM SYS.all_columns SC1
CROSS JOIN SYS.all_columns SC2

We can get the table storage usage details at this point of time as below query
SP_SPACEUSED Base_Users

OUTPUT:-
name
rows
reserved
data
index_size
unused
Base_Users
50000
12744 KB
12504 KB
56 KB
184 KB

We can also measure the performance of the queries. By Enable the IO and TIME statistics using the following script

--Enable Statistics
SET STATISTICS IO ON
SET STATISTICS TIME ON
GO
SELECT *
FROM dbo.Base_Users
WHERE JSON_VALUE(address,'$.Contact_Number') = '8999999982'

Result:-
Id
Name
Address
17
3AE4AF7F-BAFB-4770-803F-5341C8F1DF8A
{"Address":{"State":"UP","Country":"India"},    "Contact_Number":"8999999982"}

Here is a performance detail of the above query

SQL Server parse and compile time:
   CPU time = 0 ms, elapsed time = 1 ms.

(1 row(s) affected)
Table 'Base_Users'. Scan count 1, logical reads 1570, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 453 ms,  elapsed time = 462 ms.


You can see that above query resulting in higher number of IO and CPU time
To solve this problem, we need to create a non-persisted computed column ContactNumber and store the contact number from JSON string in this column.

To add ContactNumber columns in “Base_User” table, following is the query.
ALTER TABLE dbo.Base_Users
ADD ContactNumber  AS JSON_VALUE(address,'$.Contact_Number')

Now, I am checking the storage usage details after adding the new columns
 SP_SPACEUSED Base_Users
OUTPUT:-
name
rows
reserved
data
index_size
unused
Base_Users
50000
12744 KB
12504 KB
56 KB
184 KB

Here, we can see that it doesn’t take any additional storage space.
Now I am measuring the performance of the queries

SELECT *
FROM dbo.Base_Users
WHERE ContactNumber  = 8999999982

SQL Server parse and compile time:
   CPU time = 0 ms, elapsed time = 0 ms.
SQL Server parse and compile time:
   CPU time = 0 ms, elapsed time = 0 ms.

(1 row(s) affected)
Table 'Base_Users'. Scan count 1, logical reads 1570, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 469 ms,  elapsed time = 467 ms.

Here you can see there is not any improvement in the performance.
Now I am going to create Index on ContactNumber that we have added by Alter query.

CREATE INDEX IX_Base_Users_ContactNumber
 ON dbo.Base_Users(ContactNumber)

Now, I am checking the storage usage details after adding the new columns

SP_SPACEUSED Base_Users

OUTPUT:-
name
rows
reserved
data
index_size
unused
Base_Users
50000
15056 KB
12504 KB
1864 KB
688 KB

Here, we can see that it doesn’t take any additional storage space.
Now I am measuring the performance of the queries

SELECT *
FROM dbo.Base_Users
WHERE ContactNumber  = 8999999982

SQL Server parse and compile time:
   CPU time = 0 ms, elapsed time = 4 ms.
SQL Server parse and compile time:
   CPU time = 0 ms, elapsed time = 0 ms.

(1 row(s) affected)
Table 'Base_Users'. Scan count 1, logical reads 6, physical reads 6, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 3 ms.

Now, you can see the improved the IO and CPU time of the above query.
You can Disable statistics by using below query.
--Disable statistics
SET STATISTICS IO OFF
SET STATISTICS TIME OFF

GO

Government Jobs