Optimizing numeric Range Search in SQL Server












3














This question is similar to Optimizing IP Range Search? but that one is restricted to SQL Server 2000.



Suppose I have 10 million ranges provisionally stored in a table structured and populated as below.



CREATE TABLE MyTable
(
Id INT IDENTITY PRIMARY KEY,
RangeFrom INT NOT NULL,
RangeTo INT NOT NULL,
CHECK (RangeTo > RangeFrom),
INDEX IX1 (RangeFrom,RangeTo),
INDEX IX2 (RangeTo,RangeFrom)
);

WITH RandomNumbers
AS (SELECT TOP 10000000 ABS(CRYPT_GEN_RANDOM(4)%100000000) AS Num
FROM sys.all_objects o1,
sys.all_objects o2,
sys.all_objects o3,
sys.all_objects o4)
INSERT INTO MyTable
(RangeFrom,
RangeTo)
SELECT Num,
Num + 1 + CRYPT_GEN_RANDOM(1)
FROM RandomNumbers


I need to know all ranges containing the value 50,000,000. I try the following query



SELECT *
FROM MyTable
WHERE 50000000 BETWEEN RangeFrom AND RangeTo


SQL Server shows that there were 10,951 logical reads and nearly 5 million rows were read to return the 12 matching ones.



enter image description here



Can I improve on this performance? Any restructuring of the table or additional indexes is fine.










share|improve this question



























    3














    This question is similar to Optimizing IP Range Search? but that one is restricted to SQL Server 2000.



    Suppose I have 10 million ranges provisionally stored in a table structured and populated as below.



    CREATE TABLE MyTable
    (
    Id INT IDENTITY PRIMARY KEY,
    RangeFrom INT NOT NULL,
    RangeTo INT NOT NULL,
    CHECK (RangeTo > RangeFrom),
    INDEX IX1 (RangeFrom,RangeTo),
    INDEX IX2 (RangeTo,RangeFrom)
    );

    WITH RandomNumbers
    AS (SELECT TOP 10000000 ABS(CRYPT_GEN_RANDOM(4)%100000000) AS Num
    FROM sys.all_objects o1,
    sys.all_objects o2,
    sys.all_objects o3,
    sys.all_objects o4)
    INSERT INTO MyTable
    (RangeFrom,
    RangeTo)
    SELECT Num,
    Num + 1 + CRYPT_GEN_RANDOM(1)
    FROM RandomNumbers


    I need to know all ranges containing the value 50,000,000. I try the following query



    SELECT *
    FROM MyTable
    WHERE 50000000 BETWEEN RangeFrom AND RangeTo


    SQL Server shows that there were 10,951 logical reads and nearly 5 million rows were read to return the 12 matching ones.



    enter image description here



    Can I improve on this performance? Any restructuring of the table or additional indexes is fine.










    share|improve this question

























      3












      3








      3







      This question is similar to Optimizing IP Range Search? but that one is restricted to SQL Server 2000.



      Suppose I have 10 million ranges provisionally stored in a table structured and populated as below.



      CREATE TABLE MyTable
      (
      Id INT IDENTITY PRIMARY KEY,
      RangeFrom INT NOT NULL,
      RangeTo INT NOT NULL,
      CHECK (RangeTo > RangeFrom),
      INDEX IX1 (RangeFrom,RangeTo),
      INDEX IX2 (RangeTo,RangeFrom)
      );

      WITH RandomNumbers
      AS (SELECT TOP 10000000 ABS(CRYPT_GEN_RANDOM(4)%100000000) AS Num
      FROM sys.all_objects o1,
      sys.all_objects o2,
      sys.all_objects o3,
      sys.all_objects o4)
      INSERT INTO MyTable
      (RangeFrom,
      RangeTo)
      SELECT Num,
      Num + 1 + CRYPT_GEN_RANDOM(1)
      FROM RandomNumbers


      I need to know all ranges containing the value 50,000,000. I try the following query



      SELECT *
      FROM MyTable
      WHERE 50000000 BETWEEN RangeFrom AND RangeTo


      SQL Server shows that there were 10,951 logical reads and nearly 5 million rows were read to return the 12 matching ones.



      enter image description here



      Can I improve on this performance? Any restructuring of the table or additional indexes is fine.










      share|improve this question













      This question is similar to Optimizing IP Range Search? but that one is restricted to SQL Server 2000.



      Suppose I have 10 million ranges provisionally stored in a table structured and populated as below.



      CREATE TABLE MyTable
      (
      Id INT IDENTITY PRIMARY KEY,
      RangeFrom INT NOT NULL,
      RangeTo INT NOT NULL,
      CHECK (RangeTo > RangeFrom),
      INDEX IX1 (RangeFrom,RangeTo),
      INDEX IX2 (RangeTo,RangeFrom)
      );

      WITH RandomNumbers
      AS (SELECT TOP 10000000 ABS(CRYPT_GEN_RANDOM(4)%100000000) AS Num
      FROM sys.all_objects o1,
      sys.all_objects o2,
      sys.all_objects o3,
      sys.all_objects o4)
      INSERT INTO MyTable
      (RangeFrom,
      RangeTo)
      SELECT Num,
      Num + 1 + CRYPT_GEN_RANDOM(1)
      FROM RandomNumbers


      I need to know all ranges containing the value 50,000,000. I try the following query



      SELECT *
      FROM MyTable
      WHERE 50000000 BETWEEN RangeFrom AND RangeTo


      SQL Server shows that there were 10,951 logical reads and nearly 5 million rows were read to return the 12 matching ones.



      enter image description here



      Can I improve on this performance? Any restructuring of the table or additional indexes is fine.







      sql-server optimization






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked 1 hour ago









      Martin Smith

      61.5k10166245




      61.5k10166245






















          1 Answer
          1






          active

          oldest

          votes


















          3














          One alternative way of representing a range would be as points on a line.



          The below migrates all the data into a new table with the range represented as a geometry datatype.



          CREATE TABLE MyTable2
          (
          Id INT IDENTITY PRIMARY KEY,
          Range GEOMETRY NOT NULL,
          RangeFrom AS Range.STPointN(1).STX,
          RangeTo AS Range.STPointN(2).STX,
          CHECK (Range.STNumPoints() = 2 AND Range.STPointN(1).STY = 0 AND Range.STPointN(2).STY = 0)
          );

          SET IDENTITY_INSERT MyTable2 ON

          INSERT INTO MyTable2
          (Id,
          Range)
          SELECT ID,
          geometry::STLineFromText(CONCAT('LINESTRING(', RangeFrom, ' 0, ', RangeTo, ' 0)'), 0)
          FROM MyTable

          SET IDENTITY_INSERT MyTable2 OFF


          CREATE SPATIAL INDEX index_name
          ON MyTable2 ( Range )
          USING GEOMETRY_GRID
          WITH (
          BOUNDING_BOX = ( xmin=0, ymin=0, xmax=110000000, ymax=1 ),
          GRIDS = (HIGH, HIGH, HIGH, HIGH),
          CELLS_PER_OBJECT = 16);


          The equivalent query to find ranges containing the value 50,000,000 is below.



          SELECT Id,
          RangeFrom,
          RangeTo
          FROM MyTable2
          WHERE Range.STContains(geometry::STPointFromText ('POINT (50000000 0)', 0)) = 1


          The reads for this show an improvement on the 10,951 from the original query.



          Table 'MyTable2'. Scan count 0, logical reads 505, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
          Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
          Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
          Table 'extended_index_1797581442_384000'. Scan count 4, logical reads 17, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


          However there is no significant improvement over the original in terms of time elapsed. Typical execution results are 250 ms vs 252 ms.



          The execution plan is more complex as below



          enter image description here



          The only case where the rewrite reliably performs better for me is with a cold cache.



          So disappointing in this case and difficult to recommend this rewrite but publication of negative results can also be useful.






          share|improve this answer





















            Your Answer








            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "182"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f225953%2foptimizing-numeric-range-search-in-sql-server%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            3














            One alternative way of representing a range would be as points on a line.



            The below migrates all the data into a new table with the range represented as a geometry datatype.



            CREATE TABLE MyTable2
            (
            Id INT IDENTITY PRIMARY KEY,
            Range GEOMETRY NOT NULL,
            RangeFrom AS Range.STPointN(1).STX,
            RangeTo AS Range.STPointN(2).STX,
            CHECK (Range.STNumPoints() = 2 AND Range.STPointN(1).STY = 0 AND Range.STPointN(2).STY = 0)
            );

            SET IDENTITY_INSERT MyTable2 ON

            INSERT INTO MyTable2
            (Id,
            Range)
            SELECT ID,
            geometry::STLineFromText(CONCAT('LINESTRING(', RangeFrom, ' 0, ', RangeTo, ' 0)'), 0)
            FROM MyTable

            SET IDENTITY_INSERT MyTable2 OFF


            CREATE SPATIAL INDEX index_name
            ON MyTable2 ( Range )
            USING GEOMETRY_GRID
            WITH (
            BOUNDING_BOX = ( xmin=0, ymin=0, xmax=110000000, ymax=1 ),
            GRIDS = (HIGH, HIGH, HIGH, HIGH),
            CELLS_PER_OBJECT = 16);


            The equivalent query to find ranges containing the value 50,000,000 is below.



            SELECT Id,
            RangeFrom,
            RangeTo
            FROM MyTable2
            WHERE Range.STContains(geometry::STPointFromText ('POINT (50000000 0)', 0)) = 1


            The reads for this show an improvement on the 10,951 from the original query.



            Table 'MyTable2'. Scan count 0, logical reads 505, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
            Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
            Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
            Table 'extended_index_1797581442_384000'. Scan count 4, logical reads 17, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


            However there is no significant improvement over the original in terms of time elapsed. Typical execution results are 250 ms vs 252 ms.



            The execution plan is more complex as below



            enter image description here



            The only case where the rewrite reliably performs better for me is with a cold cache.



            So disappointing in this case and difficult to recommend this rewrite but publication of negative results can also be useful.






            share|improve this answer


























              3














              One alternative way of representing a range would be as points on a line.



              The below migrates all the data into a new table with the range represented as a geometry datatype.



              CREATE TABLE MyTable2
              (
              Id INT IDENTITY PRIMARY KEY,
              Range GEOMETRY NOT NULL,
              RangeFrom AS Range.STPointN(1).STX,
              RangeTo AS Range.STPointN(2).STX,
              CHECK (Range.STNumPoints() = 2 AND Range.STPointN(1).STY = 0 AND Range.STPointN(2).STY = 0)
              );

              SET IDENTITY_INSERT MyTable2 ON

              INSERT INTO MyTable2
              (Id,
              Range)
              SELECT ID,
              geometry::STLineFromText(CONCAT('LINESTRING(', RangeFrom, ' 0, ', RangeTo, ' 0)'), 0)
              FROM MyTable

              SET IDENTITY_INSERT MyTable2 OFF


              CREATE SPATIAL INDEX index_name
              ON MyTable2 ( Range )
              USING GEOMETRY_GRID
              WITH (
              BOUNDING_BOX = ( xmin=0, ymin=0, xmax=110000000, ymax=1 ),
              GRIDS = (HIGH, HIGH, HIGH, HIGH),
              CELLS_PER_OBJECT = 16);


              The equivalent query to find ranges containing the value 50,000,000 is below.



              SELECT Id,
              RangeFrom,
              RangeTo
              FROM MyTable2
              WHERE Range.STContains(geometry::STPointFromText ('POINT (50000000 0)', 0)) = 1


              The reads for this show an improvement on the 10,951 from the original query.



              Table 'MyTable2'. Scan count 0, logical reads 505, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
              Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
              Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
              Table 'extended_index_1797581442_384000'. Scan count 4, logical reads 17, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


              However there is no significant improvement over the original in terms of time elapsed. Typical execution results are 250 ms vs 252 ms.



              The execution plan is more complex as below



              enter image description here



              The only case where the rewrite reliably performs better for me is with a cold cache.



              So disappointing in this case and difficult to recommend this rewrite but publication of negative results can also be useful.






              share|improve this answer
























                3












                3








                3






                One alternative way of representing a range would be as points on a line.



                The below migrates all the data into a new table with the range represented as a geometry datatype.



                CREATE TABLE MyTable2
                (
                Id INT IDENTITY PRIMARY KEY,
                Range GEOMETRY NOT NULL,
                RangeFrom AS Range.STPointN(1).STX,
                RangeTo AS Range.STPointN(2).STX,
                CHECK (Range.STNumPoints() = 2 AND Range.STPointN(1).STY = 0 AND Range.STPointN(2).STY = 0)
                );

                SET IDENTITY_INSERT MyTable2 ON

                INSERT INTO MyTable2
                (Id,
                Range)
                SELECT ID,
                geometry::STLineFromText(CONCAT('LINESTRING(', RangeFrom, ' 0, ', RangeTo, ' 0)'), 0)
                FROM MyTable

                SET IDENTITY_INSERT MyTable2 OFF


                CREATE SPATIAL INDEX index_name
                ON MyTable2 ( Range )
                USING GEOMETRY_GRID
                WITH (
                BOUNDING_BOX = ( xmin=0, ymin=0, xmax=110000000, ymax=1 ),
                GRIDS = (HIGH, HIGH, HIGH, HIGH),
                CELLS_PER_OBJECT = 16);


                The equivalent query to find ranges containing the value 50,000,000 is below.



                SELECT Id,
                RangeFrom,
                RangeTo
                FROM MyTable2
                WHERE Range.STContains(geometry::STPointFromText ('POINT (50000000 0)', 0)) = 1


                The reads for this show an improvement on the 10,951 from the original query.



                Table 'MyTable2'. Scan count 0, logical reads 505, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
                Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
                Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
                Table 'extended_index_1797581442_384000'. Scan count 4, logical reads 17, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


                However there is no significant improvement over the original in terms of time elapsed. Typical execution results are 250 ms vs 252 ms.



                The execution plan is more complex as below



                enter image description here



                The only case where the rewrite reliably performs better for me is with a cold cache.



                So disappointing in this case and difficult to recommend this rewrite but publication of negative results can also be useful.






                share|improve this answer












                One alternative way of representing a range would be as points on a line.



                The below migrates all the data into a new table with the range represented as a geometry datatype.



                CREATE TABLE MyTable2
                (
                Id INT IDENTITY PRIMARY KEY,
                Range GEOMETRY NOT NULL,
                RangeFrom AS Range.STPointN(1).STX,
                RangeTo AS Range.STPointN(2).STX,
                CHECK (Range.STNumPoints() = 2 AND Range.STPointN(1).STY = 0 AND Range.STPointN(2).STY = 0)
                );

                SET IDENTITY_INSERT MyTable2 ON

                INSERT INTO MyTable2
                (Id,
                Range)
                SELECT ID,
                geometry::STLineFromText(CONCAT('LINESTRING(', RangeFrom, ' 0, ', RangeTo, ' 0)'), 0)
                FROM MyTable

                SET IDENTITY_INSERT MyTable2 OFF


                CREATE SPATIAL INDEX index_name
                ON MyTable2 ( Range )
                USING GEOMETRY_GRID
                WITH (
                BOUNDING_BOX = ( xmin=0, ymin=0, xmax=110000000, ymax=1 ),
                GRIDS = (HIGH, HIGH, HIGH, HIGH),
                CELLS_PER_OBJECT = 16);


                The equivalent query to find ranges containing the value 50,000,000 is below.



                SELECT Id,
                RangeFrom,
                RangeTo
                FROM MyTable2
                WHERE Range.STContains(geometry::STPointFromText ('POINT (50000000 0)', 0)) = 1


                The reads for this show an improvement on the 10,951 from the original query.



                Table 'MyTable2'. Scan count 0, logical reads 505, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
                Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
                Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
                Table 'extended_index_1797581442_384000'. Scan count 4, logical reads 17, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


                However there is no significant improvement over the original in terms of time elapsed. Typical execution results are 250 ms vs 252 ms.



                The execution plan is more complex as below



                enter image description here



                The only case where the rewrite reliably performs better for me is with a cold cache.



                So disappointing in this case and difficult to recommend this rewrite but publication of negative results can also be useful.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered 1 hour ago









                Martin Smith

                61.5k10166245




                61.5k10166245






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Database Administrators Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.





                    Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                    Please pay close attention to the following guidance:


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f225953%2foptimizing-numeric-range-search-in-sql-server%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Михайлов, Христо

                    Гороховецкий артиллерийский полигон

                    Центральная группа войск