复杂查询优化


-4

我在SQL Server 2014中的存储过程花费了太多的时间来执行。需要8分钟才能返回118条记录。这里是我的代码:

USE [testdb] 
GO 
SET ANSI_NULLS ON 
GO 
SET QUOTED_IDENTIFIER ON 
GO 

create PROCEDURE [dbo].[X] (
    @AccountID BIGINT, 
    @DateFrom DATE, 
    @DateTo DATE, 
    @Status INT = 0, 
    @FirstName VARCHAR(25) = '', 
    @LastName VARCHAR(25) = '', 
    @NPI VARCHAR(15) = '', 
    @SPI VARCHAR(15) = '', 
    @TaxID VARCHAR(15) = '', 
    @MedicaidID VARCHAR(50) = '' 
) AS BEGIN 
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED 
DECLARE @Restrict BIT = 0 
DECLARE @ManagerID BIGINT = 0; 
DECLARE @RecCount AS INT = 0; 

SELECT * FROM (
SELECT 
    s.ScheduleID, 
    s.ClientID, 
    s.AdmissionID, 
    s.StaffID, 
    CASE WHEN st.DisplayName IS NOT NULL THEN 
     st.DisplayName 
    ELSE 
     ISNULL(Call.StaffNo, '') 
    END as StaffName, 
    s.Date, 
    s.ServiceCode, 
    s.ActivityCode, 
    s.TimeIND, 
    s.TimeOUTD, 
    s.DurationD, 
    s.DurationP, 
    s.DurationS, 
    CASE 
     WHEN s.Status = 1 THEN 'Pending' 
     WHEN s.Status = 2 THEN 'Confirmed' 
     WHEN s.Status = 3 THEN 'Exception' 
     WHEN s.Status = 5 THEN 'Cancelled' 
     ELSE 'Closed' END as Status, 
    --vi.Name as Status, 
    s.Status as StatusCode, 
    s.Flags, 
    s.CallFlags, 
    CASE WHEN c.DisplayName IS NOT NULL THEN 
     c.DisplayName 
    ELSE 
     ISNULL('(' + SUBSTRING(Call.CallFrom, 1, 3) + ') ' + 
       SUBSTRING(Call.CallFrom, 4, 3) + '-' + 
       SUBSTRING(Call.CallFrom, 7, 4), '') 
    END AS ClientName, 
    CASE WHEN ad.ChartID IS NOT NULL THEN 
     ad.ChartID 
    ELSE 
     ISNULL(Call.ClientNo, '') 
    END AS ChartID, 
    s.Comments, 
    s.TimeZone, 
    s.InDST, 
    Address.Address, 
    Address.Address2, 
    Address.City, 
    Address.State, 
    Address.Zip, 
    CASE WHEN s.TimeINS IS NULL THEN NULL ELSE DATEADD(HOUR, -1 * (s.TimeZone - s.InDST), s.TimeINS) END as TimeINS, 
    CASE WHEN s.TimeOUTS IS NULL THEN NULL ELSE DATEADD(HOUR, -1 * (s.TimeZone - s.InDST), s.TimeOUTS) END as TimeOUTS, 
    CASE WHEN s.TimeIN IS NULL THEN NULL ELSE DATEADD(HOUR, -1 * (s.TimeZone - s.InDST), s.TimeIN) END as TimeIN, 
    CASE WHEN s.TimeOUT IS NULL THEN NULL ELSE DATEADD(HOUR, -1 * (s.TimeZone - s.InDST), s.TimeOUT) END as TimeOUT, 
    p.PayorName as Payors, 
    REPLACE(c.MedicaidID, ' ', '') as MedicaidID, 
    REPLACE(cm.TaxID, '-', '') as TaxID, 
    --cn.NPI, 
    --cn.SPI, 
    REPLACE(REPLACE(CASE WHEN (pn.NPI IS NULL OR pn.NPI = '') THEN CASE WHEN (cn.NPI IS NULL OR cn.NPI = '') THEN p.NPI ELSE cn.NPI END ELSE pn.NPI END, '-', ''), ' ', '') as NPI, 
    REPLACE(REPLACE(CASE WHEN (pn.SPI IS NULL OR pn.SPI = '') THEN ISNULL(cn.SPI, '') ELSE pn.SPI END, '-', ''), ' ', '') as SPI, 
    mp.AccountID as MasterAccountID, 
    CASE WHEN ser.ID IS NOT NULL THEN 'Y' ELSE 'N' END as HasReasonCode, 
    RANK() OVER (PARTITION BY s.scheduleid order by call.callid) as Ranking, 
    dbo.ParseCallExceptions(s.Flags, s.CallFlags) as Exceptions 
FROM Schedule AS s 
    INNER JOIN Account ac ON s.AccountID = ac.AccountID AND ac.BeginDate IS NOT NULL 
    LEFT OUTER JOIN Call on Call.ScheduleID = s.ScheduleID and Call.AccountID = s.AccountID 
    INNER JOIN ScheduleItem si ON si.AccountID = s.AccountID AND si.ScheduleID = s.ScheduleID AND si.Owner = 1 AND si.PayorID is not null and ItemOrder = 1 
    INNER JOIN Admission AS ad ON ad.AdmissionID = s.AdmissionID and ad.accountid = s.accountid 
    INNER JOIN Client AS c ON c.ClientID = s.ClientID and c.accountid = s.AccountID 
    LEFT OUTER JOIN Address ON address.accountid = c.accountid and Address.Addressid = c.AddressID 
    LEFT OUTER JOIN Staff AS st ON s.accountid = st.accountid and s.StaffID = st.StaffID 
    --INNER JOIN ValueItem AS vi ON vi.AccountID = s.AccountID AND vi.Category = 'Schedule Status' AND vi.Code = Convert(VARCHAR(10), s.Status) 
    LEFT OUTER JOIN ScheduleEditReason ser on ser.ScheduleID = s.ScheduleID 
    INNER JOIN Company cm ON ad.CompanyCode = cm.CompanyCode AND c.AccountID = cm.AccountID 
    INNER JOIN Payor p on p.AccountID = si.AccountID AND p.PayorID = si.PayorID 
    INNER JOIN Payor mp ON p.ParentID = mp.PayorID 
    LEFT OUTER JOIN CompanyNumber as cn ON s.CompanyCode = cn.CompanyCode AND s.LocationCode = cn.LocationCode AND s.AccountID = cn.AccountID 
    LEFT OUTER JOIN PayorNumber as pn ON pn.PayorID = p.PayorID AND s.CompanyCode = pn.CompanyCode AND s.LocationCode = pn.LocationCode 
    --OUTER APPLY (SELECT top 1 REPLACE(REPLACE(CASE WHEN (pn.NPI IS NULL OR pn.NPI = '') THEN CASE WHEN (cn.NPI IS NULL OR cn.NPI = '') THEN p.NPI ELSE cn.NPI END ELSE pn.NPI END, '-', ''), ' ', '') as NPI 
    -- , REPLACE(REPLACE(CASE WHEN ((pn.NPI IS NULL OR pn.NPI = '') AND (pn.SPI IS NULL OR pn.SPI = '')) THEN CASE WHEN (cn.SPI IS NULL OR cn.SPI = '') THEN ac.ExternalID ELSE cn.SPI END ELSE pn.SPI END, '-', ''), ' ', '') as SPI 
    -- , REPLACE(REPLACE(CASE WHEN ((pn.NPI IS NULL OR pn.NPI = '') AND (pn.API IS NULL OR pn.API = '')) THEN CASE WHEN (cn.API IS NULL OR cn.API = '') THEN pn.API ELSE cn.API END ELSE pn.API END, '-', ''), ' ', '') as API 
    -- FROM CompanyNumber as cn INNER JOIN PayorNumber pn ON pn.AccountID = ad.AccountID and pn.PayorID = p.PayorID and pn.CompanyCode = ad.CompanyCode and pn.LocationCode = ad.LocationCode 
    -- where cn.AccountID = ad.AccountID AND ad.CompanyCode = cn.CompanyCode AND ad.LocationCode = cn.LocationCode) as cn 
WHERE 
    --(c.AccountID = @AccountID or mp.AccountID = @AccountID or @AccountID = 1) AND 
    (s.Date BETWEEN @DateFrom AND @DateTo) AND 
    (@FirstName = '' OR (@FirstName <> '' AND (c.FirstName LIKE @FirstName + '%'))) AND 
    (@LastName = '' OR (@LastName <> '' AND (c.LastName LIKE @LastName + '%'))) AND 
    (s.Flags & 32) <> 32 
) innertable 
WHERE Ranking = 1 
    AND (@MedicaidID = '' OR (@MedicaidID <> '' AND (MedicaidID = @MedicaidID))) 
    --AND (@TaxID = '' OR (@TaxID <> '' AND (TaxID = @TaxID))) 
    AND (@NPI = '' OR (@NPI <> '' AND (NPI = @NPI))) 
    AND (@SPI = '' OR (@SPI <> '' AND (SPI = @SPI))) 
--OPTION (MAXDOP 8) 
END 

下面是SET STATISTICS IO, TIME ON输出:

表 'PayorNumber'。扫描计数3,逻辑读取1124,物理读取0,预读277,lob逻辑读取0,lob物理读取0次,lob预读0

表“帐户”。扫描计数3,逻辑读取352,物理读取0,预读0,lob逻辑读取0,lob物理读取0次,lob预读0

表 'CompanyNumber'。扫描计数3,逻辑读取52,物理读取0,预读10,lob逻辑读取0,lob物理读取0次,lob预读0

表“付款人”。扫描计数3,逻辑读取11611,物理读取0,预读1,lob逻辑读取0,lob物理读取0次,lob预读0

表 'ScheduleItem'。扫描计数3,逻辑读取127117,物理读取192,预读145953,lob逻辑读取0,lob物理读取0次,lob预读0

表“日程”。扫描计数0,逻辑读取54850853,物理读取156982,预读读取469149,lob逻辑读取0,lob物理读取0,lob预读取读取0.

表'工作文件'。扫描计数0,逻辑读取0,物理读取0,预读读取0,lob逻辑读取0,lob物理读取0,lob预读取读取0.

表'WorkTable'。扫描计数5,逻辑读取302,物理读取0,预读读取0,lob逻辑读取0,lob物理读取0,lob预读取读取0.

表'入场'。扫描计数3,逻辑读取3777,物理读取3,预读读取3419,lob逻辑读取0,lob物理读取0,lob预读取读取0.

表'客户'。扫描计数3,逻辑读取10536,物理读取2,预读9586,lob逻辑读取0,lob物理读取0次,lob预读0

表“公司”。扫描计数3,逻辑读取46,物理读取0,预读读取14,lob逻辑读取0,lob物理读取0,lob预读取读取0.

表'ScheduleEditReason'。扫描计数5455,逻辑读取27192,物理读取131,预读读取517,lob逻辑读取0,lob物理读取0,lob读取- 提前读取0.

表'工作人员'。扫描计数0,逻辑读取17259,物理读取79,预读读取28,lob逻辑读取0,lob物理读取0,lob预读取读取0.

表'地址'。扫描计数5455,逻辑读取20191,物理读取112,预读读取61,lob逻辑读取0,lob物理读取0,lob预读读取 0.

表'呼叫'。扫描计数5426,逻辑读取35834,物理读取126,预读读取374,lob逻辑读取0,lob物理读取0,lob预读读取0。

表'WorkTable'。扫描计数0,逻辑读取0,物理读取0,预读读取0,lob逻辑读取0,lob物理读取0,lob预读读取0.

SQL Server执行时间:CPU时间= 76579 ms,经过时间= 192952毫秒。

+1

函数dbo.ParseCallExceptions在做什么?尝试评论一下,看看它是否更快。 17 2月. 172017-02-17 19:37:02

  0

它没有什么区别。现在需要7分钟。任何其他建议? 17 2月. 172017-02-17 20:59:52

  0

表面上有8分钟是118个记录的很长一段时间。我期望源表/连接处理的记录数量要大得多,而且你在某个地方有一个非常优化的连接。参与联合的任何表格都很大? 17 2月. 172017-02-17 21:15:38

  0

几张桌子很大。最大的表格有46511175个表格中的总行数,比如3-4个表格都很大。 17 2月. 172017-02-17 21:19:47

+1

我想这是为了帮助回答这个问题,人们需要查看你的表格定义,特别是哪些列是被索引的,哪些是被聚集的? 18 2月. 172017-02-18 02:08:01

  0

有几个问题:您是否尝试过Joe的OPTION(RECOMPILE)建议,结果如何? - 什么样的执行时间可以接受? - 你能否提供实际的执行计划(和统计IO输出一起运行)?查看https://www.brentozar.com/pastetheplan/instructions/上的内容,并将其粘贴到https://www.brentozar.com/pastetheplan/。 22 2月. 172017-02-22 09:49:15

2

看起来你有一点kitchen sink query去那里。当您有太多可能与查询无关的过滤器时,SQL Server很难创建一个好的查询计划。作为一个测试尝试运行您的查询与OPTION (RECOMPILE)提示。请注意,添加该提示意味着SQL Server将在每次运行查询时始终创建一个新的查询计划,因此它可能不是您的问题的适当长期解决方案。但是,添加提示可以提供有用的数据。

除此之外,帮助我们帮助你。我们不知道任何有关表格定义或数据量的信息,因此可能难以对大型查询进行疑难解答。由于查询在几分钟内运行,您可以收集大量有关性能的信息并将其包含在您的问题中。在查询执行前尝试运行SET STATISTICS IO, TIME ON并将输出编辑到问题中。另外get an actual execution plan并将XML上传到Paste the Plan

现在您已经添加了SET STATISTICS IO, TIME信息让我们来查看输出。首先让我们调整IO。具有最多逻辑读取的表格是Schedule,具有54850853个逻辑读取。第二大表格是ScheduleItem,其逻辑读数为127117。这是一个巨大的差异。有可能这个查询的瓶颈与Schedule表有关。您是否曾尝试向该表中添加可帮助您查询的索引?这可以减少逻辑读取并加速整个查询。 Schedule的聚簇密钥是什么?表中有多少行?在应用dateflags过滤器后,表中还剩有多少行?

经过时间(192952毫秒)和CPU时间(76579毫秒)之间也有很大差异。你的查询是否并行运行?服务器上有很多其他活动吗?为了查明为什么会有这种差异,可能需要调查此查询的等待事件。

+1

@ user2011956您是否可以向您的问题添加任何信息? SET STATISTICS IO的输出,TIME ON的输出如何?为什么你不能粘贴执行计划? 21 2月. 172017-02-21 02:33:36

+1

@ user2011956请参阅https://www.brentozar.com/pastetheplan/instructions/ 21 2月. 172017-02-21 11:46:34

  0

每个表上都有索引。对于计划密钥主键是ScheduleID,帐户ID是外键,行号是13498976. 21 2月. 172017-02-21 16:57:54

+1

@ user2011956是在日程安排表上还有其他索引吗? 22 2月. 172017-02-22 02:13:13

  0

表中有更多索引。 23 2月. 172017-02-23 15:59:31