分析_数据库其它_脚本之家

The Hidden Costs of INSERT EXEC

INSERT and EXEC: relational brothers in arms, helping you get your data
and then put it somewhere. But like all brothers, their relationship has
its ups and downs and sometimes you need to look just below the surface
to see the real issues. In this post I will cover hidden penalties that
a decision–seemingly purely aesthetic–can bring to the fore.

To illustrate the first of these examples I will be using the following
table:

USE tempdbGOCREATE TABLE dbo.MyTable( [SalesOrderID] [int] NOT NULL, [SalesOrderDetailID] [int] NOT NULL, [CarrierTrackingNumber] [nvarchar] NULL, [OrderQty] [smallint] NOT NULL, [ProductID] [int] NOT NULL, [SpecialOfferID] [int] NOT NULL, [UnitPrice] [money] NOT NULL, [UnitPriceDiscount] [money] NOT NULL, [LineTotal] [money] NOT NULL, [rowguid] [uniqueidentifier] NOT NULL, [ModifiedDate] [datetime] NOT NULL)GO

Consider the following queries, disregarding the fact that this isn’t a
good dynamic SQL example (I’m trying to keep it simple), and take a
moment to think about the differences between them from a behavioral
point of view:

EXEC(' INSERT dbo.MyTable SELECT * FROM AdventureWorks.Sales.SalesOrderDetail')GOINSERT dbo.MyTableEXEC (' SELECT * FROM AdventureWorks.Sales.SalesOrderDetail')GO

金沙js8331,分析_数据库其它_脚本之家。It’s quite common to insert some data into a table from dynamic SQL, and
for as long as I can remember, I’ve believed the choice between these
two forms of the query to be primarily an issue of style. My preference
has been for the latter style, as I feel that it’s a bit more readable.
It also has the benefit of better supporting ownership chains, but
that’s something that, in most cases, we don’t have to worry about any
longer in SQL Server 2005 or 2008. But let’s move beyond style and
potential security issues and get to the substance. What actually
happens when we run these two queries?

In the first case–insert done inside of the EXEC–the first step is
that the outer statement–the EXEC itself–is parsed and validated. Now
we jump down one level of context, into the string that was passed to
EXEC, and that string is parsed and validated. The referenced objects
are resolved, and the plan cache is checked. If there is no plan, the
query is compiled. And then it’s executed. Data is streamed from
SalesOrderDetail into MyTable and when the query inside of the EXEC is
done control returns to the calling context–EXEC–and assuming there
were no errors, the EXEC is now completed.

In the second case, something quite a bit different occurs. The insert
is parsed and validated, and an INSERT EXEC plan is generated. This plan
does not involve SalesOrderDetail, as the query optimizer doesn’t know
at this point in the process where the data will be inserted from, since
that step is dynamic. So the plan references something called the
“Parameter Table” (plan truncated for simplicity):

|--Table Insert(OBJECT:([tempdb].[dbo].[MyTable])) |--Top |--Parameter Table Scan

Once this plan has been generated, we again jump down one level of
context, and the inner string is parsed and validated, the referenced
object resolved, the cache checked, and the compiled plan executed. But
what is this Parameter Table thing?

Here’s where things start really diverging. Data is not, at this point,
streamed from SalesOrderDetail directly into MyTable. Rather, it is
streamed from SalesOrderDetail into the Parameter Table. And the
Parameter Table, as it turns out, is in actuality a hidden temporary
table. And not until all of the data has streamed into that hidden
temporary table is control resumed by the EXEC context, and only then
does the data start moving into its final home.

The natural question you might be asking yourself at this point is, just
how much overhead does this Parameter Table introduce into the equation?
The answer might just send you racing to check your stored procedures:
the additional cost for the Parameter Table is well over 100% as
compared with doing the insert inside of the EXEC. The fact that all of
the data must be spooled to the Parameter Table before the insert can
even begin tells us that this must be true, and it can be verified using
a simple check against the sys.dm_exec_requests DMV, as in the
following example:

EXEC(' INSERT dbo.MyTable SELECT * FROM AdventureWorks.Sales.SalesOrderDetail')SELECT r.writes AS [INSERT inside EXEC writes]FROM sys.dm_exec_requests rWHERE r.session_id = @@SPIDGOINSERT dbo.MyTableEXEC(' SELECT * FROM AdventureWorks.Sales.SalesOrderDetail')SELECT r.writes AS [INSERT EXEC writes]FROM sys.dm_exec_requests rWHERE r.session_id = @@SPIDGO

So that’s that. We should avoid INSERT EXEC and try to do our inserts in
the same context in which the SELECT is running–right?

Well, yes and no. There is another element at play here which I haven’t
yet mentioned. What if we were only inserting a few rows, and the table
we were inserting into looked something like the following:

CREATE TABLE #MyTable( [SalesOrderID] [int] NOT NULL, [SalesOrderDetailID] [int] NOT NULL, [CarrierTrackingNumber] [nvarchar] NULL, [OrderQty] [smallint] NOT NULL, [ProductID] [int] NOT NULL, [SpecialOfferID] [int] NOT NULL, [UnitPrice] [money] NOT NULL, [UnitPriceDiscount] [money] NOT NULL, [LineTotal] [money] NOT NULL, [rowguid] [uniqueidentifier] NOT NULL, [ModifiedDate] [datetime] NOT NULL)GO

The only difference between this table and the previous one is that this
is a temporary table and the other is not. But temporary tables have
their own interesting little twists, especially when it comes down to
one of the key enemies in a highly-transactional system: recompilation.
As it turns out, doing the insert inside the EXEC will cause the
internal statement to recompile every time a new temp table is
encountered. This means that if you have a stored procedure that creates
a temp table, puts together a bit of dynamic SQL, and does an insert
inside of that dynamic SQL, you’ll now have yourself a recompilation
problem.

To see this illustrated, try the following script. Here only a single
row is inserted as a result of the dynamic query, but it’s complex
enough that the compile time more than overshadows the overhead of the
Parameter Table:

USE tempdbGODBCC FREEPROCCACHEGOCREATE TABLE #AvgTimes( CPU_time DECIMAL NOT NULL, insert_type VARCHARGOCREATE TABLE #MyTable( [SalesOrderID] [int] NOT NULL, [SalesOrderDetailID] [int] NOT NULL, [CarrierTrackingNumber] [nvarchar] NULL, [OrderQty] [smallint] NOT NULL, [ProductID] [int] NOT NULL, [SpecialOfferID] [int] NOT NULL, [UnitPrice] [money] NOT NULL, [UnitPriceDiscount] [money] NOT NULL, [LineTotal] [money] NOT NULL, [rowguid] [uniqueidentifier] NOT NULL, [ModifiedDate] [datetime] NOT NULL)EXEC(' INSERT #MyTable SELECT TOP sod.* FROM AdventureWorks.Sales.SalesOrderDetail sod WHERE sod.UnitPrice > 10 AND sod.LineTotal > 100 AND EXISTS ( SELECT * FROM AdventureWorks.Sales.SalesOrderHeader soh JOIN AdventureWorks.Person.Contact c ON c.ContactID = soh.CustomerID WHERE soh.SalesOrderID = sod.SalesOrderID AND c.LastName LIKE ''Ab%'' ) AND EXISTS ( SELECT * FROM AdventureWorks.Production.Product p WHERE p.ProductID = sod.ProductID AND p.Color IS NULL ) AND NOT EXISTS ( SELECT * FROM AdventureWorks.Sales.SalesOrderHeader soh JOIN AdventureWorks.Sales.SalesPerson sp ON soh.SalesPersonID = sp.SalesPersonID WHERE soh.SalesOrderID = sod.SalesOrderID AND sp.CommissionPct > 50 )')INSERT #AvgTimesSELECT r.cpu_time, 'INSERT inside EXEC'FROM sys.dm_exec_requests rWHERE r.session_id = @@SPIDDROP TABLE #MyTableGO 5CREATE TABLE #MyTable( [SalesOrderID] [int] NOT NULL, [SalesOrderDetailID] [int] NOT NULL, [CarrierTrackingNumber] [nvarchar] NULL, [OrderQty] [smallint] NOT NULL, [ProductID] [int] NOT NULL, [SpecialOfferID] [int] NOT NULL, [UnitPrice] [money] NOT NULL, [UnitPriceDiscount] [money] NOT NULL, [LineTotal] [money] NOT NULL, [rowguid] [uniqueidentifier] NOT NULL, [ModifiedDate] [datetime] NOT NULL)INSERT #MyTableEXEC sod.* FROM AdventureWorks.Sales.SalesOrderDetail sod WHERE sod.UnitPrice > 10 AND sod.LineTotal > 100 AND EXISTS ( SELECT * FROM AdventureWorks.Sales.SalesOrderHeader soh JOIN AdventureWorks.Person.Contact c ON c.ContactID = soh.CustomerID WHERE soh.SalesOrderID = sod.SalesOrderID AND c.LastName LIKE ''Ab%'' ) AND EXISTS ( SELECT * FROM AdventureWorks.Production.Product p WHERE p.ProductID = sod.ProductID AND p.Color IS NULL ) AND NOT EXISTS ( SELECT * FROM AdventureWorks.Sales.SalesOrderHeader soh JOIN AdventureWorks.Sales.SalesPerson sp ON soh.SalesPersonID = sp.SalesPersonID WHERE soh.SalesOrderID = sod.SalesOrderID AND sp.CommissionPct > 50 )')INSERT #AvgTimesSELECT r.cpu_time, 'INSERT EXEC'FROM sys.dm_exec_requests rWHERE r.session_id = @@SPIDDROP TABLE #MyTableGO 5SELECT AVG ( CASE a.insert_type WHEN 'INSERT inside EXEC' THEN a.CPU_time ELSE NULL END ) AS [CPU time - INSERT inside EXEC], AVG ( CASE a.insert_type WHEN 'INSERT EXEC' THEN a.CPU_time ELSE NULL END ) AS [CPU time - INSERT EXEC]FROM #AvgTimes aGODROP TABLE #AvgTimesGO

So what have we learned today? The choice between INSERT EXEC and INSERT
inside of EXEC is not purely stylistic and has definite performance
implications. Here are the rules I’ll be following from now on:

When working with permanent tables, always avoid INSERT EXEC if
possible. There are some cases where it won’t be possible to avoid. I
only showed dynamic SQL in this post, but INSERT EXEC also applies to
stored procedures. Can you safely rip apart all of the stored procedures
in your system in order to avoid this issue? Maybe not quite as easily
as you can rip apart the dynamic SQL within stored procedures. When
working with temporary tables, evaluate the complexity of the
operations, the amount of data to be inserted, and most importantly,
test every solution! The more rows that are inserted as a result of the
INSERT EXEC, the more the overhead you’ll get from the Parameter Table.
On the flip side, the more complex the dynamic statement, the more
overhead you’ll get from recompilations. Every scenario is bound to be
different and you may just learn something about your processes by doing
this extra bit of analysis.

发表评论

电子邮件地址不会被公开。 必填项已用*标注