-
September 16th, 2024, 09:15 PM
#1
Cross Apply, Outer Apply some unique scenarios.
A deeper look at Cross Apply and Outer Apply trying to resolve a performance issue with my query which was challenging to get with the traditional joins. CROSS APPLY allows for more detailed control on individual rows, while INNER JOIN operates on sets of data. It becomes very useful when we need to perform row-wise operations or apply subqueries dynamically for each row.
Introduction
CROSS APPLY and OUTER APPLY can be used when working with calculated fields in complex/nested queries and make them simpler and more readable which could be challenging with traditional joins. Cross Apply works like a cursor, one record at a time not the set operations done by joins. In most of the scenarios APPLY performs as fast as a join, but in certain scenarios, it can be slow. In this article, I?ll discuss a few scenarios where we could use the APPLY clause.
Without Cross Apply and Outer Apply, we have limited options:
The CROSS APPLY statement behaves like to a correlated sub-query but allows us to use ORDER BY statements within the subquery. This is very useful when we want to extract the top record from a sub-query to use in an outer query.
SQL Server User Defined Functions tend to prevent parallel operations. It doesn't inline the function. As each cumulative update for SQL 2022 has been released more and more restrictions as to when inlining will occur have been introduced.
Intermediate tables - Usually works and is often a good option as they can be indexed and fast, but performance can also drop due to UPDATE statements not being parallel and not allowing to cascade or reuse results to update several fields within the same statement. Junction tables can work, but soon enough you're joining subqueries with many UNION ALL statements.
Nesting queries and Repeating code - You can put parenthesis on your query and use it as a subquery upon which you can manipulate source data and calculated fields alike. It can't include a COMPUTE or FOR BROWSE clause and can only include an ORDER BY clause when a TOP clause is also specified.
Values introduced through CROSS APPLY can be used to create one or multiple calculated fields without adding performance, complexity and thus, the CROSS APPLY is like an INNER JOIN, or, more precisely, like a CROSS JOIN with a correlated sub-query with an implicit join condition of 1=1.
Derived Values
This example shows how you can perform "procedural" calculations on the chain of derived values. JOIN subqueries cannot introduce new values in the dataset based on data in the parent query (it must stand on its own).
Code:
SELECT calc1.val1,
ROW_NUMBER() OVER (PARTITION BY calc1.val1 ORDER BY calc5.price_gross) pos_number,
calc1.price1,
calc2.price2,
calc3.price3,
calc4.price4,
calc5.price_gross
FROM tbl t
CROSS APPLY (SELECT CASE t.col1 WHEN 1 THEN 'A' WHEN 2 THEN 'B' WHEN 3 THEN 'C' END val1, t.price * (100 - t.discount1) / 100 AS price1) as calc1
CROSS APPLY (SELECT calc1.price1 * (100 - t.discount2) / 100 AS price2) as calc2
CROSS APPLY (SELECT calc2.price2 * (100 - t.discount3) / 100 AS price3) as calc3
CROSS APPLY (SELECT calc3.price3 * (100 - t.discount4) / 100 AS price4) as calc4
CROSS APPLY (SELECT calc4.price4 * (100 + t.VAT) / 100 AS price_gross) as calc5
INNER JOIN tbl2 t2 ON t2.val1 = calc1.val1
ORDER BY calc1.val1
Top N Values
Another usage is let?s say you have two tables, Customer and Order. Customers have many Orders. If I want to create a view that gives me details about customers, and last 10 most recent order they've made. With just JOINS, this would require some self-joins and aggregation which isn't pretty. But with Cross Apply, its super easy and friendly on the eyes.
Code:
SELECT *
FROM Customer cust
CROSS APPLY (
SELECT TOP 10 *
FROM Order o
WHERE o.CustomerId = cust.CustomerId
ORDER BY OrderDate DESC
) T
DRY Principle
CROSS APPLY to prevent duplicate code, when I must use the same CASE in multiple parts of a query. Furthermore, it is nice to use multiple CROSS APPLY's when you have staggered calculations where the second value depends on the first result, the third is using the second resutld, the forth the third result and so on.
Code:
SELECT
webTable.*,
LEFT(noPrefix.myURL, domain_end.pos) AS domain
FROM webTable
CROSS APPLY
(SELECT REPLACE(webTable.myURL, 'www.', '') AS myURL) AS [noPrefix]
CROSS APPLY
(SELECT CHARINDEX('.', noPrefix.MyURL) AS pos) AS [suffix_start]
CROSS APPLY
(SELECT CASE WHEN suffix_start.pos = 0 THEN LEN(noPrefix.myURL) ELSE suffix_start.pos - 1 END AS pos) AS [domain_end]
Using Table Valued Functions to Inner Joins
The cross apply is really a cross-product operator. The CROSS APPLY operator returns only those rows from left table expression if it matches with right table expression is not the whole story. It serves to function as inner and outer joins by chance. Cross apply sometimes enables you to do things that you cannot do with inner join.
The below query will throw a syntax error:
Code:
SELECT fn.* from sys.objects sobj
INNER JOIN dbo.myTableFunc(sobj.name) fn
ON fn.schema_id = sobj.schema_id
There is a syntax error in the above code, because when used with inner join, table functions can only take variables or constants as parameters. i.e. the table function parameter cannot depend on another table's column.
However we can refer to the table alias column name using a CROSS Apply
Code:
SELECT fn.* from sys.objects sobj
CROSS APPLY dbo.myTableFunc(sobj.name) fn
WHERE fn.schema_id= sobj.schema_id
XML Fields
CROSS APPLY has its obvious usage in allowing a set to depend on another (unlike the JOIN operator), but that doesn't come without a cost: it behaves like a function that operates over each member of the left set, so, in SQL Server terms it always perform a Loop Join, which rarely is the best way to join sets. So, use APPLY when you need to, but don't overuse it against JOIN. Below is a simple XML snippet for Orders, we use the CROSS Apply to join the OrderDetail to Order xml .
Code:
DECLARE @x1 xml =
'<Orders>
<Order>
<OrderID>13000</OrderID>
<CustomerID>ALFKI</CustomerID>
<OrderDate>2006-09-20Z</OrderDate>
<EmployeeID>2</EmployeeID>
</Order>
<Order>
<OrderID>13001</OrderID>
<CustomerID>VINET</CustomerID>
<OrderDate>2006-09-20Z</OrderDate>
<EmployeeID>1</EmployeeID>
</Order>
</Orders>'
DECLARE @x2 xml =
'<Orders>
<OrderDetails>
<OrderID>13000</OrderID>
<ProductID>76</ProductID>
<Price>123</Price>
<Qty>10</Qty>
</OrderDetails>
<OrderDetails>
<OrderID>13000</OrderID>
<ProductID>16</ProductID>
<Price>3.23</Price>
<Qty>20</Qty>
</OrderDetails>
</Orders>'
And the query to join these two xmls
Code:
SELECT a.OrderID,c.ProductID,c.Price,c.Qty FROM
(SELECT OrderID = O.n.value('(OrderID/text())[1]', 'int'),
CustomerID = O.n.value('(CustomerID/text())[1]', 'nchar(5)'),
OrderDate = O.n.value('(OrderDate/text())[1]', 'datetime'),
EmployeeId = O.n.value('(EmployeeID/text())[1]', 'smallint')
FROM @x1.nodes('/Orders/Order') AS O(n)) a
CROSS APPLY
(
SELECT * FROM (
SELECT OrderID = D.n.value('(OrderID/text())[1]', 'int'),
ProductID = D.n.value('(ProductID/text())[1]', 'int'),
Price = D.n.value('(Price/text())[1]', 'decimal(10,2)'),
Qty = D.n.value('(Qty/text())[1]', 'int')
FROM @x2.nodes('/Orders/OrderDetails') AS D(n)) b
WHERE a.OrderID=b.OrderID
) c
Replacement for UnPivot
APPLY can be used as a replacement for UNPIVOT. Either CROSS APPLY or OUTER APPLY can be used here, which are interchangeable.
Consider you have the below table.
Code:
CREATE TABLE tblOrders (OrderId int identity, DVD int, HardDisk int, RAM int)
INSERT INTO tblOrders
SELECT 1, 3, 3
UNION ALL
SELECT 2, 5, 4
UNION ALL
SELECT 1, 3, 10
/******************************************************************/
SELECT DISTINCT OrderId, ProductName, ProductQty
FROM tblOrders
CROSS APPLY(VALUES ('DVD', DVD),('HardDisk', HardDisk), ('RAM', RAM))
COLUMNNAMES(ProductName, ProductQty)
Result of the Pivoting operation with Cross Apply.
Code:
OrderId ProductName ProductQty
----------- ----------- -----------
1 DVD 1
1 HardDisk 3
1 RAM 3
2 DVD 2
2 HardDisk 5
2 RAM 4
3 DVD 1
3 HardDisk 3
3 RAM 10
Performance Common Table Expression vs Cross Apply
Whether you use APPLY, IN, or EXISTS, the query optimizer does not take the query text literally. It transforms your code into logic operations, and optimizes from there, using a wide variety of transformations. It is possible to write the same logical requirement in many ways, and often they will produce the same or trivially different execution plan. Identical execution plans produce identical results all things being equal.
Using two sub queries
Code:
SELECT
(
SELECT MAX(to1.Qty1) as mQty1
FROM tblOrder to1
WHERE to1.OrderId = orderDetail.Id
),
(
SELECT MAX(to2.Qty2) as mQty2
FROM tblOrder to2
WHERE to2.OrderId = orderDetail.Id
)
FROM tblDetails d
Using Cross Apply for the same operation
Code:
SELECT t.mQty1, t.mQty2
FROM tblDetails d
CROSS APPLY
(
SELECT MAX(to1.Qty1) as mQty1, MAX(to1.Qty2) as mQty2
FROM tblOrder to1
WHERE to1.OrderId = orderDetail.Id
) as t
Through the Apply operator cross or outer you will be able to write a query that return the same results as the Query that use the sub-Query. But what are the differences between these two ways to write the query? You will have just noticed that in the first Query it was necessary to use two sub-queries while using the cross apply it is possible to read data from the tblOrder table only once.
The pros is in cross apply it actually reads half of the pages of the tblOrder table. But the con is SQL Server create a temporary table in the tempdb which contains the data returned from Cross Apply query. Using tempdb may affect performance. However, if the number of subqueries increased, the performance would most likely change in favor of cross apply.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|