Using FOR XML with Sql server

May 1, 2009

In SQL Server 2000, the FOR XML clause was added as an extension to SELECT, which transforms the rowset output to an XML stream. This is a very powerful extension to the SELECT command. The XML stream thus formed can have the following formants: RAW, AUTO or

EXPLICIT. The new syntax of the SELECT statement is as follows:

====================================================

SELECT

column list

FROM

table list

WHERE

filter criteria

FOR XML RAW | AUTO | EXPLICIT [, XMLDATA] [, ELEMENTS] [, BINARY BASE64]

=======================================================

Out of the RAW / AUTO / EXPLICIT, explicit is the most complex one, but it has many options which are very useful to us. Explicit contains RAW/ AUTO, both. Thus we will pick it at last.

FYI: I have used AdventureWorks database in the examples, which I think is much more better then Northwind database.

The XML RAW is the most simplest of the forms and probably the most fastest of all the options. This format provides the output of each row within a generic node called “row”. All the columns of data are transformed as attributes.

****************************************************************

Select ProductID, [Name] from  Production.Product for XML RAW

[Output]

<row ProductID=”1″ />

<row ProductID=”879″ />

<row ProductID=”712″ />

<row ProductID=”3″ />

<row ProductID=”2″ />

<row ProductID=”877″ />

****************************************************************

The XML AUTO is the next form. It provides more control over XML output then RAW mode. In this case, each row is named after the table name from where that row comes from.

************************************************************

Select ProductID, [Name] from  Production.Product for XML AUTO

[Output]

<Production.Product ProductID=”1″ />

<Production.Product ProductID=”879″ />

<Production.Product ProductID=”712″ />

<Production.Product ProductID=”3″ />

<Production.Product ProductID=”2″ />

<Production.Product ProductID=”877″ />

******************************************************************

There is one more difference between XML RAW and XML AUTO, is the way they handles JOINS. Have a look at the difference :

****************************************************************

XML ROW

SELECT  top 2  PSC.Name AS ProductSubCategoryName, PP.Name AS ProductName

FROM  Production.ProductSubcategory AS PSC INNER JOIN Production.Product AS PP

ON PSC.ProductSubcategoryID = PP.ProductSubcategoryID for XML RAW

[Output]

<row ProductSubCategoryName=”Road Frames” ProductName=”HL Road Frame – Black, 58″ />

<row ProductSubCategoryName=”Road Frames” ProductName=”HL Road Frame – Red, 58″ />

XML AUTO

SELECT  top 2  PSC.Name AS ProductSubCategoryName, PP.Name AS ProductName

FROM  Production.ProductSubcategory AS PSC INNER JOIN Production.Product AS PP

ON PSC.ProductSubcategoryID = PP.ProductSubcategoryID for XML AUTO

[OUTPUT]

<PSC ProductSubCategoryName=”Road Frames”>

<PP ProductName=”HL Road Frame – Black, 58″ />

<PP ProductName=”HL Road Frame – Red, 58″ />

</PSC>

*********************************************************************

It is quite easy to observe that XML RAW format does not show anything in the output that a JOIN has happened. All the columns are rendered as attribute.

In case of XML AUTO, the output is nested within the parent and child elements, which provides a clear view about the relationship between the tables involved in JOIN.

XMLDATA option will output a whole schema. We will pick it later otherwise, it will confuse everybody.

Just for an example :

**********************************************************************

XML AUTO

select ProductID, [Name] from  Production.Product for XML AUTO, XMLDATA

[OUTPUT]

<Schema xmlns=”urn:schemas-microsoft-com:xml-data” xmlns:dt=”urn:schemas-microsoft-com:datatypes”>

<ElementType content=”empty” model=”closed”>

<AttributeType dt:type=”i4″ />

<AttributeType dt:type=”string” />

<attribute />

<attribute />

</ElementType>

</Schema>

<Production.Product xmlns=”x-schema:#Schema4″ ProductID=”1″ />

<Production.Product xmlns=”x-schema:#Schema4″ ProductID=”879″ />

<Production.Product xmlns=”x-schema:#Schema4″ ProductID=”712″ />

<Production.Product xmlns=”x-schema:#Schema4″ ProductID=”3″ />

<Production.Product xmlns=”x-schema:#Schema4″ ProductID=”2″ />

**********************************************************************

We would try for the ELEMENTS option, which makes the out much more understandable then others. Look below for the ELEMENTS option:

*********************************************************************

XML RAW

SELECT  top 2  PSC.Name AS ProductSubCategoryName, PP.Name AS ProductName

FROM  Production.ProductSubcategory AS PSC INNER JOIN Production.Product AS PP

ON PSC.ProductSubcategoryID = PP.ProductSubcategoryID for XML RAW, ELEMENTS

[output]

<row>

<ProductSubCategoryName>Road Frames</ProductSubCategoryName>

<ProductName>HL Road Frame – Black, 58</ProductName>

</row>

<row>

<ProductSubCategoryName>Road Frames</ProductSubCategoryName>

<ProductName>HL Road Frame – Red, 58</ProductName>

</row>

XML AUTO

SELECT  top 2  PSC.Name AS ProductSubCategoryName, PP.Name AS ProductName

FROM  Production.ProductSubcategory AS PSC INNER JOIN Production.Product AS PP

ON PSC.ProductSubcategoryID = PP.ProductSubcategoryID for XML AUTO, ELEMENTS

[OUTPUT]

<PSC>

<ProductSubCategoryName>Road Frames</ProductSubCategoryName>

<PP>

<ProductName>HL Road Frame – Black, 58</ProductName>

</PP>

<PP>

<ProductName>HL Road Frame – Red, 58</ProductName>

</PP>

</PSC>

******************************************************************