Creating a Generic GetXmlReader Data Access Method For XML Explicit
By Jon Wojtowicz

  Download Source Code
Microsoft added the powerful FOR XML EXPLICIT statement to SQL Server 2000. This is a convenient means of getting XML data from the database. But what if you are not using SQL Sever 2000? This was a problem I encountered when on an assignment where we wanted to reproduce the Microsoft Data Access Application Block for OleDb. The person that originally wrote the method simply obtained a DataSet and used the GetXml method to create the XmlReader. This was lacking as it could not produce nested XML.
After reading Robbe D. Morris's article Demystify SQL Server FOR XML EXPLICIT I realized that the query was a select query providing standard, tokenized column names that described the structure. At that point I decided to implement this behavior in code.
Query Structure
I needed the query to return the column names with the structure of Node Name!Tag Index!Attribute Name!Directive. I created a query that followed the UNION ALL pattern without using the FOR XML EXPLICIT statement as in Listing 1 to return the order information from the Northwind database. This query returned the data as in Listing 2. The target XML output is given in Listing 3.
Listing 1.
 
   SELECT 1                    as Tag, 
         NULL                as Parent,
         C.CustomerID        as [C!1!customerid],
         C.ContactName       as [C!1],
         C.PostalCode        as [C!1!postalcode],
         NULL                as [O!2!orderid],
         NULL                as [O!2],
         NULL                as [O!2!employeeid!element],
         NULL                as [OD!3!orderdetailid],
         NULL                as [OD!3!productid]
FROM Customers as C
UNION ALL
SELECT 
       2 as tag, 
       1 as parent, 
       C.CustomerID as customerid,
       C.ContactName,
       C.PostalCode,
       O.OrderID,
       O.ShipAddress,
       O.EmployeeID,
       null,
       null
    FROM Customers C, Orders O
    WHERE C.CustomerID = O.CustomerID
UNION  ALL
SELECT 
       3 as tag, 
       2 as parent, 
       C.CustomerID,
       C.ContactName,
       C.PostalCode,
       O.OrderID,
       O.ShipAddress,
       O.EmployeeID,
       OD.OrderID as OrderDetailID,
       OD.ProductID as ProductID
    FROM Customers C, Orders O, [Order Details] OD
    WHERE (C.CustomerID = O.CustomerID)
      AND (O.OrderID = OD.OrderID)
     order by 3,4,5,6,7,8,9,10
 
   


Listing 2.
 
Tag         Parent      C!1!customerid   C!1                      C!1!postalcode    O!2!orderid   O!2    
----------- ----------- --------------       -------------------- -------------------  -------------  --------------------  
1              NULL       ALFKI                Maria Anders      12209                 NULL           NULL            
2              1             ALFKI                Maria Anders       12209                10643          Obere Str. 57 
3              2             ALFKI                Maria Anders       12209                10643          Obere Str. 57 
3              2             ALFKI                Maria Anders       12209                10643          Obere Str. 57 

wrapped...

Tag         O!2!employeeid!element  OD!3!orderdetailid  OD!3!productid 
-----------  ---------------------------- ---------------------- -------------- 
1              NULL                             NULL                      NULL
2              6                                    NULL                      NULL
3              6                                    10643                     28
3              6                                    10643                     39
  
   
Listing 3.
 
<C customerid="ALFKI" postalcode="12209">
    Maria Anders
    <O orderid="10643" employeeid="6">
        Obere Str. 57
        <OD orderdetailid="10643" productid="28" />
        <OD orderdetailid="10643" productid="39" />
        <OD orderdetailid="10643" productid="46" />
    </O>
    <O orderid="10692" employeeid="4">
        Obere Str. 57
        <OD orderdetailid="10692" productid="63" />
    </O>
    <O orderid="10702" employeeid="4">
        Obere Str. 57<OD orderdetailid="10702" productid="3" />
        <OD orderdetailid="10702" productid="76" />
    </O>
    <O orderid="10835" employeeid="1">
        Obere Str. 57
       <OD orderdetailid="10835" productid="59" />
       <OD orderdetailid="10835" productid="77" />
    </O>
    <O orderid="10952" employeeid="1">
        Obere Str. 57
        <OD orderdetailid="10952" productid="6" />
        <OD orderdetailid="10952" productid="28" />
    </O>
    <O orderid="11011" employeeid="3">
        Obere Str. 57
        <OD orderdetailid="11011" productid="58" />
        <OD orderdetailid="11011" productid="71" />
    </O>
</C>
   
The Approach
The data was returned in a DataReader since this would be the fatest way of getting to the data. The data was then moved into managed space as an ArrayList of object arrays. I also captured the maximum tag depth as I would need this for later array lengths.
The next step was obtaining and parsing the column names. Since the name and directive are optional I had to take this into account when parsing the column names with a default mapping directive of attribute..
The data was then grouped by tag number and a hash value was computed for the key fields of the row. The hash value for the maximun tag number was not computed since it would not have any child elements. The parent hash for each row was also computed. In this case the rows with tag number 1 would not have a parent as they would be directly under the root element.
Using the hash values computed earlier, the parent for each row was mapped. Once the parent-child relations were mapped, the XML could be built. I chose to use a stack and enumerators since it proved to be faster than recursion. I also used an XmlWriter since it would alert me to any errors in building the XML.
A few differences exisit between this implementation and the SQL Server FOR XML EXPLICIT. First is that this method returns a complete XML document. This was done for convenience. It also supports returning multiple result sets and having them parsed into the same XML document.
The sample application provided with the class is to benchmark the performance. It will also show you how to use the class in your own code. This will also allow you to incorporate the GetXmlReader method into your own data access class. This can then be used with any data source that supports creating the data schema expected by this method.
Performance
The FOR XML EXPLICIT from SQL Server is by far the fastest approach. While the method I used is not as fast as SQL Server it is about 25% faster than using a multi-tabled DataSet with data relations to created nested XML.
Last Update: 6/18/2005