Recursive Hierarchical DataSet Techniques
By Peter A. Bromberg, Ph.D.
Printer - Friendly Version
Peter Bromberg

Recently I saw one of our forum posts where the poster was asking how to iterate over the child folders of a specified folder, and put the hierarchical results into a DataSet. I thought this would be an interesting exercise, so I decided to code up one approach to this fairly common problem. The code I present here illustrates two useful concepts:

1) How to recursively call a method, passing in one or more of the results of the previous method call, and

2) How to add data to a DataTable and set a DataRelation that makes the table "self-referencing".



First, lets breeze through the code, and then I'll comment on it:

using System;
using System.Data;
using System.DirectoryServices;
using System.IO;
 class FolderTraverserWithHierarchicalDataSet
 { 
  public static DataSet ds=null;
  [STAThread]
  static void Main(string[] args)
  {    
   if (args.Length <1)
   {
    Console.WriteLine("Must supply initial path \n(e.g. \"C:\\temp\")");
      return;
   }
   ds=new DataSet();
   DataTable dt = new DataTable("directories");
   dt.Columns.Add(new DataColumn("DirName"));
    dt.Columns.Add(new DataColumn("Parent"));
      dt.AcceptChanges();
   ds.Tables.Add(dt);
   string InitialPath=args[0];
   DirectoryInfo dir = new DirectoryInfo(InitialPath);
   TraverseFolder(dir,InitialPath); 
   DataTable tbl=ds.Tables[0];
   ds.Relations.Add("SelfReferencing", tbl.Columns["DirName"], 
    tbl.Columns["Parent"], false);
   Console.WriteLine("Folder -- Parent");
   foreach (DataRow row in tbl.Rows) 
      DisplayRow(row, " ");
   Console.ReadLine();
   }
  private static void TraverseFolder(DirectoryInfo dir,string parentName)
  {
   FileInfo[] filesInDir = dir.GetFiles();   
   DirectoryInfo[] directories = dir.GetDirectories();
   foreach(DirectoryInfo newDir in directories)
   {    
    object[] rowData ={ newDir.FullName,parentName };
    ds.Tables[0].Rows.Add( rowData);
    TraverseFolder(newDir,newDir.FullName); // recursive call
   }

  }
    static void DisplayRow(DataRow row, string strIndent) 
    {
     Console.WriteLine(strIndent + row["DirName"]);
     foreach (DataRow rowChild in row.GetChildRows("SelfReferencing"))
      DisplayRow(rowChild, strIndent + "\t");
    }
 }

As can be seen above, this is a Console app. Usually I don't do this but often if we just want a short app for a "proof of concept" it can be a "quick and dirty" way to flesh out and test some code. First we declare a public DataSet , and then in our Main method, we can pass in a command line argument consisting of the target path, which should be a folder on your filesystem that has subfolders.

Then we instantiate our DataSet and add a DataTable with two columns, "DirName" and "Parent". We then create a DirectoryInfo Instance on the target folder. This represents our starting path for the recursive calls to the TraverseFolder method. You can see that for each directory in the child Folder collection (obtained via "dir.GetDirectories") we add a new row to our DataTable consisting of the folder name and its Parent folder name (similar to the "ReportsTo" column in the Northwind Database "Employees" Table). Note that at the end of the TraverseFolder method, it calls itself, passing in the new Directory and it's FullName, as that directory is now the new "Parent" for the recursive call.

What we end up with is a DataTable that holds all the folder names in the hierarchy, each row also having the name of its Parent folder.

Finally, we set a self-referencing DataRelation in our DataTable with the call to :

ds.Relations.Add("SelfReferencing", tbl.Columns["DirName"], tbl.Columns["Parent"], false);

This DataRelation is the magic that gives us our self-referencing hierarchical DataSet all in a single table, and enables us to use the GetChildRows method of the DataTable's DataRow object. One of the main uses of the DataRelation is to locate related data in different DataTable objects. This functionality is available through the DataRow object’s GetChildRows, GetParentRow, and GetParentRows methods. When you call any of these methods on the DataRow object, you specify a DataRelation as a parameter of the method. So in this case, we simply call the GetChildRows method of each DataRow and supply the name of the DataRelation object that we set, and the GetChildRows method returns the related data as an array of DataRow objects.

Finally, we show the hierarchy with the helper method "DisplayRow", which, if you've been able to follow up to this point, should be self-explanatory and uses each row's ChildRows collection. The full solution is downloadable for your viewing pleasure at the link below. Enjoy!

Download the Source Code that accompanies this article

 

 

Peter Bromberg is a C# MVP, MCP, and .NET consultant who has worked in the banking and financial industry for 20 years. He has architected and developed web - based corporate distributed application solutions since 1995, and focuses exclusively on the .NET Platform. Pete's samples at GotDotNet.com have been downloaded over 41,000 times. You can read Peter's UnBlog Here.  --><--NOTE: Post QUESTIONS on FORUMS!
Do you have a question or comment about this article? Have a programming problem you need to solve? Post it at eggheadcafe.com forums and receive immediate email notification of responses.