ASP.NET Fractured Latin Phrases: Self-decompressing Database in an Assembly with Web Site

By Peter Bromberg

For quite some time I have taken databases of items that would normally be read-only (Zipcodes, quotations, etc.), compressed them with various data compression algorithms, and stored them in a class library assembly as an embedded resource. At runtime, the resource is read out, decompressed, and used to construct a generic Dictionary or DataTable to use for lookups.

This can be a very efficient way to store data that is needed in an application and obviates the need for any external dependencies on databases. Here, I show how to store an XML Document representing a Datatable of Latin Phrases and their English meanings (some of them are very funny) and then use the resulting class library to build a whole web site out of it.

The first thing we need is a small Windows Forms app that allows us to select a file, read it into a byte array, and then use one of many compression classes to compress it, then save it to the filesystem so that we can bring the compressed resource into our class library as an embedded resource.

Next, we need to be able to read the compressed data out of the assembly at runtime, decompress it, and build whatever collection class or DataTable / DataSet we will use for our lookups.

The final stage would be to build some sort of consumer of our arrangement. In this case, I built a one-page ASP.NET Site.

The Forms app is really simple. Here is the code:

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.IO;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using BrainTechLLC;

namespace DataCompressor
{
    public partial class Form1 : Form
    {
         private byte[] sourceFileBytes;
        private byte[] compressedFileContents;
         public Form1()
        {
             InitializeComponent();
        }
         
        private void btnLoad_Click(object sender, EventArgs e)
        {
            System.Windows.Forms.DialogResult dlgres = openFileDialog1.ShowDialog();
             if (dlgres == DialogResult.OK)
            {
              sourceFileBytes  = System.IO.File.ReadAllBytes(openFileDialog1.FileName);
                compressedFileContents = sourceFileBytes.Compress();
               System.Windows.Forms.DialogResult Sres= saveFileDialog1.ShowDialog();
               if (Sres == DialogResult.OK)
               {
                   File.WriteAllBytes(saveFileDialog1.FileName, compressedFileContents);
               }
            }
        }
    }
}

The BraintechLLC namespace is the MiniLzo compression algorithm port which provides compression and decompression as extension methods on type Byte Array (byte[]). MiniLzo provides one of the best combinations of high compression combined with very fast decompression speed. In the case of my LatinPhrases.xml DataSet which is 203 Kb, the resultant LatinPhrases.dat compressed file is just 89K. Of course, you don't need to save a DataSet as XML - you can use a pipe-delimited csv-style string which would make the file even smaller.

So after embedding the LatinPhrases.dat binary compressed file in the assembly with a Build Action of Embedded Resource, here is how we handle "rehydrating" our object:

using System;
using System.Collections.Generic;
using System.Reflection;
using System.Text;
using System.Data;
using System.IO;
using BrainTechLLC;

namespace LatinPhraser
{
   public class Phraser
    {
       public DataSet DsLatinPhrases;
       public Phraser()
        {
           DsLatinPhrases = new DataSet();
           Assembly asm = Assembly.GetExecutingAssembly();        
           string strName = "LatinPhraser.LatinPhrases.dat";
           Stream stm = asm.GetManifestResourceStream(strName);
           byte[] b = new byte[stm.Length ];
           stm.Read(b, 0, (int)stm.Length);
           byte[] decomp = b.Decompress();
           MemoryStream ms = new MemoryStream(decomp);
           DsLatinPhrases.ReadXml(ms);
        }

       public DataTable SearchForPhrase(string searchTerm)
       {
           DataTable dt = DsLatinPhrases.Tables[1];
           DataRow[] rows= dt.Select("phrase LIKE '%" + searchTerm + "%'");
           DataTable dt2 = new DataTable();
           dt2.Columns.Add("phrase");
           dt2.Columns.Add("meaning");
           foreach (DataRow r in rows)
           {
               DataRow nr = dt2.NewRow();
               nr[0] = r[0];
               nr[1] = r[1];
               dt2.Rows.Add(nr);
           }
           return dt2;
       }


       public DataTable SearchPhraseByLetter(string searchTerm)
       {
           DataTable dt = DsLatinPhrases.Tables[1];
           DataRow[] rows = dt.Select("phrase LIKE '" + searchTerm + "%'");
           DataTable dt2 = new DataTable();
           dt2.Columns.Add("phrase");
           dt2.Columns.Add("meaning");
           foreach (DataRow r in rows)
           {
               DataRow nr = dt2.NewRow();
               nr[0] = r[0];
               nr[1] = r[1];
               dt2.Rows.Add(nr);
           }
           return dt2;
       }


       public DataTable SearchForMeaning(string searchTerm)
       {
           DataTable dt = DsLatinPhrases.Tables[1];
           DataRow[] rows =dt.Select("meaning LIKE '%" + searchTerm + "%'");
           DataTable dt2 = new DataTable();
           dt2.Columns.Add("phrase");
           dt2.Columns.Add("meaning");
           foreach (DataRow r in rows)
           {
               DataRow nr = dt2.NewRow();
               nr[0] = r[0];
               nr[1] = r[1];
               dt2.Rows.Add(nr);
           }
           return dt2;
       }
    }
}

You can see above the pertinent code is in the constructor:

public Phraser()
        {
           DsLatinPhrases = new DataSet();
           Assembly asm = Assembly.GetExecutingAssembly();        
           string strName = "LatinPhraser.LatinPhrases.dat";
           Stream stm = asm.GetManifestResourceStream(strName);
           byte[] b = new byte[stm.Length ];
           stm.Read(b, 0, (int)stm.Length);
           byte[] decomp = b.Decompress();
           MemoryStream ms = new MemoryStream(decomp);
           DsLatinPhrases.ReadXml(ms);
        }

We create a new DataSet, get the ManifestResourceStream of the resource, read it into a byte array. Then  we decompress it via the MiniLzo extension method into a new byte array. Finally, we store it in a MemoryStream which allows us to use the ReadXml method of the DataSet class.

The rest of the code is simple SQL-like Search methods using the DataTable's Select method.  And that's it!


I won't go into the details of the ASP.NET project since it's available in the downloadable solution for you to play with. Basically its just a constructed menu that gets the phrases by letter (from A to Z) along with a Search Box where you can search by either Latin phrase or English Meaning. You can pretty much see what's going on with the Page by looking at the codebehind:

using System;
using System.Collections;
using System.Configuration;
using System.Data;
using System.Linq;
using System.Web;
using System.Web.Security;
using System.Web.UI;
using System.Web.UI.HtmlControls;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.WebParts;
using System.Xml.Linq;

namespace LatinPhraserWeb
{
    public partial class _Default : System.Web.UI.Page
    {
         protected void Page_Load(object sender, EventArgs e)
         {
             if(Request.QueryString["let"]!=null)
            {
            string letter = Request.QueryString["let"];        
            
                LatinPhraser.Phraser p = new LatinPhraser.Phraser();
                DataTable dt = p.SearchPhraseByLetter(letter);
                 this.DataList1.DataSource = dt;
                 DataList1.DataBind();
             }
             else
                 if (!IsPostBack)
                {
                    LatinPhraser.Phraser p = new LatinPhraser.Phraser();
                    DataTable dt = p.SearchForMeaning("God");
                     this.DataList1.DataSource = dt;
                     DataList1.DataBind();
                 }
        }

        protected void Button1_Click(object sender, EventArgs e)
        {
            string s = this.DropDownList1.SelectedValue;
            DataTable dt;
            LatinPhraser.Phraser p = new LatinPhraser.Phraser();
             if (s.IndexOf("Phrase")>-1)
                dt = p.SearchForPhrase(this.txtSearch.Text);
             else
                dt = p.SearchForMeaning(this.txtSearch.Text);
             this.DataList1.DataSource = dt;
                DataList1.DataBind();
        }
    }
}

You can download the complete Visual Studio 2010 Solution, which includes the Windows Forms compression utility. I hope this gives you some interesting ideas on how you can use compressed resources in your assemblies.


Popularity  (1445 Views)
Picture
Biography - Peter Bromberg
Peter Bromberg is a C# MVP, MCP, and .NET expert who has worked in banking, financial and telephony for over 20 years. Pete focuses exclusively on the .NET Platform, and currently develops SOA and other .NET applications for a Fortune 500 clientele. Peter enjoys producing digital photo collage with Maya,playing jazz flute, the beach, and fine wines. You can view Peter's UnBlog and IttyUrl sites. Follow Microsoft MVP
Create New Account
Article Discussion: ASP.NET Fractured Latin Phrases: Self-decompressing Database in an Assembly with Web Site
Peter Bromberg posted at Thursday, May 12, 2011 12:23 PM
Understanding fully the intent of the article
Robbe Morris replied to Peter Bromberg at Saturday, May 14, 2011 7:05 PM
and what you are demonstrating.  I guess my question would be why would anyone want this brought in as a resource in the first place.  What advantage would there be versus just putting the file on the file system.  Perhaps named differently or in a folder that noone knew existed.  Why take the overhead of decompressing data when initially loading it into the DataSet?

Should changes in the data be needed, you couldn't update the ASP.NET application on the fly without at least triggering an app pool recycle and the potential of interrupted sessions in a live application.

I just can't think of any scenario where I'd want to restrict xref data from being updated without recycling the app pool and interrupting a live web site.