|
At EggheadCafe.com, we have resources
that are added on a daily basis - both by member visitors and by the staff.
These resources are searchable and are broken down by categories such
as Hotlinks, Articles, Tips & Tricks, etc. Each evening , we have
a script that's kicked off by the NT Scheduler service at a time when
traffic is normally low, that scours our resource database for the most
recent items in each category. It then assembles an 100% HTML include
page and caches it on the server's filesystem. This file is picked up
as an ASP include when you view our home page at http://www.eggheadcafe.com.
That block you see in the middle of the page with the blue section heading
graphics and lists of links comes from this HTML Include file. This is
an excellent low-bandwidth way to generate dynamic content that's fresh
daily and keep your server resource usage to a minimum.
Some time ago I had engineered a couple
of client - side functions that zoom out to Moreover.com, grab the XML
for specific newsfeed categories that are of interest to the developer
- type of community that is our primary visitor, and applies an XSLT transform
to render this as a scrolling news marquee. I thought, wouldn' t it be
cool if we could have some of our own custom newsfeed links to our own
featured resources that we could COMBINE with the Moreover feed, and have
all of it in a scrolling marquee. For example, Robbe recently released
his TurnkeyTools.com service,
and wanted to have a way to "feature it" in an unobtrusive way
for a period of time.
I went through some development "learning
exercises" that revolve around the difficulties involved in loading
Moreover's custom feed XML which has a DTD declaration and an ISO-8859
content-encoding declaration, neither of which the MSXML parser was particularly
fond of if you were attempting to do your stuff on the server side, load
a second file and do a transform. When I say "not particularly fond
of", what I mean is that simply, if you tried to load this either
with XXMLHTTP or with MSXML.DOMDocument "Load" method, the parser
just PUKED. (And frankly, Microsoft, the error messages weren't very helpful
either).
Not to be daunted, I eventually came up
with an overall solution that's not only much more elegant and extensible,
but conforms better to our philosophy of conserving server resources by
caching data that may change, but needs to be served up continuously -
and FAST.
In a nutshell, what I did was create two
VBScript ASP functions for a new file that is now included in our original
scheduled script to create the content page I described above, but the
script SAVES the xml file from Moreover AND the 100% HTML result of the
XSLT Transform to create the scrolling news marquee, all to the filesystem.
So it's now all part of that same HTML include file -- the CPU is not
tied up doing XML Loads and XSL Transformations every time the page is
requested by a new visitor. And we still get fresh new content every single
day. On top of that, I created a SECOND XML file with a schema similar
to the feed from Moreover that contains the custom Eggheadcafe.com items
that we want to display, and MERGED that with the Moreover XML using DOM
methods before applying the transform to create the final HTML. The result?
An easy-to-use, server - friendly and extensible method to create custom
dynamic newsfeed displays containing both outside and proprietary customized
content. Flexibility, code re-use, portability, and speed all wrapped
up in one concept.
Let's take a look at the script and see
how this was done:
<% function GetMoreoverXML(byVal sCategory) Dim sChoice, sSource, sDest, oHTTP, body8209,sOut,oTS,oFSO,i if sCategory = "" then sCategory="topinternet" sSource= "http://p.moreover.com/cgi-local/page?index_" & sCategory
&"+xml" sDest = Application("FileRoot") & "\moreover.xml" set oHTTP = Server.CreateObject("Microsoft.XMLHTTP") set oFSO = Server.CreateObject("Scripting.FileSystemObject") oHTTP.open "GET", sSource, False oHTTP.send body8209 = oHTTP.responseBody set oHTTP = nothing sOut = "" For i = 0 to UBound(body8209) sOut = sOut & chrw(ascw(chr(ascb(midb(body8209,i+1,1))))) Next sOut = Replace (sOut, "encoding=""iso-8859-1""", "") sOut = Replace(sOut, "<!DOCTYPE moreovernews SYSTEM ""http://p.moreover.com/xml_dtds/moreovernews.dtd"">", "") set oTS = oFSO.CreateTextFile(sDest, True) oTS.Write sOut oTS.Close set oTS = Nothing set oFSO = Nothing GetMoreoverXML =true End Function
Function showNews() on error resume next Dim xmlDoc Dim xmlDoc2 Dim xsl Dim sChoice Dim tempNodes Dim docFragment Set xmlDoc2=Server.createObject("MSXML2.DOMDocument.3.0") xmlDoc2.Async=False xmlDoc2.PreserveWhitespace=false XmlDoc2.validateOnParse =False Set xmlDoc = Server.createObject("MSXML2.DOMDocument.3.0") xmlDoc.Async=False xmlDoc2.Load Application("Fileroot") &"\moreover.xml" xmlDoc.Load Application("Fileroot") &"\article.xml" set tempNodes = xmlDoc2.selectNodes("//article") for i = 0 to tempNodes.length -1 set docFragment = xmlDoc.createDocumentFragment docFragment.appendChild( xmlDoc.createElement("article")) set docFragment = tempNodes(i) xmlDoc.documentElement.appendChild(docFragment) next set xsl =Server.CreateObject("MSXML2.DOMDocument.3.0") xsl.async = false xsl.load Application("fileroot") &"\business2.xsl" showNews= xmlDoc.transformNode(xsl) err.clear sMarqy= "<TABLE CELLSPACING=0 CELLPADDING=0 BORDERCOLOR=#858BFD BORDER=0 ALIGN=CENTER WIDTH=550 class=contentPurple>" sMarqy=sMarqy & "<TR><TD ALIGN=CENTER><IMG SRC=/images/newheadlines.jpg border=0></TD> </TR><tr><td><br></td></tr>" & vbcrlf sMarqy=sMarqy & "<TR><TD align=left>" sMarqy=sMarqy & " <MARQUEE BGCOLOR=#FFFFFF BORDERCOLOR=#858BFD BEHAVIOR=SCROLL
DATAFORMATAS=HTML DIRECTION=UP HEIGHT=80 ID=News SCROLLAMOUNT=1 SCROLLDELAY=100
TITLE=Headlines WIDTH=536 HSPACE=0>" & showNews & "</marquee>" sMarqy=sMarqy & "</TD></TR></TABLE>" showNews =sMarqy end function %>
Note that in the first function "GetMoreoverXML",
we can pass in an optional category name that is fitted into the URL to
get their auotgenerated XML stream. If this optional parameter is passed
as an empty string ("") it defaults to "topinternet"
- this is the listing of news items for Top Internet related stories.
Note we set an sSource and an sDest variable, one for the URL to get the
feeds and the second telling where to save the results XML document as
a file. Next we open an instance of the FileSystem object which we'll
use to save our file. We use XMLHTTP to "GET" our Moreover file,
and we obtain our document result from the XMLHTTP.responseBody property.
Normally I would simply load this into an MSXML2.DOMDocument object and
do a SAVE, but you remember I described some problems above regarding
the fact that Moreover's peculiar schema has both a DTD (Document Type
Definition) reference and an unusual Content-encoding declaration:
<?xml
version="1.0" encoding="iso-8859-1" ?>
<!DOCTYPE moreovernews (View Source for full doctype...)>
The
above were causing the parser to choke on load, so rather than going through
the newsgroups and docs looking for a possible fix, I reasoned that since
they were not required for my purposes, and since we are saving this stuff
to the filesystem anyway, let's just strip them out before we save it,
hence the following code:
sOut
= Replace (sOut, "encoding=""iso-8859-1""",
"")
sOut = Replace(sOut, "<!DOCTYPE moreovernews SYSTEM ""http://p.moreover.com/xml_dtds/moreovernews.dtd"">","")
You'll also
note I have the following code:
For
i = 0 to UBound(body8209)
sOut = sOut & chrw(ascw(chr(ascb(midb(body8209,i+1,1)))))
Next
Why do we do this? Well, the responseBody
Property of the XMLHTTP object returns the result as array of unsigned
bytes, namely a SAFEARRAY of type VT_ARRAY | VT_UI1. This contains the
raw undecoded bytes , and we need to decode it in order to be able to
write the results to the filesystem as text. Reading from the inside of
the function out, the midb function is used with byte data in a
string. Instead of specifying the number of characters, the arguments
specify numbers of bytes. So we are getting the Unicode (wide) character
value of the Unicode character code of the character represented by the
next byte in the string., and adding it to the result string until we
are done with our loop through the byte array.
Then we SAVE our Morover XML file to the
filesystem In the second function, we do the loads, the merge and the
transform.. The merging of the <article> nodes is done as follows:
set
tempNodes = xmlDoc2.selectNodes("//article")
for i = 0 to tempNodes.length -1
set docFragment = xmlDoc.createDocumentFragment
docFragment.appendChild( xmlDoc.createElement("article"))
set docFragment = tempNodes(i)
xmlDoc.documentElement.appendChild(docFragment)
next
We use the createDocumentFragment
method of the DOM, appendChild to append the <article> element
created with createElement, and we loop through the number of nodes
in the "articles.xml" custom document until we've added them
all. We now have a single XML Document object containing not only the
Moreover news items, but our own custom items as well. We apply the transform
with the XSL stylesheet, surround it with the desired HTML to create the
MARQUEE control, and return the whole shebang from our function as a string,
ready to incorporate into the calling script to be merged inline with
the larger HTML document and SAVED to the filesytem to be ready to be
included in any ASP page using the normal ASP <!--#include file= directive.
You can take the first block of code above
(surrounded by <% script tags) and pretty much use it as is for the
basis of most any such file caching system. The article.xml "Custom"
XML file is pretty much a clone of the Moreover XML, except we leave out
the unncecessary tags:
<moreovernews>
<article>
<url>http://www.turnkeytools.com/eggheadbeta.asp</url>
<headline_text>TurnkeyTools: Polls Beta Released</headline_text>
<document_url>http://www.turnkeytools.com/eggheadbeta.asp</document_url>
<harvest_time>May 12 2001 9:38PM</harvest_time>
</article>
<article>
<url>http://www.eggheadcafe.com/articles/20010510.asp</url>
<headline_text>EggheadCafe: Posting Form Data with XMLHTTP</headline_text>
<document_url>http://www.eggheadcafe.com/articles/20010510.asp</document_url>
<harvest_time>May 12 2001 9:39PM</harvest_time>
</article>
</moreovernews>
So there you have it. An extensible, scalable way to
construct custom content display on a scheduled basis and cache it on
the filesystem.
Peter Bromberg is an independent consultant specializing in distributed .NET solutionsa Senior Programmer
/Analyst at in Orlando and a co-developer of the EggheadCafe.com
developer website. He can be reached at pbromberg@yahoo.com
|