Be cautious using XPathDocument and DTDs.

From the docs:

“There are two ways to read an XML document in the System.Xml.XPath namespace. One is to read an XML document using the read-only XPathDocument class and the other is to read an XML document using the editable XmlDocument class in the System.Xml namespace.“

Further on, on the same page, it says:

“ The following example illustrates using the XPathDocument class’s string constructor to read an XML document.“

However, there is no example. Perhaps that’s the reason for an average rating of 2.31 out of 9. BTW, this string constructor is in fact interpreted as a URI. So why they didn’t choose to implement a constructor using a the Uri-class is beyond me.

What it fails to mention is that you need to be extra careful reading an XML document with XPathDocument, especialy when it has a reference to a DTD. And even worse, when you don’t control the location where this DTD is supposed to be.

Consider this simple XML file:

<?xml version=1.0 encoding=utf-8?>

<!DOCTYPE Library  SYSTEM http://www.devtips.net/bijlagen/sample.dtd>

<Library>

  <family>

    <title>Our Family</title>

    <parent role=mother>Christina</parent>

    <parent role=father>Jean Luc</parent>

    <child role=daughter>Sofia</child>

    <child role=son>Pedro</child>

  </family>

</Library>

Will this load with:

XPathDocument doc = new XPathDocument(@”d:\temp\test.xml”);

? Yes, it will. But when an administrator decides to rename the DTD, or the server running www.devtips.net goes offline, or as happened recently, a firewall blocks you from accessing the server… it will fail! Because in the construction of XPathDocument, there’s an http GET command to see if it can access the DTD. It’s not doing anything with the DTD. It’s for just in case. So while XPathDocument is initially set up to be a faster alternative to XmlDocument, you’ll have the additional overhead of an http request that needs to be resolved. Imagine that server being on the other side of the globe!

You will also get an exception (i.e. a WebException) if it fails to get to the DTD for any reason, thus breaking your app, because it will not load the XML file.

What about when a XML Schema is referenced? Well, then there’s no problem. It’s not trying to get the remote schema.

In conclusion, be careful using XPathDocument, it may not be as fast as you thought it was. And avoid it when your XML file references a DTD on a location that you have no control over.