Friday, September 29, 2006

Injecting XML input into XQuery using Spring

I recently went to a XML Access Languages event held jointly by W3C and Presentations centered on XQuery, XSLT 2.0, XPath 2.0 and SPARQL. The whole event was tremendously interesting and I will probably blog further about this at a latter date.

Liam Quin gave a very enjoyable talk on XQuery which specifically caught my eye. Michael Kay (of SAXONICA) also spoke about the relationship of XQuery to XSLT 2.0, XPath 2.0 and XML Schema so I feel particularly well informed on the subject now (there is nothing like hearing it from the horse's mouth!). XQuery looks a bit like a hybrid of SQL and XPath (with FLWOR [For-Let-Where-Order-Return] syntax thrown in) and is particularly useful for accessing XML data across disparate sources.

Following the conference I have been doing some experiments using SAXON, starting with trying out the examples in Bob DuCharme's article Getting Started with XQuery.

XQuery 1.0 and XSLT 2.0 both support's XPath 2.0's document() and collection() functions for accessing external input XML documents. These are potentially extremely powerful facilities. Having recent experience with the Spring Framework, this way of doing things was a concern to me. XQuery et al appeared to be, at first glance anyway, advocating closely coupling documents with processing. This flies in the face of the inversion of control (dependency injection) design pattern which I have learnt to love.

To further illustrate my point lets say I have some XML source documents that I want to perform some XQuery on:

  • they might be on the filesystem
  • they might be in an XML database
  • they might be in a CLOB on a relational database
  • they might be on the web accessed via an URI
  • they might have been returned via Web Service
  • they might be DSML format returned from an LDAP server
  • they might be from some combination of the above, I could go on...

The document could be coming from almost anywhere and therefore would need to be accessed using very different mechanisms depending on the situation. Does that mean we need as many XQuery implementations as there are access mechanisms? I would hope not. That said, this seems to be the current situation where multiple XML database vendors supplying their own implementations of XQuery for their particular databases. This is wrong surely? I would argue that where the source XML originates is none of the XQuery processors' business and arguably the precise source should not even be detectable from the URI!

SAXON (the free version) includes support for XQuery. The SAXON XQuery processor has native support for accessing XML documents from the filesystem but as an experiment I'd thought it would be good to see if I could make use of the Spring Framework to feed the SAXON XQuery processor.

There is already a Spring XML Database Framework that enables Spring to access eXist and Apache Xindice XML database datasources to interact with XQuery but it looked a little complicated for my needs.

I discovered that the URI in the document() and collection() functions is merely a reference to an external document, it is not necessary that this should imply a specific access mechanism. In order to fool SAXON into accepting my Spring accessed input data I discovered that all I needed to do was to implement a Spring aware URIResolver and CollectionURIResolver. I could then configure SAXON to use those resolvers to access the documents and collections referenced in the XQueries.

What follows is by no means full featured (it is hard wired to read from a Spring resource) but it could be extended to support multiple data access mechanisms. I achieved my ends via two fairly simple java beans and a test program.

SpringXQuery performs the XQuery itself.
It is responsible for loading XQuery query file (using the Spring resource loader).
It is used to configure the collection and document URI resolvers.

SpringURIResolver provides the URI resolution.
You can configure a map of collections to use, using a map of maps.
You can configure a map of documents.

Other files are:

Spring's application context configuration file
A simple test program

and also the XQuery files and example XML files used in Bob DuCharme's XQuery article.

Incidentally I made use of Spring's MapFactoryBean in order to make my Spring configuration a little bit cleaner. I also made use of a tip I found, Spring: Locating Application Relative Resources, to ensure the Spring resource loader works.

Surprisingly enough it works; this is despite the fact that I do not fully understand all the intricacies of what I am implementing!

It looks to me that when using XQuery across multiple datasources performance is always likely to be an issue and this partly explains why all these database vendors have vendor specific implementations. I might argue that performance issues should be addressed on a separate level and, IMHO, is not a sufficient argument for re-implementing an entire language! Roll on javax.xml.xquery...


ismjml said...

Do you have any example of this???

Thanks in advance.
Note: Comment imported. Original by gengis at 2007-12-05 15:36

ismjml said...

Hi gengis,

I have thrown together a quick distribution of my code with the examples from Bob's XQuery article. If you want to understand the XQuery side of this you should read Bob's article. I have have used Maven 2 to keep my distribution lean.


You will need to install Maven2 and run:

mvn compile exec:exec

Hopefully that will gather the dependencies, build my code and run some example XQueries.

Note: Comment imported. Original by markmc website: at 2007-12-05 21:16