How to Validate XML Documents in Database Queries

September 13, 2012

In DB2, validation of XML documents against XML Schemas is optional. If you choose to validate XML documents in DB2, the most typical scenario is to validate XML documents when they are inserted or loaded into the database. This makes sense: if you ensure that the XML that enters the database is valid, then subsequent queries can assume that the data is valid and complies with a particular XML Schema.

Likewise, validation in XML updates statements ensures that document replacement or document modifications do not violate your XML Schema.

Here is a simple example for document validation in INSERT and UPDATE statements, based on an XML Schema that was registered in the DB2 XML Schema Repository (XSR) under the SQL name db2admin.myschema:

CREATE TABLE mytable (id INTEGER NOT NULL PRIMARY KEY, doc XML);

INSERT INTO mytable
VALUES(?, XMLVALIDATE(? ACCORDING TO XMLSCHEMA ID db2admin.myschema));

UPDATE mytable
SET doc = XMLVALIDATE(? ACCORDING TO XMLSCHEMA ID db2admin.myschema)
WHERE id = ?;

UPDATE mytable
SET doc = XMLVALIDATE( XMLQUERY(‘copy $new := $DOC
                                 modify do insert <status>delivered</activated>
                                           into $new/message/header
                                 return $new’)
             ACCORDING TO XMLSCHEMA ID db2admin.myschema)
WHERE id = ?;

There are also cases when you might want to validate XML as part of a query. There can be several reasons for that:

  •  Documents were inserted or updated without validation and you need to validate them before consumptions.
  •  You wish to validate XML documents against a different schema than the one was used for validation upon insert or update.
  •  You are extracting fragments of stored XML documents and wish to validate them against a specific schema.
  •  Your queries are constructing entirely new XML documents and you wish to vaidate that the constructed XML complies with a given schema.

Regardless of the motivation, XML validation in a query is simple.

You can simply use the XMLVALIDATE function in a SELECT statement. All the same options for XMLVALIDATE are allowed as if you would use it in an INSERT or UPDATE statement. Let’s look at several examples:

SELECT XMLVALIDATE(doc ACCORDING TO XMLSCHEMA ID db2admin.myschema)
FROM mytable
WHERE id = 5;

This query above reads a specific document and performs schema validation against the XML Schema that was registered as db2admin.myschema.
If the selected document is valid for the specified schema, the document is returned.
If the selected document is not valid for the specified schema, the query fails and produces an error code that points to why the schema is violated.

Instead of the XML column name doc, the XMLVALIDATE function can take any argument of type XML, such the result of an XMLQUERY function. The following query uses the XMLQUERY function to extract just the message body from an XML document and validates it against the XML Schema db2admin.msgbodyXSD:

SELECT XMLVALIDATE( XMLQUERY(‘$DOC/message/body’)
            ACCORDING TO XMLSCHEMA ID db2admin.msgbodyXSD )
FROM mytable
WHERE id = 5;

The next query constructs a new XML document and validates it as part of the query:

SELECT XMLVALIDATE(        
         XMLQUERY(‘document{
                    <newdocument>
                      <header>{$DOC/party/identity}</header>
                      <body>
                          {$DOC/party/name}
                          {$DOC/party/details/address}
                      </body>
                   </newdocument>}’)
       ACCORDING TO XMLSCHEMA ID db2admin.newdocXSD)
FROM mytable
WHERE id = 5;

These examples give you an idea of the capabilities for validating XML query results against an XML Schema.

 

7 Responses to “How to Validate XML Documents in Database Queries”

  1. Roger Says:

    hi matthias,

    how would recommend an automatic validation, assuming we don’t explicitly do an xmlvalidate() call ? Can it be done via triggers, or is there a setting that will do this validation automatically?

    thanks a lot,
    miches


    • Hello,
      if you mean validation upon insert or update, yes, this can be done nicely with triggers. For example:


      CREATE TABLE mytable (id INTEGER NOT NULL PRIMARY KEY, doc XML);

      CREATE TRIGGER validate_on_insert
      BEFORE INSERT ON mytable
      REFERENCING NEW AS newrow
      FOR EACH ROW MODE DB2SQL
      BEGIN ATOMIC
      SET newrow.doc = XMLVALIDATE(newrow.doc ACCORDING TO XMLSCHEMA ID db2admin.myxsd);
      END
      #

      In DB2 10 for z/OS you also have the option to assign an XML Schema to a column as a column option, which also automates the validation similar to how the trigger does it.

  2. Roger Says:

    Hi Matthias,

    Thanks a lot for your input, yes, we’re now using the trigger approach, as our application runs in LUW.

    Best regards,
    Miches

  3. dries willem Says:

    Hey, Matthias. How do I know which documents are not added to my table if I use the load function? I can see in my job output what is wrong with some XML’s but not which specific XML’s.


    • Hey….. are you using DB2 LUW or DB2 for z/OS?
      For example, in DB2 LUW the LOAD messages provide the line numbers in the input file for those rows that failed to load.
      Can you share an example of the LOAD output that you are seeing? That would also clarify the type of errors that you need help with.
      Thx!

      • dries willem Says:

        Hey mattias, thx for your fast reply. I had the scroll all the way down in my joboutput so now I found them. The problem I have now is that when I try to do a query on the table I get the following message:
        QUERY MESSAGES:
        Unsuccessful execution caused by an unavailable resource. (Reason code:
        00C900A3; type of resource: 00000210; and resource name: SIDDRWM
        .XMAT0002.00000001).

  4. dries willem Says:

    I was able to fix the problem by adding nocopypend in my load. But what if I add the schema after I did the Load? My tablespace is in checkpending status but i can’t find how to delete this.


Leave a comment