Kednos' XML Support Run-Time Library User Guide & Release Notes

Kednos' XML Support Run-Time Library
User Guide & Release Notes


June 2009

This manual provides a complete guide for users, developers and system managers. It contains information relating to installation, release notes and application programming interface reference.

Revision/Update Information: This is a new manual

Operating System and Version: OpenVMS VAX V7.3

OpenVMS Alpha V7.3-2

OpenVMS I64 V8.3

Software Version: XMLRTL V1.0


March 2009

Copyright ©1999, 2000 Clark Cooper <coopercc@netheaven.com>

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Copyright ©2009 Kednos Enterprises

Contents


Preface

XMLRTL is an implementation of a number of XML APIs. The recommendations and standards either fully or partially implemented in this run-time library are:

Links to the specific documents are available in the Associated Documents section below.

XMLRTL provides an OpenVMS-specific interface to the APIs allowing XML parsers to be written in any native language. It achieves this by defining the interfaces in an OpenVMS-friendly format. All arguments are passed by reference, as is typical with most language indepentant APIs. All string arguments are passed using string descriptors. All string descriptor formats are supported. The intention is to make it easier for higher levels languages, such as BASIC, PL/I and Pascal, to call (and be called from) the XML parser.

XMLRTL is based on the eXpat library, the creation of James Clark, who's also given us groff (an nroff look-alike), Jade (an implementation of ISO's DSSSL stylesheet language for SGML), XP (a Java XML parser package) and XT (a Java XSL engine). James was also the technical lead on the XML Working Group at W3C that produced the XML specification.

Portions of this manual are derived from the eXpat XML Parser Reference Manual. In turn, that document was originally an article commissioned by XML.com. They graciously allowed Clark Cooper to retain copyright and to distribute it with eXpat. It was substantially extended to include documentation on features which had been added since the original article was published, along with additional information on using the original interface.

Intended Audience

This manual is intended for both system mangers and XML application developers. It contains release notes, installation instructions, application programming interface specification

Associated Documents

Conventions


Chapter 1
Getting Started

This chapter covers the installation and removal of the XML Support Run-Time Library including software and hardware dependencies.

1.1 Software Prerequisites

The following software is required to successfully install and use XMLRTL:

It is also recommended that all available software patches be applied to these products prior to attempting installation.

1.2 Hardware Requirements

There are no minimum hardware requirements for this software product.

1.3 Installation

The XMLRTL software product installation kit presents a number of options to the user to control exactly what is installed. <REFERENCE>(opt-table) describes each of these options and their defaults. Example 1-1 shows an example of an installation of the XMLRTL product.

Table 1-1 XMLRTL Installation Options
Description Default
User Guide & Release Notes in Postscript format Yes
User Guide & Release Notes in PDF format Yes
User Guide & Release Notes in HTML format Yes
Example programs YES

Example 1-1 XMLRTL Product Installation


1.4 Post Installation

There are no post-installation task required by XMLRTL.

1.5 Removal

Removal of the software product is a case of using the PCSI command, PRODUCT REMOVE. Example 1-2 demonstrates removal of the XMLRTL product.

Example 1-2 XMLRTL Product Removal



Chapter 2
Building an Application

XMLRTL consists of a collection of APIs for working with XML documents. These APIs are based on the following XML standards and recommendations:

The remaining sections in the chapter contain brief overviews of each API, including example source code, explaining the basics of working with the particular API.

2.1 Compiling and Linking Against XMLRTL

Before using any of the XML APIs it is necessary to know how to build the application. XMLRTL currently consists of four separate APIs. These are:

Despite the different APIs the compiling and linking instructions remain the same.

Note

When writing OpenVMS specific parsers in languages such as PL/I, BASIC and Pascal it is recommended (and in some cases the only choice) that the XML$ API is used. When writing parsers in a language such as C and it is expected that the code will be ported to an non-OpenVMS system such as Linux or Windows then it is recommended that the XML_ API is used.

Table 2-1 lists all available include libraries for all supported languages. In the event that there is a requirement for a language not currently supported, the SDL intermediate representation is also shipped and can be used (after modules are extracted) as input to the SDL/NOPARSE command.

Table 2-1 Language Specific Include Libraries
Language Library
BASIC SYS$LIBRARY:XML$RTLDEF_BASIC.TLB
BLISS SYS$LIBRARY:XML$RTLDEF_BLISS.REQ
C SYS$LIBRARY:XML$RTLDEF_C.TLB
Fortran SYS$LIBRARY:XML$RTLDEF_FORTRAN.TLB
Pascal SYS$LIBRARY:XML$RTLDEF_PASCAL.PEN
PL/I SYS$LIBRARY:XML$RTLDEF_PLI.TLB
SDL SYS$LIBRARY:XML$RTLDEF_SDI.TLB

Example 2-1 demonstrates the process of extracting an SDI module and using it as input to SDL/NOPARSE. In this example it is used to generate DCL symbols that represent XML$ message codes.

Example 2-1 Generating DCL Symbols

$ LIBRARY/EXTRACT=$XMLMSG/OUTPUT=XMLMSG.SDI SYS$LIBRARY:XML$RTLDEF_SDI.TLB 
$ SDL/NOPARSE/LANGUAGE=DCL XMLMSG.SDI 

In the case of the native interfaces, there are five include modules that define the APIs. Table 2-2 describes each of the include modules.

Table 2-2 XMLRTL Native Interface Include Modules
Module Description
$XMLDEF 1 Contains constant and bit mask definitions.
$XMLMSG 1 Contains all OpenVMS message code constants.
XML$ROUTINES All external SAX routine definitions.
XML$DOM_ROUTINES All external DOM routine definitions.
XML$RPC_ROUTINES All external XML-RPC routine definitions.


1There is no leading dollar sign for C header files

The examples below demonstrate how to include the XMLRTL header files in a number of languages. It also includes example compiler commands.

%include $xmldef; 
%include $xmlmsg; 
%include xml$routines; 
%include xml$dom_routines; 
%include xml$rpc_routines; 
 
$ PLI ELEMENTS+SYS$LIBRARY:XML$RTLDEF_PLI/LIBRARY 
      

Including XMLRTL in a PL/I program.


#include <xmldef.h> 
#include <xmlmsg.h> 
#include <xml$routines.h> 
#include <xml$dom_routines.h> 
#include <xml$rpc_routines.h> 
 
$ CC ELEMENTS+SYS$LIBRARY:XML$RTLDEF_C/LIBRARY 
      

Including XMLRTL in a C program.


        include 'SYS$LIBRARY:XML$RTLDEF_FORTRAN.TLB($XMLDEF)' 
        include 'SYS$LIBRARY:XML$RTLDEF_FORTRAN.TLB($XMLMSG)' 
        include 'SYS$LIBRARY:XML$RTLDEF_FORTRAN.TLB(XML$ROUTINES)' 
        include 'SYS$LIBRARY:XML$RTLDEF_FORTRAN.TLB(XML$DOM_ROUTINES)' 
        include 'SYS$LIBRARY:XML$RTLDEF_FORTRAN.TLB(XML$RPC_ROUTINES)' 
 
$ FORTRAN ELEMENTS+SYS$LIBRARY:XML$RTLDEF_FORTRAN/LIBRARY 
      

Including XMLRTL in a Fortran program.


%include "$XMLDEF" %from %library "SYS$LIBRARY:XML$RTLDEF_BASIC.TLB" 
%include "$XMLMSG" %from %library "SYS$LIBRARY:XML$RTLDEF_BASIC.TLB" 
%include "XML$ROUTINES" %from %library "SYS$LIBRARY:XML$RTLDEF_BASIC.TLB" 
%include "XML$DOM_ROUTINES" %from %library "SYS$LIBRARY:XML$RTLDEF_BASIC.TLB" 
%include "XML$RPC_ROUTINES" %from %library "SYS$LIBRARY:XML$RTLDEF_BASIC.TLB" 
 
$ BASIC ELEMENTS 
      

Including XMLRTL in a BASIC program.


[inherit ('sys$library:xml$rtldef_pascal')] 
 
$ PASCAL ELEMENTS 
      

Including XMLRTL in a Pascal program.

To include the portable interface in a C program it is necessary to include the EXPAT.H header file, like so:


#include <expat.h> 

Including the header library in the compile is the same as shown in C example above.

When linking it is necessary to link against the XML$SHR run-time library. This single RTL includes all APIs. For native routines there are upper and lower case definitions of all symbols. For the portable interface there are upper case and mixed-case (as is) definitions. This eases porting any code the is compiled with the qualifier /NAMES=AS_IS. Example 2-2 demonstrates how to link an application against the XMLRTL run-time library.

Example 2-2 Linking an XMLRTL Application

      $ LINK ELEMENT,SYS$INPUT/OPTIONS 
      SYS$LIBRARY:XML$SHR/SHARE 

2.2 The Simple API for XML (SAX)

The SAX parser is a stream-oriented parser. The user is required to register callback (handler) functions with the parser and then begin feeding it the document. As the parser recognizes different parts of the document it will call the appropriate handler, if one is registered. The document itself is fed to the parser in pieces making it possible to start parsing a document before the entire document is available. This also makes it very easy to parse very large documents that may not easily fit into memory.

Due to the large number of different handlers and configurable options the SAX API may at first appear intimidating. However, there are only really four functions necessary to successfully parse most documents. These are listed in Table 2-3. A full reference of these functions and others can be found in Chapter 3.

Table 2-3 Minimum Functions Necessary to Setup a Basic Parser
Function Description
XML$CREATE_PARSER Creates a new parser instance.
XML$SET_ELEMENT_HANDLER Configures the handlers for processing opening and closing tags.
XML$SET_CHARACTER_DATA_HANDLER Configures the handler for processing any non-XML text.
XML$PARSE Parsers a buffer containing the XML document.

Example 2-3 contains a very basic program that uses three of the functions shown in Table 2-3. It is not necessary to specify a character data handler, so it is left out for the moment. The example program, along with other example software can be found in SYS$COMMON:[SHYSHLP.EXAMPLES.XMLRTL]. These examples are written in a range of languages to demonstrate XMLRTL's support of all native OpenVMS languages.

Example 2-3 Simple PL/I SAX Parser


XMLRTL actually contains two separate APIs. The first has already been mentioned and is for use by OpenVMS native high-level languages. The other is the portable interface. This interface provides an API consistent with the eXpat parser library allowing easy porting of applications using eXpat to and from UNIX, Windows and OpenVMS. These two APIs are not interchangeable. The developer must pick one or the other as an attempt to use the parser context from one API with the other will result in application failure.

2.3 XMLRTL Basics

As shown in the Example 2-3 the first step in parsing an XML document with XMLRTL is to create a parser instance. There is a single routine for

2.4 Communicating Between Handlers

In order to be able to pass information between different handlers without resorting to using globals it is necessary to define a data structure to hold the shared storage. The XML parser can then be told to pass the address of this structure to all handlers using the routine XML$SET_USER_DATA. This is the first argument received by most handlers. It is possible to tell the XML parser, during creation, to pass the parser instance as the handler argument. In this case the routine XML$GET_USER_DATA can be used to retrieve the user data.

One common case where multiple calls to a single handler may need to communicate using an application data structure is when content passed to the character data handler needs to be accumulated. A common first-time mistake with any of the event-oriented interfaces to an XML parser is to expect all the text contained in an element to be reported by a single call to the character data handler. XMLRTL, like many other XML parsers, reports such data as a sequence of calls. There is no way to know when the end of the sequence is reached until a different callback is made. A buffer reference by the user data structure proves obth an effective and convenient place to accumulate character data.

The BASIC example, Example 2-4, demonstrates the use of XML$SET_USER_DATA as a method for passing a string accumulator buffer.

Example 2-4 Configuring User Data Pointer

declare string accumulator 
declare long parser 
 
call xml$set_user_data(parser by ref, 
                       loc(accumulator) by value) 
 

2.5 XML Version

XMLRTL is an XML 1.0 parser, and as such never complains based on the value of the version parameter in the XML declaration, if present.

If an application needs to check the version number (to support alternate processing), it should use XML$SET_XML_DECL_HANDLER to configure a handler that uses the information in the XML declaration to determine what to do. Example 2-5 shows an example of a handler that enforces that the document must specify a version number of "1.0".

Example 2-5 XML Version Number Handler

        %sbttl 'Handle XML declaration'; 
 
xmldecl_handler: procedure(userData, 
                           version, 
                           encoding, 
                           standalone); 
/*++ 
 * Functional Description: 
 *      This handler is called by the XML parser as a result of encountering 
 *      an XML document declaration, similar to: 
 * 
 *          <?xml version="1.0" encoding="UTF-8"?> 
-*/ 
 
%include $stsdef; 
 
    declare userData            pointer value; 
    declare version             character(*); 
    declare encoding            character(*); 
    declare standalone          bit(1) aligned; 
 
    put skip list('XML Declaration: version = ' || version || '; ' || 
                                   'encoding = ' || encoding || '; ' || 
                                   'standalone = ' || standalone); 
    if (version ^= '1.0') then do; 
        put skip list('Error: XML version is not 1.0'); 
 
        sts$value = xml$stop_parser(parser); 
    end; 
 
end xmldecl_handler; 

2.6 Namespace Processing

..re-write...

When the parser is created using the XML_ParserCreateNS or by specifying the separator arguments to XML$CREATE_PARSER function, XMLRTL performs namespace processing. Under namespace processing, XMLRTL consumes xmlns and xmlns:... attributes, which declare namespaces for the scope of the element in which they occur. This means that your start handler will not see these attributes. Your application can still be informed of these declarations by setting namespace declaration handlers with <REFERENCE>(nat_setnamespacedeclhandler) or <REFERENCE>(nat_set_handler).

2.7 Character Encodings

While XML is based on Unicode, and every XML processor is required to recognize UTF-8 and UTF-16 (one and two byte encodings of Unicode), other encodings may be declared in XML documents or entities. For the main document an XML declaration may contain an encoding parameter such as this:


<?xml version="1.0" encoding="ISO-8859-2"?> 

External parsed entities may begin with a text declaration, which looks like an XML declaration with just an encoding parameter. It could appear similar to the following:


<?xml encoding="Big5"?> 

XMLRTL supports four built-in encodings. These are listed here:

Anything else discovered in an encoding parameter or in the protocol encoding specified in the parser constructor triggers a to to the handler configured by <REFERENCE>(nat_set_unknown).


Next Contents