org.w3c.dom.ls
Interface DOMParser


public interface DOMParser

An interface to an object that is able to build, or augment, a DOM tree from various input sources.

DOMParser provides an API for parsing XML and building the corresponding DOM document structure. A DOMParser instance can be obtained by invoking the DOMImplementationLS.createDOMParser() method.

As specified in [DOM Level 3 Core] , when a document is first made available via the DOMParser:

Asynchronous DOMParser objects are expected to also implement the events::EventTarget interface so that event listeners can be registered on asynchronous DOMParser objects.

Events supported by asynchronous DOMParser objects are:

load
The DOMParser finishes to load the document. See also the definition of the LSLoadEvent interface.
progress
The DOMParser signals a progress as a document is parsed. See also the definition of the LSProgressEvent interface.

Note: All events defined in this specification use the namespace URI "http://www.w3.org/2002/DOMLS".

While parsing an input source, errors are reported to the application through the error handler (DOMParser.config's " error-handler" parameter). This specification does in no way try to define all possible errors that can occur while parsing XML, or any other markup, but some common error cases are defined. The types (DOMError.type) of errors and warnings defined by this specification are:

"unsupported-media-type" [fatal]
Raised if the configuration parameter "supported-media-types-only" is set to true and an unsupported media type is encountered.
"unsupported-encoding" [fatal]
Raised if an unsupported encoding is encountered.
"doctype-not-allowed" [fatal]
Raised if the configuration parameter "disallow-doctype" is set to true and a doctype is encountered.
"unknown-character-denormalization" [fatal]
Raised if the configuration parameter "ignore-unknown-character-denormalizations" is set to false and a character is encountered for which the processor cannot determine the normalization properties.
"unbound-namespace-in-entity" [warning]
Raised if the configuration parameter " entities" is set to true and an unbound namespace prefix is encounterd in an entity declaration.
"pi-base-uri-not-preserved" [warning]
Raised if a processing instruction is encoutered in a location where the base URI of the processing instruction can not be preserved. One example of a case where this warning will be raised is if the configuration parameter " entities" is set to false and the following XML file is parsed:
 <!DOCTYPE root [ <!ENTITY e SYSTEM 'subdir/myentity.ent' ]> 
 <root> &e; </root>
And subdir/myentity.ent looks like this:
<one> <two/> </one> <?pi 
 3.14159?> <more/>

In addition to raising the defined errors and warnings, implementations are expected to raise implementation specific errors and warnings for any other error and warning cases such as IO errors (file not found, permission denied,...), XML well-formedness errors, and so on.

See also the Document Object Model (DOM) Level 3 Load and Save Specification.


Field Summary
static short ACTION_APPEND_AS_CHILDREN
          Append the result of the parse operation as children of the context node.
static short ACTION_INSERT_AFTER
          Insert the result of the parse operation as the immediately following sibling of the context node.
static short ACTION_INSERT_BEFORE
          Insert the result of the parse operation as the immediately preceding sibling of the context node.
static short ACTION_REPLACE
          Replace the context node with the result of the parse operation.
static short ACTION_REPLACE_CHILDREN
          Replace all the children of the context node with the result of the parse operation.
 
Method Summary
 void abort()
          Abort the loading of the document that is currently being loaded by the DOMParser.
 boolean getAsync()
          true if the DOMParser is asynchronous, false if it is synchronous.
 boolean getBusy()
          true if the DOMParser is currently busy loading a document, otherwise false.
 org.apache.xerces.dom3.DOMConfiguration getConfig()
          The DOMConfiguration object used when parsing an input source.
 DOMParserFilter getFilter()
          When a filter is provided, the implementation will call out to the filter as it is constructing the DOM tree structure.
 Document parse(DOMInput is)
          Parse an XML document from a resource identified by a DOMInput.
 Document parseURI(java.lang.String uri)
          Parse an XML document from a location identified by a URI reference [IETF RFC 2396].
 Node parseWithContext(DOMInput input, Node context, short action)
          Parse an XML fragment from a resource identified by a DOMInput and insert the content into an existing document at the position specified with the context and action arguments.
 void setFilter(DOMParserFilter filter)
          When a filter is provided, the implementation will call out to the filter as it is constructing the DOM tree structure.
 

Field Detail

ACTION_APPEND_AS_CHILDREN

public static final short ACTION_APPEND_AS_CHILDREN
Append the result of the parse operation as children of the context node. For this action to work, the context node must be an Element or a DocumentFragment.

ACTION_REPLACE_CHILDREN

public static final short ACTION_REPLACE_CHILDREN
Replace all the children of the context node with the result of the parse operation. For this action to work, the context node must be an Element, a Document, or a DocumentFragment.

ACTION_INSERT_BEFORE

public static final short ACTION_INSERT_BEFORE
Insert the result of the parse operation as the immediately preceding sibling of the context node. For this action to work the context node's parent must be an Element or a DocumentFragment.

ACTION_INSERT_AFTER

public static final short ACTION_INSERT_AFTER
Insert the result of the parse operation as the immediately following sibling of the context node. For this action to work the context node's parent must be an Element or a DocumentFragment.

ACTION_REPLACE

public static final short ACTION_REPLACE
Replace the context node with the result of the parse operation. For this action to work, the context node must have a parent, and the parent must be an Element or a DocumentFragment.
Method Detail

getConfig

public org.apache.xerces.dom3.DOMConfiguration getConfig()
The DOMConfiguration object used when parsing an input source. This DOMConfiguration is specific to the parse operation and no parameter values from this DOMConfiguration object are passed automatically to the DOMConfiguration object on the Document that is created, or used, by the parse operation. The DOM application is responsible for passing any needed parameter values from this DOMConfiguration object to the DOMConfiguration object referenced by the Document object.
In addition to the parameters recognized in [DOM Level 3 Core] , the DOMConfiguration objects for DOMParser adds or modifies the following parameters:
"charset-overrides-xml-encoding"
true
[required] (default) If a higher level protocol such as HTTP [IETF RFC 2616] provides an indication of the character encoding of the input stream being processed, that will override any encoding specified in the XML declaration or the Text declaration (see also section 4.3.3, "Character Encoding in Entities", in [XML 1.0]). Explicitly setting an encoding in the DOMInput overrides any encoding from the protocol.
false
[required] The parser ignores any character set encoding information from higher-level protocols.
"disallow-doctype"
true
[optional] Throw a fatal "doctype-not-allowed" error if a doctype node is found while parsing the document. This is useful when dealing with things like SOAP envelopes where doctype nodes are not allowed.
false
[required] (default) Allow doctype nodes in the document.
"ignore-unknown-character-denormalizations"
true
[required] (default) If, while verifying full normalization when [XML 1.1] is supported, a processor encounters characters for which it cannot determine the normalization properties, then the processor will ignore any possible denormalizations caused by these characters. This parameter is ignored for [XML 1.0].
false
[optional] Report an fatal "unknown-character-denormalization" error if a character is encountered for which the processor cannot determine the normalization properties.
"infoset"
See the definition of DOMConfiguration for a description of this parameter. Unlike in [DOM Level 3 Core] , this parameter will default to true for DOMParser.
"namespaces"
true
[required] (default) Perform the namespace processing as defined in [XML Namespaces] .
false
[optional] Do not perform the namespace processing.
"supported-media-types-only"
true
[optional] Check that the media type of the parsed resource is a supported media type. If an unsupported media type is encountered, a fatal error of type "unsupported-media-type" will be raised. The media types defined in [IETF RFC 3023] must always be accepted.
false
[required] (default) Accept any media type.

The parameter " well-formed" cannot be set to false.

getFilter

public DOMParserFilter getFilter()
When a filter is provided, the implementation will call out to the filter as it is constructing the DOM tree structure. The filter can choose to remove elements from the document being constructed, or to terminate the parsing early.
The filter is invoked after the operations requested by the DOMConfiguration parameters have been applied. For example, if " validate" is set to true, the validation is done before invoking the filter.

setFilter

public void setFilter(DOMParserFilter filter)
When a filter is provided, the implementation will call out to the filter as it is constructing the DOM tree structure. The filter can choose to remove elements from the document being constructed, or to terminate the parsing early.
The filter is invoked after the operations requested by the DOMConfiguration parameters have been applied. For example, if " validate" is set to true, the validation is done before invoking the filter.

getAsync

public boolean getAsync()
true if the DOMParser is asynchronous, false if it is synchronous.

getBusy

public boolean getBusy()
true if the DOMParser is currently busy loading a document, otherwise false.

parse

public Document parse(DOMInput is)
               throws DOMException
Parse an XML document from a resource identified by a DOMInput.
Parameters:
is - The DOMInput from which the source of the document is to be read.
Returns:
If the DOMParser is a synchronous DOMParser, the newly created and populated Document is returned. If the DOMParser is asynchronous, null is returned since the document object may not yet be constructed when this method returns.
Throws:
DOMException - INVALID_STATE_ERR: Raised if the DOMParser's DOMParser.busy attribute is true.

parseURI

public Document parseURI(java.lang.String uri)
                  throws DOMException
Parse an XML document from a location identified by a URI reference [IETF RFC 2396]. If the URI contains a fragment identifier (see section 4.1 in [IETF RFC 2396]), the behavior is not defined by this specification, future versions of this specification may define the behavior.
Parameters:
uri - The location of the XML document to be read.
Returns:
If the DOMParser is a synchronous DOMParser, the newly created and populated Document is returned. If the DOMParser is asynchronous, null is returned since the document object may not yet be constructed when this method returns.
Throws:
DOMException - INVALID_STATE_ERR: Raised if the DOMParser.busy attribute is true.

parseWithContext

public Node parseWithContext(DOMInput input,
                             Node context,
                             short action)
                      throws DOMException
Parse an XML fragment from a resource identified by a DOMInput and insert the content into an existing document at the position specified with the context and action arguments. When parsing the input stream, the context node is used for resolving unbound namespace prefixes. The context node's ownerDocument node (or the node itself if the node of type DOCUMENT_NODE) is used to resolve default attributes and entity references.
As the new data is inserted into the document, at least one mutation event is fired per new immediate child or sibling of the context node.
If the context node is a Document node and the action is ACTION_REPLACE_CHILDREN, then the document that is passed as the context node will be changed such that it's xmlEncoding, documentURI, xmlVersion, actualEncoding, xmlStandalone, and all other such attributes are set to what they would be set to if the input source was parsed using DOMParser.parse().
If the DOMParser is asynchronous then the insertion of the resulting DOM structure is atomic, e.g. the whole structure is inserted only once the whole input stream is completely parsed without errors.
If an error occurs while parsing, the caller is notified through the ErrorHandler instance associated with the " error-handler" parameter of the DOMConfiguration.
When calling parseWithContext, the values of the following configuration parameters will be ignored and their default values will always be used instead: " validate", " validate-if-schema", and " whitespace-in-element-content".
Parameters:
input - The DOMInput from which the source document is to be read. The source document must be an XML fragment, i.e. anything except a complete XML document (except in the case where the context node of type DOCUMENT_NODE, and the action is ACTION_REPLACE_CHILDREN), a DOCTYPE (internal subset), entity declaration(s), notation declaration(s), or XML or text declaration(s).
context - The node that is used as the context for the data that is being parsed. This node must be a Document node, a DocumentFragment node, or a node of a type that is allowed as a child of an Element node, e.g. it cannot be an Attribute node.
action - This parameter describes which action should be taken between the new set of nodes being inserted and the existing children of the context node. The set of possible actions is defined in ACTION_TYPES above.
Returns:
Return the node that is the result of the parse operation. If the result is more than one top-level node, the first one is returned.
Throws:
DOMException - NOT_SUPPORTED_ERR: Raised if the DOMParser doesn't support this method.
NO_MODIFICATION_ALLOWED_ERR: Raised if the context node is a read only node.
INVALID_STATE_ERR: Raised if the DOMParser.busy attribute is true.

abort

public void abort()
Abort the loading of the document that is currently being loaded by the DOMParser. If the DOMParser is currently not busy, a call to this method does nothing.


Copyright � 1999-2003 Apache XML Project. All Rights Reserved.