|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
DOMSerializer
provides an API for serializing (writing) a DOM
document out into XML. The XML data is written to a string or an output
stream.
During serialization of XML data, namespace fixup is done as defined in [DOM Level 3 Core]
, Appendix B. [DOM Level 2 Core]
allows empty strings as a real namespace URI. If the
namespaceURI
of a Node
is empty string, the
serialization will treat them as null
, ignoring the prefix
if any.
DOMSerializer
accepts any node type for serialization. For
nodes of type Document
or Entity
, well-formed
XML will be created when possible (well-formedness is guaranteed if the
document or entity comes from a parse operation and is unchanged since it
was created). The serialized output for these node types is either as a
XML document or an External XML Entity, respectively, and is acceptable
input for an XML parser. For all other types of nodes the serialized form
is not specified, but should be something useful to a human for debugging
or diagnostic purposes.
Within a Document
, DocumentFragment
, or
Entity
being serialized, Nodes
are processed as
follows
Document
nodes are written, including the XML
declaration (unless the parameter "xml-declaration" is set to
false
) and a DTD subset, if one exists in the DOM. Writing a
Document
node serializes the entire document.
Entity
nodes, when written directly by
DOMSerializer.write
, outputs the entity expansion but no
namespace fixup is done. The resulting output will be valid as an
external entity.
EntityReference
nodes are serialized as an
entity reference of the form "&entityName;
" in the
output. Child nodes (the expansion) of the entity reference are ignored.
true
, CDATA sections
are split, and the unrepresentable characters are serialized as numeric
character references in ordinary content. The exact position and number
of splits is not specified. If the parameter is set to false
, unrepresentable characters in a CDATA section are reported as
"invalid-data-in-cdata-section"
errors. The error is not
recoverable - there is no mechanism for supplying alternative characters
and continuing with the serialization.
DocumentFragment
nodes are serialized by serializing the children of the document fragment
in the order they appear in the document fragment.
Note: The serialization of a Node
does not always
generate a well-formed XML document, i.e. a DOMParser
might
throw fatal errors when parsing the resulting serialization.
Within the character data of a document (outside of markup), any characters that cannot be represented directly are replaced with character references. Occurrences of '<' and '&' are replaced by the predefined entities < and &. The other predefined entities (>, ', and ") might not be used, except where needed (e.g. using > in cases such as ']]>'). Any characters that cannot be represented directly in the output character encoding are serialized as numeric character references.
To allow attribute values to contain both single and double quotes, the apostrophe or single-quote character (') may be represented as "'", and the double-quote character (") as """. New line characters and other characters that cannot be represented directly in attribute values in the output character encoding are serialized as a numeric character reference.
Within markup, but outside of attributes, any occurrence of a character
that cannot be represented in the output character encoding is reported
as an error. An example would be serializing the element
<LaCa�ada/> with encoding="us-ascii"
.
When requested by setting the parameter "
normalize-characters" on DOMSerializer
to true, character normalization is
performed according to the rules defined in [CharModel] on all
data to be serialized, both markup and character data. The character
normalization process affects only the data as it is being written; it
does not alter the DOM's view of the document after serialization has
completed.
When outputting unicode data, whether or not a byte order mark is serialized, or if the output is big-endian or little-endian, is implementation dependent.
Namespaces are fixed up during serialization, the serialization process will verify that namespace declarations, namespace prefixes and the namespace URI's associated with elements and attributes are consistent. If inconsistencies are found, the serialized form of the document will be altered to remove them. The method used for doing the namespace fixup while serializing a document is the algorithm defined in Appendix B.1, "Namespace normalization", of [DOM Level 3 Core] .
Any changes made affect only the namespace prefixes and declarations appearing in the serialized data. The DOM's view of the document is not altered by the serialization operation, and does not reflect any changes made to namespace declarations or prefixes in the serialized output. We may take back what we say in the above paragraph depending on feedback from implementors, but for now the belief is that the DOM's view of the document is not changed during serialization.
While serializing a document, the parameter "discard-default-content" controls whether or not non-specified data is serialized.
While serializing, errors are reported to the application through the
error handler (DOMSerializer.config
's "
error-handler" parameter). This specification does in no way try to define all possible
errors that can occur while serializing a DOM node, but some common error
cases are defined. The types (DOMError.type
) of errors and
warnings defined by this specification are:
"invalid-data-in-cdata-section" [fatal]
false
and invalid data is encountered in a CDATA
section. "unsupported-encoding" [fatal]
"unbound-namespace-in-entity" [warning]
true
and an unbound namespace prefix is
encounterd in a referenced entity. "no-output-specified" [fatal]
DOMOutput
if no output is specified in the
DOMOutput
. In addition to raising the defined errors and warnings, implementations are expected to raise implementation specific errors and warnings for any other error and warning cases such as IO errors (file not found, permission denied,...) and so on.
See also the Document Object Model (DOM) Level 3 Load and Save Specification.
Method Summary | |
org.apache.xerces.dom3.DOMConfiguration |
getConfig()
The DOMConfiguration object used by the
DOMSerializer when serializing a DOM node. |
DOMSerializerFilter |
getFilter()
When the application provides a filter, the serializer will call out to the filter before serializing each Node. |
java.lang.String |
getNewLine()
The end-of-line sequence of characters to be used in the XML being written out. |
void |
setFilter(DOMSerializerFilter filter)
When the application provides a filter, the serializer will call out to the filter before serializing each Node. |
void |
setNewLine(java.lang.String newLine)
The end-of-line sequence of characters to be used in the XML being written out. |
boolean |
write(Node node,
DOMOutput destination)
Serialize the specified node as described above in the general description of the DOMSerializer interface. |
java.lang.String |
writeToString(Node node)
Serialize the specified node as described above in the general description of the DOMSerializer interface. |
boolean |
writeURI(Node node,
java.lang.String URI)
Serialize the specified node as described above in the general description of the DOMSerializer interface. |
Method Detail |
public org.apache.xerces.dom3.DOMConfiguration getConfig()
DOMConfiguration
object used by the
DOMSerializer
when serializing a DOM node.
DOMConfiguration
objects for
DOMSerializer
adds, or modifies, the following
parameters:
"canonical-form"
true
true
will set the parameter "
format-pretty-print" to false
. false
"discard-default-content"
true
Attr.specified
attribute to decide what attributes
should be discarded. Note that some implementations might use
whatever information available to the implementation (i.e. XML
schema, DTD, the Attr.specified
attribute, and so on) to
determine what attributes and content to discard if this parameter is
set to true
. false
"format-pretty-print"
true
true
will set the parameter "
canonical-form" to false
. false
"ignore-unknown-character-denormalizations"
true
"unknown-character-denormalization"
warning (instead of
raising an error, if this parameter is not set) and ignore any
possible denormalizations caused by these characters. IMO it would
make sense to move this parameter into the DOM Level 3 Core spec, and
the error/warning should be defined there. false
"normalize-characters"
DOMConfiguration
in [DOM Level 3 Core]
. Unlike in the Core, the default value for this parameter is
true
. While DOM implementations are not required to
support fully normalizing the characters in the document according to
the rules defined in [CharModel]
supplemented by the definitions of relevant constructs from Section
2.13 of [XML 1.1], this
parameter must be activated by default if supported. "xml-declaration"
true
Document
, Element
, or Entity
node is serialized, the XML declaration, or text declaration, should
be included. The version (Document.xmlVersion
if the
document is a Level 3 document, and the version is non-null,
otherwise use the value "1.0"), and possibly an encoding (
DOMSerializer.encoding
, or
Document.actualEncoding
or
Document.xmlEncoding
if the document is a Level 3
document) is specified in the serialized XML declaration. false
"xml-declaration-needed"
warning if this will cause
problems (i.e. the serialized data is of an XML version other than [XML 1.0], or an
encoding would be needed to be able to re-parse the serialized data). false
.public java.lang.String getNewLine()
null
null
.public void setNewLine(java.lang.String newLine)
null
null
.public DOMSerializerFilter getFilter()
DOMConfiguration
parameters have been applied. For
example, CDATA sections are passed to the filter even if "
cdata-sections" is set to false
.public void setFilter(DOMSerializerFilter filter)
DOMConfiguration
parameters have been applied. For
example, CDATA sections are passed to the filter even if "
cdata-sections" is set to false
.public boolean write(Node node, DOMOutput destination)
DOMSerializer
interface. The output
is written to the supplied DOMOutput
.
DOMOutput
, the encoding is found by
looking at the encoding information that is reachable through the
DOMOutput
and the item to be written (or its owner
document) in this order:
DOMOutput.encoding
,
Document.actualEncoding
,
Document.xmlEncoding
.
DOMOutput
, a
"no-output-specified" error is raised.node
- The node to serialize.destination
- The destination for the serialized DOM.true
if node
was
successfully serialized and false
in case the node
couldn't be serialized.public boolean writeURI(Node node, java.lang.String URI)
DOMSerializer
interface. The output
is written to the supplied URI.
Document.actualEncoding
,
Document.xmlEncoding
.
node
- The node to serialize.URI
- The URI to write to.true
if node
was
successfully serialized and false
in case the node
couldn't be serialized.public java.lang.String writeToString(Node node) throws DOMException
DOMSerializer
interface. The output
is written to a DOMString
that is returned to the caller
(this method completely ignores all the encoding information
available).node
- The node to serialize.null
in case the
node couldn't be serialized.DOMException
- DOMSTRING_SIZE_ERR: Raised if the resulting string is too long to
fit in a DOMString
.
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |