public class BaseFilter extends BaseParser implements XMLFilter, ContentHandler, DTDHandler, ErrorHandler, LexicalHandler
Modifier and Type | Field and Description |
---|---|
protected XMLReader |
m_parent |
ACCEPT_CHARSET, CONTENT_CHARSET, DEFAULT_CHARSET, m_contentHandler, m_dtdHandler, m_entityResolver, m_errorHandler, m_features, m_lexicalHandler, m_properties, m_recognizedFeatures, m_recognizedProperties, PROPERTY_LEXICAL_HANDLER
Constructor and Description |
---|
BaseFilter() |
Modifier and Type | Method and Description |
---|---|
void |
characters(char[] ch,
int start,
int length)
Receive notification of character data.
|
void |
comment(char[] ch,
int start,
int length)
Report an XML comment anywhere in the document.
|
void |
endCDATA()
Report the end of a CDATA section.
|
void |
endDocument()
Receive notification of the end of a document.
|
void |
endDTD()
Report the end of DTD declarations.
|
void |
endElement(String uri,
String localName,
String qName)
Receive notification of the end of an element.
|
void |
endEntity(String name)
Report the end of an entity.
|
void |
endPrefixMapping(String prefix)
End the scope of a prefix-URI mapping.
|
void |
error(SAXParseException exception)
Receive notification of a recoverable error.
|
void |
fatalError(SAXParseException exception)
Receive notification of a non-recoverable error.
|
XMLReader |
getParent()
Get the parent reader.
|
void |
ignorableWhitespace(char[] ch,
int start,
int length)
Receive notification of ignorable whitespace in element content.
|
void |
notationDecl(String name,
String publicId,
String systemId)
Receive notification of a notation declaration event.
|
void |
parse(InputSource source)
Parse an XML document.
|
void |
parse(String systemId)
Parse an XML document from a system identifier (URI).
|
void |
processingInstruction(String target,
String data)
Receive notification of a processing instruction.
|
void |
setContentHandler(ContentHandler handler)
Allow an application to register a content event handler.
|
void |
setDocumentLocator(Locator locator)
Receive an object for locating the origin of SAX document events.
|
void |
setDTDHandler(DTDHandler handler)
Allow an application to register a DTD event handler.
|
void |
setEntityResolver(EntityResolver resolver)
Allow an application to register an entity resolver.
|
void |
setErrorHandler(ErrorHandler handler)
Allow an application to register an error event handler.
|
void |
setParent(XMLReader parent)
Set the parent reader.
|
void |
setProperty(String name,
Object value)
Set the value of a property.
|
void |
skippedEntity(String name)
Receive notification of a skipped entity.
|
void |
startCDATA()
Report the start of a CDATA section.
|
void |
startDocument()
Receive notification of the beginning of a document.
|
void |
startDTD(String name,
String publicId,
String systemId)
Report the start of DTD declarations, if any.
|
void |
startElement(String uri,
String localName,
String qName,
Attributes atts)
Receive notification of the beginning of an element.
|
void |
startEntity(String name)
Report the beginning of some internal and external XML entities.
|
void |
startPrefixMapping(String prefix,
String uri)
Begin the scope of a prefix-URI Namespace mapping.
|
void |
unparsedEntityDecl(String name,
String publicId,
String systemId,
String notationName)
Receive notification of an unparsed entity declaration event.
|
void |
warning(SAXParseException exception)
Receive notification of a warning.
|
getCharacterStream, getContentHandler, getDTDHandler, getEntityResolver, getErrorHandler, getFeature, getProperty, setFeature
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getContentHandler, getDTDHandler, getEntityResolver, getErrorHandler, getFeature, getProperty, setFeature
protected XMLReader m_parent
public void parse(InputSource source) throws IOException, SAXException
The application can use this method to instruct the XML reader to begin parsing an XML document from any valid input source (a character stream, a byte stream, or a URI).
Applications may not invoke this method while a parse is in progress (they should create a new XMLReader instead for each nested XML document). Once a parse is complete, an application may reuse the same XMLReader object, possibly with a different input source.
During the parse, the XMLReader will provide information about the XML document through the registered event handlers.
This method is synchronous: it will not return until parsing has ended. If a client application wants to terminate parsing early, it should throw an exception.
parse
in interface XMLReader
source
- The input source for the top-level of the
XML document.SAXException
- Any SAX exception, possibly
wrapping another exception.IOException
- An IO exception from the parser,
possibly from a byte stream or character stream
supplied by the application.InputSource
,
parse(String)
,
setEntityResolver(org.xml.sax.EntityResolver)
,
setDTDHandler(org.xml.sax.DTDHandler)
,
setContentHandler(org.xml.sax.ContentHandler)
,
setErrorHandler(org.xml.sax.ErrorHandler)
public void parse(String systemId) throws IOException, SAXException
This method is a shortcut for the common case of reading a document from a system identifier. It is the exact equivalent of the following:
parse(new InputSource(systemId));
If the system identifier is a URL, it must be fully resolved by the application before it is passed to the parser.
parse
in interface XMLReader
parse
in class BaseParser
systemId
- The system identifier (URI).SAXException
- Any SAX exception, possibly
wrapping another exception.IOException
- An IO exception from the parser,
possibly from a byte stream or character stream
supplied by the application.parse(InputSource)
public void setProperty(String name, Object value) throws SAXNotRecognizedException, SAXNotSupportedException
The property name is any fully-qualified URI. It is possible for an XMLReader to recognize a property name but to be unable to change the current value. Some property values may be immutable or mutable only in specific contexts, such as before, during, or after a parse.
XMLReaders are not required to recognize setting any specific property names, though a core set is defined by SAX2.
This method is also the standard mechanism for setting extended handlers.
setProperty
in interface XMLReader
setProperty
in class BaseParser
name
- The property name, which is a fully-qualified URI.value
- The requested value for the property.SAXNotRecognizedException
- If the property
value can't be assigned or retrieved.SAXNotSupportedException
- When the
XMLReader recognizes the property name but
cannot set the requested value.public void setEntityResolver(EntityResolver resolver)
If the application does not register an entity resolver, the XMLReader will perform its own default resolution.
Applications may register a new or different resolver in the middle of a parse, and the SAX parser must begin using the new resolver immediately.
setEntityResolver
in interface XMLReader
setEntityResolver
in class BaseParser
resolver
- The entity resolver.BaseParser.getEntityResolver()
public void setDTDHandler(DTDHandler handler)
If the application does not register a DTD handler, all DTD events reported by the SAX parser will be silently ignored.
Applications may register a new or different handler in the middle of a parse, and the SAX parser must begin using the new handler immediately.
setDTDHandler
in interface XMLReader
setDTDHandler
in class BaseParser
handler
- The DTD handler.BaseParser.getDTDHandler()
public void setContentHandler(ContentHandler handler)
If the application does not register a content handler, all content events reported by the SAX parser will be silently ignored.
Applications may register a new or different handler in the middle of a parse, and the SAX parser must begin using the new handler immediately.
setContentHandler
in interface XMLReader
setContentHandler
in class BaseParser
handler
- The content handler.BaseParser.getContentHandler()
public void setErrorHandler(ErrorHandler handler)
If the application does not register an error handler, all error events reported by the SAX parser will be silently ignored; however, normal processing may not continue. It is highly recommended that all SAX applications implement an error handler to avoid unexpected bugs.
Applications may register a new or different handler in the middle of a parse, and the SAX parser must begin using the new handler immediately.
setErrorHandler
in interface XMLReader
setErrorHandler
in class BaseParser
handler
- The error handler.BaseParser.getErrorHandler()
public void setParent(XMLReader parent)
This method allows the application to link the filter to a parent reader (which may be another filter). The argument may not be null.
public XMLReader getParent()
This method allows the application to query the parent reader (which may be another filter). It is generally a bad idea to perform any operations on the parent reader directly: they should all pass through this filter.
public void setDocumentLocator(Locator locator)
SAX parsers are strongly encouraged (though not absolutely required) to supply a locator: if it does so, it must supply the locator to the application by invoking this method before invoking any of the other methods in the ContentHandler interface.
The locator allows the application to determine the end position of any document-related event, even if the parser is not reporting an error. Typically, the application will use this information for reporting its own errors (such as character content that does not match an application's business rules). The information returned by the locator is probably not sufficient for use with a search engine.
Note that the locator will return correct information only during the invocation of the events in this interface. The application should not attempt to use it at any other time.
setDocumentLocator
in interface ContentHandler
locator
- An object that can return the location of
any SAX document event.Locator
public void startDocument() throws SAXException
The SAX parser will invoke this method only once, before any
other event callbacks (except for setDocumentLocator
).
startDocument
in interface ContentHandler
SAXException
- Any SAX exception, possibly
wrapping another exception.endDocument()
public void endDocument() throws SAXException
The SAX parser will invoke this method only once, and it will be the last method invoked during the parse. The parser shall not invoke this method until it has either abandoned parsing (because of an unrecoverable error) or reached the end of input.
endDocument
in interface ContentHandler
SAXException
- Any SAX exception, possibly
wrapping another exception.startDocument()
public void startPrefixMapping(String prefix, String uri) throws SAXException
The information from this event is not necessary for
normal Namespace processing: the SAX XML reader will
automatically replace prefixes for element and attribute
names when the http://xml.org/sax/features/namespaces
feature is true (the default).
There are cases, however, when applications need to use prefixes in character data or in attribute values, where they cannot safely be expanded automatically; the start/endPrefixMapping event supplies the information to the application to expand prefixes in those contexts itself, if necessary.
Note that start/endPrefixMapping events are not
guaranteed to be properly nested relative to each other:
all startPrefixMapping events will occur immediately before the
corresponding startElement
event,
and all endPrefixMapping
events will occur immediately after the corresponding
endElement
event,
but their order is not otherwise
guaranteed.
There should never be start/endPrefixMapping events for the "xml" prefix, since it is predeclared and immutable.
startPrefixMapping
in interface ContentHandler
prefix
- The Namespace prefix being declared.
An empty string is used for the default element namespace,
which has no prefix.uri
- The Namespace URI the prefix is mapped to.SAXException
- The client may throw
an exception during processing.endPrefixMapping(java.lang.String)
,
startElement(java.lang.String, java.lang.String, java.lang.String, org.xml.sax.Attributes)
public void endPrefixMapping(String prefix) throws SAXException
See startPrefixMapping
for
details. These events will always occur immediately after the
corresponding endElement
event, but the order of
endPrefixMapping
events is not otherwise
guaranteed.
endPrefixMapping
in interface ContentHandler
prefix
- The prefix that was being mapping.
This is the empty string when a default mapping scope ends.SAXException
- The client may throw
an exception during processing.startPrefixMapping(java.lang.String, java.lang.String)
,
endElement(java.lang.String, java.lang.String, java.lang.String)
public void startElement(String uri, String localName, String qName, Attributes atts) throws SAXException
The Parser will invoke this method at the beginning of every
element in the XML document; there will be a corresponding
endElement
event for every startElement event
(even when the element is empty). All of the element's content will be
reported, in order, before the corresponding endElement
event.
This event allows up to three name components for each element:
Any or all of these may be provided, depending on the values of the http://xml.org/sax/features/namespaces and the http://xml.org/sax/features/namespace-prefixes properties:
Note that the attribute list provided will contain only
attributes with explicit values (specified or defaulted):
#IMPLIED attributes will be omitted. The attribute list
will contain attributes used for Namespace declarations
(xmlns* attributes) only if the
http://xml.org/sax/features/namespace-prefixes
property is true (it is false by default, and support for a
true value is optional).
Like characters()
, attribute values may have
characters that need more than one char
value.
startElement
in interface ContentHandler
uri
- The Namespace URI, or the empty string if the
element has no Namespace URI or if Namespace
processing is not being performed.localName
- The local name (without prefix), or the
empty string if Namespace processing is not being
performed.qName
- The qualified name (with prefix), or the
empty string if qualified names are not available.atts
- The attributes attached to the element. If
there are no attributes, it shall be an empty
Attributes object.SAXException
- Any SAX exception, possibly
wrapping another exception.endElement(java.lang.String, java.lang.String, java.lang.String)
,
Attributes
public void endElement(String uri, String localName, String qName) throws SAXException
The SAX parser will invoke this method at the end of every
element in the XML document; there will be a corresponding
startElement
event for every endElement
event (even when the element is empty).
For information on the names, see startElement.
endElement
in interface ContentHandler
uri
- The Namespace URI, or the empty string if the
element has no Namespace URI or if Namespace
processing is not being performed.localName
- The local name (without prefix), or the
empty string if Namespace processing is not being
performed.qName
- The qualified XML 1.0 name (with prefix), or the
empty string if qualified names are not available.SAXException
- Any SAX exception, possibly
wrapping another exception.public void characters(char[] ch, int start, int length) throws SAXException
The Parser will call this method to report each chunk of character data. SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks; however, all of the characters in any single event must come from the same external entity so that the Locator provides useful information.
The application must not attempt to read from the array outside of the specified range.
Individual characters may consist of more than one Java
char
value. There are two important cases where this
happens, because characters can't be represented in just sixteen bits.
In one case, characters are represented in a Surrogate Pair,
using two special Unicode values. Such characters are in the so-called
"Astral Planes", with a code point above U+FFFF. A second case involves
composite characters, such as a base character combining with one or
more accent characters.
Your code should not assume that algorithms using
char
-at-a-time idioms will be working in character
units; in some cases they will split characters. This is relevant
wherever XML permits arbitrary characters, such as attribute values,
processing instruction data, and comments as well as in data reported
from this method. It's also generally relevant whenever Java code
manipulates internationalized text; the issue isn't unique to XML.
Note that some parsers will report whitespace in element
content using the ignorableWhitespace
method rather than this one (validating parsers must
do so).
characters
in interface ContentHandler
ch
- The characters from the XML document.start
- The start position in the array.length
- The number of characters to read from the array.SAXException
- Any SAX exception, possibly
wrapping another exception.ignorableWhitespace(char[], int, int)
,
Locator
public void ignorableWhitespace(char[] ch, int start, int length) throws SAXException
Validating Parsers must use this method to report each chunk of whitespace in element content (see the W3C XML 1.0 recommendation, section 2.10): non-validating parsers may also use this method if they are capable of parsing and using content models.
SAX parsers may return all contiguous whitespace in a single chunk, or they may split it into several chunks; however, all of the characters in any single event must come from the same external entity, so that the Locator provides useful information.
The application must not attempt to read from the array outside of the specified range.
ignorableWhitespace
in interface ContentHandler
ch
- The characters from the XML document.start
- The start position in the array.length
- The number of characters to read from the array.SAXException
- Any SAX exception, possibly
wrapping another exception.characters(char[], int, int)
public void processingInstruction(String target, String data) throws SAXException
The Parser will invoke this method once for each processing instruction found: note that processing instructions may occur before or after the main document element.
A SAX parser must never report an XML declaration (XML 1.0, section 2.8) or a text declaration (XML 1.0, section 4.3.1) using this method.
Like characters()
, processing instruction
data may have characters that need more than one char
value.
processingInstruction
in interface ContentHandler
target
- The processing instruction target.data
- The processing instruction data, or null if
none was supplied. The data does not include any
whitespace separating it from the target.SAXException
- Any SAX exception, possibly
wrapping another exception.public void skippedEntity(String name) throws SAXException
The Parser will invoke this method each time the entity is
skipped. Non-validating processors may skip entities if they
have not seen the declarations (because, for example, the
entity was declared in an external DTD subset). All processors
may skip external entities, depending on the values of the
http://xml.org/sax/features/external-general-entities
and the
http://xml.org/sax/features/external-parameter-entities
properties.
skippedEntity
in interface ContentHandler
name
- The name of the skipped entity. If it is a
parameter entity, the name will begin with '%', and if
it is the external DTD subset, it will be the string
"[dtd]".SAXException
- Any SAX exception, possibly
wrapping another exception.public void notationDecl(String name, String publicId, String systemId) throws SAXException
It is up to the application to record the notation for later reference, if necessary; notations may appear as attribute values and in unparsed entity declarations, and are sometime used with processing instruction target names.
At least one of publicId and systemId must be non-null. If a system identifier is present, and it is a URL, the SAX parser must resolve it fully before passing it to the application through this event.
There is no guarantee that the notation declaration will be reported before any unparsed entities that use it.
notationDecl
in interface DTDHandler
name
- The notation name.publicId
- The notation's public identifier, or null if
none was given.systemId
- The notation's system identifier, or null if
none was given.SAXException
- Any SAX exception, possibly
wrapping another exception.unparsedEntityDecl(java.lang.String, java.lang.String, java.lang.String, java.lang.String)
,
Attributes
public void unparsedEntityDecl(String name, String publicId, String systemId, String notationName) throws SAXException
Note that the notation name corresponds to a notation
reported by the notationDecl
event.
It is up to the application to record the entity for later
reference, if necessary;
unparsed entities may appear as attribute values.
If the system identifier is a URL, the parser must resolve it fully before passing it to the application.
unparsedEntityDecl
in interface DTDHandler
name
- The unparsed entity's name.publicId
- The entity's public identifier, or null if none
was given.systemId
- The entity's system identifier.notationName
- The name of the associated notation.SAXException
- Any SAX exception, possibly
wrapping another exception.notationDecl(java.lang.String, java.lang.String, java.lang.String)
,
Attributes
public void warning(SAXParseException exception) throws SAXException
SAX parsers will use this method to report conditions that are not errors or fatal errors as defined by the XML 1.0 recommendation. The default behaviour is to take no action.
The SAX parser must continue to provide normal parsing events after invoking this method: it should still be possible for the application to process the document through to the end.
Filters may use this method to report other, non-XML warnings as well.
warning
in interface ErrorHandler
exception
- The warning information encapsulated in a
SAX parse exception.SAXException
- Any SAX exception, possibly
wrapping another exception.SAXParseException
public void error(SAXParseException exception) throws SAXException
This corresponds to the definition of "error" in section 1.2 of the W3C XML 1.0 Recommendation. For example, a validating parser would use this callback to report the violation of a validity constraint. The default behaviour is to take no action.
The SAX parser must continue to provide normal parsing events after invoking this method: it should still be possible for the application to process the document through to the end. If the application cannot do so, then the parser should report a fatal error even if the XML 1.0 recommendation does not require it to do so.
Filters may use this method to report other, non-XML errors as well.
error
in interface ErrorHandler
exception
- The error information encapsulated in a
SAX parse exception.SAXException
- Any SAX exception, possibly
wrapping another exception.SAXParseException
public void fatalError(SAXParseException exception) throws SAXException
This corresponds to the definition of "fatal error" in section 1.2 of the W3C XML 1.0 Recommendation. For example, a parser would use this callback to report the violation of a well-formedness constraint.
The application must assume that the document is unusable after the parser has invoked this method, and should continue (if at all) only for the sake of collecting addition error messages: in fact, SAX parsers are free to stop reporting any other events once this method has been invoked.
fatalError
in interface ErrorHandler
exception
- The error information encapsulated in a
SAX parse exception.SAXException
- Any SAX exception, possibly
wrapping another exception.SAXParseException
public void startDTD(String name, String publicId, String systemId) throws SAXException
This method is intended to report the beginning of the DOCTYPE declaration; if the document has no DOCTYPE declaration, this method will not be invoked.
All declarations reported through
DTDHandler
or
DeclHandler
events must appear
between the startDTD and endDTD
events.
Declarations are assumed to belong to the internal DTD subset
unless they appear between startEntity
and endEntity
events. Comments and
processing instructions from the DTD should also be reported
between the startDTD and endDTD events, in their original
order of (logical) occurrence; they are not required to
appear in their correct locations relative to DTDHandler
or DeclHandler events, however.
Note that the start/endDTD events will appear within
the start/endDocument events from ContentHandler and
before the first
startElement
event.
startDTD
in interface LexicalHandler
name
- The document type name.publicId
- The declared public identifier for the
external DTD subset, or null if none was declared.systemId
- The declared system identifier for the
external DTD subset, or null if none was declared.
(Note that this is not resolved against the document
base URI.)SAXException
- The application may raise an
exception.endDTD()
,
startEntity(java.lang.String)
public void endDTD() throws SAXException
This method is intended to report the end of the DOCTYPE declaration; if the document has no DOCTYPE declaration, this method will not be invoked.
endDTD
in interface LexicalHandler
SAXException
- The application may raise an exception.startDTD(java.lang.String, java.lang.String, java.lang.String)
public void startEntity(String name) throws SAXException
The reporting of parameter entities (including
the external DTD subset) is optional, and SAX2 drivers that
report LexicalHandler events may not implement it; you can use the
http://xml.org/sax/features/lexical-handler/parameter-entities
feature to query or control the reporting of parameter entities.
General entities are reported with their regular names, parameter entities have '%' prepended to their names, and the external DTD subset has the pseudo-entity name "[dtd]".
When a SAX2 driver is providing these events, all other
events must be properly nested within start/end entity
events. There is no additional requirement that events from
DeclHandler
or
DTDHandler
be properly ordered.
Note that skipped entities will be reported through the
skippedEntity
event, which is part of the ContentHandler interface.
Because of the streaming event model that SAX uses, some entity boundaries cannot be reported under any circumstances:
These will be silently expanded, with no indication of where the original entity boundaries were.
Note also that the boundaries of character references (which are not really entities anyway) are not reported.
All start/endEntity events must be properly nested.
startEntity
in interface LexicalHandler
name
- The name of the entity. If it is a parameter
entity, the name will begin with '%', and if it is the
external DTD subset, it will be "[dtd]".SAXException
- The application may raise an exception.endEntity(java.lang.String)
,
DeclHandler.internalEntityDecl(java.lang.String, java.lang.String)
,
DeclHandler.externalEntityDecl(java.lang.String, java.lang.String, java.lang.String)
public void endEntity(String name) throws SAXException
endEntity
in interface LexicalHandler
name
- The name of the entity that is ending.SAXException
- The application may raise an exception.startEntity(java.lang.String)
public void startCDATA() throws SAXException
The contents of the CDATA section will be reported through
the regular characters
event; this event is intended only to report
the boundary.
startCDATA
in interface LexicalHandler
SAXException
- The application may raise an exception.endCDATA()
public void endCDATA() throws SAXException
endCDATA
in interface LexicalHandler
SAXException
- The application may raise an exception.startCDATA()
public void comment(char[] ch, int start, int length) throws SAXException
This callback will be used for comments inside or outside the document element, including comments in the external DTD subset (if read). Comments in the DTD must be properly nested inside start/endDTD and start/endEntity events (if used).
comment
in interface LexicalHandler
ch
- An array holding the characters in the comment.start
- The starting position in the array.length
- The number of characters to use from the array.SAXException
- The application may raise an exception.