http://xml.apache.org/http://www.apache.org/http://www.w3.org/

Home

Readme
Download
Installation
Build

API Docs
Samples
Schema

FAQs
Programming
Migration

Releases
Bug-Reporting
Feedback

Y2K Compliance
PDF Document

CVS Repository
Mail Archive

API Docs for SAX and DOM
 

Main Page   Class Hierarchy   Alphabetical List   Compound List   File List   Compound Members   File Members  

SAXParser.hpp

Go to the documentation of this file.
00001 /*
00002  * The Apache Software License, Version 1.1
00003  *
00004  * Copyright (c) 1999-2001 The Apache Software Foundation.  All rights
00005  * reserved.
00006  *
00007  * Redistribution and use in source and binary forms, with or without
00008  * modification, are permitted provided that the following conditions
00009  * are met:
00010  *
00011  * 1. Redistributions of source code must retain the above copyright
00012  *    notice, this list of conditions and the following disclaimer.
00013  *
00014  * 2. Redistributions in binary form must reproduce the above copyright
00015  *    notice, this list of conditions and the following disclaimer in
00016  *    the documentation and/or other materials provided with the
00017  *    distribution.
00018  *
00019  * 3. The end-user documentation included with the redistribution,
00020  *    if any, must include the following acknowledgment:
00021  *       "This product includes software developed by the
00022  *        Apache Software Foundation (http://www.apache.org/)."
00023  *    Alternately, this acknowledgment may appear in the software itself,
00024  *    if and wherever such third-party acknowledgments normally appear.
00025  *
00026  * 4. The names "Xerces" and "Apache Software Foundation" must
00027  *    not be used to endorse or promote products derived from this
00028  *    software without prior written permission. For written
00029  *    permission, please contact apache\@apache.org.
00030  *
00031  * 5. Products derived from this software may not be called "Apache",
00032  *    nor may "Apache" appear in their name, without prior written
00033  *    permission of the Apache Software Foundation.
00034  *
00035  * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED
00036  * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
00037  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
00038  * DISCLAIMED.  IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR
00039  * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
00040  * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
00041  * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
00042  * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
00043  * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
00044  * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
00045  * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
00046  * SUCH DAMAGE.
00047  * ====================================================================
00048  *
00049  * This software consists of voluntary contributions made by many
00050  * individuals on behalf of the Apache Software Foundation, and was
00051  * originally based on software copyright (c) 1999, International
00052  * Business Machines, Inc., http://www.ibm.com .  For more information
00053  * on the Apache Software Foundation, please see
00054  * <http://www.apache.org/>.
00055  */
00056 
00057 /*
00058  * $Log: SAXParser.hpp,v $
00059  * Revision 1.20  2001/08/01 19:11:02  tng
00060  * Add full schema constraint checking flag to the samples and the parser.
00061  *
00062  * Revision 1.19  2001/07/27 20:24:21  tng
00063  * put getScanner() back as they were there before, not to break existing apps.
00064  *
00065  * Revision 1.18  2001/07/16 12:52:09  tng
00066  * APIDocs fix: default for schema processing in DOMParser, IDOMParser, and SAXParser should be false.
00067  *
00068  * Revision 1.17  2001/06/23 14:13:16  tng
00069  * Remove getScanner from the Parser headers as this is not needed and Scanner is not internal class.
00070  *
00071  * Revision 1.16  2001/06/03 19:26:20  jberry
00072  * Add support for querying error count following parse; enables simple parse without requiring error handler.
00073  *
00074  * Revision 1.15  2001/05/11 13:26:22  tng
00075  * Copyright update.
00076  *
00077  * Revision 1.14  2001/05/03 19:09:25  knoaman
00078  * Support Warning/Error/FatalError messaging.
00079  * Validity constraints errors are treated as errors, with the ability by user to set
00080  * validity constraints as fatal errors.
00081  *
00082  * Revision 1.13  2001/03/30 16:46:57  tng
00083  * Schema: Use setDoSchema instead of setSchemaValidation which makes more sense.
00084  *
00085  * Revision 1.12  2001/03/21 21:56:09  tng
00086  * Schema: Add Schema Grammar, Schema Validator, and split the DTDValidator into DTDValidator, DTDScanner, and DTDGrammar.
00087  *
00088  * Revision 1.11  2001/02/15 15:56:29  tng
00089  * Schema: Add setSchemaValidation and getSchemaValidation for DOMParser and SAXParser.
00090  * Add feature "http://apache.org/xml/features/validation/schema" for SAX2XMLReader.
00091  * New data field  fSchemaValidation in XMLScanner as the flag.
00092  *
00093  * Revision 1.10  2001/01/12 21:23:41  tng
00094  * Documentation Enhancement: explain values of Val_Scheme
00095  *
00096  * Revision 1.9  2000/08/02 18:05:15  jpolast
00097  * changes required for sax2
00098  * (changed private members to protected)
00099  *
00100  * Revision 1.8  2000/04/12 22:58:30  roddey
00101  * Added support for 'auto validate' mode.
00102  *
00103  * Revision 1.7  2000/03/03 01:29:34  roddey
00104  * Added a scanReset()/parseReset() method to the scanner and
00105  * parsers, to allow for reset after early exit from a progressive parse.
00106  * Added calls to new Terminate() call to all of the samples. Improved
00107  * documentation in SAX and DOM parsers.
00108  *
00109  * Revision 1.6  2000/02/17 03:54:27  rahulj
00110  * Added some new getters to query the parser state and
00111  * clarified the documentation.
00112  *
00113  * Revision 1.5  2000/02/16 03:42:58  rahulj
00114  * Finished documenting the SAX Driver implementation.
00115  *
00116  * Revision 1.4  2000/02/15 04:47:37  rahulj
00117  * Documenting the SAXParser framework. Not done yet.
00118  *
00119  * Revision 1.3  2000/02/06 07:47:56  rahulj
00120  * Year 2K copyright swat.
00121  *
00122  * Revision 1.2  1999/12/15 19:57:48  roddey
00123  * Got rid of redundant 'const' on boolean return value. Some compilers choke
00124  * on this and its useless.
00125  *
00126  * Revision 1.1.1.1  1999/11/09 01:07:51  twl
00127  * Initial checkin
00128  *
00129  * Revision 1.6  1999/11/08 20:44:54  rahul
00130  * Swat for adding in Product name and CVS comment log variable.
00131  *
00132  */
00133 
00134 #if !defined(SAXPARSER_HPP)
00135 #define SAXPARSER_HPP
00136 
00137 #include <sax/Parser.hpp>
00138 #include <internal/VecAttrListImpl.hpp>
00139 #include <framework/XMLDocumentHandler.hpp>
00140 #include <framework/XMLElementDecl.hpp>
00141 #include <framework/XMLEntityHandler.hpp>
00142 #include <framework/XMLErrorReporter.hpp>
00143 #include <validators/DTD/DocTypeHandler.hpp>
00144 
00145 class DocumentHandler;
00146 class EntityResolver;
00147 class XMLPScanToken;
00148 class XMLScanner;
00149 class XMLValidator;
00150 
00151 
00161 
00162 class  SAXParser :
00163 
00164     public Parser
00165     , public XMLDocumentHandler
00166     , public XMLErrorReporter
00167     , public XMLEntityHandler
00168     , public DocTypeHandler
00169 {
00170 public :
00171     // -----------------------------------------------------------------------
00172     //  Class types
00173     // -----------------------------------------------------------------------
00174     enum ValSchemes
00175     {
00176         Val_Never
00177         , Val_Always
00178         , Val_Auto
00179     };
00180 
00181 
00182     // -----------------------------------------------------------------------
00183     //  Constructors and Destructor
00184     // -----------------------------------------------------------------------
00185 
00188 
00193     SAXParser(XMLValidator* const valToAdopt = 0);
00194 
00198     ~SAXParser();
00200 
00201 
00204 
00210     DocumentHandler* getDocumentHandler();
00211 
00218     const DocumentHandler* getDocumentHandler() const;
00219 
00226     EntityResolver* getEntityResolver();
00227 
00234     const EntityResolver* getEntityResolver() const;
00235 
00242     ErrorHandler* getErrorHandler();
00243 
00250     const ErrorHandler* getErrorHandler() const;
00251 
00258     const XMLScanner& getScanner() const;
00259 
00266     const XMLValidator& getValidator() const;
00267 
00275     ValSchemes getValidationScheme() const;
00276 
00287     bool getDoSchema() const;
00288 
00299     bool getValidationSchemaFullChecking() const;
00300 
00311     int getErrorCount() const;
00312 
00322     bool getDoNamespaces() const;
00323 
00333     bool getExitOnFirstFatalError() const;
00334 
00345     bool getValidationConstraintFatal() const;
00347 
00348 
00349     // -----------------------------------------------------------------------
00350     //  Setter methods
00351     // -----------------------------------------------------------------------
00352 
00355 
00372     void setDoNamespaces(const bool newState);
00373 
00390     void setValidationScheme(const ValSchemes newScheme);
00391 
00405     void setDoSchema(const bool newState);
00406 
00423     void setValidationSchemaFullChecking(const bool schemaFullChecking);
00424 
00440     void setExitOnFirstFatalError(const bool newState);
00441 
00457     void setValidationConstraintFatal(const bool newState);
00459 
00460 
00461     // -----------------------------------------------------------------------
00462     //  Advanced document handler list maintenance methods
00463     // -----------------------------------------------------------------------
00464 
00467 
00480     void installAdvDocHandler(XMLDocumentHandler* const toInstall);
00481 
00491     bool removeAdvDocHandler(XMLDocumentHandler* const toRemove);
00493 
00494 
00495     // -----------------------------------------------------------------------
00496     //  Implementation of the SAXParser interface
00497     // -----------------------------------------------------------------------
00498 
00501 
00513     virtual void parse(const InputSource& source, const bool reuseGrammar = false);
00514 
00527     virtual void parse(const XMLCh* const systemId, const bool reuseGrammar = false);
00528 
00539     virtual void parse(const char* const systemId, const bool reuseGrammar = false);
00540 
00551     virtual void setDocumentHandler(DocumentHandler* const handler);
00552 
00562     virtual void setDTDHandler(DTDHandler* const handler);
00563 
00574     virtual void setErrorHandler(ErrorHandler* const handler);
00575 
00587     virtual void setEntityResolver(EntityResolver* const resolver);
00589 
00590 
00591     // -----------------------------------------------------------------------
00592     //  Progressive scan methods
00593     // -----------------------------------------------------------------------
00594 
00597 
00628     bool parseFirst
00629     (
00630         const   XMLCh* const    systemId
00631         ,       XMLPScanToken&  toFill
00632         , const bool            reuseGrammar = false
00633     );
00634 
00665     bool parseFirst
00666     (
00667         const   char* const     systemId
00668         ,       XMLPScanToken&  toFill
00669         , const bool            reuseGrammar = false
00670     );
00671 
00702     bool parseFirst
00703     (
00704         const   InputSource&    source
00705         ,       XMLPScanToken&  toFill
00706         , const bool            reuseGrammar = false
00707     );
00708 
00733     bool parseNext(XMLPScanToken& token);
00734 
00756     void parseReset(XMLPScanToken& token);
00757 
00759 
00760 
00761 
00762     // -----------------------------------------------------------------------
00763     //  Implementation of the DocTypeHandler Interface
00764     // -----------------------------------------------------------------------
00765 
00768 
00782     virtual void attDef
00783     (
00784         const   DTDElementDecl& elemDecl
00785         , const DTDAttDef&      attDef
00786         , const bool            ignoring
00787     );
00788 
00798     virtual void doctypeComment
00799     (
00800         const   XMLCh* const    comment
00801     );
00802 
00819     virtual void doctypeDecl
00820     (
00821         const   DTDElementDecl& elemDecl
00822         , const XMLCh* const    publicId
00823         , const XMLCh* const    systemId
00824         , const bool            hasIntSubset
00825     );
00826 
00840     virtual void doctypePI
00841     (
00842         const   XMLCh* const    target
00843         , const XMLCh* const    data
00844     );
00845 
00857     virtual void doctypeWhitespace
00858     (
00859         const   XMLCh* const    chars
00860         , const unsigned int    length
00861     );
00862 
00875     virtual void elementDecl
00876     (
00877         const   DTDElementDecl& decl
00878         , const bool            isIgnored
00879     );
00880 
00891     virtual void endAttList
00892     (
00893         const   DTDElementDecl& elemDecl
00894     );
00895 
00902     virtual void endIntSubset();
00903 
00910     virtual void endExtSubset();
00911 
00926     virtual void entityDecl
00927     (
00928         const   DTDEntityDecl&  entityDecl
00929         , const bool            isPEDecl
00930         , const bool            isIgnored
00931     );
00932 
00937     virtual void resetDocType();
00938 
00951     virtual void notationDecl
00952     (
00953         const   XMLNotationDecl&    notDecl
00954         , const bool                isIgnored
00955     );
00956 
00967     virtual void startAttList
00968     (
00969         const   DTDElementDecl& elemDecl
00970     );
00971 
00978     virtual void startIntSubset();
00979 
00986     virtual void startExtSubset();
00987 
01000     virtual void TextDecl
01001     (
01002         const   XMLCh* const    versionStr
01003         , const XMLCh* const    encodingStr
01004     );
01006 
01007 
01008     // -----------------------------------------------------------------------
01009     //  Implementation of the XMLDocumentHandler interface
01010     // -----------------------------------------------------------------------
01011 
01014 
01029     virtual void docCharacters
01030     (
01031         const   XMLCh* const    chars
01032         , const unsigned int    length
01033         , const bool            cdataSection
01034     );
01035 
01045     virtual void docComment
01046     (
01047         const   XMLCh* const    comment
01048     );
01049 
01069     virtual void docPI
01070     (
01071         const   XMLCh* const    target
01072         , const XMLCh* const    data
01073     );
01074 
01086     virtual void endDocument();
01087 
01104     virtual void endElement
01105     (
01106         const   XMLElementDecl& elemDecl
01107         , const unsigned int    urlId
01108         , const bool            isRoot
01109     );
01110 
01121     virtual void endEntityReference
01122     (
01123         const   XMLEntityDecl&  entDecl
01124     );
01125 
01145     virtual void ignorableWhitespace
01146     (
01147         const   XMLCh* const    chars
01148         , const unsigned int    length
01149         , const bool            cdataSection
01150     );
01151 
01156     virtual void resetDocument();
01157 
01168     virtual void startDocument();
01169 
01196     virtual void startElement
01197     (
01198         const   XMLElementDecl&         elemDecl
01199         , const unsigned int            urlId
01200         , const XMLCh* const            elemPrefix
01201         , const RefVectorOf<XMLAttr>&   attrList
01202         , const unsigned int            attrCount
01203         , const bool                    isEmpty
01204         , const bool                    isRoot
01205     );
01206 
01216     virtual void startEntityReference
01217     (
01218         const   XMLEntityDecl&  entDecl
01219     );
01220 
01238     virtual void XMLDecl
01239     (
01240         const   XMLCh* const    versionStr
01241         , const XMLCh* const    encodingStr
01242         , const XMLCh* const    standaloneStr
01243         , const XMLCh* const    actualEncodingStr
01244     );
01246 
01247 
01248     // -----------------------------------------------------------------------
01249     //  Implementation of the XMLErrorReporter interface
01250     // -----------------------------------------------------------------------
01251 
01254 
01277     virtual void error
01278     (
01279         const   unsigned int                errCode
01280         , const XMLCh* const                msgDomain
01281         , const XMLErrorReporter::ErrTypes  errType
01282         , const XMLCh* const                errorText
01283         , const XMLCh* const                systemId
01284         , const XMLCh* const                publicId
01285         , const unsigned int                lineNum
01286         , const unsigned int                colNum
01287     );
01288 
01297     virtual void resetErrors();
01299 
01300 
01301     // -----------------------------------------------------------------------
01302     //  Implementation of the XMLEntityHandler interface
01303     // -----------------------------------------------------------------------
01304 
01307 
01318     virtual void endInputSource(const InputSource& inputSource);
01319 
01334     virtual bool expandSystemId
01335     (
01336         const   XMLCh* const    systemId
01337         ,       XMLBuffer&      toFill
01338     );
01339 
01347     virtual void resetEntities();
01348 
01363     virtual InputSource* resolveEntity
01364     (
01365         const   XMLCh* const    publicId
01366         , const XMLCh* const    systemId
01367     );
01368 
01380     virtual void startInputSource(const InputSource& inputSource);
01382 
01383 
01386 
01396     bool getDoValidation() const;
01397 
01411     void setDoValidation(const bool newState);
01413 
01414 
01415 protected :
01416     // -----------------------------------------------------------------------
01417     //  Unimplemented constructors and operators
01418     // -----------------------------------------------------------------------
01419     SAXParser(const SAXParser&);
01420     void operator=(const SAXParser&);
01421 
01422 
01423     // -----------------------------------------------------------------------
01424     //  Private data members
01425     //
01426     //  fAttrList
01427     //      A temporary implementation of the basic SAX attribute list
01428     //      interface. We use this one over and over on each startElement
01429     //      event to allow SAX-like access to the element attributes.
01430     //
01431     //  fDocHandler
01432     //      The installed SAX doc handler, if any. Null if none.
01433     //
01434     //  fDTDHandler
01435     //      The installed SAX DTD handler, if any. Null if none.
01436     //
01437     //  fElemDepth
01438     //      This is used to track the element nesting depth, so that we can
01439     //      know when we are inside content. This is so we can ignore char
01440     //      data outside of content.
01441     //
01442     //  fEntityResolver
01443     //      The installed SAX entity handler, if any. Null if none.
01444     //
01445     //  fErrorHandler
01446     //      The installed SAX error handler, if any. Null if none.
01447     //
01448     //  fAdvDHCount
01449     //  fAdvDHList
01450     //  fAdvDHListSize
01451     //      This is an array of pointers to XMLDocumentHandlers, which is
01452     //      how we see installed advanced document handlers. There will
01453     //      usually not be very many at all, so a simple array is used
01454     //      instead of a collection, for performance. It will grow if needed,
01455     //      but that is unlikely.
01456     //
01457     //      The count is how many handlers are currently installed. The size
01458     //      is how big the array itself is (for expansion purposes.) When
01459     //      count == size, is time to expand.
01460     //
01461     //  fParseInProgress
01462     //      This flag is set once a parse starts. It is used to prevent
01463     //      multiple entrance or reentrance of the parser.
01464     //
01465     //  fScanner
01466     //      The scanner being used by this parser. It is created internally
01467     //      during construction.
01468     //
01469     // -----------------------------------------------------------------------
01470     VecAttrListImpl         fAttrList;
01471     DocumentHandler*        fDocHandler;
01472     DTDHandler*             fDTDHandler;
01473     unsigned int            fElemDepth;
01474     EntityResolver*         fEntityResolver;
01475     ErrorHandler*           fErrorHandler;
01476     unsigned int            fAdvDHCount;
01477     XMLDocumentHandler**    fAdvDHList;
01478     unsigned int            fAdvDHListSize;
01479     bool                    fParseInProgress;
01480     XMLScanner*             fScanner;
01481 };
01482 
01483 
01484 // ---------------------------------------------------------------------------
01485 //  SAXParser: Getter methods
01486 // ---------------------------------------------------------------------------
01487 inline DocumentHandler* SAXParser::getDocumentHandler()
01488 {
01489     return fDocHandler;
01490 }
01491 
01492 inline const DocumentHandler* SAXParser::getDocumentHandler() const
01493 {
01494     return fDocHandler;
01495 }
01496 
01497 inline EntityResolver* SAXParser::getEntityResolver()
01498 {
01499     return fEntityResolver;
01500 }
01501 
01502 inline const EntityResolver* SAXParser::getEntityResolver() const
01503 {
01504     return fEntityResolver;
01505 }
01506 
01507 inline ErrorHandler* SAXParser::getErrorHandler()
01508 {
01509     return fErrorHandler;
01510 }
01511 
01512 inline const ErrorHandler* SAXParser::getErrorHandler() const
01513 {
01514     return fErrorHandler;
01515 }
01516 
01517 inline const XMLScanner& SAXParser::getScanner() const
01518 {
01519     return *fScanner;
01520 }
01521 
01522 #endif


Copyright © 2000 The Apache Software Foundation. All Rights Reserved.