Cocoon Documentation
This document tries to follow the operations of Cocoon from a
"document point of view" while the javadoc documentation describes it
from a "procedural point of view".
Therefore, here we try to be complementary to the
javadoc and not to simply repeat what is stated there already. Furthermore,
since the ultimate documentation is the source code
itself, this
document tries not to go too deep but eventually to integrate with the comments
in the code. In fact, some people may find that reading the source code
directly will shed more light than just reading this (significantly incomplete)
overview.
Unless otherwise specified, for sake of brevity any class name
is assumed to have the org.apache.cocoon
prefix prepended to it.
This is the "main" class, either when Cocoon is being used as a servlet
or for command-line use. Clearly, it contains the methods init
for the latter case as well as main
for the first case.
Hereafter are described the operations in the two common cases of command-line execution (typically used for offline site creation), and servlet usage.
When Cocoon
is invoked from the command-line, it requires as
arguments the location of the cocoon.properties
, the name
of the file containing the XML to be processed, and the name of the output
file. After reading the properties file, it creates a new
EngineWrapper
initialized with the above mentioned properties
and then calls the handle
method, and hands it
an output Writer
and an input File
. There is no good
reason for this asymmetry - the command-line operation mode of Cocoon was
coded quickly as a temporary hack to meet a popular need, in lieu of the
better, more integrated and well-designed command-line support planned for
Cocoon 2.
This is a "hack" which provides a "fake" implementation of
the Servlet API methods that are needed by Cocoon, in the inner classes
HttpServletRequestImpl
and
HttpServletResponseImpl
. When Cocoon gets integrated
with Stylebook, this class will probably need to be cleaned up.
Basically, this class instantiates an Engine
class and passes
it the "fake" request and response objects mentioned above.
As for any servlet, upon startup the init
method is
invoked. In Cocoon, this tries to load the cocoon.properties file, and, if
that is successful, creates an Engine
instance.
A service
method is provided by Cocoon
, which
accepts all incoming requests, whatever their type. Servlet programmers may be
accustomed to writing doGet
or doPost
methods to
handle different types of requests, which is fine for simple servlets;
however, a service
method is the best way to implement a fully
generic servlet like Cocoon.
This class implements the engine that does all the document processing.
What better definition of the function of this class than the words of its author (Stefano Mazzocchi)? From this otherwise lapidary definition, one should realize the importance of this Class in the context of the Cocoon operations and thus one should carefully read it through in order to understand the "big picture" of how Cocoon works.
Either from command-line or from the servlet, upon startup of the cocoon
servlet the Engine
is instantiated by the
private Engine
constructor. For the sake of understanding Cocoon
operations, it is important to know that at this point in time (and only this
time in the whole lifespan of the Cocoon servlet) the objects performing the
initialization of the various components
are instantiated with the parameters contained by the Configuration object.
This is the reason why, if changes are applied to the cocoon.properties file,
these do not have any effect on Cocoon until the engine is stopped and
then restarted.
These objects either directly represent the components (such as
logger.ServletLogger
)
or are Factories to provide the correct components
for a particular request (such as processor.ProcessorFactory
).
The long-winded setup code involved here reads class names from the
cocoon.properties
file and dynamically loads and configures
the classes, thus allowing for easy "swapping in and out" of components
without recompiling the whole of Cocoon.
In general, all components referenced here must be loadable at startup, otherwise Cocoon will refuse to initialize - even if the missing component(s) are not actually used in the web-application. Still, this is exactly the same situation as with a more convential Java application which does not store class names in configuration files.
The handle
method has been already mentioned previously
and is indeed the focal point for all the runtime operations of Cocoon.
It is invoked with two objects, one being the input
HttpServletRequest
and one being the output
HttpServletResponse
(just as in a servlet).
Until the whole page is done, it repeats the following process for up to 10 times (the pipeline will only need to be repeated if an OutOfMemoryError occurs, in which case the cache will be cleared out somewhat and the pipeline restarted):
Page
wrapper for cacheing purposesProducer
from the
ProducerFactory
. The HTTP parameter "producer=myproducer"
can be used to select the producer; if this parameter is not present,
the default producer is used.org.w3c.dom.Document
environment
to pass various parameters
to the processor pipelineProcessor
s,
(obtained from the ProcessorFactory
)
for each processor invoked in the Document
Formatter
requested by the Document
from the FormatterFactory
Page
bean with contentFinally,
Now, I suggest you to take a deep breath and read the above steps again, since the simplicity of the algorithm exploited is so beautiful that it makes sense to appreciate it in depth and breath.
At this point the key elements are therefore the processors and the formatters,
which directly operate upon the content of the Document. We are going to
investigate them in detail. It should be already clear that indeed one can have
more than one Processor
per Document
and that these
are going to be applied sequentially one after the other. Namely, this is how
is implemented the "chaining" of various Processors
:
in five lines of code (including debugging information).
Again, simplicity and good coding style are assets of this implementation.
Let us have a look then at what Processors
and
Formatters
are, since these could be leveraged further and indeed
these are going to be likely extended with new components for specific needs.
For each source there must be an appropriate Producer implemented. Currently (version 1.8), only ProducerFromFile is implemented. This is because XSP provides the best solution (both in terms of ease-of-use and forward-compatibility with Cocoon 2) for nearly all dynamic content solutions, so there is usually no need to write a Producer explicitly.
For each processing instruction type there must be an appropriate Processor implemented. Currently (version 1.8), the following ones are implemented:
For each format in which the output should be delivered (e.g. PDF, TEXT, HTML, XML, XHTML ), there must be an appropriate Formatter implemented. Currently (version 1.8), the following ones are distributed:
Clearly, one might imagine many more formatters such as
In Cocoon 1.8 all of the formatters provided are in fact implemented as simple
"wrapper" classes (as can be easily seen by examining the source code in the
formatters
directory) which merely set the parameters to the Apache
Serializers, or in the case of FO2PDF, Apache FOP, and then delegate the actual
formatting to those classes. In a way, no "real work" actually goes on
in the Formatter classes themselves. As you can see, Cocoon is a framework which
tries not to reinvent the wheel too often!
If you're wondering why FO2PDF isn't a Processor instead of a Formatter, the answer is simple - it is conceptually more of a Processor (it transforms the entire document), but for one vital difference - it does not output XML. Yes, there is the workaround that XSP uses internally, which is to output one XML element with all the content inside that as a text node - but this method would be rather clunky for FO2PDF and would provide no real benefit.
Note that the CPU-intensive processing required for FO2PDF can be obviated by the use of newer XML-compliant graphics and document markup languages on the client side, such as SVG (Scalable Vector Graphics), or XSL:FO itself, which can just be written out as XML. This is definitely the future for dynamic web publishing, since the "rendering" of dozens of concurrent users' documents into PDF all on the server does not make any sense from a performance point of view - it is advantageous today of course because current popular browsers do not support XSL:FO or SVG natively, but in the future this will change.
In fact, XML markup languages like VoiceXML are supported by Cocoon by returning XML
and indeed in that case the parameter to cocoon-format is text/xml
! In the
case of VRML, the cocoon format is model/vrml
which in the
cocoon.properties
configuration file is mapped to TextFormatter
.
Copyright © 1999-2000 The Apache Software Foundation.
All rights reserved.