XML Serializer
http://xml.apache.org/http://www.apache.org/http://www.w3.org/

Main
User Documentation

Serializers
Overview

Default
HTML Serializer

Core
XML Serializer
Text Serializer
WAP/WML Serializer
SVG Serializer
SVG/XML Serializer
SVG/JPEG Serializer
SVG/PNG Serializer
VRML Serializer
Link Serializer
Zip archive Serializer

Optional
PDF Serializer
PS Serializer
PCL Serializer
HSSF (XLS) Serializer

HSSF Serializer

The HSSF serializer is the catches SAX events and creates a spreadsheet in the XLS format used by Microsoft Excel (but the output looks just dandy in Gnumeric or OpenOffice.org as well). The HSSF Serializer expects data in the same tag language as the popular spreadsheet progam Gnumeric. You can locate schemas and dtd's for this format on the gnumeric website or in it's CVS repository. One of the easiest ways to use the HSSF Serializer is to create a spreadsheet using Gnumeric and then refactor it into an XSLT page.

The HSSF Serializer supports most of the functionality supplied by the HSSF API which is part of the Jakarta POI project.

  • Name : xls
  • Class: org.apache.cocoon.serialization.HSSFSerializer
  • Cacheable: ?
Usage

Using the HSSF Serializer is fairly simple. You'll need a sitemap of course. Once you have that well, you're half there. Add

                        <map:serializer name="xls" src="org.apache.cocoon.serialization.HSSFSerializer" mime-type="application/vnd.ms-excel" locale="us"/>
                        

to the sitemap between the map:serializers tags. The locale is optional and is used only to validate numbers. Please note that numbers not in US-default format may mot be compatible with Gnumeric (its less cosmopolitan then the HSSF Serializer ;-) ). Setting the locale lets you use default number formats from other locales. Set this to a two letter lowercase country code. See java.util.Locale for details.

Next, set up an entry for each url or set of urls (via matching rules) resembling this:

                         <map:match pattern="hello.xls">
                                <map:generate src="docs/samples/hello-page.xml"/>
                                <map:transform src="stylesheets/page/simple-page2xls.xsl"/>
                                <map:serialize type="xls"/>
                         </map:match>
                        

As for the stylesheets tts best to look at the sources for the Cocoon samples. You'll find the HSSF Serializer examples under "Legacy Formats" from the main samples page. You can find the source under xml-cocoon2/src/webapp/samples/poi in the Cocoon sources.

Required XML format

It is suggested that you use the sample stylesheets as a template until you master the format. You'll probably want to graduate from that to using Gnumeric to create templates and adapt them into stylesheets. HSSFSerializer assumes the XML it is serializing uses the same tag library as the Gnome project's (http://www.gnome.org) Gnumeric spreadsheet. At this time the serializer supports the same tags used in the 0.7-1.04 version. The schema for this format was created by POI committer Marc Johnson and can be found here. While HSSFSerializer ignores the bulk of the tags, it is suggested you provide at a minimum the basic tags in the example above. As a general rule, if Gnumeric 0.7-1.04 will load the XML (provided it is tar'd and gzipped as expected) then the HSSFSerializer should be able to handle it. While you can simply output an XML document in this format via a file or some other process (servlet/etc.), you'll probably have existing documents you'd like to format into spreadsheets. The best way to do this is to create a stylesheet. There are a number of good books and references on XSLT. A good one is Michael Kay's book from Wrox publishing so aptly named XSLT Programmer's Reference. Another is the XSLT book from O'Reilly. The first is good as a lookup reference. The latter is good for concepts, etc. You can also find a wealth of information at http://www.xml.org. The XSLT spec is surprisingly readable.

As a general guideline these tags are not ignored in this release:

  • gmr:Workbook - Required, basically the root element
  • gmr:Sheets - Required
  • gmr:Sheet - Required for each sheet
  • gmr:Name - Defines the sheet's name as it appears on the little tabs under the workbook in your favorite GUI spreadsheet application.
  • gmr:MaxCol - Used to set the dimensions for the sheet. This can be wrong and your spreadsheet program may not care but some other ports depend upon this, so we set it to be compatible.
  • gmr:Cols - Used to determine the default column width
  • gmr:ColInfo - Used to set the column width for a specific column. (the attribute is a misnomer, the unit is characters)
  • gmr:Rows - Used to set the default row height in points.
  • gmr:RowInfo - Used to set the row height for a specific row in points.
  • gmr:Cells - Required
  • gmr:Cell - defines the actual column and row number as well as the data type.
  • gmr:Content - defines the start of the value contained in the cell. This is obsolete as of Gnumeric 1.03. Its still supported by HSSF, but may not be in future versions.
  • gmr:Styles - required if StyleRegion is used. (parent collection node)
  • gmr:StyleRegion - defines a style for a region of the spreadsheet.
  • gmr:Style - Specifies the actaul style for a StyleRegion. HAlign attribute specifies the horizontal alignment and may have a value of 1 (General), 2 (Left), 4 (Right), 8 (Center), 16 (Fill),32 (Justify), or 64 (Center accross selection). VAlign specifies the vertical alignment (top,bottom,center,justify are 1, 2, 4 and 8 respectively). WrapText specifies whether to wrap text around or not (0 = don't, 1 = do), Shade is a kidna stupid flag...if you're setting a background color and want it filled...use Shade="1". Format is the number format to use. Generally, Excel and Gnumeric have the same formats.
  • Font - Font is a child of the StyleRegion. The attirbutes are pretty darned obvious. If you don't know what Bold or Italic mean then well, we can't help you here.
  • StyleBorder - StyleBorder a child of the Style tag. It contains child elements for each border a cell could have (Top, Bottom, Left, Right, Diagnol) and each of these specify a border Style and optionally a Color.
Future Features

So HSSF Serializer is well on its way to being darn near everything you need to create fancy smancy reports in Excel or OpenOffice. (And you can just XML Serialize the output from your stylesheets for Gnumeric version).

  • Add support for formulas. (not yet supported by HSSF)
  • Add support for custom data formats. (not yet supported by HSSF)
Copyright © 1999-2002 The Apache Software Foundation. All Rights Reserved.