MS RFC 62: Support Additional WFS GetFeature Output Formats

Date:

2010/10/07

Author:

Frank Warmerdam

Contact:

warmerdam@pobox.com

Status:

Adopted

Description: This RFC proposes to extend the ability of the WFS GetFeature request to support formats other than GML. Support for template query formatting (aka RFC 36) is introduced as well as support for output to OGR outputs. Control of output formats is managed via appropriate OUTPUTFORMAT declarations.

WFS GetFeature Changes

1) If the WFS OUTPUTFORMAT value is not one of the existing supported values, then the set of outputFormatObj’s for the map will be searched for one with a matching name or mime type and an image mode of MS_IMAGEMODE_FEATURE.

2) If the wfs_getfeature_formatlist metadata exists on the layer, it should be a comma delimited list of output formats permitted for this layer. Any other will not be selectable with the OUTPUTFORMAT parameter to GetFeature. The value in this parameter will be the “NAME” from the OUTPUTFORMAT declaration. As a fallback it may also exist on the map instead of the layer.

3) The existing GML2/GML3 generation support will be retained in essentially it’s current form within mapwfs.c, but it’s preamble and postfix generation will be moved into separate functions to reduce the “gml clutter” in the main msWFSGetFeature() function.

4) If an outputFormatObj is being used instead of GML output, it will be generated by a call to msReturnTemplateQuery() which already supports output via the new template engine (RFC 36) as well as call outs for other renderers.

WFS GetCapabilities Changes

In WFS 1.1.0 mode the GetCapabilities document lists the set of formats allowed for the OUTPUTFORMAT parameter. As part of this development it will be extended to show all the legal output formats for all of the layers (based on wfs_getfeature_formatlist metadata).

Note that in WFS 1.0.0 there is no mechanism to discover the format list on a per-layer basis, only the overall list. WFS 1.1.0 supports an overall list and a per-layer list.

outputFormatObj

A new image mode value will be added, MS_IMAGEMODE_FEATURE, intended to be appropriate for formats that are feature oriented, and cannot be meaningfully seen as an image. RFC 36 style template output format declarations will now default to this imagemode. The WFS GetFeature() directive will only support output formats of this image mode.

As well a new renderer value, MS_RENDER_WITH_OGR, will be added for OGR output.

OGR OUTPUTFORMAT Declarations

The OGR renderer will support the following FORMATOPTION declarations:

DSCO:*

Anything prefixed by DSCO: is used as a dataset creation option with the OGR driver.

LCO:*

Anything prefixed by LCO: is used as a layer creation option.

FORM=simple/zip/multipart

Indicates whether the result should be a simple single file (single), a mime multipart attachment (multipart) or a zip file (zip). “zip” is the default.

STORAGE=memory/filesystem/stream

Indicates where the datasource should be stored while being written. “file” is the default.

If “memory” then it will be created in /vsimem/ - but this is only suitable for drivers supporting VSI*L which we can’t easily determine automatically.

If “file” then a temporary directory will be created under the IMAGEPATH were the file(s) will be written and then read back to stream to the client.

If “stream” then the datasource will be created with a name “/vsistdout” as an attempt to write directly to stdout. Only a few OGR drivers will work properly in this mode (ie. CSV, perhaps kml, gml).

COMPRESSION=none/gzip

Should gzip compression be applied as the result is returned? This is generally not suitable for use with FORM=zip which is already compressing the image. “none” is the default.

GEOMTYPE=None/Unknown/Point/LineString/Polygon/GeometryCollection/MultiPoint/MultiLineString/MultiPolygon

This sets the OGR geometry type of the output layer created. It defaults to “Unknown”.

FILENAME=name

Provides a name for the datasource created, default is “result.dat”.

Examples:

OUTPUTFORMAT
  NAME "CSV"
  DRIVER "OGR/CSV"
  MIMETYPE "text/csv"
  FORMATOPTION "LCO:GEOMETRY=AS_WKT"
  FORMATOPTION "STORAGE=memory"
  FORMATOPTION "FORM=simple"
  FORMATOPTION "FILENAME=result.csv"
END

OUTPUTFORMAT
  NAME "OGRGML"
  DRIVER "OGR/GML"
  FORMATOPTION "STORAGE=filesystem"
  FORMATOPTION "FORM=multipart"
  FORMATOPTION "FILENAME=result.gml"
END

OUTPUTFORMAT
  NAME "SHAPEZIP"
  DRIVER "OGR/ESRI Shapefile"
  FORMATOPTION "STORAGE=memory"
  FORMATOPTION "FORM=zip"
  FORMATOPTION "FILENAME=result.zip"
END

OGR Renderer Implementation

The OGR Renderer will be implemented as the function msOGRWriteFromQuery() in the new file mapogroutput.c. It will create a new OGR datasource based on the information in the output format declaration. It will then create an output layer for each map layer with an active, non-empty query resultcache. Those resultcache shapes will be written.

  • The gml_include_items and gml_exclude_items rules will be used to decide what attributes should be written to the output layers.

  • The gml field name aliasing mechanism will be supported for OGR output.

  • The gml_[item]_type metadata will be supported for OGR output.

  • The gml_[item]_width (new) metadata will be supported for OGR output.

  • The gml_[item]_precision (new) metadata will be supported for OGR output.

  • When gml_[item]_type metadata is not available then all fields will be created with a type of OFTString.

  • Some features, such as zip output, will only be supported with GDAL/OGR 1.8. These features will be made conditional on the GDAL/OGR version.

Geometry Types Supported

In MapServer we have POINT, LINE and POLYGON layers which also allow for features with multiple points, lines or polygons. However, in the OGC Simple Feature geometry model used by OGR a point and multipoint layer are quite distinct. Likewise for a LineString and MultiLineString and Polygon an MultiPolygon layer type. When features written to a layer could be of distinct types like this the only way we can create the layer is as geometry type “wkbUnknown” - that is with no distinct geometry type.

Some drivers, such as the OGR shapefile driver will establish a specific geometry type based on the first geometry encountered and discard with an error features of an unsupported geometry type. Some drivers support mixtures of geometry types.

The GEOMTYPE FORMATOPTION make it possible to force creation of the layer with a particular geometry type.

Attribute Field Definitions

For OGR output it is highly desirable to be able to create the output fields with the appropriate datatype, width and precision to reflect the source feature definition. In the past for GML output the gml_[item]_type metadata provided a mechanism for users to manually set the field type in generated GML as there was no supported mechanism to automatically discovered.

As a further step, for the OGR output mechanism it is planned to offer support for two new data items, gml_[item]_width and gml_[item]_precision which default the width and precision (number of decimal places) for fields.

It is not immediately planned to utilize the field width and precision information in the GML output though that may prove useful. The width and precision metadata will be parsed by msGMLGetItems() and added into the gmlItemObj structure. Values of “0” indicate a value is not known.

gml_types auto

Setting the field type information for all fields in a layer is tedious and error prone. So as a further extension it is intended to support a mechanism to automatically discover field type, width and precision for some data sources. This will be requested by setting the “gml_types” metadata item to a value of “auto”. This will be taken as a clue to implementing layer types to set the various metadata items on the layer to define the fields type, width and precision when the GetItems layer method is called.

As part of the development effort for this RFC support for automatic field definition discovered will be implemented for “OGR”, “POSTGIS”, “ORACLE” and “SHAPEFILE” layers. The maintainers of other feature sources can implement support if and when it is convenient.

Use of CPL Services

Support for WFS OGR output is dependent on USE_OGR being defined. But when available it implies that GDAL/OGR CPL (Common Portability Library) services are also available. These include the virtual file system interface (VSI*L) which provides services like “in memory” files. The /vsimem services will be used to avoid writing intermediate files to disk if STORAGE=memory, the default configuration.

Note that the /vsimem/ service is already being used for WMS layers to avoid having to write images from remove WMS servers to disk, in WCS to avoid writing GDAL-written images to disk and in WMS when using GDAL based outputformats. So this isn’t really new.

As well, CPL includes support for gzip compression via the /vsigzip virtual filesystem handler.

In the past GDAL has not had support for writing zip files; a feature that is very desirable as a mechanism for both compressing and grouping multi-file sets produced by many OGR drivers. As part of the development undertaken for this RFC it is intended that GDAL will be extended with a CPL API for writing zip files, or alternatively a /vsizip/ virtual file system mechanism to write files.

Backwards Compatibility Issues

There are no apparent backward compatibility issues with this proposal. It’s effect is primarily in the situation where new OUTPUTFORMAT values are used in the WFS GetFeature directive.

Security Implications

There are no apparent security implications to this proposal.

Further Considerations

  • It should be noted that the DescribeFeatureType WFS operation will still only support GML formatted description for feature types.

  • A prototype implementation has been developed (by Frank Warmerdam) and is working though it is not quite feature complete, and will need some refinement before it can be committed.

  • It is convenient to use the metadata and other structure originally intended to drive GML production but it does cause some confusion when “gml” metadata items are used to drive non-GML output.

Outstanding Issues

  • How do we alter msReturnTemplate() to support the WFS limit on number of features, and offset to the first feature?

  • What are the implications if queries in other circumstances than WFS GetFeature picking an OGR output format?

Testing

The msautotest/wxs suite will be extended with a few tests of OGR output. This will likely revolve around producing text output formats so it will be easy to compare to previous results as we do with gml output. Some advanced features, such as zipped output may not be convenient to add regression testing for.

Documentation

Documentation updates willb be required in the following documents:

  • WFS Server: extensive discussion of new gml items, and how OGR and templated output is invoked and controlled.

  • MapFile Reference: Add information on OGR OUTPUTFORMAT declarations.

  • Ideally we ought to document the new “gml_types auto” stuff in the document on how to write feature layer implementations, but the only document on this is RFC 3 which does not seem to define what the functions are supposed to do for the most part.

Ticket Id

Voting history

Adopted on 2010/10/13 with +1 from SteveW, SteveL, AssefaY and FrankW.