An Introduction to HDF


What is HDF?

What types of data does HDF support?

Which version of HDF should I use?

Where can I get additional and detailed information on HDF?


Previous Main Topic

Next Main Topic

Return to Main Topics

 

What is HDF?

HDF, which stands for Hierarchical Data Format, is a common data format that has been developed to aid scientists and programmers in the storing, transfer and distribution of data sets and products created on various machines and with different software. HDF has been selected by the NASA ESDIS project as the format of choice for the standard product distribution that will be part of the Earth Observing System Data and Informations System (EOSDIS).

In addition, HDF also refers to the collection of software, application interfaces, and utilities that comprise the HDF library and allows users to work with HDF files. The HDF library is discussed in detail in Section 3 - The HDF Library: Software and Hardware.

Features of HDF

HDF is a multi-object file format for the sharing and storing of scientific data. Some of the most important features of HDF are the following:

  1. Self-describing: For each data object in an HDF file, there is also information (or metadata) about the data type, size, dimensions and location found within the file itself.
  2. Extensibility: HDF is designed to accommodate future (new) data types and data models.
  3. Versatility: Currently, HDF supports six different data types and provides software and applications to read and write these data types in HDF.
  4. Flexibility: HDF lets the user group, store, and read/write different data types in the same file or in more than one file.
  5. Portability: HDF software is mainly platform independent and can be shared across most computer platforms (all platforms have not been tested).
  6. Standardization: HDF standardizes the formats and descriptions of many types of commonly- used data types (i.e., arrays, images, etc.).
  7. HDF is available in the public domain.

Return to top

 

What types of data does HDF support?

As of the latest release of HDF (HDF 4.1r3 release as of July 2000), the HDF library supports the working with raster images, color or gray scale palettes, multi-dimensional arrays, text strings, and statistical data (in the form of tables). The HDF library supports the following data types:

  1. Scientific Data sets -- Multi-dimensional integer or floating point arrays
  2. Vertex Data (Vdata and Vgroups) -- Multi-variate data stored as records in a table
  3. General Raster (Gr) -- Raster images
  4. Annotation -- Text strings to describe files and parts of files (metadata)
  5. 8-bit Raster images
  6. 24-bit Raster images
  7. Palette -- 8-bit color palettes (accompany images)

In addition to these data types supported by the base HDF library, a sub-library called HDF-EOS has been developed to support various data types from the Earth Observing System (EOS) and other satellite missions. The HDF-EOS data models include point data, satellite swath data, and gridded data. HDF-EOS files are already being routinley generated by the instruments aboard the EOS TERA platform, as well as the TRMM satellite. More information on HDF-EOS and the differences between HDF and HDF-EOS will be provided in following sections.

As mentioned in the Welcome section, this HDF component of the tutorial will concentrate on the Scientific Data and raster image Data Models as a means of teaching the essentials of HDF. More information on the other data models can be obtained in the various documents (particularly the HDF User's Guide) provided by NCSA through their anonymous ftp server or World Wide Web home page.

Return to top


Which version of HDF should I use?

The most current version or release of HDF is the best place to begin. The current version of the HDF library as of summer 2000 is HDF 4.1r3, but with a new release slated for December 2000. An extension of the HDF library, called HDF-EOS (HDF-EOS2.6), is based on this version of HDF and is designed specifically to work with data from EOS satellite missions. The current tutorial will focus on the releases (i.e., r1, r2 or r3) of HDF4.1. One feature of HDF4 that is important, especially to experienced users of HDF, is the backwards compatibility of HDF. That is, HDF4.1r3 is compatible with earleir versions such as HDF4.1r1 and the data sets that were generated.

It should be noted that a second version of HDF, called HDF5, has also recently been developed to address the shortcomings of HDF4. This new HDF library includes simpler source codes, more consistent and fewer data models, and the ability to work with large data sets (> 2GB). Although HDF5 and associated software will not be covered in this tutorial, we want to say a few words about the transition and compatability between HDF4 and HDF5. For complete information on HDF5, the user is directed to NCSA's HDF5 Page.

Return to top


HDF4 vs HDF5

Although HDF4 is the basic underpinning of HDF-EOS and will continue to be supported by both NCSA and HDF-EOS, it is the new HDF5 Library that is slowly emerging as the new standard and is the HDF library that will be developed in the future. However, the transition will take many years as investigators and science teams are making individual decisions regarding which HDF format/library (4 or 5) to use for their data. Data providers and instrument teams from the current TERRA platform (MODIS, MISR, etc..) and the upcoming AQUA platform (2001) are using the HDF4 library.

Due to the existence of both HDF libraries and the anticipation of significant scientific data sets being create in both HDF4 and HDF5, a major task facing NASA and NCSA is to facilitate the interoperability and conversion between HDF4 and HDF5 data sets. Source code, documentation, and software being (or has been) written for this purpose. The reader is directed to a NCSA HDF4 to HDF5 White Paper for further information.


Where can I get additional and detailed information on HDF?

The best site or location to find detailed information on all aspects of HDF is the NCSA HDF Information Server available through the Internet. Another good place to start is the HDF FAQ (Frequently Asked Questions) document. For complete documentation on HDF, the user is directed to the NCSA HDF Documentation page (most current) and the NCSA anonymous ftp server . Inquiries and further questions should be sent to hdfhelp@ncsa.uiuc.edu.

The following documents and information can be obtained through the sources mentioned above:

  1. The most current HDF 4.1 Reference Manual (4.1r3 as of Summer 2000)
  2. The most current HDF 4.1 Users Guide (4.1r3 as of Summer 2000)
  3. HDF Specifications and Developers Guide v3.2 (mainly for the programmers/developers)
  4. HDF Newsletters
  5. HDF Frequently Asked Questions (FAQ)
  6. Java Products
  7. Frequently Asked Questions about Java and HDF
  8. Release Notes and Man Pages provide information on items that are not covered in the above documents
  9. HDF software contributions from non-NCSA users

In addition, users may wish to join the hdfnews mailing list (by emailing ncsalist@ncsa.uiuc.edu and placing subscribe hdfnews in the body of the message) for discussions and updates on HDF.

Return to top