The following sections detail how a user may utilize the HDF library and the SD API within a computer program to write a data file in HDF. As a teaching tool, this tutorial will concentrate on using the FORTRAN programming language. However, examples of the appropriate C code will also be given for certain steps.
Does the current version of HDF support your computer platform?
As outlined in Section 4, the HDF library can not be run on just any available computer platform or operating system. Before downloading the HDF library software, the user should make sure that the current release of HDF supports his/her computer and operating system. Otherwise, the user will be unable to work with the HDF library and files. There is also a possibility that previous releases of HDF may support the Users computer platform while the latest version does not. In this event, the user may wish to obtain the earlier software.
Downloading and Installing of the HDF library
The HDF library and software is public domain software and available free to all users. The library and code can be downloaded from the (NCSA anonymous ftp server). Directions on how to install the HDF library can also be found at this location.
Are all libraries and programs properly linked and compiled?
In order to run the HDF software, the library and the needed application routines and programs must first be properly compiled and linked. As of the current release of HDF (4.1r3), four separate libraries must be compiled and linked. These are the libmfhdf.a, libdf.a, libjpeg.a, and libz.a libraries. Provided below are examples of the command(s) that can be used for this action. It must be noted that the order in which the libraries are linked is important and should not vary from the order shown below:
For C programs:
For FORTRAN programs:
For the various commands needed to link and compile the HDF library on each individual platform, please see Section 4 "Compiling the HDF library".
Writing a short program to write a scientific data set in HDF
As mentioned previously, the HDF library and programs can only be run by using either the C or FORTRAN programming language. This choice is up to the user depending on availability and the language he or she feels most familiar and comfortable with. All SD API routines which allow the user to work with scientific data sets (SDS) either have the "sf" prefix (FORTRAN) or the "SD" prefix (C). Examples of the routines used to open, create, read, write, etc. SDS are given in the following sections.
Make sure all include files are in place
In section 4 - The HDF Library: Software and Hardware , it was noted that a series of standard HDF definitions and declarations of file access codes (i.e. read, write, etc.) and data types (i.e. integer, character) must be included within the programs that the user writes to utilize the various application routines. In the C programs, this is accomplished simply by adding the line #include "hdf.h" at the beginning of the program. This line effectively includes all the needed constants and definitions from the HDF software. When writing FORTRAN programs, this may also be done by simply adding an include statement that brings in only the needed definitions and declarations (constants.f) from the hdf.h header file. This is done by the following code: "include constants.f". However, all FORTRAN compilers (particularly the older ones) do not support the use of include statements. In this event, the user must type in/declare all the constants and definitions found in the constants.f file. It is advised that all declarations, whether through include statements or not, should be done at the beginning of the program.
Example:
FORTRAN:
C DFACC_RDONLY is defined in hdf.h
C if not available for FORTRAN then add
Parameter (DFACC_RDONLY=1)
C:
#include "hdf.h"
main()
Make all variable and parameter declarations
As with any program, the scientist/user should declare and initialize all variables and parameters at the beginning of the program. This includes all variables and arguments that will be used by the HDF commands to follow. The variable and parameter declarations needed for each call will be provided in the example boxes of the individual steps. These statements always belong at the top of the program.
Open file containing existing non-HDF data set and store in array
Before writing any data into HDF, the actual data first has to be accessed within the program. As is normally done in non-HDF applications, the file containing the data that the user wishes to convert into HDF must first be opened. After opening the file, the user reads and stores the data into a multi-dimensional array that can be accessed by the HDF commands.
For the purpose of this tutorial, the non-HDF data set will be read from an existing file called wind.dat into a multi-dimensional real array called rwind(XL,YL) where XL= 30 and YL = 30.
Example:
C:
main() {
FILE *infile;
const int XL = 30, YL = 30;
int i, j;
float rwind[XL][YL];
infile = fopen("wind.dat", "r");
for(i=0; i<XL; i++)
for(j=0; j<XL; j++)
fscanf(infile, "%f", rwind[i][j]);
}
FORTRAN:
real rwind(30,30)
XL = 30
YL = 30
Open(unit=15, file='wind.dat',form='formatted')
Do I=1,XL
Do j=1,YL
Read(15,25)rwind(I,J)
Enddo
Enddo
Initialize access to the SD interface and open new HDF file
The first real HDF programming step actually accomplishes 2 things:
This is done by the following code:
sd_id = sfstart(filename, access_mode) (FORTRAN)
or
sd_id = SDstart(filename, access_mode); ( C )
where
sd_id = HDF file id returned by the sfstart/SDstart command
filename = the name of the new HDF file (character string)
access_mode = Type of access required for this file
All available options for the access-mode argument are defined in the hdf.h header file mentioned previously and need only to be identified for all C and most FORTRAN operations. All options begin with the prefix "DFACC_" and include:
DFACC_CREATE (File Creation Access)
DFACC_RDONLY (Read Access)
DFACC_RDWR (Read and Write Access)
As mentioned previously, these definitions are stated in the hdf.h header file.
In the event that the user's FORTRAN compiler can not handle include statements such as those found in the hdf.h header file, the DFACC_ variable must be defined, along with its assigned value, at the beginning of the program. This is done by a code line such as:
parameter (DFACC_RDONLY = 1) (For FORTRAN only)
For the purpose of this tutorial, the new HDF file will be called wind.hdf.
Example:
FORTRAN:
integer*4 sd_id
integer sfstart
parameter (DFACC_CREATE = 4)
sd_id = sfstart(wind.hdf, DFACC_CREATE)
C:
#include "hdf.h"
/* Includes all the access_mode definintions */
int32 sd_id;
sd_id = SDstart(wind.hdf, DFACC_CREATE);
Define characteristics of new HDF data set(s)
After initializing the SD interface and opening and assigning a file id (sd_id) to the HDF file to be used, the next step is to define a new HDF Scientific Data Set (SDS) to which the existing non-HDF data will be written. This is done by the following code:
sds_id = sfcreate (sd_id, name, number_type, rank, dim_sizes) (FORTRAN)
or
sds_id = SDcreate (sd_id, name, number_type, rank, dim_sizes); ( C )
It should be noted that sfselect/SDselect may also be used to write to a previously defined HDF data set.
where
sds_id = HDF SDS array id returned by the sfcreate/SDcreate command
sd_id = the new HDF file id created in the previous step (sfstart/SDstart)
name = name of new SDS (in ASCII character string)
number_type = data type of data set
This argument always takes the form of DFNT_X, where X is the data type to be used. A list of all the data types supported by the API can be found in the HDF User's Guide. For most of the data types, 8,16,32 and 64-bit types are supported. A few of the available options are provided below:
| HDF Data Type | Description |
| DFNT_FLOAT32 | 32 bit floating point real |
| DFNT_DOUBLE | double precision reals |
| DFNT_CHAR8 | 8 bit character type |
| DFNT_UCHAR8 | 8 bit unsigned character type |
| DFNT_INT16 | 16 bit integer type |
| DFNT_UINT16 | 16 bit unsigned integer type |
| DFNT_NINT16 | 16 bit native integer |
| DFNT_NUINT16 | 16 bit native unsigned integer |
| DFNT_NFLOAT32 | 32 bit native floating point real |
Similar to the DFACC_ argument, all data types are defined in hdf.h. Once again, for FORTRAN compilers unable to access these include files, the DFNT_ argument, and its' assigned value, must be defined at the beginning of the program using code like this:
parameter (DFNT_INT16 = 22) (taken from constants.f within the hdf.h file)
rank = number of dimensions in array to be written (integer)
This value is best specified at the beginning of the program along with the other various declarations with a simple line of code:
rank = 2, 3,....
dim_sizes = An array defining the size of each dimension of the data array (integer)
As with the "rank" argument, this variable is best specified with the other variable declarations at the top of the program. In FORTRAN, an example for a 2-D, 30 X 30 array would be:
dimsizes(1) = 30 (FORTRAN)
dimsizes (2) = 30
or
dimsizes[0] = 30; ( C )
dimsizes[1] = 30;
EXAMPLE: For an existing data set to be written as a 2-D array of 30 (x direction) by 30(y direction), and as an 8-bit integer type, the following code need to be used:
rank = 2 (FORTRAN)
dimsizes(1) = 30 dimsizes(2) = 30
sds_id = sfcreate(sd_id, newarray_1, DFNT_INT8, rank, dimsizes)
or
rank = 2; ( C )
dimsizes[0] = 30;
dimsizes[1] = 30;
sds_id = SDcreate(sd_id, "newarray_1", DFNT_INT8, rank, dimsizes);
Example:
FORTRAN:
integer*4 DFNT_INT16
integer sds_id, rank
integer dims(2), sfcreate
rank = 2
XL = 30
YL = 30
dims(1) = XL
dims(2) = YL
sds_id = sfcreate(sd_id, winds, DFNT_INT16, rank, dims)
C:
int32 sds_id;
int32 dims[2], rank;
rank = 2;
XL = 30;
YL = 30;
dims[0] = YL;
dims[1] = XL;
sds_id = SDcreate(sd_id, winds, DFNT_INT16, rank, dims);
Write existing data set/array to a new data array in a new HDF file
After initializing the API and defining the new HDF file and new HDF SDS to be written to, the next step is to actually write the existing non-HDF data into the HDF file by using the SDwritedata (sfwdata) command. This command is used to write either all or part of the existing n-dimensional data set (termed a "slab") into the sds_id array with the same number of dimensions. In addition, the size of each dimension of the data "slab" must be the same or smaller then the corresponding dimension of the sds_id. The SDwritedata/sfwdata command is used in the following fashion:
ret=sfwdata (sds_id, start, stride, edge, data) (FORTRAN)
or
ret=SDwritedata (sds_id, start, stride, edge, data); ( C )
It should be noted that there are two versions of the write routine in FORTRAN, "sfwdata" is used for numeric data while "sfwcdata" is used for writing character data
where
sds_id = the SDS id (identifier) determined and returned by using SDcreate (sfcreate)
start = An array which identifies where in the SDS that the writing will begin
The start array identifies the location or position in the SDS where the writing of the data "slab" will begin. This array must have the same number of dimensions (rank) as the SDS and can not be larger (in each dimension) then the SDS array. The declaration of the start variables can be done at the top of the program or just preceding the call of the sfwdata (SDwritedata) command. As an example, to write the existing data set to the beginning of a new 2-dimensional SDS the following must be specified:
start(1) = 0 (FORTRAN)
start(2) = 0
Or
start[0] = 0; ( C )
start[1] = 0;
If the user wishes to begin writing the data at a location other then the beginning of the new data set, say at a first dimension (X) of 15, the declarations would be:
start(1) = 15 (FORTRAN)
start(2) = 0
Or
start[0] = 15; ( C )
start[1] = 0;
stride = An array specifying the interval between written values in each dimension
The stride argument specifies, for each dimension, the interval between consecutive written values of the data set. In other words, how many array locations are skipped with each writing of the data? Like the start array, the stride argument is predefined before calling the sfwdata (SDwritedata) command, either directly before the call or at the top of the program.
If the user does not wish to skip any array locations in a new 2-dimensional SDS, the following is to be declared:
stride(1) = 1 (FORTRAN)
stride(2) = 1
or
stride(0) = 1; ( C )
stride(1) = 1;
However, if the user wishes to skip every other X (dimension 1) location, the following would be used:
stride(1) = 2 (FORTRAN)
stride(2) = 1
Or
stride(0) = 2; ( C )
stride(1) = 1;
edge = An array defining the number of data values to be written in each dimension
The edge array defines the number of data values/elements that will be written along each dimension of the multi-dimensional SDS array. In plain terms, this argument defines the size of the data slab (all or part of the data) to be written to the new SDS array and each dimension.
edge must be specified for each dimension of the data set and SDS array, and can not be larger then the entire length of the newly defined (from sfcreate) array it is being written to.
The edge is affected by the stride. If stride = 2, then the edge will need to be divided by two, because it will be writing to every other location along a dimension.
Similar to stride and start, the edge argument needs to be defined prior to the calling of the sfwdata (SDwritedata) command, whether it be at the top of the program or directly before the routine call.
As an example, most often, the user will wish to write the entire non-HDF data set into a new array that starts from the beginning and does not contain any missing data or blanks. For a 2-dimensional array of 30X30, read and stored into the data array "rwind", this can be done, in FORTRAN, by:
start(1) = 0
start(2) = 0
stride(1) = 1
stride(2) = 1
edge(1) = 30
edge(2) = 30
retn = sfwdata(sds_id, start, stride, edges, rwind)
Or in C by:
Start[0] = 0;
Start[1] = 0;
Stride[0] = 1;
Stride[1] = 1;
Edge[0] = 30;
Edge[1] = 30
; retn = SDwritedata(sds_id, start, stride, edges, rwind);
data = The array or buffer of data to be written
The file containing this data should be opened at the beginning of the program and the data read in and stored into the necessary arrays before beginning the HDF operations.
Example:
FORTRAN:
integer start(2), edges(2), stride(2)
integer retn, XL, YL
integer sfwdata
c Define the location, pattern and size of data set that
c will be written to.
XL = 30
YL = 30
start(1) = 0
start(2) = 0
edge(1) = XL
edge(2) = YL
stride(1) = 1
stride(2) = 1
c write the data
retn = sfwdata(sds_id, start, stride, edges, rwind)
C: int32 retn;
int32 start[2], edges[2], stride[2];
XL = 30;
YL = 30;
/*Define the location, pattern and size of the dataset*/
For (i=0; i<rank; i++) {
start[i] = 0;
edge[i] = dims[i];
edge(1) = 30;
/* Write the stored data to "newarray". The 5th argument must be explicitly cast to
a generic pointer to conform to the API definition for SDwritedata */
retn = SDwritedata(sds_id, start, NULL, edges, (VCIDP)newarray);
Optional operation: Provide metadata for HDF files or data sets
Using the general attribute routines for user-defined attributes described in section 6, attributes can be written and attached to the file itself, the data set, and the dimension in question. This is not required, but up to the choice of the user.
After opening the file and obtaining the file id (sd_id) using the sfstart/SDstart command, the following can be done
1) FILE ATTRIBUTES:
To assign attributes to a file, the following code is used:
SDsetattr (sd_id,attr_name, data_type, count, value); ( C )
sfsnatt(sd_id, attr_name, data_type, count, value) (FORTRAN)
(Please note that there are two FORTRAN versions of the routine, sfsnatt writes numeric attribute data while sfcatt writes character attribute data.)
where
sd_id= file identifier
attr_name = ASCII string containing the name of the attribute (i.e., "units")
data_type = data type of attribute values (i.e., DFNT_INT32)
count = total number values/characters in the attribute
value = text string or label
2) ARRAY ATTRIBUTES
After each data set identifier (sds_id) is obtained through the SDselect/sfselect command, the following is used:
SDsetattr (sds_id, attr_name, data_type, count, value); ( C )
sfsnatt(sds_id, attr_name, data_type, count, value) (FORTRAN)
where
sds_id= data set identifier
rest as above
3) DIMENSION ATTRIBUTES
After getting the identifier for a dimension using the sfdimid/SDgetdimid command, the following is used:
SDsetattr (dim_id, attr_name, data_type, count, value); ( C )
sfsnatt (dim_id, attr_name, data_type, count, value) (FORTRAN)
where
dim_id= Dimension identifier
rest as above
4) CLOSING ATTRIBUTES
After setting/writing the attributes, the user must terminate access to the data array (using the SDendaccess/sfendacc commands) and the file and SD interface (using the SDend/sfend commands).
Example:
1) FILE ATTRIBUTES:
FORTRAN:
sd_id = sfstart("wind.hdf", DFACC_RDWR)
retn = sfsattr(sd_id, "Contents of file", DFNT_CHAR8, 16, "horizontal winds")
C:
sd_id=SDstart ("wind.hdf", DFACC_RDWR);
retn= SDsetattr (sd_id, "Contents of file", DFNT_CHAR8, 16, "horizontal winds ");
2) ARRAY ATTRIBUTES
FORTRAN:
sds_id=sfselect (sd_id, 0)
retn = sfsattr(sds_id, "format", DFNT_INT32, 4, "F8.2")
C:
sds_id=SDselect(sd_id, 0);
retn= SDsetattr (sds_id, "format", DFNT_INT32, 4, "F8.2");
3) DIMENSION ATTRIBUTES
FORTRAN:
dim_id=sfdimid (sds_id, 0)
retn = sfsattr(dim_id, "dim_metric", DFNT_CHAR8, 10, "meters/sec")
C:
dim_id=SDgetdimid (sds_id,0);
retn= SDsetattr (dim_id, "dim_metric", DFNT_CHAR8, 10, "meters/sec");
Terminate / close access to all files, data sets, and APIs
After writing the data to the new SDS array within the new HDF file, it is necessary to terminate or close access to the new data set in order to prevent any possible loss of data. This is done by the following:
retn = sfendacc(sds_id) (FORTRAN)
or
retn = SDendaccess(sds_id); ( C )
In addition, the API called within the program must also be closed to prevent any data loss:
retn = sfend(sd_id) (FORTRAN)
or
retn = SDend(sd_id); ( C )
Example:
FORTRAN: integer sfendacc, sfend
retn = sfendacc(sds_id)
retn = sfend(sd_id)
C:
retn = SDendaccess(sds_id);
retn = SDend(sd_id);
Execute like a normal FORTRAN or C program.