IDFStoPDS is a program which is used to generate PDS compliant data and label files from data that has been stored in the Instrument Description File System (IDFS) format. The IDFS format is a data storage format that is designed to be general enough to handle the majority of scientific data sets. These data sets include raw telemetry, processed data, simulation data and theoretical data. IDFS data sources are defined as either scalar instruments or vector instruments. A scalar instrument returns singular data quantities that are dependent only upon time and position. A vector instrument returns one-dimensional data quantities that have a functional dependence on a single variable, which in IDFS terminology is called the scanning variable.
With the IDFStoPDS program, the user may access any of the physical units defined for the IDFS data source being processed. The IDFS format performs a real-time conversion of telemetry data into physical units as the data is accessed. This allows for the refinement of calibration factors and processing algorithms without having to reprocess the original data set.
The IDFStoPDS program can be invoked in one of two modes:
(1) interactive mode or (2) batch mode. In interactive mode, the program
utilizes a GUI-based definition session to define the data items to be
processed. The definition phase of the IDFStoPDS program
has many options so that the user can tailor the processing to meet their
individual needs. Once this definition session has been completed, the
selected data items can then be retrieved from the IDFS data files and
returned in PDS compliant form for the selected time range. To invoke the
program in interactive mode, type IDFStoPDS
at the command line.
In batch mode, the interactive GUI-based definition session is bypassed and
the data requested is immediately processed based upon information contained in
the named layout file. To invoke the program in batch mode, type
IDFStoPDS -FName filename
at the command line. The argument
filename is the name of the layout file that is to be
utilized during the current session. Note that the name of the layout file
does not include the .I2P extension, which is appended to the filename provided
by the user during the GUI-based definition session. If the named layout file
does not exist, an error is displayed and processing terminates. In order to
modify the time range that is processed for a specific layout file, the user
may utilize the begin time (BTime) and/or end time
(ETime) command line options. For example, the following
call
IDFStoPDS -FName TEST -BTime 2003/191:21:47:00.000 -ETime 2003/191:21:50:00.000
invokes the IDFStoPDS program in order to process the selected data items defined in the layout file TEST for the specified time period.
In order to generate the PDS compliant data and label files, certain information must first be ascertained from the user. One of these pieces of information is the time range for which the IDFS data is to be processed. The time range can be set by selecting the "Time" button. This action will invoke the "Set Time" GUI. Next, the IDFS data source which contains the data items to be processed must be identified. This can be achieved by selecting the "Data Source" button. This action will invoke the "IDFS Source" GUI. Once a valid IDFS data source has been selected, the "Data Items" button becomes visible and the data items to be processed must be defined. The data items to be processed are referred to as IDFS sensors. An IDFS sensor is defined as a primary data source returned by the virtual instrument in question. When the Data Items button is selected, the "Data Items" GUI is invoked. On this GUI resides a list which indicates the IDFS data item(s) to be processed from the selected IDFS data source. Initially, this list is empty. To add a data item to the list, the pull-down Insertion menu is utilized. Once the position for the data item to be added has been determined, the actual data item must be selected using the Data Attributes GUI. This GUI is automatically invoked when a new item is added to the list. The new item will be defaulted to the first IDFS sensor defined for the IDFS data source based upon information contained in the PIDF file. Once a data item has been added to the list, the "Attributes" button becomes visible and accessible. If changes to any of the attributes for a specific data item need to be made, the data item should be selected from the list and the "Attributes" button should be activated in order to invoke the "Data Attributes" GUI.
At this point, the PDS compliant data and label files can be generated since all other information is defaulted based upon information found in the PIDF and VIDF file for the selected IDFS data source.
Once the IDFS data source, data items, and time range have been defined, the selected data items can then be processed. To generate the PDS compliant data and label files, select the pull-down Action menu from the main menubar and select the Create PDS File option. Upon activation, the local database is checked to see if the requested data files are online. If data for the requested time range is not online, the missing data is promoted to the local disk from the archive. Once the data has been placed online, the datafiles are opened, the data are extracted, converted to the appropriate physical units and written to a data file in PDS-compliant form. Data will continue to be processed until the user-requested end time has been reached or until an error condition is raised. When an error condition is encountered, a message is displayed, the partially created data file is purged and processing terminates. Upon completion of the task, successful or unsuccessful, any promoted IDFS data files are removed from the local disk. Upon successful completion of the task, a PDS label file is generated to describe the PDS data file that was created. At least one data item from the selected IDFS data source must be specified; otherwise, an error message will be displayed when the "Create PDS File" action is selected.
The IDFStoPDS program utilizes the PDS Spreadsheet object (see PDS Standards) to describe the IDFS data items that are placed within the PDS data file. When using the PDS spreadsheet object, the number of items that are returned for the data source must stay constant throughout the data file since the PDS label file defines the number of items that are returned. Depending upon the time range being processed, this may not be the case for the IDFS data source selected. Within the IDFS paradigm, vector instruments may return a subset of the maximum number of vector elements defined for the virtual instrument. When the IDFStoPDS application is started, the number of items being returned for a vector instrument is determined. For every IDFS data record processed thereafter, a check is made to determine if the number of items has changed. If the number of items has changed, the existing opened PDS data and label files are closed and a new set of PDS data and label files are generated, starting with the data record which returned a different number of items for the data in question. As an example, think about the case where a vector instrument can return either 32, 64, or 128 energy steps, depending upon the mode in which it is operating. If the time period being processed covers an interval when the instrument is stable at a specific operational mode, then one pair of PDS data and label files will be generated. However, if the time period being processed covers an interval when the instrument changes states, for example, from a 32-step mode to a 64-step mode, then two pairs of PDS data and label files will be generated. If the time period being processed covers an interval where the instrument changes states, for example, from a 32-step mode to a 64-step mode back to a 32-step mode, then three pairs of PDS data and label files will be generated, with the first set reflecting a 32 item object, the second set reflecting a 64 item object and the last set reflecting a 32 item object.
According to PDS standards, filename extensions are limited to 3 characters and filenames must not exceed a 27.3 format. In order to comply with this restriction, the filenames that are generated by the IDFStoPDS application use the following format:
VVVVVVVVYYYYDDDHHMMXXXXX[S]NN
VVVVVVVV represents the virtual instrument name for the IDFS data source selected (up to 8 characters max). The next 11 characters represent the time of the first data sample written into the data file, where YYYY represents the 4-digit year, DDD represents the 3-digit day of year, HH represents the 2-digit hour, and MM represents the 2-digit minute. XXXXX represents the units label (up to 5 characters max) for the data unit selected by the user. If the virtual instrument selected is a vector instrument, the letter S will be incorporated next before the last 2 characters (NN) of the filename. The last 2 characters NN represent a 2-digit file version number. The file version number is defaulted to 01 but can be modified by selecting the "PDS File Attributes" button and modifying the File Version Number option. For the PDS data file, the extension ".CSV" is appended to the filename generated and for the PDS label file, the extension ".LBL" is appended to the filename generated. An example of a filename generated by the IDFStoPDS application is the filename NPINORM20031912147RAW01.CSV, which indicates that the PDS data file contains data in RAW units, starting at hour 21, minute 47 in day 191 in year 2003 for the NPINORM virtual instrument. This data file is flagged as the first version generated (01) by the IDFStoPDS application.
When the IDFStoPDS program is run in interactive mode, a check is made to see if the PDS data file to be generated already exists in the current working directory. If it does, the user will be asked if they wish to overwrite the data file. If the user answers yes, the data file and the associated label file are removed and an attempt is made to create new data and label files. If the user answers no, the current request is aborted. When run in batch mode, no query is made; the files are removed and an attempt is made to create new data and label files.
Since the IDFStoPDS program has the potential to generate large data files, a clean-up mechanism is utilized. Whether or not the clean-up mechanism is invoked depends upon the actual user running the IDFStoPDS program. If there exists a ".guest" file in the user's home directory, the data and label files will be scheduled for removal 30 minutes after the data file has been closed. The user will be informed of this situation. If a ".guest" file does not exist in the user's home directory, the generated data and label files will be left untouched. This scheme was designed for those sites that set up a public guest account through which outside users are given access to the named local system. The contents of the ".guest" file is not important; simply, the existence of the file is utilized.
Once all the information has been defined, the information may be saved to a layout file for future retrieval. This is achieved by selecting the pull-down File menu and selecting the Save As option. The information defined is not saved by the program unless the user explicitly does so. Note that when providing the name of the layout file, do not specify the .I2P extension. The IDFStoPDS program automatically appends the .I2P extension to the name of the layout file upon creation of the file.
The remainder of this document gives an in-depth explanation of the options that appear on the various GUIs utilized by the IDFStoPDS program.
In order to set the time values, enter the values in the boxes that appear next to the time component being set or use the increment / decrement arrows. The stop time must be greater than the start time. The time is initially set to the current time. By Julian convention, January 1 is day 1.
The user must select a project, satellite, experiment, instrument and virtual instrument from which data is to be extracted. To change any of the selected options, click on the buttons on the right hand side. Note that all lineage information under the branch being changed is no longer applicable and must be re-selected. When the IDFS data source is changed, any previous data item definitions are deleted from the list and must be re-defined.
The data items to be processed are referred to as IDFS sensors. An IDFS sensor is defined as a primary data source returned by the virtual instrument in question. To add a data item to the list, the pull-down Insertion menu is utilized. The menu options indicate the position within the list at which the current data item definition is to be inserted. These options include:
To delete a data item from the list, the pull-down Removal menu is utilized. Currently, this pull-down menu contains just one option
If all of the sensors defined for an IDFS source are to be written to a PDS compliant data file, the user has 2 ways in which to indicate this scenario. One way is for the user to keep adding another data item to the list by selecting the Insertion menu, then selecting the specific IDFS sensor using the Sensor Group and Sensor options on the Data Attributes GUI. If the user selects this option, the user can select different scientific units and can individually select/de-select the ancillary information that will be processed for the sensor selected. A second way to select all IDFS sensors for a specific IDFS data source is to utilize the Select All Sensors checkbox option. When this option is selected, only one data item may exist on the list. When the data is processed, the "Data Items" list is temporarily expanded to hold a definition for each IDFS sensor defined. The same set of ancillary data is processed for all sensors and the same units for the sensor and scan data are utilized as those selected by the user.
The primary data items (IDFS sensors) returned by the selected IDFS source are presented in two lists entitled Sensor Group and Sensor. The PIDF file utilizes these two groupings to allow an additional level of subdivision within the primary data sources. This scheme is useful when the IDFS data source contains a large number of primary data sources representing a diverse set of measurements.
In some cases, there may be only one data unit defined. In other cases, a list
of data units will be presented. In either case, the Data Units
option is defaulted to the last data unit defined for the selected data item.
Instrument state values are pertinent to the instrument as a whole, not to any one
sensor definition. Data quality flags and the instrument state values are
automatically processed and output to the PDS data and label files by the
IDFStoPDS program, if the PIDF defines these data
parameters. Based upon the IDFS source selected, the last three items listed
may or may not apply. If they do apply, the user can select any or all of these
items to be included with the primary data in the PDS compliant data file. The
default is set so that none of these last three secondary data sources are
returned; in other words, the user must "check" the box in order to include these
secondary data sources.
The start azimuthal angle values are always returned as values between 0 and 360 degrees.
The stop azimuthal angle values could be negative or could be greater than 360 degrees.
The stop azimuthal angle values are computed by adding the degrees covered by the
accumulation time of each sample to the start azimuthal angle values.
The IDFStoPDS program creates a PDS label file for each PDS data file that is generated. Some of the values contained within the PDS label file can be modified by the user by selecting the PDS File Attributes button. This action will invoke the "PDS Label File Information" GUI. The values for these PDS label fields are defaulted by the IDFStoPDS program. A brief explanation of the options is given below. In all cases where a list is utilized, the list of options that are selectable are defined as Standard Values according to Data Dictionary Elements documentation provided by PDS.
This value is part of the filename generated for the PDS data and label files.
It is the last 2 characters prior to the ".CSV" and ".LBL" filename extensions
for the PDS data and label files, respectively. When the IDFStoPDS
program is run in interactive mode, a check is made to see if the PDS data file to
be generated already exists in the current working directory. If it does, the user
will be asked if they wish to overwrite the data file. If the user answers yes,
the data file and associated label file are removed and an attempt is made to
create new data and label files. If the user answers no, the current request is
aborted. When run in batch mode, no query is made; the data file and associated
label file are removed and an attempt is made to create new data and label files.
If the File Version Number option is modified, the PDS data file name that is
generated will be different since the last 2 characters prior to the filename
extensions will have changed; therefore, no PDS data and label files from previous
runs of the IDFStoPDS program will be purged and the new PDS
data and label files will be generated.
A brief default description is provided, which names the virtual instrument
and the time period that is contained within the PDS data file. The tokens
$START_TIME and $STOP_TIME are utilized and these are replaced at run time
with the start time of the first sample contained in the PDS data file and
the stop time of the last sample contained in the PDS data file, respectively.
A brief default value is provided, which names the virtual instrument
and the time period that is contained within the named PDS data file. The
tokens $FILE_NAME, $START_TIME and $STOP_TIME are utilized and these are
replaced at run time. $FILE_NAME is the name of the PDS data file that
is created, $START_TIME represents the start time of the first sample
contained in the PDS data file and $STOP_TIME represents the stop time of
the last sample contained in the PDS data file. Since this field is placed
before the DESCRIPTION field within the PDS label file, the PDS web pages
will pick up the contents of this keyword and display it.
The IDFStoPDS program utilizes the PDS Spreadsheet object (see PDS Standards) to describe the IDFS data items that are placed within the PDS data file. The PDS data file is simply an ASCII file which contains the selected IDFS data items, along with secondary data sources (ancillary data) and any instrument state values. The layout of the data file is row-oriented, with each row in the format of
Start time | Stop time | Data type name | Data type id | Data name | Data unit label | Data value(s) |
The primary or sensor data (Data type name = SENSOR) is the first row outputted, followed by any secondary data products selected which are associated with the IDFS data item, with each secondary data product outputted on a separate row. This pattern of sensor data and secondary data is repeated for each selected IDFS data item. There may be multiple calibration variables defined for the virtual instrument (IDFS data source) in question. Therefore, there will be one row outputted for each defined calibration variable.
If the IDFS data source selected is a vector instrument, the scan values which correspond to the returned sensor data values are also outputted. If all sensors utilize the same scan range, the scan values are outputted as the last row for the time period being processed; that is, after all data for all selected IDFS data items have been outputted. However, if all of the IDFS sensors do not utilize the same scan range, the scan values are outputted as the last row for each individual IDFS data item outputted.
The last data type to be outputted is the instrument state values (Data type name = MODE), as they pertain to the instrument as a whole. For vector instruments, the instrument state values are written once for each time interval processed for the primary data. However, for scalar instruments, this may or may not be true. The IDFS paradigm allows for the "packing" of multiple scalar values into a single group (referred to as an IDFS sensor set) in order to cut down on the size of the data files. The instrument state values stay constant throughout the IDFS sensor set. The IDFStoPDS program outputs these packed scalars one value at a time; however, the instrument state values are only written once per IDFS sensor set since they stay constant. The time range indicates the duration for which the instrument state values are valid. If the scalar instrument does not pack the primary data, then the instrument state values are written once for each time interval processed for the primary data.
Currently, this pull-down menu contains just one option