DataSet Standard Data Object for use with MATLAB
Version 5.10 - Released
November 18, 2008
The dataset object is a Matlab object written to
be applicable to any data which requires storing axillary information
along with the data itself. A typical set of data contains many
parts. In spectroscopy, for instance, a data set can contain the
matrix of spectra, the wavelength axis, sample labels, sample numbers,
class variables, reference variables, etc. MATLAB supports a variety
of data types such as double arrays, character arrays, structures
and cell arrays that can accomodate these pieces. Until now, however,
there has been no standard way to associate all the parts of a data
set that go together, including the sample and variable labels,
class variables, time and wavelength axes, etc. In order to facilite
data set handling, Eigenvector has created a standard object, the
DATASET Object (DSO). When added to a MATLAB installation,
DataSet creates a new object in MATLAB that integrates all of the
separate components associated with a data set into a single variable
in the MATLAB workspace.
An example of the MATLAB command
window after the DataSet files have been installed is shown to the
right. Data consiting of three different variables has been loaded
into the workspace. These include the data matrix dat, sample
labels names and variable labels vars. This data can
be combined into a single data object h using the commands
shown. When displayed the separate parts of the data are listed.
(The data object should look familiar to those who have been using
MATLAB structures.)
Eigenvector Research is making the DataSet object freely
available and hopes that MATLAB users everywhere
will use it when writing routines that are data intensive. Existence
of the object will greatly enhance the exchange of data sets, the
translation of data sets between file types (such as JCAMP) and
the handling of data sets within MATLAB. Of course, the PLS_Toolbox
takes full advantage of the DataSet object and also provides several
data import routines.
Changes in Version 5.1
- This minor revision of the DataSet Oject includes the addition
of 4 new functions (listed below) and several bug fixes. Read more about the history
of the DataSet Object here.
* RESHAPE: Change size of a DataSet object. This function generally follows the I/O of the standard MATLAB reshape command but some options may not be available here.
* SORTBY: Sort a DataSet by given field, dim, and set. Allows sorting of a DataSet on any field associated with the data (e.g. axisscale, class, label...).
* SORTROWS Sort data rows of a DataSet. Overload of the standard SORTROWS command. Input (column) allows specification of which rows should be used to sort. A negative value indicates that the given row should be sorted in decreasing order.
* SORTCOLS Sort data columns of a DataSet. Analog of the standard SORTROWS command. Input (col) allows specification of which columns should be used to sort. A negative value indicates that the given column should be sorted in decreasing order.
To get the DataSet object: download the compressed
ZIP file. You can also access the DataSet
technical manual here.
For installation, the @dataset
directory must be a sub-directory of a directory which is on the
MATLAB path. The demo datasetdemo.m file can be placed anywhere
on the path.
For more information on DataSet, please contact
our helpdesk. Comments and ideas would be especially appreciated.
|