Home > Software > DataSet Object
DataSet Standard Data Object for use with MATLAB
Version 5.10
- Released November 18, 2008

The dataset object is a Matlab object written to be applicable to any data which requires storing axillary information along with the data itself. A typical set of data contains many parts. In spectroscopy, for instance, a data set can contain the matrix of spectra, the wavelength axis, sample labels, sample numbers, class variables, reference variables, etc. MATLAB supports a variety of data types such as double arrays, character arrays, structures and cell arrays that can accomodate these pieces. Until now, however, there has been no standard way to associate all the parts of a data set that go together, including the sample and variable labels, class variables, time and wavelength axes, etc. In order to facilite data set handling, Eigenvector has created a standard object, the DATASET Object (DSO). When added to a MATLAB installation, DataSet creates a new object in MATLAB that integrates all of the separate components associated with a data set into a single variable in the MATLAB workspace.

An example of the MATLAB command window after the DataSet files have been installed is shown to the right. Data consiting of three different variables has been loaded into the workspace. These include the data matrix dat, sample labels names and variable labels vars. This data can be combined into a single data object h using the commands shown. When displayed the separate parts of the data are listed. (The data object should look familiar to those who have been using MATLAB structures.)

Eigenvector Research is making the DataSet object freely available and hopes that MATLAB users everywhere will use it when writing routines that are data intensive. Existence of the object will greatly enhance the exchange of data sets, the translation of data sets between file types (such as JCAMP) and the handling of data sets within MATLAB. Of course, the PLS_Toolbox takes full advantage of the DataSet object and also provides several data import routines.

Changes in Version 5.1 -  This minor revision of the DataSet Oject includes the addition of 4 new functions (listed below) and several bug fixes. Read more about the history of the DataSet Object here.

* RESHAPE: Change size of a DataSet object. This function generally follows the I/O of the standard MATLAB reshape command but some options may not be available here.

* SORTBY: Sort a DataSet by given field, dim, and set. Allows sorting of a DataSet on any field associated with the data (e.g. axisscale, class, label...).

* SORTROWS Sort data rows of a DataSet. Overload of the standard SORTROWS command. Input (column) allows specification of which rows should be used to sort. A negative value indicates that the given row should be sorted in decreasing order.

* SORTCOLS Sort data columns of a DataSet. Analog of the standard SORTROWS command. Input (col) allows specification of which columns should be used to sort. A negative value indicates that the given column should be sorted in decreasing order.

To get the DataSet object: download the compressed ZIP file. You can also access the DataSet technical manual here.

For installation, the @dataset directory must be a sub-directory of a directory which is on the MATLAB path. The demo datasetdemo.m file can be placed anywhere on the path.

For more information on DataSet, please contact our helpdesk. Comments and ideas would be especially appreciated.

Eigenvector Research, Inc., 3905 West Eaglerock Drive, Wenatchee, WA 98801
B.M. Wise, bmw@eigenvector.com, Phone: 509.662.9213, Fax: 509.662.9214
N.B. Gallagher, nealg@eigenvector.com, Phone: 509.687.1039