Main Content

Testing Guidelines for Custom Datastores

All datastores that are derived from the custom datastore classes share some common behaviors. This test procedure provides guidelines to test the minimal set of behaviors and functionalities that all custom datastores should have. You will need additional tests to qualify any unique functionalities of your custom datastore.

If you have developed your custom datastore based on instructions inDevelop Custom Datastore, then follow these test procedures to qualify your custom datastore. First perform the unit tests, followed by the workflow tests:

  • Unit tests qualify the datastore constructor and methods.

  • Workflow tests qualify the datastore usage.

For all these test cases:

  • Unless specified in the test description, assume that you are testing a nonempty datastoreds.

  • Verify the test cases on the file extensions, file encodings, and data locations (like Hadoop®) that your custom datastore is designed to support.

Unit Tests

Construction

The unit test guidelines for the datastore constructor are as follows.

Test Case Description Expected Output

Check if your custom datastore constructor works with the minimal required inputs.

Datastore object of your custom datastore type with the minimal expected properties and methods

Check if your datastore objectdshasmatlab.io.Datastoreas one of its superclasses.

Run this command:

isa(ds,'matlab.io.Datastore')

1ortrue

构造函数调用您的自定义数据存储与再保险quired inputs and any supported input arguments and name-value pair arguments.

Datastore object of your custom datastore type with the minimal expected properties and methods

read

Unit test guidelines for thereadmethod

Test Case Description Expected Output

Call thereadmethod on a datastore objectds.

t = read(ds);

Data from the beginning of the datastore

If you specify read size, then the size of the returned data is equivalent to read size.

Call thereadmethod again on the datastore object.

t = read(ds);

Data starting from the end point of the previous read operation

If you specify read size, then the size of the returned data is equivalent to read size.

Continue calling thereadmethod on the datastore object in a while loop.

while(hasdata(ds)) t = read(ds);end

No errors

Correct data in the correct format

When data is available to read, check theinfo欧tput (if any) of thereadmethod.

Call a datastore objectds.

[t,info] = read(ds);

No error

infocontains the expected information

tcontains the expected data

When no more data is available to read, callreadon the datastore object.

Either expected output or an error message based on your custom datastore implementation.

readall

Unit test guidelines for thereadallmethod

Test Case Description Expected Output

Call thereadallmethod on the datastore object.

All data

Call thereadallmethod on the datastore object, whenhasdata(ds)isfalse.

Read from the datastore untilhasdata(ds)isfalse, and then call thereadallmethod.

while(hasdata(ds)) t = read(ds);end
readall(ds)

All data

hasdata

Unit test guidelines for thehasdatamethod

Test Case Description Expected Output

Call thehasdatamethod on the datastore object before making any calls toread

true

Call thehasdatamethod on the datastore object after making a few calls toread, but before all the data is read

true

When more data is available to read, call thereadallmethod, and then call thehasdatamethod.

true

When no more data is available to read, call thehasdatamethod.

false

reset

Unit test guidelines for theresetmethod

Test Case Description Expected Output

Call theresetmethod on the datastore object before making any calls to thereadmethod.

Verify that thereadmethod returns the appropriate data after a call to theresetmethod.

reset(ds); t = read(ds);

No errors

Thereadreturns data from the beginning of the datastore.

If you specify read size, then the size of the returned data is equivalent to read size.

When more data is available to read, call theresetmethod after making a few calls to thereadmethod.

Verify that thereadmethod returns the appropriate data after making a call to theresetmethod.

No errors

Thereadmethod returns data from the beginning of the datastore.

If you specify read size, then the size of the returned data is equivalent to read size.

When more data is available to read, call theresetmethod after making a call to thereadallmethod.

Verify that thereadmethod returns the appropriate data after making a call to theresetmethod.

No errors

Thereadmethod returns data from the beginning of the datastore.

If you specify read size, then the size of the returned data is equivalent to read size.

When no more data is available to read, call theresetmethod on the datastore object and then call thereadmethod

Verify thatreadreturns the appropriate data after a call to theresetmethod.

No errors

Thereadmethod returns data from the beginning of the datastore.

If you specify read size, then the size of the returned data is equivalent to read size.

progress

Unit test guidelines for theprogressmethod

Test Case Description Expected Output

Call theprogressmethod on the datastore object before making any calls to thereadmethod.

0or an expected output based on your custom datastore implementation.

Call theprogressmethod on the datastore object after making a call toreadall, but before making any calls toread

readall(ds); progress(ds)

0or an expected output based on your custom datastore implementation.

Call theprogressmethod on the datastore object after making a few calls toreadand while more data is available to read.

之间的一小部分0and1or an expected output based on your custom datastore implementation.

Call theprogressmethod on the datastore object when no more data is available to read.

1or an expected output based on your custom datastore implementation.

preview

Unit test guidelines for thepreviewmethod

Test Case Description Expected Output

Callpreviewon the datastore object before making any calls toread.

Thepreviewmethod returns the expected data from the beginning of the datastore, based on your custom datastore implementation.

Callpreviewon the datastore object after making a few calls toreadand while more data is available to read.

Thepreviewmethod returns the expected data from the beginning of the datastore, based on your custom datastore implementation.

Callpreviewon the datastore object after making a call toreadalland while more data is available to read.

Thepreviewmethod returns the expected data from the beginning of the datastore, based on your custom datastore implementation.

Callpreviewon the datastore object after making a few calls toreadand a call toreset.

Thepreviewmethod returns the expected data from the beginning of the datastore, based on your custom datastore implementation.

Callpreviewon the datastore object when no more data is available to read.

Thepreviewmethod returns the expected data from the beginning of the datastore, based on your custom datastore implementation.

Callpreviewafter making a few calls toreadmethod and then callreadagain.

Thereadmethod returns data starting from the end point of the previous read operation.

If you specify read size, then the size of the returned data is equivalent to read size.

Callpreview, and then callreadallon the datastore.

Thereadall方法返回所有the data from the datastore.

While datastore has data available to read, callpreview, and then callhasdata.

Thehasdatamethod returnstrue.

partition

Unit test guidelines for thepartitionmethod

Test Case Description Expected Output

Callpartitionon the datastore objectdswith a valid number of partitions and a valid partition index.

Callreadon a partition of the datastore and verify the data.

subds = partition(ds,n,index) read(subds)

Verify that the partition is valid.

isequal(properties(ds),properties(subds)) isequal(methods(ds),methods(subds))

Thepartitionmethod partitions the datastore intonpartitions and returns the partition corresponding to the specifiedindex.

The returned partitionsubdsmust be a datastore object of your custom datastore.

The partitioned datastoresubdsmust have the same methods and properties as the original datastore.

Theisequalstatement returnstrue.

Callingreadon the partition returns data starting from the beginning of the partition.

If you specify read size, then the size of the returned data is equivalent to read size.

Callpartitionon the datastore objectdswith number of partitions specified as1andindexof returned partition specified as1.

Verify the data returned by callingreadandpreviewon a partition of the partitioned datastore.

subds = partition(ds,1,1) isequal(properties(ds),properties(subds)) isequal(methods(ds),methods(subds)) isequaln(read(subds),read(ds)) isequaln(preview(subds),preview(ds))

The partitionsubdsmust be a datastore object of your custom datastore.

The partitionsubdsmust have the same methods and properties as the original datastoreds.

Theisequalandisequalnstatements returnstrue.

Callpartitionon the partitionsubdswith a valid number of partitions and a valid partition index.

The repartitioning of a partition of the datastore should work without errors.

initializeDatastore

If your datastore inherits frommatlab.io.datastore.HadoopFileBased, then verify the behavior ofinitializeDatastoreusing the guidelines in this table.

Test Case Description Expected Output

CallinitializeDatastoreon the datastore objectdswith a validinfostruct.

Theinfostruct contains these fields:

  • FileName

  • Offset

  • 大小

FileNameis of data typecharand the fieldsOffsetand大小are of the data type double.

For example, initialize theinfostruct, and then callinitializeDatastoreon the datastore objectds.

info = struct('FileName','myFileName.ext',...'Offset',0,'Size',500) initializeDatastore(ds,info)

Verify the initialization by examining the properties of your datastore object.

ds

TheinitializeDatastoremethod initializes the custom datastore objectdswith the necessary information from theinfostruct.

getLocation

If your datastore inherits frommatlab.io.datastore.HadoopFileBased, then verify the behavior ofgetLocationusing these guidelines.

Test Case Description Expected Output

CallgetLocationon the datastore object.

location = getLocation(ds)

Based on your custom datastore implementation, thelocation欧tput is either of these:

  • List of files or directories

  • amatlab.io.datastore.DsFileSetobject

Iflocationis amatlab.io.datastore.DsFileSetobject, then callresolveto verify the files in thelocation欧tput.

resolve(location)

ThegetLocationmethod returns the location of files in Hadoop.

isfullfile

If your datastore inherits frommatlab.io.datastore.HadoopFileBased, then verify the behavior ofisfullfileusing these guidelines.

Test Case Description Expected Output

Callisfullfileon the datastore object.

Based on your custom datastore implementation, theisfullfilemethod returnstrueorfalse.

Workflow Tests

Verify your workflow tests in the appropriate environment.

  • If your datastore inherits only frommatlab.io.Datastore, then verify all workflow tests in a local MATLAB®session.

  • If your datastore has parallel processing support (inherits frommatlab.io.datastore.Partitionable), then verify your workflow tests in parallel execution environments, such as Parallel Computing Toolbox™ andMATLAB Parallel Server™.

  • If your datastore has Hadoop support (inherits frommatlab.io.datastore.HadoopFileBased), then verify your workflow tests in a Hadoop cluster.

Tall Workflow

Testing guidelines for thetallworkflow

Test Case Description Expected Output

Create a tall array by callingtallon the datastore objectds.

t = tall(ds)

Thetallfunction returns an output that is the same data type as the output of thereadmethod of the datastore.

For this test step, create a datastore object with data that fits in your system memory. Then, create a tall array using this datastore object.

t = tall(ds)

If your data is numeric, then apply an appropriate function like themeanfunction to both thedsandt, then compare the results.

If your data is of the data typestringor分类, then apply theuniquefunction on a column ofdsand a column oft, then compare the results.

Applygatherand verify the result.

For examples, seeBig Data Workflow Using Tall Arrays and Datastores(Parallel Computing Toolbox).

No errors

The function returns an output of the correct data type (not of atalldata type).

The function returns the same result whether it is applied todsor tot.

MapReduce Workflow

Testing guidelines for the MapReduce workflow

Test Case Description Expected Output

Callmapreduceon the datastore objectds.

欧tds = mapreduce(ds,@mapper,@reducer)
For more information, seemapreduce.

To support the use of themapreducefunction, thereadmethod of your custom datastore must return both theinfoand thedata欧tput arguments.

No error

The MapReduce operation returns the expected result

Next Steps

Note

This test procedure provides guidelines to test the minimal set of behaviors and functionalities for custom datastores. Additional tests are necessary to qualify any unique functionalities of your custom datastore.

After you complete the implementation and validation of your custom datastore, your custom datastore is ready to use.

See Also

||

Related Topics