Import HDF5 Files
Overview
Hierarchical Data Format, Version 5, (HDF5) is a general-purpose, machine-independent standard for storing scientific data in files, developed by the National Center for Supercomputing Applications (NCSA). HDF5 is used by a wide range of engineering and scientific fields that want a standard way to store data so that it can be shared. For more information about the HDF5 file format, read the HDF5 documentation available at The HDF Group website (https://www.hdfgroup.org
).
MATLAB®provides two methods to import data from an HDF5 file:
High-level functions that make it easy to import data, when working with numeric data sets
Low-level functions that enable more complete control over the importing process, by providing access to the routines in the HDF5 C library
Note
For information about importing to HDF4 files, which have a separate, incompatible format, seeImport HDF4 Files Programmatically。
Import Data Using High-Level HDF5 Functions
MATLAB包含几个功能,您可以使用这些功能来检查HDF5文件的内容,并将数据从文件导入到MATLAB工作区中。
Note
您只能使用高级函数来读取数字数据集或属性。要读取非数字数据集或属性,您必须使用低级接口。
h5disp
— View the contents of an HDF5 file.h5info
— Create a structure that contains all the metadata defining an HDF5 file.h5read
— Read data from a variable in an HDF5 file.h5readatt
— Read data from an attribute associated with a variable in an HDF5 file or with the file itself (a global attribute).
For details about how to use these functions, see their reference pages, which include examples. The following sections illustrate some common usage scenarios.
确定HDF5文件的内容
HDF5 files can contain data and metadata, calledattributes。HDF5 files organize the data and metadata in a hierarchical structure similar to the hierarchical structure of a UNIX®file system.
In an HDF5 file, the directories in the hierarchy are called小组。一个组可以包含其他组,数据集,属性,链接和数据类型。数据集是数据集,例如多维数字阵列或字符串。属性是与其他实体(例如数据集)关联的任何数据。链接类似于UNIX文件系统符号链接。链接是参考对象而无需进行对象的副本的一种方法。
Data types are a description of the data in the data set or attribute. Data types tell how to interpret the data in the data set.
To get a quick view into the contents of an HDF5 file, use theh5disp
function.
h5disp('example.h5') HDF5 example.h5 Group '/' Attributes: 'attr1': 97 98 99 100 101 102 103 104 105 0 'attr2': 2x2 H5T_INTEGER Group '/g1' Group '/g1/g1.1' Dataset 'dset1.1.1' Size: 10x10 MaxSize: 10x10 Datatype: H5T_STD_I32BE (int32) ChunkSize: [] Filters: none Attributes: 'attr1': 49 115 116 32 97 116 116 114 105 ... 'attr2': 50 110 100 32 97 116 116 114 105 ... Dataset 'dset1.1.2' Size: 20 MaxSize: 20 Datatype: H5T_STD_I32BE (int32) ChunkSize: [] Filters: none Group '/g1/g1.2' Group '/g1/g1.2/g1.2.1' Link 'slink' Type: soft link Group '/g2' Dataset 'dset2.1' Size: 10 MaxSize: 10 Datatype: H5T_IEEE_F32BE (single) ChunkSize: [] Filters: none Dataset 'dset2.2' Size: 5x3 MaxSize: 5x3 Datatype: H5T_IEEE_F32BE (single) ChunkSize: [] Filters: none . . .
To explore the hierarchical organization of an HDF5 file, use theh5info
function.h5info
returns a structure that contains various information about the HDF5 file, including the name of the file.
info = h5info('example.h5')info = fileName:'matlabroot \ matlab \ toolbox \ matlab \ matlab \ exames.h5'名称:'/'组:[4x1 struct] dataSets:[] datatypes:[] datatypes:[] links:[]链接:[]链接:[]链接:[][]属性:[2x1 struct]
By looking at theGroups
和Attributes
字段,您可以看到该文件包含四个组和两个属性。这Datasets
,Datatypes
, and链接
fields are all empty, indicating that the root group does not contain any data sets, data types, or links. To explore the contents of the sample HDF5 file further, examine one of the structures inGroups
。这following example shows the contents of the second structure in this field.
level2 = info.Groups(2) level2 = Name: '/g2' Groups: [] Datasets: [2x1 struct] Datatypes: [] Links: [] Attributes: []
In the sample file, the group named/g2
contains two data sets. The following figure illustrates this part of the sample HDF5 file organization.
To get information about a data set, such as its name, dimensions, and data type, look at either of the structures returned in theDatasets
field.
dataset1 = level2.datasets(1)dataset1 = fileName:'matlabroot \ example.h5'名称:'/g2/dset2.1'等级:1 datatype:[1x1 struct struct] dims:10 maxdims:10 maxDims:10 layout:'contiguul'属性:'连续'属性:[]链接:[]块size:[] fillvalue:[]
Import Data from HDF5 File
To read data or metadata from an HDF5 file, use theh5read
function. As arguments, specify the name of the HDF5 file and the name of the data set. (To read the value of an attribute, you must useh5readatt
)
To illustrate, this example reads the data set,/g2/dset2.1
from the HDF5 sample fileexample.h5
。
data = h5read('example.h5','/g2/dset2.1') data = 1.0000 1.1000 1.2000 1.3000 1.4000 1.5000 1.6000 1.7000 1.8000 1.9000
Map HDF5 Data Types toMATLABData Types
When theh5read
function reads data from an HDF5 file into the MATLAB workspace, it maps HDF5 data types to MATLAB data types, as shown in the table below.
HDF5 Data Type | h5read Returns |
---|---|
比特场 | 数组packed 8-bit integers |
Float | MATLAB单类型和双重类型,只要它们占64位或更少 |
Integer types, signed and unsigned | 等效的MATLAB整数类型,签名和未签名 |
Opaque | 数组uint8 值 |
Reference | 返回指向参考的实际数据,而不是参考的值。 |
Strings, fixed-length and variable length | 字符串数组。 |
Enums | Cell array of character vectors, where each enumerated value is replaced by the corresponding member name |
化合物 | 1-B-1结构阵列;数据集的尺寸在结构的字段中表示。 |
Arrays | 数组值using the same data type as the HDF5 array. For example, if the array is of signed 32-bit integers, the MATLAB array will be of typeint32 。 |
MATLAB包括附带HDF5文件的例子s examples of all these data types.
For example, the data set/g3/string
是字符串。
h5disp('example.h5','/g3/string') HDF5 example.h5 Dataset 'string' Size: 2 MaxSize: 2 Datatype: H5T_STRING String Length: 3 Padding: H5T_STR_NULLTERM Character Set: H5T_CSET_ASCII Character Type: H5T_C_S1 ChunkSize: [] Filters: none FillValue: ''
Now read the data from the file, MATLAB returns it as a cell array of character vectors.
s = h5read('example.h5','/g3/string') s = 'ab ' 'de ' >> whos s Name Size Bytes Class Attributes s 2x1 236 cell
这compound data types are always returned as a 1-by-1 struct. The dimensions of the data set are expressed in the fields of the struct. For example, the data set/g3/compound2D
is a compound data type.
h5disp('example.h5','/g3/compound2D') HDF5 example.h5 Dataset 'compound2D' Size: 2x3 MaxSize: 2x3 Datatype: H5T_COMPOUND Member 'a': H5T_STD_I8LE (int8) Member 'b': H5T_IEEE_F64LE (double) ChunkSize: [] Filters: none FillValue: H5T_COMPOUND
Now read the data from the file, MATLAB returns it as a 1-by-1 struct.
data = h5read('example.h5','/g3/compound2D') data = a: [2x3 int8] b: [2x3 double]
使用低级HDF5功能导入数据
MATLAB provides direct access to dozens of functions in the HDF5 library withlow-levelfunctions that correspond to the functions in the HDF5 library. In this way, you can access the features of the HDF5 library from MATLAB, such as reading and writing complex data types and using the HDF5 subsetting capabilities. For more information, seeExport Data Using MATLAB Low-Level HDF5 Functions。
读HDF5数据集使用动态加载的过滤器s
MATLAB使万博1manbetx用动态加载的过滤器支持读取和编写HDF5数据集。HDF组维护注册过滤器的列表Filterson their website.
To read a data set that has been written using a user-defined, third-party filter, follow these steps:
Install the HDF5 filter plugin on your system as a shared library or DLL.
Set the
HDF5_PLUGIN_PATH
environment variable to the folder containing the installed plugin binary file. On a Windows®system, use thesetenv
command in MATLAB. On a Linux®或者Macsystem, perform this action in a terminal window before you start MATLAB.
完成这些步骤后,您可以使用高级或低级MATLAB HDF5功能来读取和访问已使用第三方过滤器压缩的数据集。有关更多信息,请参阅HDF5 Dynamically Loaded Filterson The HDF Group website.
LinuxUsers Only: Rebuild Filter Plugins UsingMATLABHDF5 Shared Library
Starting in R2021b, in certain cases, Linux users using a filter plugin with callbacks to core HDF5 library functions must rebuild the plugin using the shipping MATLAB HDF5 shared library,/matlab/bin/glnxa64/libhdf5.so.x.x.x
。If you do not rebuild the plugin using this version of the shared library, you might experience issues ranging from undefined behavior to crashes. For more information, seeBuild HDF5 Filter Plugins onLinuxUsingMATLABHDF5 Shared Library or GNU Export Map。