表和数据集有什么区别?
What is the difference between tables and datasets?
在我的 MATLAB 代码中,我主要使用 dataset
s 将不同类型的数据和元数据存储在单个容器变量中。但是,我发现同事使用 table
s。在我看来,这两种数据类型非常相似:都可以通过列名或索引访问,都支持 summary
函数等。
这两种数据类型有什么区别?
不细说了,table
是一个相当新的函数,是基本的Matlab自带的。然而,较旧的 dataset
是 统计和机器学习工具箱 的一部分。
如您所知,它们非常相似,但我无法准确告诉您如何相似。但是 doc 实际上很清楚你应该使用什么:
The dataset data type might be removed in a future release. To work with heterogeneous data, use the MATLAB® table data type instead. See MATLAB table documentation for more information.
所以 table
是 dataset
的替代函数,可供所有人使用。只需使用 table
,您的未来就安全了。
正如brodroll在评论中提到的,还有一个statement of MathWorks on Matlab Central:
Broadly speaking, Tables and datasets essentially serve the same
functionality. Following are some of the differences:
1) Tables are included as part of core MATLAB, and do not need the
installation of Statistics Toolbox to use them. Moreover, their design
and terminology makes them a bit more accessible for non-statistical
users, though they remain just as useful for statistics.
2) TABLE is ultimately meant to replace DATASET over time. Hence it is
recommended to use TABLE in place of DATASET. Please note that this
transition will not happen immediately and upcoming releases will
provide more details and strategies for making the transition.
3) You still need to use DATASET in the Statistics Toolbox while using
classes such as ‘LinearModel’ and ‘LinearMixedModel’ (which is new in
MATAB R2013b). It is recommended to use TABLE and converting to
DATASET only when needed, using TABLE2DATASET.
4) The TABLE class is currently sealed. Hence it is not possible to
subclass from it unlike the DATASET class which can be inherited by a
subclass.
在我的 MATLAB 代码中,我主要使用 dataset
s 将不同类型的数据和元数据存储在单个容器变量中。但是,我发现同事使用 table
s。在我看来,这两种数据类型非常相似:都可以通过列名或索引访问,都支持 summary
函数等。
这两种数据类型有什么区别?
不细说了,table
是一个相当新的函数,是基本的Matlab自带的。然而,较旧的 dataset
是 统计和机器学习工具箱 的一部分。
如您所知,它们非常相似,但我无法准确告诉您如何相似。但是 doc 实际上很清楚你应该使用什么:
The dataset data type might be removed in a future release. To work with heterogeneous data, use the MATLAB® table data type instead. See MATLAB table documentation for more information.
所以 table
是 dataset
的替代函数,可供所有人使用。只需使用 table
,您的未来就安全了。
正如brodroll在评论中提到的,还有一个statement of MathWorks on Matlab Central:
Broadly speaking, Tables and datasets essentially serve the same functionality. Following are some of the differences:
1) Tables are included as part of core MATLAB, and do not need the installation of Statistics Toolbox to use them. Moreover, their design and terminology makes them a bit more accessible for non-statistical users, though they remain just as useful for statistics.
2) TABLE is ultimately meant to replace DATASET over time. Hence it is recommended to use TABLE in place of DATASET. Please note that this transition will not happen immediately and upcoming releases will provide more details and strategies for making the transition.
3) You still need to use DATASET in the Statistics Toolbox while using classes such as ‘LinearModel’ and ‘LinearMixedModel’ (which is new in MATAB R2013b). It is recommended to use TABLE and converting to DATASET only when needed, using TABLE2DATASET.
4) The TABLE class is currently sealed. Hence it is not possible to subclass from it unlike the DATASET class which can be inherited by a subclass.