非图像数据增强

Non-image data augmentation

我正在寻找有关数据增强的算法和-或教程,但它们都属于图像增强,是否可以在其他数据集中进行? 我正在研究帕金森病数据集 (https://archive.ics.uci.edu/ml/datasets/parkinsons),想用 python 创建一个八月数据示例,这可能吗?或者我应该像 mnist/fmnist 这样使用 smt ?

如果您可以访问实际的录音,则可以应用一些增强技术 used in speech recognition and then re-extract the features such as fundamental frequency. However, since you're dealing directly with the features, augmentation is more tricky. It is possible to generate synthetic samples by interpolating between existing ones or adding noise, but since the features are highly correlated, you need a smart way of doing that (see this paper for a simple approach and this one 以获得更高级的技术)。如果您有 class 不平衡问题,您可以尝试简单地过采样或欠采样。