如何在 python 中使用交叉验证?

How can I use cross validation in python?

我有一大堆数据。我想将此列表拆分为训练和测试列表。我可以通过应用

拆分它
cutoff = int(.7 * len(data_list)) # 70% of the data is used for training
training_list = data_list[:cutoff]
test_list = data_list[cutoff:]

但我认为这不是评估我的标注器的好策略。如何将我的列表按此百分比划分但在不同的地方并获得可靠的评估分数?谢谢!

sklearn.model_selection中有一个名为train_test_split()的函数 文档:Train Test Split

您也可以在link中找到示例。

>>> from sklearn.model_selection import train_test_split
>>> aaa = list(range(20))
>>> aaa
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
>>> train_test_split(aaa, test_size=0.3)
[[7, 14, 2, 1, 5, 13, 3, 8, 9, 17, 15, 0, 10, 16], [11, 6, 4, 19, 12, 18]]