[译]train_test_split()-白红宇的个人博客

[译]train_test_split()

发布日期：2021-05-07 14:32:29 浏览次数：24 分类：技术文章

本文共 1337 字，大约阅读时间需要 4 分钟。

sklearn.model_selection.train_test_split(*arrays, **options)

将数组或矩阵随机拆分成训练子集和测试子集。

Parameters

Parameters	数据类型	意义
*arrays	sequence of indexables with same length / shape[0]	待分数据集
test_size	float, int or None, optional (default=None)	float:表示比例 int:表示绝对数量 None:表示0.25
train_size	float, int, or None, (default=None)	同上，None表示test的补集
random_state	int, RandomState instance or None, optional (default=None)	int:随机数生成器的种子 RandomState:随机数生成器 None:np.random所用生成器的实例用于此处的生成器
shuffle	boolean, optional (default=True)	不管是否再拆分前进行数据混洗，如果是False则stratify只能是None
stratify	array-like or None (default=None)	非None则数据按分层范式拆分，且以此为类别的labels，比如stratify= y

关于随机数生成器种子（seed uesd by the random number generator），理解为随机数生成过程的一个记录，种子相同则生成器生成的随机数相同。

Return：包含train-test的list。

如果输入是sparse，输出是scipy.sparse.csr_matrix，不然与输入类型相同。

Examples

>>> import numpy as np>>> from sklearn.model_selection import train_test_split>>> X, y = np.arange(10).reshape((5, 2)), range(5)>>> Xarray([[0, 1],       [2, 3],       [4, 5],       [6, 7],       [8, 9]])>>> list(y)[0, 1, 2, 3, 4]

>>>

>>> X_train, X_test, y_train, y_test = train_test_split(...     X, y, test_size=0.33, random_state=42)...>>> X_trainarray([[4, 5],       [0, 1],       [6, 7]])>>> y_train[2, 0, 3]>>> X_testarray([[2, 3],       [8, 9]])>>> y_test[1, 4]

>>>

>>> train_test_split(y, shuffle=False)[[0, 1, 2], [3, 4]]

上一篇：[译]sklearn.feature_extraction.text.TfidfVectorizer

下一篇：证券机构分析师研报靠谱么？关于波司登沽空与买入报告

发表评论

关于作者

喝酒易醉，品茶养心，人生如梦，品茶悟道，何以解忧？唯有杜康！

-- 愿君每日到此一游！

sklearn.model_selection.train_test_split(*arrays, **options)

Parameters

Examples

发表评论

最新留言

关于作者

推荐文章