Split a dataset
更新时间:
复制 MD 格式
This topic describes how to use the API to split a dataset into a training dataset and a test dataset.
Function path
fascia.data.horizontal.dataframe.train_test_splitFunction definition
def train_test_split(data: HDataFrame,
ratio: float,
random_state: int = None,
shuffle: bool = True) -> (HDataFrame, HDataFrame):Parameters
Parameter | Type | Description |
|---|---|---|
data | HDataFrame | The federated dataset to split. |
ratio | Float | The split ratio. The value must be between 0 and 1, inclusive. The value can be accurate to three decimal places. |
random_state | Integer | The random number seed. If specified, the split result is consistent for the same seed. The default value is None. |
shuffle | Bool | Specifies whether to shuffle the data. The default value is True. |
Example
from fascia.data.horizontal.dataframe import train_test_split
# Split an existing federated dataset and save the two resulting datasets.
# Assume that fed_df is a pre-existing federated dataset.
train_set, test_set = train_test_split(fed_df, 0.7)
save_fed_dataframe(train_set, '$output1')
save_fed_dataframe(test_set, '$output2')Return value
A tuple containing two federated tables.
该文章对您有帮助吗?