Bases: BaseSynthesizer
Source code in ydata/sdk/synthesizers/regular.py
fit(X, privacy_level=PrivacyLevel.HIGH_FIDELITY, entities=None, generate_cols=None, exclude_cols=None, dtypes=None, target=None, anonymize=None, condition_on=None)
Fit the synthesizer.
The synthesizer accepts as training dataset either a pandas DataFrame
directly or a YData DataSource
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X
|
Union[DataSource, DataFrame]
|
Training dataset |
required |
privacy_level
|
PrivacyLevel
|
Synthesizer privacy level (defaults to high fidelity) |
HIGH_FIDELITY
|
entities
|
Union[str, List[str]]
|
(optional) columns representing entities ID |
None
|
generate_cols
|
List[str]
|
(optional) columns that should be synthesized |
None
|
exclude_cols
|
List[str]
|
(optional) columns that should not be synthesized |
None
|
dtypes
|
Dict[str, Union[str, DataType]]
|
(optional) datatype mapping that will overwrite the datasource metadata column datatypes |
None
|
target
|
Optional[str]
|
(optional) Target column |
None
|
name
|
Optional[str]
|
(optional) Synthesizer instance name |
required |
anonymize
|
Optional[str]
|
(optional) fields to anonymize and the anonymization strategy |
None
|
condition_on
|
Optional[List[str]]
|
(Optional[List[str]]): (optional) list of features to condition upon |
None
|
Source code in ydata/sdk/synthesizers/regular.py
sample(n_samples=1, condition_on=None)
Sample from a RegularSynthesizer
instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n_samples
|
int
|
number of rows in the sample |
1
|
condition_on
|
Optional[dict]
|
(Optional[dict]): (optional) conditional sampling parameters |
None
|
Returns:
Type | Description |
---|---|
DataFrame
|
synthetic data |
Source code in ydata/sdk/synthesizers/regular.py
PrivacyLevel
Bases: StringEnum
Privacy level exposed to the end-user.
BALANCED_PRIVACY_FIDELITY = 'BALANCED_PRIVACY_FIDELITY'
class-attribute
instance-attribute
Balanced privacy/fidelity
HIGH_FIDELITY = 'HIGH_FIDELITY'
class-attribute
instance-attribute
High fidelity
HIGH_PRIVACY = 'HIGH_PRIVACY'
class-attribute
instance-attribute
High privacy