Skip to content

Preprocessors

TableConfig pydantic-config

TableConfig(
    description: str | None = None,
    column_descriptions: dict[str, str] = {},
    config: dict[str, Any] = ConfigDict(),
)

A configuration for a table preprocessor.

Parameters:

Name Type Description Default
description str | None

A description of the table, used in the corresponding JSON schema.

None
column_descriptions dict[str, str]

A dictionary with an optional description for the table columns, used in the corresponding JSON schema.

{}
config dict[str, Any]

A pydantic.ConfigDict to be used in the corresponding Pydantic model.

ConfigDict()

TablePreproc

from_table classmethod

from_table(
    table: Table,
    *,
    config: TableConfig | None = None,
    overwrites: dict[str, ColumnPreproc | None]
    | None = None,
) -> TablePreproc

Create the table preprocessor from a Table object.

Parameters:

Name Type Description Default
table Table

The Table object with the column information.

required
config TableConfig | None

The table preprocessor configuration.

None
overwrites dict[str, ColumnPreproc | None] | None

Optional overwrites for each column. A value of None means the column is ignored.

None

Returns:

Type Description
TablePreproc

The table preprocessor.

get_default_column classmethod

get_default_column(
    column: Column | str,
) -> type[ColumnPreproc] | None

Get the type of the default column preprocessor for the given Column.

Parameters:

Name Type Description Default
column Column | str

The column type, as a Column instance.

required

Returns:

Type Description
type[ColumnPreproc] | None

The type of the column preprocessor. Returns None if the column is not supported.

RelSynthPreproc

from_schema classmethod

from_schema(
    schema: Schema,
    *,
    config: dict[str, TableConfig] | None = None,
    overwrites: dict[str, TablePreproc | None]
    | None = None,
) -> RelSynthPreproc

Build the preprocessor from a Schema object.

Parameters:

Name Type Description Default
schema Schema

The dataset Schema.

required
config dict[str, TableConfig] | None

A dictionary with optional table preprocessor configurations.

None
overwrites dict[str, TablePreproc | None] | None

Optional overwrites for each table. A value of None means the table is ignored.

None

Returns:

Type Description
RelSynthPreproc

The preprocessor.

from_data classmethod

from_data(
    data: RelationalData,
    *,
    config: dict[str, TableConfig] | None = None,
    overwrites: dict[str, TablePreproc | None]
    | None = None,
) -> RelSynthPreproc

Build and fit the preprocessor from a RelationalData.

Parameters:

Name Type Description Default
data RelationalData

The data used to build and fit the preprocessor.

required
config dict[str, TableConfig] | None

A dictionary with optional table preprocessor configurations.

None
overwrites dict[str, TablePreproc | None] | None

Optional overwrites for each table. A value of None means the table is ignored.

None

Returns:

Type Description
RelSynthPreproc

The fitted preprocessor.

fit

Fit the preprocessor to the input data.

Parameters:

Name Type Description Default
data RelationalData

The data to fit the preprocessor.

required

Returns:

Type Description
RelSynthPreproc

The fitted preprocessor.

protect

protect(data: RelationalData, fit: bool) -> RelationalData

Protect the data and optionally fit the protectors.

Parameters:

Name Type Description Default
data RelationalData

The data to protect.

required
fit bool

Whether to fit the protectors.

required

Returns:

Type Description
RelationalData

The protected data.

RelEventPreproc

from_schema classmethod

from_schema(
    schema: Schema,
    *,
    ord_cols: dict[str, str] | None = None,
    config: dict[str, TableConfig] | None = None,
    overwrites: dict[str, TablePreproc | None]
    | None = None,
    **kwargs: Any,
) -> RelEventPreproc

Build the preprocessor from a Schema object.

Parameters:

Name Type Description Default
schema Schema

The dataset Schema.

required
ord_cols dict[str, str] | None

For each table, the column that contain the order of the event. If None, only a single event table is possible, and the events are considered ordered as for the order in the event table. If a dictionary, each event table must contain an order column.

None
config dict[str, TableConfig] | None

A dictionary with optional table preprocessor configurations.

None
overwrites dict[str, TablePreproc | None] | None

Optional overwrites for each table. A value of None means the table is ignored.

None

Returns:

Type Description
RelEventPreproc

The preprocessor.

from_data classmethod

from_data(
    data: RelationalData,
    *,
    ord_cols: dict[str, str] | None = None,
    config: dict[str, TableConfig] | None = None,
    overwrites: dict[str, TablePreproc | None]
    | None = None,
    **kwargs: Any,
) -> RelEventPreproc

Build and fit the preprocessor from a RelationalData.

Parameters:

Name Type Description Default
data RelationalData

The data used to build and fit the preprocessor.

required
ord_cols dict[str, str] | None

For each table, the column that contain the order of the event. If None, only a single event table is possible, and the events are considered ordered as for the order in the event table. If a dictionary, each event table must contain an order column.

None
config dict[str, TableConfig] | None

A dictionary with optional table preprocessor configurations.

None
overwrites dict[str, TablePreproc | None] | None

Optional overwrites for each table. A value of None means the table is ignored.

None

Returns:

Type Description
RelEventPreproc

The fitted preprocessor.

fit

Fit the preprocessor to the input data.

Parameters:

Name Type Description Default
data RelationalData

The data to fit the preprocessor.

required

Returns:

Type Description
RelSynthPreproc

The fitted preprocessor.

protect

protect(data: RelationalData, fit: bool) -> RelationalData

Protect the data and optionally fit the protectors.

Parameters:

Name Type Description Default
data RelationalData

The data to protect.

required
fit bool

Whether to fit the protectors.

required

Returns:

Type Description
RelationalData

The protected data.