dataframely.columns package

class dataframely.columns.Any(*, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: Column

A column with arbitrary type.

As a column with arbitrary type is commonly mapped to the Null type (this is the default in polars and pyarrow for empty columns), dataframely also requires this column to be nullable. Hence, it cannot be used as a primary key.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.Array(inner: Column, shape: int | tuple[int, ...], *, nullable: bool = True, primary_key: Literal[False] = False, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: Column

A fixed-shape array column.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.Binary(*, nullable: bool | None = None, primary_key: bool = False, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: Column

A column of binary values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.Bool(*, nullable: bool | None = None, primary_key: bool = False, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: Column

A column of booleans.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.Categorical(*, nullable: bool | None = None, primary_key: bool = False, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: Column

A column of categorical (string) values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.Column(*, nullable: bool | None = None, primary_key: bool = False, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: ABC

Abstract base class for data frame column definitions.

This class is merely supposed to be used in Schema definitions.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

abstract property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

abstract property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

abstractmethod sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.Date(*, nullable: bool | None = None, primary_key: bool = False, min: date | None = None, min_exclusive: date | None = None, max: date | None = None, max_exclusive: date | None = None, resolution: str | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: OrdinalMixin[date], Column

A column of dates (without time).

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.Datetime(*, nullable: bool | None = None, primary_key: bool = False, min: datetime | None = None, min_exclusive: datetime | None = None, max: datetime | None = None, max_exclusive: datetime | None = None, resolution: str | None = None, time_zone: str | tzinfo | None = None, time_unit: Literal['ns', 'us', 'ms'] = 'us', check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: OrdinalMixin[datetime], Column

A column of datetimes.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.Decimal(precision: int | None = None, scale: int = 0, *, nullable: bool | None = None, primary_key: bool = False, min: Decimal | None = None, min_exclusive: Decimal | None = None, max: Decimal | None = None, max_exclusive: Decimal | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: OrdinalMixin[Decimal], Column

A column of decimal values with given precision and scale.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.Duration(*, nullable: bool | None = None, primary_key: bool = False, min: timedelta | None = None, min_exclusive: timedelta | None = None, max: timedelta | None = None, max_exclusive: timedelta | None = None, resolution: str | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: OrdinalMixin[timedelta], Column

A column of durations.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.Enum(categories: Series | Iterable[str] | type[Enum], *, nullable: bool | None = None, primary_key: bool = False, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: Column

A column of enum (string) values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.Float(*, nullable: bool | None = None, primary_key: bool = False, allow_inf_nan: bool = False, min: float | None = None, min_exclusive: float | None = None, max: float | None = None, max_exclusive: float | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseFloat

A column of floats (with any number of bytes).

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 1.7976931348623157e+308
min_value = -1.7976931348623157e+308
property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.Float32(*, nullable: bool | None = None, primary_key: bool = False, allow_inf_nan: bool = False, min: float | None = None, min_exclusive: float | None = None, max: float | None = None, max_exclusive: float | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseFloat

A column of float32 (“float”) values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 3.4028234663852886e+38
min_value = -3.4028234663852886e+38
property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.Float64(*, nullable: bool | None = None, primary_key: bool = False, allow_inf_nan: bool = False, min: float | None = None, min_exclusive: float | None = None, max: float | None = None, max_exclusive: float | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseFloat

A column of float64 (“double”) values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 1.7976931348623157e+308
min_value = -1.7976931348623157e+308
property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.Int16(*, nullable: bool | None = None, primary_key: bool = False, min: int | None = None, min_exclusive: int | None = None, max: int | None = None, max_exclusive: int | None = None, is_in: Sequence[int] | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseInteger

A column of int16 values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

is_unsigned = False
matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 32767
min_value = -32768
property name: str

Get the name of the column in a schema.

num_bytes = 2
property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.Int32(*, nullable: bool | None = None, primary_key: bool = False, min: int | None = None, min_exclusive: int | None = None, max: int | None = None, max_exclusive: int | None = None, is_in: Sequence[int] | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseInteger

A column of int32 values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

is_unsigned = False
matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 2147483647
min_value = -2147483648
property name: str

Get the name of the column in a schema.

num_bytes = 4
property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.Int64(*, nullable: bool | None = None, primary_key: bool = False, min: int | None = None, min_exclusive: int | None = None, max: int | None = None, max_exclusive: int | None = None, is_in: Sequence[int] | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseInteger

A column of int64 values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

is_unsigned = False
matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 9223372036854775807
min_value = -9223372036854775808
property name: str

Get the name of the column in a schema.

num_bytes = 8
property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.Int8(*, nullable: bool | None = None, primary_key: bool = False, min: int | None = None, min_exclusive: int | None = None, max: int | None = None, max_exclusive: int | None = None, is_in: Sequence[int] | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseInteger

A column of int8 values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

is_unsigned = False
matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 127
min_value = -128
property name: str

Get the name of the column in a schema.

num_bytes = 1
property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.Integer(*, nullable: bool | None = None, primary_key: bool = False, min: int | None = None, min_exclusive: int | None = None, max: int | None = None, max_exclusive: int | None = None, is_in: Sequence[int] | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseInteger

A column of integers (with any number of bytes).

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

is_unsigned = False
matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 9223372036854775807
min_value = -9223372036854775808
property name: str

Get the name of the column in a schema.

num_bytes = 8
property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.List(inner: Column, *, nullable: bool | None = None, primary_key: bool = False, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, min_length: int | None = None, max_length: int | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: Column

A list column.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.Object(*, nullable: bool = True, primary_key: bool = False, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: Column

A Python Object column.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.String(*, nullable: bool | None = None, primary_key: bool = False, min_length: int | None = None, max_length: int | None = None, regex: str | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: Column

A column of strings.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.Struct(inner: dict[str, Column], *, nullable: bool | None = None, primary_key: bool = False, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: Column

A struct column.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.Time(*, nullable: bool | None = None, primary_key: bool = False, min: time | None = None, min_exclusive: time | None = None, max: time | None = None, max_exclusive: time | None = None, resolution: str | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: OrdinalMixin[time], Column

A column of times (without date).

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.UInt16(*, nullable: bool | None = None, primary_key: bool = False, min: int | None = None, min_exclusive: int | None = None, max: int | None = None, max_exclusive: int | None = None, is_in: Sequence[int] | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseInteger

A column of uint16 values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

is_unsigned = True
matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 65535
min_value = 0
property name: str

Get the name of the column in a schema.

num_bytes = 2
property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.UInt32(*, nullable: bool | None = None, primary_key: bool = False, min: int | None = None, min_exclusive: int | None = None, max: int | None = None, max_exclusive: int | None = None, is_in: Sequence[int] | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseInteger

A column of uint32 values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

is_unsigned = True
matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 4294967295
min_value = 0
property name: str

Get the name of the column in a schema.

num_bytes = 4
property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.UInt64(*, nullable: bool | None = None, primary_key: bool = False, min: int | None = None, min_exclusive: int | None = None, max: int | None = None, max_exclusive: int | None = None, is_in: Sequence[int] | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseInteger

A column of uint64 values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

is_unsigned = True
matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 18446744073709551615
min_value = 0
property name: str

Get the name of the column in a schema.

num_bytes = 8
property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.UInt8(*, nullable: bool | None = None, primary_key: bool = False, min: int | None = None, min_exclusive: int | None = None, max: int | None = None, max_exclusive: int | None = None, is_in: Sequence[int] | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseInteger

A column of uint8 values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

is_unsigned = True
matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 255
min_value = 0
property name: str

Get the name of the column in a schema.

num_bytes = 1
property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

dataframely.columns.column_from_dict(data: dict[str, Any]) Column[source]

Dynamically read a column from a dictionary.

Args:
data: The dictionary that was created by calling as_dict() on a

column object. The dictionary must contain a key "column_type" that indicates which column type to instantiate.

Returns:

The column object as read from data.

Submodules

dataframely.columns.any module

class dataframely.columns.any.Any(*, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: Column

A column with arbitrary type.

As a column with arbitrary type is commonly mapped to the Null type (this is the default in polars and pyarrow for empty columns), dataframely also requires this column to be nullable. Hence, it cannot be used as a primary key.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

dataframely.columns.array module

class dataframely.columns.array.Array(inner: Column, shape: int | tuple[int, ...], *, nullable: bool = True, primary_key: Literal[False] = False, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: Column

A fixed-shape array column.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

dataframely.columns.binary module

class dataframely.columns.binary.Binary(*, nullable: bool | None = None, primary_key: bool = False, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: Column

A column of binary values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

dataframely.columns.bool module

class dataframely.columns.bool.Bool(*, nullable: bool | None = None, primary_key: bool = False, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: Column

A column of booleans.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

dataframely.columns.categorical module

class dataframely.columns.categorical.Categorical(*, nullable: bool | None = None, primary_key: bool = False, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: Column

A column of categorical (string) values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

dataframely.columns.datetime module

class dataframely.columns.datetime.Date(*, nullable: bool | None = None, primary_key: bool = False, min: date | None = None, min_exclusive: date | None = None, max: date | None = None, max_exclusive: date | None = None, resolution: str | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: OrdinalMixin[date], Column

A column of dates (without time).

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.datetime.Datetime(*, nullable: bool | None = None, primary_key: bool = False, min: datetime | None = None, min_exclusive: datetime | None = None, max: datetime | None = None, max_exclusive: datetime | None = None, resolution: str | None = None, time_zone: str | tzinfo | None = None, time_unit: Literal['ns', 'us', 'ms'] = 'us', check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: OrdinalMixin[datetime], Column

A column of datetimes.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.datetime.Duration(*, nullable: bool | None = None, primary_key: bool = False, min: timedelta | None = None, min_exclusive: timedelta | None = None, max: timedelta | None = None, max_exclusive: timedelta | None = None, resolution: str | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: OrdinalMixin[timedelta], Column

A column of durations.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.datetime.Time(*, nullable: bool | None = None, primary_key: bool = False, min: time | None = None, min_exclusive: time | None = None, max: time | None = None, max_exclusive: time | None = None, resolution: str | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: OrdinalMixin[time], Column

A column of times (without date).

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

dataframely.columns.decimal module

class dataframely.columns.decimal.Decimal(precision: int | None = None, scale: int = 0, *, nullable: bool | None = None, primary_key: bool = False, min: Decimal | None = None, min_exclusive: Decimal | None = None, max: Decimal | None = None, max_exclusive: Decimal | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: OrdinalMixin[Decimal], Column

A column of decimal values with given precision and scale.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

dataframely.columns.enum module

class dataframely.columns.enum.Enum(categories: Series | Iterable[str] | type[Enum], *, nullable: bool | None = None, primary_key: bool = False, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: Column

A column of enum (string) values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

dataframely.columns.float module

class dataframely.columns.float.Float(*, nullable: bool | None = None, primary_key: bool = False, allow_inf_nan: bool = False, min: float | None = None, min_exclusive: float | None = None, max: float | None = None, max_exclusive: float | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseFloat

A column of floats (with any number of bytes).

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 1.7976931348623157e+308
min_value = -1.7976931348623157e+308
property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.float.Float32(*, nullable: bool | None = None, primary_key: bool = False, allow_inf_nan: bool = False, min: float | None = None, min_exclusive: float | None = None, max: float | None = None, max_exclusive: float | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseFloat

A column of float32 (“float”) values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 3.4028234663852886e+38
min_value = -3.4028234663852886e+38
property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.float.Float64(*, nullable: bool | None = None, primary_key: bool = False, allow_inf_nan: bool = False, min: float | None = None, min_exclusive: float | None = None, max: float | None = None, max_exclusive: float | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseFloat

A column of float64 (“double”) values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 1.7976931348623157e+308
min_value = -1.7976931348623157e+308
property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

dataframely.columns.integer module

class dataframely.columns.integer.Int16(*, nullable: bool | None = None, primary_key: bool = False, min: int | None = None, min_exclusive: int | None = None, max: int | None = None, max_exclusive: int | None = None, is_in: Sequence[int] | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseInteger

A column of int16 values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

is_unsigned = False
matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 32767
min_value = -32768
property name: str

Get the name of the column in a schema.

num_bytes = 2
property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.integer.Int32(*, nullable: bool | None = None, primary_key: bool = False, min: int | None = None, min_exclusive: int | None = None, max: int | None = None, max_exclusive: int | None = None, is_in: Sequence[int] | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseInteger

A column of int32 values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

is_unsigned = False
matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 2147483647
min_value = -2147483648
property name: str

Get the name of the column in a schema.

num_bytes = 4
property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.integer.Int64(*, nullable: bool | None = None, primary_key: bool = False, min: int | None = None, min_exclusive: int | None = None, max: int | None = None, max_exclusive: int | None = None, is_in: Sequence[int] | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseInteger

A column of int64 values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

is_unsigned = False
matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 9223372036854775807
min_value = -9223372036854775808
property name: str

Get the name of the column in a schema.

num_bytes = 8
property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.integer.Int8(*, nullable: bool | None = None, primary_key: bool = False, min: int | None = None, min_exclusive: int | None = None, max: int | None = None, max_exclusive: int | None = None, is_in: Sequence[int] | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseInteger

A column of int8 values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

is_unsigned = False
matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 127
min_value = -128
property name: str

Get the name of the column in a schema.

num_bytes = 1
property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.integer.Integer(*, nullable: bool | None = None, primary_key: bool = False, min: int | None = None, min_exclusive: int | None = None, max: int | None = None, max_exclusive: int | None = None, is_in: Sequence[int] | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseInteger

A column of integers (with any number of bytes).

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

is_unsigned = False
matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 9223372036854775807
min_value = -9223372036854775808
property name: str

Get the name of the column in a schema.

num_bytes = 8
property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.integer.UInt16(*, nullable: bool | None = None, primary_key: bool = False, min: int | None = None, min_exclusive: int | None = None, max: int | None = None, max_exclusive: int | None = None, is_in: Sequence[int] | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseInteger

A column of uint16 values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

is_unsigned = True
matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 65535
min_value = 0
property name: str

Get the name of the column in a schema.

num_bytes = 2
property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.integer.UInt32(*, nullable: bool | None = None, primary_key: bool = False, min: int | None = None, min_exclusive: int | None = None, max: int | None = None, max_exclusive: int | None = None, is_in: Sequence[int] | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseInteger

A column of uint32 values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

is_unsigned = True
matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 4294967295
min_value = 0
property name: str

Get the name of the column in a schema.

num_bytes = 4
property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.integer.UInt64(*, nullable: bool | None = None, primary_key: bool = False, min: int | None = None, min_exclusive: int | None = None, max: int | None = None, max_exclusive: int | None = None, is_in: Sequence[int] | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseInteger

A column of uint64 values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

is_unsigned = True
matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 18446744073709551615
min_value = 0
property name: str

Get the name of the column in a schema.

num_bytes = 8
property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

class dataframely.columns.integer.UInt8(*, nullable: bool | None = None, primary_key: bool = False, min: int | None = None, min_exclusive: int | None = None, max: int | None = None, max_exclusive: int | None = None, is_in: Sequence[int] | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: _BaseInteger

A column of uint8 values.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

is_unsigned = True
matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

max_value = 255
min_value = 0
property name: str

Get the name of the column in a schema.

num_bytes = 1
property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

dataframely.columns.list module

class dataframely.columns.list.List(inner: Column, *, nullable: bool | None = None, primary_key: bool = False, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, min_length: int | None = None, max_length: int | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: Column

A list column.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

dataframely.columns.object module

class dataframely.columns.object.Object(*, nullable: bool = True, primary_key: bool = False, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: Column

A Python Object column.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

dataframely.columns.string module

class dataframely.columns.string.String(*, nullable: bool | None = None, primary_key: bool = False, min_length: int | None = None, max_length: int | None = None, regex: str | None = None, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: Column

A column of strings.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.

dataframely.columns.struct module

class dataframely.columns.struct.Struct(inner: dict[str, Column], *, nullable: bool | None = None, primary_key: bool = False, check: Callable[[Expr], Expr] | Sequence[Callable[[Expr], Expr]] | Mapping[str, Callable[[Expr], Expr]] | None = None, alias: str | None = None, metadata: dict[str, Any] | None = None)[source]

Bases: Column

A struct column.

Attributes:
col

Obtain a Polars column expression for the column.

dtype

The polars dtype equivalent of this column definition’s data type.

name

Get the name of the column in a schema.

pyarrow_dtype

The pyarrow dtype equivalent of this column data type.

Methods

as_dict(expr)

Turn the column definition into a dictionary.

from_dict(data)

Read the column definition from a dictionary.

matches(other, expr)

Check whether this column semantically matches another column.

pyarrow_field(name)

Obtain the pyarrow field of this column definition.

sample(generator[, n])

Sample random elements adhering to the constraints of this column.

sqlalchemy_column(name, dialect)

Obtain the SQL column specification of this column definition.

sqlalchemy_dtype(dialect)

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype)

Validate if the polars data type satisfies the column definition.

validation_rules(expr)

A set of rules evaluating whether a data frame column satisfies the column's constraints.

as_dict(expr: Expr) dict[str, Any][source]

Turn the column definition into a dictionary.

If the column definition references other column definitions, they will be turned into dictionaries recursively.

Args:
expr: An expression referencing the column to turn into a dictionary. This

is required to properly encode custom checks.

Returns:

The column definition as dictionary.

Note:

This method stores custom checks as expressions rather than callables to allow for serialization.

Note:

Do NOT use the returned object to evaluate semantic equality of two columns. It may yield different results than matches().

Attention:

This method is only intended for internal use.

property col: Expr

Obtain a Polars column expression for the column.

property dtype: DataType

The polars dtype equivalent of this column definition’s data type.

This is primarily used for creating empty data frames with an appropriate schema. Thus, it should describe the default dtype equivalent if this data type encompasses multiple underlying data types.

classmethod from_dict(data: dict[str, Any]) Self[source]

Read the column definition from a dictionary.

Args:

data: The dictionary that was created via as_dict().

Returns:

The column definition read from the dictionary.

Attention:

This method is only intended for internal use.

matches(other: Column, expr: Expr) bool[source]

Check whether this column semantically matches another column.

Args:

other: The column to compare with. expr: An expression referencing the column. This is required to properly

evaluate the equivalence of custom checks.

Returns:

Whether the columns are semantically equal.

property name: str

Get the name of the column in a schema.

property pyarrow_dtype: pa.DataType

The pyarrow dtype equivalent of this column data type.

pyarrow_field(name: str) pa.Field[source]

Obtain the pyarrow field of this column definition.

Args:

name: The name of the column.

Returns:

The pyarrow field definition.

sample(generator: Generator, n: int = 1) Series[source]

Sample random elements adhering to the constraints of this column.

Args:

generator: The generator to use for sampling elements. n: The number of elements to sample.

Returns:

A series with the predefined number of elements. All elements are guaranteed to adhere to the column’s constraints.

Raises:
ValueError: If this column has a custom check. In this case, random values

cannot be guaranteed to adhere to the column’s constraints while providing any guarantees on the computational complexity.

sqlalchemy_column(name: str, dialect: sa.Dialect) sa.Column[source]

Obtain the SQL column specification of this column definition.

Args:

name: The name of the column. dialect: The SQL dialect for which to generate the column specification.

Returns:

The column as specified in sqlalchemy.

sqlalchemy_dtype(dialect: sa.Dialect) sa_TypeEngine[source]

The sqlalchemy dtype equivalent of this column data type.

validate_dtype(dtype: DataType | DataTypeClass) bool[source]

Validate if the polars data type satisfies the column definition.

Args:

dtype: The dtype to validate.

Returns:

Whether the dtype is valid.

validation_rules(expr: Expr) dict[str, Expr][source]

A set of rules evaluating whether a data frame column satisfies the column’s constraints.

Args:
expr: An expression referencing the column of the data frame, i.e. an

expression created by calling polars.col().

Returns:

A mapping from validation rule names to expressions that provide exactly one boolean value per column item indicating whether validation with respect to the rule is successful. A value of False indicates invalid data, i.e. unsuccessful validation.