py4ai.data.layer.fs.tables module
Module with abstraction for accessing to data persisted in files and represented by TabularData.
- class py4ai.data.layer.fs.tables.CsvSerializer(path: Path, encoding: str = 'utf-8', sep: str = ';')
Bases:
FileSerializer
[str
,TabularData
]DataSerializer to be used for serializing/deserializing CSV files.
Return instance of DataSerializer.
- Parameters
path – local folder to be used to construct filenames
encoding – type of IO serialization (text, binary) to be used when writing files
sep – separator used in the csv
- get_key(entity: TabularData) str
Extract key for given entity.
- Parameters
entity – provided TabularData
- Returns
entity key
- mode: FileSerializerMode = ''
- to_entity(document: IndexedIO[str]) TabularData
Deserialize raw content into domain object entity.
- Parameters
document – raw content
- Returns
domain object entity
- to_object(entity: TabularData) IndexedIO[str]
Serialize domain object entity into raw content.
- Parameters
entity – domain object entity
- Returns
raw content
- to_object_key(key: str) str
Transform entity key into raw key, to be used for indexing in the persistence layer.
- Parameters
key – entity key
- Returns
raw key
- class py4ai.data.layer.fs.tables.LocalDatabase(path: ~pathlib.Path, serializer: ~typing.Type[~py4ai.data.layer.fs.serializer.FileSerializer[str, ~py4ai.data.layer.fs.tables.TabularData]] = <class 'py4ai.data.layer.fs.tables.CsvSerializer'>)
Bases:
FileSystemRepository
[str
,TabularData
]Archiver used for persistent layers used to store tabular data files.
Return an instance of the class.
- Parameters
path – path where to store the files.
serializer – An instance of serializer to convert between raw and domain objects.
- criteria: FileSystemCriteriaFactory[KE, E]
- class py4ai.data.layer.fs.tables.PickleSerializer(path: Path, encoding: str = 'utf-8', sep: str = ';')
Bases:
CsvSerializer
DataSerializer to be used for serializing/deserializing pickle files.
Return instance of DataSerializer.
- Parameters
path – local folder to be used to construct filenames
encoding – type of IO serialization (text, binary) to be used when writing files
sep – separator used in the csv
- mode: FileSerializerMode = 'b'
- to_entity(document: IndexedIO[str]) TabularData
Deserialize raw content into domain object entity.
- Parameters
document – raw content
- Returns
domain object entity
- to_object(entity: TabularData) IndexedIO[str]
Serialize domain object entity into raw content.
- Parameters
entity – domain object entity
- Returns
raw content
- to_object_key(key: str) str
Transform entity key into raw key, to be used for indexing in the persistence layer.
- Parameters
key – entity key
- Returns
raw key
- class py4ai.data.layer.fs.tables.TabularData(*, name: str = '', data: DataFrame)
Bases:
BaseModel
Domain object to represent tabular data.
Create a new model by parsing and validating input data from keyword arguments.
Raises ValidationError if the input data cannot be parsed to form a valid model.
- data: DataFrame
- name: str
- update(other: TabularData) TabularData
Return TabularData object by concatenating two TabularData objects.
- Parameters
other – second TabularData
- Returns
merged TabularData