Step Class for easy data loading and exporting. Also present at package level
Type
Default
Details
root
str | None
./data
Default root folder of data path. Not exported in metadata
attrs
str | list[str] | None
None
Default attributes part of the path
version
str | None
:default
Default version name (cannot use :last or other custom variables)
file_name
str | None
:auto
Specify the file name. See file_name_in and file_name_out for more details on :auto behaviour
method_in
str | object | None
:auto
Default method to load the data. # Method to load the data. Can be a function with path as first argument or a string among [csv, excel, xlsx, xls, parquet, json, pickle, feather, hdf, sql, pkl].
root_in
str | None
:default
Default root folder when loading [not recommended, use root instead]
attrs_in
str | list[str] | None
:default
Default attributes when loading
step_in
str | None
None
Default step name when loading
version_in
str | None
:default
Default version name when loading
file_name_in
str | None
:default
Default file name when loading
method_out
str | object | None
:auto
Default method to save the data. Can a function with path as first argument or a string among [csv, excel, xlsx, xls, parquet, json, pickle, feather, hdf, sql, pkl]
root_out
str | None
:default
Default root folder when saving [not recommended, use root instead]
Version name, converted to v_{version_name} in the path. if :default, uses :last, if :last uses last version based on its name. if :first, uses first version based on its name
file_name
str | Literal[‘:default’, ‘:auto’]
:default
File name. automatically inferred if there is only one file in the directory
method
str | object | Literal[‘:default’, ‘:auto’]
:default
Method to load the data. Can be a function with path as first argument or a string among [csv, excel, xlsx, xls, parquet, json, pickle, feather, hdf, sql, pkl].
alias
str
:ignore
Alias of the dataset to document it and its columns. (feature in development)
Step.save (data:Union[pandas.core.frame.DataFrame,Any],
root:Union[str,Literal[':default']]=':default',
attrs:Union[list,str,NoneType,Literal[':default']]=':default',
step:Union[str,NoneType,Literal[':default']]=':default', versi
on:Union[str,NoneType,Literal[':default'],stdflow.stdflow_type
s.strftime_type.Strftime]=':default',
file_name:Union[str,Literal[':default',':auto']]=':default', m
ethod:Union[str,object,Literal[':default',':auto']]=':default'
, alias:str=':ignore', export_viz_tool:bool=False,
verbose:bool=False, **kwargs)
Save data with path such as root/attrs/step/version/file_name
Type
Default
Details
data
pd.DataFrame | Any
data to save
root
str | Literal[‘:default’]
:default
Root folder of the data. Not exported in metadata
attrs
list | str | None | Literal[‘:default’]
:default
Attributes part of the path
step
str | None | Literal[‘:default’]
:default
Step name, converted to step_{step_name} in the path
version
str | None | Literal[‘:default’] | Strftime
:default
Version name, converted to v_{version_name} in the path. by default uses the current date in format %Y%m%d%H%M
file_name
str | Literal[‘:default’, ‘:auto’]
:default
File name. automatically inferred if there is only one input file
method
str | object | Literal[‘:default’, ‘:auto’]
:default
Method to save the data. Can a function with path as first argument or a string among [csv, excel, xlsx, xls, parquet, json, pickle, feather, hdf, sql, pkl]. If function, the first argument must be the path
alias
str
:ignore
Alias of the dataset to document it and its columns. (feature in development)
export_viz_tool
bool
False
If True, export html view of the data and the pipeline it comes from
Set a variable which can be overwritten if specified in StepRunner / Pipeline
step.save( df, root="../demo_project/data", attrs="lake", file_name="countries of the world.csv", version=":default", method="csv", verbose=True,)
sf.root = ../demo_project/data
sf.attrs = lake
sf.step = None
sf.version = %Y%m%d%H%M
sf.file_name = countries of the world.csv
sf.method = csv
Saving data to ../demo_project/data/lake/v_202310121113/countries of the world.csv
attrs=lake::step_name=None::version=202310121113::file_name=countries of the world.csv