Usage
Basic usage
import pandas as pd
from sktransf.trasnformer import LogColumnTransformer
df = pd.DataFrame(
{ "a": range(10),
"b": range(10)
}
)
logger = LogColumnTransformer()
logger.fit_transform(df)
df_transf = logger.transform(df)
Using common transformers
import pandas as pd
from sktransf.transformer import LogColumnTransformer, BoolColumnTransformer
from sktransf.selector import DropUniqueColumnSelector
df = pd.DataFrame(
{ "a": range(10),
"b": range(10)
}
)
df_bool = BoolColumnTransformer().fit_transform(df)
df_unique = DropUniqueColumnTransformer().fit_transform(df)
df_logged = LogColumnTransformer().fit_transform(df)
Using a pipeline with a scikit-learn model
import pandas as pd
from sklearn.pipeline import Pipeline
from sklearn.linear_model import LinearRegression
from sktransf.transformer import LogColumnTransformer, BoolColumnTransformer
from sktransf.selector import DropUniqueColumnSelector
pipe = Pipeline([
('bool', BoolColumnTransformer()),
('unique', DropUniqueColumnTransformer()),
('log', LogColumnTransformer()),
('model', LinearRegression())
])
X = pd.DataFrame(
{ "a": range(10),
"b": range(10)
}
)
y = range(10)
pipe.fit(X, y)
y_pred = pipe.predict(X)
Notebooks
For more specific information, please refer to the notebooks:
- Transformers :
- LogColumnTransformer notebook
- BoolColumnTransformer notebook
- Selectors :
- DropUniqueColumnSelector notebook
- DropSkuColumnSelector notebook
- Pipelines :
- Pipelines notebook