deeppavlov.models.sklearn¶
-
class
deeppavlov.models.sklearn.sklearn_component.
SklearnComponent
(model_class: str, save_path: Union[str, pathlib.Path] = None, load_path: Union[str, pathlib.Path] = None, infer_method: str = 'predict', ensure_list_output: bool = False, **kwargs)[source]¶ Class implements wrapper for sklearn components for feature extraction, feature selection, classification, regression etc.
Parameters: - model_class – string with full name of sklearn model to use, e.g.
sklearn.linear_model:LogisticRegression
- save_path – save path for model, e.g. full name
model_path/model.pkl
or prefixmodel_path/model
(still model will be saved tomodel_path/model.pkl
) - load_path – load path for model, e.g. full name
model_path/model.pkl
or prefixmodel_path/model
(still model will be loaded frommodel_path/model.pkl
) - infer_method – string name of class method to use for infering model, e.g.
predict
,predict_proba
,predict_log_proba
,transform
- ensure_list_output – whether to ensure that output for each sample is iterable (but not string)
- kwargs – dictionary with parameters for the sklearn model
-
model
¶ sklearn model instance
-
model_class
¶ string with full name of sklearn model to use, e.g.
sklearn.linear_model:LogisticRegression
-
model_params
¶ dictionary with parameters for the sklearn model without pipe parameters
-
pipe_params
¶ dictionary with parameters for pipe:
in
,out
,fit_on
,main
,name
-
save_path
¶ save path for model, e.g. full name
model_path/model.pkl
or prefixmodel_path/model
(still model will be saved tomodel_path/model.pkl
)
-
load_path
¶ load path for model, e.g. full name
model_path/model.pkl
or prefixmodel_path/model
(still model will be loaded frommodel_path/model.pkl
)
-
infer_method
¶ string name of class method to use for infering model, e.g.
predict
,predict_proba
,predict_log_proba
,transform
-
ensure_list_output
¶ whether to ensure that output for each sample is iterable (but not string)
-
__call__
(*args)[source]¶ Infer on the given data according to given in the config infer method, e.g.
"predict", "predict_proba", "transform"
Parameters: *args – list of inputs Returns: predictions, e.g. list of labels, array of probability distribution, sparse array of vectorized samples
-
fit
(*args) → None[source]¶ Fit model on the given data
Parameters: *args – list of x-inputs and, optionally, one y-input (the last one) to fit on. Possible input (x0, …, xK, y) or (x0, …, xK) ‘ where K is the number of input data elements (the length of list in
from config). In case of several inputs (K > 1) input features will be stacked. For example, one has x0: (n_samples, n_features0), …, xK: (n_samples, n_featuresK), then model will be trained on x: (n_samples, n_features0 + … + n_featuresK).Returns: None
-
init_from_scratch
() → None[source]¶ Initialize
self.model
as some sklearn model from scratch with given inself.model_params
parameters.Returns: None
-
load
(fname: str = None) → None[source]¶ Initialize
self.model
as some sklearn model from saved re-initializingself.model_params
parameters. If in new given parameterswarm_start
is set to True and given model admitswarm_start
parameter, model will be initilized from saved with opportunity to continue fitting.Parameters: fname – string name of path to model to load from Returns: None
-
save
(fname: str = None) → None[source]¶ Save
self.model
to the file fromfname
or, if not given,self.save_path
. Ifself.save_path
does not have.pkl
extension, then it will be replaced tostr(Path(self.save_path).stem) + ".pkl"
Parameters: fname – string name of path to model to save to Returns: None
-
static
compose_input_data
(x: List[Union[Tuple[Union[numpy.ndarray, list, scipy.sparse.base.spmatrix, str]], List[Union[numpy.ndarray, list, scipy.sparse.base.spmatrix, str]], numpy.ndarray, scipy.sparse.base.spmatrix]]) → Union[scipy.sparse.base.spmatrix, numpy.ndarray][source]¶ Stack given list of different types of inputs to the one matrix. If one of the inputs is a sparse matrix, then output will be also a sparse matrix
Parameters: x – list of data elements Returns: sparse or dense array of stacked data
- model_class – string with full name of sklearn model to use, e.g.