{ "cells": [ { "cell_type": "markdown", "id": "6d5cd16b", "metadata": {}, "source": [ "#### Python pipelines" ] }, { "cell_type": "markdown", "id": "da10fd80", "metadata": {}, "source": [ "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/deeppavlov/DeepPavlov/blob/master/docs/intro/python.ipynb)\n" ] }, { "cell_type": "markdown", "id": "d55ebe35", "metadata": {}, "source": [ "Python models could be used without .json configuration files.\n", "\n", "The code below is an alternative to building [insults_kaggle_bert](https://github.com/deepmipt/DeepPavlov/blob/master/deeppavlov/configs/classifiers/insults_kaggle_bert.json) model and using it with\n", "\n", "```python\n", "from deeppavlov import build_model\n", "\n", "model = build_model('insults_kaggle_bert', download=True)\n", "```" ] }, { "cell_type": "markdown", "id": "fa1db63b", "metadata": {}, "source": [ "At first, define variables for model components and download model data." ] }, { "cell_type": "code", "execution_count": null, "id": "9d6671e2", "metadata": {}, "outputs": [], "source": [ "from deeppavlov.core.commands.utils import expand_path\n", "from deeppavlov.download import download_resource\n", "\n", "\n", "classifiers_path = expand_path('~/.deeppavlov/models/classifiers')\n", "model_path = classifiers_path / 'insults_kaggle_torch_bert'\n", "transformer_name = 'bert-base-uncased'\n", "\n", "download_resource(\n", " 'http://files.deeppavlov.ai/deeppavlov_data/classifiers/insults_kaggle_torch_bert_v5.tar.gz',\n", " {classifiers_path}\n", ")\n" ] }, { "cell_type": "markdown", "id": "332d644e", "metadata": {}, "source": [ "Then, initialize model components." ] }, { "cell_type": "code", "execution_count": null, "id": "809c31ad", "metadata": {}, "outputs": [], "source": [ "from deeppavlov.core.data.simple_vocab import SimpleVocabulary\n", "from deeppavlov.models.classifiers.proba2labels import Proba2Labels\n", "from deeppavlov.models.preprocessors.torch_transformers_preprocessor import TorchTransformersPreprocessor\n", "from deeppavlov.models.torch_bert.torch_transformers_classifier import TorchTransformersClassifierModel\n", "\n", "\n", "preprocessor = TorchTransformersPreprocessor(\n", " vocab_file=transformer_name,\n", " max_seq_length=64\n", ")\n", "\n", "classes_vocab = SimpleVocabulary(\n", " load_path=model_path/'classes.dict',\n", " save_path=model_path/'classes.dict'\n", ")\n", "\n", "classifier = TorchTransformersClassifierModel(\n", " n_classes=classes_vocab.len,\n", " return_probas=True,\n", " pretrained_bert=transformer_name,\n", " save_path=model_path/'model',\n", " optimizer_parameters={'lr': 1e-05}\n", ")\n", "\n", "proba2labels = Proba2Labels(max_proba=True)" ] }, { "cell_type": "markdown", "id": "87e8ec20", "metadata": {}, "source": [ "Finally, create model from components. ``Element`` is a wrapper for a component. ``Element`` receives the component and the names of the incoming and outgoing arguments. ``Model`` combines ``Element``s into pipeline." ] }, { "cell_type": "code", "execution_count": null, "id": "acfe29de", "metadata": {}, "outputs": [], "source": [ "from deeppavlov import Element, Model\n", "\n", "model = Model(\n", " x=['x'],\n", " out=['y_pred_labels'],\n", " pipe=[\n", " Element(component=preprocessor, x=['x'], out=['bert_features']),\n", " Element(component=classifier, x=['bert_features'], out=['y_pred_probas']),\n", " Element(component=proba2labels, x=['y_pred_probas'], out=['y_pred_ids']),\n", " Element(component=classes_vocab, x=['y_pred_ids'], out=['y_pred_labels'])\n", " ]\n", ")\n", "\n", "model(['you are stupid', 'you are smart'])" ] } ], "metadata": {}, "nbformat": 4, "nbformat_minor": 5 }