Few-shot Text Classification

1. Introduction to the task

Text classification is a task of identifying one of the pre-defined label given an utterance, where label is one of N classes or “OOS” (out-of-scope examples - utterances that do not belong to any of the predefined classes). We consider few-shot setting, where only few examples (5 or 10) per intent class are given as a training set.

2. Get started with the model

First make sure you have the DeepPavlov Library installed. More info about the first installation.

[ ]:
!pip install -q deeppavlov

Then make sure that all the required packages are installed.

[ ]:
!python -m deeppavlov install few_shot_roberta

few_shot_roberta is the name of the model’s config_file. What is a Config File?

Configuration file defines the model and describes its hyperparameters. To use another model, change the name of the config_file here and further. Some of few-shot classification models with their config names can be found in the table.

3. Models list

At the moment, only few_shot_roberta config support out-of-scope detection.

Config name

Dataset

Shot

Model Size

In-domain accuracy

Out-of-scope recall

Out-of-scope precision

few_shot_roberta

CLINC150-Banking-Domain

5

1.4 GB

84.1±1.9

93.2±0.8

97.8±0.3

few_shot_roberta

CLINC150

5

1.4 GB

59.4±1.4

87.9±1.2

40.3±0.7

few_shot_roberta

BANKING77-OOS

5

1.4 GB

51.4±2.1

93.7±0.7

82.7±1.4

fasttext_logreg*

CLINC150-Banking-Domain

5

37 KB

24.8±2.2

98.2±0.4

74.8±0.6

fasttext_logreg*

CLINC150

5

37 KB

13.4±0.5

98.6±0.2

20.5±0.1

fasttext_logreg*

BANKING77-OOS

5

37 KB

10.7±0.8

99.0±0.3

36.4±0.2

With zero threshold we can get a classification accuracy without OOS detection:

Config name

Dataset

Shot

Model Size

Accuracy

few_shot_roberta

CLINC150-Banking-Domain

5

1.4 GB

89.6

few_shot_roberta

CLINC150

5

1.4 GB

79.6

few_shot_roberta

BANKING77-OOS

5

1.4 GB

55.1

fasttext_logreg*

CLINC150-Banking-Domain

5

37 KB

86.3

fasttext_logreg*

CLINC150

5

37 KB

73.6

fasttext_logreg*

BANKING77-OOS

5

37 KB

51.6

* - config file was modified to predict OOS examples

4. Use the model for prediction

Base model few_shot_roberta was already pre-trained to recognize simmilar utterances, so you can use off-the-shelf model to make predictions and evalutation. No additional training needed.

4.1 Dataset format

DNNC model compares input text to every example in dataset to determine, which class the input example belongs to. The dataset based on which classification is performed has the following format:

[
    ["text_1",  "label_1"],
    ["text_2",  "label_2"],
             ...
    ["text_n",  "label_n"]
]

4.2 Predict using Python

After installing the model, build it from the config and predict.

[ ]:
from deeppavlov import build_model

model = build_model("few_shot_roberta", download=True)

If you set download flag to True, then existing model weights will be overwritten.

Setting the install argument to True is equivalent to executing the command line install command. If set to True, it will first install all the required packages.

Input: List[texts, dataset]

Output: List[labels]

[2]:
texts = [
    "what expression would i use to say i love you if i were an italian",
    "what's the currency conversion between krones and yen",
    "i'd like to reserve a high-end car"
]

dataset = [
    ["please help me book a rental car for nashville",                       "car_rental"],
    ["how can i rent a car in boston",                                       "car_rental"],
    ["help me get a rental car for march 2 to 6th",                          "car_rental"],

    ["how many pesos can i get for one dollar",                              "exchange_rate"],
    ["tell me the exchange rate between rubles and dollars",                 "exchange_rate"],
    ["what is the exchange rate in pesos for 100 dollars",                   "exchange_rate"],

    ["can you tell me how to say 'i do not speak much spanish', in spanish", "translate"],
    ["please tell me how to ask for a taxi in french",                       "translate"],
    ["how would i say thank you if i were russian",                          "translate"]
]

model(texts, dataset)
[2]:
['translate', 'exchange_rate', 'car_rental']

4.3 Predict using CLI

You can also get predictions in an interactive mode through CLI (Сommand Line Interface).

[ ]:
!python -m deeppavlov interact few_shot_roberta -d

-d is an optional download key (alternative to download=True in Python code). The key -d is used to download the pre-trained model along with all other files needed to run the model.

Or make predictions for samples from stdin.

[ ]:
!python -m deeppavlov predict few_shot_roberta -f <file-name>

5. Customize the model

Out-of-scope (OOS) examples are determined via confidence with confidence_threshold parameter. For each input text, if the confidence of the model is lower than the confidence_threshold, then the input example is considered out-of-scop. The higher the threshold, the more often the model predicts “oos” class. By default it is set to 0, but you can change it to your preferences in configuration file.

[4]:
from deeppavlov import build_model
from deeppavlov.core.commands.utils import parse_config

model_config = parse_config('few_shot_roberta')
model_config['chainer']['pipe'][-1]['confidence_threshold'] = 0.1
model = build_model(model_config)
0.0