deeppavlov.models.ranking¶
Ranking classes.
-
class
deeppavlov.models.ranking.bilstm_siamese_network.BiLSTMSiameseNetwork(len_vocab: int, seed: int = None, shared_weights: bool = True, embedding_dim: int = 300, reccurent: str = 'bilstm', hidden_dim: int = 300, max_pooling: bool = True, triplet_loss: bool = True, margin: float = 0.1, hard_triplets: bool = False, *args, **kwargs)[source]¶ The class implementing a siamese neural network with BiLSTM and max pooling.
There is a possibility to use a binary cross-entropy loss as well as a triplet loss with random or hard negative sampling.
Parameters: - len_vocab – A size of the vocabulary to build embedding layer.
- seed – Random seed.
- shared_weights – Whether to use shared weights in the model to encode
contextsandresponses. - embedding_dim – Dimensionality of token (word) embeddings.
- reccurent – A type of the RNN cell. Possible values are
lstmandbilstm. - hidden_dim – Dimensionality of the hidden state of the RNN cell. If
reccurentequalsbilstmhidden_dimshould be doubled to get the actual dimensionality. - max_pooling – Whether to use max-pooling operation to get
context(response) vector representation. IfFalse, the last hidden state of the RNN will be used. - triplet_loss – Whether to use a model with triplet loss.
If
False, a model with crossentropy loss will be used. - margin – A margin parameter for triplet loss. Only required if
triplet_lossis set toTrue. - hard_triplets – Whether to use hard triplets sampling to train the model
i.e. to choose negative samples close to positive ones.
If set to
Falserandom sampling will be used. Only required iftriplet_lossis set toTrue.
-
class
deeppavlov.models.ranking.bilstm_gru_siamese_network.BiLSTMGRUSiameseNetwork(len_vocab: int, seed: int = None, shared_weights: bool = True, embedding_dim: int = 300, reccurent: str = 'bilstm', hidden_dim: int = 300, max_pooling: bool = True, triplet_loss: bool = True, margin: float = 0.1, hard_triplets: bool = False, *args, **kwargs)[source]¶ The class implementing a siamese neural network with BiLSTM, GRU and max pooling.
GRU is used to take into account multi-turn dialogue
context.Parameters: - len_vocab – A size of the vocabulary to build embedding layer.
- seed – Random seed.
- shared_weights – Whether to use shared weights in the model to encode
contextsandresponses. - embedding_dim – Dimensionality of token (word) embeddings.
- reccurent – A type of the RNN cell. Possible values are
lstmandbilstm. - hidden_dim – Dimensionality of the hidden state of the RNN cell. If
reccurentequalsbilstmhidden_dimshould be doubled to get the actual dimensionality. - max_pooling – Whether to use max-pooling operation to get
context(response) vector representation. IfFalse, the last hidden state of the RNN will be used. - triplet_loss – Whether to use a model with triplet loss.
If
False, a model with crossentropy loss will be used. - margin – A margin parameter for triplet loss. Only required if
triplet_lossis set toTrue. - hard_triplets – Whether to use hard triplets sampling to train the model
i.e. to choose negative samples close to positive ones.
If set to
Falserandom sampling will be used. Only required iftriplet_lossis set toTrue.
-
class
deeppavlov.models.ranking.keras_siamese_model.KerasSiameseModel(learning_rate: float = 0.001, use_matrix: bool = True, emb_matrix: numpy.ndarray = None, max_sequence_length: int = None, dynamic_batch: bool = False, attention: bool = False, *args, **kwargs)[source]¶ The class implementing base functionality for siamese neural networks in keras.
Parameters: - learning_rate – Learning rate.
- use_matrix – Whether to use a trainable matrix with token (word) embeddings.
- emb_matrix – An embeddings matrix to initialize an embeddings layer of a model.
Only used if
use_matrixis set toTrue. - max_sequence_length – A maximum length of text sequences in tokens. Longer sequences will be truncated and shorter ones will be padded.
- dynamic_batch – Whether to use dynamic batching. If
True, the maximum length of a sequence for a batch will be equal to the maximum of all sequences lengths from this batch, but not higher thanmax_sequence_length. - attention – Whether any attention mechanism is used in the siamese network.
- *args – Other parameters.
- **kwargs – Other parameters.
-
class
deeppavlov.models.ranking.mpm_siamese_network.MPMSiameseNetwork(dense_dim: int = 50, perspective_num: int = 20, aggregation_dim: int = 200, recdrop_val: float = 0.0, inpdrop_val: float = 0.0, ldrop_val: float = 0.0, dropout_val: float = 0.0, *args, **kwargs)[source]¶ The class implementing a siamese neural network with bilateral multi-Perspective matching.
The network architecture is based on https://arxiv.org/abs/1702.03814.
Parameters: - dense_dim – Dimensionality of the dense layer.
- perspective_num – Number of perspectives in multi-perspective matching layers.
- dim (aggregation) – Dimensionality of the hidden state in the second BiLSTM layer.
- inpdrop_val – Float between 0 and 1. A dropout value for the linear transformation of the inputs.
- recdrop_val – Float between 0 and 1. A dropout value for the linear transformation of the recurrent state.
- ldrop_val – A dropout value of the dropout layer before the second BiLSTM layer.
- dropout_val – A dropout value of the dropout layer after the second BiLSTM layer.
-
class
deeppavlov.models.ranking.siamese_model.SiameseModel(batch_size: int, num_context_turns: int = 1, *args, **kwargs)[source]¶ The class implementing base functionality for siamese neural networks.
Parameters: - batch_size – A size of a batch.
- num_context_turns – A number of
contextturns in data samples. - *args – Other parameters.
- **kwargs – Other parameters.
-
class
deeppavlov.models.ranking.siamese_predictor.SiamesePredictor(model: deeppavlov.models.ranking.siamese_model.SiameseModel, batch_size: int, num_context_turns: int = 1, ranking: bool = True, attention: bool = False, responses: deeppavlov.core.data.simple_vocab.SimpleVocabulary = None, preproc_func: Callable = None, interact_pred_num: int = 3, *args, **kwargs)[source]¶ The class for ranking or paraphrase identification using the trained siamese network in the
interactmode.Parameters: - batch_size – A size of a batch.
- num_context_turns – A number of
contextturns in data samples. - ranking – Whether to perform ranking.
If it is set to
Falseparaphrase identification will be performed. - attention – Whether any attention mechanism is used in the siamese network.
If
Falsethen calculated in advance vectors ofresponseswill be used to obtain similarity score for the inputcontext; Otherwise the whole siamese architecture will be used to obtain similarity score for the inputcontextand each particularresponse. The parameter will be used if therankingis set toTrue. - responses – A instance of
SimpleVocabularywith all possibleresponsesto perform ranking. Will be used if therankingis set toTrue. - preproc_func – A
__call__function of theSiamesePreprocessor. - interact_pred_num – The number of the most relevant
responseswhich will be returned. Will be used if therankingis set toTrue. - **kwargs – Other parameters.