deeppavlov.models.squad¶
-
class
deeppavlov.models.squad.squad.
SquadModel
(word_emb: numpy.ndarray, char_emb: numpy.ndarray, context_limit: int = 450, question_limit: int = 150, char_limit: int = 16, train_char_emb: bool = True, char_hidden_size: int = 100, encoder_hidden_size: int = 75, attention_hidden_size: int = 75, keep_prob: float = 0.7, min_learning_rate: float = 0.001, noans_token: bool = False, **kwargs)[source]¶ SquadModel predicts answer start and end position in given context by given question.
High level architecture: Word embeddings -> Contextual embeddings -> Question-Context Attention -> Self-attention -> Pointer Network
If noans_token flag is True, then special noans_token is added to output of self-attention layer. Pointer Network can select noans_token if there is no answer in given context.
- Parameters
word_emb – pretrained word embeddings
char_emb – pretrained char embeddings
context_limit – max context length in tokens
question_limit – max question length in tokens
char_limit – max number of characters in token
char_hidden_size – hidden size of charRNN
encoder_hidden_size – hidden size of encoder RNN
attention_hidden_size – size of projection layer in attention
keep_prob – dropout keep probability
min_learning_rate – minimal learning rate, is used in learning rate decay
noans_token – boolean, flags whether to use special no_ans token to make model able not to answer on question
-
__call__
(c_tokens: numpy.ndarray, c_chars: numpy.ndarray, q_tokens: numpy.ndarray, q_chars: numpy.ndarray, *args, **kwargs) → Tuple[numpy.ndarray, numpy.ndarray, List[float]][source]¶ Predicts answer start and end positions by given context and question.
- Parameters
c_tokens – batch of tokenized contexts
c_chars – batch of tokenized contexts, each token split on chars
q_tokens – batch of tokenized questions
q_chars – batch of tokenized questions, each token split on chars
- Returns
answer_start, answer_end positions, answer logits which represent models confidence
-
train_on_batch
(c_tokens: numpy.ndarray, c_chars: numpy.ndarray, q_tokens: numpy.ndarray, q_chars: numpy.ndarray, y1s: Tuple[List[int], …], y2s: Tuple[List[int], …]) → float[source]¶ This method is called by trainer to make one training step on one batch.
- Parameters
c_tokens – batch of tokenized contexts
c_chars – batch of tokenized contexts, each token split on chars
q_tokens – batch of tokenized questions
q_chars – batch of tokenized questions, each token split on chars
y1s – batch of ground truth answer start positions
y2s – batch of ground truth answer end positions
- Returns
value of loss function on batch