RNN¶
- 
class torch.nn.RNN(*args, **kwargs)[source]¶
- Applies a multi-layer Elman RNN with or non-linearity to an input sequence. - For each element in the input sequence, each layer computes the following function: - where is the hidden state at time t, is the input at time t, and is the hidden state of the previous layer at time t-1 or the initial hidden state at time 0. If - nonlinearityis- 'relu', then is used instead of .- Parameters
- input_size – The number of expected features in the input x 
- hidden_size – The number of features in the hidden state h 
- num_layers – Number of recurrent layers. E.g., setting - num_layers=2would mean stacking two RNNs together to form a stacked RNN, with the second RNN taking in outputs of the first RNN and computing the final results. Default: 1
- nonlinearity – The non-linearity to use. Can be either - 'tanh'or- 'relu'. Default:- 'tanh'
- bias – If - False, then the layer does not use bias weights b_ih and b_hh. Default:- True
- batch_first – If - True, then the input and output tensors are provided as (batch, seq, feature). Default:- False
- dropout – If non-zero, introduces a Dropout layer on the outputs of each RNN layer except the last layer, with dropout probability equal to - dropout. Default: 0
- bidirectional – If - True, becomes a bidirectional RNN. Default:- False
 
 - Inputs: input, h_0
- input of shape (seq_len, batch, input_size): tensor containing the features of the input sequence. The input can also be a packed variable length sequence. See - torch.nn.utils.rnn.pack_padded_sequence()or- torch.nn.utils.rnn.pack_sequence()for details.
- h_0 of shape (num_layers * num_directions, batch, hidden_size): tensor containing the initial hidden state for each element in the batch. Defaults to zero if not provided. If the RNN is bidirectional, num_directions should be 2, else it should be 1. 
 
- Outputs: output, h_n
- output of shape (seq_len, batch, num_directions * hidden_size): tensor containing the output features (h_t) from the last layer of the RNN, for each t. If a - torch.nn.utils.rnn.PackedSequencehas been given as the input, the output will also be a packed sequence.- For the unpacked case, the directions can be separated using - output.view(seq_len, batch, num_directions, hidden_size), with forward and backward being direction 0 and 1 respectively. Similarly, the directions can be separated in the packed case.
- h_n of shape (num_layers * num_directions, batch, hidden_size): tensor containing the hidden state for t = seq_len. - Like output, the layers can be separated using - h_n.view(num_layers, num_directions, batch, hidden_size).
 
- Shape:
- Input1: tensor containing input features where and L represents a sequence length. 
- Input2: tensor containing the initial hidden state for each element in the batch. Defaults to zero if not provided. where If the RNN is bidirectional, num_directions should be 2, else it should be 1. 
- Output1: where 
- Output2: tensor containing the next hidden state for each element in the batch 
 
 - Variables
- ~RNN.weight_ih_l[k] – the learnable input-hidden weights of the k-th layer, of shape (hidden_size, input_size) for k = 0. Otherwise, the shape is (hidden_size, num_directions * hidden_size) 
- ~RNN.weight_hh_l[k] – the learnable hidden-hidden weights of the k-th layer, of shape (hidden_size, hidden_size) 
- ~RNN.bias_ih_l[k] – the learnable input-hidden bias of the k-th layer, of shape (hidden_size) 
- ~RNN.bias_hh_l[k] – the learnable hidden-hidden bias of the k-th layer, of shape (hidden_size) 
 
 - Note - All the weights and biases are initialized from where - Warning - There are known non-determinism issues for RNN functions on some versions of cuDNN and CUDA. You can enforce deterministic behavior by setting the following environment variables: - On CUDA 10.1, set environment variable - CUDA_LAUNCH_BLOCKING=1. This may affect performance.- On CUDA 10.2 or later, set environment variable (note the leading colon symbol) - CUBLAS_WORKSPACE_CONFIG=:16:8or- CUBLAS_WORKSPACE_CONFIG=:4096:2.- See the cuDNN 8 Release Notes for more information. - Orphan
 - Note - If the following conditions are satisfied: 1) cudnn is enabled, 2) input data is on the GPU 3) input data has dtype - torch.float164) V100 GPU is used, 5) input data is not in- PackedSequenceformat persistent algorithm can be selected to improve performance.- Examples: - >>> rnn = nn.RNN(10, 20, 2) >>> input = torch.randn(5, 3, 10) >>> h0 = torch.randn(2, 3, 20) >>> output, hn = rnn(input, h0)