Flatten¶
- 
class torch.nn.Flatten(start_dim=1, end_dim=- 1)[source]¶
- Flattens a contiguous range of dims into a tensor. For use with - Sequential.- Shape:
- Input: 
- Output: (for the default case). 
 
 - Parameters
- start_dim – first dim to flatten (default = 1). 
- end_dim – last dim to flatten (default = -1). 
 
 - Examples::
- >>> input = torch.randn(32, 1, 5, 5) >>> m = nn.Sequential( >>> nn.Conv2d(1, 32, 5, 1, 1), >>> nn.Flatten() >>> ) >>> output = m(input) >>> output.size() torch.Size([32, 288]) 
 - 
add_module(name, module)¶
- Adds a child module to the current module. - The module can be accessed as an attribute using the given name. - Parameters
- name (string) – name of the child module. The child module can be accessed from this module using the given name 
- module (Module) – child module to be added to the module. 
 
 
 - 
apply(fn)¶
- Applies - fnrecursively to every submodule (as returned by- .children()) as well as self. Typical use includes initializing the parameters of a model (see also torch.nn.init).- Parameters
- fn ( - Module-> None) – function to be applied to each submodule
- Returns
- self 
- Return type
 - Example: - >>> @torch.no_grad() >>> def init_weights(m): >>> print(m) >>> if type(m) == nn.Linear: >>> m.weight.fill_(1.0) >>> print(m.weight) >>> net = nn.Sequential(nn.Linear(2, 2), nn.Linear(2, 2)) >>> net.apply(init_weights) Linear(in_features=2, out_features=2, bias=True) Parameter containing: tensor([[ 1., 1.], [ 1., 1.]]) Linear(in_features=2, out_features=2, bias=True) Parameter containing: tensor([[ 1., 1.], [ 1., 1.]]) Sequential( (0): Linear(in_features=2, out_features=2, bias=True) (1): Linear(in_features=2, out_features=2, bias=True) ) Sequential( (0): Linear(in_features=2, out_features=2, bias=True) (1): Linear(in_features=2, out_features=2, bias=True) ) 
 - 
bfloat16()¶
- Casts all floating point parameters and buffers to - bfloat16datatype.- Returns
- self 
- Return type
 
 - 
buffers(recurse=True)¶
- Returns an iterator over module buffers. - Parameters
- recurse (bool) – if True, then yields buffers of this module and all submodules. Otherwise, yields only buffers that are direct members of this module. 
- Yields
- torch.Tensor – module buffer 
 - Example: - >>> for buf in model.buffers(): >>> print(type(buf), buf.size()) <class 'torch.Tensor'> (20L,) <class 'torch.Tensor'> (20L, 1L, 5L, 5L) 
 - 
children()¶
- Returns an iterator over immediate children modules. - Yields
- Module – a child module 
 
 - 
cuda(device=None)¶
- Moves all model parameters and buffers to the GPU. - This also makes associated parameters and buffers different objects. So it should be called before constructing optimizer if the module will live on GPU while being optimized. 
 - 
double()¶
- Casts all floating point parameters and buffers to - doubledatatype.- Returns
- self 
- Return type
 
 - 
eval()¶
- Sets the module in evaluation mode. - This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. - Dropout,- BatchNorm, etc.- This is equivalent with - self.train(False).- Returns
- self 
- Return type
 
 - 
float()¶
- Casts all floating point parameters and buffers to float datatype. - Returns
- self 
- Return type
 
 - 
half()¶
- Casts all floating point parameters and buffers to - halfdatatype.- Returns
- self 
- Return type
 
 - 
load_state_dict(state_dict, strict=True)¶
- Copies parameters and buffers from - state_dictinto this module and its descendants. If- strictis- True, then the keys of- state_dictmust exactly match the keys returned by this module’s- state_dict()function.- Parameters
- state_dict (dict) – a dict containing parameters and persistent buffers. 
- strict (bool, optional) – whether to strictly enforce that the keys in - state_dictmatch the keys returned by this module’s- state_dict()function. Default:- True
 
- Returns
- missing_keys is a list of str containing the missing keys 
- unexpected_keys is a list of str containing the unexpected keys 
 
- Return type
- NamedTuplewith- missing_keysand- unexpected_keysfields
 
 - 
modules()¶
- Returns an iterator over all modules in the network. - Yields
- Module – a module in the network 
 - Note - Duplicate modules are returned only once. In the following example, - lwill be returned only once.- Example: - >>> l = nn.Linear(2, 2) >>> net = nn.Sequential(l, l) >>> for idx, m in enumerate(net.modules()): print(idx, '->', m) 0 -> Sequential( (0): Linear(in_features=2, out_features=2, bias=True) (1): Linear(in_features=2, out_features=2, bias=True) ) 1 -> Linear(in_features=2, out_features=2, bias=True) 
 - 
named_buffers(prefix='', recurse=True)¶
- Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself. - Parameters
- Yields
- (string, torch.Tensor) – Tuple containing the name and buffer 
 - Example: - >>> for name, buf in self.named_buffers(): >>> if name in ['running_var']: >>> print(buf.size()) 
 - 
named_children()¶
- Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself. - Yields
- (string, Module) – Tuple containing a name and child module 
 - Example: - >>> for name, module in model.named_children(): >>> if name in ['conv4', 'conv5']: >>> print(module) 
 - 
named_modules(memo=None, prefix='')¶
- Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself. - Yields
- (string, Module) – Tuple of name and module 
 - Note - Duplicate modules are returned only once. In the following example, - lwill be returned only once.- Example: - >>> l = nn.Linear(2, 2) >>> net = nn.Sequential(l, l) >>> for idx, m in enumerate(net.named_modules()): print(idx, '->', m) 0 -> ('', Sequential( (0): Linear(in_features=2, out_features=2, bias=True) (1): Linear(in_features=2, out_features=2, bias=True) )) 1 -> ('0', Linear(in_features=2, out_features=2, bias=True)) 
 - 
named_parameters(prefix='', recurse=True)¶
- Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself. - Parameters
- Yields
- (string, Parameter) – Tuple containing the name and parameter 
 - Example: - >>> for name, param in self.named_parameters(): >>> if name in ['bias']: >>> print(param.size()) 
 - 
parameters(recurse=True)¶
- Returns an iterator over module parameters. - This is typically passed to an optimizer. - Parameters
- recurse (bool) – if True, then yields parameters of this module and all submodules. Otherwise, yields only parameters that are direct members of this module. 
- Yields
- Parameter – module parameter 
 - Example: - >>> for param in model.parameters(): >>> print(type(param), param.size()) <class 'torch.Tensor'> (20L,) <class 'torch.Tensor'> (20L, 1L, 5L, 5L) 
 - 
register_backward_hook(hook)¶
- Registers a backward hook on the module. - This function is deprecated in favor of - nn.Module.register_full_backward_hook()and the behavior of this function will change in future versions.- Returns
- a handle that can be used to remove the added hook by calling - handle.remove()
- Return type
- torch.utils.hooks.RemovableHandle
 
 - 
register_buffer(name, tensor, persistent=True)¶
- Adds a buffer to the module. - This is typically used to register a buffer that should not to be considered a model parameter. For example, BatchNorm’s - running_meanis not a parameter, but is part of the module’s state. Buffers, by default, are persistent and will be saved alongside parameters. This behavior can be changed by setting- persistentto- False. The only difference between a persistent buffer and a non-persistent buffer is that the latter will not be a part of this module’s- state_dict.- Buffers can be accessed as attributes using given names. - Parameters
- name (string) – name of the buffer. The buffer can be accessed from this module using the given name 
- tensor (Tensor) – buffer to be registered. 
- persistent (bool) – whether the buffer is part of this module’s - state_dict.
 
 - Example: - >>> self.register_buffer('running_mean', torch.zeros(num_features)) 
 - 
register_forward_hook(hook)¶
- Registers a forward hook on the module. - The hook will be called every time after - forward()has computed an output. It should have the following signature:- hook(module, input, output) -> None or modified output - The input contains only the positional arguments given to the module. Keyword arguments won’t be passed to the hooks and only to the - forward. The hook can modify the output. It can modify the input inplace but it will not have effect on forward since this is called after- forward()is called.- Returns
- a handle that can be used to remove the added hook by calling - handle.remove()
- Return type
- torch.utils.hooks.RemovableHandle
 
 - 
register_forward_pre_hook(hook)¶
- Registers a forward pre-hook on the module. - The hook will be called every time before - forward()is invoked. It should have the following signature:- hook(module, input) -> None or modified input - The input contains only the positional arguments given to the module. Keyword arguments won’t be passed to the hooks and only to the - forward. The hook can modify the input. User can either return a tuple or a single modified value in the hook. We will wrap the value into a tuple if a single value is returned(unless that value is already a tuple).- Returns
- a handle that can be used to remove the added hook by calling - handle.remove()
- Return type
- torch.utils.hooks.RemovableHandle
 
 - 
register_full_backward_hook(hook)¶
- Registers a backward hook on the module. - The hook will be called every time the gradients with respect to module inputs are computed. The hook should have the following signature: - hook(module, grad_input, grad_output) -> tuple(Tensor) or None - The - grad_inputand- grad_outputare tuples that contain the gradients with respect to the inputs and outputs respectively. The hook should not modify its arguments, but it can optionally return a new gradient with respect to the input that will be used in place of- grad_inputin subsequent computations.- grad_inputwill only correspond to the inputs given as positional arguments and all kwarg arguments are ignored. Entries in- grad_inputand- grad_outputwill be- Nonefor all non-Tensor arguments.- Warning - Modifying inputs or outputs inplace is not allowed when using backward hooks and will raise an error. - Returns
- a handle that can be used to remove the added hook by calling - handle.remove()
- Return type
- torch.utils.hooks.RemovableHandle
 
 - 
register_parameter(name, param)¶
- Adds a parameter to the module. - The parameter can be accessed as an attribute using given name. - Parameters
- name (string) – name of the parameter. The parameter can be accessed from this module using the given name 
- param (Parameter) – parameter to be added to the module. 
 
 
 - 
requires_grad_(requires_grad=True)¶
- Change if autograd should record operations on parameters in this module. - This method sets the parameters’ - requires_gradattributes in-place.- This method is helpful for freezing part of the module for finetuning or training parts of a model individually (e.g., GAN training). 
 - 
state_dict(destination=None, prefix='', keep_vars=False)¶
- Returns a dictionary containing a whole state of the module. - Both parameters and persistent buffers (e.g. running averages) are included. Keys are corresponding parameter and buffer names. - Returns
- a dictionary containing a whole state of the module 
- Return type
 - Example: - >>> module.state_dict().keys() ['bias', 'weight'] 
 - 
to(*args, **kwargs)¶
- Moves and/or casts the parameters and buffers. - This can be called as - 
to(device=None, dtype=None, non_blocking=False)¶
 - 
to(dtype, non_blocking=False)¶
 - 
to(tensor, non_blocking=False)¶
 - 
to(memory_format=torch.channels_last)¶
 - Its signature is similar to - torch.Tensor.to(), but only accepts floating point or complex- dtype`s. In addition, this method will only cast the floating point or complex parameters and buffers to :attr:`dtype(if given). The integral parameters and buffers will be moved- device, if that is given, but with dtypes unchanged. When- non_blockingis set, it tries to convert/move asynchronously with respect to the host if possible, e.g., moving CPU Tensors with pinned memory to CUDA devices.- See below for examples. - Note - This method modifies the module in-place. - Parameters
- device ( - torch.device) – the desired device of the parameters and buffers in this module
- dtype ( - torch.dtype) – the desired floating point or complex dtype of the parameters and buffers in this module
- tensor (torch.Tensor) – Tensor whose dtype and device are the desired dtype and device for all parameters and buffers in this module 
- memory_format ( - torch.memory_format) – the desired memory format for 4D parameters and buffers in this module (keyword only argument)
 
- Returns
- self 
- Return type
 - Examples: - >>> linear = nn.Linear(2, 2) >>> linear.weight Parameter containing: tensor([[ 0.1913, -0.3420], [-0.5113, -0.2325]]) >>> linear.to(torch.double) Linear(in_features=2, out_features=2, bias=True) >>> linear.weight Parameter containing: tensor([[ 0.1913, -0.3420], [-0.5113, -0.2325]], dtype=torch.float64) >>> gpu1 = torch.device("cuda:1") >>> linear.to(gpu1, dtype=torch.half, non_blocking=True) Linear(in_features=2, out_features=2, bias=True) >>> linear.weight Parameter containing: tensor([[ 0.1914, -0.3420], [-0.5112, -0.2324]], dtype=torch.float16, device='cuda:1') >>> cpu = torch.device("cpu") >>> linear.to(cpu) Linear(in_features=2, out_features=2, bias=True) >>> linear.weight Parameter containing: tensor([[ 0.1914, -0.3420], [-0.5112, -0.2324]], dtype=torch.float16) >>> linear = nn.Linear(2, 2, bias=None).to(torch.cdouble) >>> linear.weight Parameter containing: tensor([[ 0.3741+0.j, 0.2382+0.j], [ 0.5593+0.j, -0.4443+0.j]], dtype=torch.complex128) >>> linear(torch.ones(3, 2, dtype=torch.cdouble)) tensor([[0.6122+0.j, 0.1150+0.j], [0.6122+0.j, 0.1150+0.j], [0.6122+0.j, 0.1150+0.j]], dtype=torch.complex128) 
- 
 - 
train(mode=True)¶
- Sets the module in training mode. - This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. - Dropout,- BatchNorm, etc.
 - 
type(dst_type)¶
- Casts all parameters and buffers to - dst_type.
 - 
xpu(device=None)¶
- Moves all model parameters and buffers to the XPU. - This also makes associated parameters and buffers different objects. So it should be called before constructing optimizer if the module will live on XPU while being optimized. 
 - 
zero_grad(set_to_none=False)¶
- Sets gradients of all model parameters to zero. See similar function under - torch.optim.Optimizerfor more context.- Parameters
- set_to_none (bool) – instead of setting to zero, set the grads to None. See - torch.optim.Optimizer.zero_grad()for details.