Attention Processor
译者:片刻小哥哥
项目地址:https://huggingface.apachecn.org/docs/diffusers/api/attnprocessor
原始地址:https://huggingface.co/docs/diffusers/api/attnprocessor
An attention processor is a class for applying different types of attention mechanisms.
AttnProcessor
class
diffusers.models.attention_processor.
AttnProcessor
[<
source
](https://github.com/huggingface/diffusers/blob/v0.23.0/src/diffusers/models/attention_processor.py#L694)
(
)
Default processor for performing attention-related computations.
AttnProcessor2_0
class
diffusers.models.attention_processor.
AttnProcessor2_0
[<
source
](https://github.com/huggingface/diffusers/blob/v0.23.0/src/diffusers/models/attention_processor.py#L1166)
(
)
Processor for implementing scaled dot-product attention (enabled by default if you’re using PyTorch 2.0).
LoRAAttnProcessor
class
diffusers.models.attention_processor.
LoRAAttnProcessor
[<
source
](https://github.com/huggingface/diffusers/blob/v0.23.0/src/diffusers/models/attention_processor.py#L1693)
(
hidden_size
: int
cross_attention_dim
: typing.Optional[int] = None
rank
: int = 4
network_alpha
: typing.Optional[int] = None
**kwargs
)
Parameters
- hidden_size
(
int, optional ) — The hidden size of the attention layer. - cross_attention_dim
(
int, optional ) — The number of channels in theencoder_hidden_states. - rank
(
int, defaults to 4) — The dimension of the LoRA update matrices. - network_alpha
(
int, optional ) — Equivalent toalphabut it’s usage is specific to Kohya (A1111) style LoRAs. - kwargs
(
dict) — Additional keyword arguments to pass to theLoRALinearLayerlayers.
Processor for implementing the LoRA attention mechanism.
LoRAAttnProcessor2_0
class
diffusers.models.attention_processor.
LoRAAttnProcessor2_0
[<
source
](https://github.com/huggingface/diffusers/blob/v0.23.0/src/diffusers/models/attention_processor.py#L1765)
(
hidden_size
: int
cross_attention_dim
: typing.Optional[int] = None
rank
: int = 4
network_alpha
: typing.Optional[int] = None
**kwargs
)
Parameters
- hidden_size
(
int) — The hidden size of the attention layer. - cross_attention_dim
(
int, optional ) — The number of channels in theencoder_hidden_states. - rank
(
int, defaults to 4) — The dimension of the LoRA update matrices. - network_alpha
(
int, optional ) — Equivalent toalphabut it’s usage is specific to Kohya (A1111) style LoRAs. - kwargs
(
dict) — Additional keyword arguments to pass to theLoRALinearLayerlayers.
Processor for implementing the LoRA attention mechanism using PyTorch 2.0’s memory-efficient scaled dot-product attention.
CustomDiffusionAttnProcessor
class
diffusers.models.attention_processor.
CustomDiffusionAttnProcessor
[<
source
](https://github.com/huggingface/diffusers/blob/v0.23.0/src/diffusers/models/attention_processor.py#L763)
(
train_kv
: bool = True
train_q_out
: bool = True
hidden_size
: typing.Optional[int] = None
cross_attention_dim
: typing.Optional[int] = None
out_bias
: bool = True
dropout
: float = 0.0
)
Parameters
- train_kv
(
bool, defaults toTrue) — Whether to newly train the key and value matrices corresponding to the text features. - train_q_out
(
bool, defaults toTrue) — Whether to newly train query matrices corresponding to the latent image features. - hidden_size
(
int, optional , defaults toNone) — The hidden size of the attention layer. - cross_attention_dim
(
int, optional , defaults toNone) — The number of channels in theencoder_hidden_states. - out_bias
(
bool, defaults toTrue) — Whether to include the bias parameter intrain_q_out. - dropout
(
float, optional , defaults to 0.0) — The dropout probability to use.
Processor for implementing attention for the Custom Diffusion method.
CustomDiffusionAttnProcessor2_0
class
diffusers.models.attention_processor.
CustomDiffusionAttnProcessor2_0
[<
source
](https://github.com/huggingface/diffusers/blob/v0.23.0/src/diffusers/models/attention_processor.py#L1370)
(
train_kv
: bool = True
train_q_out
: bool = True
hidden_size
: typing.Optional[int] = None
cross_attention_dim
: typing.Optional[int] = None
out_bias
: bool = True
dropout
: float = 0.0
)
Parameters
- train_kv
(
bool, defaults toTrue) — Whether to newly train the key and value matrices corresponding to the text features. - train_q_out
(
bool, defaults toTrue) — Whether to newly train query matrices corresponding to the latent image features. - hidden_size
(
int, optional , defaults toNone) — The hidden size of the attention layer. - cross_attention_dim
(
int, optional , defaults toNone) — The number of channels in theencoder_hidden_states. - out_bias
(
bool, defaults toTrue) — Whether to include the bias parameter intrain_q_out. - dropout
(
float, optional , defaults to 0.0) — The dropout probability to use.
Processor for implementing attention for the Custom Diffusion method using PyTorch 2.0’s memory-efficient scaled dot-product attention.
AttnAddedKVProcessor
class
diffusers.models.attention_processor.
AttnAddedKVProcessor
[<
source
](https://github.com/huggingface/diffusers/blob/v0.23.0/src/diffusers/models/attention_processor.py#L867)
(
)
Processor for performing attention-related computations with extra learnable key and value matrices for the text encoder.
AttnAddedKVProcessor2_0
class
diffusers.models.attention_processor.
AttnAddedKVProcessor2_0
[<
source
](https://github.com/huggingface/diffusers/blob/v0.23.0/src/diffusers/models/attention_processor.py#L931)
(
)
Processor for performing scaled dot-product attention (enabled by default if you’re using PyTorch 2.0), with extra learnable key and value matrices for the text encoder.
LoRAAttnAddedKVProcessor
class
diffusers.models.attention_processor.
LoRAAttnAddedKVProcessor
[<
source
](https://github.com/huggingface/diffusers/blob/v0.23.0/src/diffusers/models/attention_processor.py#L1919)
(
hidden_size
: int
cross_attention_dim
: typing.Optional[int] = None
rank
: int = 4
network_alpha
: typing.Optional[int] = None
)
Parameters
- hidden_size
(
int, optional ) — The hidden size of the attention layer. - cross_attention_dim
(
int, optional , defaults toNone) — The number of channels in theencoder_hidden_states. - rank
(
int, defaults to 4) — The dimension of the LoRA update matrices. - network_alpha
(
int, optional ) — Equivalent toalphabut it’s usage is specific to Kohya (A1111) style LoRAs. - kwargs
(
dict) — Additional keyword arguments to pass to theLoRALinearLayerlayers.
Processor for implementing the LoRA attention mechanism with extra learnable key and value matrices for the text encoder.
XFormersAttnProcessor
class
diffusers.models.attention_processor.
XFormersAttnProcessor
[<
source
](https://github.com/huggingface/diffusers/blob/v0.23.0/src/diffusers/models/attention_processor.py#L1075)
(
attention_op
: typing.Optional[typing.Callable] = None
)
Parameters
- attention_op
(
Callable, optional , defaults toNone) — The base operator to use as the attention operator. It is recommended to set toNone, and allow xFormers to choose the best operator.
Processor for implementing memory efficient attention using xFormers.
LoRAXFormersAttnProcessor
class
diffusers.models.attention_processor.
LoRAXFormersAttnProcessor
[<
source
](https://github.com/huggingface/diffusers/blob/v0.23.0/src/diffusers/models/attention_processor.py#L1840)
(
hidden_size
: int
cross_attention_dim
: int
rank
: int = 4
attention_op
: typing.Optional[typing.Callable] = None
network_alpha
: typing.Optional[int] = None
**kwargs
)
Parameters
- hidden_size
(
int, optional ) — The hidden size of the attention layer. - cross_attention_dim
(
int, optional ) — The number of channels in theencoder_hidden_states. - rank
(
int, defaults to 4) — The dimension of the LoRA update matrices. - attention_op
(
Callable, optional , defaults toNone) — The base operator to use as the attention operator. It is recommended to set toNone, and allow xFormers to choose the best operator. - network_alpha
(
int, optional ) — Equivalent toalphabut it’s usage is specific to Kohya (A1111) style LoRAs. - kwargs
(
dict) — Additional keyword arguments to pass to theLoRALinearLayerlayers.
Processor for implementing the LoRA attention mechanism with memory efficient attention using xFormers.
CustomDiffusionXFormersAttnProcessor
class
diffusers.models.attention_processor.
CustomDiffusionXFormersAttnProcessor
[<
source
](https://github.com/huggingface/diffusers/blob/v0.23.0/src/diffusers/models/attention_processor.py#L1254)
(
train_kv
: bool = True
train_q_out
: bool = False
hidden_size
: typing.Optional[int] = None
cross_attention_dim
: typing.Optional[int] = None
out_bias
: bool = True
dropout
: float = 0.0
attention_op
: typing.Optional[typing.Callable] = None
)
Parameters
- train_kv
(
bool, defaults toTrue) — Whether to newly train the key and value matrices corresponding to the text features. - train_q_out
(
bool, defaults toTrue) — Whether to newly train query matrices corresponding to the latent image features. - hidden_size
(
int, optional , defaults toNone) — The hidden size of the attention layer. - cross_attention_dim
(
int, optional , defaults toNone) — The number of channels in theencoder_hidden_states. - out_bias
(
bool, defaults toTrue) — Whether to include the bias parameter intrain_q_out. - dropout
(
float, optional , defaults to 0.0) — The dropout probability to use. - attention_op
(
Callable, optional , defaults toNone) — The base operator to use as the attention operator. It is recommended to set toNone, and allow xFormers to choose the best operator.
Processor for implementing memory efficient attention using xFormers for the Custom Diffusion method.
SlicedAttnProcessor
class
diffusers.models.attention_processor.
SlicedAttnProcessor
[<
source
](https://github.com/huggingface/diffusers/blob/v0.23.0/src/diffusers/models/attention_processor.py#L1484)
(
slice_size
: int
)
Parameters
- slice_size
(
int, optional ) — The number of steps to compute attention. Uses as many slices asattention_head_dim //slice_size, andattention_head_dimmust be a multiple of theslice_size.
Processor for implementing sliced attention.
SlicedAttnAddedKVProcessor
class
diffusers.models.attention_processor.
SlicedAttnAddedKVProcessor
[<
source
](https://github.com/huggingface/diffusers/blob/v0.23.0/src/diffusers/models/attention_processor.py#L1571)
(
slice_size
)
Parameters
- slice_size
(
int, optional ) — The number of steps to compute attention. Uses as many slices asattention_head_dim //slice_size, andattention_head_dimmust be a multiple of theslice_size.
Processor for implementing sliced attention with extra learnable key and value matrices for the text encoder.
