BitsandbytesPrecision¶
- class lightning.fabric.plugins.precision.BitsandbytesPrecision(mode, dtype=None, ignore_modules=None)[source]¶
Bases:
PrecisionPlugin for quantizing weights with bitsandbytes.
Warning
This is an experimental feature.
Note
The optimizer is not automatically replaced with
bitsandbytes.optim.Adam8bitor equivalent 8-bit optimizers.- Parameters:
mode¶ (
Literal['nf4','nf4-dq','fp4','fp4-dq','int8','int8-training']) – The quantization mode to use.ignore_modules¶ (
Optional[set[str]]) – The submodules whose Linear layers should not be replaced, for example.{"lm_head"}. This might be desirable for numerical stability. The string will be checked in as a prefix, so a value like “transformer.blocks” will ignore all linear layers in all of the transformer blocks.
- convert_input(data)[source]¶
Convert model inputs (forward) to the floating point precision type of this plugin.
This is a no-op in the base precision plugin, since we assume the data already has the desired type (default is torch.float32).
- Return type:
- convert_module(module)[source]¶
Convert the module parameters to the precision type this plugin handles.
This is optional and depends on the precision limitations during optimization.
- Return type:
- convert_output(data)[source]¶
Convert outputs to the floating point precision type expected after model’s forward.
This is a no-op in the base precision plugin, since we assume the data already has the desired type (default is torch.float32).
- Return type:
- forward_context()[source]¶
A contextmanager for managing model forward/training_step/evaluation_step/predict_step.
- Return type:
- module_init_context()[source]¶
Instantiate module parameters or tensors in the precision type this plugin handles.
This is optional and depends on the precision limitations during optimization.
- Return type: