Releases: IBM/aihwkit
Releases · IBM/aihwkit
IBM Analog Hardware Accelerator Kit 1.1.0
[1.1.0] - 2026/02/03
Added
- Add newly uploaded resources for CPU-only wheels (#739)
- Add a new drift compensation mechanism which uses an ideal reference readout. In the default global drift compensation mechanism, all non-idealities (as set by the corresponding
rpu_config) are modeled, potentially resulting in sub-optimal drift compensation scales being computed in some scenarios, e.g., where the output noise is sufficiently large. - New example (examples/36_gpt2_on_wikitext.py) to run Distil GPT2 on aihwkit with text prediction. (#754)
Removed/Deprecated
- Function
convert_to_analog_mappedinsrc/aihwkit/nn/conversion.pydeprecated and removed.
Changed
- Replace legacy release-build workflow with the updated build process (#744)
- Point the online demo link to the correct destination (#743)
- Update bundled notebook wheel to the GPU-enabled 1.0.0 release (#741)
- Modifications/Cleanup/Improvements to Time-Stepped Circuit IRDrop example and model implementation (#753)
Fixed
IBM Analog Hardware Accelerator Kit 1.0.0
[1.0.0] - 2025/05/19
Added
- Add new Weight Programming Optimization Feature (#703)
- Add Floating Point Preset for inference (#705)
- Add new notebook on a tutorial for Analog Device Non Idealities setup (#682)
- Add new simulation for Analog Filamentary Conductive-Metal-Oxide (CMO)/HfOx ReRAM device noise models for inference (#702)
- Add new Quantization library (#719)
- Add new Half Precision training section on Using the Simulator documentation and related example (#678)
Fixed
- Fix Hardware-Aware training tutorial notebooks (#700)
- Fix Post-Training Input Range Calibration notebook (#716)
Changed
- Change License from Apache 2.0 to MIT (#696)
- Update examples index for better navigation (#704)
- Update tutorial notebooks to run with latest builds of aihwkit (#713)
- Update and enhance API documentation on
IOParametersforis_perfect(#718) - Migrate from Travis CI/CD integration to Github Actions (#720)
IBM Analog Hardware Accelerator Kit 0.9.2
[0.9.2] - 2024/09/18
Added
- Added new Hermes noise model and related notebooks (#685)
- Added new conductance converters (#685)
- Make Conv layers also compatible with non-batched inputs (#685)
- Added per column drift compensation (#685)
- Added custom drifts (#685)
Changed
- Update requirements-examples.txt (#685)
IBM Analog Hardware Accelerator Kit 0.9.1
[0.9.1] - 2024/05/16
Added
- Added column wise scaling logic to fusion import/export to improve accuracy (#652)
- Added a new example that demonstrates how to import and perform inference using a model which has been trained in a hardware-aware fashion using an external library (#648)
- Added a wew WeightModifierType.DISCRETIZE_PER_CHANNEL type and a test case to validate the correctness against manual quantization in PyTorch (#618)
Fixed
- Fix sub-optimal mapping of conductances to weights for fusion by regressing weights per column (#653)
- Documentation correction: Use
ADDITIVE_CONSTANT instead ofADD_NORMAL` in WeightNoiseType (#630) - Fix continuing training based on checkpoint using torch tile (#626)
- Fix the support of different dtypes for the torch model (#625)
- Fixes the fall-through to the default error message when using drop connect (#624)
- Update
analog_fusionnotebook (#611)
IBM Analog Hardware Accelerator Kit 0.9.0
[0.9.0] - 2024/01/26
Added
- On-the-fly change of some RPUConfig fields (#539)
- Fusion chip CSV file model weights exporter functionality (#538)
- Experimental support for RPU data types (#563)
- Optional AIHWKIT C++ extension module (#563)
- Variable mantissa / exponent tensor conversion operator (#563)
- To digital feature for analog layers (#563)
- New PCM_NOISE type for hardware-aware training for inference (#563)
- Transfer compounds using torch implementation (TorchTransferTile) (#567)
- Weight programming error plotting utility (#572)
- Add optimizer checkpoint in example 20 (#573)
- Inference tile with time-dependent IR-drop (#587)
- Linear algebra module (#588)
- New Jupyter notebook for Fusion chip access (#601)
Fixed
- Repeated call of cuda() reset the weights for InferenceTile (#540)
- Custom tile bugfixes (#563)
- Bug-fixes for specialized learning algorithms (#563)
- Bug-fix for data-parallel hardware-aware training for inference (#569)
- Fix docker build stubgen (#581)
- Fix readthedoc builds (#586)
- Fix the backward of the input ranges in the torch tile (#606)
Changed
- Parameter structure changed into separate files to reduce file sizes (#563)
- RPUConfig has a new runtime field and inherits from additional base classes (#563)
- AnalogWrapper now directly adds module classes to subclasses (#563)
- RNN linear layers more custonable (#563)
- Parameters for specialized learning algorithms changed somwhat (#563)
- RNN modules inherit from Module or AnalogContainerBase instead of AnalogSequential (#563)
- Adjustment of parameter to bindings for various number formats (#563)
- Documentation updates and fixes (#562, #564, #570, #575, #576, #580, #585, #586)
- Updated installation instructions in Readthedoc (#594)
IBM Analog Hardware Acceleration Kit 0.8.0
[0.8.0] - 2023/07/14
Added
- Added new tutorial notebooks to cover the concepts of training,
hardware-aware training, post-training calibration, and extending aihwkit functionality (#518, #523, #526) - Calibration of input ranges for inference (#512)
- New analog in-memory training algorithms: Chopped Tiki-taka II (#512)
- New analog in-memory training algorithms: AGAD (#512)
- New training presets:
ReRamArrayOMPresetDevice,
ReRamArrayHfO2PresetDevice,ChoppedTTv2*,AGAD*(#512) - New correlation detection example for comparing specialized analog SGD
algorithms (#512) - Simplified
build_rpu_configscript for generatingRPUConfigsfor
analog in-memory SGD (#512) CustomTilefor customization of in-memory training algorithms (#512)- Pulse counters for pulsed analog training (#512)
TorchInferenceTilefor a fully torch-based analog tile for
inference (not using the C++ RPUCuda engine), supporting a subset of MVM nonidealities (#512)- New inference preset
StandardHWATrainingPreset(#512) - New inference noise model
ReRamWan2022NoiseModel(#512) - Improved HWA-training for inference featuring input and output range
learning and more (#512) - Improved CUDA memory management (using torch cached GPU memory for
internal RPUCuda buffer) (#512) - New layer generator:
analog_layers()loops over layer modules (except
container) (#512) AnalogWrapperfor wrapping a full torch module (Without using
AnalogSequential) (#512)convert_to_digitalutility (# 512)TileModuleArrayfor logical weight matrices larger than a single tile. (#512)- Dumping of all C++ fields for accurate analog training saving and
training continuation after checkpoint load. (#512) apply_write_noise_on_setfor pulsed devices. (#512)- Reset device now also for simple devices. (#512)
SoftBoundsReference,PowStepReferencefor explicit reference
subtraction of symmetry point in Tiki-taka (#512)- Analog MVM with output-to-output std-deviation variability
(output_noise_std) (#512) - Plotting utility for weight errors (#512)
per_batch_sampleweight noise injections forTorchInferenceRPUConfig(#512)
Fixed
- BERT example 24 using
AnalogWrapper(#514) - Cuda supported testing in examples based on AIHWKIT compilation (#513)
- Fixed compilation error for CUDA 12.1. (#500)
- Realistic read weights could have applied the scales wrongly (#512)
Changed
- Major re-organization of
AnalogTilesfor increased modularity
(TileWithPeriphery,SimulatorTile,SimulatorTileWrapper). Analog
tile modules (possibly array of analog tiles) are now also torchModule. (#512) - Change in tile generators:
analog_model.analog_tiles()now loops over
all available tiles (in all modules) (#512) - Import and file position changes. However, user can still import
RPUConfig
related modules fromaihwkit.simulator.config(#512) convert_to_analognow also considered mapping. Set
mapping.max_out_size = 0andmapping.max_out_size = 0to avoid this. (#512)- Mapped layers now use
TileModuleArrayarray by default. (#512) - Checkpoint structure is different than previous
versions.utils.legacy_loadprovides a way to load old checkpoints. (#512)
Removed
realistic_read_writeis removed from some high-level function. Use
program_weights(after setting the weights) orread_weights
for realistic reading (using weight estimation technique). (#512)
IBM Analog Hardware Acceleration Kit 0.7.1
Added
- Updated the CLI Cloud runner code to support inference experiment result. (#491)
- Read weights is done with least-square estimation method. (#489)
Fixed
- Realistic read / write behavior was broken for some tiles. (#489)
Changed
- Torch minimal version has changed to version 1.9. (#489)
- Realistic read / write is now achieved by read_weights and program_weights. (#489)
Removed
- The tile methods get/set_weights_realistic are removed. (#489)
IBM Analog Hardware Acceleration Kit 0.7.0
[0.7.0] - 2023/01/30
Added
- Reset tiles method (#456)
- Added many new analog MAC non-linearties (forward / backward pass) (#456)
- Polynomial weight noise for hardware-aware training (#456)
- Remap functionality for hardware-aware training (#456)
- Input range estimation for InferenceRPUConfig (#456)
- CUDA always syncs and added non-blocking option if not wished (#456)
- Fitting utility for fitting any device model to conductance measurements (#456)
- Added
PowStepReferenceDevicefor easy subtraction of symmetry
point (#456) - Added
SoftBoundsReferenceDevicefor easy subtraction of symmetry
point (#456) - Added stand-alone functions for applying inference drift to any model (#419)
- Added Example 24: analog inference and hardware-aware training on BERT with the SQUAD task (#440)
- Added Example 23: how to use
AnalogTiledirectly to implement an
analog matrix-vector product without using pytorch modules. (#393) - Added Example 22: 2 layer LSTM network trained on War and Peace dataset. (#391)
- Added a new notebook for exploring analog sensitivities. (#380)
- Remapping functionality for
InferenceRPUConfig. (#388) - Inference cloud experiment and runners. (#410)
- Added
analog_modulesgenerator inAnalogSequential. (#410) - Added
SKIP_CUDA_TESTSto manually switch off the CUDA tests. - Enabling comparisons of
RPUConfiginstances. (#410) - Specific user-defined function for layer-wise setting for RPUConfigs
in conversions. (#412) - Added stochastic rounding options for
MixedPrecisionCompound. (#418) - New
remapparameter field and functionality in
InferenceRPUConfig(#423). - Tile-level weight getter and setter have
apply_weight_scaling
argument. (#423) - Pre and post-update / backward / forward methods in
BaseTilefor
easier user-defined modification of pre and/or post-processings of a tile. (#423) - Type-checking for
RPUConfigfields. (#424)
Fixed
- Decay fix for compound devices (#463)
RPUCudabackend update with many fixes (#456)- Missing zero-grad call in example 02 (#446)
- Indexing error in
OneSidedDevicefor CPU (#447) - Analog summary error when model is on cuda device. (#392)
- Index error when loading the state dict with a model use previously. (#387)
- Weights that were not contiguous could have been set wrongly. (#388)
- Programming noise would not be applied if drift compensation was not
used. (#389) - Loading a new model state dict for inference does not overwrite the noise
model setting. (#410) - Avoid
AnalogContextcopying of self pointers. (#410) - Fix issue that drift compensation is not applied to conv-layers. (#412)
- Fix issue that noise modifiers are not applied to conv-layers. (#412)
- The CPU
AnalogConv2dlayer now uses unfolded convolutions instead of
indexed covolutions (that are efficient only for GPUs). (#415) - Fix issue that write noise hidden weights are not transferred to
pytorch when usingget_hidden_parametersin case of CUDA. (#417) - Learning rate scaling due to output scales. (#423)
WeightModifiersof theInferenceRPUConfigare no longer called
in the forward pass, but instead in thepost_update_step
method to avoid issues with repeated forward calls. (#423)- Fix training
learn_out_scalesissue after checkpoint load. (#434)
Changed
- Pylint / mypy / pycodestyle / protobuf version bump (#456)
- All configs related classes can now be imported from
aihwkit.simulator.config(#456) - Weight noise visualization now shows the programming noise and drift
noise differences. (#389) - Concatenate the gradients before applying to the tile update
function (some speedup for CUDA expected). (#390) - Drift compensation uses eye instead of ones for readout. (#412)
weight_scaling_omega_columnwiseparameter inMappingParameteris now called
weight_scaling_columnwise. (#423)- Tile-level weight getter and setter now use Tensors instead of numpy
arrays. (#423) - Output scaling and mapping scales are now distiniguished, only the
former is learnable. (#423) - Renamed
learn_out_scaling_alphaparameter inMappingParameterto
learn_out_scalingand columnwise learning has a separate switch
out_scaling_columnwise. (#423)
Deprecated
- Input
weight_scaling_omegaargument in analog layers is deprecated. (#423)
Removed
- The
_scaledversions of the weight getter and setter methods are
removed. (#423)
IBM Analog Hardware Acceleration Kit 0.6.0
[0.6.0] - 2022/05/16
Added
- Set weights can be used to re-apply the weight scaling omega. (#360)
- Out scaling factors can be learnt even if weight scaling omega was set to 0. (#360)
- Reverse up / down option for LinearStepDevice. (#361)
- Generic Analog RNN classes (LSTM, RNN, GRU) uni or bidirectional. (#358)
- Added new PiecewiseStepDevice where the update-step response function can be arbitrarily defined by the user in a piece-wise linear manner. It can be conveniently used to fit any experimental device data. (#356)
- Several enhancements to the public documentations: added a new section for hw-aware training, refreshed the reference API doc, and added the newly supported LSTM layers and the mapped conv layers. (#374)
Fixed
- Legacy checkpoint load with alpha scaling. (#360)
- Re-application of weight scaling omega when loading checkpoints. (#360)
- Write noise was not correctly applied for CUDA if dw_min_std=0. (#356)
Changed
IBM Analog Hardware Acceleration Kit 0.5.1
Added
- Load model state dict into a new model with modified
RPUConfig. (#276) - Visualization for noise models for analog inference hardware simulation. (#278)
- State independent inference noise model. (# 284)
- Transfer LR parameter for
MixedPrecisionCompound. (#283) - The bias term can now be handled either by the analog or digital domain by controlling the
digital_biaslayer parameter. (#307) - PCM short-term weight noise. (#312)
- IR-drop simulation across columns during analog mat-vec. (#312)
- Transposed-read for
TransferCompound. (#312) BufferedTranferCompoundand TTv2 presets. (#318)- Stochastic rounding for
MixedPrecisionCompound. (#318) - Decay with arbitrary decay point (to reset bias). (#319)
- Linear layer
AnalogLinearMappedwhich maps a large weight matrix onto multiple analog tiles. (#302) - Convolution layers
AnalogConvNdMappedwhich maps large weight matrix onto multiple tiles if necessary. (#331) - In the new mapping field of
RPUConfigthe max tile input and output sizes can be configured for the*Mappedlayers. (#331) - Notebooks directory with several notebook examples (#333, #334)
- Analog information summary function. (#302)
- The alpha weight scaling factor can now be defined as learnable parameter by switching
learn_out_scaling_alphain therpu_config.mappingparameters. (#353)
Fixed
- Removed GPU warning during destruction when using multiple GPUs. (#277)
- Fixed issue in transfer counter for mixed precision in case of GPU. (#283)
- Map location keyword for load / save observed. (#293)
- Fixed issue with CUDA buffer allocation when batch size changed. (#294)
- Fixed missing load statedict for
AnalogSequential. (#295) - Fixed issue with hierarchical hidden parameter settings. (#313)
- Fixed serious issue that loaded model would not update analog gradients. (#302)
- Fixed cuda import in examples. (#320)
Changed
- The inference noise models are now located in
aihwkit.inference. (#281) - Analog state dict structure `has changed (shared weight are not saved). (#293)
- Some of the parameter names of the
TransferCompoundhave changed. (#312) - New fast learning rate parameter for
TransferCompound, SGD learning rate then is applied on the slow matrix (#312). - The fixed_value of
WeightClipParameteris now applied for all clipping types if set larger than zero. (#318) - The use of generators for analog tiles of an
AnalogModuleBase. (#302) - Digital bias is now accessible through
MappingParameter. (#331) - The aihwkit documentation. New content around analog AI concepts, training presets, analog AI optimizers, new references, and examples. (#348)
- The
weight_scaling_omegacan now be defined in therpu_config.mapping. (#353)
Deprecated
- The module
aihwkit.simulator.noise_modelshas been depreciated in favor ofaihwkit.inference. (#281)