Skip to content

SEM Decoder Model PTQ Quantization Results in Severe Output Collapse with Only 3 Unique Values #31

@tanjiong233

Description

@tanjiong233

Problem Description

I am experiencing a severe output collapse issue when performing PTQ quantization on the SEM (Spatial Enhanced Manipulation) decoder model using the Horizon Robotics toolchain. The quantized HBM
model produces only 3 unique values, while the original ONNX model generates 895 unique values.

Environment Information

  • Hardware Platform: RDK S100
  • Toolchain Version: Open Explorer (hb_compile)
  • Python Environment: Python 3.10 + miniconda
  • Model Architecture: SEM Robotwin (Encoder-Decoder separated structure)
  • Affected Model: Decoder (encoder quantization works normally)

Reproduction Steps

  1. Model Export

Following the official tutorial at https://forum.d-robotics.cc/t/topic/32657 for ONNX export:

  # Quantization-friendly modifications
  data["joint_relative_pos"] = data["joint_relative_pos"].to(torch.int8)
  timestep = timestep.to(torch.int16)
  # Changed float("-inf") to -15

  # Export ONNX
  python3 onnx_scripts/export_onnx.py \
      --config config_sem_robotwin.py \
      --model /path/to/model \
      --output_path /path/to/onnx \
      --num_joint 14 \
      --validate
  1. Calibration Data Preparation

Prepared 100 calibration samples following the tutorial, preprocessed and passed through the encoder to generate decoder input features:

# Calibration data shape validation
 noisy_action:      (1, 64, 14, 8)   float32  range: [-3.005, 3.252]
 image_feature:     (1, 3, 400, 256) float32  range: [-4.295, 4.137]
 robot_feature:     (1, 14, 1, 256)  float32  range: [-3.518, 3.075]
 timestep:          (1,)              int16    value: [999]
 joint_relative_pos: (1, 14, 14)     int8     range: [0, 13]
  1. Attempted Quantization Configurations

I have tried multiple quantization configurations, all of which failed:

Configuration 1: Int16 Quantization (Recommended by Official Tutorial)

  calibration_parameters:
    cal_data_dir: ./onnx_cup_42k/calibration_data_dir/...
    quant_config: {
      "model_config": {
        "all_node_type": "int16"
      },Duplicate of #
      "op_config": {
        "Resize": {"qtype": "float16"}
      }
    }
  compiler_parameters:
    optimize_level: O2
    compile_mode: latency

Result: Failed - 3 unique values

Issue: Input quantization completely saturated the int16 range [-32768, 32767], causing severe clipping


Configuration 2: Int8 Quantization with O0 Optimization


  quant_config: {
    "model_config": {
      "all_node_type": "int8"
    },
    "op_config": {
      "Resize": {"qtype": "float16"}
    }
  }
  compiler_parameters:
    optimize_level: O0  # Changed from O2 to O0 to avoid over-optimization
    compile_mode: latency

Result: Failed - 3 unique values

Improvement: Input quantization range normal [-128, 127], no clipping, but output still collapsed


Configuration 3: Int8 with Layerwise Search and Bias Correction (Optimal Configuration)

  quant_config: {
    "model_config": {
      "all_node_type": "int8",
      "activation": {
        "calibration_type": ["max", "kl"],
        "max_percentile": [0.99995, 0.99999, 1.0],
        "num_bin": [2048, 4096],
        "asymmetric": [false, true]
      },
      "weight": {
        "bias_correction": {
          "num_sample": 10,
          "metric": "cosine-similarity"
        }
      },
      "layerwise_search": {
        "metric": "cosine-similarity"
      }
    },
    "op_config": {
      "Resize": {"qtype": "float16"},
      "MatMul": {"qtype": "float16"},
      "Gemm": {"qtype": "float16"}
    }
  }
  compiler_parameters:
    optimize_level: O0
    compile_mode: bandwidth  # Changed to bandwidth mode, prioritizing accuracy

Result: Still failed - 3 unique values


Test Results Comparison

I created a detailed comparison test script to test both ONNX and HBM models using identical input data:

Test Output

  ======================================================================
  TESTING ONNX MODEL
  ======================================================================
  Output Analysis:
     Shape: (1, 64, 14, 8)
     Dtype: float32
     Range: [-0.971276, 1.467759]
     Unique position values: 895/896
     Status: GOOD DIVERSITY

  ======================================================================
  TESTING HBM MODEL (Int8 + Layerwise Search)
  ======================================================================
  Input Information:
     noisy_action: dtype=S8, scale=1.90911591e-02
     image_feature: dtype=S8, scale=3.66429798e-02
     robot_feature: dtype=S8, scale=3.07448395e-02

  Quantizing inputs:
     noisy_action (S8): range=[-128, 127]  (no clipping)
     image_feature (S8): range=[-117, 113] (no clipping)
     robot_feature (S8): range=[-114, 100] (no clipping)

  Running HBM inference...
  Output Analysis:
     Shape: (1, 64, 14, 8)
     Dtype: float32
     Range: [-6.037731, 1.848850]
     Unique position values: 3/896
     LOW DIVERSITY - All unique values:
     [-2.934769, -2.449742, -0.26831594]

  ======================================================================
  COMPARISON SUMMARY
  ======================================================================
  Output Diversity:
     ONNX unique values: 895/896
     HBM  unique values: 3/896
     Diversity loss: 892 values

  Numerical Difference:
     Mean absolute error: 1.534658
     Max absolute error: 6.541608
     Median absolute error: 0.916659

  Conclusion:
     HBM model has SEVERE output collapse

Eliminated Issues

  1. Input data issue: Using identical calibration data, ONNX model outputs normally
  2. Quantization scale issue: Int8 quantization scale is reasonable, no clipping
  3. Compilation configuration issue: Tried O0/O1/O2, latency/bandwidth multiple combinations
  4. Calibration method issue: Tried max/kl multiple calibration methods and parameter combinations
  5. Output format issue: Output correctly configured as float32 (removed Dequantize via remove_node_type)
  6. Compilation log: No warnings or errors, compilation successful

Debug Information

HBM Model Input/Output Information

  # hbm_runtime query results
  Model: decoder_opt_resize
  Inputs:
    - noisy_action:      [1, 64, 14, 8]   INT8
    - image_feature:     [1, 3, 400, 256] INT8
    - robot_feature:     [1, 14, 1, 256]  INT8
    - timestep:          [1]              INT16
    - joint_relative_pos: [1, 14, 14]     INT8

  Output:
    - pred_action:       [1, 64, 14, 8]   FLOAT32
      Quant type: NONE
      Scale: []

Compilation Log Key Information

  2025-10-23 21:54:10 hb_compile completes running
  remove_node_type: Quantize;Cast;Softmax
  pred_action output [1, 64, 14, 8] FLOAT32
  No warnings or errors

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions