VIT: Optimized for Qualcomm Devices

VIT is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.

This is based on the implementation of VIT found here. This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the Qualcomm® AI Hub Models library to export with custom configurations. More details on model performance across various devices, can be found here.

Qualcomm AI Hub Models uses Qualcomm AI Hub Workbench to compile, profile, and evaluate this model. Sign up to run these models on a hosted Qualcomm® device.

Getting Started

There are two ways to deploy this model on your device:

Option 1: Download Pre-Exported Models

Below are pre-exported model assets ready for deployment.

Runtime Precision Chipset SDK Versions Download
ONNX float Universal QAIRT 2.45, ONNX Runtime 1.25.0 Download
ONNX w8a16 Universal QAIRT 2.45, ONNX Runtime 1.25.0 Download
ONNX w8a8 Universal QAIRT 2.45, ONNX Runtime 1.25.0 Download
ONNX w8a8_mixed_int16 Universal QAIRT 2.45, ONNX Runtime 1.25.0 Download
QNN_DLC float Universal QAIRT 2.45 Download
QNN_DLC w8a16 Universal QAIRT 2.45 Download
QNN_DLC w8a8 Universal QAIRT 2.45 Download
TFLITE float Universal QAIRT 2.45 Download
TFLITE w8a8 Universal QAIRT 2.45 Download

For more device-specific assets and performance metrics, visit VIT on Qualcomm® AI Hub.

Option 2: Export with Custom Configurations

Use the Qualcomm® AI Hub Models Python library to compile and export the model with your own:

  • Custom weights (e.g., fine-tuned checkpoints)
  • Custom input shapes
  • Target device and runtime configurations

This option is ideal if you need to customize the model beyond the default configuration provided here.

See our repository for VIT on GitHub for usage instructions.

Model Details

Model Type: Model_use_case.image_classification

Model Stats:

  • Model checkpoint: Imagenet
  • Input resolution: 224x224
  • Number of parameters: 86.6M
  • Model size (float): 330 MB
  • Model size (w8a16): 86.2 MB
  • Model size (w8a8): 83.2 MB

Performance Summary

Model Runtime Precision Chipset Inference Time (ms) Peak Memory Range (MB) Primary Compute Unit
VIT ONNX float Snapdragon® X2 Elite 3.055 ms 181 - 181 MB NPU
VIT ONNX float Snapdragon® X Elite 7.504 ms 170 - 170 MB NPU
VIT ONNX float Snapdragon® 8 Gen 3 Mobile 5.023 ms 1 - 295 MB NPU
VIT ONNX float Snapdragon® 8 Gen 1 Mobile 12.142 ms 1 - 272 MB NPU
VIT ONNX float Qualcomm® QCS8550 (Proxy) 7.193 ms 1 - 5 MB NPU
VIT ONNX float Qualcomm® QCS8450 12.142 ms 1 - 272 MB NPU
VIT ONNX float Snapdragon® 8 Elite Mobile 3.489 ms 0 - 175 MB NPU
VIT ONNX float Snapdragon® 8 Elite Gen 5 Mobile 2.801 ms 1 - 159 MB NPU
VIT ONNX float Qualcomm® QCS9075 10.188 ms 1 - 46 MB NPU
VIT ONNX float Qualcomm® QCS8750 3.489 ms 0 - 175 MB NPU
VIT ONNX float Qualcomm® QCS7181 7.504 ms 170 - 170 MB NPU
VIT ONNX w8a16 Snapdragon® X2 Elite 2.01 ms 183 - 183 MB NPU
VIT ONNX w8a16 Snapdragon® X Elite 4.749 ms 150 - 150 MB NPU
VIT ONNX w8a16 Snapdragon® 8 Gen 3 Mobile 3.108 ms 0 - 304 MB NPU
VIT ONNX w8a16 Qualcomm® QCS8550 (Proxy) 4.621 ms 0 - 4 MB NPU
VIT ONNX w8a16 Snapdragon® 8 Elite Gen 5 Mobile 1.773 ms 0 - 266 MB NPU
VIT ONNX w8a16 Snapdragon® 8 Elite Mobile 2.426 ms 0 - 260 MB NPU
VIT ONNX w8a16 Qualcomm® QCS9075 4.718 ms 0 - 45 MB NPU
VIT ONNX w8a16 Snapdragon® 7 Gen 4 Mobile 5.892 ms 0 - 393 MB NPU
VIT ONNX w8a16 Qualcomm® QCM6690 50.539 ms 0 - 401 MB NPU
VIT ONNX w8a16 Qualcomm® QCS7790 5.892 ms 0 - 393 MB NPU
VIT ONNX w8a16 Qualcomm® QCS8750 2.426 ms 0 - 260 MB NPU
VIT ONNX w8a16 Qualcomm® QCS7181 4.749 ms 150 - 150 MB NPU
VIT ONNX w8a8 Snapdragon® X2 Elite 4.117 ms 212 - 212 MB NPU
VIT ONNX w8a8 Snapdragon® X Elite 9.751 ms 181 - 181 MB NPU
VIT ONNX w8a8 Snapdragon® 8 Gen 3 Mobile 6.321 ms 0 - 323 MB NPU
VIT ONNX w8a8 Snapdragon® 8 Gen 1 Mobile 13.243 ms 0 - 330 MB NPU
VIT ONNX w8a8 Qualcomm® QCS6490 50.336 ms 0 - 45 MB NPU
VIT ONNX w8a8 Qualcomm® QCS8550 (Proxy) 9.369 ms 0 - 100 MB NPU
VIT ONNX w8a8 Qualcomm® QCS8450 13.243 ms 0 - 330 MB NPU
VIT ONNX w8a8 Qualcomm® QCS9075 9.243 ms 0 - 45 MB NPU
VIT ONNX w8a8 Qualcomm® QCM6690 186.722 ms 0 - 523 MB NPU
VIT ONNX w8a8 Snapdragon® 7 Gen 4 Mobile 17.073 ms 0 - 390 MB NPU
VIT ONNX w8a8 Snapdragon® 8 Elite Gen 5 Mobile 3.463 ms 0 - 242 MB NPU
VIT ONNX w8a8 Snapdragon® 8 Elite Mobile 5.456 ms 0 - 239 MB NPU
VIT ONNX w8a8 Qualcomm® QCS7790 17.073 ms 0 - 390 MB NPU
VIT ONNX w8a8 Qualcomm® QCS8750 5.456 ms 0 - 239 MB NPU
VIT ONNX w8a8 Qualcomm® QCS7181 9.751 ms 181 - 181 MB NPU
VIT ONNX w8a8_mixed_int16 Snapdragon® 8 Gen 3 Mobile 70.212 ms 59 - 410 MB NPU
VIT ONNX w8a8_mixed_int16 Snapdragon® 8 Elite Mobile 56.595 ms 61 - 327 MB NPU
VIT ONNX w8a8_mixed_int16 Snapdragon® 8 Elite Gen 5 Mobile 48.199 ms 60 - 333 MB NPU
VIT ONNX w8a8_mixed_int16 Qualcomm® QCS8750 56.595 ms 61 - 327 MB NPU
VIT QNN_DLC float Snapdragon® X2 Elite 3.52 ms 1 - 1 MB NPU
VIT QNN_DLC float Snapdragon® X Elite 8.3 ms 1 - 1 MB NPU
VIT QNN_DLC float Snapdragon® 8 Gen 3 Mobile 5.393 ms 0 - 251 MB NPU
VIT QNN_DLC float Snapdragon® 8 Gen 1 Mobile 13.078 ms 1 - 229 MB NPU
VIT QNN_DLC float Qualcomm® QCS8275 34.807 ms 1 - 168 MB NPU
VIT QNN_DLC float Qualcomm® QCS8550 (Proxy) 7.715 ms 1 - 400 MB NPU
VIT QNN_DLC float Qualcomm® QCS8450 13.078 ms 1 - 229 MB NPU
VIT QNN_DLC float Snapdragon® 8 Elite Mobile 3.754 ms 0 - 155 MB NPU
VIT QNN_DLC float Qualcomm® SA7255P 34.807 ms 1 - 168 MB NPU
VIT QNN_DLC float Qualcomm® SA8295P 12.955 ms 1 - 145 MB NPU
VIT QNN_DLC float Snapdragon® 8 Elite Gen 5 Mobile 2.931 ms 1 - 157 MB NPU
VIT QNN_DLC float Qualcomm® QCS9075 10.721 ms 1 - 3 MB NPU
VIT QNN_DLC float Qualcomm® QCS8750 3.754 ms 0 - 155 MB NPU
VIT QNN_DLC float Qualcomm® QCS7181 8.3 ms 1 - 1 MB NPU
VIT QNN_DLC w8a16 Snapdragon® X2 Elite 3.585 ms 0 - 0 MB NPU
VIT QNN_DLC w8a16 Snapdragon® X Elite 8.26 ms 0 - 0 MB NPU
VIT QNN_DLC w8a16 Snapdragon® 8 Gen 3 Mobile 5.088 ms 0 - 321 MB NPU
VIT QNN_DLC w8a16 Qualcomm® QCS8275 17.354 ms 0 - 258 MB NPU
VIT QNN_DLC w8a16 Qualcomm® QCS8550 (Proxy) 7.642 ms 0 - 35 MB NPU
VIT QNN_DLC w8a16 Snapdragon® 8 Elite Gen 5 Mobile 2.85 ms 0 - 277 MB NPU
VIT QNN_DLC w8a16 Snapdragon® 8 Elite Mobile 3.887 ms 0 - 260 MB NPU
VIT QNN_DLC w8a16 Qualcomm® QCS9075 7.974 ms 0 - 2 MB NPU
VIT QNN_DLC w8a16 Snapdragon® 7 Gen 4 Mobile 12.979 ms 0 - 402 MB NPU
VIT QNN_DLC w8a16 Qualcomm® QCM6690 120.068 ms 0 - 401 MB NPU
VIT QNN_DLC w8a16 Qualcomm® SA7255P 17.354 ms 0 - 258 MB NPU
VIT QNN_DLC w8a16 Qualcomm® QCS7790 12.979 ms 0 - 402 MB NPU
VIT QNN_DLC w8a16 Qualcomm® QCS8750 3.887 ms 0 - 260 MB NPU
VIT QNN_DLC w8a16 Qualcomm® QCS7181 8.26 ms 0 - 0 MB NPU
VIT QNN_DLC w8a8 Snapdragon® X2 Elite 4.472 ms 0 - 0 MB NPU
VIT QNN_DLC w8a8 Snapdragon® X Elite 10.216 ms 0 - 0 MB NPU
VIT QNN_DLC w8a8 Snapdragon® 8 Gen 3 Mobile 6.449 ms 0 - 305 MB NPU
VIT QNN_DLC w8a8 Snapdragon® 8 Gen 1 Mobile 13.271 ms 0 - 312 MB NPU
VIT QNN_DLC w8a8 Qualcomm® QCS6490 50.709 ms 0 - 2 MB NPU
VIT QNN_DLC w8a8 Qualcomm® QCS8275 28.517 ms 0 - 201 MB NPU
VIT QNN_DLC w8a8 Qualcomm® QCS8550 (Proxy) 9.585 ms 0 - 118 MB NPU
VIT QNN_DLC w8a8 Qualcomm® QCS8450 13.271 ms 0 - 312 MB NPU
VIT QNN_DLC w8a8 Qualcomm® QCS9075 9.413 ms 0 - 2 MB NPU
VIT QNN_DLC w8a8 Qualcomm® SA7255P 28.517 ms 0 - 201 MB NPU
VIT QNN_DLC w8a8 Qualcomm® QCM6690 188.78 ms 0 - 501 MB NPU
VIT QNN_DLC w8a8 Snapdragon® 7 Gen 4 Mobile 19.356 ms 0 - 358 MB NPU
VIT QNN_DLC w8a8 Qualcomm® SA8295P 15.62 ms 0 - 212 MB NPU
VIT QNN_DLC w8a8 Snapdragon® 8 Elite Gen 5 Mobile 3.498 ms 0 - 209 MB NPU
VIT QNN_DLC w8a8 Snapdragon® 8 Elite Mobile 5.377 ms 0 - 204 MB NPU
VIT QNN_DLC w8a8 Qualcomm® QCS7790 19.356 ms 0 - 358 MB NPU
VIT QNN_DLC w8a8 Qualcomm® QCS8750 5.377 ms 0 - 204 MB NPU
VIT QNN_DLC w8a8 Qualcomm® QCS7181 10.216 ms 0 - 0 MB NPU
VIT TFLITE float Snapdragon® 8 Gen 3 Mobile 5.267 ms 0 - 306 MB NPU
VIT TFLITE float Snapdragon® 8 Gen 1 Mobile 12.961 ms 0 - 290 MB NPU
VIT TFLITE float Qualcomm® QCS8275 34.607 ms 0 - 169 MB NPU
VIT TFLITE float Qualcomm® QCS8550 (Proxy) 7.258 ms 0 - 5 MB NPU
VIT TFLITE float Qualcomm® SA8775P 281.845 ms 3 - 47 MB CPU
VIT TFLITE float Qualcomm® SA8650P 281.845 ms 3 - 47 MB CPU
VIT TFLITE float Qualcomm® SA8255P 281.845 ms 3 - 47 MB CPU
VIT TFLITE float Qualcomm® QCS8450 12.961 ms 0 - 290 MB NPU
VIT TFLITE float Snapdragon® 8 Elite Mobile 3.656 ms 0 - 177 MB NPU
VIT TFLITE float Qualcomm® SA7255P 34.607 ms 0 - 169 MB NPU
VIT TFLITE float Qualcomm® SA8295P 13.301 ms 0 - 157 MB NPU
VIT TFLITE float Snapdragon® 8 Elite Gen 5 Mobile 3.032 ms 0 - 171 MB NPU
VIT TFLITE float Qualcomm® QCS9075 10.619 ms 0 - 173 MB NPU
VIT TFLITE float Qualcomm® QCS8750 3.656 ms 0 - 177 MB NPU
VIT TFLITE w8a8 Snapdragon® 8 Gen 3 Mobile 8.688 ms 0 - 487 MB NPU
VIT TFLITE w8a8 Snapdragon® 8 Gen 1 Mobile 21.745 ms 0 - 449 MB NPU
VIT TFLITE w8a8 Qualcomm® QCS6490 144.275 ms 1 - 100 MB NPU
VIT TFLITE w8a8 Qualcomm® QCS8275 35.047 ms 0 - 385 MB NPU
VIT TFLITE w8a8 Qualcomm® QCS8550 (Proxy) 12.343 ms 0 - 3 MB NPU
VIT TFLITE w8a8 Qualcomm® SA8775P 99.096 ms 0 - 52 MB CPU
VIT TFLITE w8a8 Qualcomm® SA8650P 99.096 ms 0 - 52 MB CPU
VIT TFLITE w8a8 Qualcomm® SA8255P 99.096 ms 0 - 52 MB CPU
VIT TFLITE w8a8 Qualcomm® QCS8450 21.745 ms 0 - 449 MB NPU
VIT TFLITE w8a8 Qualcomm® QCS9075 13.592 ms 0 - 88 MB NPU
VIT TFLITE w8a8 Qualcomm® SA7255P 35.047 ms 0 - 385 MB NPU
VIT TFLITE w8a8 Qualcomm® QCM6690 201.234 ms 1 - 283 MB NPU
VIT TFLITE w8a8 Snapdragon® 7 Gen 4 Mobile 29.823 ms 1 - 215 MB NPU
VIT TFLITE w8a8 Qualcomm® SA8295P 19.253 ms 0 - 346 MB NPU
VIT TFLITE w8a8 Snapdragon® 8 Elite Gen 5 Mobile 4.553 ms 0 - 383 MB NPU
VIT TFLITE w8a8 Snapdragon® 8 Elite Mobile 6.924 ms 0 - 380 MB NPU
VIT TFLITE w8a8 Qualcomm® QCS7790 29.823 ms 1 - 215 MB NPU
VIT TFLITE w8a8 Qualcomm® QCS8750 6.924 ms 0 - 380 MB NPU

License

  • The license for the original implementation of VIT can be found here.

References

Community

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for qualcomm/VIT