VIT: Optimized for Qualcomm Devices
VIT is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
This is based on the implementation of VIT found here. This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the Qualcomm® AI Hub Models library to export with custom configurations. More details on model performance across various devices, can be found here.
Qualcomm AI Hub Models uses Qualcomm AI Hub Workbench to compile, profile, and evaluate this model. Sign up to run these models on a hosted Qualcomm® device.
Getting Started
There are two ways to deploy this model on your device:
Option 1: Download Pre-Exported Models
Below are pre-exported model assets ready for deployment.
| Runtime | Precision | Chipset | SDK Versions | Download |
|---|---|---|---|---|
| ONNX | float | Universal | QAIRT 2.45, ONNX Runtime 1.25.0 | Download |
| ONNX | w8a16 | Universal | QAIRT 2.45, ONNX Runtime 1.25.0 | Download |
| ONNX | w8a8 | Universal | QAIRT 2.45, ONNX Runtime 1.25.0 | Download |
| ONNX | w8a8_mixed_int16 | Universal | QAIRT 2.45, ONNX Runtime 1.25.0 | Download |
| QNN_DLC | float | Universal | QAIRT 2.45 | Download |
| QNN_DLC | w8a16 | Universal | QAIRT 2.45 | Download |
| QNN_DLC | w8a8 | Universal | QAIRT 2.45 | Download |
| TFLITE | float | Universal | QAIRT 2.45 | Download |
| TFLITE | w8a8 | Universal | QAIRT 2.45 | Download |
For more device-specific assets and performance metrics, visit VIT on Qualcomm® AI Hub.
Option 2: Export with Custom Configurations
Use the Qualcomm® AI Hub Models Python library to compile and export the model with your own:
- Custom weights (e.g., fine-tuned checkpoints)
- Custom input shapes
- Target device and runtime configurations
This option is ideal if you need to customize the model beyond the default configuration provided here.
See our repository for VIT on GitHub for usage instructions.
Model Details
Model Type: Model_use_case.image_classification
Model Stats:
- Model checkpoint: Imagenet
- Input resolution: 224x224
- Number of parameters: 86.6M
- Model size (float): 330 MB
- Model size (w8a16): 86.2 MB
- Model size (w8a8): 83.2 MB
Performance Summary
| Model | Runtime | Precision | Chipset | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit |
|---|---|---|---|---|---|---|
| VIT | ONNX | float | Snapdragon® X2 Elite | 3.055 ms | 181 - 181 MB | NPU |
| VIT | ONNX | float | Snapdragon® X Elite | 7.504 ms | 170 - 170 MB | NPU |
| VIT | ONNX | float | Snapdragon® 8 Gen 3 Mobile | 5.023 ms | 1 - 295 MB | NPU |
| VIT | ONNX | float | Snapdragon® 8 Gen 1 Mobile | 12.142 ms | 1 - 272 MB | NPU |
| VIT | ONNX | float | Qualcomm® QCS8550 (Proxy) | 7.193 ms | 1 - 5 MB | NPU |
| VIT | ONNX | float | Qualcomm® QCS8450 | 12.142 ms | 1 - 272 MB | NPU |
| VIT | ONNX | float | Snapdragon® 8 Elite Mobile | 3.489 ms | 0 - 175 MB | NPU |
| VIT | ONNX | float | Snapdragon® 8 Elite Gen 5 Mobile | 2.801 ms | 1 - 159 MB | NPU |
| VIT | ONNX | float | Qualcomm® QCS9075 | 10.188 ms | 1 - 46 MB | NPU |
| VIT | ONNX | float | Qualcomm® QCS8750 | 3.489 ms | 0 - 175 MB | NPU |
| VIT | ONNX | float | Qualcomm® QCS7181 | 7.504 ms | 170 - 170 MB | NPU |
| VIT | ONNX | w8a16 | Snapdragon® X2 Elite | 2.01 ms | 183 - 183 MB | NPU |
| VIT | ONNX | w8a16 | Snapdragon® X Elite | 4.749 ms | 150 - 150 MB | NPU |
| VIT | ONNX | w8a16 | Snapdragon® 8 Gen 3 Mobile | 3.108 ms | 0 - 304 MB | NPU |
| VIT | ONNX | w8a16 | Qualcomm® QCS8550 (Proxy) | 4.621 ms | 0 - 4 MB | NPU |
| VIT | ONNX | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | 1.773 ms | 0 - 266 MB | NPU |
| VIT | ONNX | w8a16 | Snapdragon® 8 Elite Mobile | 2.426 ms | 0 - 260 MB | NPU |
| VIT | ONNX | w8a16 | Qualcomm® QCS9075 | 4.718 ms | 0 - 45 MB | NPU |
| VIT | ONNX | w8a16 | Snapdragon® 7 Gen 4 Mobile | 5.892 ms | 0 - 393 MB | NPU |
| VIT | ONNX | w8a16 | Qualcomm® QCM6690 | 50.539 ms | 0 - 401 MB | NPU |
| VIT | ONNX | w8a16 | Qualcomm® QCS7790 | 5.892 ms | 0 - 393 MB | NPU |
| VIT | ONNX | w8a16 | Qualcomm® QCS8750 | 2.426 ms | 0 - 260 MB | NPU |
| VIT | ONNX | w8a16 | Qualcomm® QCS7181 | 4.749 ms | 150 - 150 MB | NPU |
| VIT | ONNX | w8a8 | Snapdragon® X2 Elite | 4.117 ms | 212 - 212 MB | NPU |
| VIT | ONNX | w8a8 | Snapdragon® X Elite | 9.751 ms | 181 - 181 MB | NPU |
| VIT | ONNX | w8a8 | Snapdragon® 8 Gen 3 Mobile | 6.321 ms | 0 - 323 MB | NPU |
| VIT | ONNX | w8a8 | Snapdragon® 8 Gen 1 Mobile | 13.243 ms | 0 - 330 MB | NPU |
| VIT | ONNX | w8a8 | Qualcomm® QCS6490 | 50.336 ms | 0 - 45 MB | NPU |
| VIT | ONNX | w8a8 | Qualcomm® QCS8550 (Proxy) | 9.369 ms | 0 - 100 MB | NPU |
| VIT | ONNX | w8a8 | Qualcomm® QCS8450 | 13.243 ms | 0 - 330 MB | NPU |
| VIT | ONNX | w8a8 | Qualcomm® QCS9075 | 9.243 ms | 0 - 45 MB | NPU |
| VIT | ONNX | w8a8 | Qualcomm® QCM6690 | 186.722 ms | 0 - 523 MB | NPU |
| VIT | ONNX | w8a8 | Snapdragon® 7 Gen 4 Mobile | 17.073 ms | 0 - 390 MB | NPU |
| VIT | ONNX | w8a8 | Snapdragon® 8 Elite Gen 5 Mobile | 3.463 ms | 0 - 242 MB | NPU |
| VIT | ONNX | w8a8 | Snapdragon® 8 Elite Mobile | 5.456 ms | 0 - 239 MB | NPU |
| VIT | ONNX | w8a8 | Qualcomm® QCS7790 | 17.073 ms | 0 - 390 MB | NPU |
| VIT | ONNX | w8a8 | Qualcomm® QCS8750 | 5.456 ms | 0 - 239 MB | NPU |
| VIT | ONNX | w8a8 | Qualcomm® QCS7181 | 9.751 ms | 181 - 181 MB | NPU |
| VIT | ONNX | w8a8_mixed_int16 | Snapdragon® 8 Gen 3 Mobile | 70.212 ms | 59 - 410 MB | NPU |
| VIT | ONNX | w8a8_mixed_int16 | Snapdragon® 8 Elite Mobile | 56.595 ms | 61 - 327 MB | NPU |
| VIT | ONNX | w8a8_mixed_int16 | Snapdragon® 8 Elite Gen 5 Mobile | 48.199 ms | 60 - 333 MB | NPU |
| VIT | ONNX | w8a8_mixed_int16 | Qualcomm® QCS8750 | 56.595 ms | 61 - 327 MB | NPU |
| VIT | QNN_DLC | float | Snapdragon® X2 Elite | 3.52 ms | 1 - 1 MB | NPU |
| VIT | QNN_DLC | float | Snapdragon® X Elite | 8.3 ms | 1 - 1 MB | NPU |
| VIT | QNN_DLC | float | Snapdragon® 8 Gen 3 Mobile | 5.393 ms | 0 - 251 MB | NPU |
| VIT | QNN_DLC | float | Snapdragon® 8 Gen 1 Mobile | 13.078 ms | 1 - 229 MB | NPU |
| VIT | QNN_DLC | float | Qualcomm® QCS8275 | 34.807 ms | 1 - 168 MB | NPU |
| VIT | QNN_DLC | float | Qualcomm® QCS8550 (Proxy) | 7.715 ms | 1 - 400 MB | NPU |
| VIT | QNN_DLC | float | Qualcomm® QCS8450 | 13.078 ms | 1 - 229 MB | NPU |
| VIT | QNN_DLC | float | Snapdragon® 8 Elite Mobile | 3.754 ms | 0 - 155 MB | NPU |
| VIT | QNN_DLC | float | Qualcomm® SA7255P | 34.807 ms | 1 - 168 MB | NPU |
| VIT | QNN_DLC | float | Qualcomm® SA8295P | 12.955 ms | 1 - 145 MB | NPU |
| VIT | QNN_DLC | float | Snapdragon® 8 Elite Gen 5 Mobile | 2.931 ms | 1 - 157 MB | NPU |
| VIT | QNN_DLC | float | Qualcomm® QCS9075 | 10.721 ms | 1 - 3 MB | NPU |
| VIT | QNN_DLC | float | Qualcomm® QCS8750 | 3.754 ms | 0 - 155 MB | NPU |
| VIT | QNN_DLC | float | Qualcomm® QCS7181 | 8.3 ms | 1 - 1 MB | NPU |
| VIT | QNN_DLC | w8a16 | Snapdragon® X2 Elite | 3.585 ms | 0 - 0 MB | NPU |
| VIT | QNN_DLC | w8a16 | Snapdragon® X Elite | 8.26 ms | 0 - 0 MB | NPU |
| VIT | QNN_DLC | w8a16 | Snapdragon® 8 Gen 3 Mobile | 5.088 ms | 0 - 321 MB | NPU |
| VIT | QNN_DLC | w8a16 | Qualcomm® QCS8275 | 17.354 ms | 0 - 258 MB | NPU |
| VIT | QNN_DLC | w8a16 | Qualcomm® QCS8550 (Proxy) | 7.642 ms | 0 - 35 MB | NPU |
| VIT | QNN_DLC | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | 2.85 ms | 0 - 277 MB | NPU |
| VIT | QNN_DLC | w8a16 | Snapdragon® 8 Elite Mobile | 3.887 ms | 0 - 260 MB | NPU |
| VIT | QNN_DLC | w8a16 | Qualcomm® QCS9075 | 7.974 ms | 0 - 2 MB | NPU |
| VIT | QNN_DLC | w8a16 | Snapdragon® 7 Gen 4 Mobile | 12.979 ms | 0 - 402 MB | NPU |
| VIT | QNN_DLC | w8a16 | Qualcomm® QCM6690 | 120.068 ms | 0 - 401 MB | NPU |
| VIT | QNN_DLC | w8a16 | Qualcomm® SA7255P | 17.354 ms | 0 - 258 MB | NPU |
| VIT | QNN_DLC | w8a16 | Qualcomm® QCS7790 | 12.979 ms | 0 - 402 MB | NPU |
| VIT | QNN_DLC | w8a16 | Qualcomm® QCS8750 | 3.887 ms | 0 - 260 MB | NPU |
| VIT | QNN_DLC | w8a16 | Qualcomm® QCS7181 | 8.26 ms | 0 - 0 MB | NPU |
| VIT | QNN_DLC | w8a8 | Snapdragon® X2 Elite | 4.472 ms | 0 - 0 MB | NPU |
| VIT | QNN_DLC | w8a8 | Snapdragon® X Elite | 10.216 ms | 0 - 0 MB | NPU |
| VIT | QNN_DLC | w8a8 | Snapdragon® 8 Gen 3 Mobile | 6.449 ms | 0 - 305 MB | NPU |
| VIT | QNN_DLC | w8a8 | Snapdragon® 8 Gen 1 Mobile | 13.271 ms | 0 - 312 MB | NPU |
| VIT | QNN_DLC | w8a8 | Qualcomm® QCS6490 | 50.709 ms | 0 - 2 MB | NPU |
| VIT | QNN_DLC | w8a8 | Qualcomm® QCS8275 | 28.517 ms | 0 - 201 MB | NPU |
| VIT | QNN_DLC | w8a8 | Qualcomm® QCS8550 (Proxy) | 9.585 ms | 0 - 118 MB | NPU |
| VIT | QNN_DLC | w8a8 | Qualcomm® QCS8450 | 13.271 ms | 0 - 312 MB | NPU |
| VIT | QNN_DLC | w8a8 | Qualcomm® QCS9075 | 9.413 ms | 0 - 2 MB | NPU |
| VIT | QNN_DLC | w8a8 | Qualcomm® SA7255P | 28.517 ms | 0 - 201 MB | NPU |
| VIT | QNN_DLC | w8a8 | Qualcomm® QCM6690 | 188.78 ms | 0 - 501 MB | NPU |
| VIT | QNN_DLC | w8a8 | Snapdragon® 7 Gen 4 Mobile | 19.356 ms | 0 - 358 MB | NPU |
| VIT | QNN_DLC | w8a8 | Qualcomm® SA8295P | 15.62 ms | 0 - 212 MB | NPU |
| VIT | QNN_DLC | w8a8 | Snapdragon® 8 Elite Gen 5 Mobile | 3.498 ms | 0 - 209 MB | NPU |
| VIT | QNN_DLC | w8a8 | Snapdragon® 8 Elite Mobile | 5.377 ms | 0 - 204 MB | NPU |
| VIT | QNN_DLC | w8a8 | Qualcomm® QCS7790 | 19.356 ms | 0 - 358 MB | NPU |
| VIT | QNN_DLC | w8a8 | Qualcomm® QCS8750 | 5.377 ms | 0 - 204 MB | NPU |
| VIT | QNN_DLC | w8a8 | Qualcomm® QCS7181 | 10.216 ms | 0 - 0 MB | NPU |
| VIT | TFLITE | float | Snapdragon® 8 Gen 3 Mobile | 5.267 ms | 0 - 306 MB | NPU |
| VIT | TFLITE | float | Snapdragon® 8 Gen 1 Mobile | 12.961 ms | 0 - 290 MB | NPU |
| VIT | TFLITE | float | Qualcomm® QCS8275 | 34.607 ms | 0 - 169 MB | NPU |
| VIT | TFLITE | float | Qualcomm® QCS8550 (Proxy) | 7.258 ms | 0 - 5 MB | NPU |
| VIT | TFLITE | float | Qualcomm® SA8775P | 281.845 ms | 3 - 47 MB | CPU |
| VIT | TFLITE | float | Qualcomm® SA8650P | 281.845 ms | 3 - 47 MB | CPU |
| VIT | TFLITE | float | Qualcomm® SA8255P | 281.845 ms | 3 - 47 MB | CPU |
| VIT | TFLITE | float | Qualcomm® QCS8450 | 12.961 ms | 0 - 290 MB | NPU |
| VIT | TFLITE | float | Snapdragon® 8 Elite Mobile | 3.656 ms | 0 - 177 MB | NPU |
| VIT | TFLITE | float | Qualcomm® SA7255P | 34.607 ms | 0 - 169 MB | NPU |
| VIT | TFLITE | float | Qualcomm® SA8295P | 13.301 ms | 0 - 157 MB | NPU |
| VIT | TFLITE | float | Snapdragon® 8 Elite Gen 5 Mobile | 3.032 ms | 0 - 171 MB | NPU |
| VIT | TFLITE | float | Qualcomm® QCS9075 | 10.619 ms | 0 - 173 MB | NPU |
| VIT | TFLITE | float | Qualcomm® QCS8750 | 3.656 ms | 0 - 177 MB | NPU |
| VIT | TFLITE | w8a8 | Snapdragon® 8 Gen 3 Mobile | 8.688 ms | 0 - 487 MB | NPU |
| VIT | TFLITE | w8a8 | Snapdragon® 8 Gen 1 Mobile | 21.745 ms | 0 - 449 MB | NPU |
| VIT | TFLITE | w8a8 | Qualcomm® QCS6490 | 144.275 ms | 1 - 100 MB | NPU |
| VIT | TFLITE | w8a8 | Qualcomm® QCS8275 | 35.047 ms | 0 - 385 MB | NPU |
| VIT | TFLITE | w8a8 | Qualcomm® QCS8550 (Proxy) | 12.343 ms | 0 - 3 MB | NPU |
| VIT | TFLITE | w8a8 | Qualcomm® SA8775P | 99.096 ms | 0 - 52 MB | CPU |
| VIT | TFLITE | w8a8 | Qualcomm® SA8650P | 99.096 ms | 0 - 52 MB | CPU |
| VIT | TFLITE | w8a8 | Qualcomm® SA8255P | 99.096 ms | 0 - 52 MB | CPU |
| VIT | TFLITE | w8a8 | Qualcomm® QCS8450 | 21.745 ms | 0 - 449 MB | NPU |
| VIT | TFLITE | w8a8 | Qualcomm® QCS9075 | 13.592 ms | 0 - 88 MB | NPU |
| VIT | TFLITE | w8a8 | Qualcomm® SA7255P | 35.047 ms | 0 - 385 MB | NPU |
| VIT | TFLITE | w8a8 | Qualcomm® QCM6690 | 201.234 ms | 1 - 283 MB | NPU |
| VIT | TFLITE | w8a8 | Snapdragon® 7 Gen 4 Mobile | 29.823 ms | 1 - 215 MB | NPU |
| VIT | TFLITE | w8a8 | Qualcomm® SA8295P | 19.253 ms | 0 - 346 MB | NPU |
| VIT | TFLITE | w8a8 | Snapdragon® 8 Elite Gen 5 Mobile | 4.553 ms | 0 - 383 MB | NPU |
| VIT | TFLITE | w8a8 | Snapdragon® 8 Elite Mobile | 6.924 ms | 0 - 380 MB | NPU |
| VIT | TFLITE | w8a8 | Qualcomm® QCS7790 | 29.823 ms | 1 - 215 MB | NPU |
| VIT | TFLITE | w8a8 | Qualcomm® QCS8750 | 6.924 ms | 0 - 380 MB | NPU |
License
- The license for the original implementation of VIT can be found here.
References
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
- Source Model Implementation
Community
- Join our AI Hub Slack community to collaborate, post questions and learn more about on-device AI.
- For questions or feedback please reach out to us.
