CodeHashira73 commited on
Commit
71be50a
Β·
1 Parent(s): 6883b66

Add detailed readme

Browse files
Files changed (1) hide show
  1. README.md +180 -9
README.md CHANGED
@@ -1,12 +1,183 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- title: Visual Defect Inspector
3
- emoji: πŸ“š
4
- colorFrom: green
5
- colorTo: indigo
6
- sdk: docker
7
- pinned: false
8
- license: mit
9
- short_description: A patchcore based model for detecting anomalies
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Visual Defect Inspector
2
+
3
+ An industrial anomaly detection API built on [PatchCore](https://arxiv.org/abs/2106.08265), trained on the [MVTec AD](https://www.mvtec.com/company/research/datasets/mvtec-ad) dataset. Upload an image of a bottle and the API returns an anomaly prediction, confidence score, and a heatmap overlay highlighting suspicious regions.
4
+
5
+ **Live API:** [codehashira73-visual-defect-inspector.hf.space/docs](https://codehashira73-visual-defect-inspector.hf.space/docs)
6
+
7
+ ---
8
+
9
+ ## Demo
10
+
11
+ | Normal | Defective | Anomaly Heatmap |
12
+ |--------|-----------|-----------------|
13
+ | ![normal](assets/normal.png) | ![defective](assets/defective.png) | ![heatmap](assets/heatmap.png) |
14
+
15
+ > **Note:** This model is trained on MVTec AD bottle images β€” professionally lit, white background, standardized angles. Testing with images from this distribution will give reliable results. Random internet images may return false positives due to distribution shift (different backgrounds, lighting, angles).
16
+
17
  ---
18
+
19
+ ## How It Works
20
+
21
+ PatchCore is a memory-based anomaly detection algorithm. Instead of learning what anomalies look like (which is impossible without labeled defect data), it learns what *normal* looks like and flags anything that deviates.
22
+
23
+ **Training (offline):**
24
+ 1. Pass all normal training images through a pretrained CNN backbone (`resnet18`)
25
+ 2. Extract intermediate feature maps from `layer2` and `layer3` β€” these capture both low-level textures and mid-level semantics
26
+ 3. Flatten these into patch-level embeddings, one per spatial location
27
+ 4. Apply coreset subsampling (ratio=0.1) to compress the embeddings into a representative memory bank β€” this keeps inference fast without sacrificing much accuracy
28
+
29
+ **Inference (at API call time):**
30
+ 1. Pass the uploaded image through the same backbone
31
+ 2. Extract patch embeddings from the same layers
32
+ 3. For each patch, find its nearest neighbor in the memory bank and compute the distance
33
+ 4. Large distance = that patch looks nothing like any normal patch = anomaly
34
+ 5. Upsample per-patch distances back to image resolution β†’ anomaly heatmap
35
+ 6. Take the maximum patch distance as the image-level anomaly score
36
+ 7. Compare score against a threshold computed during training β†’ `NORMAL` or `ANOMALOUS`
37
+
38
+ ### Why resnet18 over wide_resnet50_2?
39
+
40
+ Both backbones performed nearly identically β€” `wide_resnet50_2` achieved `pixel_AUROC=0.986` vs `resnet18` at `pixel_AUROC=0.978`. Since PatchCore's performance is driven primarily by the coreset memory bank and nearest-neighbor search rather than backbone capacity, the heavier backbone offers no meaningful advantage. `resnet18` was selected for its faster inference time and lower memory footprint at deployment, with negligible cost to detection performance.
41
+
42
  ---
43
 
44
+ ## Results
45
+
46
+ Trained and evaluated on the **bottle** category of MVTec AD.
47
+
48
+ | Backbone | Image AUROC | Pixel AUROC | Image F1 | Model Size |
49
+ |----------|-------------|-------------|----------|------------|
50
+ | wide_resnet50_2 | 1.000 | 0.986 | 0.992 | ~1.5 GB |
51
+ | **resnet18 (deployed)** | **1.000** | **0.978** | **0.992** | **42 MB** |
52
+
53
+ Both runs logged with MLflow under the `visual-defect-inspector` experiment.
54
+
55
+ ---
56
+
57
+ ## API Usage
58
+
59
+ ### `GET /health`
60
+ Health check endpoint.
61
+
62
+ ```bash
63
+ curl https://codehashira73-visual-defect-inspector.hf.space/health
64
+ ```
65
+
66
+ **Response:**
67
+ ```json
68
+ {"status": "ok"}
69
+ ```
70
+
71
+ ---
72
+
73
+ ### `POST /inspect`
74
+ Upload an image and get an anomaly prediction.
75
+
76
+ ```bash
77
+ curl -X POST \
78
+ https://codehashira73-visual-defect-inspector.hf.space/inspect \
79
+ -F "file=@bottle.png"
80
+ ```
81
+
82
+ **Response:**
83
+ ```json
84
+ {
85
+ "prediction": "ANOMALOUS",
86
+ "anomaly_score": 0.9753,
87
+ "heatmap_base64": "/9j/4AAQSkZJRgAB..."
88
+ }
89
+ ```
90
+
91
+ | Field | Type | Description |
92
+ |-------|------|-------------|
93
+ | `prediction` | string | `"NORMAL"` or `"ANOMALOUS"` |
94
+ | `anomaly_score` | float | Score between 0 and 1. Higher = more anomalous |
95
+ | `heatmap_base64` | string | Base64-encoded JPEG of the original image overlaid with the anomaly heatmap (blue = normal, red = anomalous) |
96
+
97
+ To render the heatmap in Python:
98
+ ```python
99
+ import base64
100
+ from PIL import Image
101
+ import io
102
+
103
+ heatmap_bytes = base64.b64decode(response["heatmap_base64"])
104
+ image = Image.open(io.BytesIO(heatmap_bytes))
105
+ image.show()
106
+ ```
107
+
108
+ ---
109
+
110
+ ## Project Structure
111
+
112
+ ```
113
+ visual-defect-inspector/
114
+ β”œβ”€β”€ app/
115
+ β”‚ β”œβ”€β”€ __init__.py
116
+ β”‚ β”œβ”€β”€ main.py # FastAPI app β€” routes, CORS, validation
117
+ β”‚ └── inference.py # Model loading (singleton) + predict logic
118
+ β”œβ”€β”€ saved_model/
119
+ β”‚ └── weights/
120
+ β”‚ └── torch/
121
+ β”‚ └── model.pt # PatchCore model with memory bank (via Git LFS)
122
+ β”œβ”€β”€ Anomaly_detection.ipynb # Training, evaluation, MLflow logging
123
+ β”œβ”€β”€ Dockerfile
124
+ β”œβ”€β”€ requirements.txt
125
+ └── .gitignore
126
+ ```
127
+
128
+ ---
129
+
130
+ ## Tech Stack
131
+
132
+ | Component | Tool |
133
+ |-----------|------|
134
+ | Anomaly detection | [anomalib](https://github.com/openvinotoolkit/anomalib) |
135
+ | Backbone | ResNet18 (PyTorch) |
136
+ | Experiment tracking | MLflow |
137
+ | API framework | FastAPI + Uvicorn |
138
+ | Image processing | OpenCV, Pillow |
139
+ | Containerization | Docker |
140
+ | Deployment | Hugging Face Spaces |
141
+ | Model storage | Git LFS |
142
+
143
+ ---
144
+
145
+ ## Local Setup
146
+
147
+ **Prerequisites:** Python 3.10+, Git LFS installed
148
+
149
+ ```bash
150
+ # Clone the repo
151
+ git clone https://github.com/JeremiahAdebayo/visual-defect-inspector.git
152
+ cd visual-defect-inspector
153
+
154
+ # Install dependencies
155
+ pip install -r requirements.txt
156
+
157
+ # Run the API
158
+ uvicorn app.main:app --reload
159
+ ```
160
+
161
+ API will be available at `http://127.0.0.1:8000/docs`
162
+
163
+ **With Docker:**
164
+ ```bash
165
+ docker build -t visual-defect-inspector .
166
+ docker run -p 7860:7860 visual-defect-inspector
167
+ ```
168
+
169
+ ---
170
+
171
+ ## Limitations
172
+
173
+ - Trained on a single MVTec AD category (bottle). Does not generalize to other object types without retraining.
174
+ - Sensitive to distribution shift β€” images must closely resemble the MVTec training distribution (white background, controlled lighting, top-down angle) for reliable results.
175
+ - Anomaly threshold is fixed at training time. May need recalibration for production use cases with different defect types.
176
+
177
+ ---
178
+
179
+ ## Author
180
+
181
+ **Jeremiah Adebayo**
182
+ 3rd Year Information Technology Student, University of Iloilo
183
+ [GitHub](https://github.com/JeremiahAdebayo) Β· [Hugging Face](https://huggingface.co/CodeHashira73)