Computer Vision in Industry: Machine Eyes
Computer Vision: When Machines Learn to "See"
Imagine standing on a bottle production line, inspecting 600 bottles per minute for cracks or air bubbles. Your eyes would fatigue after 20 minutes, but an industrial camera connected to an AI system can inspect every bottle at 99.5% accuracy around the clock without tiring.
Industrial Machine Vision is the technology that gives machines the ability to analyze images and make decisions — from defect detection to barcode reading to robot guidance.
Components of an Industrial Vision System
Any industrial vision system consists of five core elements:
┌──────────┐ ┌──────────┐ ┌──────────────┐ ┌──────────┐ ┌──────────┐
│ Lighting │──>│ Camera │──>│ Preprocessing│──>│ Algorithm│──>│ Decision │
└──────────┘ └──────────┘ └──────────────┘ └──────────┘ └──────────┘
Lighting: The Most Important Element Everyone Ignores
Consider trying to photograph a shallow scratch on a shiny metal surface — with standard lighting, nothing will show. But with low-angle side lighting, the scratch will stand out as a clear shadow.
| Lighting Type | Use Case | Principle |
|---|---|---|
| Backlighting | Edge inspection, hole detection | Object blocks light = silhouette |
| Diffuse front | Color inspection, print | Even light without reflections |
| Dark field (side) | Scratch and crack detection | Light reflects off defects only |
| Ring light | General inspection, OCR | Even illumination around lens |
| Structured light | 3D measurement | Laser lines projected on surface |
Industrial Cameras
These are not consumer cameras — they are engineered for harsh environments:
| Feature | Consumer Camera | Industrial Camera |
|---|---|---|
| Resolution | 12-48 MP | 0.3-150 MP depending on application |
| Speed | 30-60 fps | Up to 10,000 fps |
| Interface | USB/HDMI | GigE Vision / CoaXPress |
| Operation | Free-running | Triggered by production line |
| Lifespan | 2-3 years | 10+ years |
| Protection | None | IP65/IP67 |
Sensor Types:
- Area scan: Captures a complete frame at once — for discrete objects (bottles, cans)
- Line scan: Captures one line at a time — for continuous surfaces (fabric, paper, steel)
Image Preprocessing
Before an image reaches the AI algorithm, it needs improvement:
import cv2
import numpy as np
def preprocess_industrial_image(image_path):
# 1. Read the image
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
# 2. Contrast enhancement (CLAHE - better than standard histogram equalization)
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
enhanced = clahe.apply(img)
# 3. Noise reduction while preserving edges
denoised = cv2.bilateralFilter(enhanced, d=9, sigmaColor=75, sigmaSpace=75)
# 4. Edge detection
edges = cv2.Canny(denoised, threshold1=50, threshold2=150)
# 5. Resize to standard size (224x224 for CNN networks)
resized = cv2.resize(denoised, (224, 224))
# 6. Normalize values to 0-1 range
normalized = resized.astype(np.float32) / 255.0
return normalized
# Typical preprocessing pipeline steps
pipeline_steps = [
"Lens distortion correction",
"Brightness and contrast adjustment",
"Noise reduction (Bilateral / Gaussian filter)",
"Grayscale conversion (if color is not relevant)",
"Value normalization",
"Resize",
"Data augmentation (training only)",
]
Convolutional Neural Networks (CNN)
Imagine teaching a child to distinguish between good and defective parts. You would not explain mathematical rules — you would show them hundreds of examples until they learn on their own. This is exactly what a CNN does.
How CNN Works:
Input Image (224x224x3)
|
v
+-------------------------+
| Conv2D Layer | <- Detects simple edges and lines
| Filters: 32, Size: 3x3 |
+-------------------------+
| MaxPool Layer | <- Reduces image size, keeps important features
| Size: 2x2 |
+-------------------------+
| Conv2D Layer | <- Detects more complex patterns
| Filters: 64 |
+-------------------------+
| Conv2D Layer | <- Detects complete shapes and structures
| Filters: 128 |
+-------------------------+
| Flatten Layer | <- Converts feature map to vector
+-------------------------+
| Dense Layer | <- Makes the final decision
| Outputs: 2 (OK/Defect) |
+-------------------------+
Popular CNN Models in Industry:
| Model | Size | ImageNet Accuracy | Industrial Use |
|---|---|---|---|
| MobileNetV2 | 14 MB | 72% | Resource-limited edge devices |
| ResNet-50 | 98 MB | 76% | General quality inspection |
| EfficientNet-B3 | 48 MB | 82% | Balance of accuracy and speed |
| VGG-16 | 528 MB | 71% | Feature extraction |
YOLO: Real-Time Detection
Consider a production line where dozens of products pass per second — you need a system that identifies what is in the image and where exactly, at blazing speed.
YOLO (You Only Look Once) looks at the image once and identifies all objects simultaneously:
from ultralytics import YOLO
# Load a pretrained YOLO model
model = YOLO("yolov8n.pt") # n=nano, s=small, m=medium, l=large
# Train on custom industrial data
model.train(
data="factory_defects.yaml", # Data definition file
epochs=100,
imgsz=640,
batch=16,
device="cuda", # GPU training
)
# Detect on a new image
results = model("production_line_frame.jpg")
for box in results[0].boxes:
cls = results[0].names[int(box.cls)] # Class name
conf = float(box.conf) # Confidence score
x1, y1, x2, y2 = box.xyxy[0].tolist() # Bounding box coordinates
print(f"Detected: {cls} (confidence {conf:.1%}) at [{x1:.0f},{y1:.0f},{x2:.0f},{y2:.0f}]")
| YOLO Version | Speed (FPS on GPU) | mAP Accuracy | Use Case |
|---|---|---|---|
| YOLOv8-nano | 450+ | 37% | Edge devices, Raspberry Pi |
| YOLOv8-small | 350+ | 45% | Fast inspection |
| YOLOv8-medium | 200+ | 50% | Balanced |
| YOLOv8-large | 120+ | 53% | High accuracy |
Real-World Industrial Applications
Defect Inspection
Detecting cracks, scratches, and deformations on surfaces:
# Example: Metal surface defect classification
def classify_surface_defect(image):
"""Classify metal surface defects"""
predictions = model.predict(image)
defect_classes = {
0: "OK",
1: "Scratch",
2: "Rust",
3: "Crack",
4: "Peeling",
5: "Pit"
}
return defect_classes.get(predictions.argmax(), "Unknown")
OCR and Barcode Reading
# Reading barcodes and QR codes
from pyzbar import pyzbar
def read_barcodes(image):
barcodes = pyzbar.decode(image)
results = []
for barcode in barcodes:
data = barcode.data.decode("utf-8")
barcode_type = barcode.type # EAN13, QR, Code128
results.append({"data": data, "type": barcode_type})
return results
# Reading printed text (OCR) - e.g., expiry dates
import pytesseract
def read_expiry_date(image):
# Preprocessing: convert to binary
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
_, binary = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)
text = pytesseract.image_to_string(binary, config="--psm 7")
return text.strip()
Optics and Lenses
Choosing the right lens is critical — you need to know:
# Calculating the required focal length
sensor_width_mm = 6.4 # Camera sensor width (1/1.8")
object_width_mm = 200 # Width of the area to image
working_distance_mm = 500 # Distance from camera to object
focal_length = (sensor_width_mm * working_distance_mm) / object_width_mm
print(f"Required focal length: {focal_length:.1f} mm")
# Calculating resolution (smallest detectable defect)
camera_pixels_h = 2048
pixel_size_mm = object_width_mm / camera_pixels_h
print(f"Pixel resolution: {pixel_size_mm:.3f} mm/pixel")
print(f"Smallest detectable defect: ~{pixel_size_mm * 3:.2f} mm (3 pixels)")
Lens Selection Guidelines:
- A defect of 0.1 mm requires at least 0.03 mm/pixel resolution
- Telecentric lenses prevent perspective distortion — essential for precise measurement
- Aperture (f-stop) controls depth of field — f/8 to f/11 for industrial applications
Practical Tips for Building an Industrial Vision System
- Start with lighting: 80% of vision project success depends on correct illumination
- Collect real data: Do not rely on internet images — capture from the actual production line
- Balance your dataset: If 99% of products are good, collect extra defect samples
- Test under real conditions: Line vibration, ambient light changes, dust on the lens
- Start with a small model: MobileNet or YOLOv8-nano, then scale up as needed
- Mind the response time: On a fast line, 100 ms means the product has moved 10 cm