Computer Vision in Industry: Machine Eyes

Computer Vision: When Machines Learn to "See"

Imagine standing on a bottle production line, inspecting 600 bottles per minute for cracks or air bubbles. Your eyes would fatigue after 20 minutes, but an industrial camera connected to an AI system can inspect every bottle at 99.5% accuracy around the clock without tiring.

Industrial Machine Vision is the technology that gives machines the ability to analyze images and make decisions — from defect detection to barcode reading to robot guidance.

Components of an Industrial Vision System

Any industrial vision system consists of five core elements:

┌──────────┐   ┌──────────┐   ┌──────────────┐   ┌──────────┐   ┌──────────┐
│ Lighting │──>│ Camera   │──>│ Preprocessing│──>│ Algorithm│──>│ Decision │
└──────────┘   └──────────┘   └──────────────┘   └──────────┘   └──────────┘

Lighting: The Most Important Element Everyone Ignores

Consider trying to photograph a shallow scratch on a shiny metal surface — with standard lighting, nothing will show. But with low-angle side lighting, the scratch will stand out as a clear shadow.

Lighting Type	Use Case	Principle
Backlighting	Edge inspection, hole detection	Object blocks light = silhouette
Diffuse front	Color inspection, print	Even light without reflections
Dark field (side)	Scratch and crack detection	Light reflects off defects only
Ring light	General inspection, OCR	Even illumination around lens
Structured light	3D measurement	Laser lines projected on surface

Industrial Cameras

These are not consumer cameras — they are engineered for harsh environments:

Feature	Consumer Camera	Industrial Camera
Resolution	12-48 MP	0.3-150 MP depending on application
Speed	30-60 fps	Up to 10,000 fps
Interface	USB/HDMI	GigE Vision / CoaXPress
Operation	Free-running	Triggered by production line
Lifespan	2-3 years	10+ years
Protection	None	IP65/IP67

Sensor Types:

Area scan: Captures a complete frame at once — for discrete objects (bottles, cans)
Line scan: Captures one line at a time — for continuous surfaces (fabric, paper, steel)

Image Preprocessing

Before an image reaches the AI algorithm, it needs improvement:

import cv2
import numpy as np

def preprocess_industrial_image(image_path):
    # 1. Read the image
    img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

    # 2. Contrast enhancement (CLAHE - better than standard histogram equalization)
    clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
    enhanced = clahe.apply(img)

    # 3. Noise reduction while preserving edges
    denoised = cv2.bilateralFilter(enhanced, d=9, sigmaColor=75, sigmaSpace=75)

    # 4. Edge detection
    edges = cv2.Canny(denoised, threshold1=50, threshold2=150)

    # 5. Resize to standard size (224x224 for CNN networks)
    resized = cv2.resize(denoised, (224, 224))

    # 6. Normalize values to 0-1 range
    normalized = resized.astype(np.float32) / 255.0

    return normalized

# Typical preprocessing pipeline steps
pipeline_steps = [
    "Lens distortion correction",
    "Brightness and contrast adjustment",
    "Noise reduction (Bilateral / Gaussian filter)",
    "Grayscale conversion (if color is not relevant)",
    "Value normalization",
    "Resize",
    "Data augmentation (training only)",
]

Convolutional Neural Networks (CNN)

Imagine teaching a child to distinguish between good and defective parts. You would not explain mathematical rules — you would show them hundreds of examples until they learn on their own. This is exactly what a CNN does.

How CNN Works:

Input Image (224x224x3)
    |
    v
+-------------------------+
| Conv2D Layer            | <- Detects simple edges and lines
| Filters: 32, Size: 3x3 |
+-------------------------+
| MaxPool Layer           | <- Reduces image size, keeps important features
| Size: 2x2              |
+-------------------------+
| Conv2D Layer            | <- Detects more complex patterns
| Filters: 64            |
+-------------------------+
| Conv2D Layer            | <- Detects complete shapes and structures
| Filters: 128           |
+-------------------------+
| Flatten Layer           | <- Converts feature map to vector
+-------------------------+
| Dense Layer             | <- Makes the final decision
| Outputs: 2 (OK/Defect) |
+-------------------------+

Popular CNN Models in Industry:

Model	Size	ImageNet Accuracy	Industrial Use
MobileNetV2	14 MB	72%	Resource-limited edge devices
ResNet-50	98 MB	76%	General quality inspection
EfficientNet-B3	48 MB	82%	Balance of accuracy and speed
VGG-16	528 MB	71%	Feature extraction

YOLO: Real-Time Detection

Consider a production line where dozens of products pass per second — you need a system that identifies what is in the image and where exactly, at blazing speed.

YOLO (You Only Look Once) looks at the image once and identifies all objects simultaneously:

from ultralytics import YOLO

# Load a pretrained YOLO model
model = YOLO("yolov8n.pt")  # n=nano, s=small, m=medium, l=large

# Train on custom industrial data
model.train(
    data="factory_defects.yaml",  # Data definition file
    epochs=100,
    imgsz=640,
    batch=16,
    device="cuda",                # GPU training
)

# Detect on a new image
results = model("production_line_frame.jpg")
for box in results[0].boxes:
    cls = results[0].names[int(box.cls)]     # Class name
    conf = float(box.conf)                    # Confidence score
    x1, y1, x2, y2 = box.xyxy[0].tolist()   # Bounding box coordinates
    print(f"Detected: {cls} (confidence {conf:.1%}) at [{x1:.0f},{y1:.0f},{x2:.0f},{y2:.0f}]")

YOLO Version	Speed (FPS on GPU)	mAP Accuracy	Use Case
YOLOv8-nano	450+	37%	Edge devices, Raspberry Pi
YOLOv8-small	350+	45%	Fast inspection
YOLOv8-medium	200+	50%	Balanced
YOLOv8-large	120+	53%	High accuracy

Real-World Industrial Applications

Defect Inspection

Detecting cracks, scratches, and deformations on surfaces:

# Example: Metal surface defect classification
def classify_surface_defect(image):
    """Classify metal surface defects"""
    predictions = model.predict(image)
    defect_classes = {
        0: "OK",
        1: "Scratch",
        2: "Rust",
        3: "Crack",
        4: "Peeling",
        5: "Pit"
    }
    return defect_classes.get(predictions.argmax(), "Unknown")

OCR and Barcode Reading

# Reading barcodes and QR codes
from pyzbar import pyzbar

def read_barcodes(image):
    barcodes = pyzbar.decode(image)
    results = []
    for barcode in barcodes:
        data = barcode.data.decode("utf-8")
        barcode_type = barcode.type    # EAN13, QR, Code128
        results.append({"data": data, "type": barcode_type})
    return results

# Reading printed text (OCR) - e.g., expiry dates
import pytesseract

def read_expiry_date(image):
    # Preprocessing: convert to binary
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    _, binary = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)
    text = pytesseract.image_to_string(binary, config="--psm 7")
    return text.strip()

Optics and Lenses

Choosing the right lens is critical — you need to know:

# Calculating the required focal length
sensor_width_mm = 6.4       # Camera sensor width (1/1.8")
object_width_mm = 200       # Width of the area to image
working_distance_mm = 500   # Distance from camera to object

focal_length = (sensor_width_mm * working_distance_mm) / object_width_mm
print(f"Required focal length: {focal_length:.1f} mm")

# Calculating resolution (smallest detectable defect)
camera_pixels_h = 2048
pixel_size_mm = object_width_mm / camera_pixels_h
print(f"Pixel resolution: {pixel_size_mm:.3f} mm/pixel")
print(f"Smallest detectable defect: ~{pixel_size_mm * 3:.2f} mm (3 pixels)")

Lens Selection Guidelines:

A defect of 0.1 mm requires at least 0.03 mm/pixel resolution
Telecentric lenses prevent perspective distortion — essential for precise measurement
Aperture (f-stop) controls depth of field — f/8 to f/11 for industrial applications

Practical Tips for Building an Industrial Vision System

Start with lighting: 80% of vision project success depends on correct illumination
Collect real data: Do not rely on internet images — capture from the actual production line
Balance your dataset: If 99% of products are good, collect extra defect samples
Test under real conditions: Line vibration, ambient light changes, dust on the lens
Start with a small model: MobileNet or YOLOv8-nano, then scale up as needed
Mind the response time: On a fast line, 100 ms means the product has moved 10 cm