Home Wiki AI Fundamentals Computer Vision in Industry: Machine Eyes
AI Fundamentals

Computer Vision in Industry: Machine Eyes

Computer Vision: When Machines Learn to "See"

Imagine standing on a bottle production line, inspecting 600 bottles per minute for cracks or air bubbles. Your eyes would fatigue after 20 minutes, but an industrial camera connected to an AI system can inspect every bottle at 99.5% accuracy around the clock without tiring.

Industrial Machine Vision is the technology that gives machines the ability to analyze images and make decisions — from defect detection to barcode reading to robot guidance.

Components of an Industrial Vision System

Any industrial vision system consists of five core elements:

┌──────────┐   ┌──────────┐   ┌──────────────┐   ┌──────────┐   ┌──────────┐
│ Lighting │──>│ Camera   │──>│ Preprocessing│──>│ Algorithm│──>│ Decision │
└──────────┘   └──────────┘   └──────────────┘   └──────────┘   └──────────┘

Lighting: The Most Important Element Everyone Ignores

Consider trying to photograph a shallow scratch on a shiny metal surface — with standard lighting, nothing will show. But with low-angle side lighting, the scratch will stand out as a clear shadow.

Lighting Type Use Case Principle
Backlighting Edge inspection, hole detection Object blocks light = silhouette
Diffuse front Color inspection, print Even light without reflections
Dark field (side) Scratch and crack detection Light reflects off defects only
Ring light General inspection, OCR Even illumination around lens
Structured light 3D measurement Laser lines projected on surface

Industrial Cameras

These are not consumer cameras — they are engineered for harsh environments:

Feature Consumer Camera Industrial Camera
Resolution 12-48 MP 0.3-150 MP depending on application
Speed 30-60 fps Up to 10,000 fps
Interface USB/HDMI GigE Vision / CoaXPress
Operation Free-running Triggered by production line
Lifespan 2-3 years 10+ years
Protection None IP65/IP67

Sensor Types:

  • Area scan: Captures a complete frame at once — for discrete objects (bottles, cans)
  • Line scan: Captures one line at a time — for continuous surfaces (fabric, paper, steel)

Image Preprocessing

Before an image reaches the AI algorithm, it needs improvement:

import cv2
import numpy as np

def preprocess_industrial_image(image_path):
    # 1. Read the image
    img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

    # 2. Contrast enhancement (CLAHE - better than standard histogram equalization)
    clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
    enhanced = clahe.apply(img)

    # 3. Noise reduction while preserving edges
    denoised = cv2.bilateralFilter(enhanced, d=9, sigmaColor=75, sigmaSpace=75)

    # 4. Edge detection
    edges = cv2.Canny(denoised, threshold1=50, threshold2=150)

    # 5. Resize to standard size (224x224 for CNN networks)
    resized = cv2.resize(denoised, (224, 224))

    # 6. Normalize values to 0-1 range
    normalized = resized.astype(np.float32) / 255.0

    return normalized

# Typical preprocessing pipeline steps
pipeline_steps = [
    "Lens distortion correction",
    "Brightness and contrast adjustment",
    "Noise reduction (Bilateral / Gaussian filter)",
    "Grayscale conversion (if color is not relevant)",
    "Value normalization",
    "Resize",
    "Data augmentation (training only)",
]

Convolutional Neural Networks (CNN)

Imagine teaching a child to distinguish between good and defective parts. You would not explain mathematical rules — you would show them hundreds of examples until they learn on their own. This is exactly what a CNN does.

How CNN Works:

Input Image (224x224x3)
    |
    v
+-------------------------+
| Conv2D Layer            | <- Detects simple edges and lines
| Filters: 32, Size: 3x3 |
+-------------------------+
| MaxPool Layer           | <- Reduces image size, keeps important features
| Size: 2x2              |
+-------------------------+
| Conv2D Layer            | <- Detects more complex patterns
| Filters: 64            |
+-------------------------+
| Conv2D Layer            | <- Detects complete shapes and structures
| Filters: 128           |
+-------------------------+
| Flatten Layer           | <- Converts feature map to vector
+-------------------------+
| Dense Layer             | <- Makes the final decision
| Outputs: 2 (OK/Defect) |
+-------------------------+

Popular CNN Models in Industry:

Model Size ImageNet Accuracy Industrial Use
MobileNetV2 14 MB 72% Resource-limited edge devices
ResNet-50 98 MB 76% General quality inspection
EfficientNet-B3 48 MB 82% Balance of accuracy and speed
VGG-16 528 MB 71% Feature extraction

YOLO: Real-Time Detection

Consider a production line where dozens of products pass per second — you need a system that identifies what is in the image and where exactly, at blazing speed.

YOLO (You Only Look Once) looks at the image once and identifies all objects simultaneously:

from ultralytics import YOLO

# Load a pretrained YOLO model
model = YOLO("yolov8n.pt")  # n=nano, s=small, m=medium, l=large

# Train on custom industrial data
model.train(
    data="factory_defects.yaml",  # Data definition file
    epochs=100,
    imgsz=640,
    batch=16,
    device="cuda",                # GPU training
)

# Detect on a new image
results = model("production_line_frame.jpg")
for box in results[0].boxes:
    cls = results[0].names[int(box.cls)]     # Class name
    conf = float(box.conf)                    # Confidence score
    x1, y1, x2, y2 = box.xyxy[0].tolist()   # Bounding box coordinates
    print(f"Detected: {cls} (confidence {conf:.1%}) at [{x1:.0f},{y1:.0f},{x2:.0f},{y2:.0f}]")
YOLO Version Speed (FPS on GPU) mAP Accuracy Use Case
YOLOv8-nano 450+ 37% Edge devices, Raspberry Pi
YOLOv8-small 350+ 45% Fast inspection
YOLOv8-medium 200+ 50% Balanced
YOLOv8-large 120+ 53% High accuracy

Real-World Industrial Applications

Defect Inspection

Detecting cracks, scratches, and deformations on surfaces:

# Example: Metal surface defect classification
def classify_surface_defect(image):
    """Classify metal surface defects"""
    predictions = model.predict(image)
    defect_classes = {
        0: "OK",
        1: "Scratch",
        2: "Rust",
        3: "Crack",
        4: "Peeling",
        5: "Pit"
    }
    return defect_classes.get(predictions.argmax(), "Unknown")

OCR and Barcode Reading

# Reading barcodes and QR codes
from pyzbar import pyzbar

def read_barcodes(image):
    barcodes = pyzbar.decode(image)
    results = []
    for barcode in barcodes:
        data = barcode.data.decode("utf-8")
        barcode_type = barcode.type    # EAN13, QR, Code128
        results.append({"data": data, "type": barcode_type})
    return results

# Reading printed text (OCR) - e.g., expiry dates
import pytesseract

def read_expiry_date(image):
    # Preprocessing: convert to binary
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    _, binary = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)
    text = pytesseract.image_to_string(binary, config="--psm 7")
    return text.strip()

Optics and Lenses

Choosing the right lens is critical — you need to know:

# Calculating the required focal length
sensor_width_mm = 6.4       # Camera sensor width (1/1.8")
object_width_mm = 200       # Width of the area to image
working_distance_mm = 500   # Distance from camera to object

focal_length = (sensor_width_mm * working_distance_mm) / object_width_mm
print(f"Required focal length: {focal_length:.1f} mm")

# Calculating resolution (smallest detectable defect)
camera_pixels_h = 2048
pixel_size_mm = object_width_mm / camera_pixels_h
print(f"Pixel resolution: {pixel_size_mm:.3f} mm/pixel")
print(f"Smallest detectable defect: ~{pixel_size_mm * 3:.2f} mm (3 pixels)")

Lens Selection Guidelines:

  • A defect of 0.1 mm requires at least 0.03 mm/pixel resolution
  • Telecentric lenses prevent perspective distortion — essential for precise measurement
  • Aperture (f-stop) controls depth of field — f/8 to f/11 for industrial applications

Practical Tips for Building an Industrial Vision System

  1. Start with lighting: 80% of vision project success depends on correct illumination
  2. Collect real data: Do not rely on internet images — capture from the actual production line
  3. Balance your dataset: If 99% of products are good, collect extra defect samples
  4. Test under real conditions: Line vibration, ambient light changes, dust on the lens
  5. Start with a small model: MobileNet or YOLOv8-nano, then scale up as needed
  6. Mind the response time: On a fast line, 100 ms means the product has moved 10 cm
computer-vision CNN YOLO defect-detection OCR inspection الرؤية الحاسوبية الشبكات الالتفافية كشف العيوب التعرف البصري فحص الجودة الكاميرا الصناعية