Database Fundamentals for Industrial Systems
Why Does a Factory Need a Database?
Imagine a factory with 200 sensors, each recording a reading every second. That is 200 x 60 x 60 x 24 = over 17 million readings per day. Where do you store them? How do you search through them? How do you find out that motor number 7 overheated last Thursday at 3 AM?
CSV files will collapse under that volume. You need a database -- a system purpose-built for storing and retrieving data quickly and safely.
The Three Types of Databases
Relational Databases (SQL)
Data is organized in tables linked by relationships -- like Excel spreadsheets but far more powerful. Each table has predefined columns (a schema).
Examples: PostgreSQL, MySQL, SQLite, Microsoft SQL Server
-- Create machines table
CREATE TABLE machines (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
type TEXT NOT NULL,
location TEXT,
max_temp REAL DEFAULT 85.0,
installed DATE
);
-- Create sensor readings table
CREATE TABLE sensor_readings (
id INTEGER PRIMARY KEY,
machine_id INTEGER REFERENCES machines(id),
timestamp TIMESTAMP NOT NULL,
temperature REAL,
vibration REAL,
pressure REAL
);
-- Insert data
INSERT INTO machines (name, type, location, max_temp)
VALUES ('CNC-01', 'Lathe', 'Hall 3', 90.0);
-- Query: machines that exceeded 85°C today
SELECT m.name, s.temperature, s.timestamp
FROM sensor_readings s
JOIN machines m ON s.machine_id = m.id
WHERE s.temperature > 85.0
AND s.timestamp > CURRENT_DATE
ORDER BY s.temperature DESC;
Strengths:
- Clear, enforced structure (schema)
- Strong relationships between tables (foreign keys)
- Standardized SQL language
- Data integrity guarantees (ACID transactions)
Non-Relational Databases (NoSQL)
These handle flexible data that does not follow a fixed structure -- such as event logs where each event may carry different fields.
Subtypes:
| Type | Description | Example |
|---|---|---|
| Document | Flexible JSON documents | MongoDB, CouchDB |
| Key-Value | Simple key and value pairs | Redis, DynamoDB |
| Graph | Nodes and relationships | Neo4j, SurrealDB |
| Wide-Column | Dynamic columns | Cassandra, ScyllaDB |
// Example: MongoDB document for an industrial event
{
"_id": "evt_20240315_001",
"machine_id": "CNC-01",
"timestamp": "2024-03-15T14:30:05Z",
"event_type": "alarm",
"severity": "high",
"data": {
"temperature": 92.3,
"threshold": 85.0,
"action_taken": "speed_reduced",
"operator": "Ahmed"
},
"tags": ["temperature", "exceeded", "cnc_line"]
}
Strengths:
- Flexible structure -- no predefined schema required
- Excellent handling of nested data
- Scales horizontally with ease
- High write throughput
Time-Series Databases
Designed specifically for data that changes over time -- exactly what sensors produce. Optimized for writing massive volumes of timestamped data.
Examples: InfluxDB, TimescaleDB, QuestDB
-- InfluxDB example: write a sensor reading
INSERT sensor_data,machine=CNC-01,location=hall3
temperature=78.5,vibration=2.3,pressure=145.0
1710500000000000000
-- Query: average temperature every 5 minutes for the last hour
SELECT MEAN(temperature)
FROM sensor_data
WHERE machine = 'CNC-01'
AND time > now() - 1h
GROUP BY time(5m)
Strengths:
- Extreme write speed (millions of points per second)
- Excellent data compression (90%+ space savings)
- Built-in time-based aggregations (hourly, daily, monthly)
- Automatic data expiration (retention policies)
Comprehensive Comparison
| Criterion | SQL | NoSQL | Time-Series |
|---|---|---|---|
| Structure | Fixed schema | Flexible | Fixed with timestamp |
| Best for | Structured, relational data | Diverse, flexible data | Sensor and measurement data |
| Write speed | Good | Excellent | Extreme |
| Queries | Full SQL | Varies by type | Time-based queries |
| Industrial example | Machine and order management | Event logs | Sensor readings |
| Scaling | Vertical | Horizontal | Horizontal |
Indexes: Speeding Up Searches
Without an index, a database scans every row to find what you are looking for. An index works like a book index -- it points you directly to the right location.
-- Without index: scans all readings (very slow with millions of rows)
SELECT * FROM sensor_readings
WHERE machine_id = 5 AND timestamp > '2024-03-15';
-- Create a composite index
CREATE INDEX idx_machine_time
ON sensor_readings (machine_id, timestamp);
-- The same query is now hundreds of times faster
When to create an index:
- Columns you filter on frequently (
WHERE) - Columns you sort by (
ORDER BY) - Columns used in joins (
JOIN)
Warning: each index slightly slows down write operations -- do not create too many.
SurrealDB: A Next-Generation Database
SurrealDB is a modern database that combines SQL, NoSQL, and graph capabilities in a single system. It supports tables, documents, graph relations, and real-time queries.
-- SurrealQL: define a table with constraints
DEFINE TABLE machine SCHEMAFULL;
DEFINE FIELD name ON machine TYPE string;
DEFINE FIELD machine_type ON machine TYPE string;
DEFINE FIELD location ON machine TYPE string;
DEFINE FIELD max_temp ON machine TYPE float DEFAULT 85.0;
DEFINE FIELD installed ON machine TYPE datetime;
-- Create a record
CREATE machine SET
name = 'CNC-01',
machine_type = 'Lathe',
location = 'Hall 3',
max_temp = 90.0,
installed = '2023-06-15T00:00:00Z';
-- Graph relations: machine is located in a department
RELATE machine:cnc01 -> located_in -> department:machining;
-- Query with relations
SELECT name, ->located_in->department.name AS dept
FROM machine
WHERE max_temp > 85.0;
SurrealDB advantages for industry:
- Multi-model: relational + document + graph in one system
- Real-time: live subscriptions to data changes (Live Queries)
- Permissions: fine-grained access control at the field level
- Embedded + Server: runs embedded in your application or as a standalone server
Designing a Factory Database: Practical Example
-- Machines
CREATE TABLE machines (
id SERIAL PRIMARY KEY,
name VARCHAR(50) UNIQUE,
type VARCHAR(30),
department VARCHAR(30),
max_temp REAL,
max_vibration REAL,
status VARCHAR(20) DEFAULT 'active'
);
-- Sensor readings (partitioned by month for performance)
CREATE TABLE readings_2024_03 (
machine_id INTEGER REFERENCES machines(id),
ts TIMESTAMP NOT NULL,
temp REAL,
vibration REAL,
pressure REAL
);
CREATE INDEX idx_readings_machine_ts
ON readings_2024_03 (machine_id, ts);
-- Alarm log
CREATE TABLE alarms (
id SERIAL PRIMARY KEY,
machine_id INTEGER REFERENCES machines(id),
triggered_at TIMESTAMP DEFAULT NOW(),
severity VARCHAR(10), -- low, medium, high, critical
parameter VARCHAR(20), -- temperature, vibration, pressure
value REAL,
threshold REAL,
acknowledged BOOLEAN DEFAULT FALSE,
acknowledged_by VARCHAR(50)
);
-- Query: weekly alarm summary per machine
SELECT
m.name,
COUNT(*) AS total_alarms,
COUNT(*) FILTER (WHERE a.severity = 'critical') AS critical,
MAX(a.value) AS worst_reading
FROM alarms a
JOIN machines m ON a.machine_id = m.id
WHERE a.triggered_at > NOW() - INTERVAL '7 days'
GROUP BY m.name
ORDER BY critical DESC, total_alarms DESC;
How to Choose the Right Database
| Question | Answer | Choice |
|---|---|---|
| Is your data structured and relational? | Yes | SQL (PostgreSQL) |
| Is your data flexible and varied? | Yes | NoSQL (MongoDB) |
| Is your data sensor readings with timestamps? | Yes | Time-Series (InfluxDB) |
| Do you need everything in one system? | Yes | SurrealDB |
| Do you need ultra-fast temporary caching? | Yes | Redis (Key-Value) |
In practice, most smart factories use more than one type -- a relational database for managing machines and orders, and a time-series database for sensor data. The key is to choose based on the nature of your data, not on what is popular.