Sector · Manufacturing — Distributed AI in industrial processes

Distributed AI
in manufacturing.

When we work inside industrial contexts — manufacturing, mechanical components, process plants — distributed AI is not an abstract aspiration. It is a technical consequence of three real constraints: production data does not leave the site, legacy systems (MES, SCADA, PLM, ERP) carry twenty years of accumulated logic that works, and the margin for error is narrow because every line stop costs.

This page is not a sales brochure. It is a laboratory essay: what we observe applying distributed AI to industrial processes, which architectures hold up in production and which collapse, the practices we've adopted working inside this context. The vocabulary is that of applied research — not consulting.

01 / Observation

What we see applying AI to industrial processes.

Three kinds of problem reach our contact form from manufacturing.

The first is poorly conceived predictive maintenance. The company has installed sensors on machines, has been collecting telemetry for years, has tried to develop a predictive maintenance model — and the model produces 30% false positives in the first weeks of operation. Often the cause is not the model: it is the fact that every machine is in reality an exception (different configuration, different production year, different supply of components) and a global model does not capture that variance. Federated learning with one model per line or per machine family, trained on local data and without centralization, reduces false positives because each model sees only coherent variance.

The second is assisted quality control. The control room receives images from the end-of-line camera to reject defective parts. The neural network classifier was trained in the lab, deployment puts it into production, and for the first week it works. Then the dataset shifts — it is winter, the camera light has a different color temperature, the raw material has a different supplier — and accuracy drops. The problem is that the model was trained once and does not update. Our approach is a continuous re-training pipeline where local data from the production site feeds incremental fine-tuning, with explicit governance (who approves the update, under which confidence criteria) and an audit log for every automated decision.

The third is integration with legacy enterprise systems. The company has an MES, a PLM, an ERP, each installed in a different decade, each with its own naming logic, each with its own data schema. The customer's initial idea is "I'll put an LLM on top and have it read everything". It works in demo, it fails in production: the LLM does not understand that the same article code has three different representations, that mandatory PLM fields are optional in the MES, that the ERP has twenty years of workarounds documented in record comments. TORA approach: first an enterprise knowledge graph that captures the relationships between systems (who is authoritative, where mappings live), then an agentic system that uses this graph as structured context — not the LLM as omniscient oracle, but the LLM as a verified component inside a wider architecture.

02 / Architectures

What we see hold up in production.

Three architectural principles we have seen survive from prototype to industrial deployment.

Federated learning as default, centralization as exception. The constraint "data does not leave the site" is not a compliance officer's preference: it is a choice that defends industrial margin. A product matrix, a formula, a process parameter that leaves the corporate perimeter is an asset that becomes public. AI that demands centralizing data to train better is asking the company to give up something concrete in exchange for a theoretical benefit. Federated learning reverses the relationship: the model travels, the data stays. Complexity shifts from exfiltration (a legal and contractual problem nobody wants to manage) to orchestration (a technical problem solved by good engineering). We applied this in Jouelry in the retail domain: 3,056 retailers profiled without centralizing transaction data from any of them. The same logic holds for manufacturing.

Domain-specialized multi-model, not one large generalist model. Industry speaks a dense and specific technical vocabulary. A single large generalist LLM is inefficient: it pays the training cost across the whole Internet to apply itself to a domain where only 1% of the corpus is relevant. An agentic architecture with smaller, scope-specialized models (one for PLM technical documentation, one for operator dialogue, one for image defect classification) — each with a verifiable task — has three advantages: lower compute cost per call, traceability of who answered what, and the ability to update one model without touching the others. The technical depth is in Reasonance and VIBE Framework, where multi-model orchestration is codified as discipline.

RAG on industrial technical documentation, not on everything. The plant manual, the maintenance procedure, the list of approved components — these documents have high information density and low ambiguity. A well-built RAG system over them works. The mistake we see is generic RAG: a vector database fed with the entire corporate content management (sales decks, tender presentations, marketing documents) — retrieval pulls from a noisy ocean and answer quality collapses. Our approach is aggressive scoping: separate at ingestion what is normative/technical/procedural (reliable RAG) from what is discursive/marketing (excluded, or in a separate collection with reduced weight).

03 / Practices

How we work inside an industrial context.

Audit before build. When we enter an industrial context that has already had AI attempts in production, we spend the first 2-3 weeks reading the code, the pipelines, the SLAs, the operational logs of the last 6 weeks. No new feature is proposed in this phase. The output is a document — an internal mini-whitepaper — that says what should be preserved, what should be redone, and where the problem is not AI but raw data, system engineering, or governance. Often this is the most important phase of the project.

On-prem or hybrid, never cloud-only. Cloud AI is scalable, fast to deploy, and a problem in industrial production. Corporate networks with SCADA segmentation cannot afford continuous dependence on an external endpoint. Even where cloud is authorized, it is for specific use cases (training with anonymized data, long-term archival). Inference is on-prem or on-edge by default. When a company signs a project with TORA, one of the first conversations is: where is the inference machine, who maintains it, what is the rollback plan if the model behaves anomalously.

Every model has a named internal owner. This is not a TORA-exclusive practice — it is what we see working. When a model in production generates a costly error, there must be a person inside the company responsible for evaluating, correcting, redeploying. The model as "external vendor's black box" is the pattern that fails most often. Part of our work is defining the transfer of ownership from the start: who is the internal engineer who becomes the model's custodian after we leave the project.

Internal whitepaper as deliverable, not just code. At project end, beyond the system in production, we deliver a technical document that describes the architectural choices, the alternatives considered, the errors encountered. It is the system's operating manual for the next five years — when the internal engineer changes, when supply conditions change, when the company has to justify in audit why the model takes the decisions it takes. Code without its technical explanation is a halved asset.

Why this page exists

We work with industry because it is one of the contexts in which the constraints TORA places on its own operation — federated learning, data sovereignty, open source, verifiable systems — are not ideology: they are technical necessities. A manufacturer that integrates AI into processes cannot give up data sovereignty. Cannot depend on a single vendor. Cannot have models that cannot be inspected. The laboratory's structural choices coincide with the sector's requirements.

This does not mean we work only with industry. It means that inside this sector our practices and our constraints are the most direct match. For the operational detail of how we work together — who comes in, in which roles, with what cadence — the page is Applied Lab. For what enters companies concretely — agentic systems, federated learning, reliable RAG, AI engineering on legacy systems — the reference is How we work.

Start a technical conversation