Multi-Agent Reinforcement Learning Accelerated by Swarm Heuristics for Distributed Control in Smart Manufacturing Lines
Main Article Content
Abstract
Smart manufacturing lines integrate heterogeneous machines, conveyors, buffers, and transport systems that must operate under stringent productivity, quality, and energy constraints. Conventional centralized control architectures encounter scalability limitations when the line grows in size or configuration changes become frequent. Distributed decision making based on multi-agent reinforcement learning is one possible direction, but pure data-driven learning often suffers from slow convergence, unstable exploration, and difficulty in exploiting structural properties of manufacturing processes. This work discusses a distributed control framework where local controllers are modeled as learning agents whose policies are optimized through multi-agent reinforcement learning while their exploration is guided by swarm-based metaheuristics. The swarm layer searches over policy parameters, coordination signals, and shaping rewards, and injects structured perturbations into the learning process in order to accelerate the discovery of high-performing joint behaviors. The manufacturing line is represented as a network of stations, buffers, and routing elements with local observations and shared global performance indicators. The study outlines a linear state-space abstraction of the line dynamics, formulates the agents' interaction as a cooperative game, and integrates swarm heuristics with policy gradient and value-based methods. Numerical experiments on synthetic smart line configurations illustrate how the combined scheme may affect throughput, work-in-process, tardiness, and energy usage under varying demand and disturbance patterns. The discussion highlights practical design choices and limitations when embedding swarm heuristics into multi-agent reinforcement learning for distributed control in reconfigurable manufacturing environments.