การจัดการวัตถุดิบคงคลังของร้านอาหาร โดยใช้ Proximal Policy Optimization

Authors

ผศ.ดร.วรพล พงษ์เพ็ชร, นายณัฐวัฒน์ เอกธรรมนิตย

Published

Thai Journal Operation Research: TJOR

Abstract

This research focuses on the Proximal Policy Optimization Algorithm (PPO) of
Reinforcement Learning to make a forecasting model of raw material stock in
restaurants. The restaurant’s daily raw material stock ordered and the number of raw
material stock used fluctuates daily. The unused raw material stock is left as wasted
material. It caused fermentation and produced methane gas that rises to destroy
ozone into the atmosphere. This research investigated a One-Attribute Model and a
Multi-Attribute Model. The dataset used in this research is synthetic data that use the
normal distribution theory to make it. The model’s performance was assessed using
F-statistics, R-Square, and RMSE. We trained each model trained 12 million timesteps.
The result showed that the Multi-Attribute Model would converge to the value
optimization faster than the One-Attribute Model. We found that both models’
accuracy is about 82 percent. This research demonstrated that the PPO
Algorithm could be utilized to make an effective raw material forecasting tool.

(2021). การจัดการวัตถุดิบคงคลังของร้านอาหาร โดยใช้ Proximal Policy Optimization . Thai Journal Operation Research: TJOR , 9(1), 45-54.