A Preliminary Study on Design Rule Derivation from Graphical Representations Using Multimodal Large Language Models
DOI: 10.35490/EC3.2025.274
Abstract: Automated design compliance checking has often overlooked graphical descriptions in regulatory clauses. This study investigates the use of multimodal large language models (MLLMs) to derive executable rules from both text and graphics. Using 23 accessibility regulation clause-graphic pairs with Prolog-based ground truth rules, MLLM outputs were evaluated under three input conditions: text-only (F1: 0.74), graphic-only (F1: 0.43), and combined (F1: 0.96). Results highlight the effectiveness of multimodal inputs, demonstrating MLLMs’ potential for accurate rule derivation in design compliance.
Keywords: Automated compliance checking, Design compliance checking, Graphical representation, multimodal large language model, Rule derivation