神经符号逻辑增强用于大型语言模型(LLMs)
我正在尝试一种新颖的方法,将符号逻辑直接集成到变换器的注意力机制中。通过使用基于 spaCy 的自定义逻辑解析器,我生成了一个“逻辑掩码”,该掩码引导自注意力层专注于逻辑结构。在对经过微调的 LLaMA 3 8B 模型进行的初步测试中,这种方法在符号推理任务上显示出了良好的改进(例如,在 FOLIO 数据集上取得了约 62% 的成绩)。我期待听到社区对进一步完善这一方法的想法和建议。同时请注意,我并没有获得机器学习方面的博士或硕士学位。欢迎任何批评,无论好坏。 :)
<p>https://marqcodes.com/logicEnhanced.html
查看原文
I’m experimenting with a novel approach that integrates symbolic logic directly into a transformer’s attention mechanism. By using a custom spaCy-based logic parser, I generate a “logic mask” that guides the self-attention layers to focus on logical constructs. In preliminary tests with a fine-tuned LLaMA 3 8B model, this method has shown promising improvements on symbolic reasoning tasks (e.g., achieving around 62% on the FOLIO dataset). I’m eager to hear thoughts and suggestions from the community on further refining this approach. Also please note I don’t have a PhD nor masters in machine learning. Happy to take any criticism good or bad. :)<p>https://marqcodes.com/logicEnhanced.html