新型逻辑增强的大型语言模型用于改善符号推理
我正在尝试一种新颖的方法,将符号逻辑直接集成到变换器的注意力机制中。通过使用基于spaCy的自定义逻辑解析器,我生成了一个“逻辑掩码”,引导自注意力层关注逻辑结构。在对经过微调的LLaMA 3 8B模型进行的初步测试中,这种方法在符号推理任务上显示出了良好的改进(例如,在FOLIO数据集上达到了约62%的准确率)。我期待听到社区对进一步完善这种方法的想法和建议。同时请注意,我并没有机器学习方面的博士或硕士学位。欢迎任何批评,无论好坏。 :)
<p>https://marqcodes.com/logicEnhanced.html
查看原文
I’m experimenting with a novel approach that integrates symbolic logic directly into a transformer’s attention mechanism. By using a custom spaCy-based logic parser, I generate a “logic mask” that guides the self-attention layers to focus on logical constructs. In preliminary tests with a fine-tuned LLaMA 3 8B model, this method has shown promising improvements on symbolic reasoning tasks (e.g., achieving around 62% on the FOLIO dataset). I’m eager to hear thoughts and suggestions from the community on further refining this approach. Also please note I don’t have a PhD nor masters in machine learning. Happy to take any criticism good or bad. :)<p>https://marqcodes.com/logicEnhanced.html