Stream of Search (SoS): Learning to Search in Language
Published:
In the landscape of AI, the ability of LLMs to reason, plan and solve complex problems has become increasingly significant. This novel paper titled “Stream of Search (SoS): Learning to Search in Language” delves into this crucial area, proposing an approach that enhances the reasoning capabilities of LLMs through the concept of search.
The Challenge
LLMs are often trained on ideal outcomes, lacking exposure to the processes that lead to these outcomes. This limitation hinders their ability to learn from mistakes, explore alternative paths, and effectively backtrack when faced with complex decision-making scenarios. Consequently, LLMs may struggle with tasks requiring foresight, leading to a compounding of errors that degrade performance over time.
The Stream of Search (SoS) Framework
The SoS framework addresses these challenges by teaching language models to represent the process of search in a serialized format—essentially creating a “stream of search.” This framework not only captures the successful paths to solutions but also incorporates the exploration and backtracking inherent in the search process. By training LMs on a diverse dataset of search trajectories generated from various symbolic search strategies, the authors demonstrate significant improvements in reasoning and problem-solving abilities.
Key Findings
Improved Accuracy: The SoS pretraining increased search accuracy by 25% compared to models trained solely on optimal search trajectories. This highlights the importance of exposing models to a wide range of exploratory strategies.
Self-Improvement: The SoS framework enables models to self-improve by utilizing policy improvement techniques. The finetuned models solved 36% of previously unsolved problems, showcasing their ability to adapt and refine their search strategies.
Flexibility in Search Strategies: The research indicates that LMs can autonomously explore and implement various search strategies, even discovering new ones during training. This adaptability enhances their performance in complex tasks.
Conclusion
The Stream of Search (SoS) framework represents a significant step forward in the training of language models. By prioritizing the learning process over mere outcomes, this approach enhances the models’ reasoning capabilities and prepares them for real-world complexities. As AI continues to evolve, frameworks like SoS will be pivotal in shaping the future of intelligent systems.
For a deeper dive into the methodology and findings, the full paper is available here.