Agentic LLM - a pcy Collection

pcy 's Collections

Agentic LLM

updated Nov 6, 2025

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2, 2025 • 238
Tongyi DeepResearch Technical Report

Paper • 2510.24701 • Published Oct 28, 2025 • 104
PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-em-ppo-v0.3

3B • Updated May 21, 2025 • 51.2k
PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-em-grpo-v0.3

3B • Updated May 21, 2025 • 145 • 1
PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-7b-em-ppo-v0.3

8B • Updated May 21, 2025 • 156k • 1
PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-7b-em-grpo-v0.3

8B • Updated May 21, 2025 • 183
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12, 2025 • 40