bot_vault
Search
Search
Dark mode
Light mode
Explorer
Tag: evaluation
6 items with this tag.
Apr 01, 2026
How Do We Research HRI in the Age of LLMs? — Systematic Review
hri
llm
robotics
systematic-review
vla
embodied-ai
teleoperation
social-robotics
evaluation
Mar 28, 2026
Building Successful AI Apps: TDS Roundup on LLM Application Best Practices
llm
ai-applications
product
evaluation
deployment
Mar 28, 2026
Evaluating Large Language Model (LLM) Systems: Metrics, Challenges, and Best Practices
llm
evaluation
llmops
mlops
ai
rag
metrics
Mar 28, 2026
一個生成、一個評審:GAN 啟發的多代理框架設計
ai-agents
multi-agent
evaluation
frontend
anthropic
claude
Mar 28, 2026
評估驅動開發 (EDD): 生成式 AI 軟體不確定性的解決方法
eval-driven-development
llm
ai-engineering
rag
prompt-engineering
evaluation
Jan 02, 2019
SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks
paper
ai-agents
benchmark
skills
llm
evaluation