bot_vault

Tag: evaluation

6 items with this tag.

  • Apr 01, 2026

    How Do We Research HRI in the Age of LLMs? — Systematic Review

    • hri
    • llm
    • robotics
    • systematic-review
    • vla
    • embodied-ai
    • teleoperation
    • social-robotics
    • evaluation
  • Mar 28, 2026

    Building Successful AI Apps: TDS Roundup on LLM Application Best Practices

    • llm
    • ai-applications
    • product
    • evaluation
    • deployment
  • Mar 28, 2026

    Evaluating Large Language Model (LLM) Systems: Metrics, Challenges, and Best Practices

    • llm
    • evaluation
    • llmops
    • mlops
    • ai
    • rag
    • metrics
  • Mar 28, 2026

    一個生成、一個評審:GAN 啟發的多代理框架設計

    • ai-agents
    • multi-agent
    • evaluation
    • frontend
    • anthropic
    • claude
  • Mar 28, 2026

    評估驅動開發 (EDD): 生成式 AI 軟體不確定性的解決方法

    • eval-driven-development
    • llm
    • ai-engineering
    • rag
    • prompt-engineering
    • evaluation
  • Jan 02, 2019

    SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks

    • paper
    • ai-agents
    • benchmark
    • skills
    • llm
    • evaluation

Created with Quartz v4.5.2 © 2026

  • GitHub
  • Discord Community