The BERDS Benchmark aims to measure retrieval diversity for questions that are opinionated or invite diverse perspectives.
Hung-Ting Chen
timchen0618
·
AI & ML interests
NLP
Recent Activity
updated a Space 4 days ago
timchen0618/open-wikitable-smoke-viewer published a Space 5 days ago
timchen0618/open-wikitable-smoke-viewer updated a Space 6 days ago
timchen0618/open-wikitable-viewerOrganizations
spaces 7
Running
Open-WikiTable Smoke Viewer
📑
Explore Open‑WikiTable benchmark results and agent trajectories
Running
Open-WikiTable Viewer
📑
Explore Open-WikiTable dataset and evaluation results
Running
MoNaCo Benchmark Viewer
🧩
Explore MoNaCo benchmark examples and model results
Sleeping
Browsecomp Plus Viz
😻
Render Markdown text into formatted HTML preview
Running
QAMPARI Dev Data Explorer
📚
Explore data and get answers with source citations
Running
Information-Scaffolds Taxonomy v0 Viewer
📐
Explore scaffold taxonomy clusters across datasets
datasets 40
timchen0618/browsecomp-plus-benchmark
Viewer • Updated • 830 • 2.02k
timchen0618/browsecomp-plus-sel-tools-test300-random-seed7-v1
Viewer • Updated • 300 • 69
timchen0618/browsecomp-plus-sel-tools-test300-random-seed6-v1
Viewer • Updated • 300 • 84
timchen0618/browsecomp-plus-sel-tools-test300-random-seed5-v1
Viewer • Updated • 300 • 70
timchen0618/browsecomp-plus-sel-tools-test300-random-seed4-v1
Viewer • Updated • 300 • 67
timchen0618/browsecomp-plus-sel-tools-test300-random-seed3-v1
Viewer • Updated • 300 • 84
timchen0618/browsecomp-plus-sel-tools-test300-random-seed1-v1
Viewer • Updated • 300 • 78
timchen0618/browsecomp-plus-sel-tools-test300-random-seed0-v1
Viewer • Updated • 300 • 94
timchen0618/browsecomp-plus-sel-tools-test300-gemini-3p1-pro-v1
Viewer • Updated • 300 • 155
timchen0618/browsecomp-plus-sel-tools-test300-gemini-2p5-pro-v1
Viewer • Updated • 300 • 76