Skip to content

Pull requests: huggingface/lighteval

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

feat: add MathVista benchmark
#1081 opened Nov 22, 2025 by omkar-334 Draft
Feature/tvd mi metric
#1080 opened Nov 22, 2025 by zrobertson466920 Loading…
[EVAL] MultiChallenge
#1075 opened Nov 21, 2025 by akshathmangudi Loading…
[EVAL] Long Horizon Execution
#1074 opened Nov 21, 2025 by akshathmangudi Loading…
4 tasks done
feat: Add Kyrgyz LLM Bench multilingual tasks
#1070 opened Nov 19, 2025 by golden-ratio Loading…
diskcache for caching
#1068 opened Nov 19, 2025 by f14-bertolotti Loading…
batched metric was not aggregated properly
#1067 opened Nov 18, 2025 by f14-bertolotti Loading…
add to inspect
#1065 opened Nov 17, 2025 by NathanHB Loading…
graceful shutdown of vllm async
#1064 opened Nov 17, 2025 by f14-bertolotti Loading…
Adds Profbench
#1041 opened Nov 6, 2025 by NathanHB Loading…
Fix PERPLEXITY task
#1037 opened Nov 4, 2025 by ScottHoang Loading…
Legal NLP tasks on Swiss data
#1032 opened Oct 31, 2025 by rolshoven Loading…
Add support to vllm==0.11.0
#1027 opened Oct 22, 2025 by anmarques Loading…
Wrap vllm inputs to compatible with VLLM>=0.10.2
#1003 opened Oct 2, 2025 by JIElite Loading…
Fix caching logic
#994 opened Sep 25, 2025 by jxmorris12 Loading…
Fix deberta overflow error bug
#990 opened Sep 24, 2025 by amstu2 Loading…
run slow tests aginst vllm and transformers main
#985 opened Sep 23, 2025 by NathanHB Loading…
Add ChartQA new-task
#954 opened Sep 11, 2025 by 0xjunhao Loading…
ProTip! Exclude everything labeled bug with -label:bug.