Skip to main content
Saurav Panigrahi

AI systems, safety, and programmable biology.

Saurav Panigrahi https://sauravpanigrahi.com/
  • Home
  • Research
  • Work
  • Notes
  • Links
  • Now
  • About

Medmarks

Medmarks is an open-source benchmark suite for evaluating medical capabilities in language models across a mix of verifiable and open-ended clinical tasks.

Focus

  • Medical LLM evaluation.
  • Verifiable and open-ended benchmark tasks.
  • LLM-as-judge evaluation for non-verifiable tasks.
  • Clinically relevant model capability tracking.

Artifacts

  • Medmarks v0.1
  • Medmarks: A Comprehensive Open-Source LLM Benchmark Suite for Medical Tasks

RSS

© 2026 Saurav Panigrahi.