Reading on Saurav Panigrahi

Emergent Misalignment

Fri, 01 May 2026 00:00:00 +0000

Selected references on emergent misalignment and broad behavioral shifts from narrow training signals.

Emergent Misalignment: Narrow Finetuning Can Produce Broadly Misaligned LLMs
Introduces the central phenomenon: finetuning on a narrow harmful behavior can produce broader misaligned behavior outside the training domain.
Narrow Misalignment is Hard, Emergent Misalignment is Easy
Useful for thinking about why a broad misalignment direction may be a more stable and efficient solution than a narrow one.

Fri, 01 May 2026 00:00:00 +0000

Long-form references on benchmarks, measurement, and what evaluations actually test.

LAB-Bench
Benchmark for language models doing biology research tasks. Useful because it evaluates research-relevant behavior rather than only static factual recall.
FOMO26
Foundation model challenge for brain MRI, useful as a clinical-domain evaluation reference.

Open Graph Benchmark
Standardized graph ML benchmark suite with datasets, loaders, and evaluators. Useful as a reference point for what benchmark infrastructure can look like.

RoboTwin
Dual-arm robot benchmark using generative digital twins for scalable task and data generation.

Fri, 01 May 2026 00:00:00 +0000

Long-form references on training, infrastructure, and implementation practice.

Frontier Model Training Methodologies
Survey of open frontier training recipes and implementation choices.
Scaling LLMs with JAX
Book-length treatment of distributed training practice.
Beyond Language Modeling: An Exploration of Multimodal Pretraining
From-scratch multimodal pretraining study with useful details on representation choices and scaling behavior.

How to Train the Best Embedding Model in the World
Detailed engineering writeup on embedding model training, label noise, verification, and dataset scale.

CUDA Writeups by Tushar Gautam
Implementation-forward notes on CUDA kernels and optimization.

Fri, 01 May 2026 00:00:00 +0000

Long-form references on biological foundation models, structure prediction, and sequence modeling.

AlphaFold
Foundational protein structure prediction paper.
AlphaFold 3
Extends structure prediction toward biomolecular complexes and interactions.

Fri, 01 May 2026 00:00:00 +0000

Selected references on research taste, engineering judgment, and doing useful technical work.

You and Your Research
Hamming’s classic essay on choosing important problems and organizing a life around serious work.
An Opinionated Guide to ML Research
Practical advice on developing taste and becoming effective in machine learning research.
Principles of Effective Research
A useful frame for research as a skill that can be deliberately improved.
Fast
Examples of ambitious work happening faster than conventional expectations.

Fri, 01 May 2026 00:00:00 +0000

Long-form references on tool use, agent environments, and reliability loops.

Harness Engineering
Useful framing around agents as systems shaped by environments, specs, feedback, and reliability loops.
Code Mode
A concrete argument for exposing tools through code interfaces rather than forcing every step through chat-level tool calls.
Context Mode
A useful pattern for keeping agent context manageable when tools produce large or noisy outputs.

Agents Learn Their Runtime
Study of persistent versus reset Python interpreters in CodeAct-style training.

AI Gave Birth to the 100x Engineer
Long case study on compounding agent workflows with test harnesses and supporting tools.