Job Market

0002476610

Research Lead - Eval Execution

0002476610/">

METR seeks Research Lead - Eval Execution to work in Berkeley, CA. Draft, eval & manage methods for evaluating ML systems; Devlp algorithms for enhancing AI systems; Identify weaknesses in evals & propose improve; Run ML experiments on model performance; Manage team; Publish results. Req’s Mast’s in CS & 1 yr of exp. in job offered or similar managerial-level AI eval research position. Req’s working knowledge of LLM agent evals & methods, scaffolding tools for LLM-based agents & knowledge of scoring functions to analyze agent performance, JSON, Docker, NodeJS, PostgreSQL, Git, Github, Python, Yaml, basic frontend development, LLM API devlmt, deep learning, neural network architecture, alignment techniques, item response theory, experimental design, protocol analysis, factor analysis, & thematic analysis. Occasional local telecommuting permitted. $209,181/yr +ben.

Send resume to: Kris Chari, Operations,

440 N Barracna Ave #3345, Covina, CA 91723

0002476610/">