Skip to content
Zach

About

Maths undergrad working on AI safety.

Most of my time outside coursework goes into AI safety. It’s the problem I most want to work on, and where I think I can actually help. I co-founded the Nottingham AI Safety Initiative last year and run its technical fellowship; on the field-building side, I’m a Kairos Pathfinder Fellow. Other things along the way: a UK AI governance podcast, and some interpretability research.

Zach Levin

The work

Where I’m trying to be useful

Two things at the moment.

The technical side: mechanistic interpretability. Most recently I’ve been working on how refusal directions degrade once published jailbreak frameworks bypass them, using PCA, SVD, and steering vectors. Sparse autoencoders, RLHF, and Constitutional AI on the alignment-training side.

The organisational side: I co-founded NAISI last year and now run its technical fellowship, a 2-hour weekly seminar where around ten students and a lecturer work through interpretability, evaluations, and alignment training together. The curriculum sits on top of BlueDot Impact’s Technical AI Safety course, tuned for our cohort. I built naisi.uk to run the day-to-day: events, members, the newsletter, internal tasks. On the field-building side, I’m a Kairos Pathfinder Fellow. Before all this I co-founded AI Policy Pulse, a UK AI governance podcast.

The thing I’m most pleased about is harder to put on a CV: that a lecturer chose to come along, that students keep turning up for two hours every week to questions without clean answers, and that there’s a small AI safety community at Nottingham now where there wasn’t one a year ago. Most UK AI safety work still happens in Oxford, Cambridge, or London. Building one more place feels worth doing.

To follow what we’re doing at NAISI, find us on Instagram and elsewhere via linktr.ee/nottsaisi.

Methods

  • Mech interp
  • PCA
  • SVD
  • Steering vectors
  • Sparse autoencoders
  • RLHF
  • Constitutional AI
  • Jailbreak analysis
  • Grokking

Stack

  • Python
  • TypeScript
  • React
  • Firestore

Experience

Recent and notable

  1. Co-founder, technical fellowship lead

    Nottingham AI Safety Initiative

    Co-founded NAISI in 2025 and now run its technical fellowship: a 2-hour weekly seminar where around ten students and a lecturer work through interpretability, evaluations, and alignment training together. The curriculum sits on top of BlueDot Impact's Technical AI Safety course, tuned for our cohort. Built naisi.uk to run the day-to-day: events, members, the newsletter, internal tasks. On the field-building side, I'm a Kairos Pathfinder Fellow.

  2. Course participant

    BlueDot Impact · Technical AI Safety

    30-hour technical curriculum on RLHF, Constitutional AI, data filtration, dangerous-capability evaluations, alignment faking, and mechanistic interpretability via sparse autoencoders. I now use it as the backbone of the Kairos fellowship I run at Nottingham. Submitted a research proposal extending Nanda et al.'s grokking analysis to Fourier-based algorithmic representations in transformers learning modular arithmetic.

  3. Technical AI Safety Researcher

    Impact Research Groups

    Five-person team investigating how refusal directions in open-source LLMs degrade when published jailbreak frameworks bypass them. I owned the long-prompt regime and its signal-extraction problems. Applied PCA, SVD, and direction extraction / steering. Placed third in the cohort.

  4. Podcast Director (co-founder)

    AI Policy Pulse

    Co-founded a UK-focused AI governance podcast for getting Gen Z engaged with catastrophic AI risks. Booked guests via the London Initiative for Safe AI and worked with academics on technically grounded episodes. Handled editing, publication, and marketing myself.

  5. Course participant

    BlueDot Impact · AI Alignment

    Co-authored a project on the effect of online AI-generated misinformation on LLM predictions. The course shaped the editorial direction at AI Policy Pulse afterwards.

The credentialed long form lives on my CV (PDF).

Beyond the work

Other things

  • Global Challenges Project: a programme on biosecurity and AI safety; the iteration I attended ran in Oxford. Met a lot of the people I now know in the UK AI safety community there.
  • Senior Non-Commissioned Officer, Army Cadets, planning and leading training for junior cadets.
  • Volunteered at a local Nottingham charity, preparing and serving meals to students.
  • Self-taught my way through Karpathy’s nanoGPT and nanoLLM series; regular practice on Codewars and LeetCode.

Get in touch

Find me here

Easiest way to reach me is to book a slot or email zach@zlevin.uk. If you’re a student thinking about AI safety, an organiser running an AI safety group at another university, or a researcher working on adjacent problems, I’d especially like to hear from you.

Looking for summer 2026 AI safety positions: technical research and field-building (e.g. Generator). Get in touch if you have something or know who I should be talking to.