Skip to content
Zach

About

Maths undergrad working on AI safety.

I spend most of my time outside coursework on AI safety. Currently running the Nottingham AI Safety Initiative as a Kairos Pathfinder Fellow. Other things I’ve worked on along the way: a campus technical fellowship, a UK AI governance podcast, and some interpretability research.

Zach Levin

The work

Where I’m trying to be useful

Two things at the moment.

The technical side: mechanistic interpretability. Most recently I’ve been working on how refusal directions degrade once published jailbreak frameworks bypass them, using PCA, SVD, and steering vectors. Sparse autoencoders, RLHF, and Constitutional AI on the alignment-training side.

The organisational side: running the Nottingham AI Safety Initiative as a Kairos Pathfinder Fellow. Designed the curriculum (built on BlueDot Impact’s Technical AI Safety course), brought together the cohort, got the university on board. Before that I co-founded AI Policy Pulse, a UK AI governance podcast.

To follow what we’re doing at NAISI, find us on Instagram and elsewhere via linktr.ee/nottsaisi.

Methods

  • Mech interp
  • PCA
  • SVD
  • Steering vectors
  • Sparse autoencoders
  • RLHF
  • Constitutional AI
  • Jailbreak analysis
  • Grokking

Stack

  • Python
  • TypeScript
  • React
  • Firestore

Experience

Recent and notable

  1. Pathfinder Fellow

    Kairos Pathfinder Fellowship

    Co-founded the Nottingham AI Safety Initiative. Designed the curriculum (built on BlueDot Impact's Technical AI Safety course), brought together the cohort, got the university on board. I run a 2-hour weekly technical fellowship for around ten students and a lecturer, covering interpretability, evaluations, and alignment training.

  2. Course participant

    BlueDot Impact · Technical AI Safety

    30-hour technical curriculum on RLHF, Constitutional AI, data filtration, dangerous-capability evaluations, alignment faking, and mechanistic interpretability via sparse autoencoders. I now use it as the backbone of the Kairos fellowship I run at Nottingham. Submitted a research proposal extending Nanda et al.'s grokking analysis to Fourier-based algorithmic representations in transformers learning modular arithmetic.

  3. Technical AI Safety Researcher

    Impact Research Groups

    Five-person team investigating how refusal directions in open-source LLMs degrade when published jailbreak frameworks bypass them. I owned the long-prompt regime and its signal-extraction problems. Applied PCA, SVD, and direction extraction / steering. Placed third in the cohort.

  4. Podcast Director (co-founder)

    AI Policy Pulse

    Co-founded a UK-focused AI governance podcast for getting Gen Z engaged with catastrophic AI risks. Booked guests via the London Initiative for Safe AI and worked with academics on technically grounded episodes. Handled editing, publication, and marketing myself.

  5. Course participant

    BlueDot Impact · AI Alignment

    Co-authored a project on the effect of online AI-generated misinformation on LLM predictions. The course shaped the editorial direction at AI Policy Pulse afterwards.

The credentialed long form lives on the CV (PDF).

Beyond the work

Other things

  • Global Challenges Project: a programme on biosecurity and AI safety; the iteration I attended ran in Oxford. Met a lot of the people I now know in the UK AI safety community there.
  • Senior Non-Commissioned Officer, Army Cadets, planning and leading training for junior cadets.
  • Volunteered at a local Nottingham charity, preparing and serving meals to students.
  • Self-taught my way through Karpathy’s nanoGPT and nanoLLM series; regular practice on Codewars and LeetCode.

Get in touch

Find me here

Easiest way to reach me is to book a slot or email zach@zlevin.uk. If you’re a student thinking about AI safety, an organiser running an AI safety group at another university, or a researcher working on adjacent problems, I’d especially like to hear from you.

Looking for summer 2026 AI safety research positions and related programmes. Get in touch if you have something or know who I should be talking to.