×
Register Here to Apply for Jobs or Post Jobs. X

Research Engineer, Machine Learning

Job in New York, New York County, New York, 10261, USA
Listing for: Poseidon Research
Full Time position
Listed on 2025-12-02
Job specializations:
  • IT/Tech
    Data Scientist, AI Engineer, Systems Engineer
Salary/Wage Range or Industry Benchmark: 100000 - 150000 USD Yearly USD 100000.00 150000.00 YEAR
Job Description & How to Apply Below
Location: New York

Location: Remote or New York City, US

Organization: Poseidon Research

Compensation: $100,000–$150,000 annually; or higher, depending on experience

Type: One year contract

This position is funded through a charitable research grant.

Poseidon Research is an independent AI safety laboratory based in New York City. Our mission is to make advanced AI systems transparent, trustworthy, and governable through deep technical research in interpretability, control, and secure monitoring.

We investigate how models think, hide, and reason
—from understanding encoded reasoning and steganography in reasoning models to building open‑source monitoring tools that preserve human oversight. Our research spans mechanistic interpretability, reinforcement learning, control, information theory, and cryptography, bridging the theoretical and the practical.

You could be a cog in a big lab and gamble with humanity’s future. Or you could own your entire research platform at Poseidon Research, pioneering the infrastructure needed to accelerate AI safety to build a safe, secure, and prosperous future.

The Role

We are hiring a Research Engineer to implement and scale experiments studying encoded reasoning and steganography in modern reasoning models.

This is a hands‑on, highly technical position focused on experiment design, model evaluation, and engineering platforms.

You will collaborate closely with research scientists to turn conceptual ideas into reproducible systems by building pipelines, datasets, and model organisms that make opaque behaviors measurable and controllable.

Responsibilities

We’re looking for a creative, rigorous engineer who loves to build in order to understand how safety issues intersect with reality. You will:

  • Implement and reproduce prior work on encoded reasoning and steganography, extending it to current open‑weight reasoning models (e.g., Deep Seek‑R1 and V3, GPT‑OSS, QwQ).
  • Develop and maintain modular experiment pipelines for evaluating steganography, encoded reasoning, and reward hacking.
  • Build and test fine‑tuning workflows (SFT or RL‑based) to study emergent encoded reasoning and reward hacking behaviors.
  • Collaborate with our research leads to design safety cases and control agenda monitoring mechanisms suitable for countering various types of unsafe chain of thought.
  • Extend interpretability infrastructure
    , including probing, feature ablation, and sparse autoencoder (SAE) analysis pipelines using frameworks like Transformer Lens.
  • Engineer datasets and evaluation suites for robust paraphrasing, steganography cover tasks, and monitoring robustness metrics.
  • Collaborate with scientists to identify causal directions and larger‑scale mechanisms (via standard interp, DAS, MELBO, targeted LAT, and related methods) underlying encoded reasoning.
  • Ensure reproducibility through clean code, experiment tracking, and open‑source releases.
  • Contribute to research communication by preparing writeups, visualizations, and benchmark results for research vignettes and publications.
Ideal Candidate Core Technical Skills
  • Strong Python and Py Torch experience.
  • Experience with LLM experimentation using frameworks such as Hugging Face Transformers, Transformer Lens, or equivalent.
  • Building reproducible ML pipelines including data preprocessing, logging, visualization, and evaluation.
  • RL fine‑tuning or training small‑to‑mid‑scale models through frameworks like TRL, verl, OpenRLHF or equivalents.
  • Proficiency with experiment tracking tools such as Weights & Biases or MLflow, and Git.
  • Active proficiency and/or intellectual curiosity working with AI‑assisted coding and research tools such as Claude Code, Codex, Cursor, Roo, Cline or equivalents.
Nice to have
  • Familiarity with interpretability methods such as probing, activation patching, or feature attribution.
  • Understanding of encoded reasoning, steganography, or information‑theoretic approaches to model communication; or some background in formal cryptography, information theory, or offensive cybersecurity.
  • Experience with mechanistic interpretability such as feature visualization, direction ablation, SAEs, crosscoders, and circuit tracing.
  • Background in information security, control, or formal…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary