Senior ML Engineer – Distributed RL & Post-Training Infrastructure Job Jamestown area,Town of Poland New York USA,IT/Tech

Location: Town of Poland

Who we are

Affine is pioneering a decentralized, incentive‑driven environment where machine learning models improve through competition—not coordination. Built on Bittensor’s Subnet 64 (Chutes), we reward miners for producing genuinely better models in tasks like program synthesis and code generation. By building a sybil‑proof, decoy‑proof, and overfitting‑resistant system, we’re making reasoning—the pinnacle of intelligence—a scalable commodity.

What you will be doing

As a Senior ML Engineer, you’ll architect the infrastructure that powers our incentivized reinforcement learning competitions. You’ll work on systems that ensure fair evaluation, detect gaming attempts, and drive rapid model iteration. This is a deeply technical role with high autonomy and major influence on the direction of the platform.

Key areas you’ll own

Distributed RL Infrastructure
- Build scalable, fault‑tolerant systems to evaluate models competing across RL environments
- Track Pareto frontiers in real‑time and implement multi‑objective optimization algorithms
- Create sybil‑proofing, decoy detection, and overfitting mitigation mechanisms
Post‑Training Pipelines
- Develop infrastructure for downloading, fine‑tuning, and resubmitting models
- Implement PPO, GRPO, and other RL techniques for coding and reasoning tasks
- Design automated validation and genuine improvement detection frameworks
Validator & Evaluation Systems
- Build high‑throughput model evaluation and ranking infrastructure
- Create real‑time leaderboards and tracking systems for model contributions
- Balance inference loads across a distributed compute network
Anti‑Gaming & Incentives
- Use cryptographic proofs for model ownership and integrity verification
- Implement copy detection, sybil resistance, and dynamic evaluation sets
- Ensure continuous pressure toward genuine progress, not leader board exploits
Scale & Performance Engineering
- Optimize for 1000+ model submissions per day
- Develop distributed caching, model diffing, and monitoring systems
- Push the system to scale alongside a growing miner ecosystem

You should have the following

5+ years in distributed machine learning systems or competitive ML environments
Deep knowledge of RL (PPO, GRPO, DPO) and multi‑objective optimization
Strong backend and infrastructure skills in Python (PyTorch a must)
Experience with evaluation systems for ML competitions or benchmarks
Exposure to blockchain, Bittensor, or decentralized protocols is a strong plus
Bonus: program synthesis, automated reasoning, or participation in ML competitions

Tech stack

Languages
:
Python
Frameworks
:
PyTorch, JAX, OpenRLHF, TRL
Infrastructure
:
Kubernetes, Docker, time‑series DBs, graph DBs
Eval Tools
:
Custom RL environments, cryptographic detection, model lineage tracking
Distributed Systems
:
Model serving, inference load balancing, peer‑to‑peer exchanges

What you’ll help us achieve

Build the infrastructure that transforms competitive machine learning into a global market
Launch robust anti‑gaming mechanisms and scalable evaluation systems
Power the next leap in AI advancement by decentralizing intelligence improvement

#J-18808-Ljbffr


Increase/decrease your Search Radius (miles)



Job Posting Language