Senior ML Engineer – Distributed RL & Post-Training Infrastructure
Listed on 2026-01-07
-
IT/Tech
AI Engineer, Machine Learning/ ML Engineer, Systems Engineer
Who we are
Affine is pioneering a decentralized, incentive‑driven environment where machine learning models improve through competition—not coordination. Built on Bittensor’s Subnet 64 (Chutes), we reward miners for producing genuinely better models in tasks like program synthesis and code generation. By building a sybil‑proof, decoy‑proof, and overfitting‑resistant system, we’re making reasoning—the pinnacle of intelligence—a scalable commodity.
What you will be doingAs a Senior ML Engineer, you’ll architect the infrastructure that powers our incentivized reinforcement learning competitions. You’ll work on systems that ensure fair evaluation, detect gaming attempts, and drive rapid model iteration. This is a deeply technical role with high autonomy and major influence on the direction of the platform.
Key areas you’ll own- Distributed RL Infrastructure
- Build scalable, fault‑tolerant systems to evaluate models competing across RL environments
- Track Pareto frontiers in real‑time and implement multi‑objective optimization algorithms
- Create sybil‑proofing, decoy detection, and overfitting mitigation mechanisms
- Post‑Training Pipelines
- Develop infrastructure for downloading, fine‑tuning, and resubmitting models
- Implement PPO, GRPO, and other RL techniques for coding and reasoning tasks
- Design automated validation and genuine improvement detection frameworks
- Validator & Evaluation Systems
- Build high‑throughput model evaluation and ranking infrastructure
- Create real‑time leaderboards and tracking systems for model contributions
- Balance inference loads across a distributed compute network
- Anti‑Gaming & Incentives
- Use cryptographic proofs for model ownership and integrity verification
- Implement copy detection, sybil resistance, and dynamic evaluation sets
- Ensure continuous pressure toward genuine progress, not leader board exploits
- Scale & Performance Engineering
- Optimize for 1000+ model submissions per day
- Develop distributed caching, model diffing, and monitoring systems
- Push the system to scale alongside a growing miner ecosystem
- 5+ years in distributed machine learning systems or competitive ML environments
- Deep knowledge of RL (PPO, GRPO, DPO) and multi‑objective optimization
- Strong backend and infrastructure skills in Python (PyTorch a must)
- Experience with evaluation systems for ML competitions or benchmarks
- Exposure to blockchain, Bittensor, or decentralized protocols is a strong plus
- Bonus: program synthesis, automated reasoning, or participation in ML competitions
- Languages
:
Python - Frameworks
:
PyTorch, JAX, OpenRLHF, TRL - Infrastructure
:
Kubernetes, Docker, time‑series DBs, graph DBs - Eval Tools
:
Custom RL environments, cryptographic detection, model lineage tracking - Distributed Systems
:
Model serving, inference load balancing, peer‑to‑peer exchanges
- Build the infrastructure that transforms competitive machine learning into a global market
- Launch robust anti‑gaming mechanisms and scalable evaluation systems
- Power the next leap in AI advancement by decentralizing intelligence improvement
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).