Blockchain Data Wizard, Analyst or Scientist
Listed on 2025-12-27
-
IT/Tech
Blockchain / Web3, Data Analyst, Data Engineer, Data Security
Allium makes blockchain data accurate, simple and fast
Blockchain data is hard, messy, and chaotic
When we started out in late 2021 our thesis was simple - blockchain data, despite it being public and free, was difficult to understand, clunky to access and troublesome to maintain. Answering a simple question like “Who are the biggest Ethereum token holders over time?” requires an engineering team to run their own RPC nodes, ingest the full history of the blockchain, clean the data, transform the data and finally summon a wizard to cast a complex SQL query.
Accessing data is hard because blockchains are optimized for Writes and not ReadsWhy is it so hard? Blockchains have historically been optimized for Writes (getting data onto the blockchain) and less for Reads (getting data OUT of the blockchain). This is because optimization efforts were focused on increasing transaction throughput and building fault tolerant and scalable consensus algorithms. This neglect makes it hard to get data out efficiently and reliably at scale.
Parsing and interpreting blockchain data requires both deep domain expertise and data manipulationTo quote Tim Roughgarden, Columbia Professor, “Blockchains are (virtual) computers, not databases.” They are Turing machines that support general computations, and anyone can write and deploy their own smart contract for their own use case. This nearly infinite number of use cases leads to the fragmentation of data schemas for different purposes. Standardizing these schemas requires deep domain expertise to turn esoteric technical outputs into clear information for specific concepts like tokens, NFTs, stable coins and DEXs.
Allium abstracts the complexity with a simple way to query blockchain dataAllium tames the chaos by ingesting, sanitizing, and standardizing all this data. As of this post, the data we’ve archived across 40+ blockchains is in the petabytes and growing exponentially.
Google and Bloomberg had to organize the world s public financial and webpage data, Allium is on a mission to do the same for blockchain dataThis is one of the rare times in history where indexing a giant public dataset is sorely needed by all - similar to what Bloomberg did for financial data and what Google organized for public webpage data. With this indexed data, we are fortunate to support trailblazers in this industry and play some role the industry’s most exciting trends:
About our customersWe serve 2 groups of customers today with the same data but different platform.
Analysts who need to answer data questions about the blockchain (think BI) and Engineers who need highly reliable data queryable in near realtime (think Application backends). Our customers include the biggest institutions Visa, Stripe, Grayscale and also the biggest crypto companies such as Phantom, Uniswap. Allium is one of the unique companies in the industry that bridge blockchain and non blockchain worlds.
the Role
We love engineers and wizards who love solving new problems every single day. While wizards are not engineers, they contribute and give the best product roadmap guidance to ensure the engineers build the right things in the right way.
Data Egress - How does one transport 100s of TBs of data around the world without breaking the piggybank? 100s of TBs of data around the world without breaking the piggybank.
Handle high traffic - How can we support the biggest applications in this industry and allow handle 100,000 QPS at peak traffic and not go down?
Botnets - This industry is in its early days, how does one catch botnets based on their behavioral patterns?
Fraud (Sybil) Detection - Is it possible to transfer the same fraud detection heuristics into this blockchain world?
Who is real? - What constitutes meaningful and organic transactions on the blockchain?
Bring Your Own Transformation - How do we let our customers design their own APIs and transform their own realtime data streams?
Data Governance - We pride ourselves on our data quality - How can we ensure our data is consistent across every copy and every region 24/7?
AI and LLMs - How does one design the LLM and AI experience on top of our data to lower the…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).