Logo
Theory Ventures

Blockchain Data Wizard, Analyst or Scientist

Theory Ventures, New York, New York, us, 10261

Save Job

Allium makes blockchain data accurate, simple and fast

Blockchain data is hard, messy, and chaotic

When we started out in late 2021 our thesis was simple - blockchain data, despite it being public and free, was difficult to understand, clunky to access and troublesome to maintain. Answering a simple question like “Who are the biggest Ethereum token holders over time?” requires an engineering team to run their own RPC nodes, ingest the full history of the blockchain, clean the data, transform the data and finally summon a wizard to cast a complex SQL query. Accessing data is hard because blockchains are optimized for Writes and not Reads

Why is it so hard? Blockchains have historically been optimized for Writes (getting data onto the blockchain) and less for Reads (getting data OUT of the blockchain). This is because optimization efforts were focused on increasing transaction throughput and building fault tolerant and scalable consensus algorithms. This neglect makes it hard to get data out efficiently and reliably at scale. Parsing and interpreting blockchain data requires both deep domain expertise and data manipulation

To quote Tim Roughgarden, Columbia Professor, “Blockchains are (virtual) computers, not databases.” They are Turing machines that support general computations, and anyone can write and deploy their own smart contract for their own use case. This nearly infinite number of use cases leads to the fragmentation of data schemas for different purposes. Standardizing these schemas requires deep domain expertise to turn esoteric technical outputs into clear information for specific concepts like tokens, NFTs, stablecoins and DEXs. Allium abstracts the complexity with a simple way to query blockchain data

Allium tames the chaos by ingesting, sanitizing, and standardizing all this data. As of this post, the data we’ve archived across 40+ blockchains is in the petabytes and growing exponentially. Google and Bloomberg had to organize the world\'s public financial and webpage data, Allium is on a mission to do the same for blockchain data

This is one of the rare times in history where indexing a giant public dataset is sorely needed by all - similar to what Bloomberg did for financial data and what Google organized for public webpage data. With this indexed data, we are fortunate to support trailblazers in this industry and play some role the industry’s most exciting trends: About our customers

We serve 2 groups of customers today with the same data but different platform.

Analysts

who need to answer data questions about the blockchain (think BI) and

Engineers

who need highly reliable data queryable in near realtime (think Application backends). Our customers include the biggest institutions Visa, Stripe, Grayscale and also the biggest crypto companies such as Phantom, Uniswap. Allium is one of the unique companies in the industry that bridge blockchain and non blockchain worlds. About the Role

We love engineers and wizards who love solving new problems every single day. While wizards are not engineers, they contribute and give the best product roadmap guidance to ensure the engineers build the right things in the right way. Data Egress

- How does one transport 100s of TBs of data around the world without breaking the piggybank? 100s of TBs of data around the world without breaking the piggybank.

Handle high traffic

- How can we support the biggest applications in this industry and allow handle 100,000 QPS at peak traffic and not go down?

Botnets

- This industry is in its early days, how does one catch botnets based on their behavioral patterns?

Fraud (Sybil) Detection

- Is it possible to transfer the same fraud detection heuristics into this blockchain world?

Who is real?

- What constitutes meaningful and organic transactions on the blockchain?

Bring Your Own Transformation

- How do we let our customers design their own APIs and transform their own realtime data streams?

Data Governance

- We pride ourselves on our data quality - How can we ensure our data is consistent across every copy and every region 24/7?

AI and LLMs

- How does one design the LLM and AI experience on top of our data to lower the barrier of entry to crypto data?

Data Transformation Holy grail

- How can one unify streaming and batch transformation logic into a single code base?

More specific past work Allium data wizards have done:

Diving deep into the guts of Ordinals data

to power research like this.

Sybil Detection

Creating

Wallet360

Deploy

washtrading

filters

Powering Brevan Howard Digital\'s

stablecoin

industry reports

Designing the most intuitive

DEX schemas for ALL DeFi researchers

to use easily

Ensure

Grayscale\'s State of Ethereum Report

had the right staking and fees data

Design

chain level metrics

for all chains

Hunt down and

curate wallet entities

and labels

Account abstraction

- Ensure we have all the right decoded logs to power

Some qualities

Sherlock & Enola Holmes level of curiosity to find peculiarities in the data and help the industry redefine the narratives

Ability to parse and understand new blockchain schemas fast and well

Proficient understanding of NFT, DEXs, Decoded Logs, and Smart Contracts to transform the data to the product and customer\'s needs quickly

Allium sizzle reel

Giant infrastructure budget per head

You will make mistakes, costly mistakes, but at Allium\'s expense. We have an internal leaderboard of the costliest infrastructure mistakes made, and we (try to) learn from them. We don\'t have fancy Michelin-starred meal budgets, but we have a huge infrastructure budget for one to get better at your craft. Why? We leverage every tool (no prereqs) out there because we meet our enterprise customers where they are at: Every OLAP: Snowflake, Databricks, Bigquery, Clickhouse

Every OLTP: Postgres, Aurora

Every Event bus: Kafka, SNS, PubSub

Every Cloud Provider: AWS, GCP, Azure (one day)

A copy of data in every region: US East, Central West, Europe, Asia

Every data transformation and orchestration tool: Apache Beam, Materialize, TinyBird, DBT, SQLMesh, Temporal

Data governance tools: DataFold

Don\'t take our word for it, what our customers say about us

About quotes and reviews can be found on the site. What some people have to say about us:

Mario Gabriele

from The Generalist\'s Future 50 Startup List

Tomasz Tungus

from Theory Ventures

Bucky Moore

from Kleiner Perkins

Ok.. now for some tough love, here are the values we strive for at Allium:

Pro Athlete Mindset

- Consistency. Day in and day out, in pursuit of excellence. A win yesterday does not guarantee (or even imply!) a win tomorrow.

Figure It Out & Extreme Ownership

- Every day is unexplored territory. If you don’t know it, learn it. If you can\'t learn it, find someone who can.

High Agency

- Responsiveness and problem solving as a core habit.

Leading from the Front

- Lead by example with MVP work and momentum.

Strong Opinions On the Future

- It is okay to be wrong, but strive to have ideas for a better future.

Sense of (allium) business smell

- Understand how work builds leverage for the business goals.

About the team

We invite people of all backgrounds. We have engineers who learnt coding later in life, on the side, still in school, or from top schools. All are welcome if you have a curious mind and infectious work ethic. Administrative Benefits

Medical, Dental, Vision, Life and AD&D insurance

- US folks get 100% coverage for Gold plans, 80% for dependents Note:

The sun never sets on Allium - we hire from any geographical location as long as you are willing to overlap 2 hours on NYC mornings Mon-Thurs from 10am-12pm ET. We have people based in New York, Seattle, Singapore and Australia All applicants have to answer this pop quiz: What is an Allium? What is your favorite Allium? Bonus points for the right pronunciation.

#J-18808-Ljbffr