Skip to main content

Blockchain Data in Your Cloud

Bitquery provides ready-to-use blockchain data dumps via popular cloud providers such as AWS, Google Cloud, and Snowflake.
You can plug these datasets directly into your existing analytics stack (AWS, BigQuery, Snowflake, etc.) and build custom data pipelines without running your own blockchain infrastructure.

Sample Parquet Data

To quickly explore the structure of the data and test your tooling, you can use our public sample dataset:

In the GitHub repo, each sample file (per data point or topic) includes the exact S3 URL in a comment, so you can:

  • Point test pipelines to the same path
  • Easily request more files from the same bucket/prefix if you need additional data

For example, an Ethereum balance updates dump might look like:

https://bitquery-blockchain-dataset.s3.us-east-1.amazonaws.com/ethereum/balance_updates/24053500_24053549.parquet

bitquery-blockchain-dataset/
└── ethereum/
└── balance_updates/
├── 24053500_24053549.parquet
├── 24053550_24053599.parquet
├── 24053600_24053649.parquet
├── 24053650_24053699.parquet
├── 24053700_24053749.parquet
├── 24053750_24053799.parquet
├── 24053800_24053849.parquet
├── 24053850_24053899.parquet
├── 24053900_24053949.parquet
└── 24053950_24053999.parquet

Use these samples to:

  • Validate your ETL / analytics pipeline
  • Inspect column names and types before connecting to full buckets
  • Benchmark query performance on realistic data sizes