Introduction¶

This is a quick post to illustrate how I collect and store crypto asset data from Coinbase. Essentially, the scripts below pull minute, hour, and daily data for the specified assets and if there is an existing data record, then the existing record is updated to include the most recent data. If there is not an existing data record, then the complete historical record from coinbase is pulled and stored.

Python Imports¶

In [1]:
# Standard Library
import datetime
import os
import sys
import warnings

from datetime import datetime
from pathlib import Path

# Suppress warnings
warnings.filterwarnings("ignore")

Add Directories To Path¶

In [2]:
# Add the source subdirectory to the system path to allow import config from settings.py
current_directory = Path(os.getcwd())
website_base_directory = current_directory.parent.parent.parent
src_directory = website_base_directory / "src"
sys.path.append(str(src_directory)) if str(src_directory) not in sys.path else None

# Import settings.py
from settings import config

# Add configured directories from config to path
SOURCE_DIR = config("SOURCE_DIR")
sys.path.append(str(Path(SOURCE_DIR))) if str(Path(SOURCE_DIR)) not in sys.path else None

# Add other configured directories
BASE_DIR = config("BASE_DIR")
CONTENT_DIR = config("CONTENT_DIR")
POSTS_DIR = config("POSTS_DIR")
PAGES_DIR = config("PAGES_DIR")
PUBLIC_DIR = config("PUBLIC_DIR")
SOURCE_DIR = config("SOURCE_DIR")
DATA_DIR = config("DATA_DIR")
DATA_MANUAL_DIR = config("DATA_MANUAL_DIR")

Python Functions¶

Here are the functions needed for this project:

  • coinbase_fetch_available_products: Fetch available products from Coinbase Exchange API.
  • coinbase_fetch_full_history: Fetch full historical data for a given product from Coinbase Exchange API.
  • coinbase_fetch_historical_candles: Fetch historical candle data for a given product from Coinbase Exchange API.
  • coinbase_pull_data: Update existing record or pull full historical data for a given product from Coinbase Exchange API.
In [3]:
from coinbase_fetch_available_products import coinbase_fetch_available_products
from coinbase_fetch_full_history import coinbase_fetch_full_history
from coinbase_fetch_historical_candles import coinbase_fetch_historical_candles
from coinbase_pull_data import coinbase_pull_data

Function Usage¶

Coinbase Fetch Available Products¶

This script pulls the list of available assets based on the inputs for base and quote currency. Here's an example:

In [4]:
df = coinbase_fetch_available_products(
    base_currency=None,
    quote_currency="USD",
    status="online",
)

In this example, the quote_currency is provided as "USD". This script checks all available assets that are priced against USD and returns a dataframe listing all available assets:

In [5]:
display(df)
id base_currency quote_currency quote_increment base_increment display_name min_market_funds margin_enabled post_only limit_only cancel_only status status_message trading_disabled fx_stablecoin max_slippage_percentage auction_mode high_bid_limit_percentage
24 00-USD 00 USD 0.0001 0.01 00-USD 1 False False False False online False False 0.03000000 False
342 1INCH-USD 1INCH USD 0.001 0.01 1INCH-USD 1 False False False False online False False 0.03000000 False
534 2Z-USD 2Z USD 0.00001 0.01 2Z/USD 1 False False False False online False False 0.03000000 False
713 A8-USD A8 USD 0.0001 0.01 A8/USD 1 False False False False online False False 0.03000000 False
115 AAVE-USD AAVE USD 0.01 0.001 AAVE-USD 1 False False False False online False False 0.03000000 False
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
442 ZKC-USD ZKC USD 0.0001 0.01 ZKC/USD 1 False False False False online False False 0.03000000 False
352 ZKP-USD ZKP USD 0.00001 0.1 ZKP/USD 1 False False False False online False False 0.03000000 False
88 ZORA-USD ZORA USD 0.00001 1 ZORA/USD 1 False False False False online False False 0.03000000 False
467 ZRO-USD ZRO USD 0.001 0.01 ZRO/USD 1 False False False False online False False 0.03000000 False
610 ZRX-USD ZRX USD 0.000001 0.00001 ZRX-USD 1 False False False False online False False 0.03000000 False

375 rows × 18 columns

Coinbase Fetch Historical Candles¶

This script pulls the historical candles:

In [6]:
df = coinbase_fetch_historical_candles(
    product_id="BTC-USD",
    start=datetime(2025, 1, 1),
    end=datetime(2025, 1, 1),
    granularity=86_400,
)

Specifically, the date/time, open, high, low, close, and volume levels:

In [7]:
display(df)
time low high open close volume
0 2025-01-01 92743.63 94960.91 93347.59 94383.59 6871.738482

Coinbase Fetch Full History¶

This script pulls the full history for a specified asset:

In [8]:
df = coinbase_fetch_full_history(
    product_id="BTC-USD",
    start=datetime(2025, 1, 1),
    end=datetime(2025, 1, 31),
    granularity=86_400,
)

The example above pulls the daily data for 1 month, but can handle data ranges of years because it uses the coinbase_fetch_historical_candles to pull 300 candles at a time to ensure that the API is not overloaded and drops data. Here's the results for the above:

In [9]:
display(df)
time low high open close volume
0 2025-01-01 92743.63 94960.91 93347.59 94383.59 6871.738482
1 2025-01-02 94177.00 97776.99 94383.59 96903.19 10912.473840
2 2025-01-03 96016.63 98969.92 96905.48 98136.51 9021.885382
3 2025-01-04 97516.65 98761.02 98139.85 98209.85 2742.089606
4 2025-01-05 97250.00 98814.00 98209.85 98345.33 2377.921759
5 2025-01-06 97900.00 102500.00 98347.65 102279.41 15173.556068
6 2025-01-07 96105.11 102735.99 102279.41 96941.98 16587.286922
7 2025-01-08 92500.00 97254.35 96941.98 95036.63 14182.297395
8 2025-01-09 91187.00 95363.26 95033.18 92547.44 9712.378532
9 2025-01-10 92209.25 95862.92 92547.44 94701.18 12634.034078
10 2025-01-11 93804.05 94983.65 94701.48 94565.02 2638.699568
11 2025-01-12 93670.30 95383.84 94569.91 94509.62 2025.816130
12 2025-01-13 89028.64 95900.00 94507.24 94506.45 13094.863595
13 2025-01-14 94311.36 97353.29 94507.35 96534.96 11210.742267
14 2025-01-15 96400.00 100716.45 96534.97 100510.23 13610.747294
15 2025-01-16 97277.58 100880.00 100504.27 99981.78 12312.373669
16 2025-01-17 99937.81 105970.00 99981.46 104107.00 20518.309493
17 2025-01-18 102233.45 104933.15 104107.00 104435.00 7835.299918
18 2025-01-19 99518.00 106314.44 104435.01 101211.13 13312.636856
19 2025-01-20 99416.27 109358.01 101217.78 102145.43 32342.183113
20 2025-01-21 100051.00 107291.10 102145.42 106159.26 19411.234890
21 2025-01-22 103100.00 106431.34 106159.27 103667.11 10730.018962
22 2025-01-23 101200.01 106870.87 103659.60 103926.36 25064.864999
23 2025-01-24 102751.92 107200.00 103926.36 104850.27 12921.993614
24 2025-01-25 104104.00 105294.00 104866.13 104733.56 3404.853083
25 2025-01-26 102452.24 105478.80 104729.92 102563.00 4575.366115
26 2025-01-27 97715.03 103228.46 102565.28 102062.42 23647.141119
27 2025-01-28 100213.80 103770.85 102063.92 101290.00 9488.534295
28 2025-01-29 101275.60 104829.64 101290.01 103747.25 11403.202789
29 2025-01-30 103289.74 106484.77 103747.25 104742.64 13061.348812
30 2025-01-31 101506.00 106090.00 104742.63 102411.26 13313.681045

Coinbase Pull Data¶

This script combines the above functions to perform the following:

  1. Attempt to read an existing pickle data file
  2. If a data file exists, then pull updated data
  3. Otherwise, pull all historical data available for that asset on Coinbase
  4. Store pickle and/or excel files of the data in the specified directories

Through the base_directory, source, and asset_class variables the script knows where in the local filesystem to look for an existing pickle file and the store the resulting updated pickle and/or excel files:

df = coinbase_pull_data(
    base_directory=DATA_DIR,
    source="Coinbase",
    asset_class="Cryptocurrencies",
    excel_export=False,
    pickle_export=True,
    output_confirmation=True,
    base_currency="BTC",
    quote_currency="USD",
    granularity=60, # 60=minute, 3600=hourly, 86400=daily
    status='online', # default status is 'online'
    start_date=datetime(current_year, current_month - 1, 1), # default start date
    end_date=datetime.now() - timedelta(days=1), # updates data through 1 day ago due to lag in data availability
)

By passing None as the base_currency and/or the quote_currency, the script will use the coinbase_fetch_available_products function to pull the list of all the available products, and then pulls data for all assets in that list. This functionality is incredibly useful, and makes acquiring data very straightforward, especially for a set of products for a specific base_currency or quote_currency.

The example above pulls the data for BTC-USD, and stores it in the following system directory:

DATA_DIR/Coinbase/Cryptocurrencies/Minute

And here is the filesystem tree output for the Coinbase directory:

$ tree Coinbase/
Coinbase/
└── Cryptocurrencies
    ├── Daily
    │   ├── BTC-USD.pkl
    │   ├── BTC-USD.xlsx
    │   ├── ETH-USD.pkl
    │   ├── ETH-USD.xlsx
    │   ├── SOL-USD.pkl
    │   ├── SOL-USD.xlsx
    │   ├── XRP-USD.pkl
    │   └── XRP-USD.xlsx
    ├── Hourly
    │   ├── BTC-USD.pkl
    │   ├── BTC-USD.xlsx
    │   ├── ETH-USD.pkl
    │   ├── ETH-USD.xlsx
    │   ├── SOL-USD.pkl
    │   ├── SOL-USD.xlsx
    │   ├── XRP-USD.pkl
    │   └── XRP-USD.xlsx
    └── Minute
        ├── BTC-USD.pkl
        ├── ETH-USD.pkl
        ├── SOL-USD.pkl
        └── XRP-USD.pkl

5 directories, 20 files

References¶

  1. https://www.coinbase.com/