Introduction
Hello everyone, welcome back to our new blog about getting Stock data in real-time using Polygon.io. Few blogs ago, I've shared how can we use Alpaca API to stream Stock data. But in this blog, we will use Polygon.io and choosing Polyon over Alpaca has its own pros and cons.
- Alpaca is little bit better than Polygon in terms of the documentation in GitHub and the APIs. Which can be seen in Alpaca Trade API and Polygon IO Client Python.
- Alpaca could give data in dataframe as well as JSON format but Polygon gives only in JSON. However we could make dataframe from JSON as well.
- The Updated candles in Alpaca were arriving little slower than Polygon. It was found that at least 3sec is taken from the Polygon to send corrected bars whereas Alpaca was taking more than 30secs.
So, to choose between Alpaca and Polygon, one should focus if the 30 seconds delay in corrected data is acceptable or not. If it is not then Polygon is best choice else Alpaca wins the race for me as it provides some great modules like get_clock()
.
Getting User and Key: Polygon.io
Its easy to get API Key. Just sign up for free version to see the dashboard and then the key will be there somewhere.
key=""
Installing Polygon API Client
I am using version 0.2.11 because this version was working as my requirements was for the project.
!pip install polygon-api-client==0.2.11
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting polygon-api-client==0.2.11
Downloading polygon_api_client-0.2.11-py3-none-any.whl (22 kB)
Collecting websocket-client>=0.56.0
Downloading websocket_client-1.3.3-py3-none-any.whl (54 kB)
[K |████████████████████████████████| 54 kB 2.2 MB/s
[?25hRequirement already satisfied: requests>=2.22.0 in /usr/local/lib/python3.7/dist-packages (from polygon-api-client==0.2.11) (2.23.0)
Collecting websockets>=8.0.2
Downloading websockets-10.3-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (112 kB)
[K |████████████████████████████████| 112 kB 12.1 MB/s
[?25hRequirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests>=2.22.0->polygon-api-client==0.2.11) (1.24.3)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests>=2.22.0->polygon-api-client==0.2.11) (2.10)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests>=2.22.0->polygon-api-client==0.2.11) (3.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests>=2.22.0->polygon-api-client==0.2.11) (2022.6.15)
Installing collected packages: websockets, websocket-client, polygon-api-client
Successfully installed polygon-api-client-0.2.11 websocket-client-1.3.3 websockets-10.3
Rest API to Get Aggregate Data
Aggregate Data means the data that is aggregated with time. There could be OHLC in every minute and in any time frame. Open is always a Opening price of first candle in the timeframe, close is always a closing price of last candle in a timeframe and high is high price among all candles in timeframe and low is low price among all candles in timeframe.
from polygon import RESTClient
import pandas as pd
client = RESTClient(key)
Polygon uses timestamp to mention datetime FROM and TO. So we need to get timestamp first. And its also worth mentioning that the timestamp should be using milliseconds. We got the timestamp below but its not upto milliseconds so we added 3 0s in the end of bothn while calling api.
int(pd.to_datetime("2022-06-10 01:22").timestamp()),int(pd.to_datetime("2022-06-21 06:22").timestamp())
(1654824120, 1655792520)
The data we are looking for is 1minute candle and it is available using stocks_equities_aggregates
in this version. More about this function can be found in documentation here.
res=client.stocks_equities_aggregates(ticker='AAPL', multiplier=1,
timespan="minute", from_="1654824120000",
to="1655792520000",limit=500000)
The result will be in JSON but we can use pandas to make it dataframe.
res.results
df = pd.DataFrame(res.results)
df
v | vw | o | c | h | l | t | n | |
---|---|---|---|---|---|---|---|---|
0 | 2292.0 | 142.9749 | 143.03 | 142.99 | 143.03 | 142.90 | 1654848000000 | 64 |
1 | 817.0 | 143.0116 | 143.02 | 143.03 | 143.03 | 143.02 | 1654848060000 | 53 |
2 | 513.0 | 143.0704 | 143.10 | 143.10 | 143.10 | 143.10 | 1654848120000 | 34 |
3 | 940.0 | 143.2342 | 143.23 | 143.25 | 143.25 | 143.23 | 1654848240000 | 30 |
4 | 1802.0 | 143.1876 | 143.20 | 143.15 | 143.20 | 143.15 | 1654848300000 | 58 |
... | ... | ... | ... | ... | ... | ... | ... | ... |
5033 | 536.0 | 131.5226 | 131.52 | 131.52 | 131.52 | 131.52 | 1655509860000 | 18 |
5034 | 706.0 | 131.5344 | 131.52 | 131.55 | 131.55 | 131.52 | 1655509920000 | 11 |
5035 | 2014.0 | 131.5570 | 131.55 | 131.57 | 131.57 | 131.55 | 1655510040000 | 27 |
5036 | 647.0 | 131.5287 | 131.53 | 131.52 | 131.53 | 131.52 | 1655510220000 | 21 |
5037 | 902.0 | 131.5651 | 131.55 | 131.56 | 131.56 | 131.55 | 1655510340000 | 32 |
5038 rows × 8 columns
Columns in above table are using initials, o for open, h for high and so on.
Web Socket to Get Realtime Data
Polygon also provides WebSocket which allows us to get data in near realtime.
from polygon import WebSocketClient, STOCKS_CLUSTER,RESTClient
import json
We need to create a handler inorder to handle a response. In our case, we need to create a handler that will write the data in our database or our desired place.
def close_handler(ws):
ws=json.loads(ws)
for w in ws:
if w["ev"]=="AM":
print(w)
In above function, we are receiving websocket's response as ws and we convert it into dictionary using json.loads
. Polygon sends bunch of responses in a same response if the system is slow or the result is too many to send one by one. So we loop through them, if there is a event (ev) named AM
then we pring our data. AM means aggregated minute (I guess).
symbols=["AAPL"]
my_client = WebSocketClient(STOCKS_CLUSTER, key, close_handler)
my_client.run_async()
my_client.subscribe(*[f"AM.{s}" for s in symbols])
- We prepare symbols in a list.
- Then prepare a object of
WebSocketClient
by passing STOCKS_CLUSTER, key and our handler. - We run a Async and finally subscribe to the symbols. The
AM
there is responsible for getting realtime Aggregated data per minute.