An application of Bidirectional LSTMs toward Stock Market Prediction utilizing Polygon Stock Market API | by Anirudh Sekar | Jul, 2024


Our project involves the utilization of the LSTM, chosen according to this study, where out of multiple tests of different RNN architectures on both classification and regression problems, the LSTM performed the best. The study is linked here to understand the testing process: 10.13140/RG.2.2.33253.41447/1. We will be using the Bidirectional version of the LSTM in order to create a stronger and more accurate prediction. In order to obtain the code for this project, refer to this GitHub repository, containing all the files needed for this project. https://github.com/AnirudhSekar/Stock-Market-Prediction-using-Polygon-API

In order to complete this project, we will need to do the 5 following steps.

  1. File Setup
  2. Setup Polygon API and receive Public Key
  3. Access the data from the Polygon API
  4. Build the Bidirectional LSTM model using TensorFlow
  5. Configure the front-end using Dash

To setup our files for this project, we will need to create a directory named “Stock Trader”. Within this directory, create three files, named “config.py”, “ai_prediction.py”, and “app.py” so that your file structure looks like this.

Visit polygon.io and click Sign Up. Once signed in, you will be redirected to your dashboard, where you can find your personal API Key. Then add the following code to the “config.py” file.

API_KEY = 'ENTER YOUR API KEY HERE'

First we need to install the some modules using this command. If down the line, a certain module doesn’t exist on your system, install it using pip install module_name

pip install polygon matplotlib sklearn tensorflow numpy pandas talib dash dash_bootstrap_components

Then, in the ai_prediction.py file, add the following code

from polygon import RESTClient
import matplotlib
matplotlib.use("Agg")
from polygon import RESTClient
from urllib3 import HTTPResponse
import config
import json
import pandas as pd
import numpy as np
from typing import cast
import datetime
import tensorflow as tf
from keras.layers import LSTM, Bidirectional, Dense
from keras.models import Sequential
from sklearn.preprocessing import MinMaxScaler

client = RESTClient(config.API_KEY)
def lossgain(stock_val, end):

aggs = cast(
HTTPResponse,
client.get_aggs(
stock_val,
1,
'day',
# Currently set to use 2 years worth of data, but you can change it to whatever you want
str((datetime.datetime.today()-datetime.timedelta(366*2)).strftime('%Y-%m-%d')),
end,
raw = True
),
)
data = json.loads(aggs.data)

Now that we have the appropriate data from Polygon, let’s use that data to train a Bidirectional LSTM model using TensorFlow. To begin, add the following code to the ai_prediction.py file.

        closeList = []
openList = []
timestamp = []
lossgain= []

for item in data:
if item == 'results':
rawData = data[item]
for bar in rawData:
for category in bar:
if category == "c":
closeList.append(barai)
elif category == 'o':
openList.append(barai)
elif category == 't':
timestamp.append(barai)
for i in range(len(openList)):
lossgain.append(closeList[i]-openList[i])
times = []
for time in timestamp:
times.append(pd.Timestamp(time, tz='GMT', unit="ms"))
print(times[-1])
# Make LossGain a time series
lossgain = np.array(lossgain).astype(float)
scaler = MinMaxScaler(feature_range=(0, 1))
lossgain = scaler.fit_transform(lossgain.reshape(-1, 1))

Adding this code extracts the data into lists, containing the Opening Price, Closing Price, and the Loss/Gain value between them each day. In addition, it makes Loss/Gain a time series and uses Normalization to make the data better for the Bidirectional LSTM. To finally have the Bidirectional LSTM programmed, add the following code to the “ai_prediction.py” file.

# Function to create the dataset with input features and labels
def create_dataset(series, look_back=1):
X, y = [], []
for i in range(len(series) - look_back):
X.append(series[i:(i + look_back), 0])
y.append(series[i + look_back, 0])
return np.array(X), np.array(y)

# Change the lookback to whatever you feel has the best results
lookback=1
if stock_val == "NFLX" or stock_val == "TSLA":
lookback = 5
X, y = create_dataset(lossgain, lookback)

# Reshape input to be [samples, time steps, features]
X = np.reshape(X, (X.shape[0], X.shape[1], 1))

# Split the data into training and testing sets
train_size = int(len(X) * 0.8)
X_train, X_test = X[:train_size], X[train_size:]
y_train, y_test = y[:train_size], y[train_size:]

# Create AI model
print(X_train.shape)
model = Sequential([
Bidirectional(LSTM(50, input_shape=(21, 1), return_sequences=True)),
Bidirectional(LSTM(50, return_sequences=True)),
(LSTM(50)),
Dense(1)
])
model.compile(optimizer=tf.optimizers.legacy.Adam(0.0001), loss="mae", metrics=["MeanSquaredError", "RootMeanSquaredError"])
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10, batch_size=32, verbose=2)

test_predictions = model.predict(X_test)
test_predictions = scaler.inverse_transform(test_predictions.reshape(-1,1))
return (f"Prediction of price increase for {stock_val}: {test_predictions[-1][0]:.2f}")

Adding this code creates a Tensorflow Sequential Model containing Bidirectional LSTM layers, normal LSTM layers, and Dense Layers to properly give us a time-series prediction of the Loss/Gain. Because our timestamp is one day, it would predict the Loss/Gain value for the next day.

Now that we have finished the AI component to this project, we can use Dash, a framework similar to Flask in order to create a front end that can also handle complex graphs. We are using Dash so that we can make a nice visual plot of each stock using Polygon’s data. Add the following code to “app.py”

from dash import Dash, html, dcc
from dash.dependencies import Input, Output
import matplotlib
import dash_bootstrap_components as dbc
matplotlib.use("Agg")
from plotly import graph_objects as go
from polygon import RESTClient
from urllib3 import HTTPResponse
import config
import json
import pandas as pd
import talib
import numpy as np
from ai_prediction import lossgain
from typing import cast
import datetime

# Connect to Polygon REST API
client = RESTClient(config.API_KEY)

# Dash Setup
app = Dash(__name__,external_stylesheets=[dbc.themes.BOOTSTRAP],
meta_tags=[
{"name": "viewport",
"content": "width=device-width, initial-scale=1"}
])
app.layout = html.Div( [

html.H1('Stock Graphs', style={'color': 'black', 'font-weight': 'bold','font-family':"arial", 'margin':'20px'}),
html.Div([
html.P("Choose a stock:", style={'font-style': 'italic',
'font-weight': 'bold',
'margin':'20px',
'display':'flex'}),
dcc.Dropdown(
id="stock-val",
options=[
{'label':"Apple", 'value':"AAPL"},
{'label':"Google", 'value':"GOOGL"},
{'label':"Tesla", 'value':"TSLA"},
{'label':"Amazon", 'value':"AMZN"},
{'label':"Netflix", 'value':"NFLX"},
{'label':"Microsoft", 'value':"MSFT"},
{'label':"Nvidia", 'value':"NVDA"},

],
className='dropdown-class-1',
value="AAPL",
style={'margin':'10px','width':'60%'}
),
], style={'width':'60%', "display":'flex'}),
html.Div([
html.Div(children='Start date:', style={
'font-style': 'italic',
'font-weight': 'bold',
'margin':'20px'
}),

dbc.Input(
id="start",
type="text",
value=str((datetime.datetime.today()-datetime.timedelta(366)).strftime('%Y-%m-%d')),
style={'margin':'20px', 'width':'10%'}
),
html.Div(children='End date:', style={
'font-style': 'italic',
'font-weight': 'bold',
'margin':'20px',
'display':'flex'
}),
dbc.Input(
id="end",
type="text",
value=str(datetime.datetime.today().strftime('%Y-%m-%d')),
style={'margin':'20px', 'width':'10%'}
),
], style={'display':'flex'}),
dcc.Graph(id="stock-graph", style={'width':'500', 'height':'400'}),
html.Div([
dbc.Button(children="Click on me for tomorrow's price increase prediction for this stock!" ,id='pred-button'),
html.P(id="prediction-button", children="")
], style={'margin':'20px'})
], id="app")

This code will allow Dash to convert this Python Code into a webpage with all the elements we need. To make this webpage responsive and create additional projections on the stock plot, add the following code.

@app.callback(Output('stock-graph', 'figure'), [Input("stock-val", component_property='value'), Input("start", component_property="value"),Input("end", component_property="value")])
def graph(stock_val, start, end):
# Receive the data
aggs = cast(
HTTPResponse,
client.get_aggs(
stock_val,
1,
'day',
start,
end,
raw = True
),
)

data = json.loads(aggs.data)
closeList = []
openList = []
highList = []
lowList = []
timestamp = []
lossgain= []

for item in data:
if item == 'results':
rawData = data[item]
for bar in rawData:
for category in bar:
if category == "c":
closeList.append(barai)
elif category == "h":
highList.append(barai)
elif category == 'l':
lowList.append(barai)
elif category == 'o':
openList.append(barai)
elif category == 't':
timestamp.append(barai)
for i in range(len(openList)):
lossgain.append(closeList[i]-openList[i])

closeList = np.array(closeList)
ema_8 = talib.EMA(closeList, 8)
ema_13 = talib.EMA(closeList, 13)
ema_21 = talib.EMA(closeList, 21)
ema_55 = talib.EMA(closeList, 55)

upper, middle, lower = talib.BBANDS(closeList, timeperiod=20, nbdevdn=2,matype=0)
times = []
for time in timestamp:
times.append(pd.Timestamp(time, tz='GMT', unit="ms"))

fig = go.Figure()

fig.add_trace(go.Candlestick(x=times, open=openList, high=highList, low=lowList,close=closeList,name=stock_val))
fig.add_trace(go.Scatter(x=times,y=upper, name='Bollinger Band Upper'))
fig.add_trace(go.Scatter(x=times,y=middle, name='Bollinger Band Middle'))
fig.add_trace(go.Scatter(x=times,y=lower, name='Bollinger Band Lower'))
fig.add_trace(go.Scatter(x=times,y=ema_8, name='EMA 8'))
fig.add_trace(go.Scatter(x=times,y=ema_13, name='EMA 13'))
fig.add_trace(go.Scatter(x=times,y=ema_21, name='EMA 21'))
fig.add_trace(go.Scatter(x=times,y=ema_55, name='EMA 55'))
fig.update_layout(transition_duration=500)
fig.update_layout(xaxis_rangeslider_visible=False)
return fig

@app.callback(Output('prediction-button', 'children'), [Input("stock-val", component_property='value'),Input("end", component_property="value"),Input('pred-button', 'n_clicks')], prevent_initial_call=True)
def run_ai(stock_val, end, _):
return lossgain(stock_val, end)

This will make the page responsive and allow for the user to interact with the program. Finally, add the last piece of code within the “app.py” file.

if __name__ == "__main__":
app.run_server(debug=True)

And that’s it! Now you should have a fully working Dash application that can visualize stock price data and also have an AI component to that application to predict future stock price gains and losses. Thank you so much for reading this article! Let me know if you have any questions or concerns!



Source link

Be the first to comment

Leave a Reply

Your email address will not be published.


*