DEV Community

Yavuz
Yavuz

Posted on

Financial Data Processing with AdaBoost Regression

In this article, I present a code I wrote in Python that is used to analyze Bitcoin prices. The method used is a machine learning based approach with AdaBoost Regression. This code offers great potential for those who want to perform financial analysis, create price predictions and develop trading strategies.

First of all, I imported the libraries I will use in the code.

import yfinance as yf
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import AdaBoostRegressor
import numpy as np
Enter fullscreen mode Exit fullscreen mode

In the first step, I pulled Bitcoin price data with the symbol "BTC-USD" using the yfinance library. You can also pull the price of any index or stock. I also limited this data to the date "2023-01-01" as I wanted to analyze this data over a more recent time period.

stock = yf.Ticker("BTC-USD")
data = stock.history(start="2023-01-01")
Enter fullscreen mode Exit fullscreen mode

I then added a date column to this data to be able to do date-related operations. This is important for time series analysis.

data['Date_Int'] = pd.to_datetime(data.index).astype('int64')
Enter fullscreen mode Exit fullscreen mode

I chose independent and dependent variables to process the data. The independent variable is set to "Date_Int" as a representation of the date column, while Bitcoin prices ("Close") are chosen as the dependent variable.

X = data[['Date_Int']].values
y = data['Close'].values
Enter fullscreen mode Exit fullscreen mode

In this step, I created an AdaBoostRegressor model. AdaBoost is used to boost a regression model using decision trees. The model limits the depth of the trees.

from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import AdaBoostRegressor
regr = AdaBoostRegressor(DecisionTreeRegressor(max_depth=4), n_estimators=1, random_state=1)
Enter fullscreen mode Exit fullscreen mode

The model is trained on the selected dataset. This training aims to predict future prices using historical price data.

regr.fit(X, y)
Enter fullscreen mode Exit fullscreen mode

Using the trained model, I obtained forecasts of future Bitcoin prices. I visualized these forecasts and the actual price data on a graph on a logarithmic scale. The reason for using logarithmic instead of linear graphs is that it is much more convenient and readable, especially over large time periods. This visualization shows how forecasts can be compared to actual data.

y_pred = regr.predict(X)
y = np.log(y)
y_pred = np.log(y_pred)

plt.figure(figsize=(14, 7))
plt.scatter(data['Date_Int'], y, color='blue', label='Logarithmic Real Prices')
plt.plot(data['Date_Int'], y_pred, color='red', label='Logarithmic Predictions', linewidth=2)
plt.title('Boosted Decision Tree Regression - Apple Stock Prices (2021 - Now) - Logarithmic Scale')
plt.xlabel('Date')
plt.ylabel('Log Price')
plt.legend()
plt.grid(True)
plt.show()
Enter fullscreen mode Exit fullscreen mode

In particular, this code can help developers of financial software to create applications that predict the future prices of various stocks and indices. In addition, various financial reports can be generated by outputting the values of the code. You can find the full code below. Thank you

import yfinance as yf
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import AdaBoostRegressor
import numpy as np

stock = yf.Ticker("BTC-USD")
data = stock.history(start="2023-01-01")

data['Date_Int'] = pd.to_datetime(data.index).astype('int64')

X = data[['Date_Int']].values
y = data['Close'].values

regr = AdaBoostRegressor(DecisionTreeRegressor(max_depth=4), n_estimators=1, random_state=1)

regr.fit(X, y)

y_pred = regr.predict(X)
y = np.log(y)
y_pred = np.log(y_pred)

plt.figure(figsize=(14, 7))
plt.scatter(data['Date_Int'], y, color='blue', label='Logarithmic Real Prices')
plt.plot(data['Date_Int'], y_pred, color='red', label='Logarithmic Predictions', linewidth=2)
plt.title('Boosted Decision Tree Regression - Apple Stock Prices (2021 - Now) - Logarithmic Scale')
plt.xlabel('Date')
plt.ylabel('Log Price')
plt.legend()
plt.grid(True)
plt.show()
Enter fullscreen mode Exit fullscreen mode

Top comments (0)