Hey! I’m Joanne, an intern at lemon.markets, and I’m here to share some invaluable Python libraries & packages to use when you’re working with financial data and automated trading. At lemon.markets, we provide the infrastructure for developers to build their own brokerage experience at the stock market. Whatever your product might look like, there’s usually one or more Python libraries that can do the legwork for you. So, instead of re-inventing the wheel, let’s have a look at which packages can facilitate your automated trading.
The focus here is on Python, but many of the featured libraries have either wrappers that allow them to be used in other languages, or have comparable alternatives. If you have additional suggestions, feel free to leave a comment below. 💭
I’ve split the trading process into three general steps: manipulating (raw) data, performing technical analysis and finally assessing your portfolio. There’s probably 100+ steps that can be inserted into this process, but as a starting point, we think this is a solid place to begin. It covers the ‘before’, the ‘during’ and the ‘after’ when it comes to implementing your strategy. If you’re struggling to find more steps, perhaps consider: data collection, data visualisation, paper trading, backtesting, machine learning, portfolio management…must I go on?
Data Manipulation
We make the assumption here that you’re collecting data before writing your trading strategy. Live market data, historical data, trading sentiment: it all falls within this category. And before you can perform any kind of manipulation, you need data to do it on. Naturally, the lemon.markets market data API can be used to retrieve historical market data. We’ll also be providing real-time market data in the near future (stay tuned!). But you’re not restricted to only market data, you can also, for example, scrape headlines from financial news sites to perform sentiment analysis. Regardless of where you obtain your data, you’ll notice that often your source won’t present the data in exactly the format you need: cue data manipulation tools.
🐼 Pandas
import pandas as pd
No list of Python libraries for financial analysis (or really any kind of data-driven work) would be complete without the mention of Pandas. It’s a powerful data manipulation tool that works with data structures called Series (one-dimensional) and DataFrames (two-dimensional). It can be used to intelligently index data, merge and join different data sets and even perform computations. The documentation includes a 10-minute guide to Pandas and DataCamp has a tutorial on using Python for Finance.
🔢 NumPy
import numpy as np
NumPy, or the Numerical Python library, is the package when it comes to scientific computing in Python. Specifically, NumPy provides functions for linear algebra, Fourier transforms and random number generation. It’s widely used because it utilises vectorisation, which means it can turn a computation which might take 1000 cycles into one that takes 250 cycles. As developers, we’re always looking to reduce computational power where ever possible.
👩🔬 SciPy
import scipy as sp
SciPy is the scientific library that builds on NumPy — it includes modules for statistics, optimisation, integration, linear algebra and more. Similar to NumPy, but with more functionality (which comes at a price: slower computation speed). Check out the documentation to see if it meets your requirements!
📊 Matplotlib
import matplotlib.pyplot as plt
The Matplotlib library can be used to create static, animated and interactive visualisations in Python. There are a million reasons why you might like to visualise data in financial analysis. For example, you might want to measure the performance of a single stock (or basket of stocks) against an index like the S&P500. Or, you might want to construct a simple histogram of daily stock returns to determine (visually) whether they follow a normal distribution. Of course, this would need to be backed up by a statistical test, which can be done with the statsmodels library (coming up soon).
You’ll notice that the above four libraries are often used simultaneously in projects, and likely, in your use-case it’ll be the same situation. They integrate seamlessly. Many additional niche packages are built on top of these four packages, for example: PyNance. There’s lots of resources available regarding these libraries: to get started, here’s an introduction to NumPy and Pandas.
python
import requests
import pandas as pd
import numpy as np
from scipy.stats import norm
import matplotlib.pyplot as plt
def get_ohlc():
response = requests.get(
'https://data.lemon.markets/v1/ohlc/d1/?isin=US0378331005 \
&from=2021-06-25T00:00:00&to=2021-11-14T00:00:00',
headers={"Authorization": "Bearer YOUR-API-KEY"})
results = response.json()['results']
return results
def calculate_returns():
df = pd.DataFrame(get_ohlc())
# calculate returns based on closing price
df['r'] = df['c'].pct_change().fillna(method='bfill')
df.tail()
return df
def plot_returns():
returns = calculate_returns()['r']
# plot returns
plt.hist(returns, bins=25, density=True, alpha=0.6, color='darkorange')
# plot normal distribution fitted to returns
mu, std = norm.fit(returns)
xmin, xmax = plt.xlim()
x = np.linspace(xmin, xmax, 100)
p = norm.pdf(x, mu, std)
plt.plot(x, p, 'k', linewidth=2)
plt.xlabel('Daily Tesla Returns')
plt.ylabel('Probability Density Function')
plt.show()
plot_returns()
This script will return the following histogram:
Obviously, we do not have enough data points to conclude whether Tesla daily returns follow a normal distribution. We hope that this little example shows you what can be done with these data manipulation packages and our OHLC endpoint.
Technical Analysis
Your strategy may or may not employ technical analysis. If you’re somehow using historical price data to predict future price movement, then this falls under technical analysis. If you’re not, don’t worry, it’s not necessary in order to implement an automated trading strategy (but might be helpful nonetheless, so feel free to dive in).
📈 TA-Lib
import talib as ta
TA-Lib, or Technical Analysis Library, can be used to perform technical analysis on financial data by calculating well-known technical indicators, such the Weighted Moving Average (WMA) or Relative Strength Index (RSI). It can also recognise candlestick patterns, such as the inverted hammer or homing pigeon, to name a few. These indicators might serve as buy or sell signals for your trading strategy.
📉 Statsmodels
import statsmodels.api as sm
Python already includes a built-in statistics module, but the statsmodels package can be used for more in-depth statistical analysis. Say you want to construct an ARIMA model for historical price data in order to predict price movement in the future, then this library would be the tool to use. Or, your use-case might be more simple, such as conducting a Jarque-Bera test for normality of residuals after a regression.
🙋 PyStan
import stan
Bayesian inference is used in financial modelling to assess return predictability and strategy risk (among other things). PyStan is the Python-adapted package to perform Bayesian inference. Note that you need to use a domain specific language based on C++ (called Stan), which makes this package a bit more difficult to use. However, it is very powerful in that it allows you to perform high-level statistical modelling, analysis and prediction.
➗ f.fn()
import ffn
ffn is a library that extends Pandas, NumPy and SciPy and contains functions often used within the quantitative finance framework. For example, you can use it to calculate the risk parity weights given a DataFrame (🐼) of returns. If you’re not familiar with risk parity, it’s an investment management technique that determines how to allocate risk within a portfolio. (Have we mentioned that reading the documentation of financial-related libraries is a great way to get familiarised with new metrics?)
The above four libraries can be used to determine when, what and how much to buy or sell. Once these decisions are made, the lemon.markets trading API can be used to place your orders on the stock market. An order can be placed as follows:
python
import requests
import json
def place_order(isin: str, side: str, quantity: int):
if trading_signal:
request = requests.post("https://paper-trading.lemon.markets/v1/orders/",
data=json.dumps({
"isin": isin,
"expires_at": "p7d",
"side": side,
"quantity": quantity,
"space_id": YOUR-SPACE-ID,
"venue": "XMUN"}),
headers={"Authorization": "BEARER YOUR-API-KEY"})
print("Your trade was placed")
else:
print("Your trade was not placed")
place_order("US0378331005", "buy", 1)
The boolean trading_signal
indicates whether the trade should be placed or not (this is where the aforementioned libraries come in handy). However, in this script we have not defined it yet.
Portfolio Assessment
Once your strategy is finished and implemented, it’s important to measure its performance, not only by returns, but also by calculating e.g. the risk associated with it. Portfolio analysis is not a one-and-done event: a good investor assesses their portfolio (or automates the process) regularly and implements necessary changes, such as a rebalancing or purchasing additional stocks to diversify appropriately. Note: your lemon.markets portfolio can be accessed via the Portfolio endpoint:
python
import requests
request = requests.get("https://paper-trading.lemon.markets/v1/portfolio/",
headers={"Authorization": "BEARER YOUR-API-KEY"})
print(request.json())
Which would return each instrument in your portfolio in the following fashion:
ISIN : {YOUR-SPACE-ID: {
'buy_orders_total': '',
'buy_price_avg': '',
'buy_price_avg_historical': '',
'buy_price_max': '',
'buy_price_min': '',
'buy_quantity': ''
'orders_total': '',
'quantity': '',
'sell_orders_total': '',
'sell_price_avg_historical': '',
'sell_price_max': '',
'sell_price_min': '',
'sell_quantity': ''}
✋ Empyrical
import empyrical as em
Empyrical can be used to calculate well-known performance and risk statistics, for example the Sharpe ratio, alpha and beta. These metrics might show how the portfolio performs in relation to the market and indicate whether structural changes should be made. It’s an open-source project initiated by the now-defunct Quantopian, however the GitHub repository remains somewhat active (fingers crossed it stays that way 🙏🏼).
💼 PyFolio
import pyfolio as pf
PyFolio is quite similar to Empyrical in that it can create an image that reflects performance and risk analysis. It does this through a so-called tear sheet, which includes metrics such as the stability, maximum drawdown and kurtosis of your portfolio’s returns.
These ten Python libraries and packages should provide a good starting point for your automated trading journey. Integration with the lemon.markets API is possible at every step: market data can be retrieved for data manipulation, orders can be placed according to technical indicators and the portfolio can be accessed to do risk and performance assessments. We strive to make the API as transparent as possible, to give you, the developer, full control over your brokerage experience.
If you’re not already signed-up to lemon.markets, join our waitlist here, we’d love to have you! Connect with us by leaving behind a comment, sending us an email and joining our vibrant Slack community. You’re bound to pick up some additional tools and inspiration along the way.
See you soon,
Joanne 🍋
Top comments (2)
Hey there, thanks for your comment. Naturally, lemon.markets is not the only broker out there, but we think that it might solve a need for many developers out there. Feel free to check it out :)
Hello Joanne,
FreqTrade is not identified as a commun framework ?
What is your opinion on it ?
Cheers,
Alexis