topjer

Posted on Jun 22, 2022 • Edited on Jun 26, 2022 • Originally published at topjer.net

Arbitrage - this is not a Game

#beginners #datascience #watercooler #devjournal

This article will describe how I created and implemented a trading model for a game I have recently started playing. Even though it is "just a game" and therefore "not real". It is still a great sandbox to play around with, to explore concepts and to improve coding skills.

I will begin with describing the market conditions and drawing conclusions from then. The next step will be to dissect the first model I have created and lastly I will describe all the flaws and how to improve.

It All Started With a Game

Some months ago I have started to play Final Fantasy 14, an massively multiplayer online role playing game (MMO) set in the popular Final Fantasy universe.
As is usually the case for MMOs, this game has an auction house where you can buy and sell in-game items for in-game currency.

Whilst this is probably only a side feature for most people, it can be a great source of virtual income if played right.

The Market

The Facts

Let us first gather the hard facts of the market as they set the boundaries to the model we do need.

The in-game currency is called 'Gil'.
The markets are called 'Market Boards'.
The game has so called 'Data Centers' which are, well, data centers in their respective region.

Every data center has multiple worlds and you can travel without much effort between those worlds. (It only takes a couple of seconds to load and you do not have to pay anything.)

The market boards between the worlds of a data center are not connected. That means that prices can evolve differently between the worlds depending on the supply and demand generated by their player base.
A character can only sell items on the market board of their home world, i.e. the world they have been created on but can buy from every world of their data center.
If you want to sell an item you must place a sell order, specifying the price per item and the number of items you want to sell as a stack. Stack size is limited to 99.
Buy orders are not available! If you want to buy an item you have to choose among the existing sell orders.
Partially buying a sell order is not possible. You must buy the whole position.
Selling incurs a fee of 5%. As a matter of fact it could be reduced to 3% but for the sake of simplicity we will work with 5%, as this will not change the general approach.
Buying an item only comes with a fee of 5%, if you do that on a world that is not your home world.
It is not possible to directly access market data, but there are websites that have users collect and upload price data. For example Universalis, which handily provides an API we can use to query order book and sales data. (As a side note, there actually seems to be an API for the market data in the game but it is not publicly accessible.)
Trading happens 24/7.

The Implications

Now it is time to draw a couple conclusions from the facts above. Again, let's do this in the form of bullet points.

Because the market boards of different worlds are not connected, cross worlds arbitrage opportunities can arise.
Ideally you would want to match buy orders of your home world with sell orders from a foreign world and execute immediately. But since you can only create a sell order on your home world it might happen that the price drops rapidly, leaving you at a loss. This means there is a certain risk associated to the chosen approach, so technically it might not be fully correct to talk about 'arbitrage'. Yet I will continue to do so, as it makes it sound way cooler.
Not being able to partially execute an order is also bothersome. On the one hand you might not be able to afford a lucrative position. On the other hand, it forces us to think about good stack size when selling ourselves.
Because we cannot access up to date date, it can happen that arbitrage opportunities we find are no longer available.

The Model v0.1

Here is the high level work flow of what my first model was doing.

Query a list of items that have last been updated on a given world. There is an API endpoint for that, provided by Universalis.
For each item I would determine the minimum price on each world of my data center by again querying the API from Universalis.
Compute the difference between those prices from other worlds to the price on my home world and remember all servers where the price difference is positive, i.e. where I can make money.
Query the list of past transactions (another call to the Universalis API) and filter for the last 14 days. Compute the total number of traded items per day. Afterwards, compute the max, min and median over the available days.
As a measure for how profitable an item could be, I computed the 'median win' which is the computed price difference times the median number of traded items.
In the end I would report those items that have
- a profitable price difference
- trading activity on 85% of days in the past 14 days.

The Shortcomings

Every journey has to start with a first step. The same is true in this case.
It will probably not surprise you, that this model did not work too well. But it allowed for exploration of basic concepts and fundamental implementation techniques.

Enough talk! Here are the issues that quickly arose alongside with some possible improvements for the future.

The 'median win' was incredibly misleading. It would usually overestimate the potential win, obviously due to its poor design.
- It could be that only a single position would be priced much cheaper than the rest. So the same price difference could not be achieved with the other positions.
- Alternatively, it could happen that the number of favorable positions is less than the median.
- It would be much better to compare the actual listings of other worlds with the minimum price on my home world. That way I could see the profit to be made.
Turns out that picking the items 'at random' is not the best approach. What a Shocker! That way it is hard to find potential arbitrage in general and even if something is found, it might be in a category that has little demand.
- It would be much more promising to focus on categories with higher demand and thus higher trading activity. The margins might be smaller in this segment but being able to quickly close a position reduces risk. The longer a position is open, the higher the chance that the prices fall.
- Going by category requires to prepare the mapping of items to categories. This can be achieved as this data is freely available.
I was ignoring the fees entirely.
- Luckily, the impact of this is not too bad. It just slightly overestimates profit margins. On top of that, it is easy to fix. Simply add the fees to the formula for the price differences. So easy that it makes you wonder why it was not implemented in the first place.
My script had poor error handling that would result in it crashing for items that are iliquid.
- Mostly annoying and nothing that couldn't be fixed quickly.
Working with absolute prices differences. Expensive items would produces higher price differences but would also carry a greater risk / require a greater initial investment. A price difference of 5000 sounds like much but when the item costs 7 mio Gil, then it might not be worth the effort.
- This can be fixed by putting the win in relation to the required investment.
It was hard to decide on stack size for my sell orders. I had an idea of the daily volume but not which stack size I should choose.
- This is a tricky one. I have noticed tendencies for some items, e.g. people would pay more for a smaller stack when they need less of those items.
- One possible approach could be to look at the histogram of past stack sizes but I have the feeling that there is much more to uncover here.
The last one is less of a shortcoming but contains a lot of potential: There seem to be fluctuations in the price of items depending on the weekday. Which can be attested to the fact that more people play on the weekends. Yet the impact of that is not easy to predict. Maybe this means that the demand is increased because everyone needs a certain item or -conversely- the supply increases because more is being produced.
- The big issue here is data availability, as Universalis does not seem to keep a complete sale history. Thus it would be necessary to create a price time series yourself and then analyze the price change with regards to the week day.

Implementation

In the end, I want to briefly touch upon how I implemented this script without going into too much detail.
Everything is implemented in Python and is run under Windows Subsystems for Linux on my Windows gaming PC. This makes me chuckle a bit every time because I assume that I am the only one with a bash window on his second screen while playing.

The API requests are done with the 'requests' package and the most frequently needed calls are encapsulated in functions.

Data wrangling is done with 'Pandas' which gave me quite a lot of head ache because of how unfamiliar I am with the package. Also because I had clear ideas how to achieve what I want as a set of SQL statements.
This was especially annoying in conjunction with dates and date time operations.

Conclusion

The project so far had its ups and downs.

The implementation was rather frustrating at times. Especially because I knew clearly how to solve the problems with other tools. As if that was not enough, the outcome turned out to be less usable in the end then I had hoped for.

But how do they say: 'It is about the journey and not the destination'. It was really fun to think about ways to find arbitrage and the first version of the model made it really easy to come up with improvements.

The issues with the implementation were also very educational and now I have a first impression of pandas. I would probably still have to google everything, if I were to repeat the implementation. But maybe I would be a bit quicker in the process.

I definitely want to continue with this project as it holds a ton of potential for further learnings even outside of the things mentioned here and I will definitely take you along on the journey.

Do you have other ideas how to improve my model or did you realize arbitrage possibilities somewhere else? Let me know in the comments.

DEV Community

Arbitrage - this is not a Game

It All Started With a Game

The Market

The Facts

The Implications

The Model v0.1

The Shortcomings

Implementation

Conclusion

Top comments (0)

Read next

Key Components of a VPC: Detailed Breakdown

Why Seeing Data Beats Reading It: The Case for Data Visualization

How to Build a Line Follower Robot with Arduino

Building a Single Page Weather Application in JavaScript