DEV Community

loading...

Mapping out Gacha Pull Probabilities using Python and Google Colaboratory

sr229 profile image Ayane Satomi ・4 min read

COVID-19 brought us some good stuff because of boredom, some of you may have worked on your applications that you abondoned because you have no time, some of you are probably giving your cats a good well-earned petting, I spent one day of my quarantine doing what most otakus with computer science degrees do best: using our knowledge with programming and technology to our advantage to "cheat" Gacha pulls.

First of all I do not condone cheating, however, the idea around here is that a group of people were working on mapping a statistical model of Gacha probabilities which tends to have a very predictable RNG. Let's just say I got interested so I decided to hop in on the fun.

Background

Gacha originates from Gacha vending machines which usually you drop a hundred yen to get a capsule that gives you a cute thing, but one out of a thousand of those is a very rare item out of the pile of gacha capsules inside that vending machine.

Fast forward to modern gaming, the gacha mechanic is used by a lot of games - especially anime-themed ones like Arknights.

Gacha games usually employs a index of items to check, and a Pseudo-Random Number Generator, with either your User ID as the seed or in more extreme cases, the server time as the seed. However, Gacha RNG has a "drop rate" or a probability that you would get your preferred item in a N amount of drops. And of course that involves a lot of money to properly test this, so why not use our knowledge with data science to figure out that instead of wasting buckets of money just to figure out the distribution by hand?

Starting out

The entire project was started by an acquaintance I met who goes by KaidenFrizu with one simple goal: they wanted to figure out how much rolls it would require to get the desired results in a N amount of times.

This actually has been going on for a while now, I only joined to the party since I believed that the RNG algorithm also has an effect of the outcome of a roll (considering not all RNGs are created equal after all: some of them uses bitshift and some of them uses cryptographically secure algorithms).

After a while my involvement became more hands-on as I started to port a Python code by another member of the discussion who goes by the name "Eyenine". Together with the resident statistician of the group, SurfChu85, we went ahead and implemented a Jupyter notebook that maps out the probability of a rolls in a specific amount of iterations.

The implementation

The implementation focused on using the Subtractive PRNG - the same PRNG used by C# for it's Random API, and the use of a Numba-optimized loop function that does the following:

  • for a specific rate and N amount of iterations with the drop rate of X with a sample size of Y:
    • iterate until N is reached as...
    • ...another iterator iterates until sample size Y is achieved...
    • keep iterating on attempts and find a good hit from a RNG vs the rate. If a hit is found, increment the rate and start all over for until the maximum increment range is achieved.
    • After one iteration for N is achieved, save output to a iterable and repeat the loop until N is achieved.

The following algorithm was designed to match the Arknights RNG system that included also a "pity" system that increases the rate should the RNG fail to get the target drop rate on a specified amount of failed rolls.

You can view the entire implementation in GitHub:

GitHub logo sr229 / gacha-prng

A Jupyter Notebook that maps the probability distribution of the Arknights RNG for Gacha rolls.

gacha-prng

Binder

A Jupyter Notebook that maps the probability distribution of the Arknights RNG for Gacha rolls.

Running

You can run this on Google Colaboratory!.

Copyright

Copyright 2020 © Ayane Satomi, KaidenFrizu, et al. Licensed under MIT.




Which you can open in Google Colaboratory. It should work with Binder as well if you're not into using Google services.

The results

From what we can gather on the output of the Jupyter notebook, we have concluded the following:

Fixed UID as seeds

There is a much more steeper curve in fixed seeds (aka seeds that uses UIDs).

Therefore, from what I have analyzed that fixed UIDs will require more pulls to get a desired drop, and only at the 51th pull you will see the probability increase.

Time as seed

One of the most interesting simulations we did was a theory that a variable seed would have yield a much wider graph which means on a ever increasing seed would also imply that the distribution will be higher as time goes on.

Again, due to how Arknight's pity system modifies the Gacha algorithm, the pity system would be overkill to the point the probability would increase with the lesser amount of rolls and uniformly the higher amount of rolls as well. This gives an insight on why time-based seeds are not so oftenly used in Gacha games.

Conclusion

Based on our discovery, I can only come up with one TL;DR - prepare to simp even more harder because you'll need to roll even more.

However, in a experience standpoint, it really shows what we can do with simple statistics and data science to solve a literal "million dollar question". Such methods have been used to solve more practical problems but I can now rest well knowing we just helped a entire Gacha community to plan properly when doing the money drain we call Gacha rolls.

Credits

Of course, this entire knack of a project won't be possible without these bunch of people:

We made this project for a greater community and I hope you like what we did in it!💖💖💖

Discussion (0)

pic
Editor guide