Analyzing League of Legends Data with R

couch profile image S🅰️Ⓜ️ 🛋 Updated on ・2 min read

Sourcing the data

If you want to analyze all gameplay from public matches, Riot has a great API. For me, I like to look at the pros in competitive play; an excellent resource is Oracle's Elixir. Specifically, I use the match data files to start all of my research.

The R Part

As I said, we'll be using the match data file provided by Oracle's Elixer, we'll use the Summer 2019 file. If you're not familiar with R yet, that's OK! We'll keep it simple today. All you need to get started is R Studio. Now there are two libraries we'll be using: Tidyverse (a suite of several libraries for data manipulation) and openxlsx (You may have noticed the file format is an excel document, this lets us open it with ease). With these two libraries, we can get started!

With one line of code, we're ready to rock! Once you run that, we'll have the data loaded in, but before we can start an analysis, we need to clean it up a bit. For example, it thinks the patchno is a number (11.1 for example), but we really should consider it as a string, and the gamelength we should use as a double, the date is also in an Excel-specific format, so we should make that a usable date. So let's take care of these things:

You'll notice that the date cleaning looks a little bit more involved than the others. The reason is that we first need to represent the column as a number, and then we need to convert the number to a date (The trick here is that you need to provide the origin parameter, which in Excel is defined as December 30, 1899... I don't know why, but thanks Google!).


Finally, let's do a simple analysis using the Tidyverse packages!

Once we have the data thoroughly cleaned and ready to rock, we first filter the data by the region (in this case, called "league"), so we isolate the LCS (North America region), and then filter the rows by the "Team" rows (as opposed to the results for individual players, the "Team" represent aggregate performance for the entire team. Then we group by the actual teams. Finally, we create a summary for each team, where we tell it to create columns for each of the summaries that we want to see.


In the future, I'll be sharing how to do a more in-depth analysis, but this is a great place for you to get started hopefully. Let me know if you have any questions or ideas for types of analysis to perform!

Posted on Jun 28 '19 by:

couch profile

S🅰️Ⓜ️ 🛋


I am a Developer Advocate focussing on AI and data science. I enjoy exploring the intersection of entertainment and technology and am interested in all things esports.


markdown guide

I have a question, why do you chose to use R instead of python? What are the benefits of using one vs the other? And in what cases one is better than the other? I really liked the article. I enjoy some league of legends stats. 😊