This is a personal project I wanted to undergo after completing my Google Data Analytics Certificate. I am a recent FromSoftware fan and was curious about what percentage of Elden Ring players complete certain bosses. Then, if there was any pattern found between completed bosses and their stats.
Steam API Data from 10/10/2022
Elden Ring is an open-world game by FromSoftware and has achievements tied to specific bosses. In this project, I found the achievement name, the corresponding bosses and then pulled those bosses’ stats. Then, I took that information and looked for patterns.
I used .csv files for most of the exported files so that I can more easily move the files between applications for analysis.
Todd Birchard (copied json_extract from his article)
Since Elden Ring is an open-world game, it can not be determined where a player may have “given up” in the storyline. There are a few achievements tied to required events, but those are not explored in this particular project.
Furthermore, a player may stumble upon a boss too early and never return to a boss after repeated defeats. I would need in-game map information to delve into the possibility of those scenarios, which is not within the scope of this project.
Finally, there are differences in bosses’ move sets and relative difficulty. This would require frame data, movement speeds, and other information I do not have access to.
Which bosses are completed most and least by Elden Ring owners on Steam?
Are there patterns between the most/least completed bosses and their absorption or resistance values?
How could one use this data to inform a specific build or strategy in game?
Margit, Godrick, Red Wolf of Radagon, and Leonine Misbegotten are the most defeated bosses. The Regal Ancestor Spirit, Mohg, Malenia, Lichdragon Fortissax are the least defeated bosses.
There is an increase in the average absorption and resistance values for bosses with <43.10% completion.
If doing a specific run or build it can be important to avoid depending too much on status effects and elemental damage weapons or spells.
Full disclosure, I am not a data scientist and I learned basic Python in high school and only picked it up again recently. Perhaps, in the future, I can come back and improve upon these data collection codes as I continue to learn more.
Steam API allows Steam users to access public user data. json_extract was taken from Todd Birchard. The following code returns the Elden Ring achievement names and the percentage of players who own Elden Ring that have that achievement:
import pandas as pd
import requests
import numpy as np
#Elden Ring Steam ID
game_id = '1245620'
responses = requests.get("http://api.steampowered.com/ISteamUserStats/GetGlobalAchievementPercentagesForApp/v0002/?gameid="+game_id+"format=json")
#json_extract is from Todd Birchard on hackersandslackers.com
def json_extract(obj, key):
"""Recursively fetch values from nested JSON."""
arr = []
def extract(obj, arr, key):
"""Recursively search for values of key in JSON tree."""
if isinstance(obj, dict):
for k, v in obj.items():
if isinstance(v, (dict, list)):
extract(v, arr, key)
elif k == key:
arr.append(v)
elif isinstance(obj, list):
for item in obj:
extract(item, arr, key)
return arr
values = extract(obj, arr, key)
return values
#Combine name with percentages into one dataframe
df_name = json_extract(responses.json(), 'name')
df_percent = json_extract(responses.json(), 'percent')
test = np.column_stack((df_name,df_percent))
df = pd.DataFrame(test, columns=['ID','Percent'])
#comment following line to stop file from being overwritten
#csv = df.to_csv('./data/download/steam_api_data.csv')
df.head()
## ID Percent
## 0 ACH39 79.80000305175781
## 1 ACH25 74.9000015258789
## 2 ACH04 68.9000015258789
## 3 ACH40 63.900001525878906
## 4 ACH26 62.5
I pulled the display names and descriptions for the achievements from SteamDB. I then connected these names and descriptions to the Steam API IDs:
for row in a_names:
#rows in list of achievement names
names.append(row)
for a_row in a_percent:
#rows in list of achievement percents
percents.append(a_row)
#Achievement descriptions
for rows in names:
for divclass in soup.find_all('tr', {'id' : 'achievement-'+rows}):
ad.append(divclass.text)
dfs = pd.DataFrame(ad)
ads = dfs.replace(r'\n',' ',regex=True)
#Split columns and delete blank columns
ads= pd.DataFrame(ads[0].str.split(' ',expand=True),columns=[0, 1, 2])
stack = np.column_stack((ads,percents))
merge = pd.DataFrame(stack, columns=['ID','Boss', 'Description', 'Percent'])
final = merge[['Boss', 'ID', 'Description', 'Percent']]
file_address = 'achievement_descriptions.csv'
#comment following line to stop file from being overwritten
#ads.to_csv(file_address)
final.head()
## Boss ID Description Percent
## 0 Roundtable Hold ACH39 Hidden. 79.800003
## 1 Margit the Fell Omen ACH25 Hidden. 74.900002
## 2 Shardbearer Godrick ACH04 Hidden. 68.900002
## 3 Great Rune ACH40 Hidden. 63.900002
## 4 Red Wolf of Radagon ACH26 Hidden. 62.5
I took the URLs from the Fextralife Elden Ring Bosses page:
import numpy as np
import pandas as pd
import requests
from bs4 import BeautifulSoup
url = 'https://eldenring.wiki.fextralife.com/Bosses'
reqs = requests.get(url)
soup = BeautifulSoup(reqs.text, "html.parser")
#Empty frames for Titles and corresponding URLs
titles = []
urls = []
links = soup.find_all('a', class_='wiki_link')
for link in links:
Title = (link.get('title'))
HRef = (link.get('href'))
titles.append(Title)
urls.append(HRef)
test = np.column_stack((titles,urls))
df = pd.DataFrame(test, columns=['Title','URL'])
#comment following line to stop file from being overwritten
#csv = df.to_csv('boss_urls.csv')
df.head()
## Title URL
## 0 Elden Ring Wiki /Elden+Ring+Wiki
## 1 Elden Ring Runes /Runes
## 2 Elden Ring Weapons /Weapons
## 3 Elden Ring Armor /Armor
## 4 Elden Ring Magic Spells /Magic+Spells
Using a VLOOKUP, I connected the boss name with the website’s labels and returned the URL extension. Using the VLOOKUP file, I pulled specific paragraphs from Fextralife boss pages. Each loop is output into a separate worksheet in an Excel workbook. This raw file needs the most cleaning; there are extra rows, special characters and differing stat terms.
import pandas as pd
import requests
from bs4 import BeautifulSoup
df=pd.read_csv("steam_boss_info-Vlookup.csv")
new_url = (df["URL"])
#Empty frame for URL endings
address = []
#create Excel sheet
writer = pd.ExcelWriter('boss_stats.xlsx', engine='xlsxwriter')
for row in new_url:
#print(row)
address.append(row)
for rows in address:
url = 'https://eldenring.wiki.fextralife.com'+rows
reqs = requests.get(url)
soup = BeautifulSoup(reqs.text, "html.parser")
#Empty frame for boss stat information
stats = []
for divclass in soup.find_all('div', {'class' : 'col-sm-6'}):
for ultag in divclass.find_all('ul'):
for litag in ultag.find_all('li'):
#minimize the long strings not needed from this loop
if len(litag.text) < 43:
stats.append(litag.text.split(":"))
dfs = pd.DataFrame(stats)
unique = dfs.drop_duplicates()
#checks that the sheet name doesn't exceed Excel worksheet maximum
sheet_names = rows[:30]
unique.to_excel(writer, sheet_names.strip("/"))
#comment following line to stop file from being overwritten
#writer.save()
unique.head()
## 0 1
## 0 Phy (Standard) 35
## 1 Phy (Slash) 35
## 2 Phy (Strike) 35
## 3 Phy (Pierce) 10
## 4 Magic 40
There is a range from 27.9% to 74.90% for the % of Elden Ring players that completed certain bosses. From my own personal knowledge of the game, it was unsurprising that bosses that are optional and/or hard to find like the Lichdragon Fortissax, Shardbearer Malenia, and Mohg the Fell Omen were the least completed.
The following are the overviews of the absorption and resistance numbers side by side with the completion percentages to search for initial patterns. Astel, Godrick, Valiant Gargoyles, and the Royal Knight Loretta were omitted from the elemental and status effect visualizations. Those bosses had several immune values that I translated to “9999” in my data, so I removed them to avoid skewing the averages Their data was included in the Physical Damage absorption analysis:
Madness and Sleep were omitted as the value was either immune or blank; This made it an uncommon binary effect. Death was included as it was the most common binary status effect.
A difference could be seen in the amount of high absorption and resistance values on the lower completion % group, but I needed to make it more visually clear to reach a confident conclusion
I looked at the stat averages of the above and below 43.10% completion groups and found compelling differences. I made sure the scales were the same across the absorption categories, to prevent visual distortion of the results.
The biggest difference between the groups is Lightening and Holy, differing by 14 and 15 respectively. The bosses that were completed least had a higher average absorption in all elemental damage types.
The biggest difference between the groups is Poison and Scarlet Rot, differing by 187 and 189 respectively. The bosses that were completed the least had a higher average resistance to all status effects
Physical damage types showed the smallest relative differences. The biggest difference is Strike and Standard, differing by 15 and 6 respectively. The bosses that were completed the least had increased physical damage absorption, but not as drastically as status effects and elemental damage.
There is a negative relationship between the absorption and resistance numbers and the % of Steam Elden Ring players who defeat a boss.
This project stemmed from my interest in data, video games, and how people are known to drop FromSoftware games due to difficulty. So, after looking at all this data, I have some general insights and applications for my conclusions:
Build Choice: Individuals will often pick a certain type of damage and weapon they will stick with for a playthrough of a game. One popular one is a Bleed Build, which increases the infliction of what is labeled Hemorrhage in this project. This can lead people to get frustrated with late-game bosses that are bleed resistant and must be fought based on physical damage. These results give an overview of what bosses may be more difficult if one is doing a certain build, a nice alternative to Googling “what’s a good build” or “is Bleed the best build.” I personally plan to use this data for my next playthrough of Elden Ring.
Run/Ending Planning: Similar to choosing a build, some will know they want to pursue a certain ending or “completion run” where they fight everything. If someone is wanting to pursue a certain character’s ending or story line, they may need to fight Malenia, for example, and a Scarlet Rot focused build may increase the difficulty.
This was a fun project to practice and strengthen my skills, new and old, and answer questions I had about Elden Ring. The results of searching for builds feels less personal and I enjoy seeing a story unfold for myself. I plan on doing a similar project with Dark Souls 3, a more linear FromSoftware game.