A Python walkthrough for effective risk assessment
ESG stands for Environmental, Social, and Governance. These three factors measure a company’s sustainability and societal impact. They help investors evaluate how a company operates beyond just its financial performance, considering its effect on the environment, treatment of stakeholders, and ethical practices in governance.
Why is ESG Important?
ESG is crucial for responsible investing, providing a broader perspective of a company’s risks and opportunities. Companies with strong ESG practices often demonstrate resilience, lower regulatory risks, and higher employee satisfaction, contributing to long-term financial success. Investors are increasingly using ESG metrics to identify sustainable and ethical opportunities that align with their values and to avoid companies that might face challenges from poor practices.
How is it calculated?
ESG scores are calculated by analyzing a company’s performance on environmental, social, and governance metrics using data from reports, filings, and news. Rating agencies assign scores by assessing these metrics, weighting them by industry relevance, and benchmarking against peers to provide a final ESG risk or sustainability rating.
In this article, we will use the ESG data package from the EODHD and InvestVerte partnership to explore whether the ESG scores of S&P 500 companies are related to their risk.
Refer to the product page for more info about the API endpoint: EODHD ESG Data Package
Let’s Code: Python, ESG, and APIs
To do that, first, we will have to get the SP500 stocks in a data frame. For this, we will use the EODHD API for the indices, as below:
import requests
import json
import pandas as pd
import numpy as np
import io
import os
import matplotlib.pyplot as plt
import seaborn as sns
token = '<YOUR API KEY>'
idx = "GSPC.INDX"
url = f'https://eodhd.com/api/mp/unicornbay/spglobal/comp/{idx}'
querystring = {"api_token":token,"fmt":"json"}
sp500_tickers = requests.get(url, params=querystring).json()
df_sp500 = pd.DataFrame.from_dict(sp500_tickers['Components'], orient='index')
df_sp500.set_index('Code', inplace=True)
Using the above, we will obtain a dataframe with the tickers of the S&P 500, as well as their sector so we will use it later on for further analysis
Next, we will get their ESG historical data for each stock. After analyzing the dataset, we can see that the data available are for a full decade, so we will focus on that from 2014 to 2024.
def get_company_esg(ticker, year, frequency):
try:
url = f'https://eodhd.com/api/mp/investverte/esg/{ticker}'
querystring = {"api_token":token,"fmt":"json", 'year':year, 'frequency':frequency}
response = requests.get(url, params=querystring)
if response.status_code == 200:
data = response.json()
else:
print(response.status_code, ticker, year, frequency)
return []
except Exception as e:
print(ticker, year, frequency, e)
return []
for r in data:
r['Code'] = ticker
return data
final_list = []
for idx, row in df_sp500.iterrows():
for year in range(2014, 2024):
data = get_company_esg(idx, year, 'FY')
final_list = final_list + data
df_esg = pd.DataFrame(final_list)
df_esg
Using the above code, we will have a dataframe of around 5000 rows:
You will notice, that the information provided contains everything we need to know for each year. Besides the ESG score, the API provides the individual score for e (environmental), s (social), and g (governance).
Changes in ESG scores over the years
An interesting thing to check before diving into more data is how the scores have moved over the years.
# we should exclude the zero scores assuming there was not score this year
df_e = df_esg[df_esg['e'] != 0][['e', 'year', 'Code']]
df_s = df_esg[df_esg['s'] != 0][['s', 'year', 'Code']]
df_g = df_esg[df_esg['g'] != 0][['g', 'year', 'Code']]
df_esg_p = df_esg[df_esg['esg'] != 0][['esg', 'year', 'Code']]
# Create a figure
plt.figure(figsize=(10, 6))
# Plot each group on the same plot with a label
df_e.groupby('year')['e'].mean().plot(label='Environmental', linestyle='-', marker='o')
df_s.groupby('year')['s'].mean().plot(label='Social', linestyle='--', marker='x')
df_g.groupby('year')['g'].mean().plot(label='Governance', linestyle='-.', marker='^')
df_esg_p.groupby('year')['esg'].mean().plot(label='ESG', linestyle=':', marker='s')
# Add title, labels, legend, and grid
plt.title('Average ESG Metrics per Year')
plt.xlabel('Year')
plt.ylabel('Mean Value')
plt.legend(title='Metrics')
plt.grid(True)
# Display the plot
plt.show()
One noticeable spike is that the environmental score significantly increased in 2022, before returning to a usual average score. A likely reason behind it was that many large companies, especially those in the S&P 500, announced ambitious net-zero emissions goals around 2021–2022, following pressure from shareholders, activists, and global agreements like the Paris Agreement. The push toward decarbonization increased investments in renewable energy and sustainable practices, boosting their environmental scores. In any case, it seems that there these scores did not remain consistently high.
Are ethical companies less riskier?
In recent years, there’s been a growing movement among investors to include companies with high ESG scores in their portfolios. These companies are seen as better positioned to manage environmental, social, and governance risks, which can contribute to long-term financial performance and sustainability.
If we try to interpret the above reality, it looks like there should be a negative correlation between the stock’s risk and its ESG score. When the ESG score is higher, the risk should be lower and vice versa. But is this true?
To do that, first of all, with the use of EODHD API for historical prices, we will calculate (as commonly accepted risk metrics) the max drawdown over the same period, as well as the volatility. The reason behind this is that higher ESG-ranked stocks should have less negative publicity, thus having fewer reasons for the price to fluctuate
def get_ohlc(idx, start_date):
url = f'https://eodhd.com/api/eod/{idx}'
querystring = {"api_token":token,"fmt":"csv","from":start_date}
response = cache.get(url, params=querystring)
if response.status_code != 200:
print(f"Error processing {idx}: {response.status_code}")
return
df = pd.read_csv(io.StringIO(response.text))
return df
l_full= []
l_per_year = []
for stock in df_sp500.index:
def max_drawdown(cumulative_returns):
roll_max = cumulative_returns.cummax()
drawdown = cumulative_returns / roll_max - 1.0
max_drawdown = drawdown.cummin()
return max_drawdown.min()
def volatility(cumulative_returns):
returns = cumulative_returns.pct_change().dropna()
return returns.std()
df = get_ohlc(stock, '2013-01-01')
df['Date'] = pd.to_datetime(df['Date'])
df.set_index('Date', inplace=True)
df['year'] = df.index.year
df = df[(df['year'] > 2012) & (df['year'] < 2024)]
# Add full
d = {"Code": stock}
d['max_drawdown'] = max_drawdown(df['Adjusted_close'])
d['volatility'] = volatility(df['Adjusted_close'])
l_full.append(d)
# Add per year
for year in df['year'].unique():
d = {"Code": stock}
d['year'] = year
d['max_drawdown'] = max_drawdown(df[df['year'] == year]['Adjusted_close'])
d['volatility'] = volatility(df[df['year'] == year]['Adjusted_close'])
l_per_year.append(d)
df_sp500_metrics = pd.DataFrame(l_full)
df_sp500_metrics_per_year = pd.DataFrame(l_per_year)
First, let’s use the full metrics to see any potential correlation between the ESG scores and their risk metrics:
# Specify columns for X and Y axes
# x_columns = ['e', 's', 'g', 'esg', 'e_rank', 's_rank', 'g_rank', 'esg_rank']
x_columns = ['e', 's', 'g', 'esg']
y_columns = ['max_drawdown', 'volatility']
x_data = df_ranks[x_columns]
y_data = df_ranks[y_columns]
# Compute correlation between each pair of columns (X and Y)
correlation_matrix = pd.DataFrame(index=x_columns, columns=y_columns)
for x_col in x_columns:
for y_col in y_columns:
correlation_matrix.loc[x_col, y_col] = x_data[x_col].corr(y_data[y_col])
# Plotting the correlation matrix
plt.figure(figsize=(8, 6))
sns.heatmap(correlation_matrix.astype(float), annot=True, cmap='coolwarm', linewidths=0.5)
plt.title('Correlation Between ESG Scores and Risk Metrics')
plt.xlabel('Risk')
plt.ylabel('ESG')
plt.show()
From the analysis, it is clear that there is no correlation between the ESG scores and the risk metrics. So let’s dive a bit more and check what is going on by sector!
# Creating subplots with 4 box plots for 'e', 's', 'g', and 'esg'
fig, axes = plt.subplots(2, 2, figsize=(14, 10))
fig.suptitle('Distribution of e, s, g, and esg by Sector', fontsize=16)
# Box plot for 'e'
sns.boxplot(ax=axes[0, 0], x='Sector', y='e', data=df_ranks)
axes[0, 0].set_title('e by Sector')
axes[0, 0].tick_params(axis='x', rotation=45)
# Box plot for 's'
sns.boxplot(ax=axes[0, 1], x='Sector', y='s', data=df_ranks)
axes[0, 1].set_title('s by Sector')
axes[0, 1].tick_params(axis='x', rotation=45)
# Box plot for 'g'
sns.boxplot(ax=axes[1, 0], x='Sector', y='g', data=df_ranks)
axes[1, 0].set_title('g by Sector')
axes[1, 0].tick_params(axis='x', rotation=45)
# Box plot for 'esg'
sns.boxplot(ax=axes[1, 1], x='Sector', y='esg', data=df_ranks)
axes[1, 1].set_title('esg by Sector')
axes[1, 1].tick_params(axis='x', rotation=45)
# Adjust layout for better spacing
plt.tight_layout(rect=[0, 0, 1, 0.96])
plt.show()
From the above box plot by sector, we can notice that there are no noticeable differences, between the sectors. Also, it seems that in sectors like healthcare, the average is below that of other sectors, but the outliers are at the top or bottom of the list.
The Top 10
For that reason of some interesting outliers, let’s try to find out what is going on with the top 10 companies of the S&P 500 according to their mean ESG score over the last few years.
# Calculate the mean max drawdown for each sector
# Get the top 10 of esg and check where their drawdown stands in the sector
top_ten_esg = df_ranks.nlargest(10, 'esg')
sector_stats = df_ranks.groupby('Sector').agg(
mean_max_drawdown=('max_drawdown', 'mean'),
mean_volatility=('volatility', 'mean')
)
top_ten_with_sector_stats = top_ten_esg.merge(sector_stats, on='Sector', how='left')
top_ten_with_sector_stats['max_drawdown_diff'] = top_ten_with_sector_stats['max_drawdown'] - top_ten_with_sector_stats['mean_max_drawdown']
top_ten_with_sector_stats['volatility_diff'] = top_ten_with_sector_stats['mean_volatility'] - top_ten_with_sector_stats['volatility']
top_ten_with_sector_stats[['Code', 'Name', 'Sector','max_drawdown', 'mean_max_drawdown','max_drawdown_diff', 'mean_volatility', 'volatility', 'volatility_diff']]
# top_ten_with_sector_stats
df = top_ten_with_sector_stats.copy()
# Create subplots
fig, axes = plt.subplots(1, 2, figsize=(16, 6))
# Subplot 1: Bar Plot for Max Drawdown Differences
axes[0].bar(df['Code'], df['max_drawdown_diff'], color='skyblue', alpha=0.7)
axes[0].set_title('Max Drawdown Differences')
axes[0].set_xlabel('Stock')
axes[0].set_ylabel('Max Drawdown Difference')
axes[0].tick_params(axis='x', rotation=45)
# Subplot 2: Bar Plot for Volatility Differences
axes[1].bar(df['Code'], df['volatility_diff'], color='salmon', alpha=0.7)
axes[1].set_title('Volatility Differences')
axes[1].set_xlabel('Stock')
axes[1].set_ylabel('Volatility Difference')
axes[1].tick_params(axis='x', rotation=45)
# Layout adjustment
plt.tight_layout()
plt.show()
The above code gets the top 10 companies in terms of ESG score and compares their risk metrics to the average ones. What we see is something very interesting. Except for 1 stock for max drawdown and 2 stocks for volatility, all the top-10 stocks are less riskier than their peers.
Spoiler alert: The individual scores of E, S, and G, do not seem to have the same behavior, which looks like it means that the commitment to all metrics of ESG, is more beneficial rather than excelling to one and only metric. You can check this simply by changing the code to the respective metric:
top_ten_esg = df_ranks.nlargest(10, 'esg')
# change 'esg' to 'e', 's' or 'g'
The Bottom 10
But what can we see for the bottom 10 companies? Simply change the “nlargest” to “nsmallest” to the above code and you will get the below plot.
There appears to be no significant correlation for the bottom 10 stocks. The investors seem not to penalize the bottom ESG performers, as much as reward the top 10.
Conclusion
Companies with high ESG scores tend to have lower risk, showing reduced volatility and drawdowns.
Top 10 ESG performers demonstrate lower risk compared to sector peers, especially when excelling across all three ESG pillars.
Focusing solely on one ESG aspect (environmental, social, or governance) does not significantly reduce risk.
The bottom 10 ESG performers do not face as strong penalties in terms of risk metrics, indicating a skew towards rewarding top performers.
Investors can lower portfolio risk by including high-ESG companies, but a balanced ESG approach across all metrics is key.
Further Research
If you want to dive into ESG score analysis and explore the data:
Gather ESG and financial data: Use APIs like EODHD to collect ESG scores and risk metrics (e.g., volatility, max drawdown) for companies over time.
Identify patterns: Analyze the relationship between ESG scores and financial performance, comparing industries or sectors to find correlations or trends.
I hope you enjoyed the article! If you found this useful, please clap and share your thoughts below!
Disclaimer: While we explore the exciting world of investing in this article, it’s crucial to note that the information provided is for educational purposes only. I’m not a financial advisor, and the content here doesn’t constitute financial advice. Always do your research and consider consulting with a professional before making any investment decisions.
Comentarios