top of page
Nikhil Adithyan

Finding Undervalued & Overvalued Sectors Using Python

Updated: Dec 5

A unique approach to finding hidden gems with simple metrics



Introduction

In the world of investing, determining whether a sector is overvalued or undervalued is essential for making sound investment decisions. One of the most effective ways to gauge this is by comparing the current Price-to-Earnings (PE) ratios of sectors to their historical averages. By analyzing these deviations, we can identify sectors that may present investment opportunities or risks based on their valuation trends.


In this article, we’ll explore a data-driven approach to uncover which sectors are trading significantly above or below their historical norms. Using FinancialModelingPrep’s Sector PE Ratio API, we’ll gather historical sector PE data and apply statistical techniques such as z-scores to assess how far current valuations have deviated from their historical benchmarks.


This guide will cover the process of calculating historical averages, computing valuation deviations, and visualizing sectors that appear either overvalued or undervalued.


Without further ado, let’s dive into the article!


Understanding the Metrics

Before diving into the analysis, it’s essential to understand the metrics that will form the backbone of our sector valuation study. These metrics will help us quantify whether a sector is currently overvalued or undervalued compared to its historical averages.


1. Price-to-Earnings (PE) Ratio

The Price-to-Earnings (PE) ratio is one of the most widely used metrics in valuation. It measures the price of a stock (or sector) relative to its earnings, providing insight into how much investors are willing to pay for each dollar of earnings. For sectors, the PE ratio is the weighted average of the PE ratios of all the companies in that sector.


Formula:



where,


  • Price per Share: The market price of a company’s stock or the collective stock in the sector.

  • Earnings per Share (EPS): The net income of a company (or sector in aggregate) divided by the number of outstanding shares.


A higher PE ratio generally indicates that investors expect future growth, while a lower PE ratio could signal that a sector is undervalued or out of favor.


2. Historical PE Ratio

This is simply the average of the PE ratios over a historical period. For our analysis, the historical PE ratio serves as a benchmark against which we compare the current sector PE ratios. By comparing the current ratio to its historical average, we can gauge whether a sector is trading at a premium (overvalued) or at a discount (undervalued).


Formula:



Where PEt_{t}​ represents the PE ratio at a given point in time, and n is the number of time periods used to compute the average.


3. Standard Deviation

Standard deviation is a measure of the amount of variation or dispersion in a set of data points. In the context of PE ratios, it tells us how much the sector’s PE ratio fluctuates around its historical average. A higher standard deviation indicates more volatility in sector valuations, while a lower standard deviation suggests more consistency.


Formula:



4. Z-Score (Valuation Deviation)

The z-score is a statistical measure that quantifies the distance (in standard deviations) of a data point from the mean of the data set. In this context, the z-score helps us measure how far the current PE ratio of a sector is from its historical average, relative to the historical volatility (standard deviation).


Formula:



where,

  • Current PE is the sector’s present-day PE ratio.

  • Historical PE is the average PE ratio over a chosen historical period.

  • Standard Deviation is the amount of fluctuation in the PE ratios over that historical period.


A positive z-score indicates that the sector is overvalued (the current PE is above the historical average), while a negative z-score suggests that the sector is undervalued (the current PE is below the historical average).


Approach Overview: Sector PE Ratio vs. Historical Benchmarks

As we now have a strong understanding of the metrics involved, let’s dive into the detailed approach for identifying overvalued and undervalued sectors. Our goal is to evaluate sector valuations by comparing their current Price-to-Earnings (PE) ratios to their historical benchmarks.


In this approach, we will follow a structured process:


  1. Data Collection: We will extract historical PE ratio data for various sectors over a specific period. This data will provide a comprehensive view of how sector valuations have shifted over time.

  2. Historical Benchmark Calculation: Once the data is gathered, we will calculate the historical mean and standard deviation of PE ratios for each sector. These benchmarks serve as a reference to evaluate whether the current PE ratio is aligned with or deviates from the historical norm.

  3. Z-Score Calculation: Using the current PE ratio along with its historical mean and standard deviation, we will calculate the z-score for each sector. This will allow us to quantify how far the current valuation deviates from its historical average. A high positive z-score indicates overvaluation, while a negative z-score suggests undervaluation.


By following this approach, we will be able to highlight sectors that are trading at extreme valuations relative to their historical benchmarks.

Now, it’s time to implement each of these steps with Python.


Python Implementation


1. Importing Required Packages

The first and foremost step is to import all the required packages into our Python environment. In this article, we’ll be using five packages which are:


  • Pandas — for data formatting, clearing, manipulating, and other related purposes

  • Matplotlib and Seaborn — for creating charts and different kinds of visualizations

  • Requests — for making API calls in order to extract data

  • Termcolor — to customize the standard output shown in Jupyter notebook

  • datetime — for date-related functions and operations


The following code imports all the above-mentioned packages into our Python environment:



# IMPORTING PACKAGES

import requests
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime

If you haven’t installed any of the imported packages, make sure to do so using the pip command in your terminal.


2. Extracting Historical PE Ratio Data

In this section, we will extract sector-wise PE ratio data using FinancialModelingPrep’s (FMP) Sector PE Ratio API. Although the API itself does not provide historical data directly, it allows users to retrieve the PE ratios for a specific date.


To obtain historical PE ratios over a period of time, we will take a different approach by creating a sequence of dates, making API calls for each date, and then compiling the data into a single DataFrame for further analysis.


Generating a Sequence of Dates:

To fetch the PE ratios over time, we first need to generate a sequence of dates between the start and end periods. This function, generate_date_sequence, creates a list of dates in the YYYY-MM-DD format, which we will use to query the API.



def generate_date_sequence(start_date, end_date):
    date_list = pd.date_range(start=start_date, end=end_date).tolist()
    return [date.strftime('%Y-%m-%d') for date in date_list]

The function uses Pandas’ pd.date_range() method to generate a range of dates between the specified start_date and end_date. The dates are then converted to a string format (YYYY-MM-DD) to ensure compatibility with the API query structure.


Fetching Sector PE Ratios for Each Date:

Next, we define the fetch_sector_pe_ratio function, which retrieves the PE ratio data for a specific date using FMP’s API. The function sends an API request with the date and returns the PE ratios for various sectors on that date.



def fetch_sector_pe_ratio(date):
    api_key = 'YOUR FMP API KEY'
    url = f'https://financialmodelingprep.com/api/v4/sector_price_earning_ratio?date={date}&exchange=NYSE&apikey={api_key}'
    response = requests.get(url)
    if response.status_code == 200:
        return response.json()
    else:
        return None

This function constructs the API URL by inserting the provided date and api_key. It then makes a GET request using the requests library. If the request is successful (status_code == 200), the response is returned in JSON format. If there is an issue, the function returns None, allowing the main process to handle any missing data. Make sure to replace YOUR FMP API KEY with your actual FMP API key which you can obtain by opening a developer account.


Compiling the Historical PE Data:

Finally, we use the create_historical_pe_df function to loop through each date in the sequence, calling the fetch_sector_pe_ratio function for each date. The results are appended to a DataFrame, with the dates set as the index and sectors as columns.



def create_historical_pe_df(start_date, end_date):
    dates = generate_date_sequence(start_date, end_date)
    df_pe = pd.DataFrame()

    for date in dates:
        sector_pe = fetch_sector_pe_ratio(date, api_key)
        if sector_pe:
            pe_dict = {entry['sector']: float(entry['pe']) for entry in sector_pe}
            pe_dict['date'] = date
            df_pe = df_pe.append(pe_dict, ignore_index=True)
    
    df_pe.set_index('date', inplace=True)
    return df_pe

The function first generates a sequence of dates using generate_date_sequence. For each date, it calls fetch_sector_pe_ratio to obtain the PE ratios for that specific date. The data is then stored in a dictionary (pe_dict) where each sector is a key, and its corresponding PE ratio is the value. This dictionary is appended to the DataFrame (df_pe), and once all the dates have been processed, the date column is set as the index.


This approach allows us to compile a historical dataset of sector PE ratios by making individual API calls for each day in the specified date range.


Finally, we call the create_historical_pe_df function to generate the historical data for the chosen period:



start_date = '2023-01-01'
end_date = '2023-12-31'

df_historical_pe = create_historical_pe_df(start_date, end_date)

df_historical_pe.tail()

This step collects the historical PE ratio data for all sectors between January 2023 and December 2023, resulting in a DataFrame where each row represents a date and each column represents a sector. The end result looks like this:



3. Calculating Historical Mean and Standard Deviation

In this section, we calculate the historical benchmarks of sector PE ratios by determining the mean and standard deviation for each sector over the chosen time period. These benchmarks will serve as reference points to evaluate whether current sector valuations are significantly above or below their historical norms.



# CALCULATE Z-SCORES

def calculate_z_scores(df_pe, sector_means, sector_stddev):
    z_scores = (df_pe - sector_means) / sector_stddev
    return z_scores

z_scores = calculate_z_scores(df_historical_pe, sector_means, sector_stddev)

print('Z-scores for the most recent date:')
print(z_scores.iloc[-1])

The code uses the Pandas mean() and std() functions to compute the mean and standard deviation of the PE ratios for each sector over the defined time period.


The historical mean provides an average PE ratio for each sector, while the standard deviation gives us an idea of the variability in sector PE ratios. These statistics will later be used to calculate the z-scores, helping us understand how far the current PE ratios deviate from historical norms.


This is the gist provided by the above code:



This output shows the historical PE ratio means and standard deviations for each sector, which will be essential for computing z-scores in the next section.


4. Calculating Z-Scores for Each Sector

Now that we have the historical means and standard deviations for each sector, we will calculate the z-scores for the current PE ratios. Z-scores help us measure how far a sector’s current PE ratio deviates from its historical average, allowing us to identify sectors that may be overvalued or undervalued.



# CALCULATE Z-SCORES

def calculate_z_scores(df_pe, sector_means, sector_stddev):
    z_scores = (df_pe - sector_means) / sector_stddev
    return z_scores

z_scores = calculate_z_scores(df_historical_pe, sector_means, sector_stddev)

print('Z-scores for the most recent date:')
print(z_scores.iloc[-1])

In this code, we define a function calculate_z_scores to compute the z-scores for each sector by subtracting the historical mean from the current PE ratio and dividing the result by the standard deviation. The function is applied across all the historical PE data (df_historical_pe) to calculate the z-scores for every date and sector. The z_scores.iloc[-1] command prints the z-scores for the most recent date in the dataset, giving us a snapshot of current sector valuations relative to their historical norms.


This is the output of the above code:



This output shows the z-scores for each sector on the most recent date. Positive z-scores indicate that a sector’s PE ratio is above its historical average (potential overvaluation), while negative z-scores suggest the sector is trading below its historical average (potential undervaluation).


5. Identifying Overvalued and Undervalued Sectors

In this section, we will visualize the z-scores for the most recent date to identify which sectors are potentially overvalued or undervalued. By plotting the z-scores, we can clearly see how each sector’s current valuation deviates from its historical average.



# FINDING UNDERVALUED/OVERVALUED SECTORS

plt.figure(figsize=(12, 6))
sns.barplot(x=z_scores.columns, y=z_scores.iloc[-1], palette='coolwarm')
plt.axhline(0, color='black', linestyle='--')
plt.title('Sector Valuation Deviation (Z-Scores) - Most Recent Date')
plt.ylabel('Z-Score')
plt.xticks(rotation=45)
plt.show()

In this code, we use Seaborn’s barplot function to create a bar chart showing the z-scores for each sector. The x-axis represents the sectors, while the y-axis represents the z-scores. The bar colors indicate the degree of overvaluation or undervaluation, with a cool-warm palette emphasizing the extremes. The axhline(0) function draws a horizontal line at the z-score of 0, which represents the historical average, helping us visually differentiate overvalued sectors (positive z-scores) from undervalued ones (negative z-scores).


Here’s the output:



Bar chart of sector z-scores for the most recent date (Image by Author)

The chart provides a clear visual representation of how different sectors are currently valued relative to their historical averages. Sectors like Industrials and Real Estate have z-scores significantly above 1, indicating that these sectors are potentially overvalued. Investors may consider this as a sign to be cautious, as prices in these sectors might be inflated compared to historical norms.


On the other hand, sectors like Consumer Defensive and Utilities have z-scores below 0, with Utilities showing a particularly negative z-score. This suggests these sectors are undervalued relative to their historical PE ratios, which could indicate potential buying opportunities.


Sectors such as Basic Materials and Communication Services are hovering around positive z-scores but not to an extreme extent, signaling moderate overvaluation.


6. Tracking Valuation Changes Over Time

In this section, we will visualize how sector valuations have evolved over time by tracking the z-scores for selected sectors. This helps to identify trends and patterns, providing insights into whether certain sectors have consistently been overvalued or undervalued over a given period.



# TRACKING VALUATION CHANGES

z_scores['Technology'].plot(label='Technology Sector', color='blue')
z_scores['Energy'].plot(label='Energy Sector', color='green')
plt.axhline(0, color='black', linestyle='--')
plt.title('Sector Valuation Deviation Over Time (Z-Scores)')
plt.ylabel('Z-Score')
plt.legend()
plt.show()

This code visualizes the z-score evolution for two specific sectors — Technology and Energy — over the selected time period. The z-scores are plotted over time to show how their valuation deviation from historical norms changes. The horizontal line at 0 represents the historical average, allowing us to easily see whether the sector was overvalued or undervalued at any point in time.


This is the generated line chart:



The line chart illustrates how the valuation deviations of the Technology and Energy sectors have evolved over the selected time period, represented by their z-scores.


  • The Technology sector (blue line) shows high volatility, with frequent shifts above and below the historical average (z-score = 0). This suggests that the sector has experienced multiple periods of overvaluation and undervaluation relative to its historical norms. Around the middle of the time frame, the Technology sector experiences a sharp drop, potentially indicating a market correction or a shift in investor sentiment. However, it recovers quickly and continues to hover above the average, suggesting periods of significant overvaluation.


  • The Energy sector (green line), in contrast, shows more consistent behavior, with the z-score frequently in negative territory, particularly in the earlier part of the period. This suggests that the Energy sector has been mostly undervalued compared to its historical average. However, there is a noticeable upward trend toward the end of the time frame, indicating that the Energy sector’s valuation is catching up, though still below the overvalued levels seen in Technology.


This comparison allows investors to monitor how sector valuations fluctuate over time and make decisions accordingly. For instance, the Technology sector’s tendency to oscillate between overvalued and undervalued states may provide opportunities for tactical entry and exit points. On the other hand, the Energy sector’s undervaluation might suggest potential long-term opportunities if the trend of recovery continues.


Conclusion

Here we are at the end of an insightful exploration! We began by setting up the Python environment and fetching historical PE ratio data for various sectors using FinancialModelingPrep’s Sector PE Ratio API. After calculating historical means and standard deviations, we applied z-scores to identify which sectors are currently trading at valuations above or below their historical norms. Finally, we visualized and analyzed the results to uncover potential opportunities or risks in the market.


There’s certainly room for further exploration. Incorporating other valuation metrics, such as Price-to-Sales or Dividend Yield, could provide even deeper insights into sector valuations. Additionally, comparing sector valuations across different global exchanges or testing for cyclical patterns could refine the analysis and enhance investment decision-making.


With that being said, you’ve reached the end of the article. Hope you learned something new and useful today. Thank you very much for your time.


Disclaimer: While we explore the exciting world of investing in this article, it’s crucial to note that the information provided is for educational purposes only. I’m not a financial advisor, and the content here doesn’t constitute financial advice. Always do your research and consider consulting with a professional before making any investment decisions.

1 comment

Related Posts

See All

1 Comment


saravanakumaar.a
saravanakumaar.a
Dec 05

Good one

Like
bottom of page