Predicting the Market Using Economic Indicators with Python

Can GDP, Inflation, and Friends predict the market?

When it comes to the stock market, most of us are left scratching our heads, wondering if the whole thing is run by an army of economists, a fleet of fortune tellers, or perhaps even a mix of both. Every day, numbers bounce around the screen — stocks go up, stocks go down, and somewhere in the background, economic forces nudge them along. Enter the world of macroeconomic indicators: the GDP, unemployment rates, inflation, and more — each one like a loud cheerleader or a silent saboteur influencing market moves in ways that are sometimes predictable and, other times, hilariously unpredictable.

But here’s the big question: do these economic indicators hold the secrets to predicting what the market will do? Over the last 20 years, these indicators have been reporting the nation’s economic pulse, from job growth to inflation jitters. So, in this post, we’re diving deep to see if they can help us decode the stock market’s quirks and swings. We’ll unpack the most-watched indicators, look at their long-term flirtation with the stock market indices, and try to answer the age-old question: are these indicators useful for investors, or are they just more noise in the market’s chaotic concert?

Ready to roll? Let us jump in and see if we can find some actual wisdom amidst the stock market’s dramatic highs and lows. And don’t worry, no economics degree is required — just a sense of humor and a little curiosity!

How this article will flow

Initially, we will gather the macro indicators of the US economy
Then, we will also gather the prices of stock market indices for various sectors
Following that we will calculate the correlations between the indices and the macros
We investigate the pairs that seem to be affected more positively or negatively by movements of each other and try to explain why this is happening.

Ready for the fun? Let’s start:

Alert: boring Python imports ahead.


import requests
import requests_cache
import json
import pandas as pd
import numpy as np
import io
import os
import matplotlib.pyplot as plt
import seaborn as sns

token = '<Your EODHD API key>'

Macro indicators

EODHD provides a very cool endpoint called the Macro Indicators API where you can get the macro indicators of various countries around the globe. In this article, we will focus on the US economy, because let us face it: this is the mother of all.

First, we will create a list of those indicators, that later will be used to gather the data.


list_of_macro_indicators = [
    'real_interest_rate',  # Real interest rate (%)
    'population_total',  # Population, total
    'population_growth_annual',  # Population growth (annual %)
    'inflation_consumer_prices_annual',  # Inflation, consumer prices (annual %)
    'consumer_price_index',  # Consumer Price Index (2010 = 100)
    'gdp_current_usd',  # GDP (current US$)
    'gdp_per_capita_usd',  # GDP per capita (current US$)
    'gdp_growth_annual',  # GDP growth (annual %)
    'debt_percent_gdp',  # Debt in percent of GDP (annual %)
    'net_trades_goods_services',  # Net trades in goods and services (current US$)
    'inflation_gdp_deflator_annual',  # Inflation, GDP deflator (annual %)
    'agriculture_value_added_percent_gdp',  # Agriculture, value added (% of GDP)
    'industry_value_added_percent_gdp',  # Industry, value added (% of GDP)
    'services_value_added_percent_gdp',  # Services, etc., value added (% of GDP)
    'exports_of_goods_services_percent_gdp',  # Exports of goods and services (% of GDP)
    'imports_of_goods_services_percent_gdp',  # Imports of goods and services (% of GDP)
    'gross_capital_formation_percent_gdp',  # Gross capital formation (% of GDP)
    'net_migration',  # Net migration (absolute value)
    'gni_usd',  # GNI, Atlas method (current US$)
    'gni_per_capita_usd',  # GNI per capita, Atlas method (current US$)
    'gni_ppp_usd',  # GNI, PPP (current international $)
    'gni_per_capita_ppp_usd',  # GNI per capita, PPP (current international $)
    'income_share_lowest_twenty',  # Income share held by lowest 20% (in %)
    'life_expectancy',  # Life expectancy at birth, total (years)
    'fertility_rate',  # Fertility rate, total (births per woman)
    'prevalence_hiv_total',  # Prevalence of HIV, total (% of population ages 15-49)
    'co2_emissions_tons_per_capita',  # CO2 emissions (metric tons per capita)
    'revenue_excluding_grants_percent_gdp',  # Revenue, excluding grants (% of GDP)
    'cash_surplus_deficit_percent_gdp',  # Cash surplus/deficit (% of GDP)
    'startup_procedures_register',  # Start-up procedures to register a business (number)
    'market_cap_domestic_companies_percent_gdp',  # Market capitalization of listed domestic companies (% of GDP)
    'mobile_subscriptions_per_hundred',  # Mobile cellular subscriptions (per 100 people)
    'internet_users_per_hundred',  # Internet users (per 100 people)
    'high_technology_exports_percent_total',  # High-technology exports (% of manufactured exports)
    'merchandise_trade_percent_gdp',  # Merchandise trade (% of GDP)
    'unemployment_total_percent'  # Unemployment total (% of labor force)
]

With this list, we’ll loop through each indicator to collect the data.


country = 'USA'

# Initialize an empty DataFrame to store all indicators
df_macro_indicators = pd.DataFrame()

# Loop through each indicator and fetch the data
for indicator in list_of_macro_indicators:
    url = f'https://eodhd.com/api/macro-indicator/{country}'
    querystring = {"api_token": token, "indicator": indicator, "fmt": "json"}
    
    # Fetch data from the API
    response = requests.get(url, params=querystring)
    data = response.json()
    
    # Convert the response to a DataFrame
    df = pd.DataFrame(data)
    
    # If it's the first indicator, add Date and Period columns
    if df_macro_indicators.empty:
        df_macro_indicators['Date'] = df['Date']
        df_macro_indicators['Period'] = df['Period']
    
    # Add the indicator's value as a new column
    df_macro_indicators[indicator] = df['Value']

df_macro_indicators['Date'] = pd.to_datetime(df_macro_indicators['Date'])
df_macro_indicators.to_csv('data/indicator_values_df.csv', index=False)
df_macro_indicators.drop(columns=['Period'], inplace=True)

This way we will have a dataframe that will hold a date and in each column the value for each year of the macro indicator

Stock Market Indices

The next step is the pulse of the heart of the economy. The stock market and its indices. EODHD API supports indices from various countries, which you can retrieve using their API to get the list of tickers for some exchange. Instead of an exchange, you just need to request the tickers for “INDX” and there you go:


EXCHANGE_CODE = 'INDX'
url = f'https://eodhd.com/api/exchange-symbol-list/{EXCHANGE_CODE}'
querystring = {"api_token": token, "fmt": "json"}

data = (requests.get(url, params=querystring)).json()
df_US_indices = pd.DataFrame(data)

df_US_indices = df_US_indices[df_US_indices['Country'] == 'USA']

Feel free to browse through these results. For the rest of the article, we will focus on the S&P indices for specific sectors.


list_of_indices = ['GSPC','SP500-15','SP500-151010','SP500-20','SP500-25','SP500-2550', 'SP500-30','SP500-35','SP500-40','SP500-45','SP500-50','SP500-55','SP500-60','DXY']

df_US_indices_close_prices = pd.DataFrame()

# we should get per day - since year is not supported and we should have the last day of year price
for ind in list_of_indices:
    url = f'https://eodhd.com/api/eod/{ind}.INDX'
    querystring = {"api_token": token, "from": "1959-01-01", "period": "d", "fmt": "json"}
    data = (requests.get(url, params=querystring)).json()
    index_data = pd.DataFrame(data)
    index_data.set_index('date', inplace=True)
    index_data = index_data[['adjusted_close']]
    index_data.columns = [ind]
    df_US_indices_close_prices = pd.merge(df_US_indices_close_prices, index_data, how='outer', left_index=True, right_index=True)

df_US_indices_close_prices.rename(columns={'GSPC':'SP500',"SP500-15":"SECTOR-MATERIALS","SP500-151010":"SECTOR-CHEMICALS", "SP500-20":"SECTOR-INDUSTRIALS","SP500-25":"SECTOR-CONSUMER-DISCRETIONARY","SP500-2550":"SECTOR-CONSUMER-DISTRIBUTION-RETAIL", "SP500-30":"SECTOR-STAPLES","SP500-35":"SECTOR-HEALTHCARE","SP500-40":"SECTOR-FINANCIALS","SP500-45":"SECTOR-INFORMATION-TECHNOLOGY", "SP500-50":"SECTOR-TELECOMMUNICATION-SERVICES","SP500-55":"SECTOR-UTILITIES","SP500-60":"SECTOR-REAL-ESTATE", "DXY":"DOLLAR-INDEX"}, inplace=True)
df_US_indices_close_prices.reset_index(inplace=True)
df_US_indices_close_prices.rename(columns={'date': 'Date'}, inplace=True)
df_US_indices_close_prices['Date'] = pd.to_datetime(df_US_indices_close_prices['Date']) 
df_US_indices_close_prices.set_index('Date', inplace=True) 
df_US_indices_close_prices = df_US_indices_close_prices.resample('YE').last()
df_US_indices_close_prices.reset_index(inplace=True)

This will result in a dataframe with all the prices of the indices we have added in our list.

Do they correlate?

In order to investigate the correlations for these two datasets, we should merge them into one with the below simple code


df_merged = pd.merge(df_macro_indicators, df_US_indices_close_prices, how='left', on='Date')
df_merged.set_index('Date', inplace=True)

Our aim is to find correlations between the macro indicators, and the indices, and not between each other. So not to be too overwhelmed with a lot of numbers, we’ll assign the x-axis to the macro indicators and the y-axis to the indices, then plot a heatmap.


x_axis_columns = ['real_interest_rate', 'population_total', 'population_growth_annual',
       'inflation_consumer_prices_annual', 'consumer_price_index',
       'gdp_current_usd', 'gdp_per_capita_usd', 'gdp_growth_annual',
       'debt_percent_gdp', 'net_trades_goods_services',
       'inflation_gdp_deflator_annual', 'agriculture_value_added_percent_gdp',
       'industry_value_added_percent_gdp', 'services_value_added_percent_gdp',
       'exports_of_goods_services_percent_gdp',
       'imports_of_goods_services_percent_gdp',
       'gross_capital_formation_percent_gdp', 'net_migration', 'gni_usd',
       'gni_per_capita_usd', 'gni_ppp_usd', 'gni_per_capita_ppp_usd',
       'income_share_lowest_twenty', 'life_expectancy', 'fertility_rate',
       'prevalence_hiv_total', 'co2_emissions_tons_per_capita',
       'revenue_excluding_grants_percent_gdp',
       'cash_surplus_deficit_percent_gdp', 'startup_procedures_register',
       'market_cap_domestic_companies_percent_gdp',
       'mobile_subscriptions_per_hundred', 'internet_users_per_hundred',
       'high_technology_exports_percent_total',
       'merchandise_trade_percent_gdp', 
       'unemployment_total_percent']

y_axis_columns = ['SECTOR-MATERIALS','SECTOR-CHEMICALS', 'SECTOR-INDUSTRIALS','SECTOR-CONSUMER-DISCRETIONARY', 'SECTOR-CONSUMER-DISTRIBUTION-RETAIL','SECTOR-STAPLES', 'SECTOR-HEALTHCARE', 'SECTOR-FINANCIALS', 'SECTOR-INFORMATION-TECHNOLOGY', 'SECTOR-UTILITIES','SECTOR-REAL-ESTATE']
correlations = df_merged.corr()

# Extract the correlation matrix between X and Y columns
corr_matrix = correlations.loc[x_axis_columns, y_axis_columns]

# Plot the heatmap
plt.figure(figsize=(15, 6))
sns.heatmap(corr_matrix, annot=False, cmap='coolwarm', fmt=".2f")
plt.title("Correlation Heatmap between Indices and Macros")
plt.xticks(rotation=45)  # Rotate the y-axis labels 45 degrees
plt.show()

OK, it is still overwhelming. Let us go to the old traditional method and just examine the dataframe of correlations.

Now we will start examining the dataset for each macro and see what the results are. To do that let’s create two functions:

the first will plot in a bar chart all the correlations of the various indices vs the sectors
the other will normalize and plot 2 series for better visualization.


def plot_bar_chart(df, column_name):
    plt.figure(figsize=(10, 6))
    df[column_name].plot(kind='barh')  # Changed from 'bar' to 'barh'
    plt.title(f'Correlations of {column_name}')
    plt.xlabel(column_name)  # Switched x-label and y-label
    plt.ylabel('Index')
    plt.show()

def plot_correlated_series(df, line_A, line_B):
    df.dropna(inplace=True)
    df_normalized = df[[line_A,line_B]] / df[[line_A,line_B]].max()
    
    # Plot the normalized data
    plt.figure(figsize=(10, 6))
    plt.plot(df_normalized[line_A], label=f'{line_A}')
    plt.plot(df_normalized[line_B], label=f'{line_B}')
    plt.title(f'{line_A} vs {line_B}')
    plt.xlabel('Date')
    # plt.ylabel('Normalized Price')
    plt.yticks([])
    plt.legend()
    plt.show()

Real Interest Rate

The real interest rate is like the secret sauce of the financial world. It’s the nominal interest rate minus the sneaky inflation rate. Think of it as the true cost of borrowing or the real return on your savings, after the inflation dragon has taken its bite. A positive real interest rate means your money is growing faster than prices, while a negative one means it’s losing value.

Let’s plot the correlations


plot_bar_chart(corr_matrix.T,'real_interest_rate')

As expected! When the interest rates are low, borrowing is cheaper. Companies can borrow cheap money, consumers can borrow to spend. So as the rates go down, the companies’ stocks go up! Also, you will notice that the financial sector has the smallest correlation. Dah! Lowest interest rates, lowest profits! But don’t feel sorry for them. Still in the negative area, so it means they still make money ;)

Now let us plot just one macro to check if we will find something interesting!


line_A = 'SECTOR-MATERIALS'
line_B = 'real_interest_rate'
plot_correlated_series(df_merged, line_A, line_B)

Do you notice something strange? Right after the crisis of 2008, it seems that for a decade, the 2 were positively correlated.

A possible explanation would be that after the crisis, central banks around the world implemented unconventional monetary policies such as quantitative easing (QE) to lower the interest rates in the long run and boost the economy. It does not look like a coincidence that this happens after a crisis!

Consumer Price Index (CPI)

The Consumer Price Index (CPI) is like a magical shopping basket that tracks how much everyday stuff costs over time. By watching the prices of things like groceries, gas, and rent, the CPI gives us a glimpse into the rising cost of living. It’s like a financial crystal ball, helping us see if our hard-earned money is buying less and less each year.

Let’s plot it


plot_bar_chart(corr_matrix.T,'consumer_price_index')

No wonder. Everything positive around 80%+. As time passes, companies are moving their prices up, passing the inflation to the consumer and CPI goes up also together with their stock prices. Poor Banks (financial sector) still on the lowest side!

Gross National Product

Imagine GNI as a country’s piggy bank. It not only counts the money earned from activities within the country (like GDP) but also the extra cash its citizens bring home from working abroad. So, if a country’s citizens are super popular overseas, working as doctors, engineers, or even pop stars, their earnings add to the GNI, making the piggy bank even fuller. While GDP is like the money earned from a lemonade stand in your backyard, GNI is the total earnings from all the lemonade stands, whether they’re in your backyard or your neighbor’s.


plot_bar_chart(corr_matrix.T,'gni_usd')

Again like the consumer price index, everything is on the high positive side, and again Financial Sector is the least correlated. So let us see why is this happening…

line_A = 'SECTOR-FINANCIALS'line_B = 'gni_usd'plot_correlated_series(df_merged, line_A, line_B)

No surprise. Again the financial crisis of 2008 is skewing the correlation. After that, it is business as usual.

Net Trades Goods and Services

Imagine your country as a giant store. Net Trade in Goods and Services is like the difference between what your country sells to other countries (exports) and what it buys from them (imports). If you sell more than you buy, it’s like running a profitable business, boosting your country’s economy. But if you buy more than you sell, it’s like spending more than you earn, which can strain your financial health. So, keeping track of this trade balance is crucial for any country’s economic well-being.


plot_bar_chart(corr_matrix.T,'net_trades_goods_services')

Now that is the first time that we see another sector than the financial one, to be the least correlated. The Real Estate!


line_A = 'SECTOR-REAL-ESTATE'
line_B = 'net_trades_goods_services'
plot_correlated_series(df_merged, line_A, line_B)

Well from the graph above you see that the crisis of 2008 makes a

difference, however, the question is why this time the real estate looks more affected than the financial one. The answer is quite simple. It is more grounded in local economic conditions. While trade fluctuations can impact industries like manufacturing and tech, real estate is largely driven by local factors like population growth, job markets, and interest rates. For instance, a booming tech sector might increase demand for housing, but it doesn’t necessarily affect the overall trade balance. Simple right?

The Dollar Index!

You might have noticed, that in the data before, I have added the Dollar index, but did not include them in the plots. The reason is that this index is completely different from the sectors so the analysis should be done separately.

The Dollar Index (DXY) is like a currency popularity contest where the mighty US dollar goes head-to-head with a basket of other currencies. Think of it as the dollar’s own little league, where it competes against the euro, yen, pound, and a few other foreign currencies. When the dollar’s winning, it means it’s getting stronger compared to these other currencies. Conversely, if the dollar’s losing, it’s getting weaker. This index is a big deal for investors and traders because it can influence everything from exchange rates to commodity prices, making it a key player in the global financial game.

Now let’s see how it is correlated to all the macro indicators:


plot_bar_chart(correlations,'DOLLAR-INDEX')

You will notice from the x-axis that there is no strong correlation with any of the macro indicators. None of them is passing a descent 50%, so we can even discuss any links between the economy and the dollar index. I have to admit that we could expected more interesting results there.

Let us try to compare the graphs at least with the winner (~minus 50%). The merchandise trade as a percent of GDP shows how much a country’s economy is dependent on their foreign friends to buy their stuff! It’s the value of exports and imports divided by GDP, multiplied by 100 — easy math. A high score implies that the economy has a serious passport stamp collection. Low score? They’re more the stay-at-home type. It’s like asking if a country’s dinner is home-cooked or “imported” from a neighbor!


line_A = 'DOLLAR-INDEX'
line_B = 'merchandise_trade_percent_gdp'
plot_correlated_series(df_merged, line_A, line_B)

What is interesting is that while till 2018 the two time series seem pretty negative correlation, until 2018 looks like the opposite. Some possible scenarios why this shift happened, are COVID-19 and the fact that the US implemented various tariffs, especially for Chinese imports.

So how is it possible when you increase your imports, and still have a strong currency? In pure economy maths, it cannot. So the only thing to accept as a reasonable explanation is the geopolitical force in combination with a worldwide pandemic.

Chinese curse: May you live in interesting times

Conclusions

Macroeconomic indicators are like weather forecasts for the stock market — some days they’re spot on, while others, they’re just guessing whether it’ll rain profits or tanking stocks.

Different sectors seem to take these indicators personally — while some dance to the tune of interest rates and inflation, others pretend they didn’t hear a thing.

Economic shifts and market responses may zigzag, but a good mix of indicators gives investors a fighting chance to navigate the market’s mood swings (no crystal ball or fortune teller required!).

This article aims to be a bit educational with a light (and funny) touch, backed by some data analysis, on subjects that only Nobel winners can give spot-on (if possible) answers. We hope that by reading it, ideas will be triggered from your end, to investigate more interesting correlations (or not).

tech. finance. ai