top of page

Investigating Xmas Rally in the Market Using Python

Nikhil Adithyan

The truth behind Santa's rally



The Santa Claus Rally is a notable stock market phenomenon typically occurring during the last five trading days of December and January’s first two trading days. This period has historically been marked by a rise in stock prices, with the S&P 500 often showing gains. The term was popularized by Yale Hirsch in 1972 through his “Stock Trader’s Almanac.”


The rally is seen as a positive market trend that provides investors with a cheerful end to the year, often serving as an informal indicator of market sentiment heading into the new year. While its occurrence is not guaranteed, it has been observed frequently enough to attract attention from traders and analysts alike.


Several factors are believed to contribute to the Santa Claus Rally:


Low Trading Volumes: Many institutional investors take vacations during this period, leading to lighter trading volumes. This can make it easier for stock prices to rise as smaller trades have a larger impact on market movements.


Tax-Loss Harvesting: Investors may sell underperforming stocks at the end of the year to offset capital gains taxes, creating opportunities for others to buy at lower prices.


Optimism and Festive Spirit: The holiday season often brings a sense of optimism and goodwill, which can translate into more bullish investment behavior.


Anticipation of the January Effect: Some investors buy stocks in anticipation of the January effect, a separate phenomenon where stock prices are expected to rise at the start of the new year.


Year-End Bonuses: Some investors use year-end bonuses to purchase stocks, which increases buying pressure during this period.


What we will do in this article:

With the use of EODHD API, which can provide a lot of years of historical prices, we will try to:


  • Prove the rally on stock market indices

  • Analyze if we can identify specific stocks that follow these patterns

  • Check if the rally has any effect on the year to come


Stock Market’s Indices

First, we will get the data for the indices. To do that, we will use the EODHD’s Fundamental Data API. This will provide us with 653 indices from the US!



import pandas as pd
import requests
import io
import tqdm
import matplotlib.pyplot as plt

token = 'YOUR EODHD API KEY'

EXCHANGE_CODE = 'INDX'
url = f'https://eodhd.com/api/exchange-symbol-list/{EXCHANGE_CODE}'
querystring = {"api_token":token,"fmt":"json"}

response = requests.get(url, params=querystring).json()
df_tickers = pd.DataFrame.from_dict(response)
df_tickers = df_tickers[df_tickers['Country'] == 'USA']

df_tickers


Before we do anything else, we will need to define a function that will return a dataframe for a specific symbol using EODHD’s End-of-Day API again for historical data. In this case, of course, we will pass the index, but we also plan to reuse it later on for equity symbols. The dataframe returned will be including:


  • If this year you’ve been a good kid(and the rally took place)

  • how much of a good kid were you (percentage-wise)

  • Sector name and market cap

  • If the symbol (or index) was in a short uptrend (10MA on top of 50MA) or long uptrend (50MA on top of 200MA)

  • finally, what happened next year



def get_xmas_rally_df(symbol, EXCHANGE_CODE):
    start_date = '1974-05-01'
    url = f'https://eodhd.com/api/eod/{symbol}.{EXCHANGE_CODE}'
    querystring = {"api_token":token,"fmt":"csv","from":start_date}
    response = requests.get(url, params=querystring)

    if response.status_code != 200:
        print(f"Error processing {symbol}: {response.status_code}")
        return pd.DataFrame()

    try:
        df = pd.read_csv(io.StringIO(response.text))
        df['Date'] = pd.to_datetime(df['Date'])
        # print(df)
    except Exception as e:
        print(f"Error processing {symbol}: {response.status_code}")
        print(response.text)
        return pd.DataFrame()

    # calculate the below 3 moving averages
    df['10_MA'] = df['Close'].rolling(window=10).mean().shift(1)
    df['50_MA'] = df['Close'].rolling(window=50).mean().shift(1)
    df['200_MA'] = df['Close'].rolling(window=200).mean().shift(1)

    # Get fundamental data
    try:
        url_fundamentals = f'https://eodhd.com/api/fundamentals/{symbol}.{EXCHANGE_CODE}'
        querystring_fundamentals = {"api_token":token,"fmt":"json"}
        response_fundamentals = requests.get(url_fundamentals, params=querystring_fundamentals).json()
        symbol_name = response_fundamentals.get('General', {}).get('Name', None)
        sector = response_fundamentals.get('General', {}).get('Sector', None)
        market_cap = response_fundamentals.get('Highlights', {}).get('MarketCapitalization', None)
    except Exception as e:
        print(f'Cannot find sector for symbol {symbol} with error {e}')
        symbol_name = None
        sector = None
        market_cap = None

    xmas_data = []
    for year in range(1975, 2024):
        xmas_rally = {'symbol':symbol, 'name':symbol_name, 'year':year, 'sector':sector, 'market_cap':market_cap}

        # get the open price of the 5 working day
        year_data = df[df['Date'].dt.year == year].sort_values(by='Date').tail(5)
        if len(year_data) == 0:
            continue
        first_row = year_data.iloc[0]
        xmas_rally['from_date'] = first_row['Date']
        xmas_rally['from_price'] = first_row['Open']

        xmas_rally['short_uptrending'] = first_row['10_MA'] > first_row['50_MA']
        xmas_rally['long_uptrending'] = first_row['50_MA'] > first_row['200_MA']


        #get the second working day of the next year the close price
        next_year_data = df[df['Date'].dt.year == year + 1].sort_values(by='Date')
        if len(next_year_data) < 2:
            continue
        second_row = next_year_data.iloc[1]
        xmas_rally['to_date'] = second_row['Date']
        xmas_rally['to_price'] = second_row['Close']

        xmas_rally['rally_pct'] = ((xmas_rally['to_price'] / xmas_rally['from_price']) - 1) * 100
        xmas_rally['been_a_good_kid'] = xmas_rally['rally_pct'] > 0

        xmas_rally['next_year_pct'] = ((next_year_data.iloc[-1]['Close'] / next_year_data.iloc[0]['Close']) - 1) * 100
        xmas_rally['next_year_good_boy'] = xmas_rally['next_year_pct']  > 0

        xmas_data.append(xmas_rally)

    return pd.DataFrame(xmas_data)

Now we have the indexes and the function, so let’s get our data:



df_idx_xmas_rally = pd.DataFrame()

for index, row in tqdm.tqdm(df_tickers.iterrows()):
    symbol = row['Code']
    symbols_xmas_rally_df = get_xmas_rally_df(symbol,'INDX')
    df_idx_xmas_rally = pd.concat([df_idx_xmas_rally, symbols_xmas_rally_df], ignore_index=True)

df_idx_xmas_rally

This way, we will get more than 11 thousand rows of data for all the indexes.



Does Santa Claus exist?

Let’s start with a simple question. How many times did the rally occur among all the indices in those years?



percentage_true = df_idx_xmas_rally['been_a_good_kid'].mean() * 100
percentage_false = (1 - df_idx_xmas_rally['been_a_good_kid'].mean()) * 100

print(f"Percentage of True: {percentage_true:.2f}%")
print(f"Percentage of False: {percentage_false:.2f}%")

======================================
Output
Percentage of True: 62.56%
Percentage of False: 37.44%

The rally existed in almost 63% of the cases, but this could be more impressive. Let’s plot it by year.



grouped = df_idx_xmas_rally.groupby('year')['been_a_good_kid'].mean() * 100
percentages = grouped.reset_index(name='percentage_been_a_good_kid')

# Plotting
plt.figure(figsize=(10, 5))
plt.bar(percentages['year'], percentages['percentage_been_a_good_kid'], color='skyblue')
plt.xlabel('Year')
plt.ylabel('Percentage of Been a Good Kid (%)')
plt.title('Percentage of Been a Good Kid per Year')
plt.xticks(percentages['year'], rotation=90)
plt.ylim(0, 110)
plt.axhline(y=50, color='r', linestyle='--')  # Optional: Line at 50%
plt.grid(axis='y')

plt.show()


That is impressive! When the rally happens, it happens all around the indexes, with percentages about 80%.


The question, though, is, what is happening with the major indices? We will focus on three indices: the S&P 500, the Nasdaq Composite, and the Dow Jones Industrial Average.



df_majors = df_idx_xmas_rally[df_idx_xmas_rally['symbol'].isin(['GSPC','DJI','IXIC'])]
true_percentage = df_majors['been_a_good_kid'].mean() * 100

average_rally_pct = df_majors.groupby('been_a_good_kid')['rally_pct'].mean()

# Print the results
print(f"Percentage of 'been_a_good_kid' being True: {true_percentage:.2f}%")
print("Average 'rally_pct' when 'been_a_good_kid' is:")
print(average_rally_pct)

======================================
Output
Percentage of 'been_a_good_kid' being True: 72.79%
Average 'rally_pct' when 'been_a_good_kid' is:
been_a_good_kid
False   -1.685401
True     2.064658

You see, in this case, we have an even better probability. For those indices, the rally happened 72.79% in the last 50 years, with an average return of 2.06%, while when it is not happening, the loss is less (1.68%).

Can we narrow it down?



average_rally_pct = df_majors.groupby(['short_uptrending'])['been_a_good_kid'].mean() * 100
print("Average 'been_a_good_kid' by 'short_uptrending':")
print(average_rally_pct)

average_rally_pct = df_majors.groupby(['long_uptrending'])['been_a_good_kid'].mean() * 100
print("Average 'been_a_good_kid' when 'long_uptrending':")
print(average_rally_pct)

======================================
Output
Average 'been_a_good_kid' by 'short_uptrending' is:
False    74.509804
True     71.875000
Average 'been_a_good_kid' when 'long_uptrending' is:
False    72.727273
True     72.815534

When the index is up-trending or down-trending, the results are the same (around 70+%), so we can assume that Santa’s rally has nothing to do with the current trend.


But can the rally predict the future?



average_rally_pct = df_majors.groupby(['been_a_good_kid'])['next_year_good_kid'].mean() * 100
print("Average 'next_year_good_kid' when 'been_a_good_kid':")
print(average_rally_pct)

======================================
Output
Average 'next_year_good_kid' when 'been_a_good_kid':
been_a_good_kid
False    67.500000
True     77.570093

That is interesting! When the rally happens, it will be a good year in 77% of the cases for next year. However, when the rally does not occur, it is not necessarily true that the next year will be bad! It looks like Santa resets the goodness counter ;)


What about individual stocks?

By now, we should agree that the Santa Clause rally is real. But how about the individual stocks? Let’s get all the stocks of NASDAQ and NYSE and check it out!



EXCHANGE_CODE = 'US'
url = f'https://eodhd.com/api/exchange-symbol-list/{EXCHANGE_CODE}'
querystring = {"api_token":token,"fmt":"json"}
response = requests.get(url, params=querystring).json()

df_symbols = pd.DataFrame.from_dict(response)
# df_symbols = df_symbols[(df_symbols['Exchange'] == 'NYSE') & (df_symbols['Type'] == 'Common Stock')]
df_symbols = df_symbols[(df_symbols['Exchange'].isin(['NYSE', 'NASDAQ'])) & (df_symbols['Type'] == 'Common Stock')]

df_symbols

The result is 6278 stocks! This will take some time…



df = pd.DataFrame()

for index, row in tqdm.tqdm(df_symbols.iterrows()):
    symbols_xmas_rally_df = get_xmas_rally_df(row['Code'],'US')
    df = pd.concat([df, symbols_xmas_rally_df], ignore_index=True)

df

With this loop, we managed to get 90K+ observations to analyze.




percentage_true = df['been_a_good_kid'].mean() * 100
percentage_false = (1 - df['been_a_good_kid'].mean()) * 100

print(f"Percentage of True: {percentage_true:.2f}%")
print(f"Percentage of False: {percentage_false:.2f}%")

======================================
Output
Percentage of True: 56.68%
Percentage of False: 43.32%

When we go into the details, the probability for the rally to happen on an individual stock drops to 56.68%.



by_sector = df.groupby('sector')['been_a_good_kid'].mean() * 100
by_sector


If we analyze the rally by sector, we will see that sectors like technology or consumer are not even in the first places, while those are the sectors that (in theory) should have been affected more. Who wouldn’t want a brand-new Microsoft Surface or iPhone delivered to him by Amazon at Christmas?


So, let’s see the individual stocks! We will filter to those with at least 15 years of existence so we don’t get results for a few years.



result = df.groupby(['symbol', 'name','sector']).agg(
    Percentage_of_Been_A_Good_Kid=('been_a_good_kid', 'mean'),
    Count_of_been_a_good_kid=('been_a_good_kid', 'size')
).reset_index()

# Convert proportion to percentage
result['Percentage_of_Been_A_Good_Kid'] *= 100

result = result[result['Count_of_been_a_good_kid'] > 15]

# Rank the results based on the percentage in descending order
result['Rank'] = result['Percentage_of_Been_A_Good_Kid'].rank(ascending=False, method='min')
result


That is interesting! As we can see, the top 10 is mainly from the financial services sector, especially Black Rock funds.



result[result['symbol'].isin(['AAPL','AMZN','MSFT'])]


Regarding stocks like Apple, Amazon, or Microsoft, out of 2500 stocks of the filtered dataset, they are not in a promising place, positioning Apple in the 883 place, while Microsoft or Amazon are close to the end…

This observation brings us to the next question. It is supposed that bigger companies have more probability of running the rally of Santa, but is this true?



df['market_cap'] = df['market_cap'].astype(float)

# Define the bins and labels for each capitalization category
bins = [0, 50_000_000, 300_000_000, 2_000_000_000, 10_000_000_000, 200_000_000_000, float('inf')]
labels = ['nano', 'micro', 'small', 'mid', 'large', 'mega']

# Create a new column with the categorized values
df['CapCategory'] = pd.cut(df['market_cap'], bins=bins, labels=labels, right=False)
df.groupby('CapCategory')['been_a_good_kid'].mean() * 100


It looks like this is true! Even though we saw Microsoft or Apple not run the rally as much as other stocks, mega stocks generally have more probability of doing so. The differences may be slight, but it is not a coincidence that mega and large companies are more likely (59% and 58%) to run Santa’s rally than the nano and small, which are around 55%.


Conclusions

In conclusion, the Santa Claus Rally is a notable stock market phenomenon occurring during the last five trading days of December and the first two of January. This period often sees stock price increases, particularly in major indices like the S&P 500, Nasdaq Composite, and Dow Jones Industrial Average. Our analysis using historical data confirms several key points:


  • Frequency: The Santa Claus Rally occurs in approximately 73% of cases for major indices over the past 50 years, with an average return of 2.06% when it happens.


  • Market Trends: The rally does not appear to be significantly influenced by current market trends, as both up-trending and down-trending indices show similar probabilities of experiencing a rally.


  • Predictive Element: When the rally occurs, there is a 77% chance that the following year will also be positive for the market.


  • Individual Stocks: The likelihood of a Santa Claus Rally drops to about 56.68% when examining individual stocks, indicating less consistency compared to indices.


  • Sector Analysis: Financial services stocks are more likely to experience a Santa Claus Rally than technology or consumer sectors, which might traditionally be expected to benefit more from holiday spending.


  • Market Capitalization: Larger companies (mega and large caps) are slightly more likely to experience a rally compared to smaller ones, with probabilities around 59% for mega caps versus 55% for nano and small caps.


Overall, while the Santa Claus Rally is a real and intriguing market event, its occurrence varies significantly between indices and individual stocks. This analysis highlights the importance of considering various factors, such as market capitalization and sector performance, when evaluating the potential impact of this seasonal trend.


Remember, even if your stocks weren’t “good kids” this year, Santa might still bring you a surprise rally gift!


Disclaimer: While we explore the exciting world of investing in this article, it’s crucial to note that the information provided is for educational purposes only. I’m not a financial advisor, and the content here doesn’t constitute financial advice. Always do your research and consider consulting with a professional before making any investment decisions.

Comments


Bring information-rich articles and research works straight to your inbox (it's not that hard). 

Thanks for subscribing!

© 2023 by InsightBig. Powered and secured by Wix

bottom of page