An unconventional strategy that actually works!
In today’s fast-paced financial markets, traders are constantly seeking innovative ways to gain an edge. Sentiment analysis has emerged as a powerful tool, allowing traders to gauge market mood and make more informed decisions. By analyzing the sentiment of news articles, it’s possible to develop trading strategies that align with market trends and shifts in sentiment.
In this article, we’ll explore how you can leverage financial news data to build and backtest a sentiment-driven trading strategy using Python. With the help of Intrinio’s Financial News API, which provides real-time sentiment analysis of news articles, we will extract sentiment data and use it to implement a simple yet effective intraday trading model. This strategy aims to capitalize on shifts in sentiment to make informed intraday trades, all based on high-quality news data.
So without further ado, let’s dive right into it!
The Trading Strategy
In this strategy, we combine sentiment analysis with intraday trading to take advantage of market trends driven by news sentiment. The basic idea is to track how news sentiment evolves over time and use that information to enter long or short positions in a stock. By comparing the average sentiment of a given day to a rolling average of previous days, we can determine if the market is trending positively or negatively based on the news flow.
Mechanics of the Strategy
In this strategy, we combine sentiment analysis with intraday trading to capitalize on market movements driven by news sentiment. The core idea is to compare the daily average sentiment to a 7-day rolling average.
If the sentiment is positive and higher than the rolling average, it signals a potential upward trend, prompting a long position. Conversely, if the sentiment is negative and below the rolling average, we take a short position, anticipating a decline.
Trading Conditions
Long Position: When the average sentiment is higher than the rolling average and positive, we enter a long position at the open of the next trading day. We hold this position until the close of the day and exit by selling the stock.
Short Position: When the average sentiment is lower than the rolling average and negative, we enter a short position at the open of the next trading day. We exit this short position by buying back the shares at the close of the day.
With the strategy outlined, we are now ready to dive into the coding process to implement this approach.
Importing Packages
The first and foremost step is to import all the required packages into our Python environment. In this article, we’ll be using several packages which are:
Pandas — for data formatting, clearing, manipulating, and other related purposes
Matplotlib — for creating charts and different kinds of visualizations
Requests — for making API calls in order to extract data
Termcolor — to customize the standard output shown in Jupyter notebook
Math — for various mathematical functions and operations
NumPy — for numerical and high-level mathematical functions
Datetime — for date-related functions
The following code imports all the above-mentioned packages into our Python environment:
# IMPORTING PACKAGES
import requests
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from datetime import timedelta
import math
from termcolor import colored as cl
If you haven’t installed any of the imported packages, make sure to do so using the pip command in your terminal.
Extracting Financial News Data of AAPL
To implement our strategy, the first step is to extract the relevant financial news data. In this case, we’re pulling news articles related to Apple Inc. (AAPL) from the Intrinio Financial News API over a specified date range (2024–09–26 to 2024–10–16). This data will include not only the headlines and publication dates but also sentiment scores, which will later be used in our trading strategy.
Here’s the code for extracting the news data:
# EXTRACTING FINANCIAL NEWS & SENTIMENT DATA OF AAPL
def fetch_news_data(api_key, ticker, start_date, end_date):
url = f'https://api-v2.intrinio.com/companies/AAPL/news?security=AAPL&specific_source=moody_us_news&start_date={start_date}&end_date={end_date}&api_key={api_key}'
response = requests.get(url).json()
df = pd.DataFrame(response['news'])
is_next_page = True
while is_next_page == True:
next_page = response['next_page']
next_page_url = url + f'&next_page={next_page}'
np_response = requests.get(next_page_url).json()
np_df = pd.DataFrame(np_response['news'])
df = df.append(np_df)
if np_response['next_page'] != None:
response = np_response
else:
is_next_page = False
df = df.reset_index().drop('index', axis = 1)
return df
api_key = 'YOUR API KEY'
ticker = 'AAPL'
start_date = '2024-09-26'
end_date = '2024-10-16'
news_data = fetch_news_data(api_key, ticker, start_date, end_date)
news_data = news_data.iloc[::-1].set_index('id')
news_data.tail()
In this code, we first define a function fetch_news_data() that takes the API key, ticker, start date, and end date as parameters. It constructs the URL for the API request and retrieves the financial news articles for AAPL within the given date range. The articles are stored in a DataFrame for further processing. If there are multiple pages of news data, the code checks for a next_page attribute and continues fetching until all available data is retrieved.
After retrieving the data, we reverse the DataFrame to ensure the data is in chronological order (using iloc[::-1]) and reset the index. The resulting DataFrame contains all the relevant financial news for AAPL during the specified time frame, and it is now ready for sentiment analysis and strategy implementation.
Here’s how the dataframe’s last few rows looks like:
Aggregating News Sentiment by Date
Once we have gathered the financial news data, the next step is to aggregate the sentiment scores by date. The goal here is to calculate the average sentiment for each day in the selected time range. This step is crucial because it gives us a clearer picture of market sentiment over time, allowing us to base trading decisions on the overall market mood rather than individual articles.
In this section, we will process the financial news data by extracting the publication date, sentiment score, and sentiment confidence for each article. We will then calculate a weighted sentiment score for each day and aggregate the results. The final output will be a DataFrame where each row represents a date along with its corresponding average sentiment score.
Here’s the code for calculating the average sentiment:
def aggregate_sentiment_by_date(news_data):
sentiment_by_date = {}
for i in range(len(news_data)):
# Extract the date
date = news_data.iloc[i]['publication_date'][:10] # Extract the date (first 10 characters)
# Extract sentiment score and confidence
sentiment_score = 1 if news_data.iloc[i]['article_sentiment'] == 'positive' else (-1 if news_data.iloc[i]['article_sentiment'] == 'negative' else 0)
sentiment_confidence = news_data.iloc[i]['article_sentiment_confidence'] # Default confidence to 1 if not provided
# Weighted sentiment score
weighted_sentiment = sentiment_score * sentiment_confidence
# Aggregate the data by date
if date not in sentiment_by_date:
sentiment_by_date[date] = {'total_sentiment': 0, 'count': 0}
sentiment_by_date[date]['total_sentiment'] += weighted_sentiment
sentiment_by_date[date]['count'] += 1
# Calculate the average sentiment for each date
aggregated_data = []
for date, data in sentiment_by_date.items():
avg_sentiment = data['total_sentiment'] / data['count'] if data['count'] > 0 else 0
aggregated_data.append({'date': date, 'average_sentiment': avg_sentiment})
return pd.DataFrame(aggregated_data)
sentiment_df = aggregate_sentiment_by_date(news_data)
sentiment_df.tail()
Explanation:
In this code, we define the function aggregate_sentiment_by_date(), which loops through the financial news DataFrame (news_data) and aggregates the sentiment data by date.
Extracting the Date: We extract the publication date from each news article and strip it down to the first 10 characters (format: YYYY-MM-DD), which represents the date of the news.
Sentiment Score and Confidence: We assign a sentiment score based on the sentiment of the article (positive, negative, or neutral). Positive articles are assigned a score of 1, negative articles are assigned -1, and neutral articles are given a score of 0. We multiply this score by the article's sentiment confidence to calculate the weighted sentiment score for each article.
Aggregating Sentiment by Date: For each date, we aggregate the weighted sentiment scores and count the number of articles. This gives us a total sentiment score and the number of articles for each day.
Calculating Average Sentiment: For each day, we calculate the average sentiment by dividing the total sentiment score by the number of articles. This gives us a more balanced view of market sentiment for each date.
The output is a DataFrame where each row represents a date, and the columns contain the date and the average sentiment score for that day. This is what the final few rows of the dataframe might look like:
Now that we have the average sentiment for each day, we can proceed to the next step, which is calculating the rolling average to smooth the sentiment data and identify trends over time.
Calculating Rolling Average
A rolling average (also called a moving average) is a commonly used statistical technique that helps smooth out short-term fluctuations and identify long-term trends by averaging the values of a data series over a specific window of time. In our case, we are using a 7-day rolling average of the sentiment values. This allows us to compare daily sentiment against a smoothed trend, making it easier to spot periods of sustained optimism or pessimism.
We are calculating the rolling average to provide a reference point for our trading strategy. By comparing each day’s sentiment to the rolling average, we can assess whether the sentiment is relatively higher (indicating bullishness) or lower (indicating bearishness) compared to recent trends.
# CALCULATING ROLLING AVERAGE
def calculate_rolling_average(df, window_size=7):
df['date'] = pd.to_datetime(df['date'])
df.sort_values(by='date', inplace=True)
df['rolling_avg_sentiment'] = df['average_sentiment'].rolling(window=window_size).mean()
return df
sentiment_df = calculate_rolling_average(sentiment_df)
sentiment_df.head(10)
This function first ensures that the dates are in the correct format for sorting and time-based operations. After sorting the data by date, it calculates the 7-day rolling average for the sentiment values using the rolling() function. The resulting rolling_avg_sentiment column is added to the DataFrame, providing a smoothed version of the average sentiment.
Here is the output showing the first few rows of the resulting dataframe, including the calculated rolling average:
Backtesting the Strategy
We’ve reached one of the most critical and exciting parts of this article. Now that we have a clear understanding of our trading strategy, it’s time to implement and backtest it using Python. To keep things simple, we’ll use a basic and straightforward backtesting system. The following code demonstrates how we backtest our strategy:
# BACKTESTING THE STRATEGY
# Extract intraday data
def extract_intraday(date):
start_date = date
end_date = str(pd.to_datetime(date) + timedelta(1))[:10]
url = f'https://api-v2.intrinio.com/securities/AAPL/prices/intervals?timezone=UTC&source=realtime&start_date={start_date}&end_date={end_date}&interval_size=1m&api_key={api_key}'
json = requests.get(url).json()
df = pd.DataFrame(json['intervals'])
df = df.iloc[::-1]
return df
sentiment_df = sentiment_df.dropna()
investment = 100000
equity = investment
earning = 0
earning_record = []
# Trading conditions
for i in range(len(sentiment_df)):
try:
if sentiment_df.iloc[i]['average_sentiment'] > sentiment_df.iloc[i]['rolling_avg_sentiment'] and sentiment_df.iloc[i]['average_sentiment'] > 0:
date = str(sentiment_df.iloc[i+1]['date'])[:10]
df = extract_intraday(date)
open_p = df.iloc[0].open
no_of_shares = math.floor(equity/open_p)
equity -= (no_of_shares * open_p)
sell_price = df.iloc[-1].close
equity += (no_of_shares * sell_price)
print(cl(f'{date}', attrs = ['bold']), cl('LONG:', attrs = ['bold'], color = 'green'), f'BOUGHT AT {open_p} & SOLD AT {sell_price}')
investment += earning
earning = round(equity-investment, 2)
earning_record.append(earning)
elif sentiment_df.iloc[i]['average_sentiment'] < sentiment_df.iloc[i]['rolling_avg_sentiment'] and sentiment_df.iloc[i]['average_sentiment'] < 0:
date = str(sentiment_df.iloc[i+1]['date'])[:10]
df = extract_intraday(date)
short_p = df.iloc[0].open
no_of_shares = math.floor(equity/short_p)
equity += (no_of_shares * short_p)
buy_price = df.iloc[-1].close
equity -= (no_of_shares * buy_price)
print(cl(f'{date}', attrs = ['bold']), cl('SHORT:', attrs = ['bold'], color = 'r'), f'SOLD AT {short_p} & BOUGHT AT {buy_price}')
investment += earning
earning = round(equity-investment, 2)
earning_record.append(earning)
except:
print(f'{date} invalid date')
# CALCULATING STRATEGY EARNINGS
strategy_earning = round(equity - 100000, 2)
roi = round(strategy_earning / 100000 * 100, 2)
print('')
print(cl('EARNING:', attrs = ['bold']), f'${strategy_earning} ;', cl('ROI:', attrs = ['bold']), f'{roi}%')
We start by defining a function, extract_intraday(), which fetches intraday price data for AAPL from Intrinio's API. This data is retrieved in 1-minute intervals for the specified date and is used to calculate the opening and closing prices.
Next, we loop through each date in the sentiment data to simulate trades based on our strategy. When the average sentiment is above the rolling average and positive, we simulate buying AAPL shares at the opening price and selling them at the closing price of the same day. When the sentiment is below the rolling average and negative, we simulate a short sale by selling at the opening price and buying back at the closing price.
Finally, we calculate the earnings from the strategy by tracking how the equity changes after each trade, as well as the overall return on investment (ROI).
The following are the trades executed by our backtesting system:
The strategy produced a total profit of $4270.19, resulting in an ROI of 4.27% over the test period. While these results are promising, the strategy can only be considered truly successful if it outperforms the simple buy-and-hold approach during the same period.
Buy/Hold Returns Comparison
A successful trading strategy not only needs to generate profits but must also consistently outperform the buy-and-hold approach. For those unfamiliar, the buy-and-hold strategy involves purchasing a stock and holding it over a longer period, regardless of market fluctuations.
If our strategy manages to outperform this approach, we can confidently consider it robust and potentially ready for real-world application. On the other hand, if it falls short, significant adjustments will be necessary to improve its performance.
Here’s the code that implements the buy-and-hold strategy and calculates its returns:
url = f'https://api-v2.intrinio.com/securities/AAPL/prices/intervals?timezone=UTC&source=realtime&start_date=2024-09-26&end_date=2024-10-16&interval_size=1m&api_key={api_key}'
response = requests.get(url).json()
bh_df = pd.DataFrame(response['intervals'])
is_next_page = True
while is_next_page == True:
next_page = response['next_page']
next_page_url = url + f'&next_page={next_page}'
np_response = requests.get(next_page_url).json()
np_bh_df = pd.DataFrame(np_response['intervals'])
bh_df = bh_df.append(np_bh_df)
if np_response['next_page'] != None:
response = np_response
else:
is_next_page = False
bh_df = bh_df.reset_index().drop('index', axis = 1)
bh_df = bh_df.iloc[::-1]
bh_roi = round(list(bh['close'].pct_change().cumsum())[-1],4)*100
print(cl(f'BUY/HOLD STRATEGY ROI: {round(bh_roi,2)}%', attrs = ['bold']))
In this code, we retrieve the historical price data for AAPL over the same time frame. We use the closing prices to calculate the percentage change for each interval and sum these changes to compute the cumulative return of the buy-and-hold strategy.
This is the result of the buy/hold strategy:
After comparing the results of the buy/hold strategy and our sentiment-driven trading strategy, our strategy outperformed the buy/hold strategy’s returns with a difference of 1.54% in ROI. What does this mean? This means we have indeed created a successful trading strategy that has the potential to do really well in the real-world market.
Conclusion
Although we went through an extensive process of developing a sentiment-driven trading strategy, gathering financial news data, and backtesting it using Intrinio’s Financial News API, we managed to successfully outperform the buy-and-hold strategy with an ROI of 4.27%. The aim of this article was not just to showcase a profitable trading method, but to introduce the concept of using sentiment analysis as a key tool in shaping intraday trading strategies. With the support of high-quality news sentiment data, traders can make more informed decisions aligned with market trends.
To turn this strategy into an even more profitable one, several improvements can be made. First, fine-tuning the strategy’s parameters — such as experimenting with different rolling average windows, sentiment thresholds, and trade intervals — can optimize its performance. Additionally, implementing a solid risk management system, something we haven’t covered in this article, would be crucial for reducing losses and protecting profits when deploying the strategy in real-world markets.
With that being said, you’ve reached the end of the article. Hope you learned something new and useful today. If you have any suggestions for improving the trading strategy, kindly let me know in the comments. Thank you very much for your time.