A complete guide to extract stock data with Alpha Vantage Stock API in Python
Extracting stock data manually from websites is a tedious task to perform. It is even more difficult to find reliable data. This situation is a hectic one to deal with for beginners. The solution to this problem is to automate the process of extracting the data. How can we do this? The answer is simple, with the help of Stock API.
What and Why to use Stock APIs?
What: A Stock API is a database hosted in a cloud that offers real-time stock updates, intraday data, historical data, and much more.
Why: In recent days, almost every financial institution is using stock APIs for trading and research purposes as it helps in cutting the expense of buying stock data directly from the exchanges (which costs a hefty amount of money). Secondly, with the help of programming, it is easy to interact with Stock APIs to obtain the desired information. Finally, we have access to varied types of data with highly customizable features.
A note on Alpha Vantage:
Alpha Vantage provides free stock APIs through which users can access a wide range of data like real-time updates, and historical data on equities, currencies, and cryptocurrencies.
What we’re going to do?
In this article, we are going to interact with the stock API provided by Alpha Vantage with python to extract three types of equity data: intraday data, historical data, and real-time updates. Before moving on to the coding part, the user must create a developer account on Alpha Vantage (https://www.alphavantage.co/support/#api-key), only then, the API key (a vital part of an API) can be accessed to pull data.
Importing Packages
We are going to use only two packages in this article which are Pandas, and Requests. The Pandas package is used to carry out an extensive amount of data manipulations and processing and the Requests package provides functions to pull data from an API.
Python Implementation:
import pandas as pd
import requests
Now that we have imported the required packages into our python environment. So, let's begin extracting some data!
Extracting Intraday Data
In this part, we are going to pull the intraday data of Tesla stocks using the stock API provided by Alpha Vantage. This specific API is highly recommended to use for short-term charting or trading strategy development.
Python Implementation:
# INTRADAY DATA
def get_intraday_data(symbol, interval):
api_key = open(r'api_key.txt')
api_url = f'https://www.alphavantage.co/query?function=TIME_SERIES_INTRADAY&symbol={symbol}&interval={interval}&apikey={api_key}'
raw_df = requests.get(api_url).json()
df = pd.DataFrame(raw_df[f'Time Series ({interval})']).T
df = df.rename(columns = {'1. open': 'open', '2. high': 'high', '3. low': 'low', '4. close': 'close', '5. volume': 'volume'})
for i in df.columns:
df[i] = df[i].astype(float)
df.index = pd.to_datetime(df.index)
df = df.iloc[::-1]
return df
tsla_intra = get_intraday_data('TSLA', '1min')
tsla_intra
Output:
Code Explanation: Firstly, we are defining a function named ‘get_intraday_data’ that takes the stock’s symbol (‘symbol’), and the time interval between the data points (‘interval’) as parameters. Inside the function, we are first storing the secret API key (should not be revealed) provided by Alpha Vantage to the ‘api_key’ variable. Next, we are defining a variable ‘api_url’ to store the URL of the API to pull intraday data. With the help of the ‘get’ function provided by the requests package, we are pulling and storing the data in a JSON format to the ‘raw_df’ variable. After doing some data processing and manipulations, we are returning the intraday data in a clean format. Finally, we are calling the function we created to pull the intraday data of Tesla stocks with a time interval of 1 minute.
In this article, I’ve used 1 minute as the time interval to just shown an example and there aren’t any specific reasons for doing that. Apart from choosing 1 minute, there are other options like 5 minutes (5min), 15 minutes (15min), 30 minutes (30min), and 60 minutes (60min).
Extracting Historical Data
This part is to specifically extract the historical data of the given stock using the stock API provided by Alpha Vantage.
Python Implementation:
# HISTORICAL DATA
def get_historical_data(symbol, start_date = None):
api_key = open(r'api_key.txt')
api_url = f'https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED&symbol={symbol}&apikey={api_key}&outputsize=full'
raw_df = requests.get(api_url).json()
df = pd.DataFrame(raw_df[f'Time Series (Daily)']).T
df = df.rename(columns = {'1. open': 'open', '2. high': 'high', '3. low': 'low', '4. close': 'close', '5. adjusted close': 'adj close', '6. volume': 'volume'})
for i in df.columns:
df[i] = df[i].astype(float)
df.index = pd.to_datetime(df.index)
df = df.iloc[::-1].drop(['7. dividend amount', '8. split coefficient'], axis = 1)
if start_date:
df = df[df.index >= start_date]
return df
msft_hist = get_historical_data('MSFT', '2020-01-01')
msft_hist
Output:
Code Explanation: The first thing we did is to define a function named ‘get_historic_data’ that takes the stock’s symbol (‘symbol’) as a required parameter and the starting date of the historical data (‘start_date’) as an optional parameter. Like we did in the previous function, we are defining the API key and the URL and stored them into their respective variable. Next, we are extracting the historical data in JSON format using the ‘get’ function and stored it into the ‘raw_df’ variable. After doing some processes to clean and format the raw JSON data, we are returning it in the form of a clean Pandas dataframe. Finally, we are calling the created function to pull the historic data of Microsoft from the starting of 2020 and stored it into the ‘msft_hist’ variable.
Pulling Latest Updates
In this step, we are going to extract the latest updates and information of a given stock using the stock API provided by Alpha Vantage. This step will be really helpful as the function we are going to create updates itself with the latest data continuously and seamlessly.
Python Implementation:
# LIVE UPDATES
def get_live_updates(symbol):
api_key = open(r'api_key.txt')
api_url = f'https://www.alphavantage.co/query?function=GLOBAL_QUOTE&symbol={symbol}&apikey={api_key}'
raw_df = requests.get(api_url).json()
attributes = {'attributes':['symbol', 'open', 'high', 'low', 'price', 'volume', 'latest trading day', 'previous close', 'change', 'change percent']}
attributes_df = pd.DataFrame(attributes)
values = []
for i in list(raw_df['Global Quote']):
values.append(raw_df['Global Quote'][i])
values_dict = {'values':values}
values_df = pd.DataFrame(values).rename(columns = {0:'values'})
frames = [attributes_df, values_df]
df = pd.concat(frames, axis = 1, join = 'inner').set_index('attributes')
return df
ibm_updates = get_live_updates('IBM')
ibm_updates
Output:
Code Explanation: The structure or format of the code for this function is almost similar to the previous functions we created earlier but, the URL of the API changes. The steps involving data processing are slightly extensive in this function as the raw JSON data which is being pulled is comparatively messy. From the output dataframe being represented, we can observe that almost every fundamental information of a stock is revealed.
Final Thoughts!
In this article, we learned to pull intraday data, historical data, and the latest information of a stock using the stock APIs provided by Alpha Vantage. We just explored a pinch of Alpha Vantage’s huge collection of stock APIs to carry out a varied amount of tasks. Also, we retained the default API parameters while defining the URL of the API but, there are a lot of flexible and customizable parameters that come along with the API. So, it is highly recommended to experiment with the function we created with different API parameters. That’s it! We finally automated one of the stressful tasks to perform in finance using free stock APIs. Hope you found something useful in this article.