Twitter data analysis using Python blog

NASDAQ Per Minute Data Using Python

If analysis is the body, data is the soul.

In our project of stock market analysis based on Twitter sentiments, we selected a few sample companies. We decided what we needed but we still had to cast some spells to get to the core data.

As much as we dream about attending Hogwarts, we can’t simply say ‘Accio’ to make the data come to us. The computer generally doesn’t understand the spells of the wizarding world yet. So, we will just go create some of our own spells using Python.

Now that we’ve chosen our language, let’s dive into coding. To get the data, I need an API which can provide me with reliable data of NASDAQ. With numerous searches on Google, I found one named ‘Alpha Vantage’. It provides simple, easy to use API to retrieve around last 10 days of per minute data. Good enough to go ahead with it.

Now, with this API, either I can just simply download the JSON file everyday by making changes to the API call or I can simply write some code to get me the real-time data. Downloading it every day is one way but I chose the later one.

“To run this script with others, I required reliable data storage services as well as good transfer rates. Moreover, I wanted a service which could help me set up my machine with no efforts. I had read about CloudSigma in my Cloud Computing classes. So, I knew it was reliable. So, I decided to run it on CloudSigma. I could set up the server within seconds and got complete support from the 24/7 Live Chat feature.”

— Akshay Nagpal

Firstly, to fetch the data from the API, I will write a simple function using Python’s library ‘urllib’. I have removed my API Key from the code but it can be obtained for free from Alpha Vantage’s website.

This function is parsing the website which has JSON content written on it. It reads the JSON content and then decodes it into the UTF-8 format. It is returning the entire string which has JSON content of the website.

Now that I have got the string, I need to operate on the JSON content. For the same purpose, I can use the JSON’s loads function which loads the string having JSON content in a dictionary format.

Now that, I have got the JSON in the dictionary, I send the data to another function partitionSave which stores my files on filesystem according to the date of the data or stock prices.

This function would parse the data and create folders according to the date mentioned in the timestamp on the data. Once it ensures that all the directories are created, it moves onto writing the data onto the filesystem.

Finally, I’ve finished all my spells. I just write the main method to have these spells work according to my need.

Now, I am getting per minute data at once for the last 10 days. I have used it for quite some weeks now to collect data from NASDAQ and NSE and it’s running flawlessly.

If you are interested in doing sentiment analysis also, you can find the complete code here: GitHub.