Imagine this scenario: You’re an entrepreneur who has built a successful mobile gaming company offering a high-tech experience at parties and events. Now it’s time to expand. You’ve seen more than one episode of “Shark Tank” and have learned you must bring data to the table when pitching to investors. How do you go about finding and analyzing relevant information? That’s where InterWorks can help.
Approaching the Problem
An open-ended question like this is best solved by asking successively more detailed questions to narrow down the analysis at each step. Tableau can help us go through this process very rapidly, but we first need to gather some data. For that, we turn to the most openly accessible social data out there: the Twitter stream.
When I began this piece, the 2014 League of Legends World Championship had just wrapped up. Tweets mentioning this championship are relevant to the target market and in large supply. Those familiar with writing code can pull tweets and metadata from the Twitter API with some knowledge of how it works. For the rest of us, we can use the Twitter Search tool in Alteryx.
Importing Tweets and Analyzing Sentiment in Alteryx
We could start by looking for tweets mentioning the phrase “League of Legends,” but we also want to get the #LeagueOfLegends hashtag. There may be other relevant search terms, but this should be a good start. Use a formula field to indicate the search terms after setting options in the Twitter Search, then bring the tweets together using the Union tool.
Next, we’ll use R integration in Alteryx to handle the sentiment analysis. Sentiments are normally determined by matching words to a lexicon which scores individual words as having a “positive” or “negative” emotion. To get an appropriate score, we’ll need to clean up the tweet text by setting all words to lowercase and by cleaning up punctuation marks. That is easily done with a formula and RegEx tool. Inputs are shown in the screenshot below:
The prepared data will feed into the R tool as shown. Alteryx provides several predefined analysis types with the Predictive Tools installation, but sentiment analysis will require some custom code as shown below:
For this code to execute, you’ll need to install the Rstem and sentiment packages outside of Alteryx. Depending on your installation, you may also need to specify the package location inside the library() command.
Using Tableau to Narrow the Focus
The Twitter API gives us location information which will have to be split between city and state using the “text to columns” feature in Alteryx or Excel. Once that’s done, the data is ready for analysis in Tableau. By dragging the City field onto the map along with the Number of Records (tweets), I can immediately see places where people are actively discussing the championship.
Next, I can color the circles based on the sentiment score by using a calculated field with the following formula:
By dragging that onto the Color shelf and adding annotations, I now have a map that can help me narrow down which cities to analyze further. It’s clear that the existing location for our scenario has an active gaming community with positive sentiments. Los Angeles and San Francisco stand out as well. Interestingly, neighboring cities like Costa Mesa show more positive sentiments than LA itself.
Cross-Checking the Results
Savvy investors may not fully trust this single point of analysis. Good science depends on multiple, independent lines of inquiry. What else could we check to verify the conclusions above? A mobile gaming business would cater to events like League of Legends viewing parties, so we’ll start there.
I gathered locations of viewing parties from websites listing event details using Import.io. The lists had full addresses which Tableau can’t automatically recognize and geocode. What now? One fast, easy way to create a map from addresses is with Google Fusion Tables. It sends the addresses through Google’s algorithm for finding and plotting lat/long coordinates. With a few simple steps, outlined here, we now have a Google Map.
Seeing the clusters in the same cities identified on the tweet map, we now have independent confirmation that San Francisco and the Greater Los Angeles Area are prime markets to look into. For further confirmation, we could repeat the whole process above with different gaming topics and events.
Final Site Selection
Now that we’ve narrowed it down to two cities, we can start looking for specific sites to use as a base of operations. Here is an outline for a data-driven approach to site selection:
- Gather a list of sites available for lease or purchase
- Conduct a drive time analysis to discover areas within a reasonable distance
- Explore the demographics and consumer spending habits of those areas
- Compare data from your current market to that of the target markets
The details of how to go about that analysis will be covered in a future post.
There is a saying among photographers: “The best camera is the one you have with you.” The same can be said of data analysis software. This analysis incorporates Alteryx, R, Tableau, import.io and Google Fusion Tables to show how we can use a range of available tools to get a complete picture. In a very short time, we have:
- Imported tweets about a popular gaming event
- Analyzed tweet sentiment
- Narrowed our focus by visualizing tweet locations and sentiments
- Confirmed results by importing and visualizing event listings
With this and further site-specific analysis, we can approach investors with confidence having solid data and sharp visuals in a presentation about market potential. Who knows? Maybe you’ll be the next one on TV getting Mark Cuban and Kevin O’Leary to fight over who gets to invest in your winning ideas.