Import.io is the best tool for scraping structured data off the web. Alteryx is the best tool for self-service data analysis. It doesn’t take long to realize that we should put the two together.
Robert Rouse built a handy Import.io web connector of his own.
Setup
To install the InterWorks macro, unzip the attached folder to the location on your computer where you want the macros to be permanently saved. I suggest: DocumentsMy Alteryx MacrosInterWorks Macros Then, run the installer wizard Install.yxwz and choose Install. Once you restart Alteryx, the macros will now be available on your toolbar just like any other tool:
You will also need to set up a free account at http://import.io. Once you’ve signed up, the account page will show your API Key:
Automatically Extract Web Data
Import.io provides easy access to web data through their Magic API. On the homepage, enter any URL and the Magic API will attempt to extract structured data. As an example, try using the InterWorks People page:
Import.io will return a structured table of the InterWorks employee directory. To do the same thing in Alteryx, use the InterWorks connector, enter your API Key, select Magic API in the tool configuration and enter the URL. The tool will the return the JSON data extracted from the webpage right into your Alteryx module:
Using the Connector API in Alteryx
The Magic API works wonders, but sometimes it is unable to find exactly what you are looking for. In this case, you may need to train an extractor or connector. An extractor allows you to build a custom tool to scrape data from similarly structured web pages. A connector is an extractor with a macro attached. This will allow you to record actions such as using page searches before extracting data. Building a connector is easy; but for this example, we will use one that has already been created.
Robert Rouse, in another blog, has provided a connector that allows us to pull data from a formatted Wikipedia table:
To use this connector in Alteryx, enter your API key, the connector ID and the input variable:
Try It Out
We’re excited to bring together the simple data scraping capabilities with our favorite data analytics tools. InterWorks is always looking for feedback and ways that we can improve our tools to better help our clients. Create a connector, try out the macro and tell us what you think!