Access Data from EXASOL Using ODBC

Unlike JDBC, ODBC is language-independent. ODBC is an open interface that can be used to communicate with databases. All you should do is create an ODBC connection to EXASOL database in your system. For windows, follow these steps to add ODBC creation. Here is an example of using EXASOL data for predictive analysis in R tool.

Steps to Create ODBC Connection

Click on the Start Menu
Select Control Panel
Select Administrative Tools and double-click the ODBC Data Sources (either 32-bit or 64-bit) icon
Click on the System DSN tab
Click the Add button
Select EXASolution Driver from the driver’s list and click the Finish button

ODBC Data Source Administrator

EXASOL Data Retrieval

How To Import Data Into EXASOL

In this example, I am using two datasets: train and test. Train dataset is used to build a predictive model and test dataset is used to validate or test the model built. Do not panic if you do not have datasets handy in EXASOL. All you should do is use import statements to load data from your local machine. An import statement in EXASOL can get you gigabytes of data in a few minutes.

I used the import statement for the first time and it was so fast. It seemed like the dataset was already there in EXASOL and I didn’t need to do any importing. Remember to create tables in EXASOL as per the dataset before importing the data.

I already have the train.csv and test.csv file in C: drive and I am simply giving the path to import. Since it is a CSV file, column separator is “comma.”

IMPORT INTO retail.TRAIN FROM LOCAL CSV FILE ‘C:train.csv’

ENCODING = ‘UTF-8’

ROW SEPARATOR = ‘CRLF’

COLUMN SEPARATOR = ‘,’

COLUMN DELIMITER = ‘”’

SKIP = 1

REJECT LIMIT 0;



IMPORT INTO retail.TEST FROM LOCAL CSV FILE ‘C:test.csv’

ENCODING = ‘UTF-8’

ROW SEPARATOR = ‘CRLF’

COLUMN SEPARATOR = ‘,’

COLUMN DELIMITER = ‘”’

SKIP = 1

REJECT LIMIT 0;

EXASOL Data To Build Predictive Models In R

First, you need to install EXASOL package in R. The package can be found on GitHub. This example demonstrates the use of the random forest technique to predict image id in test data given the pixel input.

# packages
install.packages("RODBC")

library(RODBC)

install.packages("randomForest")

library(randomForest)

install.packages("devtools")

devtools::install_github("exasol/r-exasol")

Connect to EXASOL using the ODBC connection that was made in the first step and read the data from tables in EXASOL.

# connect

C <- odbcConnect("exa_db")

# read data

train <- exa.readData(C, "SELECT * FROM retail.TRAIN")

test  <- exa.readData(C, "SELECT * FROM retail.TEST")

Remove image id from the test data since we are using this data to predict values of image id using the model. Also, extract the labels from the train data. “Labels” in the train data is simply the target variable.

# remove imageid from test

test     <- test[order(test[1]),]

imageid  <- as.factor(test[,1])

test     <- test[,-1]

# extract labels from train
labels <- as.factor(train[,1])

train  <- train[,-1]

I am using the random forest technique to build 500 trees. Random forests correct overfitting problems created by decision trees, because random forests are an ensemble method of classification that operates by constructing many decision trees.

# random forest algorithm
model  <- randomForest(train, labels, xtest=test, ntree=500, keep.forest=TRUE)

predictions <- cbind(imageid ,levels(labels)[rf$test$predicted])

# print prediction
predictions
# print forest
model

Random Forest Algorithm technique

You can also save the model built locally and reuse it with a different dataset.

# save the model locally
save(model,file="model")

# reload and reuse
load("model")
predict(model,test)

The model turned out to have a misclassification rate of 60%. We can see that the predicted values for image id four to nine are wrong. The model could definitely be trained in conjunction with larger datasets. This is where EXASOL comes in handy since we can load and access huge datasets within mere seconds.

SIGMA RAPID START

Access Data from EXASOL Using ODBC

Access Data from EXASOL Using ODBC

Steps to Create ODBC Connection

How To Import Data Into EXASOL

EXASOL Data To Build Predictive Models In R

Related

How To Leverage AI To Become a World-Class Dev Team

Switching your analytics to Sigma?
Speed up success with a Rapid Start!

Archana Chinnaswamy

Access Data from EXASOL Using ODBC

Access Data from EXASOL Using ODBC

Steps to Create ODBC Connection

How To Import Data Into EXASOL

EXASOL Data To Build Predictive Models In R

Related

How To Leverage AI To Become a World-Class Dev Team

Switching your analytics to Sigma? Speed up success with a Rapid Start!

Archana Chinnaswamy

Company Information

Tax Information

Payment Options

Other Information

Terms

Switching your analytics to Sigma?
Speed up success with a Rapid Start!