Zero to Snowflake: Python and Snowflake

Data

Zero to Snowflake: Python and Snowflake

This series takes you from zero to hero with the latest and greatest cloud data warehousing platform, Snowflake.

As a Snowflake user and a Python enthusiast, I was very happy to learn that Snowflake has created its own Python package to connect to Snowflake and execute commands. As these commands are executed locally and transmitted to Snowflake, the full SnowSQL functionality is available for use.

Prerequisites

Before you can install the Python connector for Snowflake, you must first ensure that a supported version of Python is installed. At time of writing, the connector requires either Python 2.7.9 or Python 3.5.0 at a minimum; more recent versions are also supported.

To confirm the version of Python which is installed on a machine, open a terminal and execute the following line:

Python --version

If you do not have Python installed, you can get started by visiting the website for the Python Software Foundation and downloading the relevant packages for your operating system.

If your version of Python is out of date, open a terminal and execute the following line:

Python -m pip install --upgrade pip

Installing the Python Connector for Snowflake

As the Python connector for Snowflake is part of the Python Package Index (PyPI), installing it is a simple case of opening a terminal and executing the following line:

pip install snowflake-connector-Python

It is also advised to install snowflake-alchemy, a package that will enable the creation and execution of SQL code for Snowflake:

pip install snowflake-sqlalchemy

These steps are also advised by Snowflake via the Downloads area of the user interface:

Connecting to Snowflake via Python

Now that the connector is installed, we can connect to Snowflake in a Python IDE. It does not matter which IDE we use. To begin, open your desired IDE to a new Python script and import the Snowflake package:

import snowflake as sf

Three pieces of information are required to establish a connection to Snowflake:

  1. The Snowflake account to connect to, including the region
  2. An existing user within that account
  3. The corresponding password for that user

As Snowflake is a cloud service, each account is accessed via a URL. For example, InterWorks’ account in the EU is accessed via https://myDemoAccount.my-region.snowflakecomputing.com.

This URL is formed by https://<Account>.<Account_Region>.snowflakecomputing.com.

Therefore, we can see that our account is myDemoAccount.my-region, and I will be connecting with the user MyDemoUser. To simplify matters, we can create these as variables in Python:

sfAccount = 'myDemoAccount.my-region'
sfUser = 'MyDemoUser'

We also take this opportunity to enter in our password. As it is not best practice to hard-code passwords, I will instead make use of the getpass package to enter my password:

sfPswd = "

# Request user password if not provided already
if sfPswd == "
  import getpass
  sfPswd = getpass getpass('Password:')

We now have all we need to establish the connection to Snowflake. The following code will attempt to establish the connection. If the connection is established, the Snowflake version number is returned. Otherwise, a message is printed expressing that the connection failed:

# Test the connection to Snowflake by retrieving the version number
from sqlalchemy import create_engine
engine = create_engine(
  'snowflake://{user}:{password}@{account}/' format(
    user = sfUser,
    password = sfPswd,
    account - sfAccount,
  )
)
try:
  connection = engine.connect()
  results = connection.execute('select current_version()').fetchone()
  print('Snowflake Version:'+ results[0])
  connection.close()
except:
 print('Connection failed, check credentials')
finally:
 engine.dispose()

With our engine tested and working, we can leverage it to establish a connection with Snowflake and begin executing SQL commands:

# Open connection to Snowflake
con = sf.connector.connect(
  user = sfUser,
  password = sfPswd,
  account = sfAccount,
)

sfq = con.cursor()

Our variable sfq is used for executing Snowflake queries. Any SQL command string we pass through sfq.execute() will be executed in our Snowflake environment. What follows are some examples in which we execute basic commands to create a database, a schema and an internal named stage:

sfq.execute('CREATE DATABASE DEMO_DB')
sfq.execute('USE DATABASE DEMO_DB')
sfq.execute('CREATE SCHEMA DEMO_SCHEMA')
sfq.execute('USE DEMO_DB.DEMO_SCHEMA')
sfq.execute('CREATE STAGE DEMO_STAGE')

As you may have noticed, each time we are simply passing a string to our sfq.execute() function. This opens up the potential to replace repeated pieces with variables and create a more automatic script:

sfDatabase = 'DEMO_DB'
sfSchema = 'DEMO_SCHEMA'
sfStage = 'DEMO_STAGE'

sfq.execute('CREATE DATABASE {0}'.format(sfDdatabase))
sfq.execute('USE DATABASE {0}'.format(sfDatabase))
sfq.execute('CREATE SCHEMA {0}'.format(sfSchema))
sfq.execute('USE {0}.{1}'.format(sfDatabase, sfSchema))
sfq.execute('CREATE STAGE {0}'.format(sfStage))

Taking this one step further, we can simplify our process whilst turning it into a function. This  function can then be called to automatically create a database and a schema. Below is a complete function to create a new database and schema:

def CreateSnowflakeDBandSchema(
  sfPswd ="
  sfAccount ='myDemoAccount.my-region'
  sfUser = 'MyDemoUser'
  sfDatabase = 'DEMO_DB'
  sfSchema = 'DEMO_SCHEMA'
):
  import snowflake as sf

# Request user password if not provided already
if sfPswd == "
  import getpass
  sfPswd = getpass getpass('Password:')

# Test the connection to Snowflake
from sqlalchemy import create_engine
engine = create_engine(
  'snowflake://{user}:{password}@{account}/' format(
    user = sfUser,
    password = sfPswd,
    account = sfAccount,
  )
)
try:
  connection = engine.connect()
  result = connection.execute('select current_version()').fetchone()
  # Comment out our printed version as it is no longer needed
  # print('Snowflake Version:' + results[0])
  connection.close()
except:
  print('Connection failed, check credentials')
  return # Exit function if connection failed
finally:
  engine.dispose()

# Open connection to Snowflake
con = sf.connector.connect(
  user = sfUser,
  password = sfPswd,
  account = sfAccount,
)

sfq = con.cursor()

sfq.execute('CREATE DATABASE IF NOT EXISTS {0}'.format(sfDdatabase))
sfq.execute('CREATE SCHEMA IF NOT EXISTS {0}.{1}'format(sfDatabase, sfSchema))

print('Steps complete')

With this function created, we can call it and pass in new values for the variables. This allows us to automate the process. You could go further by passing in your password and looping through a series of desired new databases and/or schemas:

CreateSnowflakeDBandSchema(
  sfAccount = 'myDemoAccount.my-region'
  sfUser = 'MyDemoUser'
  sfDatabase = 'ANOTHER_DEMO_DB'
  sfSchema = 'ANOTHER_DEMO_SCHEMA'
)

We now have all the tools required to leverage Python to automate Snowflake commands. If you fancy a challenge, try building a Python function to automate a set of tasks you regularly execute in Snowflake.

Unlocking New Snowflake Potential with Python

My first challenge was to build a function that automatically reads the metadata from a local CSV file and creates a matching table in Snowflake, proceeding to then upload the file to a new internal named stage and load the data into the table. This only scratches the surface of what you can do with Python and Snowflake, and yet it has saved me a lot of time over the last few months when handling bespoke problems that require data uploads. Another challenge is to read in a list of new users and create each of them with the correct security permissions in Snowflake.

Let me know if you come up with any other inventive ways to leverage Python with Snowflake!

More About the Author

Chris Hastie

Data Engineer
Zero to Snowflake: The Nuances of Cloning One of the fantastic features in Snowflake is the support of zero-copy cloning. This is the capability to duplicate an object while ...
Zero to Snowflake: Multi-Threaded Bulk Loading with Python Earlier in this series we have discussed how to load data into Snowflake using several different methods: Snowflake’s user ...

See more from this author →

Subscribe to our newsletter

  • I understand that InterWorks will use the data provided for the purpose of communication and the administration my request. InterWorks will never disclose or sell any personal data except where required to do so by law. Finally, I understand that future communications related topics and events may be sent from InterWorks, but I can opt-out at any time.
  • This field is for validation purposes and should be left unchanged.

InterWorks uses cookies to allow us to better understand how the site is used. By continuing to use this site, you consent to this policy. Review Policy OK

×

Interworks GmbH
Ratinger Straße 9
40213 Düsseldorf
Germany
Geschäftsführer: Mel Stephenson

Kontaktaufnahme: markus@interworks.eu
Telefon: +49 (0)211 5408 5301

Amtsgericht Düsseldorf HRB 79752
UstldNr: DE 313 353 072