Zero to Snowflake: Python and Snowflake

Data

Zero to Snowflake: Python and Snowflake

This series takes you from zero to hero with the latest and greatest cloud data warehousing platform, Snowflake.

As a Snowflake user and a Python enthusiast, I was very happy to learn that Snowflake has created its own Python package to connect to Snowflake and execute commands. As these commands are executed locally and transmitted to Snowflake, the full SnowSQL functionality is available for use.

Prerequisites

Before you can install the Python connector for Snowflake, you must first ensure that a supported version of Python is installed. At time of writing, the connector requires either Python 2.7.9 or Python 3.5.0 at a minimum; more recent versions are also supported.

To confirm the version of Python which is installed on a machine, open a terminal and execute the following line:

Python --version

If you do not have Python installed, you can get started by visiting the website for the Python Software Foundation and downloading the relevant packages for your operating system.

If your version of Python is out of date, open a terminal and execute the following line:

Python -m pip install --upgrade pip

Installing the Python Connector for Snowflake

As the Python connector for Snowflake is part of the Python Package Index (PyPI), installing it is a simple case of opening a terminal and executing the following line:

pip install snowflake-connector-Python

It is also advised to install snowflake-alchemy, a package that will enable the creation and execution of SQL code for Snowflake:

pip install snowflake-sqlalchemy

These steps are also advised by Snowflake via the Downloads area of the user interface:

Connecting to Snowflake via Python

Now that the connector is installed, we can connect to Snowflake in a Python IDE. It does not matter which IDE we use. To begin, open your desired IDE to a new Python script and import the Snowflake package:

import snowflake.connector as sf

Three pieces of information are required to establish a connection to Snowflake:

  1. The Snowflake account to connect to, including the region
  2. An existing user within that account
  3. The corresponding password for that user

As Snowflake is a cloud service, each account is accessed via a URL. For example, InterWorks’ account in the EU is accessed via https://myAccount.my-region.snowflakecomputing.com.

This URL is formed by https://<Account>.<Account_Region>.snowflakecomputing.com.

Therefore, we can see that our account is myAccount.my-region, and I will be connecting with the user my_user. To simplify matters, we can create these as variables in Python:

sfAccount = 'myAccount.my-region'
sfUser = 'my_user'

We also take this opportunity to enter in our password. As it is not best practice to hard-code passwords, I will instead make use of the getpass package to enter my password:

sfPswd = ''

# Request user password if not provided already    
if sfPswd == '' :
  import getpass
  sfPswd = getpass.getpass('Password:')

We now have all we need to establish the connection to Snowflake. The following code will attempt to establish the connection. If the connection is established, the Snowflake version number is returned. Otherwise, a message is printed expressing that the connection failed:

# Test the connection to Snowflake by retrieving the version number
try:
    sfConnection = sf.connect(
        user=sfUser,
        password=sfPswd,
        account=sfAccount
    )
    sfq = sfConnection.cursor()
    sfq.execute("SELECT current_version()")
    sfResults = sfq.fetchall()
    print('Snowflake Version: ' + sfResults[0][0])
    sfq.close()
    sfConnection.close()
except:
    print('Connection failed. Check credentials')

With our engine tested and working, we can leverage it to establish a connection with Snowflake and begin executing SQL commands:

# Open connection to Snowflake
sfConnection = sf.connect(
    user=sfUser,
    password=sfPswd,
    account=sfAccount
)
sfq = sfConnection.cursor()

Our variable sfq is used for executing Snowflake queries. Any SQL command string we pass through sfq.execute() will be executed in our Snowflake environment. What follows are some examples in which we execute basic commands to create a database, a schema and an internal named stage:

sfq.execute('CREATE DATABASE DEMO_DB')
sfq.execute('USE DATABASE DEMO_DB')
sfq.execute('CREATE SCHEMA DEMO_SCHEMA')
sfq.execute('USE DEMO_DB.DEMO_SCHEMA')
sfq.execute('CREATE STAGE DEMO_STAGE')

As you may have noticed, each time we are simply passing a string to our sfq.execute() function. This opens up the potential to replace repeated pieces with variables and create a more automatic script:

sfDatabase = 'DEMO_DB'
sfSchema = 'DEMO_SCHEMA'
sfStage = 'DEMO_STAGE'

sfq.execute('CREATE DATABASE {0}'.format(sfDdatabase))
sfq.execute('USE DATABASE {0}'.format(sfDatabase))
sfq.execute('CREATE SCHEMA {0}'.format(sfSchema))
sfq.execute('USE {0}.{1}'.format(sfDatabase, sfSchema))
sfq.execute('CREATE STAGE {0}'.format(sfStage))

Taking this one step further, we can simplify our process whilst turning it into a function. This  function can then be called to automatically create a database and a schema. Below is a complete function to create a new database and schema:

def CreateSnowflakeDBandSchema (
    sfPswd = '',
    sfAccount = 'myAccount.my-region',
    sfUser = 'my_user',
    sfDatabase = 'DEMO_DB',
    sfSchema = 'DEMO_SCHEMA'
):
    import snowflake.connector as sf

    # Request user password if not provided already
    if sfPswd == '' :
      import getpass
      sfPswd = getpass.getpass('Password:')

    # Test the connection to Snowflake
    try:
      sfConnection = sf.connect(
          user=sfUser,
          password=sfPswd,
          account=sfAccount
      )
      sfq = sfConnection.cursor()
      # sfq.execute("SELECT current_version()")
      # sfResults = sfq.fetchall()
      # print('Snowflake Version: ' + sfResults[0][0])
      sfq.close()
      sfConnection.close()
    except:
      print('Connection failed. Check credentials')

    # Open connection to Snowflake
    sfConnection = sf.connect(
      user=sfUser,
      password=sfPswd,
      account=sfAccount
    )

    sfq = sfConnection.cursor()

    sfq.execute('CREATE DATABASE IF NOT EXISTS {0}'.format(sfDatabase))
    sfq.execute('CREATE SCHEMA IF NOT EXISTS {0}.{1}'.format(sfDatabase, sfSchema))

    print('Steps complete')

With this function created, we can call it and pass in new values for the variables. This allows us to automate the process. You could go further by passing in your password and looping through a series of desired new databases and/or schemas:

CreateSnowflakeDBandSchema (
    sfAccount = 'myAccount.my-region',
    sfUser = 'my_user',
    sfDatabase = 'ANOTHER_DEMO_DB',
    sfSchema = 'ANOTHER_DEMO_SCHEMA'
)

We now have all the tools required to leverage Python to automate Snowflake commands. If you fancy a challenge, try building a Python function to automate a set of tasks you regularly execute in Snowflake.

Unlocking New Snowflake Potential with Python

My first challenge was to build a function that automatically reads the metadata from a local CSV file and creates a matching table in Snowflake, proceeding to then upload the file to a new internal named stage and load the data into the table. This only scratches the surface of what you can do with Python and Snowflake, and yet it has saved me a lot of time over the last few months when handling bespoke problems that require data uploads. Another challenge is to read in a list of new users and create each of them with the correct security permissions in Snowflake.

Let me know if you come up with any other inventive ways to leverage Python with Snowflake!

KeepWatch by InterWorks

Whether you need support for one platform or many, our technical experts have you covered.

More About the Author

Chris Hastie

Data Lead
Querying Stock Data with an API Query Profile in Matillion ETL Matillion ETL is a GUI-based orchestration and transformation tool for data movement that has many functionalities, including querying ...
Snowflake External Access: Retrieve Data from an API Recently, Snowflake released external access integrations into general availability. This allows Snowflake admins to create secure ...

See more from this author →

InterWorks uses cookies to allow us to better understand how the site is used. By continuing to use this site, you consent to this policy. Review Policy OK

×

Interworks GmbH
Ratinger Straße 9
40213 Düsseldorf
Germany
Geschäftsführer: Mel Stephenson

Kontaktaufnahme: markus@interworks.eu
Telefon: +49 (0)211 5408 5301

Amtsgericht Düsseldorf HRB 79752
UstldNr: DE 313 353 072

×

Love our blog? You should see our emails. Sign up for our newsletter!