As Snowflake becomes more and more feature rich, we’ve become surrounded by all manner of suffixes to the word “Snow.” In this post, I’ll summarise the tools available and what their intended use is. This is only meant to be a brief overview, so I’ve provided links to learn more about each feature.
Snowsight
We’ll start with the tool in which most users will spend the majority of their time. This is Snowflake’s GUI (Graphical User Interface) accessed via Safari, Chrome, Edge or Firefox (latest three major versions supported). Within Snowsight sit a number of helpful features and sections:
- Worksheets and Notebooks in which you can write, execute and save SQL queries or Python scripts assisted by auto complete and syntax highlighting.
- Projects, the menu that holds all worksheets (which you can organize by folder and share with others), Notebooks, Streamlit apps, dashboards and native apps.
- Data, where you can list databases, show schemas, search for objects and show details about an object like the table definition, a list of columns and preview the data. The Add Data feature is within this menu too.
- Data Products section containing Snowflake’s Marketplace, a list of installed Apps, any Private Shares set up for your account, the Provider Studio where you can manage your listings on the data marketplace and Partner Connect where you can integrate third-party applications
- AI & ML, this menu contains the Studio where users can create no-code machine learning models (forecasts, classification and anomaly detection), view existing models and leverage the Document AI feature which is part of Snowflake Cortex (discussed further down).
- Monitoring, where you can view query history and see query profiles, copy history to monitor data loading activity, task history, a list of your Dynamic tables and the trust center where you can evaluate your account for security risks
- Admin, your crucial cost management section is here with a dashboard showing spend in currency and credits split by your most costly warehouses and databases. Also here is a list of warehouses and the ability to add new warehouses and a list of users and roles.
SnowSQL
For command line fans and those wanting to undertake batch processing, SnowSQL has what you need. It will allow you to connect to Snowflake and execute SQL commands, load or download data and manage Snowflake objects. There are platform specific versions for Linux, MacOS and Windows.
Snowflake CLI
Like SnowSQL, Snowflake CLI is a command line tool. However, this version is open source, so developers can improve and enhance the tool alongside the improvements Snowflake has stated it will make. It has all the capabilities of SnowSQL, but can also be used to manage workloads and applications that connect to Snowflake. It can be used to debug and deploy your Snowflake apps from your favorite IDE.
Snowpark
Do you want to build data applications, set up pipelines, or build and deploy ML models using Python, Java or Scala coding languages? Snowpark was built to achieve exactly that whilst removing the need to migrate your data to wherever your application or model sits. Using the Snowpark API libraries, you can build, test and deploy models, applications or pipelines that process data within Snowflake’s ecosystem taking advantage of the existing security models and scalable compute engines.
We have guides on the InterWorks blog for setting up Snowflake sessions using snowpark, creating stored procedures using Snowpark, creating Python user defined functions using snowpark, building machine learning models using Snowpark and a guide to which Python related feature to use according to your needs.
Snowpark Container Services
If you prefer to package your application along with all its dependencies, libraries, configuration files and binaries into a container, Snowpark container services allows you to deploy and serve that application directly within the Snowflake platform.
Snowflake Cortex
Snowflake Cortex is the umbrella name for everything generative artificial intelligence within Snowflake. It refers to a suite of AI features that use large language models for multiple purposes which I’ll briefly outline below.
Snowsight Tools built with Cortex available at time of writing include:
- Copilot, an AI-powered SQL assistant.
- Document AI, a tool to scrape data from PDFs.
- Universal Search, to find data and apps across your Snowflake account and the Marketplace.
Also available at time of writing are LLM functions, which can be used as SQL functions or in Python, and fine-tuning, which allows users to customise the LLM for a specific purpose.
Snowflake ML
Snowflake ML is a unified environment in which you can prepare data, build and train machine learning (ML) models, deploy those models and monitor model performance.
It is straightforward to build models within Snowsight using the AI & ML Studio to guide you through creating a Forecast, Anomaly Detection or Classification model. The same logic is available in a series of ML Functions including an additional function for Contribution Explorer.
Snowpark ML
Whilst discussing Machine Learning within Snowflake, we need to mention Snowpark ML, which is a set of Python APIs that can be used for data preprocessing, feature engineering and model training.
Snowflake Connector for Python
An aptly named “does what it says on the tin” product here, the Snowflake Connector for Python is a Python library through which you can connect to Snowflake and execute SQL based commands.
It’s worth noting here that there is a newer, more “pythonic” library for Snowflake, the Snowflake Python API, which you can also use connect to Snowflake however it contains more comprehensive APIs for interacting with Snowflake resources.
Snowflake Horizon
Consider Snowflake Horizon as a unified set of compliance, security, privacy, interoperability and access capabilities to help govern and discover your data and applications. It includes, but is not limited to: compliance certifications, data quality monitoring, lineage, business continuity, risk monitoring, role based access control, privacy policies, classification and integration with other data catalog and governance tools.
One feature that falls under the Horizon umbrella is a Snowflake Data Clean Room. This differs to Snowflake’s Secure Data Sharing feature in that rather than providing access to a whole table, A data clean room allows the data provider to define rules about the types of queries that can be run on the data, and restrict the person running the queries from accessing the underlying data itself.
Polaris Catalog
Snowflake has developed Polaris Catalog as a vendor-neutral, open data catalog for Apache Iceberg. The benefits of Apache Iceberg is that it is an open-source table format designed for large analytic workloads with an open standard REST protocol. This means you can access or retrieve data using any engine (e.g. Apache Spark, Python etc) without being limited to a specific data platform.
Polaris Catalog is also open source so you can host it within containers on whatever infrastructure you prefer, or within Snowflake. If you integrate Polaris Catalog with Snowflake Horizon, any Iceberg table can benefit from Horizon’s governance capabilities whether built by Snowflake or Polaris.
At the start of June Snowflake said Polaris Catalog will be made open source in the next 90 days, so we’re expecting that around the end of September/October 2024.
This overview covers some of what features Snowflake has to offer, but with many other items in private preview and development I’m sure there’ll be more to add before the end of the year.