Last month, Matillion launched the Data Productivity Cloud (DPC). In our introductory blog post, we highlighted how this new SaaS product simplifies managing multiple data sources and performing transformations — all from a single pipeline and right within your web browser. This was made possible using Matillion-hosted containers and password manager. But what if you want to take over control? Matillion gives you the option to take the reins and manage the infrastructure yourself.
When setting up a project in Matillion DPC, you can choose between using Matillion’s infrastructure or your own. If you choose the latter, our previous article provides a step-by-step guide for you.
Once you have deployed your own compute containers for DPC, we can look into integrating AWS Secrets Manager to store passwords. These passwords can be leveraged in several locations, including when configuring the authentication to Snowflake within an environment and when supplying a password to a component in an orchestration pipeline such as “Database Query.”
Currently, Matillion DPC exclusively supports AWS. Our guide starts by highlighting the utilization of Service Accounts. This is followed by instructions on creating secrets within AWS Secrets Manager. Finally, we’ll demonstrate how to leverage these secrets within DPC. The power to manage your secrets, your way, will be at your fingertips.
Why Using Service Accounts?
Service Accounts are recommended for any platform-to-platform access. They contribute to enhanced security and control by offering standardized permissions and reducing inconsistencies and vulnerabilities that individual user accounts might present. Additionally, Service Accounts help organizations monitor costs, ensuring that expenses align with budget expectations. By utilizing Service Accounts, you can connect platforms like AWS and DPC more securely and efficiently, aligning with best practices in cloud security. This ensures more reliable interactions between these platforms and minimizes potential risks.
This article assumes you already have a Service Account created in Snowflake and a Service Account in your database you want to query using DPC’s “Database Query” component.
Create AWS Secrets within AWS
To create a new secret in AWS that DPC can access, follow these steps:
- Log into your AWS account: Make sure to use the same account you will be using with the Matillion agent.
- Go to AWS Secrets Manager: You’ll find this in your AWS services.
- Click Store a new secret.
- Select Other type of secret.
- Enter your secret: Go to the Plaintext tab, enter the value of your secret, and assign it a key in JSON format.
- Note: A secret can host either a password or a Private Key.
- Leave the Encryption Key field blank: Matillion advises doing this so that Secrets Manager automatically provisions the KMS key.
- Click Next.
- Name your secret.
- Finalize the process: Click Next and then Next again on the Configure rotation page, review your new secret and click Store.
The next GIF shows how to create an AWS Secret to establish a connection with Snowflake:
You can replicate the same steps to create an AWS Secret for the database you wish to query. In this article, we stored another secret called “postgres_service_account” to query data from a Postgres database.
Leverage Secrets within Matillion DPC
Remember, for DPC to read a secret from AWS Secrets Manager, it must have the correct privileges in AWS. The simplest way is to attach the policy “SecretsManagerReadWrite” to your cluster role, which will grant all the access you need. For more details you can visit our previous post to deploy your own compute containers. If you wish to be more restrictive, you may wish to leverage this AWS documentation and consult your AWS administrator to set up the privileges correctly.
Once your cluster is up and running and have the correct privileges attached to its role, you can utilize secrets either within an environment for connecting to Snowflake or when using DPC’s connectors, such as the “Database Query” component.
Leverage a Secret in an Environment to Connect to Snowflake
When adding a project, follow these steps in Matillion DPC to set up the environment and connect it with Snowflake:
You also have the option to create a new environment directly from within an existing project. This can be particularly useful when you want to deploy your project in a development/testing environment. The following image shows where to do so from an existing project.
Leverage a Secret in the “Database Query” Component
Many data transformation pipelines involve importing data from a database into a data warehouse, such as Snowflake in our specific use case. Employing a “Database Query” component makes the task a piece of cake. However, it requires authentication with the database, using a URI, Username and, most importantly, a Password, which should be stored safely, for example, as an AWS secret.
Before utilizing the Secret “postgres_service_account” in the “Database Query” Component we need to first declare that secret within our project. Here are the steps to achieve this:
After we have already took care of storing the password in AWS Secrets Manager and declaring it in our project, all that remains is a simple step. Simply select that particular secret from the dropdown menu, as shown in the following images, and voila!
Wrap Up
In this article, we’ve not only demonstrated how to use AWS Secrets Manager within Matillion DPC, we’ve also highlighted the essential practice of using Service Accounts instead of individual user accounts to connect between platforms. This approach ensures greater security and consistency, reducing potential vulnerabilities and ultimately safeguarding your data.