An Introduction to StreamSets Platform for Snowflake Data Integration and Transformation

About this Session

StreamSets Platform provides comprehensive data integration, including database bulk loads and Change Data Capture, with targets that include Snowflake, Google BigQuery, Azure Synapse, Amazon Redshift and myriad other systems. StreamSets Platform can also push large-scale transformation logic to execute within Snowflake and Databricks.

During this advanced, technical session we have explored the capabilities and benefits of using the StreamSets Platform:

  • Bulk loading data into Snowflake: StreamSets Platform can bulk load data into Snowflake using a variety of design patterns. We’ll demonstrate several different approaches and discuss their tradeoffs.

  • Keeping Snowflake in sync with transactional systems: StreamSets Platform can perform Change Data Capture from a variety of databases and merge those changes into Snowflake. This allows analytics and machine learning on transactional data to be moved to the cloud. We’ll demonstrate merging changes from Oracle and SQLServer into Snowflake.

  • Push-down Transformations on Snowflake: StreamSets Platform leverages Snowpark to enable a no-code environment to generate optimized SQL transformations that execute entirely within Snowflake’s Data Cloud. We’ll show how data ingest and transformations can be implemented, executed, and monitored using StreamSets Platform’s single pane-of-glass.

  • Cost-optimized Snowflake data ingestion patterns: We’ll describe Snowflake’s “day 2” problem and how StreamSets Platform can be used to implement cost-optimized Snowflake data ingestion.