Friday, August 1, 2025

Snowflake Launches Snowpark Connect: Run Apache Spark Workloads Without Clusters

Share

Snowflake has announced the public preview of Snowpark Connect for Apache Spark™, a new capability that allows organizations to run Spark workloads directly within the Snowflake platform—removing the need to provision and maintain traditional Spark clusters.

Introduced in Apache Spark version 3.4, Spark Connect enables a decoupled client-server architecture, allowing user code to run separately from the Spark execution environment. Snowpark Connect takes this architecture a step further by enabling Spark code execution on Snowflake’s vectorized engine, providing the benefits of Spark’s familiar APIs with the performance, scalability, and simplicity of Snowflake’s cloud-native platform.

With Snowpark Connect, users can now execute Spark SQL, DataFrame operations, and UDFs within Snowflake’s elastic virtual warehouses. This integration removes the typical operational burdens of Spark, such as managing dependencies, ensuring version compatibility, and performing manual scaling or tuning. Snowflake handles it all behind the scenes, allowing developers to focus purely on building and optimizing data applications.

Also Read: Zafin Integrates ChatGPT Enterprise to Accelerate Platform Development and Help Banks Compete

In addition to reducing infrastructure complexity and cost, the solution enhances governance by centralizing data processing within Snowflake’s unified security and compliance framework. Organizations can now streamline their analytics and machine learning workflows, leveraging Snowflake’s native capabilities while writing in Spark.

Snowpark Connect is available in public preview starting today, inviting developers and data teams to modernize their Spark workloads with greater efficiency and ease.

Read more

Local News