Easily automate complex, multi-step processes with Tidal and Spark SQL.
Eliminate silos
Automate transfers, deployments, provisioning and more.
Work more efficiently
Use comprehensive system management to mitigate risks.
Integrate securely
Reduce attack surface across your environment.
The Tidal adapter for Databricks Spark SQL integrates Tidal with Databricks SQL and Apache Spark SQL so you can create, schedule and run Databricks Spark jobs through Tidal.
Databricks Spark SQL is a module for structured data processing and acts as a distributed SQL query engine. Databricks provides a programming abstraction called DataFrames, which organizes data into a table of rows and columns.
The Tidal adapter for SQL Spark gives you access to these features and capabilities:
The Tidal adapter connects Tidal with Databricks SQL and Apache Spark SQL, enabling you to orchestrate and execute Databricks Spark jobs directly within Tidal. This integration leverages Databricks Spark SQL’s capabilities as a distributed SQL query engine for structured data processing.
At the core of this process are DataFrames, Databricks’ programming abstraction that organizes data into tabular structures. These DataFrames, with their defined schemas specifying column names and data types (including standard types like StringType and IntegerType, as well as Spark-specific StructType), provide a flexible and intuitive way to manage and store data. Notably, the system handles missing or incomplete data by representing them as null values within these DataFrames to ensure data integrity during processing.
Resources
Set up your business to thrive with Tidal.
Spark SQL is Apache Spark's module for structured data processing. It allows users to query data using SQL or DataFrame APIs, thus integrating SQL-like capabilities into the Spark ecosystem.
Spark SQL is the open-source component within Apache Spark, while Databricks SQL is a serverless data warehouse on Databricks. Databricks SQL optimizes and enhances Spark SQL for interactive and performant SQL workloads in a cloud environment with enhancements for performance, cost optimization and user interface. Essentially, Databricks SQL is a refined, cloud-optimized version of Spark SQL.