Concepts

Execution

The Execution entity represents the process of running a data job within the data platform. It is a crucial component in the data pipeline, responsible for executing data extraction, transformation, and loading tasks. Execution can be triggered by user actions or scheduled events and is essential for the movement and processing of data from sources to datasets.

Overview

The Execution entity represents the process of running a data job within the data platform. It is a crucial component in the data pipeline, responsible for executing data extraction, transformation, and loading tasks. Execution can be triggered by user actions or scheduled events and is essential for the movement and processing of data from sources to datasets.

Properties

  • ID: A unique identifier for the execution instance.

  • Type: The nature of the execution process (e.g., batch, real-time, streaming).

  • Status: Current status of the execution (e.g., running, completed, failed).

  • Trigger: The method by which the execution is initiated (e.g., manual trigger, scheduled event).

Usage

  • Data Movement: Executes processes that move data from sources, through extracts, and into datasets.

  • Data Transformation: Executes transformation scripts (PySpark scripts) as defined in the transforms associated with the execution.

  • Scheduling and Automation: Supports the scheduling of data jobs for regular, automated execution.

Overview

The Execution entity represents the process of running a data job within the data platform. It is a crucial component in the data pipeline, responsible for executing data extraction, transformation, and loading tasks. Execution can be triggered by user actions or scheduled events and is essential for the movement and processing of data from sources to datasets.

Properties

  • ID: A unique identifier for the execution instance.

  • Type: The nature of the execution process (e.g., batch, real-time, streaming).

  • Status: Current status of the execution (e.g., running, completed, failed).

  • Trigger: The method by which the execution is initiated (e.g., manual trigger, scheduled event).

Usage

  • Data Movement: Executes processes that move data from sources, through extracts, and into datasets.

  • Data Transformation: Executes transformation scripts (PySpark scripts) as defined in the transforms associated with the execution.

  • Scheduling and Automation: Supports the scheduling of data jobs for regular, automated execution.

Overview

The Execution entity represents the process of running a data job within the data platform. It is a crucial component in the data pipeline, responsible for executing data extraction, transformation, and loading tasks. Execution can be triggered by user actions or scheduled events and is essential for the movement and processing of data from sources to datasets.

Properties

  • ID: A unique identifier for the execution instance.

  • Type: The nature of the execution process (e.g., batch, real-time, streaming).

  • Status: Current status of the execution (e.g., running, completed, failed).

  • Trigger: The method by which the execution is initiated (e.g., manual trigger, scheduled event).

Usage

  • Data Movement: Executes processes that move data from sources, through extracts, and into datasets.

  • Data Transformation: Executes transformation scripts (PySpark scripts) as defined in the transforms associated with the execution.

  • Scheduling and Automation: Supports the scheduling of data jobs for regular, automated execution.

Overview

The Execution entity represents the process of running a data job within the data platform. It is a crucial component in the data pipeline, responsible for executing data extraction, transformation, and loading tasks. Execution can be triggered by user actions or scheduled events and is essential for the movement and processing of data from sources to datasets.

Properties

  • ID: A unique identifier for the execution instance.

  • Type: The nature of the execution process (e.g., batch, real-time, streaming).

  • Status: Current status of the execution (e.g., running, completed, failed).

  • Trigger: The method by which the execution is initiated (e.g., manual trigger, scheduled event).

Usage

  • Data Movement: Executes processes that move data from sources, through extracts, and into datasets.

  • Data Transformation: Executes transformation scripts (PySpark scripts) as defined in the transforms associated with the execution.

  • Scheduling and Automation: Supports the scheduling of data jobs for regular, automated execution.

Overview

The Execution entity represents the process of running a data job within the data platform. It is a crucial component in the data pipeline, responsible for executing data extraction, transformation, and loading tasks. Execution can be triggered by user actions or scheduled events and is essential for the movement and processing of data from sources to datasets.

Properties

  • ID: A unique identifier for the execution instance.

  • Type: The nature of the execution process (e.g., batch, real-time, streaming).

  • Status: Current status of the execution (e.g., running, completed, failed).

  • Trigger: The method by which the execution is initiated (e.g., manual trigger, scheduled event).

Usage

  • Data Movement: Executes processes that move data from sources, through extracts, and into datasets.

  • Data Transformation: Executes transformation scripts (PySpark scripts) as defined in the transforms associated with the execution.

  • Scheduling and Automation: Supports the scheduling of data jobs for regular, automated execution.

Pyspark Examples in Transforms

Schedule

© Copyright 2024. All rights reserved.

Concepts

Execution

The Execution entity represents the process of running a data job within the data platform. It is a crucial component in the data pipeline, responsible for executing data extraction, transformation, and loading tasks. Execution can be triggered by user actions or scheduled events and is essential for the movement and processing of data from sources to datasets.

Overview

The Execution entity represents the process of running a data job within the data platform. It is a crucial component in the data pipeline, responsible for executing data extraction, transformation, and loading tasks. Execution can be triggered by user actions or scheduled events and is essential for the movement and processing of data from sources to datasets.

Properties

  • ID: A unique identifier for the execution instance.

  • Type: The nature of the execution process (e.g., batch, real-time, streaming).

  • Status: Current status of the execution (e.g., running, completed, failed).

  • Trigger: The method by which the execution is initiated (e.g., manual trigger, scheduled event).

Usage

  • Data Movement: Executes processes that move data from sources, through extracts, and into datasets.

  • Data Transformation: Executes transformation scripts (PySpark scripts) as defined in the transforms associated with the execution.

  • Scheduling and Automation: Supports the scheduling of data jobs for regular, automated execution.

Pyspark Examples in Transforms

Schedule

© Copyright 2024. All rights reserved.