Build your first Dagster project
Welcome to Dagster! In this guide, you'll use Dagster to create a basic pipeline that:
- Extracts data from a CSV file
- Transforms the data
- Loads the transformed data to a new CSV file
What you'll learn
- How to set up a basic Dagster project
- How to create a Dagster asset for each step of the Extract, Transform, and Load (ETL) process
- How to use Dagster's UI to monitor and execute your pipeline
Prerequisites
Prerequisites
To follow the steps in this guide, you'll need:
- Basic Python knowledge
- Python 3.9+ installed on your system. Refer to the Installation guide for information.
Step 1: Set up the Dagster environment
-
Open the terminal and create a new directory for your project:
mkdir dagster-quickstart
cd dagster-quickstart -
Create and activate a virtual environment:
- MacOS
- Windows
python -m venv venv
source venv/bin/activatepython -m venv venv
source venv\Scripts\activate -
Install Dagster and the required dependencies:
pip install dagster dagster-webserver pandas
Step 2: Create the Dagster project structure
info
The project structure in this guide is simplified to allow you to get started quickly. When creating new projects, use dagster project scaffold
to generate a complete Dagster project.
Next, you'll create a basic Dagster project that looks like this:
dagster-quickstart/
├── quickstart/
│ ├── __init__.py