Dagster & Power BI (Pythonic)
This feature is considered in a beta stage. It is still being tested and may change. For more information, see the API lifecycle stages documentation.
If you are just getting started with the Power BI integration, we recommend using the new Power BI component.
Your Power BI assets, such as semantic models, data sources, reports, and dashboards, can be represented in the Dagster asset graph, allowing you to track lineage and dependencies between Power BI assets and upstream data assets you are already modeling in Dagster. You can also use Dagster to orchestrate Power BI semantic models, allowing you to trigger refreshes of these models on a cadence or based on upstream data changes.
What you'll learn
- How to represent Power BI assets in the Dagster asset graph, including lineage to other Dagster assets.
- How to customize asset definition metadata for these Power BI assets.
- How to materialize Power BI semantic models from Dagster.
- How to customize how Power BI semantic models are materialized.
Prerequisites
- The
dagsteranddagster-powerbilibraries installed in your environment - A Power BI workspace
- A service principal configured to access Power BI, or an API access token. For more information, see Embed Power BI content with service principal and an application secret in the Power BI documentation.
Set up your environment
To get started, you'll need to install the dagster and dagster-powerbi Python packages:
- uv
- pip
uv add dagster-powerbi
pip install dagster-powerbi
Represent Power BI assets in the asset graph
To load Power BI assets into the Dagster asset graph, you must first construct a PowerBIWorkspace resource, which allows Dagster to communicate with your Power BI workspace. You'll need to supply your workspace ID and credentials. You may configure a service principal or use an API access token, which can be passed directly or accessed from the environment using EnvVar.
Dagster can automatically load all semantic models, data sources, reports, and dashboards from your Power BI workspace as asset specs. Call the load_powerbi_asset_specs function, which returns a list of AssetSpecs representing your Power BI assets. You can then include these asset specs in your Definitions object:
from dagster_powerbi import (
PowerBIServicePrincipal,
PowerBIToken,
PowerBIWorkspace,
load_powerbi_asset_specs,
)
import dagster as dg
# Connect using a service principal
power_bi_workspace = PowerBIWorkspace(
credentials=PowerBIServicePrincipal(
client_id=dg.EnvVar("POWER_BI_CLIENT_ID"),
client_secret=dg.EnvVar("POWER_BI_CLIENT_SECRET"),
tenant_id=dg.EnvVar("POWER_BI_TENANT_ID"),
),
workspace_id=dg.EnvVar("POWER_BI_WORKSPACE_ID"),
)
# Alternatively, connect directly using an API access token
power_bi_workspace = PowerBIWorkspace(
credentials=PowerBIToken(api_token=dg.EnvVar("POWER_BI_API_TOKEN")),
workspace_id=dg.EnvVar("POWER_BI_WORKSPACE_ID"),
)
power_bi_specs = load_powerbi_asset_specs(power_bi_workspace)
defs = dg.Definitions(
assets=[*power_bi_specs], resources={"power_bi": power_bi_workspace}
)
By default, Dagster will attempt to snapshot your entire workspace using Power BI's metadata scanner APIs, which are able to retrieve more detailed information about your Power BI assets, but rely on the workspace being configured to allow this access.
If you encounter issues with the scanner APIs, you may disable them using load_powerbi_asset_specs(power_bi_workspace, use_workspace_scan=False).
Customize asset definition metadata for Power BI assets
By default, Dagster will generate asset specs for each Power BI asset based on its type, and populate default metadata. You can further customize asset properties by passing an instance of a custom DagsterPowerBITranslator subclass to the load_powerbi_asset_specs function. This subclass can implement methods to customize the asset specs for each Power BI asset type.
from dagster_powerbi import (
DagsterPowerBITranslator,
PowerBIServicePrincipal,
PowerBIWorkspace,
load_powerbi_asset_specs,
)
from dagster_powerbi.translator import PowerBIContentType, PowerBITranslatorData
import dagster as dg
power_bi_workspace = PowerBIWorkspace(
credentials=PowerBIServicePrincipal(
client_id=dg.EnvVar("POWER_BI_CLIENT_ID"),
client_secret=dg.EnvVar("POWER_BI_CLIENT_SECRET"),
tenant_id=dg.EnvVar("POWER_BI_TENANT_ID"),
),
workspace_id=dg.EnvVar("POWER_BI_WORKSPACE_ID"),
)
# A translator class lets us customize properties of the built
# Power BI assets, such as the owners or asset key
class MyCustomPowerBITranslator(DagsterPowerBITranslator):
def get_asset_spec(self, data: PowerBITranslatorData) -> dg.AssetSpec:
# We create the default asset spec using super()
default_spec = super().get_asset_spec(data)
# We customize the team owner tag for all assets,
# and we customize the asset key prefix only for dashboards.
return default_spec.replace_attributes(
key=(
default_spec.key.with_prefix("prefix")
if data.content_type == PowerBIContentType.DASHBOARD
else default_spec.key
),
owners=["team:my_team"],
)
power_bi_specs = load_powerbi_asset_specs(
power_bi_workspace, dagster_powerbi_translator=MyCustomPowerBITranslator()
)
defs = dg.Definitions(
assets=[*power_bi_specs], resources={"power_bi": power_bi_workspace}
)
Note that super() is called in each of the overridden methods to generate the default asset spec. It is best practice to generate the default asset spec before customizing it.