5 Tips for Separating Power BI Datasets and Reports

Post last updated: Dec. 31, 2020

What’s one of the best things you can do to improve your Power BI implementation? Separating datasets from reports — by this, I mean use of separate pbix files. When a dataset is used for multiple reports, this is sometimes called a ‘golden dataset’ or the ‘hub and spoke’ method. The separate reports are sometimes called ‘thin reports.’

Do you prefer a video or reading text? I’ve got you covered with both below.

Advantages to Separating Datasets and Reports

  • Different people can handle data modeling and report creation. Separating the pbix files simplifies collaboration. This is helpful when different people are involved with different aspects of development, and since Power BI Desktop does not support multiple users working at once in the same .pbix file.

  • Report page flexibility. When reports aren't tightly associated with the dataset, you have flexibility to create however many report pages you want. If you have a report that has grown to, say, 30 pages, that's too much to navigate through. Separate report files, with less pages targeted towards specific needs, is usually far more usable.

  • Less dataset maintenance. When we have fewer datasets, there is less work to create and maintain measures and row-level security. This saves a lot of time, hassle, and duplication of effort, for dataset owners.

  • Easier for report creators to find datasets. Fewer datasets means it's that much easier for self-service report creators to determine which dataset to use for their report because there are less choices to scroll through.

  • Reduces the number of dataset refreshes. The number of data refresh operations that need to run is reduced. This takes some processing load off of source systems, potentially reduces the number of people who need direct access to source systems, and uses your Power BI resources more efficiently.

  • Lessen security and compliance concerns. Reducing the number of redundant or repetitive datasets is always helpful for lessening security and compliance concerns, at least to a certain extent.

  • Reduces risk of inconsistencies and inaccuracies. Fewer datasets also reduces the risk of inconsistencies or inaccuracies that can exist between multiple datasets.

Tip 1: Publish Datasets to Separate "Data Workspaces" from "Report Workspaces"

Here I am depicting 2 datasets which are serving 5 reports - the reports use data from their respective dataset via live connection:

Data workspace with 2 datasets and report workspace with 5 reports

The datasets shown above have been published to a workspace in the Power BI Service that I called "Sales Data" whereas the reports reside in a different workspace that I called "Sales Analytics." Advantages to doing this:

  1. Permissions management. The data workspace has very few people who may edit that content, whereas the reporting workspace allows many more users to publish their own reports. Having distinct workspace permissions like this lets us protect datasets from inadvertent changes, especially when they are certified or have a lot of downstream report dependencies.

  2. Clearer ownership and responsibilities. Using separate workspaces reduces ambiguity when data is managed by one team, and reports are owned and managed by another team. This might occur when a centralized IT or BI team produces the datasets, and decentralized analysts produce reports.

  3. Row-level security works. Row-level security will still work for report creators who don't have edit permissions on the data workspace. Report creators only need read and build permissions on the dataset in order to access the data via live connection. Not having edit permissions ensures row-level security is invoked — which is a *huge* advantage.

Tip 2: Use Report Pages to Document the Dataset

Including documentation alongside the dataset is helpful for teammates who might need to work on this dataset at some point, or even for yourself in the future. It also means you won't have an empty report, and you won't need to delete the empty report once the file is published to the service.

Example of Power BI Dataset Documentation

The above is a simple example of using the report pages for:

  • Dataset documentation

  • Dataset change log

  • Data validation reports

  • Row-level security testing

Tip 3: Manage and Audit the Dataset Permissions

Each dataset has its own set of permissions which are managed in the Power BI Service (separate from workspace permissions, and separate from sharing operations):

Power BI Dataset Permissions

For this particular dataset, there are 4 groups and 1 user who have been granted rights. I’ve used Azure Active Directory groups that align with the sales members and contributors are who have been assigned the build permission.

Power BI Dataset Permissions

Note that the build permission is implicitly granted to anyone who has direct workspace permissions above viewer (that is, contributors, members, and admins). The build permission is what allows someone to create their own report from the dataset.

There are multiple ways to grant dataset permissions (like during sharing or app publishing), so it's important to specifically manage and audit dataset permissions right alongside your other permissions auditing processes.

Tip 4: Use Certified & Promoted Endorsements on Important Datasets and Dataflows

Setting certain datasets to be certified or promoted can help your self-service report creators a great deal. When a dataset has been certified, that is a signal that it is trustworthy. It's been validated. Business rules and naming matches organizational expectations. Because it does involve some rigor in order to ensure trustworthiness, very few people should be allowed to certify data.

Power BI Dataset Endorsements

In Power BI Desktop, when we do a “Get Data” to connect to a dataset which has been published to the Power BI Service, certified datasets show up first, followed by promoted datasets. This is what makes endorsed datasets more easily discoverable by report creators.

Tip 5: Teach report creators how to use live connection mode

To take full advantage of reusable shared datasets, it's important that report creators are comfortable with how to work in live connection mode.

When creating a new report in Power BI Desktop: rather than importing data to that .pbix file, instead use “Get Data” to make a live connection to an existing Power BI dataset.

Once connected, all of the normal report creation capabilities are available. Report-level measures may also be generated.

Power BI Live Connection Mode

Starting with the June 2020 release of Power BI Desktop: self-service report creators can view the data model in live connection mode. That’s a very exciting development.

Power BI Live Connection Mode

That wraps up the top tips. If you’re not currently using this technique, I hope this has inspired you to give it a go.

December 2020 update: Composite models have been extended with new capabilities. Be sure to read about DirectQuery for Power BI Datasets. A PBIX file which starts off as a Live Connection, gets converted to DirectQuery, when then allows that PBIX file to act as a composite model which includes other data from additional sources. This is a great way to extend and customize data that’s available in an existing data model, without replicating that data or duplicating the work that was done originally.

Where Can You Find More Information?

Intro to Datasets Across Workspaces

Why Power BI Shared Datasets are Critical to a Self-Service BI Implementation (BlueGranite)

3 Reasons to Use a Single Dataset (Guy In a Cube video)

What the Heck are Build Permissions in Power BI (Guy in a Cube video)

Build Permission for Shared Datasets

Connect to Datasets in the Power BI Service from Power BI Desktop

Like This Content?

Take a look at our Power BI Deployment & Governance training course.