How to create and use keys (A work around for Primary Keys) in Fabric data warehouses – FourMoo | Power BI


When I started looking into using the data warehouses feature in Fabric, I did see that there were limitations on Primary Key columns.

Below is my blog post on how I still use keys in my data warehouse, instead of using GUID’s which to me are long and hard to use.

In my example I am going to create a simple data warehouse which is going to consist of two-dimension tables (Date and Country) and a fact table with the Sales amounts.

I loaded my data into the lakehouse as shown below.

Creating my Warehouse

The first step is I needed to create a warehouse where I could create my tables.

To do this I changed the experience to Data Warehouse.

I then clicked on Warehouse and gave it the name of DW-FourMoo and clicked on Create.

Now in order to access the data I then created my warehouse table by importing the data from my lakehouse into the warehouse (Yes, I know it can be a bit confusing with all these similar names).

I then needed to load my source data into my warehouse table.

To do this I went into my Warehouse, clicked on Get Data and then selected “New Dataflow Gen2”

I then loaded my data via the Dataflow Gen2 into my warehouse table (I am not going to put in how I did it as it is not part of this blog post and I have covered this before in my other blog post “”).

Once loaded I could then my table in my warehouse as shown below.

I could now start creating my dimension and fact tables with the required keys.

Creating the dimension tables

Next, to create my dimension tables I used the following TSQL code below.

The code below is for my Country Dimension

Create table DW_FourMoo.dbo.tb_DimCountry AS
Select distinct country_code, 
-- This is the column that I wanted to create the Key on
ROW_NUMBER() OVER(ORDER BY country_code) as CountrySK
from DW_FourMoo.dbo.staging_sales

I could then see that the query ran successfully.

And I could see the table and the actual Key values in my tb_DimCountry table.

I then used the following TSQL code to create the Date dimension.

Create table DW_FourMoo.dbo.tb_DimDate AS
Select distinct CAST(CAST(time_ref as VARCHAR(10)) as DATE) as Date, 
-- This is the column that I wanted to create the Key on
ROW_NUMBER() OVER(ORDER BY time_ref) as DateSK
from DW_FourMoo.dbo.staging_sales

Once completed I could then see the table and values as shown below.

I now had my Date and Country dimensions.

Create the Fact Table

I am now at the point that I can create my fact table.

To do this I used the following TSQL code below.

Create table DW_FourMoo.dbo.tb_FactSales AS

SELECT   
       Country.CountrySK
      ,SUM([value]) as SalesAmount
      ,[status]
      ,D.Date
  FROM [dbo].[staging_sales] as S
    inner join dbo.tb_DimCountry as Country 
        on S.country_code = Country.country_code
    Inner join dbo.tb_DimDate as D
        on S.time_ref = D.Date

Group by 
    Country.CountrySK
    ,[status]
    ,D.Date

I could then see my fact table as shown below.

The next challenge is what to do when I want to add or update rows in my dimension table. I want to keep my existing keys and only add new keys.

Add rows to my Dimension Table

In the code below I will show you how I added new keys to my dimension table.

Declare @MaxCountrySK as BigInt = (Select max(CountrySK) from DW.dbo.tb_DimCountry)

Insert into DW.dbo.tb_DimCustomer

SELECT distinct
 [country_code]
,ROW_NUMBER() OVER(ORDER BY country_code) + @MaxCountrySK as CountrySK


FROM [DW_FourMoo].[dbo].[staging_sales]
where country_code not in 
    (
        Select country_code
        from DW.dbo.tb_DimCountry
    )

If I wanted to add rows to the fact table, I could do this with traditional data warehousing where I only load the new data based on the date.

I hope that you have found this useful. If you have any questions or comments, please let me know.



Source link

Be the first to comment

Leave a Reply

Your email address will not be published.


*