Power Query to remove unwanted Columns dynamically from Excel Dataset before load to Power BI | Power BI Blog

How to load only required Columns dynamically from Excel Dataset into Power BI using Power Query

Suppose, we have an excel dataset with some required and the unwanted columns like Colum2, Column 4, Prod_Id_2, Country_Id2, Cust_Id – CopyCust_Id – Copy.1, Country_Id3..etc

These kind of unwanted Columns can be removed as per and load only the required Columns as explained below:

1) First Load the Dataset(not Data) and keep only Source Step:

Source = Excel.Workbook(File.Contents(xl_DataSource), null, true)

Note: I have used the Parameters for DataSource and Dataset :

xl_DataSource = E:\Tech_Lab\Power BI\DataSets\ds_Sample_Orders.xlsx

xl_Dataset = tbl_OrderDetails

2) Next get the list of Column Names from the Dataset:

= Table.ColumnNames(Table.PromoteHeaders(Source{[Item=xl_Dataset,Kind=”Sheet”]}[Data], [PromoteAllScalars=true]))

3) Next convert the list of Column Names to a Table and then create a Flag column to identify the Columns to Remove from the Dataset:

ConvertToTable = Table.RenameColumns(Table.FromList(SelectColNames, Splitter.SplitByNothing(), null, null, ExtraValues.Ignore),{{“Column1”, “ColumnNames”}})

FlagCols2Remove = Table.AddColumn(ConvertToTable, “RemoveCols_Flag”, each if ([ColumnNames] is null or [ColumnNames] = “null” or Text.Length(Text.Select([ColumnNames],{“0″..”9″}))>0) then 1 else if Text.Contains([ColumnNames],”- Copy”) then 1 else 0)

Note :

The “RemoveCols_Flag” retruns 1 for the Columns to Remove and 0 to Keep.

4) Next generate a final list of ColumnNames to Remove where RemoveCols_Flag=1

ColsList2Remove= (Table.SelectColumns(Table.SelectRows(FlagCols2Remove, each ([RemoveCols_Flag] = 1)),”ColumnNames”))[ColumnNames]

5) Finally, pass the above list of Columns to Source Query, to remove them and Keep only the required columns to load

SelectReqCols= Table.RemoveColumns(Table.PromoteHeaders(Source{[Item=xl_Dataset,Kind=”Sheet”]}[Data], [PromoteAllScalars=true]),ColsList2Remove)

Result :

The complete Power Query used in this Scenario is as follows:


    Source = Excel.Workbook(File.Contents(xl_DataSource), null, true),

    SelectColNames = Table.ColumnNames(Table.PromoteHeaders(Source{[Item=xl_Dataset,Kind=”Sheet”]}[Data], [PromoteAllScalars=true])),

    ConvertToTable = Table.RenameColumns(Table.FromList(SelectColNames, Splitter.SplitByNothing(), null, null, ExtraValues.Ignore),{{“Column1”, “ColumnNames”}}),

    FlagCols2Remove = Table.AddColumn(ConvertToTable, “RemoveCols_Flag”, each if ([ColumnNames] is null or [ColumnNames] = “null” or Text.Length(Text.Select([ColumnNames],{“0”..”9″}))>0) then 1 else if Text.Contains([ColumnNames],”- Copy”) then 1 else 0),

    ColsList2Remove = (Table.SelectColumns(Table.SelectRows(FlagCols2Remove, each ([RemoveCols_Flag] = 1)),”ColumnNames”))[ColumnNames],

    SelectReqCols = Table.RemoveColumns(Table.PromoteHeaders(Source{[Item=xl_Dataset,Kind=”Sheet”]}[Data], [PromoteAllScalars=true]),ColsList2Remove),

    #”Changed Type” = Table.TransformColumnTypes(SelectReqCols,{{“Order_Id”, Int64.Type}, {“Order_Date”, type datetime}, {“Prod_Id”, Int64.Type}, {“Country_Id”, Int64.Type}, {“Cust_Id”, Int64.Type}, {“Gross_Sales”, Int64.Type}})


    #”Changed Type”


Thanks, TAMATAM ; Business Intelligence & Analytics Professional


Source link

Be the first to comment

Leave a Reply

Your email address will not be published.