Combining multiple Excel files with different shee…


Problem description

Eventhough Power BI can connect to hundreds of data sources, in lot of scenarios people choose Excel instead of a more technical solution.
Excel is usually connected to Power BI in two ways:

  1. One way, is to connect one big file, which changes over time and with every single refresh new data is pulled to Power BI.
  2. Second way, is when people use folders, where multiple files are being stored. Typical scenario would be that each month  there is a new file, which must be combined with other files from previous months.

Problem I am going to explain is related to the second scenario, where you try to combine multiple files, while Excel sheets have different names. For example, one file contains sheet “Values 2020”, and the other sheet “Values 2021”. With the default autogenerared function in Power Query this is not possible. Let’s look at the real life scenario:

Initial Setup

In my demonstration, I connected one of my SharePoint using SharePoint Folder connector. Result of such a connection can be seen below.

Migasuke_4-1672935472887.png

 

My SharePoint Folder contains Excel files named “2020”, “2021”, and “2022”. The data structure of all those excel files is the same. Of course, my goal here is to combine all three files into one single query. I have two sheets and each sheet contains a very simple table with only three columns and several rows:

 

Migasuke_1-1672935439930.png

As we can see my file “2020” contains sheet “Revenue 2020” and “Cost 2020”. Logically my other file “2021” has sheets with “Revenue 2021”, “Cost 2021” etc.

Problem Solving

Now if I jump back to the Power Query editor, where I already connected my SharePoint folder, I will  click on the option of combining files:

Migasuke_8-1672935715849.png

 

Next window will pop up with additional options:

Migasuke_7-1672935676088.png

I will combine sheets with revenue, which will create new Power Query function in my Query list:

Migasuke_9-1672935771619.png

The most important is our function called “Transform File”. This particular query is responsible for our excel sheets combination. Unfortunately, by default it does not work as we usually intended. Let’s look at the original query and see how the data looks like:

Migasuke_10-1672935821701.png

We can see that only rows from file “2022” were imported. The reason why other files were not properly imported is because none of them contains my sheet called “Revenue 2022”. How can we fix this problem? Instead of connecting to a specific sheet name, we can simply use the position of the sheet. For example, if the data with revenue is always in the first sheet, we will connect only the first sheet from each file – different name does not matter anymore.

For this, we need to go to autogenerated function “Transform File” and open Advanced Editor:

Migasuke_11-1672935994795.png

In the Advanced Editor we need to pay extra attention to the step #“Revenue 2022_Sheet“. In this step we need to remove the hardcoded name and replace it with a more dynamic approach.

Migasuke_12-1672936035505.png

 

The solution is that we remove everything in the curly brackets and replace it with a number. In my case we will use 0, since 0 represents first sheet, 1 represents second sheet etc.

 

Migasuke_13-1672936060891.png

Now when we return to our original query, we can see that all data is imported as we wanted:

Migasuke_14-1672936091574.png

 

 

As you can see, the solution is quite easy but it has huge impact, especially if you use Excel as your main data source.

 



Source link

Be the first to comment

Leave a Reply

Your email address will not be published.


*