Scenario:
The user wants to convert the formatted forms such as Image1 into line-by-line data details such as Image2. For a convenient data analysis later.
Image1
Image2
Expected result:
The data needs to be displayed with line-by-line just like as Image2:
Image2
Detailed steps:
Let us start from convert single formatted form first
- Get data from excel workbook into Power BI
- Select the sheet to be imported in the pop-up window, and click the “Transform data” button as shown in below screenshot:
- Delete the “Changed Type” step added by Power Query automatically in order to avoid data type conversion errors : Click the “X” button before “Changed Type” in the “Query Settings” as shown in below screenshot:
- As of the current step, the results of the table are as follows.
Some columns are recognized as null, we don’t need to care about them. We only need to know where the value of each column is. For example, Applicant’s name “Jack” is in the first row of Column2 (Index: 0), so as long as you read “[Column2]{0}” in this table, you can get the Applicant’ Name. Similarly, the corresponding Age is in the third line of Column4, you can get the age by reading “[Column2]{0}” …The corresponding position as shown in below table as summarized the information:
Column Name |
Column Value |
Index |
Applicant’s name |
Column2 |
0 |
Age |
Column4 |
0 |
Gender |
Column6 |
0 |
Graduated school |
Column2 |
1 |
Hobby |
Column4 |
1 |
Telephone |
Column6 |
1 |
Mailbox |
Column2 |
2 |
|
|
|
- Construct a new table with #table function to meet the requirement base on the column position information obtained in the previous step. The related applied codes as follow, the codes with red font is new added/updated:
let
Source = Excel.Workbook(File.Contents("xxx\xx\xx\xx.xlsx"), null, true),
Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
Updated_Sheet= #table(
{"Name of applicant","Age","Gender","Graduated school","Hobby","Telephone","Mailbox"},
{{Sheet1_Sheet[Column2]{0},Sheet1_Sheet[Column4]{0},Sheet1_Sheet[Column6]{0},Sheet1_Sheet[Column2]{1},
Sheet1_Sheet[Column4]{1},Sheet1_Sheet[Column6]{1},Sheet1_Sheet[Column2]{2}}}
)
in
Updated_Sheet
If we need to convert multiple formatted forms as show in below screenshot, what should we do?
- Get all formatted forms from excel into Power BI
- Create the following custom function based on the construction table used in the conversion of the single formatted form and put the custom function behind let keyword:
- Switch to “Add Column” tab, click “Custom Column” button. Input “=trans([Data])” in the pop-up dialog box, and click “OK” button.
- Click the data expansion button to the right of the column name of “Custom” column, uncheck the “Use original column name as prefix” option in the pop-up dialog box. Then click “OK” button.
- Remove the unnecessary columns: right click and choose the option “Remove Other Columns”
Finally, the table as show in below screenshot:
Here we can optimize the part about column position information (that is, step 2, 4 and 5 above):
- Add a mapping table, the specific content is as follows:
Column Name |
Column Value |
Index |
Applicant’s name |
Column2 |
0 |
Age |
Column4 |
0 |
Gender |
Column6 |
0 |
Graduated school |
Column2 |
1 |
Hobby |
Column4 |
1 |
Telephone |
Column6 |
1 |
Mailbox |
Column2 |
2 |
- Update the applied codes in step 2, 4 and 5 in Advanced Editor
- Replace {“Name of applicant”, “Age”, “Gender”, “Graduated school”, “Hobby”, “Telephone”, “Mailbox”} in the applied steps named “Expanded Custom” and “Removed Other Columns” with Column Name in Mapping table
- Same with the previous step, replace {“Name of applicant”, “Age”, “Gender”, “Graduated school”, “Hobby”, “Telephone”, “Mailbox”} in custom function with Column Name in Mapping table
- Change the custom function by using List.Transform and Record.Field : column value of construct table get from Mapping table instead of the original fixed values
trans=(t)=>
#table(
Mapping[Column Name],
{List.Transform(
Mapping[Column Name],
each Record.Field(
t{Mapping{[Column Name=_]}[Index]},Mapping{[Column Name=_]}[Column Value]
))})
The full applied codes as follow:
let
trans=(t)=>
#table(
Mapping[Column Name],
{List.Transform(
Mapping[Column Name],
each Record.Field(
t{Mapping{[Column Name=_]}[Index]},Mapping{[Column Name=_]}[Column Value]
)
) }
),
Source = Excel.Workbook(File.Contents("D:\Blogs\2021.1.9\Interviewer registration form.xlsx"), null, true),
#"Added Custom" = Table.AddColumn(Source, "Custom", each trans([Data])),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", Mapping[Column Name], Mapping[Column Name]),
#"Removed Other Columns" = Table.SelectColumns(#"Expanded Custom",Mapping[Column Name])
in
#"Removed Other Columns"
Hope this article helps everyone with similar questions here.
Author: Rena Ruan
Reviewer: Ula Huang, Kerry Wang
Be the first to comment