How to convert multiple formatted forms to line-by…


Scenario:  
The user wants to convert the formatted forms such as Image1 into line-by-line data details such as Image2. For a convenient data analysis later.

Image1

1.jpg

Image2

2.jpg

Expected result: 

The data needs to be displayed with line-by-line just like as Image2:

Image2

3.jpg

Detailed steps:   

Let us start from convert single formatted form first

  1. Get data from excel workbook into Power BI

 

  1. Select the sheet to be imported in the pop-up window, and click the “Transform data” button as shown in below screenshot:

4.jpg

  1. Delete the “Changed Type” step added by Power Query automatically in order to avoid data type conversion errors : Click the “X” button before “Changed Type” in the “Query Settings” as shown in below screenshot:

5.jpg

  1. As of the current step, the results of the table are as follows.

6.jpg

Some columns are recognized as null, we don’t need to care about them. We only need to know where the value of each column is. For example, Applicant’s name “Jack” is in the first row of Column2 (Index: 0), so as long as you read “[Column2]{0}” in this table, you can get the Applicant’ Name. Similarly, the corresponding Age is in the third line of Column4, you can get the age by reading “[Column2]{0}” …The corresponding position as shown in below table as summarized the information:

Column Name

Column Value

Index

Applicant’s name

Column2

0

Age

Column4

0

Gender

Column6

0

Graduated school

Column2

1

Hobby

Column4

1

Telephone

Column6

1

Mailbox

Column2

2

 

 

 

 

  1. Construct a new table with #table function to meet the requirement base on the column position information obtained in the previous step. The related applied codes as follow, the codes with red font is new added/updated:
let
    Source = Excel.Workbook(File.Contents("xxx\xx\xx\xx.xlsx"), null, true),
    Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
    Updated_Sheet= #table(
    {"Name of applicant","Age","Gender","Graduated school","Hobby","Telephone","Mailbox"},
    {{Sheet1_Sheet[Column2]{0},Sheet1_Sheet[Column4]{0},Sheet1_Sheet[Column6]{0},Sheet1_Sheet[Column2]{1},
Sheet1_Sheet[Column4]{1},Sheet1_Sheet[Column6]{1},Sheet1_Sheet[Column2]{2}}}
    )
in
    Updated_Sheet

  

7.jpg

 

If we need to convert multiple formatted forms as show in below screenshot, what should we do?

8.jpg

  1. Get all formatted forms from excel into Power BI

9.jpg

  1. Create the following custom function based on the construction table used in the conversion of the single formatted form and put the custom function behind let keyword:

10.jpg

  1. Switch to “Add Column” tab, click “Custom Column” button. Input “=trans([Data])” in the pop-up dialog box, and click “OK” button.11.jpg
  2. Click the data expansion button to the right of the column name of “Custom” column, uncheck the “Use original column name as prefix” option in the pop-up dialog box. Then click “OK” button.

12.jpg

  1. Remove the unnecessary columns: right click and choose the option “Remove Other Columns

13.jpg

Finally, the table as show in below screenshot:

14.jpg

Here we can optimize the part about column position information (that is, step 2, 4 and 5 above):

  1. Add a mapping table, the specific content is as follows:

Column Name

Column Value

Index

Applicant’s name

Column2

0

Age

Column4

0

Gender

Column6

0

Graduated school

Column2

1

Hobby

Column4

1

Telephone

Column6

1

Mailbox

Column2

2

 

  1. Update the applied codes in step 2, 4 and 5 in Advanced Editor
  • Replace {“Name of applicant”, “Age”, “Gender”, “Graduated school”, “Hobby”, “Telephone”, “Mailbox”} in the applied steps named “Expanded Custom” and “Removed Other Columns” with Column Name in Mapping table

 

  • Same with the previous step, replace {“Name of applicant”, “Age”, “Gender”, “Graduated school”, “Hobby”, “Telephone”, “Mailbox”} in custom function with Column Name in Mapping table

 

  • Change the custom function by using List.Transform and Record.Field : column value of construct table get from Mapping table instead of the original fixed values
trans=(t)=>
    #table(
   Mapping[Column Name],
    {List.Transform(
        Mapping[Column Name],
        each Record.Field(
            t{Mapping{[Column Name=_]}[Index]},Mapping{[Column Name=_]}[Column Value]
      ))})

 

15.jpg

 

The full applied codes as follow:

let
    trans=(t)=>
    #table(
   Mapping[Column Name],
    {List.Transform(
        Mapping[Column Name],
        each Record.Field(
            t{Mapping{[Column Name=_]}[Index]},Mapping{[Column Name=_]}[Column Value]
      )
) }
),
    Source = Excel.Workbook(File.Contents("D:\Blogs\2021.1.9\Interviewer registration form.xlsx"), null, true),
    #"Added Custom" = Table.AddColumn(Source, "Custom", each trans([Data])),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", Mapping[Column Name], Mapping[Column Name]),
    #"Removed Other Columns" = Table.SelectColumns(#"Expanded Custom",Mapping[Column Name])
in
    #"Removed Other Columns"

 

Hope this article helps everyone with similar questions here. 

 

Author:  Rena Ruan

Reviewer: Ula Huang, Kerry Wang

How to convert multiple formatted forms to line-by-line data details.pbix



Source link

Be the first to comment

Leave a Reply

Your email address will not be published.


*