How to remove spaces from abbreviations in values …


Often people have problems with spaces in the abbreviations when importing data in Power BI Desktop. The sample data and the expected result are first provided below for the reader’s reference.


Sample Data and expected result:
Sample data:


Expected result:



Method 1:

It was found that many beginners will use the following method to solve this problem. They may replace the abbreviations one by one.



If the amount of data is small, we can do this, then in case we have thousands of abbreviations, the workload is quite voluminous.


Method 2:

Now I will introduce a dynamic way to remove spaces from abbreviations. The operations are generated by adding custom columns.
1. Separate the text of each line with spaces to get the lists.




2. Get the length of each line of text and convert them to lists.




3. Take a list of lists, lists, and returns a list of lists combining items at the same position.




To show this function in detail here, I expand it to a new row to show you.





4. The following is to determine whether the lists without the first character in the Zipped column is equal to {1,1}. Because {1,1} is part of the English abbreviation with spaces. For example, if the {B,1,1} list does not contain the first character is {1,1}, which is equal to {1, 1}, then “” & “B” are returned. If it is {“Power”,5,1}, then return “Power” & ” “.





5. At the end of the above steps, it looks like the result has been obtained. But there are still some differences, for example “Power B I Desktop”, you will get “Power BI Desktop ” with one more space. This is where we have to add another step to remove the trailing space.



6. Then remove the unnecessary columns to get the final result.



The entire M language in the advanced editor as follow:


    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCsgvTy1ScFLwVIrVQeIpuKQWZ5fkF4BFg1KLSxQcFQKgioIUfBSCFYJTk0uLMksqFTLzgDJgA2IB", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Value = _t]),
    #"Added Custom" = Table.AddColumn(Source, "TextToList", each Text.Split([Value], " ")),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "TextLengths", each List.Transform([TextToList], each Text.Length(_))),
    #"Added Custom2" = Table.AddColumn(#"Added Custom1", "Zipped", each List.Zip({[TextToList], [TextLengths], List.Skip([TextLengths])})),
    #"Added Custom3" = Table.AddColumn(#"Added Custom2", "Accu", each List.Accumulate([Zipped], "",
(state, current) => 
if List.Skip(current)= {1,1}
then state & List.First(current)
else state & List.First(current) & " ")),
    #"Added Custom4" = Table.AddColumn(#"Added Custom3", "Results", each Text.Trim([Accu])),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom4",{"Value", "TextToList", "TextLengths", "Zipped", "Accu"})
    #"Removed Columns"



Hope this article helps everyone with similar questions here.


Author: Stephen Tao

Reviewer: Ula Huang, Kerry Wang


How to remove spaces from abbreviations in values in Power Query.pbix

Source link

Be the first to comment

Leave a Reply

Your email address will not be published.