Data Wrangler VS. Power Query Editor – Microsoft Fabric


Power Query Editor and Data Wrangler are data transformation and preparation tools in Microsoft Fabric. There are similarities between these two tools. However, there are differences, too. It is essential to know the capabilities of each tool to understand which one should be used for what purpose and scenario. In this article, this is our quest.

Video

Data Wrangler

Data Wrangler is a tool for cleaning and preparing data. Data Scientists mainly use this tool, and it helps them generate Python code for data manipulation easier. This tool comes with a simple-to-use graphical user interface. The data has to be loaded into a Pandas DataFrame. Then, the tool can help the data scientists prepare the data and generate the Python code for later use in a more extensive process or application.

To learn more about the Data Wrangler, read my article here;

Power Query Editor

Power Query is the data transformation engine in Power BI. This engine comes with an editor called Power Query Editor. Power Query Editor first came to Excel in 2013, named Data Explorer. Through many years since then, the graphical interface of Power Query Editor has been improved significantly. Power Query Editor is available as a component in Power BI Desktop, Power BI Service, Power Apps portal, Excel, Microsoft Fabric portal, and others. Power Query Editor provides a rich graphical user experience for citizen data analysts to prepare and transform their data to the shape they want.

To learn more about Power Query Editor, read my article below;

What You Intend To Do?

One of the differences between these two tools is about what they each do. It matters if you want to transform the data or the data preparation to be part of a machine-learning process. Depending on the use case for the data, one of the tools might work better than the other. What you do can also determine the persona of the user you are dealing with the data. Are you a data analyst or data scientist? Tools available for each group can be different, providing different abilities.

Power Query Editor: an ETL tool

Power Query Editor is built for data preparation and data transformation. It is an ETL (Extract, Transform, and Load) tool. Because Power Query is an integral part of Power BI, it supports many data sources (over 200 data sources)

Power Query Editor makes it easy to connect to data sources and apply some rich data transformations with a simple-to-use GUI. There are transformations such as Merge, Append, Combining multiple files, etc., which greatly help in any data transformation scenario. Creating custom functions through the graphical interface is another powerful feature of this editor. It enables the data analyst to bundle repetitive transformation steps into a single function unit and execute it multiple times.

And Power Query Editor (when used in Dataflow Gen2) can load data into destinations. Also, if you use Power Query Editor in Excel, the data output can be loaded into an Excel sheet. Or using it in Power BI Desktop will load the data into a Power BI Dataset.

Power Query Editor is available almost anywhere you have data in the Microsoft suite: Power BI, Excel, Power Apps, Analysis Services, Fabric, etc. If you are dealing with data, perhaps there is a Power Query Editor component in that environment to make the experience more accessible.

Power Query Editor generates M script. M is a language specifically designed for data transformation. Unlike other languages with data-related functions, M is designed solely to work with data.

Data Wrangler: for Python Developers

Data Wrangler does have data preparation and transformation capabilities. However, the transformation experience is never close to Power Query Editor. The data sources are also limited. You have to load the data using some Python functions under the Pandas library, and you cannot apply as many transformations as you can using the Power Query Editor. However, Data Wrangler gives you a Python code. The Python code can be helpful if you want the data preparation to be part of a larger Python script, which may apply some machine-learning algorithms on the cleansed data afterward.

Data Wrangler is a tool for data scientists. Instead of remembering Python functions and writing the code for data preparation, Data Wrangler prepares the code for you.

A Python code can go a long way. Unlike M, a language merely for data transformation, Python is a programming language with many libraries. Writing a piece of code in Python that does many more things than just preparing the data is possible. Data Wrangler is an excellent tool for running the data preparation code as part of a larger Python code.

Sample Scenarios

There is no single tool for everybody. Different requirements, scenarios, and use cases cause different tools to become more efficient. Here are a couple of those scenarios;

  • Power Query Editor gives you a better experience if you want to transform the data.
  • If the data preparation and transformation is part of a larger Python code for further machine-learning processes, then Data Wrangler prepares the Python code for you.

Summary: Difference Chart

Data Wrangler and Power Query Editor can be used for data preparation and transformation. The table below summarizes the differences between Power Query Editor and Data Wrangler. The critical point is that no one tool can answer all your requirements. Each tool has its place in the data analytics project.

Point of difference Power Query Editor Data Wrangler
Available in the environments Microsoft Fabric
Power BI
Power Apps
Analysis Services
Excel
…
Microsoft Fabric
Azure Synapse
Target User Citizen Data Analyst Data Scientist
Primary function ETL Data Wrangling and
generating Python Code
Data sources More than 200 Limited, supported by the Pandas
Data Transformations Many functions including
but not limited to
Merge, Append, create
custom functions, etc.
Limited, supported by the Pandas
Data Destinations Four destinations in Dataflow Gen2
Excel output for Excel Power Query
Power BI Dataset output
Analysis Services output
etc
Limited, supported by the Pandas
Generating code M script Python code
Primary use case Data Transformation
Data Preparation
ETL
Generating Python code for
data preparation steps of a larger
machine-learning process
Reza Rad

Trainer, Consultant, Mentor

Reza Rad is a Microsoft Regional Director, an Author, Trainer, Speaker and Consultant. He has a BSc in Computer engineering; he has more than 20 years’ experience in data analysis, BI, databases, programming, and development mostly on Microsoft technologies. He is a Microsoft Data Platform MVP for 12 continuous years (from 2011 till now) for his dedication in Microsoft BI. Reza is an active blogger and co-founder of RADACAD. Reza is also co-founder and co-organizer of Difinity conference in New Zealand, Power BI Summit, and Data Insight Summit.

Reza is author of more than 14 books on Microsoft Business Intelligence, most of these books are published under Power BI category. Among these are books such as Power BI DAX Simplified, Pro Power BI Architecture, Power BI from Rookie to Rock Star, Power Query books series, Row-Level Security in Power BI and etc.

He is an International Speaker in Microsoft Ignite, Microsoft Business Applications Summit, Data Insight Summit, PASS Summit, SQL Saturday and SQL user groups. And He is a Microsoft Certified Trainer.

Reza’s passion is to help you find the best data solution, he is Data enthusiast.

His articles on different aspects of technologies, especially on MS BI, can be found on his blog: https://radacad.com/blog.



Source link

Be the first to comment

Leave a Reply

Your email address will not be published.


*