Power BI Export Limitations: Part 3 & 4


“It’s time to find out if we can ‘jailbreak’ Power BI”

 

In the first part of this series I discussed how Power BI does not meet my client’s stated requirements, of being able to download all of the underlying data, with totals regardless of size limits. In this final part, I am going to discuss the several options for intercepting and reinterpreting the requests.

Before I get into the nitty gritty here, I would like to very briefly mention that some of the techniques here are not very secure and I would not recommend them for most installations. That said there are viable workarounds here – here we go!

 

Intercept and reinterpret

 

The intercept and reinterpret models vary in terms of security and scalability, however they all share several defining traits. 1) They determine the dataset the user is trying to access. 2) They send information about what dataset is being used in Power BI to an outside service, which accesses the full set of underlying data and compiles it.

 

What Microsoft wants us to do: Embedded solutions

 

If paginated reports are not slick enough for your users, Microsoft offers embedded solutions. Embedded solutions essentially display Power BI in an iFrame, and you can set up your site navigation (or “chrome” in industry parlance) around the iFrame. In order to meet the client’s requirement, we could move all of the filtering and slicing in the reports to the embedded solution’s chrome and then write a web service that takes the filters as arguments.

Once the web service has all of the contextual data passed from the embedded solution, it is free to spool unlimited data to a file and pass the file back to the user. This would typically be done using an asynchronous JavaScript callback to the browser when the file is prepared.

 

There are not too many downsides to this approach, except that it is expensive and custom. While this may be an option for some large installations, it was not an option for my client.

 

Custom Power BI Visualization

 

Another solution that I came across in my research is to implement (or purchase) a custom Power BI visualization. This visualization displays a “download” icon and works similarly to the embedded solution. The visualization accepts an argument of a web service, which would have to be input by the report author. The visualization then uses the Power BI JavaScript API to get contextual information and sends it to the web service.

It seems as though one vendor, MAQ Software, has implemented a solution of this sort, however our client, who received a demo of the software complained it was very slow (20 minutes for a 50 page report). I called MAQ Software for a demo, but I was unable to reach anyone there.

 

This solution was not performant enough, but perhaps, if implemented differently, it could work. This however is, again, expensive and custom.

 

The last hacky workaround – intercept the SQL

 

I’m not sure if I’m proud of this or not, but it’s my invention… The final workaround I investigated was intercepting Power BI on the database server itself. I attached SQL profiler to the database server (note this will only work in Direct Query mode) to see how Power BI makes its calls. As I had expected, Power BI runs “SELECT TOP 1000” queries when getting data from the server.

At least in theory, although I have not tested this extensively, this could provide an avenue to a frictionless approach for getting the data. Essentially, we run a listener service on the database that monitors these “SELECT TOP” queries and the IPs they are requested from. Based on timing and some other criteria we could, theoretically figure out which data a user wants from clicking a link in the report and serve them that data without a web service.

 

I think this would be an interesting product, although I have not done the level of research I would need to do to determine if it was productizable. Certainly, it is not very secure (you are essentially performing a man-in-the-middle probe on yourself) and it seems costly in terms of compute resources.

 

I want to be clear about one thing, Power BI is a great piece of reporting software. It’s very easy to use, fast and is generally great at what it’s supposed to do. It still has a couple rough edges and I generally spend my time writing about those things, not about what makes the product great. I want to be extremely clear that I think the product is great.

 

The product is great but…

 

This is a big miss. In 2020, the hard data limits I wrote about in part 1, of 30k rows to CSV or 150k rows to XLS is simply unacceptable. These limits should be set to an order of magnitude larger in order to be more representative of real-world usage. If you have less than a couple hundred thousand rows of data, why would you even use Power BI?

The workarounds are too expensive or too clunky:

 

Paginated reports are at best a suitable workaround for dealing with large reports and at worst a clunky bolt-on of an outdated (Reporting Services, I’m looking at you) product that should have been revisited wholesale.

 

Embedded reports are expensive to deliver and require a much broader team to deliver and have more

 

Don’t even get me started on unraveling data cubes or attaching to the SQL server in order to intercept and send back data.

 

In conclusion

 

I delivered a summary of all this research to my client who was pleased at the lengths I went to in exploring workarounds. Ultimately, the client ended up deciding that they would go with an embedded solution, but that was not in this year’s budget and so it would have to wait until next year.

 

I did mention to them that Tableau supported this functionality, however Tableau is much more expensive than Power BI and in many cases is a worse solution.

 

Microsoft needs to address the community feature request that has been here for 4 years. It’s time to update the product to meet the needs of modern analysts in regard to export.



Source link

Be the first to comment

Leave a Reply

Your email address will not be published.


*