Theme ggplot2 to Power BIs visual style

For a recent project, I’ve had to hit the R scripting and use the R Visuals to plug a gap in Power BI (PBI). Even though PBI is very capable, it does not have the full range of statistical formulas that Excel has. So I’ve had to build Linear Regression formulas in DAX, and also calculate in Power Query some coefficients using R. For the visuals, again I hit some limits with Power BI. I needed to use a measure as an axis, I also needed to show a polynomial trend line, so had to use the R visual and the ggplot2 library to display the data.

I’ve not used R much, I’ve been on a SQL Bits training day about it, and one of my colleagues, was quite good at it (However they have since moved on), so it was a nice move out of my comfort zone and I get to learn something too!

Note: If you want to follow this blog post you’ll need Microsoft R Open and install the ggplot2 library. Also useful is the ggplot2 reference web site

In this example I’ve started with a blank PBI file, and used the ‘Enter Data’ function to create a column with single value in it. You don’t need to do this if you are using you own data, I just need something to drag into the R visual to use as a data set. I’ll actually be using one of the inbuilt R data sets for the visual. You can download the example PBI file from my GitHub

So lets start with the basic set up in the R Visual.

library(ggplot2)

#Base Chart uses the iris dataset
chart <-    ggplot(iris, aes(x = Petal.Width, y = Sepal.Length)) + 
            geom_point() + 
            stat_smooth()      

#display
chart

Which renders the following chart. (Note the rounded corners in the visual, thanks to the Feb PBI Desktop update)

Lets break this down. First it calls ggplot, using the ‘iris’ data set and assigns the Petal.Width and Sepal.Length to the relevant axis.

ggplot(iris, aes(x = Petal.Width, y = Sepal.Length))

Plots the points on the chart, don’t miss this bit out, I did, and could not understand why I wasn’t seeing any data

geom_point()

Add the trend line and the shading.

stat_smooth()

So far so good. But it does not fit the Power BI style, and will look a bit out of place along side the base PBI ones. However we can fix that. First we are going to add two variables called BaseColour and LightGrey and assign them a hex colour value so they can be used without having to recall the hex values. So the base code will look like:

library(ggplot2)

#BaseGrey
BaseColour = "#777777"
LightGrey = "#01B8AA"

#Base Chart
chart <-    ggplot(iris, aes(x = Petal.Width, y = Sepal.Length)) + 
            geom_point() + 
            stat_smooth()      

#display
chart

Nothing should change in the visual. So next I’m going to update the stat_smooth function, to remove the shaded area and change the colour of the line.

stat_smooth(col = LightGrey, se=FALSE)

‘col’ assigns the colour, ‘se’ removes the shaded area.

For the next set of updates to the visual, we are going to update the theme setting, specifically the following:

  • Axis Text
  • Axis Ticks
  • Panel Grid
  • Panel Background

So lets start with the text. if you poke around the PBI formatting settings you’ll come across the default font family, size and colour in the standard PBI visuals. So next we are going to add the following:

First set the ‘axis.text’ elements to the size, font and use the variable ‘BaseColour’

axis.text = element_text(size=11, family=”Segoe UI”, colour=BaseColour)

remove the axis ticks with

axis.ticks = element_blank()

and set the axis text, setting the colour using the BaseColour variable

axis.title = element_text(size=11, family=”Segoe UI”, colour=BaseColour)

but will need to wrap it up in the ‘theme’ stuff, so the code now looks like:

library(ggplot2)

#BaseGrey
BaseColour = "#777777"
LightGrey = "#01B8AA"

chart <-    ggplot(iris, aes(x = Petal.Width, y = Sepal.Length)) + 
            geom_point() + 
            stat_smooth(col = LightGrey, se=FALSE) 

#Build theme
chart <- chart + theme(      axis.text = element_text(size=11, family="Segoe UI", colour=BaseColour)
                         ,   axis.ticks = element_blank()
                         ,   axis.title = element_text(size=11, family="Segoe UI", colour=BaseColour)     
                        )

#display
chart

Which hopefully should be getting close to the PBI look:

So just the grid line and backdrop to sort out, which will be added to the theme set up as:

Set the grid for the ‘y’ axis.

panel.grid.major.y = element_line( size=.1, color=BaseColour )

Set the grid for the ‘x’ axis, basically get rid of it using the element_blank()

panel.grid.major.x = element_blank()

And set the backdrop to blank as well

panel.background = element_blank()

which should mean that the code should now look like:

library(ggplot2)

#BaseGrey
BaseColour = "#777777"
LightGrey = "#01B8AA"

chart <-    ggplot(iris, aes(x = Petal.Width, y = Sepal.Length)) + 
            geom_point() + 
            stat_smooth(col = LightGrey, se=FALSE) 

#Build theme
chart <- chart + theme(      axis.text = element_text(size=11, family="Segoe UI", colour=BaseColour)
                         ,   axis.ticks = element_blank()
                         ,   axis.title = element_text(size=11, family="Segoe UI", colour=BaseColour)     
                         ,   panel.grid.major.y = element_line( size=.1, color=BaseColour )  
                         ,   panel.grid.major.x = element_blank()   
                         ,   panel.background = element_blank()  
                        )

#display
chart

And it should match the look of the default Power BI style.

So with a few bits of code added to the theme, you can change the look of an R visual, to match the default PBI theme, or what every you want it to look like. I’m not a R expert in any measure, and there maybe a better way of doing this, but its got me started, and hopefully you too.

Update: The BBC released a theme for R, along with a cook book on how to do visuals, I may use this as a base for my next project.

Your Brain & Report Design Part 1 – The Attentive vs the Intuitive Process

image11I do a number of Power BI workshops, were over a few days I guide people through the features of Power BI, what it is good at, what its limitations are, and after a training exercise on a basic data set to familiarise the clients with Power BI Desktop, we get started on report building with their own data. In that phase of the workshop, it is always interesting to see what they build, particularly how they lay it out and what colours they use.

After some, lets say, very interesting design choices, I started adding in a section into the workshop slides on report design best practice, mainly based on the guidance in the book ‘Information Dashboard Design by Stephen Few‘ and ‘Storytelling with Data by Cole Knaflic‘. In these books, and a few others, it talks about the ‘attentive process’, and how by incorrectly choosing visuals you make the information presented hidden, obscured or you just can’t see the pattern.Read More »

Power Query & finding the Nth instance of a letter in a string for manipulation

Power BI LogoPower Query and the M language is a bit of a pain, not a lot of people do it (Or they do, and they don’t write about it), and the knowledge base on MSDN is not real world example friendly. So trying to figure out how to do something you know how to do in say SQL or DAX is not that easy some times. Where possible it is best to do as much as you can in the Power Query loading, otherwise you’ll be burning CPU time in the data model. So things like data cleansing should be done in Power Query, not in the DAX after it has been loaded into memory.

So find the Nth instance of a item in string I can do in SQL, but in Power Query it is a little bit different. So lets start with an example that contains the following.

ABC-01-02-03-XYZ

So I had to remove the ‘XYZ’ bit. Easy right?


Text.Start("ABC-01-02-03-XYZ", 12)

No, turns out the data contain items like

AB-012-02-03-XYZ

ABC-01-02-09

AB-012-032-03-XGYZ

So no quick fix by just using basic string manipulation. So what can we use? In this case you can use the Text.PositionOfAny function, and with this you can set it to find the first, last or all occurrences. So in this case, looking for the first instance of ‘-‘, you would get:

 


Text.PositionOfAny("ABC-01-02-03-XYZ", {"-"}, Occurrence.First)

Would return 3. Wait, not 4? No as it starts from ‘0’. It’s like a JSON array or Python dictionary index. This will be important later.


Text.PositionOfAny("ABC-01-02-03-XYZ", {"-"}, Occurrence.Last)

Returns 12.


Text.PositionOfAny("ABC-01-02-03-XYZ", {"-"}, Occurrence.Any)

Would in a custom column, just return ‘List’, or internally {‘3’, ‘6’, ‘9’, ’12’}. You can expand the list out if needed and create extra rows, but you should not in this case.

So to strip out the ‘XYZ’ part of ‘ABC-01-02-03-XYZ’

So you can use Text.Start, so it will read from the start of the string to the end, which can be set from a list.


Text.Start("ABC-01-02-03-XYZ", (Text.PositionOfAny("ABC-01-02-03-XYZ" , {"-"}, Occurrence.All){3}))

So in the above code the {3} is get the value from list in that location. Remember lists start from 0, and {3} would return the value 12 in the list {‘3’, ‘6’, ‘9’, ’12’}

But for the other instances in the data, {3} might not exist. However you can problematically strip stuff out, with for example an ‘if’ clause, with List.Count and drive the extraction of the data, based in the number of items in the list.

So for example in the data I have in the column [SortID] the following

AB-012-02-03-XYZ

ABC-01-02-09

if List.Count(Text.PositionOfAny([SortID] , {"-"}, Occurrence.All)) <= 3
then [SortID]
else Text.Start([SortID], (Text.PositionOfAny([SortID] , {"-"}, Occurrence.All){3}))

So in the 'if' clause List.Count on ABC-01-02-09 will return 3, and then do nothing, as I don't need to trim the end off it. Otherwise it will take AB-012-02-03-XYZ and find the position of the 3rd list value, and take the string length from it found in the Text.PositionOfAny, and trim the end off.

If you want here is some example M code to recreate the above. Enjoy

 

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcnRy1jUw1DUw0jUw1o2IjFKK1UERtFSKjQUA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [SortID = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"SortID", type text}}),
#"Basic String Manipulation" = Table.AddColumn(#"Changed Type", "Basic String Manipulation", each Text.Start([SortID], 12)),
#"First Occurrance" = Table.AddColumn(#"Basic String Manipulation", "First Occurrance", each Text.PositionOfAny([SortID], {"-"}, Occurrence.First)),
#"Last Occurrance" = Table.AddColumn(#"First Occurrance", "Last Occurrance", each Text.PositionOfAny([SortID], {"-"}, Occurrence.Last)),
#"All Occurance" = Table.AddColumn(#"Last Occurrance", "All Occurrance", each Text.PositionOfAny([SortID], {"-"}, Occurrence.All)),
#"If Clause" = Table.AddColumn(#"All Occurance", "If Clause", each if List.Count(Text.PositionOfAny([SortID] , {"-"},
Occurrence.All)) <= 3
then [SortID]
else Text.Start([SortID], (Text.PositionOfAny([SortID] , {"-"}, Occurrence.All){3})))
in
#"If Clause"

 


 

 

 

SUMX vs DISTINCT COUNT

During the preparation for delivering a Power BI training session for a client, I was looking at the Tabular Data Model that was their data source, and I was struggling with a long running query. It was taking 1 minute and 24 seconds to return the values after chewing through 2 million rows of data. So I had a go at optimising it… and got it down to 6 seconds.

Mind Blown

Wow! When the query ran in that short of time, I went ‘No way… that’s not right, what have I done wrong?’, but no, after checking it was right.

Let’s start at the beginning…Read More »

Power BI Embedded, SKU Differences and Cost Breakdowns

There still seems to be a bit of confusion about Power BI licensing since the changes made back in May 2017. I get a number of questions about embedding options, were a client wants to share reports internally via SharePoint and/or externally. Now if a customer is a large enterprise and wants to share internally via SharePoint, the best option is Power BI Premium. In Premium you can allocate Power BI work spaces to capacity, so those who log into Power BI with a ‘Free’ license or use SharePoint can see the reports they need. However for smaller organisations paying £3,766 per month is not going to be viable. There are some other options, which is were the confusion creeps in. I’ll be using a very raw measure of Monthly Cost to Power BI Pro Licenses, to show how to get the best bang for your buck/quid/currency denomination of your choice.Read More »

Power BI Custom Visuals – My top picks part one

Recently I presented a talk at the SQL Server Birmingham User Group, chatting about how you interpret information and some of the best practice methods of presenting your reports. I had few questions asked at the end of the session, but got stuck on one, which was ‘Which Power BI Custom Visuals do you use and recommend?’.  I was stuck for a moment and named a one or two, but I do use a few more, so here are my top picks from the Office Store.

Please note, the following list is in not any order of use, but it does seem to be split between OK Viz visuals and the rest. So out of 12 I use the most, 5 are from them, which is not surprising as they are awesome. In fact in adding the links to this post, they have released a new one, the Colour Helper. As I am super duper colour blind, it is very helpful!Read More »

Power BI & SAP HANA… can they play together nice?


I’ve been doing a bit of work for a global company who are SAP based. They use the BW solution and they are moving to the SAP HANA database solution (In memory database, swish!), however they have recently moved from Lotus Notes (blah!) to Office 365 (nice!) so they have a foot in both SAP and Microsoft worlds, and wanted to compare Power BI and SAP Lumira for their self-service business intelligence needs. So what did we find out? And can SAP HANA & Power BI play nicely together?

Connecting Power BI
The SAP Universe connector that was in the Power BI Desktop Preview is sadly not available any more in the release in July, and only lingers on in the Power Query connections in Excel 2013. Word on the street is that SAP poo-pooed the idea of a connector, as they didn’t want Microsoft having a better BI front end that SAP Lumira and their cloud offering had. Fetch the rolled up newspaper, bad SAP [Slap], naughty! So we are
left with two options ODBC and OData to connect to a Calculated View. For those not familiar with SAP (myself included), its like a OLAP cube, with aggregations, but flatter with dimensions and measures mixed together. (Please correct me in comments if needed). Normally I would extract the data to a source that Power BI can use like SQL Server or Azure, but not in this case, connect to SAP HANA or nothing!

Connecting Power BI 
In trying to connect Power BI I came across a few limitations.
1: You can’t have a live connection for the data in both the Power BI Designer and the Power BI Service. You’ll have to refresh the data in the Power BI Designer then upload it to the service. But what about OData? That can be used as a data source when loading it to the web service, well yes you can, but this is a On-Premise OData service, the web service part can’t resolve the internal URL in the OData connection, you would have to create/expose the SAP HANA OData URL to the internet for it to work, which in this case was a great big NO!.
2: ODBC & OData (On Prem) isn’t supported in the Power BI Personal Gateway, so no scheduled refresh of the data.
3: If you use Parameters and/or Variables in the calculated view, Power BI doesn’t like those and will throw an error message for both ODBC & OData connections when trying to connect so those options in SAP HANA are out.

4: For ODBC connections Calculated Views do not show up in the browser list of objects. You have to use an SQL query to get the data into Power BI.

5: Security and permissions can be an issues for the ODBC connections, as it will list all the objects that you technically have permission to see. As I had elevated permissions, this meant a long list of tables, objects etc, that the Power BI Desktop table selection struggled with and just hung for a minute or so when trying to move down the list of objects. I’ve encountered something similar when connecting to SQL Server connections, it will list the tables/views etc you have permission for, and the system ones in a database too.
For the OData connection, adding filters as per the OData standard to the URL worked fine for example filtering on the dataset for just the Europe region:
http://hanaserver.somevendor.com:8000/Power_BI_PoC/PowerBIODataV1.xsodata/Results?$filter Region eq ‘Europe’One of the other things with OData connections is that you add in the OData connection set up the SAP HANA column names, and it will bring through the column labels fine in the OData XML and then on to Power BI
ODBC connections and the resulting query is a lot faster than OData to get data from SAP HANA. Power BI took a few minutes for the OData, and seconds for the same volume in ODBC. I assume, that as its reading the data, putting it into JSON format, then Power BI transfers it back to a table format, its going to suck up some processing time.
Comment.
IMHO, SAP have made a strategic mistake in not allowing a SAP connector to Power BI, it is old world thinking to limit the SAP application in such a way so they can sell SAP Lumira on top of their SAP DB’s. I’ve experienced first hand and the industry is moving to a place were customers want their technology to talk to each other, Power BI has connectors to Oracle, Teradata, Salesforce and so many more, sure it works great with the Microsoft stack of databases, both on premise and cloud, but its not limited to it. Customers want less tools and applications, training time is expensive both in cost and time for the person to be out of the business and they have already invested in capital for their IT systems. SAP Lumira having seen it, it can’t compare to Power BI in terms of usability and visualisations, and the rate that Power BI is moving with updates and capability, SAP Lumira isn’t going to catch up, so please SAP do yourself a favor, work with Microsoft get a connector sorted, welcome to the new world, it’s glittery and interconnected.