Your Brain & Report Design Part 1 – The Attentive vs the Intuitive Process

image11I do a number of Power BI workshops, were over a few days I guide people through the features of Power BI, what it is good at, what its limitations are, and after a training exercise on a basic data set to familiarise the clients with Power BI Desktop, we get started on report building with their own data. In that phase of the workshop, it is always interesting to see what they build, particularly how they lay it out and what colours they use.

After some, lets say, very interesting design choices, I started adding in a section into the workshop slides on report design best practice, mainly based on the guidance in the book ‘Information Dashboard Design by Stephen Few‘ and ‘Storytelling with Data by Cole Knaflic‘. In these books, and a few others, it talks about the ‘attentive process’, and how by incorrectly choosing visuals you make the information presented hidden, obscured or you just can’t see the pattern.Read More »

Power Query & finding the Nth instance of a letter in a string for manipulation

Power BI LogoPower Query and the M language is a bit of a pain, not a lot of people do it (Or they do, and they don’t write about it), and the knowledge base on MSDN is not real world example friendly. So trying to figure out how to do something you know how to do in say SQL or DAX is not that easy some times. Where possible it is best to do as much as you can in the Power Query loading, otherwise you’ll be burning CPU time in the data model. So things like data cleansing should be done in Power Query, not in the DAX after it has been loaded into memory.

So find the Nth instance of a item in string I can do in SQL, but in Power Query it is a little bit different. So lets start with an example that contains the following.

ABC-01-02-03-XYZ

So I had to remove the ‘XYZ’ bit. Easy right?


Text.Start("ABC-01-02-03-XYZ", 12)

No, turns out the data contain items like

AB-012-02-03-XYZ

ABC-01-02-09

AB-012-032-03-XGYZ

So no quick fix by just using basic string manipulation. So what can we use? In this case you can use the Text.PositionOfAny function, and with this you can set it to find the first, last or all occurrences. So in this case, looking for the first instance of ‘-‘, you would get:

 


Text.PositionOfAny("ABC-01-02-03-XYZ", {"-"}, Occurrence.First)

Would return 3. Wait, not 4? No as it starts from ‘0’. It’s like a JSON array or Python dictionary index. This will be important later.


Text.PositionOfAny("ABC-01-02-03-XYZ", {"-"}, Occurrence.Last)

Returns 12.


Text.PositionOfAny("ABC-01-02-03-XYZ", {"-"}, Occurrence.Any)

Would in a custom column, just return ‘List’, or internally {‘3’, ‘6’, ‘9’, ’12’}. You can expand the list out if needed and create extra rows, but you should not in this case.

So to strip out the ‘XYZ’ part of ‘ABC-01-02-03-XYZ’

So you can use Text.Start, so it will read from the start of the string to the end, which can be set from a list.


Text.Start("ABC-01-02-03-XYZ", (Text.PositionOfAny("ABC-01-02-03-XYZ" , {"-"}, Occurrence.All){3}))

So in the above code the {3} is get the value from list in that location. Remember lists start from 0, and {3} would return the value 12 in the list {‘3’, ‘6’, ‘9’, ’12’}

But for the other instances in the data, {3} might not exist. However you can problematically strip stuff out, with for example an ‘if’ clause, with List.Count and drive the extraction of the data, based in the number of items in the list.

So for example in the data I have in the column [SortID] the following

AB-012-02-03-XYZ

ABC-01-02-09

if List.Count(Text.PositionOfAny([SortID] , {"-"}, Occurrence.All)) <= 3
then [SortID]
else Text.Start([SortID], (Text.PositionOfAny([SortID] , {"-"}, Occurrence.All){3}))

So in the 'if' clause List.Count on ABC-01-02-09 will return 3, and then do nothing, as I don't need to trim the end off it. Otherwise it will take AB-012-02-03-XYZ and find the position of the 3rd list value, and take the string length from it found in the Text.PositionOfAny, and trim the end off.

If you want here is some example M code to recreate the above. Enjoy

 

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcnRy1jUw1DUw0jUw1o2IjFKK1UERtFSKjQUA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [SortID = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"SortID", type text}}),
#"Basic String Manipulation" = Table.AddColumn(#"Changed Type", "Basic String Manipulation", each Text.Start([SortID], 12)),
#"First Occurrance" = Table.AddColumn(#"Basic String Manipulation", "First Occurrance", each Text.PositionOfAny([SortID], {"-"}, Occurrence.First)),
#"Last Occurrance" = Table.AddColumn(#"First Occurrance", "Last Occurrance", each Text.PositionOfAny([SortID], {"-"}, Occurrence.Last)),
#"All Occurance" = Table.AddColumn(#"Last Occurrance", "All Occurrance", each Text.PositionOfAny([SortID], {"-"}, Occurrence.All)),
#"If Clause" = Table.AddColumn(#"All Occurance", "If Clause", each if List.Count(Text.PositionOfAny([SortID] , {"-"},
Occurrence.All)) <= 3
then [SortID]
else Text.Start([SortID], (Text.PositionOfAny([SortID] , {"-"}, Occurrence.All){3})))
in
#"If Clause"

 


 

 

 

Power Query – M Calendar Function

Power BI LogoI’ve downloaded a few M language based calendars, for use in Power BI recently and have been frustrated as they don’t quite meet my needs. So I’ve created my own.

The main difference to some I’ve seen online is that the time category columns, Month, Quarter, Year etc. columns are only dependent on the date column, nothing is dependant on anything else, so if you don;t need them you can delete them. The other option, is the use of a key column for sorting. So in Power BI you can use the sort by other column, and stop the month names being sorted alphabetically in charts/slicers and so on. You can define the start if needed and creates the calendar to the current now date.

Also I now have a GitHub!

Code is here enjoy

Or copy and paste from belowRead More »

SQL Server 2017 – TRIM, at last!

AzureSQLLogoThere is loads of good stuff in SQL Server 2017, run on Linux, Graph Databases, Python, but some times it is the small things in the daily T-SQL grind that really make the difference. The Trim function is one of these.

Cleaning up dirty data from source systems, well I say source systems, most likey xls, xlsx, csv and tsv files, your going to come across leading or trailing spaces, the ghosts in the string!

So for example if we had ‘   Some string from some funky csv export   ‘ with 3 leading and 3 trailing spaces we had to do this in SQL Server 2016 and below:

DECLARE @str AS VARCHAR(50)
SET @str = '  Some string from some funky csv export   '

SELECT RTRIM(LTRIM(@str)) AS TrimExample

Which can be converted to a UDF so you could just do your own TRIM function, but it would look untidy with a schema in front of it, blah!

DECLARE @str AS VARCHAR(50)
SET @str = ' Some string from some funky csv export '
SELECT udf.TRIM(@str) AS TrimExample

But now you can just do

DECLARE @str AS VARCHAR(50)
SET @str = ' Some string from some funky csv export '
SELECT TRIM(@str) AS TrimExample

No more having to recall which is your left and right, nice and tidy! Worth updating just for this function IMHO.

Power BI Custom Visuals – My top picks part two

If you missed part one of ‘Power BI Custom Visuals – my top picks’, which was basically a love letter to the OK Viz team, check it out now.

So this is the second part of my top picks, looking at the visuals I tend to use when creating Power BI reports. As mentioned in part one, there are about 12 ones that I use, I do tend to be fairly conservative in using custom visuals, there are a quite a number now, and tend to side with functional, and less flashy/gimmicky choices, as those tend to  kick in the attentive process and not be as intuitive to use, and ultimately distract from the information that you wish to show. Anyway here’s the final few.Read More »

SUMX vs DISTINCT COUNT

During the preparation for delivering a Power BI training session for a client, I was looking at the Tabular Data Model that was their data source, and I was struggling with a long running query. It was taking 1 minute and 24 seconds to return the values after chewing through 2 million rows of data. So I had a go at optimising it… and got it down to 6 seconds.

Mind Blown

Wow! When the query ran in that short of time, I went ‘No way… that’s not right, what have I done wrong?’, but no, after checking it was right.

Let’s start at the beginning…Read More »

Power BI Embedded, SKU Differences and Cost Breakdowns

There still seems to be a bit of confusion about Power BI licensing since the changes made back in May 2017. I get a number of questions about embedding options, were a client wants to share reports internally via SharePoint and/or externally. Now if a customer is a large enterprise and wants to share internally via SharePoint, the best option is Power BI Premium. In Premium you can allocate Power BI work spaces to capacity, so those who log into Power BI with a ‘Free’ license or use SharePoint can see the reports they need. However for smaller organisations paying £3,766 per month is not going to be viable. There are some other options, which is were the confusion creeps in. I’ll be using a very raw measure of Monthly Cost to Power BI Pro Licenses, to show how to get the best bang for your buck/quid/currency denomination of your choice.Read More »