PBIDS – Power BI Data Sources Files

The great thing about Power BI is the monthly updates and how much Microsoft are churning out features, updates, and all sorts of cool stuff. The worst thing about Power BI, is trying to keep up with all of it. In this post I’m going to look at Power BI Data Source files. PBIDS was announced back in October 2019, I must have been busy, but I didn’t pick up on it. But the other day I was using Azure Synapse, and its integration with Power BI on a project. It download a blank connection file, and noticed it had different icon and file extension. So have to figure out what is was.

Here is the announcement from Oct 2019, for those like me that missed it.

So what is a PBIDS file? Well it creates a basic file that holds a data source connection. At the moment you can only store one connection and not multiple connections. I looked at the posting and the documentation for it, but I didn’t quite get it at first, as I could not see how to define and save the file, just bits of JSON to set a connection. But that bit of JSON is all it is. It just holds the connection settings, so it is not a full fat PBIX Power BI file. It supports the following connection types:
Azure Analysis Service
Folder
OData
SAP BW
JSON
SAP Hana
SharePoint List
SQL Server
Text file
Web
Dataflow

Here is the basic JSON for a connection to a folder:

{
“version”: “0.1”,
“connections”: [
{
“details”: {
“protocol”: “folder”,
“address”: {
"path": "C:\\Users\\jon\\Documents\\Some Folder"
}
}
}
]

That is all there is to it. Create the JSON in notepad, save it as with the extension of ‘.pbids’ and there you have it. The Power BI Tips site has a connections builder that you can use that will generate the PBIDS file for you if you want.

If you load it, it fires up Power BI, and starts the ‘Get Data’ process for you.

Power BI PBIDS file opening

So you can save a set of predefined data connections and store them in the for users to quickly get access to data. It will not support storing usernames and passwords, as that would not be secure and just give access to anyone who has the file. I the moment it is at version 0.1, so there maybe some updates coming in the future, for example adding multiple connections. But it is a nice quick way to save and store connection templates.

Power Query & finding the Nth instance of a letter in a string for manipulation

Power BI LogoPower Query and the M language is a bit of a pain, not a lot of people do it (Or they do, and they don’t write about it), and the knowledge base on MSDN is not real world example friendly. So trying to figure out how to do something you know how to do in say SQL or DAX is not that easy some times. Where possible it is best to do as much as you can in the Power Query loading, otherwise you’ll be burning CPU time in the data model. So things like data cleansing should be done in Power Query, not in the DAX after it has been loaded into memory.

So find the Nth instance of a item in string I can do in SQL, but in Power Query it is a little bit different. So lets start with an example that contains the following.

ABC-01-02-03-XYZ

So I had to remove the ‘XYZ’ bit. Easy right?


Text.Start("ABC-01-02-03-XYZ", 12)

No, turns out the data contain items like

AB-012-02-03-XYZ

ABC-01-02-09

AB-012-032-03-XGYZ

So no quick fix by just using basic string manipulation. So what can we use? In this case you can use the Text.PositionOfAny function, and with this you can set it to find the first, last or all occurrences. So in this case, looking for the first instance of ‘-‘, you would get:

 


Text.PositionOfAny("ABC-01-02-03-XYZ", {"-"}, Occurrence.First)

Would return 3. Wait, not 4? No as it starts from ‘0’. It’s like a JSON array or Python dictionary index. This will be important later.


Text.PositionOfAny("ABC-01-02-03-XYZ", {"-"}, Occurrence.Last)

Returns 12.


Text.PositionOfAny("ABC-01-02-03-XYZ", {"-"}, Occurrence.Any)

Would in a custom column, just return ‘List’, or internally {‘3’, ‘6’, ‘9’, ’12’}. You can expand the list out if needed and create extra rows, but you should not in this case.

So to strip out the ‘XYZ’ part of ‘ABC-01-02-03-XYZ’

So you can use Text.Start, so it will read from the start of the string to the end, which can be set from a list.


Text.Start("ABC-01-02-03-XYZ", (Text.PositionOfAny("ABC-01-02-03-XYZ" , {"-"}, Occurrence.All){3}))

So in the above code the {3} is get the value from list in that location. Remember lists start from 0, and {3} would return the value 12 in the list {‘3’, ‘6’, ‘9’, ’12’}

But for the other instances in the data, {3} might not exist. However you can problematically strip stuff out, with for example an ‘if’ clause, with List.Count and drive the extraction of the data, based in the number of items in the list.

So for example in the data I have in the column [SortID] the following

AB-012-02-03-XYZ

ABC-01-02-09

if List.Count(Text.PositionOfAny([SortID] , {"-"}, Occurrence.All)) <= 3
then [SortID]
else Text.Start([SortID], (Text.PositionOfAny([SortID] , {"-"}, Occurrence.All){3}))

So in the 'if' clause List.Count on ABC-01-02-09 will return 3, and then do nothing, as I don't need to trim the end off it. Otherwise it will take AB-012-02-03-XYZ and find the position of the 3rd list value, and take the string length from it found in the Text.PositionOfAny, and trim the end off.

If you want here is some example M code to recreate the above. Enjoy

 

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcnRy1jUw1DUw0jUw1o2IjFKK1UERtFSKjQUA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [SortID = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"SortID", type text}}),
#"Basic String Manipulation" = Table.AddColumn(#"Changed Type", "Basic String Manipulation", each Text.Start([SortID], 12)),
#"First Occurrance" = Table.AddColumn(#"Basic String Manipulation", "First Occurrance", each Text.PositionOfAny([SortID], {"-"}, Occurrence.First)),
#"Last Occurrance" = Table.AddColumn(#"First Occurrance", "Last Occurrance", each Text.PositionOfAny([SortID], {"-"}, Occurrence.Last)),
#"All Occurance" = Table.AddColumn(#"Last Occurrance", "All Occurrance", each Text.PositionOfAny([SortID], {"-"}, Occurrence.All)),
#"If Clause" = Table.AddColumn(#"All Occurance", "If Clause", each if List.Count(Text.PositionOfAny([SortID] , {"-"},
Occurrence.All)) <= 3
then [SortID]
else Text.Start([SortID], (Text.PositionOfAny([SortID] , {"-"}, Occurrence.All){3})))
in
#"If Clause"