Power Query Merge Queries

 admin

Power Query gives us the power to combine queries or append one query into another. And don’t forget we can refresh all the data sources with a single click of Refresh All. This means we can put an end to tedious copy and paste routines for combining multiple files together. The result of a combine operation on one or more queries will be only one query. You can find Append or Merge in the Combine Queries section of the Query Editor in Power BI or in Excel. Append means results of two (or more) queries (which are tables themselves) will be combined into one query in this way: Rows will be appended after each other.

  1. Power Query Merge Queries Returns Null
  2. Excel Power Query Merge Multiple Queries
  3. Merge Multiple Queries In Excel

I had to tackle an issue this week that occurred when I refreshed one of the queries contained in a quarterly report. Here’s the scenario..

The source data for the query was simple and had only a few columns:

  • customer
  • some customer attributes
  • index with associated percentages

The Index column was created by numbering rows after grouping by customer (Shout out to this Excel Guru blog post for helping me get to this point.)

A clean version of my source data:

I needed to manipulate this data before landing to the workbook. My first objective was to pivot the Index column and fill with the values from the Percentage column.

This worked with no issue. Then I needed to merge all of the Index columns to create a column containing a comma separated list of all the percentages.

The first step to achieve this was to change the column types to text. Notice how all four of the column names were hard coded in this step.

After I changed the type, I selected the columns to merge by a comma separator. Again, notice how the names of the columns to merge were hard coded.

After some additional clean up, this landed the desired outcome to my workbook.

This was all well and good, until it came to the next quarter..

Every quarter, the source data for my query is refreshed with new customers. The most important change in the data to highlight is that the maximum number in the Index column is now 3. Last quarter, the maximum value was 4.

This caused the following error to occur in my query when I tried to refresh.

The #”Changed Type” step was looking for column 4, which no longer exists! This was easy to fix, as I could simply delete the {“4”, type text} clause from the list of columns to change.

But it errors again!

Once again, column 4 was hard coded in the combine columns function. My entire query is outlined below:

I knew I needed to modify the query to dynamically define the columns to change type and merge. I came up with the following solution.

Dynamic(ish) Solution

My first attempt at fixing the initial query was definitely more dynamic than what I started with. I was happy with it for about six hours until I changed it again. I outline the fully complete query, that I’m happy with (for the time being..), in the third section below. But for full transparency, I wanted to include this intermediate step.

The key step to fixing the query was inserting a step immediately after pivoting the index column, named #”Column List”. This step removed the customer and customer attribute columns, leaving only the index columns.

Power

As I explicitly removed the customer columns, this returned all of the index columns. It will do so regardless of how many there are quarter to quarter.

The key assumption here being that we will forever and always only need to remove the three columns hard coded into the remove columns step

.

.

.

(Spoiler: this won’t be the case)

My next step was to change the type of all of the columns defined in the previous step. I found a neat solution in this blog post from the BI Accountant.

The important points that allowed me to do this are:

  1. Table.ColumnNames() – this function returns all the column names in a table as a list. I called this on the table from my previous step which listed the names of all of the index columns.
  2. List.Transform() – this function has two arguments: a list and a transformation function. It takes each item in the list and transforms it with the function in the second argument.

I nested these two functions together within the #”Changed Type Dynamic” step. The data for this step is actually from the #”Pivoted Column” applied step (one before I defined the #”Column List”).

I called Table.ColumnNames() on #”Column List” to create the list of columns I wanted to transform. This was the first argument in the List.Transform() function.

The second argument in List.Transform() was to change each column to type text.

Very nice!! Now I had to merge all of the index columns. I achieved this by once again using the Table.ColumnNames() function on the
#”Column List” step.

Again, very nice! This worked well and refreshed correctly with the new data.

However.. My query would not work properly if additional customer attributes were added to the source table:

This would result in incorrect data being included in the #”Column List” step. It was particularly problematic with a numeric column like “Number of Orders” being included, as it is hard to distinguish from the percentages that fill the index columns.

Dynamic Solution

To make my solution fully dynamic, I had to change how I was defining my list of index columns to change and merge. To achieve this, my first step was to demote headers after the initial pivot of the index columns.

Then I removed all of the data except for the top rows. This left me with a table with one row: Column1 to ColumnN, containing column names in the first row.

My next step was to transpose the table. This resulted in a table with one column containing a row for every column name.

Then I changed the type of the column from any to whole number. I knew that the index column names will always be a number. I also assumed that none of the other column names would be able to be parsed as a number.

My next step was to remove the errors and transpose the table back.

Now I had a fully dynamic way to define the #”Column List” step, no matter what the other columns in the source data were! My final query is outlined below:

Types of merging of queries in Power Query or in Power BI

This article describes the differences between types of merging of queries. The screenshots are from Power Query, but it works very similar in Power BI.

We are talking about merging of queries by some key (ID). This is not about the appending of queries.

We will use these two tables, that are supposed to be merged. The values from the orange one should be assigned to rows in the blue one, when the ID is key.

Power Query Merge Queries Returns Null

We will create the queries to both tables, and then in merging window select, which columns are keys in the tables. And then we have to select the way of merging - which is what this article describes.

So what is the difference between these options?

Left Outer

In this connection, all rows from first table are taken and related values from second table are related. If there are more related rows in the second table, all of them are assigned (so values from first table are multiplied) (like number '2' here.

RIght Outer

The same, just from the other side.

Full Outer

In the result, there will be all rows from both tables. If they have adequate row on the other side, they will be connected. If not, empty values are created.

Inner

Excel Power Query Merge Multiple Queries

The result contains only rows, that have appropriate value on the other side - other rows are ignored.

Left Anti

The result contains the left table, where values contained in the right table are removed. In other words, the result says, what is in first table and at the same time is not in the second table (nice to comparison).

Right Anti

Merge Multiple Queries In Excel

The same from the other side.