## Calculating Median in PowerPivot using DAX

I just came across this blog post by Bill Anton where he discusses several approaches to calculated the Median of a given set in T-SQL, MDX and DAX. In the end of his post when it comes to the DAX calculation, he references several post by Marco, Alberto and Javier (post1, post2) that already address that kind of calculation in DAX. But he also claims that non of the solutions is “elegant”. Well, reason enough for me to try it on my own and here is what I came up with. Its up to you to decide whether this solution is more elegant than the others or not

In general, the median calculation always varies depending on the number of items and whether this number is even or odd.
For an even population the median is the mean of the values in the middle:
the median of {3, 5, 7, 9} is is (5 + 7)/2 = 6
For an odd population, the median is the value in the middle:
the median of {3, 5, 9} is 5

In both cases the values have to be ordered before the calculation. Note that it does not make a difference whether the values are sorted in ascending or descending order.

In this example, our set contains 12 items (=months) so we have to find the 2 items in the middle of the ordered set – December and February – and calculate the mean.

So, how can we address this problem using DAX? Most of the posts I mentioned above use some kind of combination of ranking – RANKX() – and filtering – FILTER(). For my approach I will use none of these but use TOPN instead (yes, I really like that function as you probably know if you followed my blog for some time).

In this special case, TOPN() can do both, ranking and filtering for us. But first of all we need to know how many items exist in our set:

Cnt_Months:=DISTINCTCOUNT('Date'[Month])

This value will be subsequently used in our next calculations.

To find the value(s) in the middle I use TOPN() twice, first to get the first half of the items (similar to TopCount) and then a second time to get the last values that we need for our median calculation (similar to BottomCount):

As the median calculation is different for even and odd sets, this also has to be considered in our calculation. For both calculations MOD()-function is used to distinguish both cases:

Items_TopCount:=IF(MOD([Cnt_Months],2) = 0,
([Cnt_Months] / 2) + 1,
([Cnt_Months] + 1) / 2)

For an even number of items (e.g. 12) we simply divide the count of items by 2 and add 1 which gives us a (12 / 2) + 1 = 7 for our sample.
For an odd number of items (e.g. 5) we first add 1 to our count of items and then divide by 2 which gives us (5 + 1) / 2 = 3

Items_BottomCount:=IF(MOD([Cnt_Months],2) = 0, 2, 1)

For an even number of items we have to consider the last 2 values whereas for an odd number of items we only have to consider the last value.

These calculations are then used in our median calculation:

Median SA Months:=CALCULATE([SumSA],
TOPN(
[Items_BottomCount],
TOPN(
[Items_TopCount],
VALUES('Date'[Month]),
[SumSA]),
[SumSA] * -1))
/
[Items_BottomCount]

As DAX has no built-in BOTTOMN()-function, we need to “abuse” the TOPN() function and multiply the OrderBy-value by “–1” to get the BOTTOMN() functionality. As you can see most of the logic is already handled by our [Items_TopCount] and [Items_BottomCount] measures and this pattern can be reused very easily.

Of course all these calculations can also be combined and the use of IF() can be avoided:

Median SA Months v2:=CALCULATE([SumSA],
TOPN(
2 – MOD([Cnt_Months], 2),
TOPN(
([Cnt_Months] + 1) / 2,
VALUES('Date'[Month]),
[SumSA]),
[SumSA] * -1))
/
(2 – MOD([Cnt_Months], 2))

Note: for an even population ([Cnt_Months] + 1) / 2 returns X.5 which is automatically rounded up when it is used in a function that expects a whole number. In our example this is what happens: (12 + 1) / 2 = 6.5 –> 7

These are the final results:

We could also use AVERAGEX() to calculate our median but I think that it is some kind of overhead to use AVERAGEX() just to divide by “1” or “2” depending on the number of items that our TOPN-functions return:

Median SA Months AvgX:=AVERAGEX(
TOPN(
2-MOD([Cnt_Months], 2),
TOPN(
([Cnt_Months] +1) / 2,
VALUES('Date'[Month]),
[SumSA]),
[SumSA] * -1),
[SumSA])

As you can see there are various approaches to calculate the median, its up to you which on you like most. I have not tested any of them in terms of performance over bigger sets – this may be topic for an upcoming post.

Categories: DAX, Excel, PowerPivot, Tabular Tags: , , ,

## Another Post about Calculating New and Returning Customers – Part 2

March 4, 2013 1 comment

In my previous post I showed a new approach on how to calculate new (and returning) customers in PowerPivot/tabular using DAX. We ended up with a solution where we added the customers first order date as a calculated column to our customer-table. This column was then linked to our date-table with an inactive relationship. The final calculation used USERELATIONSHIP() to make use of this relationship as follows:

New Customers:=CALCULATE(
COUNTROWS(Customer),
USERELATIONSHIP(Customer[FirstOrderDate], 'Date'[Date]))

This calculation performs really well as it does not have to touch the fact-table to get the count of new customers. And this is also the issue with the calculation as other filters are not reflected in the calculation:

Take row 2 as an example: we have 8 “Total Customers” of which 12 are “New Customers”. Obviously an error in the calculation. The PivotTable is filtered to Category=”Road Bikes” and we have 8 customers on the 2nd of February that bought a road bike. The “New Customers” calculation on the other hand is not related to the Subcategory and shows 12 as in total there were 12 new customers for all products.

To get our calculation working also with other filters we have to somehow relate it to our fact-table. So far we calculated the customers first order date only in the customer table. The customers first order may be related to several fact-rows, e.g. one row for each product the customer bought. Our “New Customers” calculation should only include customers that are active considering also all other filters.

To identify a customers first order in our fact-table we can again use a calculated column and also re-use our previous calculated column in our Customer-table that holds the customers first order date:

=NOT(
ISBLANK(
LOOKUPVALUE(
Customer[CustomerKey],
Customer[CustomerKey],
[CustomerKey],
Customer[FirstOrderDate],
[Order Date]
)))

This returns True for all fact-rows associated with a customers first order and False for all other rows.

The final “New Customers v2” calculation is quite simple then – in addition to the active filters we add a further filter to only select rows that are associated to a customers first order:

New Customers v2:=CALCULATE(
[Total Customers],
'Internet Sales'[IsCustomersFirstOrder] = TRUE())

And this are the results:

As you can see there are still differences between “New Customers OLD” and “New Customers v2”. But is this really a problem with the new calculation? Lets analyze the issue taking customer “Desiree Dominguez” where we encounter the first difference as an example:

“Desiree Dominguez” had her first order on the 22th of June in 2006. So she is actually no “new customer” in 2008. The reason why the old calculation counts her as “new customer” is that it was the first time that she bought a product of subcategory “Road Bikes”. Whether this is correct or not is up to your business definition of a “new customer”. According to my experience it is more likely that “Desiree Dominguez” is not counted as a new customer in 2008 and so the “New Customer v2” actually returns the more accurate results.

An other option for this calculation is to rank the [Order Date] or [Sales Order Number] for each customer within the fact-table using the calculation below:

=RANKX(
FILTER(
ALL('Internet Sales'),
[CustomerKey] = EARLIER([CustomerKey])),
[Order Date],
[Order Date],
1,
DENSE
)

[Order Date] could be replaced by [Sales Order Number]. This makes sense if a customer can have multiple orders per day and you also want to distinguish further by [Sales Order Number]. The new field would also allow new analysis. For example the increase/decrease in sales from the second order compared to the first order and so on.

The “New Customer” calculation in this case would still be similar. We just have to filter on the new calculated column instead:

New Customers v3:=CALCULATE(
[Total Customers],
'Internet Sales'[CustomersOrderNr] = 1)

The multidimensional model:

The whole logic of extending the fact-table to identify rows that can be associated with a customers first order can also be used in a multidimensional model. Once we prepared the fact-table accordingly the calculations are quite easy. The biggest issues here does not reside in the multidimensional model itself but in the ETL/relational layer as this kind of operation can be quite complex – or better say time-consuming in terms of ETL time.

At this point I will not focus on the necessary ETL steps but start with an already prepared fact-table and highlight the extensions that have to be made in the multidimensional model. The fact-table already got extended by a new column called [IsCustomersFirstOrder] similar to the one we created in tabular using a DAX calculated column. It has a value of 1 for rows associated with a customers first order and 0 for all other rows.

The next thing we have to do is to create a new table in our DSV to base our new dimension on. For simplicity I used this named query:

This table is then joined to the new fact-table:

The new dimension is quite simple as it only contains one attribute:

You may hide the whole dimension in the end as it may only be used to calculate our “new customers” and nowhere else and may only confuse the end-user.

Once we have added the dimension also to our cube we can create a new calculated measure to calculate our “new customers” as follows:

CREATE MEMBER CURRENTCUBE.[Measures].[New Customers] AS (
[Measures].[Customer Count],
[Is Customers First Order].[Is Customers First Order].&[1]
), ASSOCIATED_MEASURE_GROUP = 'Internet Customers'
, FORMAT_STRING = '#,##0';

The calculation is based on the existing [Customer Count]-measure which uses DistinctCount-aggregation. Similar to DAX with just extend the calculation by further limiting the cube-space where “Is customers First Order” = 1.

This approach also allows you to create aggregations if necessary to further improve performance. So this is probably also the best way in terms of query-performance to calculate the count of new customers in a multidimensional model.

## Another Post about Calculating New and Returning Customers

I know, this topic has already been addressed by quite a lot of people. Chris Webb blogged about it here(PowerPivot/DAX) and here(SSAS/MDX), Javier Guillén here, Alberto Ferrari mentions it in his video here and also PowerPivotPro blogged about it here. Still I think that there are some more things to say about it. In this post I will review the whole problem and come up with a new approach on how to solve this issue for both, tabular and multidimensional models with the best possible performance I could think of (hope I am not exaggerating here  )

OK, lets face the problem of calculating new customers first and define what a new customer for a given period actually is:

A new customer in Period X is a customer that has sales in Period X but did not have any other sales ever before. If Period X spans several smaller time periods
(e.g. Period X=January contains 31 days) then there must not be any sales before the earliest smaller time period (before 1st of January) for this customer to be counted as a new customer.

According to this definition the common approach can be divided into 2 steps:
1) find all customers that have sales till the last day in the selected period
2) subtract the number of customers that have sales till the day before the first day in the
selected period

First of all we need to create a measure that calculates our distinct customers.
For tabular it may be a simple calculated measure on your fact-table:

Total Customers:=DISTINCTCOUNT('Internet Sales'[CustomerKey])

For multidimensional models it should be a physical distinct count measure in your fact-table, ideally in a separate measure group.

How to solve 1) in tabular models

This is also straight forward as DAX has built-in functions that can do aggregation from the beginning of time. We use MAX(‘Date’[Date]) to get the last day in the current filter context:

Customers Till Now:=CALCULATE(
[Total Customers],
DATESBETWEEN(
'Date'[Date],
BLANK(),
MAX('Date'[Date])))

How to solve 2) in tabular models

This is actually the same calculation as above, we only use MIN to get the first day in the current filter context and also subtractt “1” to get the day before the first day.

Previous Customers:=CALCULATE(
[Total Customers],
DATESBETWEEN(
'Date'[Date],
BLANK(),
MIN('Date'[Date])-1))

To calculate our new customers we can simply subtract those two values:

New Customers OLD:=[Customers Till Now]-[Previous Customers]

How to solve 1) + 2) in multidimensional models

Please refer to Chris Webb’s blog here. The solution is pure MDX and is based on a combination of the range-operator “{null:[Date].[Calendar].currentmember}”, NONEMPTY() and COUNT().

Well, so far nothing new.

So lets describe the solution that I came up with. It is based on a different approach. To make the approach easily understandable, we have to rephrase the answer to our original question “What are new customers"?”:

A new customer in Period X is a customer that has his first sales in Period X.

According to this new definition we again have 2 steps:
1) Find the first date with sales for each customer
2) count the customers that had their first sales in the selected period

I will focus on tabular models. For multidimensional models most of the following steps have to be solved during ETL.

How to solve 1) in tabular models

This is pretty easy, we can simply create a calculated column in our Customer-table and get the first date on which the customer had sales:

=CALCULATE(MIN('Internet Sales'[Order Date]))

How to solve 2) in tabular models

The above create calculated column allows us to relate our ‘Date’-table directly to our ‘Customer’-table. As there is already an existing relationship between those tables via ‘Internet Sales’ we have to create an inactive relationship at this point:

Using this new relationship we can very easy calculate customers that had their first sales in the selected period:

New Customers:=CALCULATE(
COUNTROWS(Customer),
USERELATIONSHIP(Customer[FirstOrderDate], 'Date'[Date]))

Pretty neat, isn’t it?
We can use COUNTROWS() opposed to a distinct count measure as our ‘Customer’-table only contains unique customers – so we can count each row in the current filter context.
Another nice thing is that we do not have to use any Time-Intelligence function like DATESBETWEEN which are usually resolved using FILTER that would iterate over the whole table. Further it also works with all columns of our ‘Date’-table, no matter whether it is [Calendar Year], [Fiscal Semester] or [Day Name of Week]. (Have you ever wondered how many new customers you acquired on Tuesdays? )   And finally, using USERELATIONSHIP allows us to use the full power of xVelocity as native relationships are resolved there.

The results are of course the same as for [New Customers OLD]:

Though, there are still some issues with this calculation if there are filters on other tables:

As you can see, our new [New Customers] measure does not work in this situation as it is only related to our ‘Date’-table but not to ‘Product’.

I will address this issue in a follow-up post where I will also show how the final solution can be used for multidimensional models – Stay tuned!

UPDATE: Part2 can be found here

## Fiscal Periods, Tabular Models and Time-Intelligence

I recently had to build a tabular model for a financial application and I would like to share my findings on this topic in this post. Financial applications tend to have “Periods” instead of dates, months, etc. Though, those Periods are usually tied to months – e.g. January = “Period01”, February = “Period02” and so on. In addition to those “monthly periods” there are usually also further periods like “Period13”, “Period14” etc. to store manually booked values that are necessary for closing a fiscal year. To get the years closing value (for a P&L account) you have to add up all periods (Period01 to Period14). In DAX this is usually done by using TOTALYTD() or any similar Time-Intelligence Function.

Here is what we want to achieve in the end. The final model should allow the End-user to create a report like this:

This model allows us to analyze data by Year, by Month and of course also by Period. As you can see also the YTD is calculated correctly using DAX’s built-in Time-Intelligence functions.

However, to make use of Time-Intelligence functions a Date-table is required (more information: Time Intelligence Functions in DAX) but this will be covered later. Lets start off with a basic model without a Date-table.

For testing purposes I created this simple PowerPivot model:

Sample of table ‘Facts’:

 AccountID PeriodID Value 4 201201 41,155.59 2 201201 374,930.01 3 201211 525,545.15 5 201211 140,440.40 1 201212 16,514.36 5 201212 639,998.94 3 201213 -100,000.00 4 201213 20,000.00 5 201214 500,000.00

The first thing we need to do is to add a Date-table. This table should follow these rules:
- granularity=day –> one row for each date
- no gaps between the dates –> a contiguous range of dates
- do not use use the fact-table as your date-table –> always use an independent date-table
- the table must contain a column with the data type “Date” with unique values
- “Mark as Date-table”

A Date-table can be created using several approaches:
- SQL view/table
- Azure Datamarket (e.g. Boyan Penev’s DateStream)
- …

(Creating an appropriate Date-table is not part of this post – for simplicity i used a Linked Table from my Excel workbook).

I further created calculated columns for Year, Month and MonthOfYear.

At this point we cannot link this table to our facts. We first have to create some kind of mapping between Periods and “real dates”. I decided to create a separate table for this purpose that links one Period to one Date. (Note: You may also put the whole logic into a calculated column of your fact-table.) This logic is straight forward for periods 1 to 11 which are simply mapped to the last (or first) date in that period. For Periods 12 and later this is a bit more tricky as we have to ensure that these periods are in the right order to be make our Time-Intelligence functions work correctly. So Period12 has to be before Period13, Period13 has to be before Period14, etc.

So I mapped Period16 (my sample has 16 Periods) to the 31st of December – the last date in the year as this is also the last period. Period 15 is mapped to the 30th of December – the second to last date. And so on, ending with Period12 mapped to the 27th of December:

 PeriodID Date 201101 01/31/2011 201102 02/28/2011 201111 11/30/2011 201112 12/27/2011 201113 12/28/2011 201114 12/29/2011 201115 12/30/2011 201116 12/31/2011 201201 01/31/2012 201202 02/29/2012

I called the table ‘MapPeriodDate’.

This table is then added to the model and linked to our already existing Period-table (Note: The table could also be linked to the Facts-table directly using PeriodID). This allows us to create a new calculated column in our Facts-table to get the mapping-date for the current Period:

=RELATED(MapPeriodDate[Date])

The new column can now be used to link our Facts-table to our Date-Table:

Please take care in which direction you create the relationship between ‘Periods’ and ‘MapPeriodDate’ as otherwise the RELATED()-function may not work!

Once the Facts-table and the Date-table are connected you may consider hiding the unnecessary tables ‘Periods’ and ‘MapPeriodDate’ as all queries should now use the Date-table. Also the Date-column should be hidden so the lowest level of our Date-table should be [Period].

To get a [Period]-column in our Date-table we have to create some more calculated columns:

[Period_LookUp]
= LOOKUPVALUE(MapPeriodDate[PeriodID], MapPeriodDate[Date], [Date])

this returns the PeriodID if the current date also exists in the MapPeriodDate-table. Note that we only get a value for the last date in a month.

[Period]
= CALCULATE(MIN([Period_LookUp]), DATESBETWEEN('Date'[Date], [Date], BLANK()))

our final [Period]-calculation returns the first populated value of [Period_LookUp] after the current date. The first populated value for dates in January is the 31st which has a value of 201101 – our PeriodID!

The last step is to create our YTD-measures. This is now very easy as we can again use the built-in Time-Intelligence functions with this new Date-table:

ValueYTD:=TOTALYTD(SUM([Value]), 'Date'[Date])

And of course also all other Time-Intelligence functions now work out of the box:

All those calculations work with Years, Months and also Periods and offer the same flexibility that you are used to from the original financial application.

Categories: DAX, Excel, PowerPivot, Tabular

## Dynamic ABC Analysis in PowerPivot using DAX

An ABC Analysis is a very common requirement for for business users. It classifies e.g. Items, Products or Customers into groups based on their sales and how much impact they had on the cumulated overall sales. This is done in several steps.

1) Order products by their sales in descending order
2) Cumulate the sales beginning with the best selling product till the current product
3) Calculate the percentage of the cumulated sales vs. total sales
4) Assign a Class according to the cumulated percentage

Marco Russo already blogged about this here. He does the classification in a calculated column based on the overall sales of each product. As calculated columns are processed when the data is loaded, this is not dynamic in terms of your filters that you may apply in the final report. If, for example, a customer was within Class A regarding total sales but had no sales last year then a report for last year that uses this classification may give you misleading results.

In this blog I will show how to do this kind of calculation on-the-fly always in the context of the current filters. I am using Adventure Works DW 2008 R2 (download) as my sample data and create a dynamic ABC analysis of the products.

The first thing we notice is that our product table is a slowly changing dimension of type 2 and there are several entries for the same product as every change is traced in the same table.

So we want to do our classification on the ProductAlternateKey (=Business Key) column instead of our ProductKey (=Surrogate Key) column.

First we have to create a ranking of our products:

Rank CurrentProducts:=IF(HASONEVALUE(DimProduct[ProductAlternateKey]),
IF(NOT(ISBLANK([SUM SA])),
RANKX(
CALCULATETABLE(
VALUES(DimProduct[ProductAlternateKey]),
ALL(DimProduct[ProductAlternateKey])),
[SUM SA])))

Check if there is only one product in the current context and that this product also has sales. If this is the case we calculate our rank. We need to do the CALCULATETABLE to do the ranking within the currently applied filters on other columns of the DimProduct-table e.g. if a filter is applied to DimProduct[ProductSubcategoryKey] we want to see our ranking within that selected Subcategory and not against all Products.

I also created a measure [SUM SA] just to simplify the following expressions:

SUM SA:=SUM(FactInternetSales[SalesAmount])

The second step is to create a running total starting with the best-selling product/the product with the lowest rank:

CumSA CurrentProducts:=SUMX(
TOPN(
[Rank CurrentProducts],
CALCULATETABLE(
VALUES(DimProduct[ProductAlternateKey]),
ALL(DimProduct[ProductAlternateKey])),
[SUM SA]),
[SUM SA])

We use a combination of SUMX() and TOPN() here. TOPN() returns the top products ordered by [SUM SA]. Further we use our previously calculated rank to only get the products that have the same or more sales than the current product. For example if the current product has rank 3 we sum up the top 3 products to get our cumulated sum (=sum of the first 3 products) for this product. Again we need to use CALCULATETABLE() to retain other filters on the DimProduct-table.

The third step is pretty easy – we need to calculate percentage of the cumulated sales vs. the total sales:

CumSA% CurrentProducts:=
[CumSA CurrentProducts]
/
CALCULATE([SUM SA], ALL(DimProduct[ProductAlternateKey]))

This calculation is straight forward and should not need any further explanation.

The result of those calculations can be seen here:

To do our final classification we have to extend our model with a new table that holds our classes and their border-values:

 Class LowerBoundary UpperBoundary A 0 0.7 B 0.7 0.9 C 0.9 1

Class A should contain products which’s cumulated sales are between 0 and 0.7 – between 0% and 70%.
Class B should contain products which’s cumulated sales are between 0.7 and 0.9 – between 70% and 90%.
etc.

(This table can later be extended to support any number of classes and any boundaries between 0% and 100%.)

To get the boundaries of the selected class we create two measures that are later used in our final calculation:

MinLowerBoundary:=MIN([LowerBoundary])

MaxUpperBoundary:=MAX([UpperBoundary])

Our final calculation looks like this:

SA Classified Current:=IF(NOT(ISCROSSFILTERED(Classification[Class])),
[SUM SA],
CALCULATE(
[SUM SA],
FILTER(
VALUES(DimProduct[ProductAlternateKey]),
[MinLowerBoundary] < [CumSA% CurrentProducts]
&& [CumSA% CurrentProducts] <= [MaxUpperBoundary])))

If our Classification-table is not filtered, we just show our [SUM SA]-measure. Otherwise we extend the filter on our DimProduct[ProductAlternateKey] using our classification filtering out all products that do not fall within the borders of the currently selected class.

This measure allows us to see the changes of the classification of a specific product e.g. over time:

In 2006 our selected product was in Class C. For 2007 and 2008 it improved and is now in Class A. Still, overall it resides in Class B.

We may also analyze the impact of our promotions on the sales of our classified products:

Our Promotion “Touring-1000 Promotion” only had impact on products in Class C so we may consider to stop that promotion and invest more into the other promotions that affect all classes.

The classification can be used everywhere you need it – in the filter, on rows or on columns, even slicers work. The only drawback is that the on-the-fly calculation can take quite some time. If I find some time in the future i may try to further tune them and update this blog-post.

Though it is already in Office 2013 format an may not be opened with any previous versions of Excel/PowerPivot.
It also includes a second set of calculations that use the same logic as described above but does all the calculations without retaining any filters on the DimProducts-table. This allows you to filter on Class “A” and ProductSubcategory “Bike Racks” realizing that “Bike Racks” are not a Class “A” product or to see which Subcategories or Categories actually contain Class A, B or C products!

## DAXMD and DefaultMembers

With Power View for Multidimensional Models – Preview Microsoft recently made the first SQL Server version available that allows you to query your multidimensional models using DAX, or the be exactly, using DAXMD. As tabular and multidimensional model are fundamentally different in terms of their underlying data structures , there are also some differences in how to query them. Jason Thomas already blogged about some of those differences here and showed how to query attributes in DAXMD. In this blog I will focus on how DefaultMembers are handled in DAXMD.

As you probably know, there is no concept of DefaultMembers in tabular models but they are essential for multidimensional models. In most cases the DefaultMember is the All-Member of the hierarchy. This is also no issue for tabular models but it gets more tricky if there is a DefaultMember defined in the multidimensional model and you query it using a query language that is designed for tabular models like DAXMD.

For my tests I used the AdventureWorks Multidimensional Model (enterprise) from SQL Server 2008R2 which can be downloaded here.

This model contains a Dimension called [Scenario] which is Non-Aggregateable and also has a DefaultMember defined. To get a list of all available scenarios you could simply write the following MDX:

SELECT
{[Measures].[Amount]} ON 0,
[Scenario].[Scenario].[Scenario].allmembers ON 1

The result looks like this:

To get the similar result in DAX you would usually write a query like this:

EVALUATE
SUMMARIZE(
'Scenario',
'Scenario'[Scenario],
"Amount", 'Financial Reporting'[Amount])

If you run this query you get an error that says, that

Column [Scenario] is part of composite key, but not all columns of the composite key are included in the expression or its dependent expression.

This is because the our [Scenario]-attribute has a NameColumn that is different from it’s KeyColumn. DAXMD handles this as it would be a composite key as the NameColumn may not be unique without the KeyColumn. So you always also have to include the Key-Column in your query. This is usually done by using the following syntax: ‘MyTable’[MyColumn.Key0] where Key0 refers to the first column of the composite key. So lets change our query and see what happens:

EVALUATE
SUMMARIZE(
'Scenario',
'Scenario'[Scenario.Key0],
"Amount", 'Financial Reporting'[Amount])

Now we get a different saying that the column [Scenario.Key0] could not be found in the source table. That’s very strange as this syntax works just well for all other attributes that have different Key- and NameColumns.

(I posted this issue on Connect as for me this is an inconsistent behavior – feel free to vote for it here)

So, to further investigate into this problem we first have to check what columns actually exist in our ‘Scenario’ table. This can be done using this query:

EVALUATE 'Scenario'

We notice several things here:
1) the column is called [Scenario.UniqueName] and opposed to [Scenario.Key0]
2) the resultset only contains 1 row
3) this one row is the DAXMD representation of our DefaultMember

Ok, now that we know how to reference the column we can adopt our query accordingly:

EVALUATE
SUMMARIZE(
'Scenario',
'Scenario'[Scenario.UniqueName],
"Amount", 'Financial Reporting'[Amount])

The query works now but still only shows only one row – our DefaultMember. This is the special behavior for DAXMD and how DefaultMembers are handled. It internally applies a filter on the columns where DefaultMembers are set. In this case, whenever you query your ‘Scenario’ table, it is internally handled like

FILTER(
'Scenario',
'Scenario'[Scenario.UniqueName] = "[Scenario].[Scenario].&[1]")

This internal filter can be removed like any other filter on a table or a column using the same functions that we would usually use – ALL(), ALLEXCEPT(), etc.

So to get our DAXMD equivalent to our MDX query we have to remove the filter from our ‘Scenario’ table:

EVALUATE
SUMMARIZE(
ALL('Scenario'),
'Scenario'[Scenario.UniqueName],
'Scenario'[Scenario],
"Amount", 'Financial Reporting'[Amount])

Please also note that the last two rows [Budget Variance] and [Budget Variance %] are calculated members that are created in the cubes MDX script and they show up just like regular rows! So calculated members are seamlessly integrated into DAXMD – awesome! Especially  if you consider that DAXMD is NOT translated into MDX but is natively integrated into the engine!

In this blog I showed how DefaultMembers defined in multidimensional models are handled in a tabular query language like DAXMD and what pitfalls I encounter. I hope this blogs helps you to better understand these internals and not to make the same mistakes again that I already did.

## Back to Business – now as SSAS Maestro!

If you followed my blog in the past you have probably realized that there was no new post since mid of July. Well, there is a good reason for this – I was on vacation for the past 5 months . At this point I want to thank my company pmOne for giving me the opportunity to do this!

I used the time to travel around in South East Asia visiting Indonesia, Singapore, Malaysia, Brunei and Vietnam and had an really awesome time and made a lot of good experiences.
During that time I also got the official confirmation that I passed the SSAS Maestro 1.2 certification that I did back in 2011 and that I am now a member of to the small circle of SSAS Maestros – what awesome news during my trip!

Now that I am back again having reloaded all my batteries, I have to catch up what I missed during that time – which is quite a lot (Office 2013, PowerView, DAX on MOLAP, …) – and pick up my old working life again.
These new technologies and tools also offer a lot of potential for future posts – so stay tuned!

Below are also some pictures of my travel, which are, by no means, supposed to make anyone jealous who is still sitting in the office reading this post

Categories: Uncategorized Tags: ,