Overview of DAX
Table of contents
- Data Types
Data Analysis Expressions (DAX) is a programming language used to create calculated columns, measures, and custom tables in Microsoft Power BI, Power Pivot, and Analysis Services. It is a collection of functions, operators, and constants that can be used to perform advanced calculations and queries on data in related tables and columns in tabular data models.
In other words, DAX is a tool that allows you to analyze data in a more sophisticated way than you could with basic formulas. It can be used to create new insights from your data, such as calculating trends, ratios, and other complex metrics. You can also use DAX to create custom visualizations and reports.
DAX formulas can be used to:
Create new ways to summarize and analyze your data.
Add new data points to your data model or transform existing data.
Create temporary tables that can be used in calculations or to filter data.
Control which rows of data are visible to each user.
Here are some examples of how DAX formulas can be used:
Measure: Calculate the total sales for a product category in the current year.
Calculated column: Calculate a discount rate based on the customer's order value.
Calculated table: Create a table of the top 10 customers by sales.
Row-level security: Only allow users to see data for their own region.
Measures are dynamic calculation formulas in Power BI, Excel PivotTables, and PivotCharts. They are used to summarize data and calculate metrics, such as total sales, average order value, and profit margin. Measures are created using DAX formulas, which can include standard aggregation functions (such as COUNT, SUM, and AVERAGE) as well as custom formulas.
Measures are dynamic because their results change depending on the context of the report or visualization. For example, a measure that calculates total sales will return a different value for each product category, region, and time period.
Measures are also used in reporting that supports combining and filtering model data by using multiple attributes. This means that you can create reports that show how different metrics vary across different dimensions of your data. For example, you could create a report that shows total sales for each product category by region and time period.
Measures are created by using the DAX formula bar in the model designer. To create a measure, you simply enter a DAX formula in the formula bar. The formula can include any valid DAX expression, including functions, operators, and values.
A formula in a measure can use standard aggregation functions automatically created by using the Autosum feature. For example, you could create a measure that calculates total sales by simply typing
SUM([Sales Amount]) in the formula bar.
You can also define your own formula by using the DAX formula bar. For example, you could create a measure that calculates profit margin by using the following formula:
Profit Margin = SUM([Profit]) / SUM([Sales Amount])
Named measures can be passed as an argument to other measures. This allows you to create complex calculations that combine multiple measures. For example, you could create a measure that calculates total revenue by combining the total sales and total discount measures.
Here are some examples of how measures can be used in reporting:
Calculate total sales for each product category, region, and time period.
Calculate average order value by customer segment.
Calculate profit margin by product category.
Calculate customer lifetime value.
Calculate year-over-year growth for key metrics.
Measures are a powerful tool for summarizing and analyzing data in Power BI, Excel PivotTables, and PivotCharts. By learning how to use measures, you can gain new insights from your data and create more informative reports and visualizations.
In the context of defining a formula for a measure, a Tooltip feature provides a sneak peek of potential results for the total within the current context. However, these results aren't immediately displayed elsewhere. This is because a measure's outcome depends on its context, which can vary.
To calculate a measure accurately, you need a reporting client application that can supply the necessary context, fetch relevant data for each cell, and then assess the expression for each cell. Such a client can be Excel's PivotTable or PivotChart, a Power BI report, or even a table expression in a DAX query within SQL Server Management Studio (SSMS).
Regardless of the specific client used, a separate query is executed for each cell in the results. In other words, different combinations of row and column headers in a PivotTable or various selections of slicers and filters in a Power BI report generate distinct subsets of data over which the measure is computed.
Unlike calculated columns, the syntax for a measure includes the measure's name preceding the formula. In the example just provided, the name Total Sales appears preceding the formula. After you've created a measure, the name and its definition appear in the reporting client application Fields list, and depending on perspectives and roles is available to all users of the model.
A calculated column is an additional column that you introduce into an existing table using the model designer. You then define the values for this column using a DAX formula. When a calculated column contains a valid DAX formula, it computes values for each row as soon as the formula is input. These values are subsequently stored in the in-memory data model.
For instance, in a Date table, if you enter the following formula in the formula bar:
\= [Calendar Year] & " Q" & [Calendar Quarter]
For each row in the table, a value is calculated by taking data from the Calendar Year column within the same Date table, adding a space and the capital letter Q, and then incorporating values from the Calendar Quarter column in the same Date table. The outcome for each row in the calculated column is instantly computed and might appear as, for instance, "2017 Q1."
Importantly, these column values are only recalculated when the table or any related table undergoes a refresh, or when the model is unloaded from memory and then reloaded, such as when closing and reopening a Power BI Desktop file.
A calculated table is a dynamically generated entity that relies on a formula expression and is derived from one or more tables within the same model. Instead of extracting and loading data into the columns of a new table from an external data source, a DAX formula defines the values within this table.
Calculated tables prove valuable in scenarios like role-playing dimensions. For instance, a Date table may serve as OrderDate, ShipDate, or DueDate, depending on foreign key relationships. By explicitly creating a calculated table for ShipDate, you establish a standalone table ready for queries, on par with any other table. Calculated tables also excel in configuring filtered rowsets or subsets/supersets of columns from existing tables. This empowers you to preserve the original table while crafting variations to cater to specific requirements.
Calculated tables seamlessly integrate into relationships with other tables. They possess data types, formatting options, and can be categorized as needed. These tables can be named, surfaced for visibility, or hidden, just like regular tables. Importantly, recalculations occur when the tables they draw data from undergo refreshing or updating processes.
Row Level Security
Row-level security involves the use of DAX formulas that must produce a TRUE/FALSE result, determining which rows can be accessed by members of specific roles in a query. For instance, in the Sales role, a DAX formula like:
\= Customers[Country] = "USA"
ensures that members of the Sales role can only see data for customers in the USA, and aggregate functions like SUM will only consider data related to USA customers. Importantly, row-level security is not available in Power Pivot within Excel.
When you establish row-level security using DAX formulas, you create a set of allowed rows. This doesn't deny access to other rows; instead, those rows are simply excluded from the allowed set. Other roles can grant access to the excluded rows through their own row-level security settings. If a user belongs to another role that permits access to the previously excluded row set, they can view the data for those rows.
Row-level security formulas apply not only to specified rows but also to related rows. In cases where a table has multiple relationships, the security filters apply to the active relationship. Additionally, row-level security formulas can intersect with other formulas defined for related tables, providing comprehensive security control.
DAX queries allow you to create and execute queries in tools such as SQL Server Management Studio (SSMS) and DAX Studio (daxstudio.org). Unlike DAX calculation formulas, which are limited to tabular data models, DAX queries can also be used with Analysis Services Multidimensional models. DAX queries are often favored for their simplicity and efficiency compared to Multidimensional Data Expressions (MDX) queries.
A DAX query is akin to a T-SQL SELECT statement and is composed of various DAX functions. One of the fundamental types of DAX queries is the "evaluate" statement. For example:
( FILTER ( 'DimProduct', [SafetyStockLevel] < 200 ) )
ORDER BY [EnglishProductName] ASC
This query, when executed, returns a table that lists products with a SafetyStockLevel less than 200, sorted in ascending order by EnglishProductName.
In summary, DAX queries provide a versatile and efficient means of querying data in various data models, offering advantages in terms of simplicity and performance over MDX queries.
DAX formulas are essential for creating calculations in calculated columns and measures, and securing your data by using row-level security. To create formulas for calculated columns and measures, use the formula bar along the top of the model designer window or the DAX Editor. To create formulas for row-level security, use the Role Manager or Manage roles dialog box. Information in this section is meant to get you started with understanding the basics of DAX formulas.
DAX formulas vary in complexity, ranging from simple to intricate. Here are examples of straightforward formulas that can be utilized in a calculated column:
= TODAY(): Inserts the current date into every row of the calculated column.
= 3: Inserts the value "3" into every row of the calculated column.
= [Column1] + [Column2]: Adds the values from [Column1] and [Column2] in the same row and places the result in the calculated column of that row.
Regardless of whether your formula is basic or intricate, you can follow these steps when constructing one:
Every formula must start with an equal sign (=).
You can either type the function name or an expression.
Start typing the first few letters of the desired function or name, and AutoComplete will suggest available functions, tables, and columns. Press TAB to add an item from the AutoComplete list to the formula.
Alternatively, you can click the Fx button to see a list of available functions. Choose a function from the dropdown list using the arrow keys and click OK to include it in the formula.
Provide the function's arguments by selecting them from a dropdown list of potential tables and columns or by entering values.
Validate the formula for syntax errors, ensuring that all parentheses are closed and that columns, tables, and values are accurately referenced.
Press ENTER to confirm the formula.
These steps help in creating DAX formulas, whether they are simple or complex, for use in calculated columns.
Using formula Atutocomplete
AutoComplete simplifies the process of creating accurate formula syntax by offering suggestions for each component within the formula.
When working with nested functions or adding to an existing formula, AutoComplete is applicable. It displays relevant options based on the text just before the insertion point. Notably, any text following the insertion point remains unaltered.
However, it's important to note that AutoComplete doesn't automatically insert closing parentheses for functions or match parentheses. Users must ensure the correctness of each function's syntax to save and utilize the formula effectively.
Using multiple functions in a formula
Function nesting allows you to use the outcomes of one function as input for another function. In calculated columns, you can nest functions to a maximum depth of 64 levels. Nevertheless, extensive nesting can complicate the creation and debugging of formulas. Some functions are specifically intended for nesting and generate tables as results, which cannot be directly saved but must be utilized as input for a table function. For instance, functions like SUMX, AVERAGEX, and MINX necessitate a table as their initial argument.
In DAX, a function is a named formula embedded within an expression. Functions typically require both required and optional input arguments, known as parameters. When a function is executed, it yields a value. DAX encompasses various functions designed for diverse purposes, including performing calculations involving dates and times, generating conditional values, manipulating strings, conducting lookups based on relationships, and enabling iterative operations on tables for recursive computations.
DAX functions differ from Excel formulas in several key ways:
DAX functions invariably reference entire columns or tables. To work with specific values within a table or column, you can apply filters within the formula.
DAX provides functions that enable custom calculations on a row-by-row basis. These functions allow you to use the current row's value or related values as parameters, facilitating context-dependent computations.
Many DAX functions return tables instead of single values. These tables are not directly displayed in reporting tools but serve as inputs for other functions. For instance, you can retrieve a table and then perform operations like counting distinct values or dynamic sums on filtered tables or columns.
DAX boasts a range of time intelligence functions, empowering users to define or select date ranges and perform dynamic calculations based on these temporal contexts. This facilitates tasks such as comparing sums across parallel periods.
Aggregation functions calculate a (scalar) value such as count, sum, average, minimum, or maximum for all rows in a column or table as defined by the expression.
Date and time functions
The date and time functions in DAX are similar to date and time functions in Microsoft Excel. However, DAX functions are based on a datetime data type starting March 1, 1900.
The filter functions in DAX return specific data types, look up values in related tales, and filter by related values.The lookup functions work by using tables and relationships, like a database. The filtering functions let you manipulate data context to create dynamic calculations.
The financial functions in DAX are used in formulas that perform financial calculations, such as net present value and rate of return. These functions are similar to financial functions used in Microsoft Excel.
An information function looks at the cell or row that is provided as an argument and tells you whether the value matches the expected type. For example, the ISERROR function returns TRUE if the value that you reference contains an error.
Logical functions act upon an expression to return information about the values in the expression. For example, the TRUE function lets you know whether an expression that you are evaluating returns a TRUE value.
Mathematical and trigonometric Functions
The mathematical functions in DAX are very similar to the Excel mathematical and trigonometric functions. Some minor differences exist in the numeric data types used by DAX functions.
These functions perform unique actions that cannot be defined by any of the categories most other functions belong to
Relationship functions in DAX allow you to return values from another related table, specify a particular relationship to use in an expression, and specify cross filtering direction.
Statistical functions calculate values related to statistical distributions and probability, such as standard deviation and number of permutations.
Text functions in DAX are very similar to their counterparts in Excel. You can return part of a string, search for text within a string, or concatenate string values. DAX also provides functions for controlling the formats for dates, times, and numbers.
Time Intelligence Functions
The time intelligence functions provided in DAX let you create calculations that use built-in knowledge about calendars and dates. By using time and date ranges in combination with aggregations or calculations, you can build meaningful comparisons across comparable time periods for sales, inventory, and so on.
Table Manipulation Functions
These functions return a table or manipulate existing tables. For example, by using ADDCOLUMNS you can add calculated columns to a specified table, or you can return a summary table over a set of groups with the SUMMARIZECOLUMNS function.
You can create variables within a DAX expression using the keyword VAR. Unlike a function, VAR is not a function itself; rather, it's a keyword used to store the result of an expression as a named variable. This named variable can then be utilized as an argument in other measure expressions. For instance:
DAX Copy code VAR TotalQty = SUM ( Sales[Quantity] )
IF ( TotalQty > 1000, TotalQty 0.95, TotalQty 1.25 )
In this example, TotalQty is a named variable that holds the result of the SUM function. This variable can be subsequently used in other expressions. Variables in DAX can be of various scalar data types, including tables. Leveraging variables in your DAX formulas can be a potent technique for enhancing your calculations and making them more understandable and efficient.
You can import data into a model from various sources with different data types. When data is imported into a model, it's initially converted to one of the tabular model data types. However, when this model data is utilized in calculations, it undergoes conversion to a DAX data type for the duration and output of the calculation. The choice of DAX data type is determined automatically based on the terms used in the formula.
DAX supports several data types, including:
Whole Number: A 64-bit integer without decimal places, covering a wide range of positive and negative whole numbers.
Decimal Number: A 64-bit real number allowing decimal places but limited to 17 significant digits.
Boolean: Representing True or False values.
Text: A Unicode character data string that can hold various types of data, including strings, numbers, or dates in text format.
Date: Handling dates and times in an accepted date-time representation, with valid dates starting from March 1, 1900.
Currency: Supporting currency values with four decimal digits of fixed precision within specific ranges.
Blank (N/A): A data type that represents and replaces SQL nulls, useful for handling missing or undefined values.
DAX also includes the Table data type for complex calculations involving tables, allowing dynamic data set definitions.
Understanding these data types is crucial for crafting accurate DAX formulas, as errors or unexpected results can occur when using incompatible operators with specific data types in arguments. For example, "1" + "2" results in an integer 3, while 1 & 2 yields a string "12."
Context is a fundamental concept in DAX formula creation, crucial for achieving dynamic analyses and resolving formula-related issues. It enables formulas to adapt their results based on the current row or cell selection and related data.
DAX formulas within tabular models can be assessed within various contexts, influenced by different design elements:
Filters applied within a PivotTable or report, which affect how formulas interpret data based on user selections.
Filters explicitly defined within a formula, allowing you to apply specific conditions to your calculations.
Relationships established using special functions within a formula, influencing how data from related tables is considered.
Context manifests in different types, including:
Row context, where formulas adapt their outcomes based on the current row's data.
Query context, which involves the context created by filters applied to the entire query.
Filter context, representing the set of filters applied to the data model before a DAX expression evaluation starts.
Understanding and effectively utilizing these contextual aspects are essential for constructing dynamic and high-performing analyses, as well as diagnosing formula-related issues.
Row context can be described as the context of the "current row" within a table. When you formulate a calculation in a calculated column, the row context for that calculation encompasses all the values within the columns of the current row. If the table has a relationship with another table, the row context also encompasses values from the related table that are associated with the current row.
For instance, consider a scenario where you create a calculated column, such as = [Freight] + [Tax], which combines values from two columns, Freight and Tax, within the same table. This formula inherently retrieves values only from the current row within the specified columns.
Row context extends to follow any defined relationships between tables, even those created within calculated columns using DAX formulas. It determines which rows in related tables are linked to the current row.
As an example, the following formula employs the RELATED function to fetch a tax value from a related table based on the shipping region of an order. It retrieves the tax rate for the current region from the Region table and adds it to the value in the Freight column. Notably, in DAX formulas, there's no need to explicitly specify the connecting relationships between tables.
Multiple Row Context
DAX encompasses functions that enable iterative calculations within a table. These functions can work with multiple current rows, each having its unique row context. Essentially, these functions facilitate the creation of formulas that perform recursive operations involving inner and outer loops.
For instance, consider a scenario where your data model consists of a Products table and a Sales table. You may want to analyze the Sales table, which contains various transactions for multiple products, and determine the highest quantity ordered for each product in any single transaction.
Using DAX, you can construct a single formula that delivers the correct result, and it automatically updates as users input data into the tables.
DAX Formula Example:
DAX Copy code = MAXX(FILTER(Sales,[ProdKey] = EARLIER([ProdKey])),Sales[OrderQty])
The EARLIER function plays a crucial role in storing the row context from the preceding operation. It maintains two sets of context in memory: one for the current row within the inner loop of the formula and another for the current row within the outer loop of the formula. DAX efficiently exchanges values between these loops, allowing the creation of intricate aggregate calculations.
Query context refers to the specific data subset fetched implicitly for a formula. When a user incorporates a measure or field into a report, the system scrutinizes elements such as row and column headers, slicers, and report filters to establish the context. It then executes queries against the model data to retrieve the precise data subset, perform computations as defined by the formula, and populate the report with values.
The key point is that the context varies based on where the formula is positioned, leading to potential variations in formula results. For instance, if you devise a formula that sums the Profit column values in the Sales table, like = SUM('Sales'[Profit]), and use it in a calculated column within the Sales table, the formula's query context consistently encompasses the entire Sales table dataset. This results in profit totals for all regions, products, years, and so forth.
However, users usually seek distinct results, such as profit for a specific year, country, product, or a combination of these, alongside a grand total. In a report, context changes through actions like filtering, adding/removing fields, and utilizing slicers. With each alteration, the measure is assessed within a different query context, allowing the same formula used as a measure to produce diverse outcomes for each cell.
Filter context defines the permissible values within columns or those retrieved from related tables. These filters can be applied at the column design level or in the presentation layer, including reports and PivotTables. Filter context can also be explicitly defined using filter expressions within a formula.
To establish filter context, you specify filter constraints for columns or tables through formula arguments. This filter context operates alongside other contexts like row context or query context.
In tabular models, various methods can be employed to create filter context. Users working within client applications like Power BI reports can dynamically create filters by adding slicers or report filters to row and column headings. Additionally, you have the option to define filter expressions directly within formulas to filter related values, tables used as inputs, or dynamically obtain context for calculations. Furthermore, you can clear filters entirely or selectively on specific columns, a valuable technique when crafting formulas for calculating grand totals.
For more information on creating filters within formulas, refer to the FILTER Function (DAX). To see an example of clearing filters for grand totals, consult the ALL Function (DAX).
Determining context in formulas
When you create a DAX formula, it undergoes two tests: first, it's checked for valid syntax, and then it's verified to ensure that the column and table names used in the formula exist within the current context. If any specified column or table can't be found, the formula triggers an error.
The context, during both validation and recalculation operations, is determined based on the available tables in the model, relationships between tables, and any applied filters. For instance, if you've recently imported data into a new table with no relationships or applied filters, the current context encompasses all the columns in that table. If the table is linked to other tables through relationships, the current context extends to include those related tables. When you incorporate a column from this table into a report that employs slicers and report filters, the context for the formula is the subset of data in each cell of the report.
Context is a potent concept that can sometimes make troubleshooting formulas challenging. Starting with simple formulas and understanding relationships is recommended to grasp how context operates. The following section provides examples of how formulas utilize various types of context to dynamically yield results.
In DAX language, there are four distinct categories of calculation operators used in formulas:
Comparison operators: These operators are employed for comparing values and generating logical outcomes, typically either TRUE or FALSE.
Arithmetic operators: This category includes operators utilized for performing mathematical calculations, resulting in numeric values.
Text concatenation operators: Text concatenation operators are employed to combine two or more text strings into a single, unified text string.
Logical operators: These operators are used to combine multiple expressions and produce a single result based on logical conditions.
These operator types serve various purposes within DAX formulas, enabling users to manipulate and analyze data effectively.
Did you find this article valuable?
Support Sweta_Sarangi by becoming a sponsor. Any amount is appreciated!