In my previous blog we have seen few techniques of data modeling. Now let us see few more techniques to do the data modeling and data transformation
Add New column:
We have Match table which has all match related details such as winner team, who is the man of the match, is the match won by wickets or runs etc. If we would like to show the data “Win by wickets” and “Win By Runs” based on the winning type, how do we do that? Just open the table. See there are two columns (Win Type and Win Margin) which can provide such data, we only need to show this data from columns to the rows.
Let us add new column called Win_By_Wickets and add the formula to show the win_margin in case the match is won by wickets.
Let us follow the steps as shown in below snippets
Add the column name before the “=” symbol and let us add if condition to check the win_type and display the Win_Margin, Writing this DAX formula is almost similar to how we add formula in excel.
The complete formula would be
“Win_By_Wickets = IF(Match[Win_Type]=”wickets”,Match[Win_Margin],”0″)”
Similarly for Win by Runs, it would be
Win_By_Runs = IF(Match[Win_Type]=”runs”,Match[Win_Margin],”0″)
Now we have these two new columns added in the table for further visualization purpose.
Aggregate the data with group by
Ball By Ball table contains the huge data per ball in all the matches. Now if we need to find the statistics of Economy Rate – how many runs the Bowler has given per over. We can use SUMMARIZE function which provides grouping of the data based on particular columns. Output of the summarize function is a table. Its very similar to the select query having group by clause and aggregate function in the SQL server.
Please refer below expression which selects data from Ball_By_Ball table, groups the data by Bowler and adds additional columns to the same table with various aggregate functions such as SUM, COUNTROWS etc.
EconomyTable = SUMMARIZE(Ball_By_Ball,Ball_By_Ball[Bowler],”Runsperbowler”,SUM(Ball_By_Ball[Runs_Scored]),”Oversbybowler”,COUNTROWS(Ball_By_Ball)/6,”Economy”,(SUM(Ball_By_Ball[Runs_Scored])/(COUNTROWS(Ball_By_Ball)/6)))
Here we are calculating the economy rate, hence need the number of overs played by the Bowler and the runs scored,
My Economy column contains the formula as RunsperBowler divided by oversbyBowler. Exact formula is as given below (SUM(Ball_By_Ball[Runs_Scored])/(COUNTROWS(Ball_By_Ball)/6)))
Now we can use this table in report for further visualizations. I would recommend everyone to go to the Microsoft Docs (Link is provided below) to refer the details of the different aggregate functions available in Power BI.
We have seen few more techniques of data modeling in Power BI using Power BI desktop in this blog. Let us see how to create different visualizations of this data and create IPL Analytics reports in my next blog. I just loved creating these reports.
Thank you for reading my blog, if you have any questions please mention in the comments and hit the like button if you like my write up.
Keep Learning !! Happy analyzing and be Powerful 🙂