2. Replacing of Missing Values
Missing values can cause faulty reports, so we either fill or remove them.
Click on the column that has missing values.
Transform > Replace Values, then type in a default value (like “Unknown” for missing customer names).
Remove rows with incomplete information (Remove Rows > Remove Blank Rows), if necessary.
For instance, for missing values in the Price column, one can replace them with an average or fixed amount so that calculations would be done right.
3. Splitting and Merging Columns
Sometimes one column contains different pieces of information.
For instance, Customer Name may contain full names like "Alice Johnson" while we want separate columns for First Name and Last Name.
Select the Customer Name column.
Click Split Column > By Delimiter and choose Space as the delimiter.
Rename the two new columns to First Name and Last Name.
In the same way, you can merge other columns together using Merge Columns.
4. Change Data Types
Inaccurate data types could throw calculations off and have an impact during analysis.
Select the Order Date column.
Click Data Type in the toolbar and choose Date.
Make sure Price and Total columns are set to Currency if these are to be analyzed as finances.
Once we fix the data types, it allows Power BI to work properly in doing its calculations.
5.Adding a Custom Column for New Calculations
Let’s say a 10% discount has to be calculated for every order.
Click Add Column > Custom Column.
Type in the formula:
[Total] * 0.10
Rename this column to Discount Amount.
Now every order will have a 10% discount that can further enhance pricing analysis.
Step 6 - Effective Practices in Power BI Data Modeling
• Unique Identifiers - CustomerID, OrderID, ProductID must all be different.
- Improve Performance: Simple joins and do not have too many tables joined together
- Star Schema: Helps in faster and more efficient querying.
- Set Cross-Filtering: Both direction for interactive reporting.
- Logical Hierarchies: Hierarchies should be kept in natural order (Region → Country→ State → City).
6. Grouping Data for Summarizing Sale
Grouping is helpful for summarizing large datasets.
For instance, to calculate total sales per product:
Select Product and Total columns.
Click Transform > Group By.
Select Sum for calculating total revenue for each product.
Presently, instead of the individual transactions, we see a clean picture of the total sales for each product.
7.Unpivoting Data for Analysis
Occasionally, data is structured in a way that makes it difficult to analyze. Assume we have columns for quarterly sales like Q1 Sales and Q2 Sales and Q3 Sales. Then we can use Unpivot Columns to convert them all into a single column.
Select the quarterly sales columns.
Click Transform>Unpivot Columns.
This makes it easier to filter and analyze sales trends across quarters.
8. Conditional Columns for Data Categorization
In this section, we will categorize orders depending on their Total value:
Click Add Column > Conditional Column.
Set up rules as such:
If Total > 1500, then High Value
If Total between 800 and 1500, then Medium Value
Else, Low Value
This categorization helps in identifying high-value customers and products.
9. Removal of Duplicates
Removing duplicates can alter the insight, hence, they should be eliminated.
Select either the Order ID or Customer Name column.
Click Remove Duplicates.
This way we ensure there is one order for each CustomerID, thereby bettering the accuracy of data.
The Data Finalization and Loading into Power BI
Once all transformations are done:
Click Close &Apply to load the cleaned data into Power BI.
Use the Report View for dashboard and visualization purposes.
Well, the data is structured, corrected, and good for reporting.
Conclusion
Power Query just changes the game in converting raw data into useful insights. This practice had us cover real-world techniques, including those we have mentioned: handling of missing values, splitting columns, changing data types, and summarizing data.
Now that you have mastered all those transformations, you will be able to clean and prepare any data set at an instant. Implement these steps in your projects on Power BI until working with dirty data becomes second nature!