Website Visuals | Website Analytics

Visualize Website KPIs with Python Using Matplotlib and Seaborn

Welcome to Phase 8.2 Crafting Compelling Visualizations

You have successfully created an interactive web dashboard using Streamlit. That is a fantastic way to make your web analytics insights accessible. Now it is time to focus on creating high quality static visualizations for reports presentations or even for embedding in your website. In this phase we will learn how to visualize key performance indicators KPIs using Python with Matplotlib and Seaborn. Think of this as creating beautiful and informative charts that tell a clear story about your website's performance. This is crucial for formal reporting and deeper analytical dives.
This step is crucial for detailed data exploration and professional data presentation.


Why Python Visualizations (Matplotlib and Seaborn) are Essential

While Streamlit provides quick interactive dashboards Matplotlib and Seaborn offer unparalleled control and flexibility for static plots. Here is why they are so valuable.

• Publication Quality Create charts that are ready to use in academic research, business reports, or professional presentations.
• Granular Control Customize every detail of your charts—from colors and fonts to labels, scales, and annotations—for full visual control.
• Complex Visualizations Build advanced chart types and multi-layered plots that may not be possible with basic dashboard tools.
• Deep Dive Analysis Use detailed visualizations to uncover patterns, investigate outliers, and explore specific hypotheses.
• Automated Report Generation Generate and save charts automatically, making it easier to include visuals in recurring reports or pipelines.
• Industry Standard Matplotlib and Seaborn are trusted tools widely used by analysts and data scientists. Learning them boosts your data visualization skills.

Python visualization libraries empower you to tell a precise and impactful data story.


 Line chart showing daily users, sessions, and page views

Key Concepts for Python Visualizations

To create effective visualizations with Python we need to understand a few core concepts.

Matplotlib: This is the core plotting library in Python. It offers a wide variety of plotting tools and allows full customization of your charts, including colors, fonts, layout, and more.

Seaborn: Built on top of Matplotlib, Seaborn simplifies the process of creating visually appealing and informative statistical graphics. It handles complex plots with less code and better default styling.

Types of Charts: Different chart types help highlight different aspects of your data:
Line Charts: Best for showing trends over time, such as daily users or sessions.
Bar Charts: Useful for comparing categories like traffic sources or devices.
Pie Charts / Donut Charts: Great for showing proportions of a whole, such as traffic share by medium.
Histograms / KDE Plots: Help visualize the distribution of a single numeric value, like session duration.
Scatter Plots: Show relationships between two numeric variables, such as time on site vs. bounce rate.

Data Preparation: Before plotting, make sure your data is clean and structured in a Pandas DataFrame. This often means filtering, aggregating, or transforming your raw data to fit your visual needs.

Figure and Axes: In Matplotlib, a Figure represents the whole plotting area (like a canvas), and an Axes is a single plot within that space. You can add multiple Axes to one Figure to build complex layouts.

Labels, Titles, and Legends: These elements help make your charts readable. Always include clear labels for axes, a descriptive title, and a legend if you have multiple data series.

Saving Plots: Use Matplotlib’s plt.savefig() function to export your plots as image files (PNG, JPEG, SVG). This is useful for including visuals in reports, presentations, or sharing insights offline.


Python Code for Visualizing Web Analytics KPIs

We will write a Python script that connects to your SQL Server database. It will fetch various aggregated data points. It will then use Matplotlib and Seaborn to create several types of visualizations for your web analytics KPIs.

Practical Python Code Example (visualize_kpis.py)
Here is the basic Python code you will write for visualizing your KPIs. Save this file in your 'dashboard/' folder.

# SCRIPT SETUP 1: Import necessary libraries import pandas as pd import pyodbc import matplotlib.pyplot as plt import seaborn as sns from datetime import datetime # SCRIPT SETUP 2: Database connection details for SQL Server # Replace with your actual server name, database name, and credentials DB_CONFIG = { 'driver': '{ODBC Driver 17 for SQL Server}', 'server': 'YOUR_SQL_SERVER_NAME', # e.g., 'localhost' or 'SERVER_NAME\SQLEXPRESS' 'database': 'YOUR_DATABASE_NAME', 'uid': 'YOUR_USERNAME', 'pwd': 'YOUR_PASSWORD' } # FUNCTION 1: Connect to the database def connect_to_db(): """Establishes a connection to the SQL Server database.""" conn_str = ( f"DRIVER={DB_CONFIG['driver']};" f"SERVER={DB_CONFIG['server']};" f"DATABASE={DB_CONFIG['database']};" f"UID={DB_CONFIG['uid']};" f"PWD={DB_CONFIG['pwd']};" ) try: conn = pyodbc.connect(conn_str) print("Successfully connected to SQL Server.") return conn except pyodbc.Error as ex: sqlstate = ex.args[0] print(f"Database connection error: {sqlstate}") print(ex) return None # FUNCTION 2: Fetch daily traffic trends def fetch_daily_trends(conn): """Fetches daily traffic trends.""" query = """ SELECT CAST(DATEADD(s, event_timestamp / 1000000, '1970-01-01') AS DATE) AS event_date, COUNT(DISTINCT user_pseudo_id) AS daily_users, COUNT(DISTINCT session_id) AS daily_sessions, COUNT(CASE WHEN event_name = 'page_view' THEN 1 END) AS daily_page_views FROM events GROUP BY CAST(DATEADD(s, event_timestamp / 1000000, '1970-01-01') AS DATE) ORDER BY event_date; """ try: df = pd.read_sql(query, conn) df['event_date'] = pd.to_datetime(df['event_date']) return df except Exception as e: print(f"Error fetching daily trends: {e}") return None # FUNCTION 3: Fetch sessions by device category def fetch_sessions_by_device(conn): """Fetches sessions count by device category.""" query = """ SELECT device_category, COUNT(DISTINCT session_id) AS total_sessions FROM events WHERE device_category IS NOT NULL AND device_category != '' GROUP BY device_category ORDER BY total_sessions DESC; """ try: df = pd.read_sql(query, conn) return df except Exception as e: print(f"Error fetching sessions by device: {e}") return None # FUNCTION 4: Fetch sessions by traffic source def fetch_sessions_by_source(conn): """Fetches sessions count by traffic source.""" query = """ SELECT traffic_source, COUNT(DISTINCT session_id) AS total_sessions FROM events WHERE traffic_source IS NOT NULL AND traffic_source != '' GROUP BY traffic_source ORDER BY total_sessions DESC; """ try: df = pd.read_sql(query, conn) return df except Exception as e: print(f"Error fetching sessions by source: {e}") return None # FUNCTION 5: Visualize KPIs def visualize_kpis(conn): """ Fetches data and creates various KPI visualizations using Matplotlib and Seaborn. """ # Set Seaborn style for better aesthetics sns.set_style("whitegrid") plt.rcParams['figure.figsize'] = (10, 6) # Default figure size # --- Plot 1: Daily Traffic Trends (Users, Sessions, Page Views) --- daily_trends_df = fetch_daily_trends(conn) if daily_trends_df is not None and not daily_trends_df.empty: plt.figure(figsize=(12, 7)) plt.plot(daily_trends_df['event_date'], daily_trends_df['daily_users'], label='Daily Users', marker='o', markersize=4) plt.plot(daily_trends_df['event_date'], daily_trends_df['daily_sessions'], label='Daily Sessions', marker='x', markersize=4) plt.plot(daily_trends_df['event_date'], daily_trends_df['daily_page_views'], label='Daily Page Views', marker='s', markersize=4) plt.title('Daily Website Traffic Trends', fontsize=16) plt.xlabel('Date', fontsize=12) plt.ylabel('Count', fontsize=12) plt.legend(fontsize=10) plt.grid(True, linestyle='--', alpha=0.7) plt.tight_layout() plt.savefig('E:/SankalanAnalytics/dashboard/daily_traffic_trends.png') plt.close() # Close plot to free memory print("Generated daily_traffic_trends.png") else: print("Skipping daily traffic trends plot: No data or empty DataFrame.") # --- Plot 2: Sessions by Device Category (Bar Chart) --- device_df = fetch_sessions_by_device(conn) if device_df is not None and not device_df.empty: plt.figure(figsize=(9, 6)) sns.barplot(x='device_category', y='total_sessions', data=device_df, palette='viridis') plt.title('Total Sessions by Device Category', fontsize=16) plt.xlabel('Device Category', fontsize=12) plt.ylabel('Total Sessions', fontsize=12) plt.tight_layout() plt.savefig('E:/SankalanAnalytics/dashboard/sessions_by_device.png') plt.close() print("Generated sessions_by_device.png") else: print("Skipping sessions by device plot: No data or empty DataFrame.") # --- Plot 3: Sessions by Traffic Source (Bar Chart) --- source_df = fetch_sessions_by_source(conn) if source_df is not None and not source_df.empty: plt.figure(figsize=(10, 7)) sns.barplot(x='total_sessions', y='traffic_source', data=source_df, palette='magma') plt.title('Total Sessions by Traffic Source', fontsize=16) plt.xlabel('Total Sessions', fontsize=12) plt.ylabel('Traffic Source', fontsize=12) plt.tight_layout() plt.savefig('E:/SankalanAnalytics/dashboard/sessions_by_source.png') plt.close() print("Generated sessions_by_source.png") else: print("Skipping sessions by traffic source plot: No data or empty DataFrame.") # --- Plot 4: Simple Pie Chart for Device Distribution --- if device_df is not None and not device_df.empty: plt.figure(figsize=(8, 8)) plt.pie(device_df['total_sessions'], labels=device_df['device_category'], autopct='%1.1f%%', startangle=90, colors=sns.color_palette('pastel')) plt.title('Session Distribution by Device Category', fontsize=16) plt.axis('equal') # Equal aspect ratio ensures that pie is drawn as a circle. plt.tight_layout() plt.savefig('E:/SankalanAnalytics/dashboard/device_distribution_pie.png') plt.close() print("Generated device_distribution_pie.png") else: print("Skipping device distribution pie chart: No data or empty DataFrame.") # MAIN EXECUTION 1: This block runs when the script starts if __name__ == "__main__": # MAIN EXECUTION 2: Connect to the database conn = connect_to_db() if conn: # MAIN EXECUTION 3: Visualize KPIs visualize_kpis(conn) # MAIN EXECUTION 4: Close the database connection conn.close() print("Database connection closed.") else: print("Could not establish database connection. Exiting.")


Important Notes on This Code:

This Python script connects to your SQL Server database to fetch key web analytics metrics. It then uses Matplotlib and Seaborn to create a set of static visualizations that give clear insights into your website’s performance.

The script generates several charts, including daily traffic trends (users, sessions, and page views), sessions by device category, sessions by traffic source, and a pie chart showing device distribution.

Each chart is saved automatically as a .png image in your E:\SankalanAnalytics\dashboard\ folder. Before running the script, be sure to update the DB_CONFIG section with your actual SQL Server connection details — including the server name, database name, username, and password.

If you haven’t already installed the required libraries, use the following command:
pip install matplotlib seaborn


Understanding Your Python KPI Visualization Script

This Python script is designed to create professional quality static visualizations of your key web analytics metrics. It is a powerful way to prepare charts for reports presentations or even for embedding in other documents. Let us break down each part of the code.

1. Setting Up Your Tools and Connections (Refer to SCRIPT SETUP 1, 2 in code)

At the very top of the script you see some import lines. These lines bring in the necessary tools for our work.

import pandas as pd (SCRIPT SETUP 1) This brings in Pandas for data manipulation.

import pyodbc (SCRIPT SETUP 1) This lets Python talk to your SQL Server database.

import matplotlib.pyplot as plt (SCRIPT SETUP 1) This is the primary plotting library.

import seaborn as sns (SCRIPT SETUP 1) This library is built on Matplotlib and provides enhanced statistical plots and aesthetics.

from datetime import datetime (SCRIPT SETUP 1) This is used for working with dates and times.


Next you see DB_CONFIG.
• DB_CONFIG (SCRIPT SETUP 2) This holds all the details for connecting to your SQL Server. You need to update YOUR SQL SERVER NAME YOUR DATABASE NAME YOUR USERNAME and YOUR PASSWORD with your actual information.

2. Connecting to Your Database connect_to_db (Refer to FUNCTION 1 in code)

This function establishes the connection to your SQL Server database.

What it does It tries to open a link to your SQL Server using the details from DB_CONFIG

How it works It builds a connection string. This string helps 'pyodbc' find and log in to your database. It then attempts the connection.

Safety check It prints messages to the console if the connection is successful or if there is an error.


3. Data Fetching Functions (Refer to FUNCTIONS 2, 3, 4 in code)

These functions are responsible for retrieving specific datasets needed for our visualizations from the SQL Server database.

fetch_daily_trends(conn) (FUNCTION 2) This function fetches daily users, sessions, and page views. It is best suited for creating time series visualizations that show trends over time.

fetch_sessions_by_device(conn) (FUNCTION 3) This retrieves the total number of sessions grouped by device category such as mobile, desktop, or tablet. It helps you understand which devices users prefer when visiting your website.

fetch_sessions_by_source(conn) (FUNCTION 4) This gathers session counts grouped by traffic source — like organic search, direct visits, or referral traffic — giving insight into where your users are coming from.

How they work Each function runs a SQL query using pd.read_sql to pull the data directly into a Pandas DataFrame. They also include error handling to ensure that even if there’s an issue with the query or connection, the script won’t break unexpectedly.


4. Visualize KPIs visualize_kpis (Refer to FUNCTION 5 and its internal steps)

This is the main function where the plotting happens. It orchestrates the data fetching and visualization process.

What it does It sets up the plotting style, calls the data fetching functions, and then creates and saves four different types of plots.

How it works The lines sns.set_style("whitegrid") and plt.rcParams set a clean aesthetic style for all plots and define a default figure size, making your charts look professional.

For Plot 1, Daily Traffic Trends, it fetches daily trend data using fetch_daily_trends. It then uses plt.plot() to create a line chart showing daily users, sessions, and page views over time. The plot includes a title, labels, legend, and grid for clarity. The plot is saved as daily_traffic_trends.png in your dashboard/ folder using plt.savefig(). Finally, plt.close() is called to free up memory, which is important when generating multiple plots.

Plot 2, Sessions by Device Category, fetches device session data using fetch_sessions_by_device. It uses sns.barplot() to create a bar chart comparing sessions across different device categories. Seaborn simplifies creating aesthetically pleasing bar charts. This plot is saved as sessions_by_device.png.

Plot 3, Sessions by Traffic Source, fetches traffic source data using fetch_sessions_by_source. It uses sns.barplot() again to compare sessions by traffic source and saves the plot as sessions_by_source.png.

Plot 4 is a Simple Pie Chart for Device Distribution. It reuses the device_df data and uses plt.pie() to create a pie chart showing the proportion of sessions from each device category. The autopct parameter formats the percentages directly on the slices. This plot is saved as device_distribution_pie.png.

Output The function prints messages to the console indicating which plots have been generated and saved successfully.


5. Running the Script The Main Execution Block (Refer to MAIN EXECUTION 1 to 4 in code)

This part of the script tells Python to run your visualization process when the file is executed.

What it does The if __name__ == "__main__": construct ensures that the code inside this block only runs when the script is executed directly.

How it works The script first attempts to connect to the database. If successful, it calls the visualize_kpis() function to generate and save all the plots. Finally, it closes the database connection to free up resources.


Overall Value of Python KPI Visualizations

Creating visualizations with Matplotlib and Seaborn is a fundamental skill for any data analyst or data scientist. It allows you to produce highly customized and professional plots that can effectively communicate complex insights. These static plots are perfect for formal reports presentations or even for sharing as standalone images. This demonstrates your ability to not only analyze data but also to present your findings in a clear compelling and visually appealing manner. This is a vital skill for data storytelling and influencing business decisions.


Next Steps

You have successfully created various KPI visualizations using Python. This means you are now proficient in generating professional static charts. The next exciting phase will be to visualize page level metrics and drop heatmaps. This will allow you to create even more detailed and actionable visualizations focusing on individual page performance and user drop off points.

For now make sure you save this Python script in your E drive SankalanAnalytics dashboard folder. Name it: visualize_kpis.py.

To run your visualization script:

1. Open your Command Prompt (CMD).

2. Navigate to your E:\SankalanAnalytics\dashboard\ folder:
cd E:\SankalanAnalytics\dashboard\

3. Run the Python script:
python visualize_kpis.py

4. After execution, you will find the generated .png image files in the same dashboard/ folder.


Previous Topic==> Dashboard with Strealit  Next Topics==> Page Level Metric


SQL All topics Wise Interview Questions    Employee Salary Management SQL FAQ!.   C FAQ   Top 25 PL/SQL Interview Questions

Joins With Group by Having  Equi Join  Joins with Subqueries  Self Join  Outer Join 


Website Analytics Project: Phases and Action Steps

  • Home
  • 🟢 Live App: Web Analytics Simulator
  • Phase 0: Project Setup & Problem Definition
  • 0.1 Define Project Goals & Challenges
  • 0.2 Select Tools, Tech Stack & Data Sources
  • 0.3 Software Requirements & Installation
  • 0.4 Folder Structure & GitHub Repo
  • 0.5 Testing Project Locally
  • Phase 1: Planning for Analytics
  • 1.1 Website Analytics Project Overview
  • 1.2 Define KPIs, Bounce Rate, Engagement
  • 1.3 Identify Target Users & Pain Points
  • Phase 2: Data Collection
  • 2.1 Setup Google Analytics 4 (GA4)
  • 2.2 Export GA4 Data to BigQuery/CSV
  • 2.3 Design SQL Schema for Web Analytics
  • Phase 3: Data Cleaning & Feature Engineering
  • 3.1 Clean Website Data with Python & Pandas
  • 3.2 Create Custom Metrics (Session, Bounce, etc.)
  • Phase 4: Exploratory Data Analysis (EDA)
  • 4.1 Analyze Website Traffic Trends
  • 4.2 Behavior by Device, Source, Location
  • 4.3 Top Pages & High Bounce Pages
  • 4.4 Diagnose Low Traffic & User Drop
  • Phase 5: Business Insights
  • 5.1 Funnel Analysis & Drop-Off Points
  • 5.2 New vs Returning Users
  • 5.3 Time Spent & Scroll Depth
  • Phase 6: SQL for Business
  • 6.1 SQL for Business Insights
  • 6.2 Combine Web Data Using SQL
  • 6.3 Find Problematic Pages Using SQL
  • Phase 7: Machine Learning
  • 7.1 Segment Users with Clustering
  • 7.2 Predict Bounce Rate with ML
  • 7.3 Recommend Pages or Content
  • Phase 8: Dashboards & Visualization
  • 8.1 Dashboard with Streamlit
  • 8.2 Visualize KPIs with Python
  • 8.3 Page-Level Metrics & Drop Heatmaps
  • Phase 9: Final Analytics Story
  • 9.1 Summary Report & Findings
  • Phase 10: Hosting & Portfolio Presentation
  • 10.1 Host Website Project Online
  • 10.2 Add to GitHub with ReadMe
  • 10.3 Pitch Project in Interview
  • Other Topics
  • SQL Interview Questions
  • SQL Case Study: Account Management
  • Python Interview Questions
  • Why C Language

Get in touch

  • tech2dsm@gmail.com

© Sankalan Data Tech. All rights reserved.