Module 1 — Lesson 3 of 8

Dashboards & Metrics

Navigate the Performance dashboards, understand key metrics, and learn how to drill down from fleet-wide views to individual endpoints.
📚 Overview
🔧 Deep Dive
🛠 Hands-On
Check
📈
6
Metric Categories
📊
4
Dashboard Levels
🔍
5
Time Range Options
🎯
Top-Down
Drill-Down Approach

Dashboard Drill-Down Hierarchy

Fleet Overview Computer Group: Claims Computer Group: IT Endpoint List Endpoint List Endpoint Detail Start broad (fleet) Narrow to group Drill into endpoint

The Performance dashboards are organized as a funnel: start broad at the fleet level, narrow to a computer group, then drill into specific endpoints that need attention. This top-down approach ensures you focus time on the machines that matter most.

Six categories of metrics are tracked: CPU utilization, memory pressure, disk I/O and space, boot/login times, application crashes/hangs, and network latency. Each tells a different story about the endpoint experience.

Simulated Performance Overview Dashboard

Tanium Console -- Performance Overview
Overview
By Group
Endpoints
Alerts
7.6
Fleet Health Score
72%
Good/Excellent
22%
Fair
6%
Poor

Fleet Averages -- Key Metrics

CPU Usage
35%
Memory Pressure
62%
Disk Usage
48%
Avg Boot Time
2m 45s

Top Issues Detected

IssueAffected EndpointsSeverity
Disk space below 10%47High
Boot time > 5 minutes38High
High memory pressure (>90%)29Medium
Frequent app crashes (>3/day)15Medium

The Six Metric Categories

Drill-Down Workflow: Fleet to Endpoint

Spot the Trend

On the Overview dashboard, notice Claims Department health dropped from 7.8 to 6.2 this week.

Drill into Group

Click Claims Department -- see that 45 of 200 endpoints are now "Fair" or "Poor."

Sort by Score

Sort the endpoint list ascending -- worst-performing machines appear at the top.

Examine Endpoint

Click the lowest-scoring endpoint. Detail shows: CPU 92% (SearchIndexer.exe), memory 95%, boot 6 min.

Take Action

Initiate remediation -- restart the process, deploy a fix, or flag for technician visit.

Time Range Filters

Time RangeBest For
Last 1 hourReal-time troubleshooting during an active incident
Last 24 hoursDay-over-day comparison
Last 7 daysWeekly trend analysis (most common view)
Last 30 daysMonthly reporting and long-term trends
Custom rangeBefore/after comparisons around specific changes
Tip: Before/After Comparisons

Investigating a change impact (e.g., "did Tuesday's Windows update affect boot times?")? Use the custom time range to compare the week before to the week after. Performance's trend graphs make this visual comparison straightforward.

Simulated: Endpoint Detail View

Tanium Console -- Endpoint Detail: CAEI778234
Overview
By Group
Endpoints
Detail
4.2
Health Score
92%
CPU Avg
95%
Memory Used
6m 12s
Last Boot

Resource Breakdown

CPU
92%
Memory
95%
Disk I/O Queue
4.5
Disk Free
12%

Top Processes

ProcessCPU %Memory MB
SearchIndexer.exe48%312
chrome.exe (12 tabs)22%1,840
outlook.exe8%420
Teams.exe6%380

🤔 What Would You Do?

It's Monday morning and you open the Performance Overview dashboard. You notice that the fleet-wide health score dropped from 7.5 to 5.8 over the weekend. The Health Distribution chart shows that 30% of endpoints moved from "Good" to "Fair." The trend line shows the drop happened gradually between Saturday 2:00 AM and Saturday 6:00 AM.

What is the most likely cause, and what should you investigate first?

Correct! A gradual decline between 2:00 AM and 6:00 AM on a Saturday is a classic signature of a scheduled maintenance window. Check your WSUS/SCCM/Intune update history and any change management records for that window.
Not quite. The timing (Saturday 2:00-6:00 AM) and the gradual spread across 30% of endpoints strongly suggests a scheduled maintenance activity -- Windows Update, a software deployment, or a Group Policy change.

Match the Metric to Its Description

Drag each metric on the left to its correct description on the right.

CPU Utilization
Memory Pressure
Disk Queue Length
Boot Time
Application Crashes
Duration from power-on to a usable desktop
Percentage of processing capacity in use
Process terminations from unhandled exceptions
Number of pending disk read/write operations
How close the system is to exhausting physical RAM
All matches correct! You have a solid understanding of the key Performance metrics.
Some matches are incorrect. Review the metrics descriptions in the Deep Dive tab and try again.

Walkthrough: Drill Down to Root Cause

Follow this step-by-step simulated walkthrough to practice the drill-down workflow.

Tanium Console -- Computer Group View
Overview
By Group
Endpoints
Alerts

Health Scores by Department

Computer GroupEndpointsAvg ScoreTrend (7d)
IT Department858.2+0.1
Underwriting3207.10.0
Customer Service4506.8-0.3
Claims Processing200 5.4 -1.6
Remote Workers2807.4+0.2

Action: Claims Processing stands out with a 1.6-point drop. Click into this group to investigate.

✍ Knowledge Check

1. In the Performance Overview dashboard, what does the Health Distribution chart show?

Correct! The Health Distribution chart gives you a quick visual breakdown of how many endpoints fall into each health category.
Not quite. The Health Distribution chart shows the percentage of endpoints categorized as Excellent, Good, Fair, or Poor -- giving you a quick sense of overall fleet health distribution.

2. Which metric would help you distinguish between a "slow machine" problem and a "slow network" problem?

Correct! Network latency (round-trip time) helps you determine if the user's slow experience is caused by the local machine or by network conditions.
Not quite. Network Latency is the metric that separates local machine problems from network problems. If CPU, memory, and disk are healthy but network RTT is high, the slowness is network-related.
Mercury Insurance — Digital Workplace Team
DEX Specialization Training © 2026