Back to blog
| 11 min read

Active Directory Health Check: 10 Things to Audit Before They Break

Active Directory is one of those things that works fine until it doesn't, and when it doesn't, everything breaks at once. Logins fail, GPOs stop applying, Exchange can't find mailboxes. Here are the 10 checks you should be running regularly — before your Monday morning starts with a P1.

Active Directory PowerShell Diagnostics Best Practices

Why AD health checks matter

Most small IT teams don't audit Active Directory until something visibly breaks. The problem is that AD degrades silently. Replication falls behind, DNS records go stale, SYSVOL stops syncing, and nobody notices because logins still work — until they don't. By the time you see symptoms, the underlying issue has been festering for weeks.

A regular health check catches these problems early. Run it monthly, and you'll never be surprised by a replication failure or a GPO that stopped applying two weeks ago. Here's what to check and how to check it.

1

AD replication status

If replication is broken, changes made on one DC don't propagate to others. A password reset on DC1 doesn't reach DC2, so the user can log in on one site but not another. Group membership changes don't replicate, so access is inconsistent. This is the single most important thing to check.

# Check replication status across all DCs
repadmin /replsummary

# Show detailed replication failures
repadmin /showrepl * /csv | ConvertFrom-Csv |
  Where-Object { $_."Number of Failures" -gt 0 } |
  Format-Table "Source DSA", "Naming Context", "Number of Failures", "Last Failure Time"

# Quick PowerShell check
Get-ADReplicationFailure -Target (Get-ADDomainController -Filter *).HostName

What to look for: Any DC showing failures or a "last success" timestamp older than 1 hour is a problem. Replication should complete within minutes on a healthy domain. If you see errors referencing RPC or DNS, the issue is usually network or DNS, not AD itself.

2

DNS records and SRV records

AD is utterly dependent on DNS. Domain-joined machines find DCs by querying SRV records. If those records are missing or pointing to decommissioned DCs, clients can't find a DC to authenticate against. Logins slow down, GPO processing fails, and Kerberos ticket requests time out.

# Verify SRV records for your domain
nslookup -type=srv _ldap._tcp.dc._msdcs.contoso.com

# Check all critical SRV record types
$Domain = (Get-ADDomain).DNSRoot
$SrvTypes = @(
  "_ldap._tcp.dc._msdcs.$Domain",
  "_kerberos._tcp.dc._msdcs.$Domain",
  "_gc._tcp.$Domain"
)

foreach ($Srv in $SrvTypes) {
  $Records = Resolve-DnsName -Name $Srv -Type SRV -ErrorAction SilentlyContinue
  if ($Records) {
    Write-Host "[OK] $Srv - $($Records.Count) records" -ForegroundColor Green
  } else {
    Write-Host "[FAIL] $Srv - No records found" -ForegroundColor Red
  }
}

Common issue: Stale DNS records pointing to decommissioned DCs. If you removed a DC without running dcpromo (or the demotion failed), its SRV records linger in DNS. Delete them manually.

3

SYSVOL and NETLOGON share health

SYSVOL holds your GPO files and login scripts. If SYSVOL replication is broken (DFSR or the older FRS), GPOs applied on one DC don't replicate to others. Users get different policies depending on which DC they authenticate against. Login scripts work for some users and not others. This is maddening to troubleshoot if you don't check SYSVOL first.

# Check DFSR replication state
dfsrdiag ReplicationState

# Verify SYSVOL and NETLOGON shares exist on all DCs
$DCs = (Get-ADDomainController -Filter *).HostName
foreach ($DC in $DCs) {
  $Sysvol = Test-Path "\\$DC\SYSVOL"
  $Netlogon = Test-Path "\\$DC\NETLOGON"
  $Status = if ($Sysvol -and $Netlogon) { "OK" } else { "FAIL" }
  Write-Host "[$Status] $DC - SYSVOL:$Sysvol NETLOGON:$Netlogon"
}
4

FSMO role holders

Five Flexible Single Master Operation roles govern specific AD functions: Schema Master, Domain Naming Master, PDC Emulator, RID Master, and Infrastructure Master. If any of these are held by a DC that's offline or decommissioned, specific operations start failing. The PDC Emulator is the most visible — it handles password changes and time sync. If it's down, password resets seem to "not work" for users.

# Show all FSMO role holders
netdom query fsmo

# PowerShell equivalent
Get-ADDomain | Select-Object PDCEmulator, RIDMaster, InfrastructureMaster
Get-ADForest | Select-Object SchemaMaster, DomainNamingMaster

Action item: Verify every FSMO role is held by an active, reachable DC. If a role holder was decommissioned without transferring the role, you'll need to seize it — which is a separate (and more stressful) operation.

5

Stale user and computer accounts

Accounts that haven't logged in for 90+ days are either unused or belong to devices that are off-network. Either way, they're a security risk (dormant credentials) and an audit finding. Most compliance frameworks require periodic access reviews, and "200 accounts that haven't logged in since 2024" is a finding your auditor will flag.

# Find user accounts inactive for 90+ days
$Threshold = (Get-Date).AddDays(-90)
Get-ADUser -Filter { LastLogonDate -lt $Threshold -and Enabled -eq $true } `
  -Properties LastLogonDate, Department |
  Select-Object Name, SamAccountName, LastLogonDate, Department |
  Sort-Object LastLogonDate |
  Export-Csv -Path ".\StaleUsers.csv" -NoTypeInformation

# Find computer accounts inactive for 90+ days
Get-ADComputer -Filter { LastLogonDate -lt $Threshold } `
  -Properties LastLogonDate, OperatingSystem |
  Select-Object Name, OperatingSystem, LastLogonDate |
  Sort-Object LastLogonDate |
  Export-Csv -Path ".\StaleComputers.csv" -NoTypeInformation

Write-Host "Stale users: $((Import-Csv .\StaleUsers.csv).Count)"
Write-Host "Stale computers: $((Import-Csv .\StaleComputers.csv).Count)"
6

Domain controller event logs

The Directory Service event log on each DC records replication errors, schema issues, and other AD-specific problems. The key events to watch for: Event ID 1311 (replication failure), 1864 (lingering objects), 2042 (replication gap too large), and 4013 (DNS not ready at startup). Most AD problems announce themselves in event logs days before they become visible to users.

# Check for critical AD events on all DCs (last 7 days)
$DCs = (Get-ADDomainController -Filter *).HostName
$CriticalEvents = @(1311, 1864, 2042, 4013, 1925, 1926)

foreach ($DC in $DCs) {
  $Events = Get-WinEvent -ComputerName $DC -FilterHashtable @{
    LogName = "Directory Service"
    Level = 1,2  # Critical and Error
    StartTime = (Get-Date).AddDays(-7)
  } -ErrorAction SilentlyContinue

  if ($Events) {
    Write-Host "[WARN] $DC - $($Events.Count) critical/error events" -ForegroundColor Yellow
    $Events | Select-Object TimeCreated, Id, Message -First 5 | Format-Table -Wrap
  } else {
    Write-Host "[OK] $DC - No critical events" -ForegroundColor Green
  }
}
7

GPO version consistency

Every GPO has two version numbers: one in AD (the GPC) and one in SYSVOL (the GPT). When these don't match, the GPO may not apply correctly — or it may apply an outdated version. This is a symptom of SYSVOL replication issues, but it's worth checking directly because GPO problems are hard to trace back to their root cause.

# Compare AD and SYSVOL GPO versions
Get-GPO -All | ForEach-Object {
  $GpoName = $_.DisplayName
  $ADVersion = $_.Computer.DSVersion
  $SysvolVersion = $_.Computer.SysvolVersion

  if ($ADVersion -ne $SysvolVersion) {
    Write-Host "[MISMATCH] $GpoName - AD:$ADVersion SYSVOL:$SysvolVersion" -ForegroundColor Red
  }
}
8

Time synchronization

Kerberos authentication allows a maximum of 5 minutes time skew by default. If a client's clock drifts more than 5 minutes from the DC, Kerberos tickets are rejected and the user can't log in. The PDC Emulator should sync to an external NTP source, and all other DCs and clients sync from it. If this chain breaks, time drift starts accumulating.

# Check time source on each DC
$DCs = (Get-ADDomainController -Filter *).HostName
foreach ($DC in $DCs) {
  $Source = Invoke-Command -ComputerName $DC -ScriptBlock {
    w32tm /query /source
  }
  Write-Host "$DC syncs to: $Source"
}

# Verify the PDC uses an external source (not "Local CMOS Clock")
$PDC = (Get-ADDomain).PDCEmulator
w32tm /monitor /computers:$PDC

Red flag: If the PDC shows "Local CMOS Clock" as its time source, it's not syncing externally. Configure it to use time.windows.com or your organization's NTP server.

9

AD Recycle Bin and tombstone lifetime

The AD Recycle Bin lets you recover deleted objects (users, groups, OUs) with all their attributes intact. But it only works if it's enabled — and it's not enabled by default. If someone accidentally deletes an OU with 200 users in it, you want the Recycle Bin to be there. Check if it's on, and if it's not, enable it (it's a one-way operation — can't be undone, but there's no reason not to enable it).

# Check if AD Recycle Bin is enabled
(Get-ADOptionalFeature -Filter "Name -eq 'Recycle Bin Feature'").EnabledScopes

# If empty, enable it (irreversible but recommended)
# Enable-ADOptionalFeature 'Recycle Bin Feature' -Scope ForestOrConfigurationSet -Target (Get-ADForest).Name

# Check tombstone lifetime (default is 180 days)
$Config = (Get-ADObject "CN=Directory Service,CN=Windows NT,CN=Services,$(
  (Get-ADRootDSE).configurationNamingContext)" -Properties tombstoneLifetime)
Write-Host "Tombstone lifetime: $($Config.tombstoneLifetime) days"
10

AD backup freshness

When was AD last backed up? If your last system state backup is older than your tombstone lifetime (default 180 days), it's useless — you can't restore from it. Most backup solutions capture system state, but it's worth verifying that AD is actually in the backup set and that a recent backup exists.

# Check last backup time via repadmin
repadmin /showbackup *

# The "Last Backup" column shows when each DC's AD database was last backed up.
# If it shows "never" or a date older than 30 days, your backup strategy needs attention.

Best practice: Back up at least one DC's system state daily. Test a restore at least once a year. The backup you never tested is the backup that won't work when you need it.

Building a monthly health check routine

Running these 10 checks individually is effective but tedious. The value comes from making it a routine — a scheduled script that runs monthly (or weekly) and emails you a report. If everything is green, you glance at it and move on. If something is red, you catch it before users do.

The pattern is simple: run each check, collect the results into a structured object, and output an HTML report that highlights failures. Schedule it as a Windows Task on any domain-joined server, and email the report to your team.

Check Frequency Impact if missed
Replication Daily Critical
DNS SRV records Weekly Critical
SYSVOL health Weekly High
FSMO roles Monthly High
Stale accounts Monthly Medium
DC event logs Weekly High
GPO versions Monthly Medium
Time sync Monthly High
Recycle Bin status Once (verify enabled) Medium
Backup freshness Weekly Critical

Get the automated health check script — free

Our AD Health Check script runs all 10 of these checks automatically and generates a clean HTML report. It checks replication, DNS, SYSVOL, FSMO roles, stale accounts, and more in a single run. No signup, no paywall — just download and run.

Need more than diagnostics? Our paid scripts handle the remediation — user onboarding, offboarding, and name changes across your hybrid environment.