When we deliver the Solution Optimization Assessment for Microsoft Sentinel, we check for (and often find) duplication in a couple of logging areas:
SecurityEvent
CommonSecurityLog/Syslog
Obviously, duplicated logs can lead to bad outcomes - increased ingestion cost, duplicated alerts, other side-effects - so they’re worth avoiding!
In this post I'll discuss SecurityEvent, some bad {UI or assumptions}, and an approach to fix it. The next post discusses CEF/Syslog.
Duplication in SecurityEvent
When using a modern Windows OS (“modern” here meaning “at least Windows Vista”), you can fairly easily identify duplication of records due to the unique EventRecordId attached to each event within a log.
This gives you a Logs query like:
SecurityEvent | where TimeGenerated > ago(1h)
| where not(isempty(EventRecordId))
| summarize count() by Computer, EventID, EventRecordId
| where count_ >1
| summarize DuplicateEvents= make_set(EventID,500) by Computer
| top 30 by array_length(DuplicateEvents)
Which coincidentally is was the same query used in the Sentinel SOA... (update 2025-04-15 - better version I think.)
Or, if you want to get clever (yep, I tried), something like this:
let duplicateEventCount = SecurityEvent | where TimeGenerated > ago(1h)
| summarize count() by Computer, EventID, EventRecordId
| where count_ > 1
| summarize sum(count_) by Computer, EventID;
SecurityEvent | where TimeGenerated > ago(1h)
| summarize count() by Computer, EventID, EventRecordId
| where count_ >1
| summarize DuplicateEvents= make_set(EventID,500) by Computer
| top 100 by array_length(DuplicateEvents)
| join duplicateEventCount on $left.Computer == $right.Computer
| summarize DupEvents=sum(sum_count_) by Computer, tostring(DuplicateEvents)
| top 30 by DupEvents
(I was dissatisfied with the old version)
If you trace back the cause of whatever Event IDs pop out, you find overlapping Data Collection Rules (DCRs) have been applied to the same host.
Each DCR is its own pipeline, and can conceivably do very different things between the input stream and the output stream - filters and transformations - and it can be counter-intuitive to realize that:
If you create a DCR with Minimal security events, and/or
Create another DCR with Common security events and/or
Create another DCR with All security events
... and apply more than one to the same source ...
...you've configured those DCRs for duplication of events. (The Sentinel Windows Events via AMA connector provides the UI for the above settings now.)
As an example, if you configured 2 DCRs to be Windows minimal security events, they'd both collect the (minimal) events and store them twice!
Having the same source listed twice across different DCRs (read: pipelines), there's no implication they'd stay in the same format or end up in the same location, so they’re collected twice and processed twice… or N times, where N is the number of DCRs with the same source specified.
I have duplicates, what should I do?
Well, unlink the DCRs causing the problem! Find one machine identified as The Problem, and then work backwards to find the DCRs linked to that machine.
Finding them
This Resource Graph query should do the job of showing you where multiple DCRs are applied to a machine, and from there you can find the DCR, evaluate the contents, and make sure there's no or minimal overlap. (Mostly pilfered from looking at what the UI does):
(Paste into Resource Graph Explorer)
insightsresources
| where type == "microsoft.insights/datacollectionruleassociations"
| extend associationId=tolower(id), resourceIdArr = split(tolower(id), "/providers/microsoft.insights/datacollectionruleassociations")
| extend resourceId = tostring(resourceIdArr[0])
| extend dcrId=properties["dataCollectionRuleId"], dceId=properties["dataCollectionEndpointId"]
| extend SrcName=tostring(split(resourceId,"/")[-1]), SrcRg = tostring(split(resourceId,"/")[4]), SrcSub = tostring(split(dcrId,"/")[2])
| extend DcrName = tostring(split(dcrId,"/")[-1]), DcrRg = tostring(split(dcrId,"/")[4]), DcrSub = tostring(split(dcrId,"/")[2])
| order by ['SrcSub'] asc
| project id, resourceId, SrcName, SrcRg, SrcSub, DcrName, DcrRg, DcrSub
| summarize DCRSet = make_set(DcrName) by resourceId, SrcName, SrcRg, SrcSub
| extend DCRCount = array_length(DCRSet)
And you can compare the names (and counts) with the names from the earlier query. (For a flat list, just exclude the last 2 lines (from | summarize to the end))
Addressing them
This behaviour of DCRs is powerful but can be (quite!) surprising, especially when coming from the old MMA/Log Analytics Agent/OMSAgent era of one-set-of-logs-fits-all.
But you can plan for this (or exploit it) in a couple of ways:
#1: Don't Layer DCRs
Brute-force approach - just apply 1 DCR to a system for a given set of events
#2: Use complementary role-based DCRs
Core Windows Events - something approaching minimal, applies to every Windows system
Members - for non-Identity member systems, any additional logging applicable to every server OS
Identity Systems - any additional logs you want collected from DCs, ADFS, etc
Either as a delta from Members profile or as a delta from Core
Database Servers - etc
Either as a delta from Members profile or from Core
And then the logs will apply without overlap because the collection directives don't overlap - but you'll need to layer collection requirements onto systems and have a plan for it.
Next time
Assuming I get back to it sometime this decade, next time we’ll chat Syslog and CommonSecurityLog. But you can see where that’s going, right? :)