Detect threats using Microsoft Graph activity logs - Part 2
In part one I focused mostly on detecting offensive security tools like AzureHound, GraphRunner, and PurpleKnight. In part two I will go into more depth how you can use the now available information for hunting and how to correlate it with other datasets to gain deeper insights.
Correlate Graph activities with other log sources
While the MicrosoftGraphActivityLogs
alone is a trove of information, correlating it with other logs makes it an even more interesting data source. Here are a few example how to get additional information.
Resolve User Id to UPN
With User and Entity Behavior Analytics (UEBA) enabled in Sentinel the IdentityInfo
table gives a great overview on all user identities in Entra ID. It behaves more like a hybrid between watchlist and regular table and you should always query the last 14 days and aggregate to the newest available entry to make sure to have all information available.
Since the Graph logs only contain a user Id but no user principal name, this is something you might need to better identify the user responsible for the Graph call.
MicrosoftGraphActivityLogs
| where TimeGenerated > ago(1d)
| where isnotempty( UserId )
| join kind=inner (IdentityInfo
| where TimeGenerated > ago(14d)
| summarize arg_max(TimeGenerated, *) by AccountObjectId
| project UserId=AccountObjectId, AccountUPN)
on UserId
| project-away UserId1
| limit 100
| sort by TimeGenerated
This returns only Graph calls done by a regular user and if the user could be resolved by the IdentityInfo
table. Change the join kind to leftouter
to also include graph calls from user ids that cannot be resolved.
Map sign-in events to Graph calls
Another cool trick is the ability map the sign-in information using the field SignInActivityId
which translates to UniqueTokenIdentifier
. That way you can easily map a particular sign-in event to the events in Microsoft Graph.
Since the object id of the active entity can be in either the field UserId
or in ServicePrincipalId
depending on the object type you must consider this when querying the data.
I created a two new fields ObjectId
and ObjectType
for this reason.
Now you should join SigninLogs
, AADNonInteractiveUserSignInLogs
, AADServicePrincipalSignInLogs
, and AADManagedIdentitySignInLogs
to have the best coverage of all available sign-ins. (ADFS logs not covered because please don’t use it anymore)
MicrosoftGraphActivityLogs
| where TimeGenerated > ago(8d)
| extend ObjectId = iff(isempty(UserId), ServicePrincipalId, UserId)
| extend ObjectType = iff(isempty(UserId), "ServicePrincipalId", "UserId")
| join kind=inner (union isfuzzy=true
SigninLogs,
AADNonInteractiveUserSignInLogs,
AADServicePrincipalSignInLogs,
AADManagedIdentitySignInLogs
| where TimeGenerated > ago(90d)
| summarize arg_max(TimeGenerated, *) by UniqueTokenIdentifier
)
on $left.SignInActivityId == $right.UniqueTokenIdentifier
| project-reorder TimeGenerated, ObjectType, UserPrincipalName, ObjectId, SignInActivityId, RequestUri, RequestMethod
With this you get a good understanding when the entity signed in to Entra ID and what they did using the Microsoft Graph API.
Find missing sign-in logs
From a threat detection perspective let’s change the direction of this query for a second and ignore all the queries where you find no sign-in information in the logs.
MicrosoftGraphActivityLogs
| where TimeGenerated > ago(8d)
| extend ObjectId = iff(isempty(UserId), ServicePrincipalId, UserId)
| extend ObjectType = iff(isempty(UserId), "ServicePrincipalId", "UserId")
| summarize by ObjectType, ObjectId, SignInActivityId
| join kind=leftanti (union isfuzzy=true
SigninLogs,
AADNonInteractiveUserSignInLogs,
AADServicePrincipalSignInLogs,
AADManagedIdentitySignInLogs
| where TimeGenerated > ago(90d)
| summarize arg_max(TimeGenerated, *) by UniqueTokenIdentifier
)
on $left.SignInActivityId == $right.UniqueTokenIdentifier
| summarize by ObjectType, ObjectId
In my environment this resulted in about 55 unique service principal Ids I could find any sign-in data for. Either my lab is hopelessly compromised or there is some data missing from the logs.
Let’s map all these object ids to service principal ids that exists in my tenant either as Enterprise Application, multi tenant app or even managed identities.
Since there is no native IdentityInfo
table for such objects my colleague Thomas Naunheim and I created such a enrichment table based on our Sentinel Enrichment Framework. More on than in due time.
MicrosoftGraphActivityLogs
| where TimeGenerated > ago(8d)
| extend ObjectId = iff(isempty(UserId), ServicePrincipalId, UserId)
| extend ObjectType = iff(isempty(UserId), "ServicePrincipalId", "UserId")
| summarize by ObjectType, ObjectId, SignInActivityId
| join kind=leftanti (union isfuzzy=true
SigninLogs,
AADNonInteractiveUserSignInLogs,
AADServicePrincipalSignInLogs,
AADManagedIdentitySignInLogs
| where TimeGenerated > ago(90d)
| summarize arg_max(TimeGenerated, *) by UniqueTokenIdentifier
)
on $left.SignInActivityId == $right.UniqueTokenIdentifier
| summarize by ObjectType, ObjectId
| join kind=leftouter (_GetWatchlist('WorkloadIdentityInfo')) on $left.ObjectId == $right.SearchKey
| project ObjectType, ObjectId, AppDisplayName, AppId, IsFirstPartyApp
This changes the perception of the data quite a bit. All of the service principals are resolved and as indicated by the IsFirstPartyApp
field belong to Microsoft. But it’s still curious that they don’t show up in any sign-in log I have access to. With names like Yggdrasil there definitely are some creative minds at work.
Missing object ids
One thing I also found very curious are Graph events without any user or service principal Id.
let ClientAuthMethods = dynamic ({"0": "public client", "1": "client secret", "2": "Certificate"});
MicrosoftGraphActivityLogs
| where TimeGenerated > ago(8d)
| where isempty(UserId) and isempty(ServicePrincipalId)
| extend ClientAuthMethodName = tostring(ClientAuthMethods[tostring(ClientAuthMethod)])
| summarize Count=count() by RequestUri, UserAgent, ClientAuthMethodName
| project-reorder Count, ClientAuthMethodName, UserAgent, RequestUri
Some of them I was able to match to a sign-in events of a user using GDAP, but others I didn’t find correlating logs. This case is still unsolved.
let ClientAuthMethods = dynamic ({"0": "public client", "1": "client secret", "2": "Certificate"});
MicrosoftGraphActivityLogs
| where TimeGenerated > ago(90d)
| where isempty(UserId) and isempty(ServicePrincipalId)
| extend ClientAuthMethodName = tostring(ClientAuthMethods[tostring(ClientAuthMethod)])
| join kind=inner (union isfuzzy=true
SigninLogs,
AADNonInteractiveUserSignInLogs,
AADServicePrincipalSignInLogs,
AADManagedIdentitySignInLogs
| where TimeGenerated > ago(90d)
| summarize arg_max(TimeGenerated, *) by UniqueTokenIdentifier
)
on $left.SignInActivityId == $right.UniqueTokenIdentifier
| summarize by RequestUri, UserAgent, ClientAuthMethodName, ClientAuthMethod, Identity
Correlate the data with itself
The batch endpoint
When using Microsoft Graph you might have encountered it already, and if you take a look at the browser developer tools when using the Entra portal you definitely have seen it:
https://graph.microsoft.com/beta/$batch
This endpoint accepts multiple Graph requests using the POST
method and returns all results in one, handy response. But how does such an request will show up int the MicrosoftGraphActivityLogs?
MicrosoftGraphActivityLogs
| where TimeGenerated > ago(1d)
| where RequestMethod == "POST" and RequestUri == "https://graph.microsoft.com/beta/$batch"
| limit 10
| join kind=inner (MicrosoftGraphActivityLogs
| where RequestMethod != "POST" and RequestUri != "https://graph.microsoft.com/beta/$batch"
| project-rename BatchRequestUri = RequestUri
)
on OperationId
| project-reorder TimeGenerated, OperationId, RequestUri, BatchRequestUri
Using this kusto query you can get 10 of those graph calls and map the actual requests based on the OperationId
. So even if the batch endpoint is used, all related graph calls are logged and can be used in the investigation.
Hunting
All new data sources should help you build either detections or hunting queries to find the needle in the haystack. Here are a few ideas of mine you can use in you environment.
Unusual user agent
let HistoricalActivity = MicrosoftGraphActivityLogs
| where TimeGenerated between (ago(30d) .. startofday(now()))
| where isnotempty(UserAgent)
| extend ObjectId = iff(isempty(UserId), ServicePrincipalId, UserId)
| extend ObjectType = iff(isempty(UserId), "ServicePrincipalId", "UserId")
| summarize by ObjectId, UserAgent, IPAddress;
MicrosoftGraphActivityLogs
| where TimeGenerated between (startofday(now()) .. now())
| extend ObjectId = iff(isempty(UserId), ServicePrincipalId, UserId)
| where isnotempty(UserAgent)
// Remove known user agents
| join kind=leftanti (HistoricalActivity
| summarize by ObjectId, UserAgent
)
on UserAgent, ObjectId
// Remove known IP addresses to limit false positives
//| join kind=leftanti (HistoricalActivity | summarize by IPAddress) on IPAddress
Building a list of know User agents per entity and comparing those to current data maybe helps to identify if something is off. The false positive rate can be medium to high depending on how much your environment changes. Removing known “good” IP addresses can help mitigate this quite a bit.
New sensitive role used
Using the awesome Microsoft Graph classification information provided by Thomas Naunheim, it’s super easy to get all Graph requests that use a API role assigned to the tier level ControlPlane for the first time.
let SensitiveMsGraphPermissions = externaldata(AppId: guid, AppRoleId: guid, AppRoleDisplayName: string, Category: string, EAMTierLevelName: string, EAMTierLevelTagValue: string)["https://raw.githubusercontent.com/Cloud-Architekt/AzurePrivilegedIAM/main/Classification/Classification_AppRoles.json"] with (format='multijson')
| where EAMTierLevelName == "ControlPlane"
| distinct AppRoleDisplayName;
let HistoricalActivity = MicrosoftGraphActivityLogs
| where TimeGenerated between (ago(30d) .. startofday(now()))
| where Roles has_any (SensitiveMsGraphPermissions)
| extend ObjectId = iff(isempty(UserId), ServicePrincipalId, UserId)
| extend ObjectType = iff(isempty(UserId), "ServicePrincipalId", "UserId")
| summarize by ObjectId;
MicrosoftGraphActivityLogs
| where TimeGenerated between (startofday(now()) .. now())
| extend ObjectId = iff(isempty(UserId), ServicePrincipalId, UserId)
| where Roles has_any (SensitiveMsGraphPermissions)
// Remove known object ids
| where ObjectId !in (HistoricalActivity)
Of course you can also adjust this query to return all the service principals that would be classified as Control Plane assets and use set_intersect
to identify which role permissions are the critical ones.
let SensitiveMsGraphPermissions = externaldata(AppId: guid, AppRoleId: guid, AppRoleDisplayName: string, Category: string, EAMTierLevelName: string, EAMTierLevelTagValue: string)["https://raw.githubusercontent.com/Cloud-Architekt/AzurePrivilegedIAM/main/Classification/Classification_AppRoles.json"] with (format='multijson')
| where EAMTierLevelName == "ControlPlane"
| distinct AppRoleDisplayName;
let ScalarRoles = toscalar(SensitiveMsGraphPermissions
| summarize AppRoleDisplayName=make_set(AppRoleDisplayName, 1000));
MicrosoftGraphActivityLogs
| where TimeGenerated > ago(30d)
| where Roles has_any (SensitiveMsGraphPermissions)
| extend Roles = split(Roles, ' ')
| extend ControlPlaneRoles=set_intersect(todynamic(Roles), ScalarRoles)
| extend ObjectId = iff(isempty(UserId), ServicePrincipalId, UserId)
| extend ObjectType = iff(isempty(UserId), "ServicePrincipalId", "UserId")
| summarize by ObjectId, ObjectType, tostring(ControlPlaneRoles)
| join kind=leftouter (_GetWatchlist('WorkloadIdentityInfo')
| project-away ['_DTItemId'], LastUpdatedTimeUTC, SearchKey
| project-rename ObjectId=ServicePrincipalObjectId
| extend ObjectId = tostring(ObjectId))
on ObjectId
| join kind=leftouter (IdentityInfo
| where TimeGenerated > ago(14d)
| summarize arg_max(TimeGenerated, *) by AccountObjectId
| project-rename ObjectId=AccountObjectId)
on ObjectId
| project ObjectType, AppDisplayName, AccountUPN, ControlPlaneRoles
Audit data
The last idea is paired with a funny coincidence. I was using the Entra ID audit logs and correlated them with the graph data. This got me thinking: Is there a source for this information already?
A database where you can see which Graph call will result in which audit event and the other way around?
And it seemed that I wasn’t the only one thinking about this at the time. On X/Twitter Andy Robbins (@_wald0) had exactly this question.
And now I can answer this question with a definitive: “Some of it”.
In the EntraIDAuditLogToMicrosoftGraph repository, you will find a nice list, either as CSV or as JSON, which contains this data.
https://github.com/f-bader/EntraIDAuditLogToMicrosoftGraph
The data is based on the Graph logs and the following query is the source of it. If you want to contribute feel free to run this query in your environment, export the results and create a pull request with your file added to the source folder.
Hopefully with a collective effort we will get a good coverage of the data.
AuditLogs
| where TimeGenerated > ago(90d)
| join kind=inner (
MicrosoftGraphActivityLogs
// Ignore GET requests
| where RequestMethod != 'GET'
)
on $left.CorrelationId == $right.ClientRequestId
// Remove PII information and normalize the RequestURI
| extend NormalizedRequestUri = replace_regex(RequestUri, @'[0-9a-fA-F]{8}\b-[0-9a-fA-F]{4}\b-[0-9a-fA-F]{4}\b-[0-9a-fA-F]{4}\b-[0-9a-fA-F]{12}', @'<UUID>')
| extend NormalizedRequestUri = replace_regex(NormalizedRequestUri, @'[a-zA-Z0-9_-]{41,65}', @'<ID>')
| extend NormalizedRequestUri = replace_regex(NormalizedRequestUri, @'\d+$', @'<ID>')
| extend NormalizedRequestUri = replace_regex(NormalizedRequestUri, @'\/+', @'/')
| extend NormalizedRequestUri = replace_regex(NormalizedRequestUri, @'https:\/', @'https://')
| extend NormalizedRequestUri = replace_regex(NormalizedRequestUri, @'%23EXT%23', @'')
| extend NormalizedRequestUri = replace_regex(NormalizedRequestUri, @'\/[a-zA-Z0-9+_.\-]+@[a-zA-Z0-9.]+\/', @'/<UPN>/')
| extend NormalizedRequestUri = replace_regex(NormalizedRequestUri, @'^\/<UUID>', @'')
| extend NormalizedRequestUri = replace_regex(NormalizedRequestUri, @'\?.*$', @'')
// Remove POST requests to the batch endpoint
| where not ( NormalizedRequestUri matches regex @"https:\/\/graph.microsoft.com\/(v1\.0|beta)/\$batch" )
| summarize by OperationName, NormalizedRequestUri, RequestMethod, OperationVersion
| project-rename
MicrosoftGraphRequestUri = NormalizedRequestUri,
EntraIDOperationName = OperationName,
EntraIDOperationVersion = OperationVersion
| sort by EntraIDOperationName asc
Missing data
One big caveat of this query is, it can only map what’s there or has the correct correlation id. And in many cases this seems not to be the case. In my environment I cannot map all the audit data to a graph call. While ClientRequestId
is the best anchor I found in the data it’s not perfect. RequestId
and OperationId
are only a subset of the results from ClientRequestId
so I don’t use them anymore.
My best guess on the missing data is, that the client did not use the Microsoft Graph to do the change, but used other administrative APIs.
AuditLogs
| where TimeGenerated > ago(14d)
| join kind=leftanti (MicrosoftGraphActivityLogs) on $left.CorrelationId == $right.ClientRequestId
| count
Community resources
Since I wrote the initial draft for this post the amazing security community has come up with more and more use cases for this log type. Here are a few of those
- Have you heard of workload identity access token replay?
by Nicola Suter - Identify large data transfers/exfiltration
by Invictus Incident Response
Conclusion
MicrosoftGraphActivityLogs
is an excellent source of data that can be used to analyze the usage in any environment. There are some caveats to be aware of, mostly that other APIs are not part of this log and therefore there might some gaps that you as an defender should be aware of.
Overall I would recommend everybody to invest the time to identify additional use cases, build upon the provided ones and share the results with the community.