Security analysts have to deal with genuine internal and external threats along with a stream of false positives on a daily basis. A SIEM solution integrated with user and entity behavior analytics (UEBA) capabilities can help them decrease false positives by identifying suspicious behavior. These suspicious behaviors are identified by benchmarking the user's or entity's observed behavior with that of a "learned baseline."
The solution spends the first few weeks studying all user and system activity, and developing a baseline of expected activity for each entity. After this initial training period, the solution begins to detect deviant behavior automatically by consistently reviewing time, count, and pattern anomalies using machine learning algorithms. With a UEBA-integrated SIEM solution, security analysts can detect threats proactively, including account compromise, data exfiltration, insider threats, and logon anomalies.
Critical to this effort is anomaly modeling. Anomaly modeling is the process of using machine learning models to create baselines of expected behavior. Analysts may be able to customize the anomaly model by choosing the parameters that the solution will study to determine the baseline. The parameters may differ according to the type of anomaly being set.
Here are the parameters for time, count, and pattern anomalies that can be used to model anomalies in a UEBA solution:
Let us see how these parameters can be customized.
Briefly put, a time anomaly is triggered when an event occurs at an abnormal time. As seen previously, security analysts can choose the time intervals in which user behavior is studied.
Let us say, John usually logs on everyday between 9 and 10 am. A default anomaly model that tracks John's log on time in 15-minute intervals will record every log on after 9:15 as an anomaly. But what if a security analyst creates a model that tracks John's log on behavior in one-hour intervals instead?
As illustrated in the table 1, activity that could be construed as an anomaly when tracked every 15 minutes, might not be one when monitored on an hourly basis. Custom anomaly modeling enables analysts to choose a time interval parameter that best suits their requirements.
Default anomaly model (tracks user behavior in 15-minute intervals by default) |
Custom anomaly model (tracks user behavior in 1-hour intervals as set by the analyst) |
||||
Date | John's log on time | Anomaly | Date | John's log on time | Anomaly |
Jan. 23 | 9:10 | No | Jan. 23 | 9:15 | No |
Jan. 24 | 9:13 | No | Jan. 24 | 9:17 | No |
Jan. 25 | 9:30 | Yes | Jan. 25 | 9:30 | No |
Jan. 26 | 9:12 | No | Jan. 26 | 9:45 | No |
Table 1: Default vs Custom anomaly model for tracking time anomalies
As illustrated in the table, in a default anomaly model, when John logs on at 9:30, it is flagged as an anomaly. With custom anomaly modeling, the analyst sets the time interval to one hour. This way, when John logs on at 9:30 am, it is not identified as an anomaly.
Time anomalies can be tracked for discrete events since there is an element of time sequencing. This may not be the case for count or pattern anomalies. Security analysts may be able to create multiple anomaly models with different time interval parameters, and reports for these anomalies can be generated simultaneously as well.
When there is an unusually high number of activities happening for a particular period, it is identified as a count anomaly.
For identifying count anomalies, a SIEM solution will aggregate the number of activities in a specific time range, and compare it with the expected number of activities in the same time range. Anomaly modeling will enable the analyst to choose the time range parameter. For example, a default anomaly model may study the number of times a file is accessed in every 30-minute time interval in a day. If the expected count is 100 times for a specific 30-minute period but the actual count is 200 times, it will record a deviation from expected behavior. While designing a custom anomaly model, an analyst may know that it is normal for the file to be accessed 200 times in 30 minutes, so they may choose to track the number of accesses on an hourly basis instead of every 30 minutes.
This is illustrated in table 2 below.
Default anomaly model (tracks file reads every 30 mins by default) |
Custom anomaly model (tracks file reads every hour as set by the analyst) |
||||
Time range | No. of file accesses | Anomaly | Time range | No. of file accesses | Anomaly |
00:00 to 00:30 | 100 | No | 00:00 to 01:00 | 250 | No |
00:30 to 01:00 | 150 | Yes | 01:00 to 02:00 | 300 | No |
01:00 to 01:30 | 200 | Yes | 02:00 to 03:00 | 400 | No |
Table 2: Default vs Custom anomaly model for tracking count anomalies
Under this new anomaly model, the SIEM solution will benchmark the number of observed file accesses in each 1-hour interval against the learned baseline. And the 250 file accesses that happen between 00:00 and 01:00 may actually be deemed normal. However, under the default model, the 150 file accesses that occur between 00:30 and 1:00 may trigger an anomaly.
The analyst can select the time range parameter that best suits their cybersecurity needs to facilitate the creation of anomaly models.
As the name suggests, pattern anomalies are identified when an unusual sequence of events occur. While individual events may not seem anomalous, when observed as a sequence, they may connote suspicious behavior. For example, a user changes a firewall rule at an unusual time. The sequence being examined could be Host name > Rule ID > Time, where the Rule ID > Time part of the sequence may be recorded as an anomaly.
A UEBA-integrated SIEM identifies pattern anomalies by studying user behavior for a few weeks and establishing a pattern baseline through different ways. One such method it uses to establish baselines is Markov chain analysis. In simple words, Markov chains are chains, or sequences of events, where the probability of one event occurring depends on the state of the previous event. To get deeper insights on how UEBA establishes baseline behavior and Markov chain analysis, you can check out our informative guide.
In default models, these pattern chains are created from the default values present in the data or reports given as input. In custom anomaly modeling, the security analyst is able to select the variables that form this pattern chain.
For example, let's suppose that for a particular input report, the default pattern chain being examined is Username > Domain > Hostname > Time.
The solution will look for anomalies in the association of adjacent variables. Therefore, anomalies will be examined in the pairs Username > Domain, Domain > Hostname, and Hostname > Time.
Sometimes, default patterns might not provide the analysts with the complete picture. They may require additional data or reports. This is where custom modeling comes in. In custom modeling, the analyst creates these pattern chains by choosing the variables and the pattern in which they are arranged.
So for this example, along with choosing the components 'Username', 'Domain', 'Hostname', and 'Time', the analyst can also change the pattern in which they are arranged.
They could alternatively arrange them as:
Username > Domain > Time > Hostname
They could also add up to four additional variables (to give a total of eight) to this chain to create a new anomaly model.
Creating customized anomaly models as opposed to using pre-built models can help security teams combat threats and risks that their organizations are more susceptible to face.
The UEBA module in ManageEngine's Log360 comes equipped with both pre-built and customized anomaly models. To learn more, sign up for a personalized demo with our product experts today!
Explore all that the world of UEBA has to offer through our latest blogs on the topic here.
You will receive regular updates on the latest news on cybersecurity.
© 2021 Zoho Corporation Pvt. Ltd. All rights reserved.