Log Monitoring Best Practices

Troubleshooting

The HYPR platform has robust Event logging across the lifecycle of all registration and authentication activities. By monitoring the logs, customers can create trending reports on usage as well as alerts to identify critical failures at various points in the registration or authentication flow.

Success/Failure Monitoring

HYPR recommends that you watch for anomalies in the volume of successful and failed registration/authentication Events as the cornerstone of any log monitoring process.

🚧

Recommended

We suggest you monitor for failures at the Event level as opposed to the errorCode level. Since Events traverse several different components, a failed Event can result in several error codes because each component will present an error. This is helpful during the debugging process but results in noise when employed within an alerting mechanism.

As an example, here’s a monitor built with Datadog’s anomaly detection that evaluates two monitors grouped on the same key (eventTags).

2680

There are two checks happening here:

  1. For a given registration or authentication Event type, are we seeing an anomalous increase in the number of failed Events (based on the isSuccessful key described in Event Log Keys below) over a certain time interval (in this case, 15 minutes). The anomaly algorithm used here employs the β€œrobust” anomaly detection in Datadog. This checks to see if the behavior observed in the given time interval is more than two standard deviations from the behavior in previous weeks during the same time period. For instance, we should expect that 8:00 AM - 8:15 AM on a Monday behaves similarly to 8:00 AM - 8:15 AM over the past six Mondays.
  2. For a given registration or authentication Event type, we are seeing an anomalous decrease in the number of successful Events (based on the isSuccessful key again) over a certain time interval (in this case, 15 minutes). The anomaly detection algorithm described above is used here as well.

Both of these checks are evaluated in a composite monitor. Only if both of these checks are true will the monitor fire. Remember, it’s possible for there to be a legitimate drop in success volumes (for example, a holiday or a β€œsoft launch” period), as well as an increase in failures that may simply be the result of an overall volume increase. However, if both of these issues occur simultaneously, an alert should be triggered.

Additional Error Monitoring

In addition to the success/failure Event monitoring described above, there are certain other errors where any instance should trigger an alert.

Error Code/NameDescriptionRecommended Solution
1203003
FIDO2_SETTINGS_NULL_EC
FIDO2 settings are null.Please make sure FIDO2 settings are configured.
1202501
LICENSE_VALIDATION_PROBLEM
This indicates that the license validation has failed.The administrator should check that the correct license has been provided.
1202027
PUSH_NOT_CONFIG_EC
There was an issue with push provider configuration. Either push is disabled or the URL is null.Check the Control Center logs and contact HYPR Support for assistance.

Pseudocode for Querying Event Logs

You can use the following sample searches as a foundation for writing any number of searches around registration and authentication activity.

πŸ“˜

Event Descriptions

A full list of all Events and common parameters can be found in the Event Descriptions article.

Mobile Device Registration

Number of users with registration attempts:

select count(distinct machineUserName)
where eventName:OOB_WEBSITE_REG OR eventName:OOB_WORKSTATION_REG

Number of successfully registered users:

select count(distinct machineUserName)
where eventName:OOB_DEVICE_REG_COMPLETE AND isSuccessful:true

Subtracting the second query result from the first will give you the number of users with failed registration attempts.

Number of successful registrations by day:

select count(distinct traceId), day(eventTimeInUTC)
where eventName:OOB_DEVICE_REG_COMPLETE AND isSuccessful:true
group by day(eventTimeInUTC)

Remember that one user can successfully register multiple devices/machines.

Registered devices per user:

select a.deviceCount, count(a.machineUserName) as userCount
from
  (select count(distinct deviceId) as deviceCount, machineUserName
   where eventName:OOB_DEVICE_REG_COMPLETE AND isSuccessful:true
   group by machineUserName) a
group by a.deviceCount

This does not account for devices that have been deregistered.

Smart Key Device Registration

Number of unique users and registration Events by status:

select count(distinct machineUserName), count(distinct traceId), isSuccessful
where eventName:SMARTKEY_ENROLL group by isSuccessful

FIDO2 Device Registration

Number of unique users and registration Events:

select count(distinct machineUserName), count(distinct traceId), isSuccessful
where eventName:FIDO2_DEVICE_REG_COMPLETE

Mobile Device Authentication

Number of unique authentication attempts:

select count(distinct traceId)
where eventName:OOB_WEBSITE_AUTH OR eventName:WORKSTATION_AUTH

Number of successful authentication Events and users:

select count(distinct traceId) as auths, count(distinct machineUserName) as users
where eventName:OOB_WEBSITE_AUTH_COMPLETE and isSuccessful:true

Subtracting the second query result from the first will give you the number of users with failed registration attempts.

Number of users attempting to authenticate:

select count(distinct machineUserName)
where eventName:OOB_WEBSITE_AUTH OR eventName:WORKSTATION_AUTH

Number of successful authentications by day:

select count(distinct machineUserName) as users, count(distinct traceId) as auths, day(eventTimeInUTC)
where eventName:OOB_WEBSITE_AUTH_COMPLETE AND isSuccessful:true
group by day(eventTimeInUTC)

FIDO2 Device Authentication

Number of unique users and registration Events:

select count(distinct machineUserName), count(distinct traceId)
where eventName:FIDO2_WEBAUTHN_COMPLETE