Monitoring

Services

The HYPR environment has the following processes

  • Nginx for Reverse Proxy and SSL termination
  • JVM for HYPR Servers (embedded Tomcat)
  • Vault for Configuration 
  • Redis for In Memory Caching
  • MySQL for Persistent Storage

HYPR Server health checks

Adding health checks to the HYPR services helps ensure that the services are available. A health checker can poll the HTTP API endpoints at regular intervals. If the service is running normally it will send a response to the health check request.

For a conceptual overview of health checks see health check concepts

Servers behind a Load Balancer

The load balancer can make the health check requests at regular intervals. Servers which do not respond can be marked as unhealthy and taken out of service. Once the problem has been addressed and the servers are responding normally, they can be marked as healthy and return to service.

ServerEndpointExpected response
Control Center serverhttp://HOSTNAME:8009/healthHttp 200 OK
Nginx web serverhttps://HOSTNAME/healthHttp 200 OK

Since 3.8.0

The /health endpoints include a check of all the dependencies
Redis, Vault, and DB connection liveness is checked via a test write

📘

Response code

The API call returns a response code of 200. If any of the dependencies fail, a non 200 error code is returned

Sample response for /health

{
  "status": "UP",
  "components": {
    "db": {
      "status": "UP",
      "components": {
        "eventDataSource": {
          "status": "UP",
          "details": {
            "database": "MySQL",
            "validationQuery": "isValid()"
          }
        },
        "fido2DataSource": {
          "status": "UP",
          "details": {
            "database": "MySQL",
            "validationQuery": "isValid()"
          }
        },
        "rpDataSource": {
          "status": "UP",
          "details": {
            "database": "MySQL",
            "validationQuery": "isValid()"
          }
        },
        "uafDataSource": {
          "status": "UP",
          "details": {
            "database": "MySQL",
            "validationQuery": "isValid()"
          }
        }
      }
    },
    "discoveryComposite": {
      "description": "Discovery Client not initialized",
      "status": "UNKNOWN",
      "components": {
        "discoveryClient": {
          "description": "Discovery Client not initialized",
          "status": "UNKNOWN"
        }
      }
    },
    "diskSpace": {
      "status": "UP",
      "details": {
        "total": 107361579008,
        "free": 19792601088,
        "threshold": 10485760,
        "exists": true
      }
    },
    "ping": {
      "status": "UP"
    },
    "reactiveDiscoveryClients": {
      "description": "Discovery Client not initialized",
      "status": "UNKNOWN",
      "components": {
        "Simple Reactive Discovery Client": {
          "description": "Discovery Client not initialized",
          "status": "UNKNOWN"
        }
      }
    },
    "redis": {
      "status": "UP",
      "details": {
        "version": "4.0.13"
      }
    },
    "refreshScope": {
      "status": "UP"
    },
    "vault": {
      "status": "UP",
      "details": {
        "version": "0.10.3"
      }
    },
    "vaultReactive": {
      "status": "UP",
      "details": {
        "version": "0.10.3"
      }
    }
  }
}

OS Level Monitoring

ResourceThresholdLevelNotes
CPUSustained usage > 90%CriticalSignificant slow down in server response time might cause timeouts
MemoryFree memory < 10%CriticalHYPR services and dependencies have a max memory limit defined. Variance is likely to come from cache utilization.
Disk usageFree space < 10%CriticalLogs can cause disks to fill up. Logs are zipped up at regular intervals. The zipped log files can be moved or deleted if needed.

HYPR Service Dependencies

These are systemd services which can be monitored

ResourcePort checkLevelNotesSystemd service check
RedisPort monitoring on
- 6379 (server)
- 26379 (sentinel)
MediumHYPR setup can detect and survive any 1 Redis node failurehypr-redis
VaultPort monitoring on
8200
MediumNon critical process, only needed during app startuphypr-vault
MySql DBPort monitoring on
3306
CriticalAlthough not generally necessary, you may use a MySQL specific tool like MyTop to view more details on MYSQL performance metrics.This is typically hosted as an external service and should be monitored accordingly
NginxPort monitoring on
443
CriticalNginx failure will stop http requests being proxied to the HYPR processhypr-nginx

Integration with a Third Party APM

Third party monitoring agents may be attached to the Java process for further insight. The exact command line depends upon the tool being used. Sample config is listed below.

DataDog

 /opt/hypr/ServerInstaller-6.11.0/jre/jdk-11.0.10+9-jre/bin/java \
 -javaagent:/opt/hypr/dd-java-agent.jar \
-Ddd.service.name=cc \
-Ddd.env=PROD1

App dynamics

/opt/hypr/ServerInstaller-6.11.0/jre/jdk-11.0.10+9-jre/bin/java \
-javaagent:/opt/hypr/javaagent.jar \
-Dappdynamics.agent.tierName=hypr-cc \
-Dappdynamics.agent.nodeName=PROD1"

Log monitoring

HYPR and third party logs are detailed here

To feed logs into external systems, use a log aggregator such as Fluentd or Logstash. This decouples the log handling and consumption from the core HYPR services.

You can also use Splunk to monitor and collect these logs.