HVR Alert Perl Job Fails to Complete for One Channel
Issue
In HVR 6, the alert Perl job, hvralert.pl, does not complete for a specific channel. The issue does not occur in HVR 5.
Temporarily disabling and re-enabling the alert only temporarily resolves the issue.
Environment
- HVR 6
- OS: Amazon Linux 2023
Resolution
To troubleshoot this issue, do the following:
- Check whether any active channel trace variables are increasing the size of the channel log. Large logs can affect alert jobs, even when daily log rotation is enabled.
- Temporarily disable the problematic alert using hvralertconfig, then re-enable it. This clears out long-running or orphaned Perl processes.
- If the issue persists, manually stop any running
hvralert.plorhvrstatistics.plprocesses for the affected channel. - Use hvrmaintlogrotate or the HVR GUI to clean up the log and reduce the volume the alert job needs to parse. If hvrmaintlogrotate is unavailable, manually rename the affected
hvr.outfile and restart the hub or scheduler. - Schedule alert jobs to avoid overlap with log rotations. If an alert job runs while logs are being rotated, required log files may be unavailable or archived, which can prevent the job from completing.
- After resetting the alert, monitor the job for several days.
- If the issue returns, run advanced debugging, such as
straceon the job process, or escalate to R&D.
Cause
This issue occurs when the HVR log file becomes large because of extended channel uptime, active tracing, or insufficient log rotation. When hvralert.pl runs, it parses the log file. If the log is very large, the alert job may stall. If log rotation occurs while the alert job is running, the job may be interrupted, leaving orphaned Perl processes.