Notifications for hung jobs

Please let us know how well we're fulfilling your needs!

Moderators: michael.jones, casey.burns, Mike D

Notifications for hung jobs

Postby Walt » Sat Aug 06, 2011 1:06 am

Over the past couple years, I've run into several situations where jobs get hung for some reason. It could be a pre or post job script, NTBackup hanging on a tape drive, etc., and you may not know it for days.

We have extensive monitoring in place on our servers but when the job runs and just gets stuck, there is no event to trigger on in the event log. It's easy to trigger on an event log item, but the lack of anything in the event log is Much harder to trigger on.

What would be VERY helpful is 2 new features. The first should be minor.

1) When a job is sceduled to run at a certain time, and can't for whatever reason, it gets "queued," but there is no event generated that it couldn't run at the specified time. I would like to request that BA generate an event stating this fact. (probably with a new event ID stating something like "Job XYZ scheduled to start at 9:00pm but couldn't due to prior job XYZ not yet completed. Job XYZ has been queued to start." Internally BA knows it can't immediately start the job - I would just like to expose this to the event log.

2) Allow a maximum run time to be specified for a job. Optionally terminate the job if it exceeds the maximum run time (and any sub processes started, such as scripts.) Zero would mean infinate, length specified in minutes. Generate a warning event if the terminate check box is not checked, and the normal error if the user has selected to terminate the job if run-time has exceeded.

Number 2 is also very useful for rsync jobs that you may not want to have running during business hours. It can be easy for an rsync job to exceed a specified time limit running much longer than anticipated or desired.

Thanks!
Walt
User
 
Posts: 1
Joined: Sat Aug 06, 2011 12:36 am

Return to Feedback and Feature Requests

Who is online

Users browsing this forum: No registered users and 1 guest

cron