Heartbeat Monitors
What are Heartbeat Monitors?
Heartbeat monitors track the execution of recurring tasks, scheduled jobs, cron jobs, and automated processes. Unlike uptime monitors that actively check your services, heartbeat monitors wait to receive a "ping" from your scheduled tasks. If a task fails to report in within the expected timeframe, you'll be alerted immediately.
How Heartbeat Monitoring Works
The concept is simple:
- You create a heartbeat monitor and receive a unique webhook URL
- Your scheduled task/cron job calls this URL when it starts or completes
- The system expects to receive this "heartbeat" within your defined schedule
- If the heartbeat is missed, you're alerted that something went wrong
How to Access Heartbeat Monitors
You can access your heartbeat monitors through:
- The main dashboard Heartbeats section
- Sidebar navigation menu → Heartbeats
- Direct URL: /heartbeats
Common Use Cases
Backup Jobs
- Daily database backups
- File system backups
- Cloud synchronization tasks
Data Processing
- Report generation
- Data imports/exports
- ETL (Extract, Transform, Load) processes
- Analytics calculations
Maintenance Tasks
- Log file cleanup
- Cache clearing
- Certificate renewals
- System updates
Monitoring Tasks
- Health checks
- Security scans
- Performance audits
- Compliance checks
Creating a Heartbeat Monitor
Step 1: Navigate to Heartbeat Creation
- Go to your Heartbeats page
- Click the "Create Heartbeat" button
- You'll see the heartbeat creation form
Step 2: Basic Settings
Name and Description
- Name: Choose a descriptive name (e.g., "Daily Database Backup", "Weekly Report Generation")
- Description: Optional details about what this heartbeat monitors
Expected Run Interval
How often should your task run?
- Every minute: For frequently running tasks
- Hourly: Tasks that run every hour
- Daily: Daily scheduled jobs
- Weekly: Weekly maintenance tasks
- Monthly: Monthly reports or cleanup
- Custom: Define your own interval
Grace Period
Additional time to wait before considering the heartbeat as missed:
- Why needed: Accounts for normal execution time variations
- Recommended: 10-20% of your run interval
- Example: For a daily job, use 2-4 hours grace period
Step 3: Advanced Settings
Timeout Settings
- Request Timeout: Maximum time to wait for the heartbeat ping (default: 30 seconds)
- Use case: Adjust if your task takes longer to reach the endpoint
Project Assignment
- Assign to a project for better organization
- Group related heartbeats together
- Useful for client or application-specific monitoring
Step 4: Notification Settings
Configure who gets notified when heartbeats are missed:
- Select from your configured notification handlers
- Choose multiple notification methods for redundancy
- Consider different notifications for different types of failures
Step 5: Email Reports (Optional)
- Enable: Receive regular email reports about heartbeat performance
- Frequency: Daily, weekly, or monthly summaries
- Content: Success rates, missed heartbeats, and trends
Implementing Heartbeat Pings
Getting Your Webhook URL
- After creating a heartbeat, click on its name
- Copy the unique webhook URL provided
- This URL is what your task will call
Integration Examples
Bash/Shell Scripts
#!/bin/bash
# Your backup script
echo "Starting backup..."
rsync -av /data/ /backup/
# Ping heartbeat on success
curl -X GET "https://yourdomain.com/heartbeat/your-unique-code"
echo "Backup completed and heartbeat sent"
Python Scripts
import requests
import sys
def send_heartbeat():
try:
url = "https://yourdomain.com/heartbeat/your-unique-code"
response = requests.get(url, timeout=30)
print(f"Heartbeat sent: {response.status_code}")
except Exception as e:
print(f"Failed to send heartbeat: {e}")
# Your task logic here
try:
# Your code here
print("Task completed successfully")
send_heartbeat() # Send heartbeat on success
except Exception as e:
print(f"Task failed: {e}")
# Optionally, you might want to send heartbeat even on failure
# depending on whether you want to track execution vs success
Cron Job Integration
# Daily backup at 2 AM
0 2 * * * /path/to/backup-script.sh && curl -X GET "https://yourdomain.com/heartbeat/your-unique-code"
# Weekly report on Sundays at 6 AM
0 6 * * 0 /path/to/weekly-report.py && curl -X GET "https://yourdomain.com/heartbeat/your-unique-code"
PowerShell (Windows)
# Your PowerShell task
Write-Host "Starting scheduled task..."
try {
# Your task logic here
Write-Host "Task completed successfully"
# Send heartbeat
$url = "https://yourdomain.com/heartbeat/your-unique-code"
Invoke-WebRequest -Uri $url -Method GET -TimeoutSec 30
Write-Host "Heartbeat sent successfully"
}
catch {
Write-Host "Error: $_"
}
When to Send Heartbeats
Start of Execution
- Use case: Verify that your task started
- Benefit: Detect if the task never begins
- Example: Send ping when backup script starts
End of Execution
- Use case: Verify that your task completed successfully
- Benefit: Detect if the task crashes or fails
- Example: Send ping only after backup completes
Multiple Checkpoints
- Use case: Track progress through long-running tasks
- Implementation: Create multiple heartbeats for different stages
- Example: Separate heartbeats for "backup started", "backup completed", "verification done"
Understanding Heartbeat Status
Status Indicators
- 🟢 Up: Heartbeat received within expected timeframe
- 🔴 Down: Heartbeat missed (not received when expected)
- ⚪ Paused: Heartbeat monitoring is temporarily disabled
- 🟡 Waiting: Waiting for the first heartbeat or next expected ping
Key Metrics
- Last Run: When the last heartbeat was received
- Expected Next Run: When the next heartbeat is expected
- Success Rate: Percentage of expected heartbeats received
- Total Runs: Number of heartbeats received
- Missed Runs: Number of expected heartbeats that were missed
Managing Heartbeat Monitors
Viewing Heartbeat Details
- Click on any heartbeat name from your list
- View execution history and timing
- See success/failure patterns
- Access the webhook URL
- Review notification settings
Editing Heartbeats
- Go to the heartbeat details page
- Click the "Edit" button
- Modify settings as needed
- Save your changes
Note: The webhook URL remains the same when editing settings.
Pausing Heartbeats
- When to pause: During maintenance, known downtime, or schedule changes
- How to pause: Use the pause button on the heartbeat details page
- Resuming: Click resume when you're ready to start monitoring again
Testing Heartbeats
- Use the webhook URL in a browser or curl command
- Check that the "Last Run" time updates
- Verify that the status shows as "Up"
- Test notification delivery by waiting for a missed heartbeat
Heartbeat Logs and History
Accessing Logs
- Click on a heartbeat from your list
- Navigate to the "Logs" section
- View execution history and patterns
Log Information
Each log entry shows:
- Timestamp: When the heartbeat was received
- Status: Success or missed
- Response Time: How quickly the ping was processed
- Source: IP address that sent the heartbeat
- User Agent: Information about the calling application
What to Expect
First Setup
- New heartbeats show as "Waiting" until the first ping is received
- Implement the webhook call in your task
- Test manually to ensure everything works
- Wait for your scheduled task to run automatically
Normal Operation
- Heartbeats change from "Waiting" to "Up" when pinged on schedule
- Status remains "Up" as long as heartbeats are received on time
- Missed heartbeats trigger "Down" status and notifications
Notifications
- Missed Heartbeat: Sent when a heartbeat is overdue
- Heartbeat Resumed: Sent when a heartbeat comes back after being missed
- Timing: Alerts sent immediately after grace period expires
Common Issues and Troubleshooting
Heartbeat Not Received
- Check the URL: Ensure you're using the exact webhook URL provided
- Network connectivity: Verify your server can reach the internet
- Firewall issues: Check if outbound HTTPS requests are blocked
- Script errors: Ensure your task is actually executing the curl/HTTP request
False Alerts
- Insufficient grace period: Increase grace time if your task runtime varies
- Schedule changes: Update heartbeat timing if your cron schedule changed
- Server maintenance: Pause heartbeats during planned downtime
Tasks Running But No Heartbeat
- Conditional execution: Heartbeat call might be in an unexecuted code path
- Error handling: Task might fail before reaching the heartbeat call
- Network timeouts: HTTP request might be timing out
- Wrong conditions: Heartbeat only sent on success, but you want to track execution
Multiple False Alerts
- Timezone issues: Check if your server timezone matches expectations
- Daylight saving time: Adjust for DST changes in schedule
- Leap years/months: Monthly tasks might have timing issues
Best Practices
Heartbeat Design
- Descriptive names: Use clear, specific names for each heartbeat
- Appropriate grace periods: Set realistic grace times based on task duration
- Single responsibility: One heartbeat per distinct task or process
- Consider dependencies: If Task B depends on Task A, monitor both
Implementation
- Error handling: Always wrap heartbeat calls in try-catch blocks
- Timeout settings: Set reasonable timeouts for HTTP requests
- Success criteria: Clearly define when to send heartbeats (start vs. completion)
- Logging: Log heartbeat sending in your application logs
Monitoring Strategy
- Critical tasks first: Start with your most important scheduled jobs
- Gradual rollout: Add heartbeats to tasks incrementally
- Regular review: Periodically check that heartbeats are still relevant
- Documentation: Keep track of what each heartbeat monitors
Advanced Tips
Heartbeat Chaining
For complex workflows, consider multiple heartbeats:
- Step 1: "Data extraction started"
- Step 2: "Data processing completed"
- Step 3: "Report generated and sent"
Conditional Heartbeats
Different heartbeat behavior based on conditions:
- Success only: Send heartbeat only when task completes successfully
- Execution tracking: Send heartbeat regardless of success/failure
- Mixed approach: Different heartbeats for different outcomes
Integration with Monitoring
- Combine with uptime monitors: Monitor both service availability and scheduled tasks
- Status page integration: Include critical heartbeats on your status page
- Incident correlation: Link heartbeat failures to service incidents
Security Considerations
- URL protection: Keep webhook URLs secure and don't expose them publicly
- Network security: Ensure heartbeat calls go over HTTPS
- Access control: Limit access to heartbeat configuration
- Log monitoring: Monitor for unusual heartbeat patterns or sources