Rob Ewaschuk wrote an interesting booklet about his observations at Google: My Philosophy on Alerting . It is a recommended read!
A short summary:
- keep alerting simple
- alert on symptoms (‘monitor for your users’)
The booklet formed the basis for writing https://www.oreilly.com/ideas/monitoring-distributed-systems