Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not everyone is "pretending that the problem doesn't actually exist." There are limited resources available to any and every organization. Many times, a proper fix can cost more than a band aid, sometimes even over time.

Here's a hypothetical, but not ridiculous, scenario. Which is a better way to spend resources? Track down and stamp out an extremely difficult resource leakage bug? Or simply bounce the server? The costs associated with the former may be HUGE. The costs associated with the latter may be small in comparison (cost of potential down time, minimal labor cost of bouncing the server(s), and costs associated with user reliability).

I wouldn't always chalk it up to whatever it is that your comment implies (laziness? willful ignorance?).



Yeah I'm absolutely implying something by my comment. I understand the time:value relationship and can expect some degree of RCA to go away if rebooting a server solves a problem for X amount of time, where X is less than a few days. But if you're spending the time opening an alarm, calling people to a bridge, agreeing to bounce a server, and then bouncing it, at some point that equation comes out in favor of actually doing some sysadmin/RCA and making it so you're not bouncing a server every 48/72 hours.

It also looks really JV to see those alerts day in/day out. Like, come on. We can't just fix the problem?


I really doubt anybody would accept that. The solution in many cases are simply to reboot the servers every night.

Sure, we would all love to fix the bugs, but if there are no easy clues and a regular reboot fixes it - who really cares when there are features to build?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: