Sepherosa Ziehau has added updated the ‘ecc’ device, for Intel E3-1200 series systems. What’s it do? It will report on memory errors, and potentially fix them.
You should have ECC memory in your server already. If not, you oughta.
Update: as Sascha Wildner pointed out, ecc(4) already existed, but didn’t support Intel controllers. Also, the Xeon X3400 series is supported now too.