Blade 3 nodes crash when control board crashes

As several other users, I’ve got the issue that the control board of the Cluster Box sometimes crashes. When I’m logged in via ssh, I get such an error message:

client_loop: send disconnect: Broken pipe

On the control board, I get a whole bunch of the messages I’ve written you about several times:

[  292.012003] miop 0000:03:00.0: DMA timeout, restart DMA controller.

I also always get this message when connected to the Blade 3 nodes when the control board crashes. When I can log into the control board and the cluster nodes after a couple of minutes, uptime shows me an uptime of a couple of minutes both on the control board, and on all four nodes:

mixtile@ClusterBox:~$ uptime
 17:11:05 up 7 min,  load average: 5.91, 7.30, 3.59

This means that a crashing control board takes the nodes with it, probably by cutting off power. Is that true? If yes, please tell me ehat I can do to make sure that such a power cut won’t occur. Last time it happened while I was building a Ceph cluster and left me with an unfinished installation. And: Are data exchanged between nodes (which can be several terabytes) forwarded to the control board?

Thank you.