Random connectivity issues with OVH Proxmox node
Has anyone been experiencing random connectivity issues to Proxmox VE 6 node on OVH/SYS (or, anywhere for that matter)?
I have one server on SoYouStart with Proxmox 6 and it happens rather randomly that I cannot ssh to the machine or open its GUI (no server response). At the same time I can access all the VM's on it without any issue.
Also, if I had an ssh session active with the node at the moment of issue happening, the session will keep working as nothing has happened, but if I disconnect, I cannot ssh back.
I don't have a firewall active on the node. All the services keep working normally all the time, logs are not showing any issue (during the issue no connection attempt is being logged anywhere). SSH and 8006 (for the GUI) ports are open and listening all the time.
I have no clue where and what to look for.
Any idea appreciated.
The Proxmox itself is installed with the SYS provided image and is an as standard install as possible: no zfs, no ceph, no cluster. I have 4 HDD-s in a raid 10 configuration (software raid, option selected during the Proxmox installation). The first thing I did after installation was to register an ACME account and obtain a Let's Encrypt certificate for the domain where the node is. No LXC containers, just KVM. Cannot remember now if the issue started happening before creating the very first VM.
Comments
One more thing: restarting the server usually brings back the connectivity, but not always... sometimes it gets back 5-20 minutes after the restart.
This actually made me abandon the idea of an outer network related cause. Or, maybe, it is a network (router/switch/gateway) related issue that gets temporarily fixed wen the server boots and starts registering itself on the network. Not even sure how to ask their support to check.
@Not_Oles has quite a bit of experience with Proxmox there.
Thanks for your kind words! However, I'm just a Proxmox noob.
Tom. 穆坦然. Not Oles. Happy New York City guy visiting Mexico! How is your 文言文?
The MetalVPS.com website runs very speedily on MicroLXC.net! Thanks to @Neoon!
You are using their default image and this happens? Double check networking is all correct, and consider contacting support if another machine is competing for the same IP)
TensorDock: Hourly Cloud GPUs from $0.32/hour
One thing you could try is putting the server into Rescue mode and checking to see whether the issues persist.
Running ssh -v when the issue is happening might tell you something about where in the connection process the failure occurs.
Running traceroutes to the node and to the VMs when everything is working and comparing with traceroutes when the issue is active might tell you something.
Check http://travaux.ovh.net/ for anything relevant.
Also, please do talk with OVH support. The perfect way to ensure unhelpful support is not even to tell them what's happening.
I'm sure other folks will have more and better ideas.
Good luck and best wishes from sunny Sonora! ?️
Tom. 穆坦然. Not Oles. Happy New York City guy visiting Mexico! How is your 文言文?
The MetalVPS.com website runs very speedily on MicroLXC.net! Thanks to @Neoon!
Thanks @Not_Oles and all!
Funny thing I never did neither "ssh -v" or traceroute. Will do next time this happens. Usually I see it happening every time I try to ssh or to open the GUI after few hours of inactivity.
I just came home and saw I can ssh instantly, no hangs. But I did something this morning...
I was thinking that this might happen after some time of connection inactivity... sort of power mode. So I've put a Cassy server, serving single page with current time on it and nothing else. And after I did that, this is first time for me to experience ssh instantly opening upon trying after few hours of absence. Will continue to observe... if hangs stop happening after this, I hope it will be easier to find the cause.
Thanks for the tips for support. I'll definitely get in touch if this keeps happening.
Just wanted to report back that since I've put a web server, the hangs have stopped completely.
Although an interesting quick fix for this situation, I have decided to reinstall the entire server. I cannot value that much fixing the issue with the lack of knowledge and understanding about what was happening.
Thanks again everyone that responded to this thread. I wrote about my problem on several places and this was the only place where people actually gave any feedback. I am really grateful.
It's really nice here!
Tom. 穆坦然. Not Oles. Happy New York City guy visiting Mexico! How is your 文言文?
The MetalVPS.com website runs very speedily on MicroLXC.net! Thanks to @Neoon!