Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
On of our server providers was running a deal on servers equipped with the newly released AMD Ryzen CPUs. We got a couple to test them for future placements.
Right off the bat we started experiencing issues with them, the shell sessions would randomly hang/freeze. At first we thought this was a network issue and ignored them, it wasn’t until one of the servers crashed and wouldn’t respond to reboot commands that we became suspicious. We had to contact our data center to send physically send someone to the machine to investigate why it wasn’t even responding to hard reset requests. They reported back that the server had no display and was completely unresponsive to keyboard commands, they had to power it off manually and then turn it back on.
We suspected right from the start that this may have been a kernel issue as the CPU architecture was practically brand new, turns out we were right, a number of people had experienced similar issues and the solution turned out to be to install kernel version 4.12 and above.
Note: Make sure you backup everything you have on this server before proceeding with updating the kernel.
Start off by verifying that you are indeed on an old kernel version
uname -r
The result should be something similar to the following:
Add the ELRepo to your server
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm
Install the latest kernel by issuing the following:
yum --enablerepo=elrepo-kernel install kernel-ml
You now need to edit the grub file at /etc/default/grub
GRUB_TIMEOUT=5 GRUB_DEFAULT=saved GRUB_DISABLE_SUBMENU=true GRUB_TERMINAL_OUTPUT="console" GRUB_CMDLINE_LINUX="rd.lvm.lv=centos/root rd.lvm.lv=centos/swap crashkernel=auto rhgb quiet" GRUB_DISABLE_RECOVERY="true"
Edit the GRUB_DEFAULT=saved line to GRUB_DEFAULT=0
Change this line, save the config and then issue the following command.
grub2-mkconfig -o /boot/grub2/grub.cfg
Reboot the server
reboot
Check the kernel version after your server returns
uname -r
You should see something like the following: