this post was submitted on 30 Oct 2024
9 points (100.0% liked)

techsupport

2468 readers
13 users here now

The Lemmy community will help you with your tech problems and questions about anything here. Do not be shy, we will try to help you.

If something works or if you find a solution to your problem let us know it will be greatly apreciated.

Rules: instance rules + stay on topic

Partnered communities:

You Should Know

Reddit

Software gore

Recommendations

founded 1 year ago
MODERATORS
 

Hello everyone. I have a system with Ryzen 9 7950x, 32GB 6400 mhz DDR5 ram, 1 TB primary SSD where Windows 11 and Linux installed and Gainward RTX 4080 graphics card and Asus Prime X670-P Wifi mobo. I also have 1 TB SSD and 2 TB HDD's mounted. 2-3 months ago, I started to get crashes on my both OS'es. And in time, they got frequent. I bought a brand new SSD for OS installation, after a while it started again. I cannot get any error message on Windows, since BSOD screen just stays for 2-3 seconds and system restarts. After restart, I sometimes get "no Bootable device found" error on boot stage. When the crash happens on Linux, dmesg outputs show something like whole SSD disconnected. It shows I/O messages for root partition as well. I changed primary SSD 1 month ago, errors still persist. Sent mobo to the service, no issues were found. BIOS also updated and reset. When I run PC on live Linux media, I get no issues however. What can I do else? What can cause this issue? Thanks in advance.

you are viewing a single comment's thread
view the rest of the comments
[โ€“] [email protected] 2 points 2 weeks ago* (last edited 2 weeks ago) (1 children)

Rambling StoryOnce, I had an El Cheapo and very questionable SATA SSD fail on my system. Had similar symptoms, Windows would hang and crash at random, becoming more frequent over time. Found out while digging through Windows logs and troubleshooting, that the system would crash when trying to access the drive via the file explorer, because the drive would disconnect. The SSD seemed to fail slowly, but I was using it as a faster workspace and saving everything to an HDD, so I never looked into the possibility of a failing drive until the system wouldn't boot. Removing the drive cured everything. I should probably note that the failed SSD wasn't the boot drive, it was used strictly for data, so the OS wasn't being unmounted directly. I think the drive itself was shorting out some of the SATA pins, scrambling the whole bus.

Several years later, on the Linux side of things, I found out that fstab can prevent booting if a storage device is missing. Fstab had auto configured an external drive enclosure as a critical component on a fresh install. Not sure what the error messages would look like if an internal data drive mounted as critical disconnected on a running system, but I would assume Linux would halt even if no processes are running from the drive.

I'm not sure what the symptoms would have been if my SSD drive failed while running Linux. My gut says it would show similar to your Linux dmesg, like the boot drive I/O disconnecting or becoming inaccessible.

I've also had a system with an AMD processor fail to boot, but that one wouldn't even POST. Fixed that one by finally reseating the CPU. Turns out that's a common issue with some AMD CPUs using the AM4 socket, found a lot of complaints online for that one after the fact.

Since your system runs fine from a live USB, and you've already replaced the M.2 drive, I would try running the system without any SATA drives installed, and try to force a crash until you feel confident the issue is gone.

If the problems still persist, then I would look at getting a cheap fresh HDD and new SATA cable, installing a temporary OS, and try the test again.

If it STILL crashes, I would look at removing all unnecessary hardware from the motherboard and slowly testing each stage as you rebuild.

[โ€“] [email protected] 1 points 2 weeks ago

I unplugged SATA cables last night, booted from Windows USB to install it, SSD disconnected again mid course :) SSD is disconnected somehow and if it happens in OS installed on, it causes crash. On USB, there is no crash. It's not HDD, not memory or cpu, not SSD (it's brand new already). I'm down to motherboard at this point.