It seems like a bug in a storage driver in that case (if it's actually getting triggered by it)... if a command isn't available, it should be falling back to one that is, right?
Maybe? I don't know if there's a command Discovery for scsi that would let them know if things are supposed to be supported. If there is maybe it advertised support and confuses the system when it doesn't work
When you talk to disks via smartctl, the tool reports the specification versions they support. There's a ATA Version and SATA Version field for SATA disks. I was unable to get details on a SAS disk, but it was identified as a SAS drive successfully.
These standards probably define mandatory and optional commands to certify disks as compatible with these specs IMHO.
If the command is optional, then it's OK, but if it's not, then there's some bug fix what WD shall make.
Thanks for the reply and the utility. I'll take a look into it. Since I'm familiar with smartctl due to my server management roles, it came to my mind so I've shared. I never thought it should be able to handle beyond what it needs to do get SMART and other diagnostic data.
> I don't know if there's a command Discovery for scsi that would let them know if things are supposed to be supported.
The OP shows errors that are reported to the OS by the drive when it attempts to use the command. Even if it can't pre-determine support for the command, it can fall back upon receiving an error.
Thanks, I suppose that answers the question of "why not try the opcode instead of doing command discovery". Though what I was really trying to understand was, "if you've already issued the command {for whatever reason}, and it returns invalid opcode, then shouldn't you fall back to an alternative command?" Because at that point, you have enough information to know you can do so safely. It seems to me that that's what the storage driver needs to do, irrespective of any command discovery or lack thereof beforehand.
There can be other reasons for command failure than "opcode not supported", even if that's the error code returned. I wouldn't trust cheaper harddrives to handle that properly either.
What would such a reason be? How likely is this to happen? If you have such a mistrust of the response then you can never trust anything, right? How do you know the drive isn't lying about everything else too? At some point you gotta trust something means what it says...
The trust is in what the drive identifies as supported.
The issue is that some command ops may be doing double duty in a different drive. Famously, a few CDROM drive vendors reused the "clear buffer" command to instead mean "update firmware". Linux used support for "clear buffer" to detect if a drive is a CDROM or CDRW drive. As a result, using such a specific CDROM drive under linux would quickly cause the CDROM drive to become permanently bricked.
You can't trust the response because it's likely that at that point, the damage is already done. Even if you get one, you might not know what it means.
That applies to any command that the drive does not advertise support for via appropriate SAS and SATA commands. In some rare cases you might manually have a whitelist of commands supported by drives outside this list but you should never try to automatically discover it during runtime.
> You can't trust the response because it's likely that at that point, the damage is already done. Even if you get one, you might not know what it means.
I still don't get this. If the damage is already done, then how is issuing the fallback going to change things? Again: I'm not arguing about whether discovery should be done or not. All I'm saying is, if the device says invalid opcode, you should use the fallback, whether or not there was any discovery that led you to use the initial opcode.
You don't know what state the drive is in anymore. The safest option is to reset the device entirely and start it back up again. If it comes back, you can use your fallback.
But it is much easier to rely on what is known to work instead of issuing potentially non-working commands to the point that there is no reason to have a fallback other than "rediscover what it supports".
I don't get why you would even want to use a fallback command on a drive that is in a potentially unknown or undefined state.
If discovery led to an invalid opcode the drive is faulty, end of story. The SAS and SATA standards are very clear on what is permitted and what is forbidden and that falls very far on the side of "not allowed".
Is this just a theoretical thing, or have there been actual drives that lied about invalid opcodes on a read and then proceeded to destroy the drive if you issued a fallback read? I have a hard time believing a hard drive would behave like a C compiler if I'm being honest...
As I mentioned earlier, there was a series of CDROM drives that upon receiving an unsupported command (and this was before you could discover it) would lead to all further data being interpreted as firmware data for an update and brick the device. If you issued a fallback read then the device would become bricked, if you reset the bus and reinitialized the device, everything was fine.
Discovery has of course improved this, so we know what a harddrive can and cannot do. Harddrives that lie about what they support shouldn't have the appropriate seals and trademarks of SATA or SAS on them, as they must be certified by those entities.