1

Do smartd need the database?

or

smartctl needs the database?

I saw smart tool github keep updating database:

https://github.com/smartmontools/smartmontools/labels/drivedb

In my understanding, smartd will scan all disks then why does it need a database? what's the function/purpose to use a database in smartd/smartctl?

2 Answers 2

1

The smartmontools drive database can be seen here. Its purpose is to provide additional command-line flags to both smartctl and smartd for drives where the default settings (which are themselves defined in the database) aren’t sufficient, and/or to provide warnings about the drive.

Many drives have specific counters, or counters that need to be interpreted in specific ways; see for example the first non-default entry in the database.

Some drives have important misfeatures which users should be told about; see for example m4 SSDs with their counter bug.

USB drives need an access method to be specified, see the entries starting here.

The database isn’t needed, since all the settings it defines can be specified using command-line parameters; having it saves each user from having to determine those command-line parameters themselves.

2
  • the database is a customized spec/protocol for smartd/smartctl to talk to disk? Commented Aug 12, 2022 at 6:54
  • 1
    Not quite; the tools contain the implementations of a number of protocols to access SMART information on drives, the database allows the right one to be specified when the default isn’t appropriate (that’s the role of the -d option). Commented Aug 12, 2022 at 6:56
0

As with lspci and lsusb, what these tools receive are short numbers, often hex numbers, that are then mapped to strings. Monitor EDID also sends a very short binary block that contains many such short numbers which are then translated to strings using matching tables of some sort.

https://pci-ids.ucw.cz/v2.2/pci.ids - this is the physical matching list that lspci uses, and other tools that turn the vendor 4 hex number id and the product 4 hex number id into string values. When an item is not in that matching table, it will just show the product and vendor ID.

Smartctl is significantly more complicated, but a large amount of the fields require such matching tables to create the string values, and also the other types of values and ranges, which are otherwise just random characters and numbers which the tool reads.

I don't know how big the block the drive can send to smartctl is, but I know more is mapped than other nix disk tools usually get access to, which is almost certainly at least in part due to those matching tables, some binary value mapped to a specific string value.

What hardware actually tells the system is quite different from what these tools report to you as an end user, in human readable string form. I've made at least one of these tools, I can't remember how many binary bits were used, in very clever ways, to highly compress these unique identifiers, and sometimes also string values, but this is why you need matching tables, to complete the raw binary or hex data the device sends.

Roughly speaking, anyway. I don't know exactly how smartctl does it, but anytime you see one of these tools using a matching table, aka, a db, this is why.

For example, it's quite unlikely the drive sent a string: Samsung Electronics Corporation, but it's quite likely it sent: CE00h as the hex 4 character id, which was then mapped to Samsung's string ID. Of course, smartctl has a lot more features than just that, but that's the rough idea. Other values the drive sends via smart don't mean anything alone, and have to be mapped to a specific vendor, and sometimes a specific product, to then know what they refer to, like units written, where units is not a constant, and depends on the vendor and drive.

CPUs also send out a tiny block as well, that contains a single hex extended family, extended model, family, model, and stepping ID, which then have to be mapped back to the cpu vendor and other known things to determine what it is.

It's wort noting that if the matching tables are wrong or incomplete, smartctl will show incorrect values for a specific field, the wrong unit might be used for example, I've seen that many times, sometimes that stuff is only released under NDA, or not at all, so smartctl has to try to figure out what the right matching values are, and it doesn't always succeed. That is, this is not pure raw data the tool is showing, it's interpreted and processed data.

4
  • CPUs and hard drives provide text strings identifying themselves, there’s no need for a database to map ids to descriptions to identify them (unlike PCI devices). Commented Aug 12, 2022 at 6:10
  • Text strings in this context have no data value, they can be, and are often, almost useless in terms of determining what the cpu is. CPUID, on other hand, is the actual cpu family, model, stepping, which can then be used to determine cpu features. But the actual string returned can be, and often is, totally useless data, beyond maybe determining it's an AMD or Intel CPU. Vendor provided strings are generally the worst data source possible, since they can be, and often are, simple copy/pastes done by some junior level person at the vendor. This varies, but strings in general are just strings. Commented Aug 12, 2022 at 17:53
  • All I’m saying is that there’s no need for a database to map ids to names for hard drives and CPUs. That’s all. Commented Aug 12, 2022 at 21:19
  • Matching table, lookup table, database, not a huge difference except when it comes to the names you prefer to give the function. I'm assuming you haven't seen a huge set of this type of data from varied sources, that's usually all that's required to disabuse anyone of such notions. Identify this string: "AMD Custom APU". That's a real one, which recently came through my systems. Once you see enough of these, you start to realize that strings are nothing. It's a Ryzen, Zen 2 architecture, but you'd never know that from the string. Commented Aug 13, 2022 at 3:06

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.