Looking for some opinions on naming and structure when matching a data entry to another data entry.
Specifically the use case is matching a file to a host, but this same methodology will be applied to other relationships.
A file on disk contains yaml front matter with a separator after which is the contents of the file, very similar to posts here on dev.to. In the front matter I need to list attributes of a Host to match and attributes that should not match as well as define greater/less-than tests. It is important that the naming and structure be understandable and approachable by systems administrators.
Simplified example without matching criteria, this will be the static context used in all examples:
path: /etc/hosts mode: 0644 user: root group: root --- 127.0.0.1 localhost ::1 ip6-localhost
Now to add criteria to match only
Hosts which have a
OS.Type of "Linux", and
OS.Version greater than
18.00 but not matching a
Platform of "prlvm":
path: /etc/hosts mode: 0644 user: root group: root assert: - os.type: linux - os.version: 18.00+ refute: - platform: prlvm --- 127.0.0.1 localhost ::1 ip6-localhost
assert key contains a list of Host attributes and values that must match. The
os.version would need special treatment to be compared as a floating-point value even thought it's stored and transmit as a string and the operators
- would need to be parsed out and evaluated accordingly. I think this is mostly clear and concise from the users' end, but special treatment of values such as
os.version could make this nightmarishly complex later down the road.
I also am considering
except terms in place of
refute, respectively, because they might be more intuitive to sysadmins.
Another approach is to thin out the abstraction and use a more direct representation of the matches happening in the program. I think this is more accurate at the cost of being more verbose.
path: /etc/hosts mode: 0644 user: root group: root match: equal: - os.type: linux greaterthan: - os.version: 18.00 not: - platform: prlvm --- 127.0.0.1 localhost ::1 ip6-locahost
That would produce the same end result but have simpler logic in the application processing it. We could reasonably assume that values submitted under the
greaterthan key could be treated as numbers without special parsing. We could more clearly differentiate between
contains and even add operators such as
startswith. I think this is starting to look like elasticsearch filters which I was never a big fan of, but it does make the backend coding safer and easier and I think more clearly set expectations for the user.
In the last example,
match could instead be
select or other term.
Ultimately I am trying to make it the most friendly to the sysadmin end user while still being reasonably safe, testable, and not overly complex for the developer.
I am moving forward with the Second Option. I think the benefits of clarity are worth the increased verbosity. While the baseline complexity is more than other options I believe the complexity will not significantly increase with the number and size of records. It also requires more logic in the application but not more complex logic, this should make the program easier to maintain and test.