DEV Community

MN Mark
MN Mark

Posted on • Updated on

Terminology and structure of defining record relationships

Looking for some opinions on naming and structure when matching a data entry to another data entry.
Specifically the use case is matching a file to a host, but this same methodology will be applied to other relationships.

A file on disk contains yaml front matter with a separator after which is the contents of the file, very similar to posts here on dev.to. In the front matter I need to list attributes of a Host to match and attributes that should not match as well as define greater/less-than tests. It is important that the naming and structure be understandable and approachable by systems administrators.

Base context

Simplified example without matching criteria, this will be the static context used in all examples:

path: /etc/hosts
mode: 0644
user: root
group: root
---
127.0.0.1 localhost
::1 ip6-localhost

First Option

Now to add criteria to match only Hosts which have a OS.Type of "Linux", and OS.Version greater than 18.00 but not matching a Platform of "prlvm":

path: /etc/hosts
mode: 0644
user: root
group: root
assert:
  - os.type: linux
  - os.version: 18.00+
refute:
  - platform: prlvm
---
127.0.0.1 localhost
::1 ip6-localhost

Here the assert key contains a list of Host attributes and values that must match. The os.version would need special treatment to be compared as a floating-point value even thought it's stored and transmit as a string and the operators + and - would need to be parsed out and evaluated accordingly. I think this is mostly clear and concise from the users' end, but special treatment of values such as os.version could make this nightmarishly complex later down the road.

I also am considering match and except terms in place of assert and refute, respectively, because they might be more intuitive to sysadmins.

Second Option

Another approach is to thin out the abstraction and use a more direct representation of the matches happening in the program. I think this is more accurate at the cost of being more verbose.

path: /etc/hosts
mode: 0644
user: root
group: root
match:
  equal:
    - os.type: linux
  greaterthan:
    - os.version: 18.00
  not:
    - platform: prlvm
---
127.0.0.1 localhost
::1 ip6-locahost

That would produce the same end result but have simpler logic in the application processing it. We could reasonably assume that values submitted under the greaterthan key could be treated as numbers without special parsing. We could more clearly differentiate between equal and contains and even add operators such as startswith. I think this is starting to look like elasticsearch filters which I was never a big fan of, but it does make the backend coding safer and easier and I think more clearly set expectations for the user.

In the last example, match could instead be select or other term.

Ultimately I am trying to make it the most friendly to the sysadmin end user while still being reasonably safe, testable, and not overly complex for the developer.

Update

I am moving forward with the Second Option. I think the benefits of clarity are worth the increased verbosity. While the baseline complexity is more than other options I believe the complexity will not significantly increase with the number and size of records. It also requires more logic in the application but not more complex logic, this should make the program easier to maintain and test.

Discussion (0)