When designing systems that form a chain of communication, it may be described as a trade-off between allowing flexibility and creating standard inputs. Systems can be designed with user requirements in mind to allow certain free-form entry inputs, but the trade-off is a data hygiene problem in the long-term. Fields that may have once been designed to have no dependencies may suddenly be required for use in a downstream system. When those business requirements are being written up, not only does the data need to be sanitized, but the upstream system needs to be re-built to accomplish the goals for its dependents. I have seen a number of instances where this problem has really bitten the original creators and I have a couple best practices I have personally observed.
Firstly, when creating your schema, create a list of acceptable values pertaining to each field. Usually implemented for user input with a dropdown or radio buttons, make sure data is entered with these values consistent. I remember numerous headaches created by incorrect entry on the part of an upstream user or values that needed to be cleaned up. For instance, we couldn’t automate a feature display for a product line easily because users had been entering features like “Guide-Approved” as “Melissa Arnot” or “Mountain-Tested”. Its best if you nip any future problems in the bud and make the inputs standard.
Additionally, do not allow NULL values in inputs, even if it creates some false data. I would much rather have poorly described data to work with, rather than no data at all.
Finally, consistency in field values should be uniform across all systems. Even if the field values in one system are standard, care must be taken to make sure all systems being developed have these same inputs. The biggest offender of this rule is when a business decision is made and there is only sufficient bandwidth to fix the most downstream system. Some of these discrepancies persisted for years! These cases can be unavoidable based on practicality, but tech debt can quickly build and become unmanageable.