Sometimes, when weโre working with data, we need to be sensitive with regards to what weโre exposing when we serialize our objects. On a recent project, I was dealing with various types of Personally Identifiable Information, often abbreviated as PII, and needed to find a consistent, and easy, way to mark fields as sensitive. I discussed the problem with my team and after a few rounds of spit-balling and bouncing ideas off each other, the CensoredContentAttribute
was born.
Note: All the code discussed in this article can be found in this Gist. Eventually, I plan to set it up in a repo with some other useful classes and snippets, but for now, GitHub Gist is its home away from home.
To start, letโs preview an example of how this attribute works in practice.
The important bits are on line 9, [Redacted(ShowFirst:3, ShowLast:3)]
, and on line 13 new CensoredJsonSerializerSettings<RedactedAttribute>()
.
Adding the RedactedAttribute
to the Name property of this LoggingExample
class sets it up to be read in when we go to serialize this class later on in the ToSerializedString
method. The ShowFirst
and ShowLast
properties of the RedactedAttribute
provide us a way to specify how many characters at the start and end of the original value should be shown in the serialized output. Very handy when serializing things like Social Security Numbers, Email Addresses, Phone Numbers, or Usernames where some fields may have different amounts of data permissible to view and others need to be fully censored.
The CensoredJsonSerializerSettings<T>
is a class derived from JsonSerializerSettings
with constraints on the type of T
, limiting it to children of the CensoredContentAttribute
. This class helps to provide the necessary ContractResolver
to the JsonSerializer
so that your censored fields are handled properly.
Next, we move on to the core of the attribute, with the CensoredContentAttribute
, CensoredContentContractResolver<T>
, and the CensoredContentValueProvider
. Hereโs the code:
Lines 8โ13 are the abstract CensoredContentAttribute
, which defines a base class for us to attach most of our logic to, and more importantly, gives us a strong type to attach our generic type constraints to. Itโs a simple class with one read only property, Censor
, and one method, TruncateData(string input)
. The Censor property is for defining what value will be used to censor the content. In our example attribute, RedactedAttribute
, we use the value ***REDACTED***
. Similarly, the TruncateData
method does what it says on the tin, itโs for performing the actual truncation of your data.
Lines 15โ28 (technically through 50, as the CensoredContentValueProvider
is an internal class, but weโll get to that in a moment) make up the core of the CensoredContentContractResolver<T>
class. This class is what the Netwonsoft Json Serializer is going to use to determine if a member on the object being serialized should use the default value provider, or if it should use our custom CensorContentValueProvider
. Overall, itโs pretty straight forward and ensures, through reflection, that the member weโre getting the value of has a child of the CensoredContentAttribute
on it. We get a reference to that custom attribute and pass it into the constructor of the CensorContentValueProvider
along with a reference to the standard MemberValueProvider
.
Lines 30โ49 make up the CensorContentValueProvider
class, where the real magic happens. In here, we have a couple properties for holding references to the values passed in, namely the base IValueProvider
and the concrete implementation of our attribute, wrapped in a CensoredContentAttribute
shell. In the GetValue
method (line 39), we first need to pull out the string value of the member being censored. Itโs a standard call to the GetValue
method of the IValueProvider
interface, but we need to make sure to toss in some null check and null coalescing so we donโt error out if the member hasnโt been set. After that, we throw our targetString
into the TruncateData
method of our attribute class and return the result.
The
SetValue
method isnโt particularly important in this case, since this is intended as a one way censoring attribute, but there are plenty of interesting things you could do there, including things like encryption when serializing to/from json. Iโm interested to see what ideas you have. Let me know in the comments what you come up with.
That covers the groundwork and the core of our censorship engine, so now lets dive in to a couple example attributes. Hereโs the code:
Starting with the LargeDataAttribute
, we have an example use case for the CensoredContentAttribute
that doesnโt involve direct censorship for privacy reasons, but rather for log size reasons. In our case, we were trying to store a raw copy of the body from an HttpRequest
during development for debugging and replay purposes, but for some requests which included file uploads, this was excessively large. Utilizing the LargeDataAttribute
we were able to specify a threshold over which the data on the decorated member would be replaced with a message like โ***Large Data Removed*** [Length: 1024]
โ. This greatly eased the burden on our logging tools, and provided an easy way to get a general feel for what was included without being excessively large.
The RedactedAttribute
is by far our most used of the two attributes though. With it, we are able to flag members as Redacted
and specify a varying amount of data that is okay to display in the logs when serialized. In the TruncateData
method, we check to ensure that the combined length of the first and last characters to show do not exceed the length of the original input. If it does, we spit out the value of the Censored
property only (***REDACTED***
in this case). Otherwise, we append the first X (where X is the value of ShowFirst
) characters of the input to our output string, append the Censor
value, then perform the same action for the number of characters specified by the ShowLast
property. Finally, we run one final sanity check on the output and return the value, or the Censored
value, if our planned output was somehow empty. This ensures that someone looking at the log does not know if the original input value was originally empty, or contained less characters than the supplied Censor
string (that the reason the Censor
string is padded with asterisk as well).
Packaging it all together, you end up with a pretty straight forward way to mark members of your classes as sensitive and perform some form of modification to the data before putting it through serialization. If you need to serialize the same object without the redaction, such as for writing to a DB, posting to an endpoint, or writing out to a file, you simply run it through the serializer without sending in the CensoredJsonSerializerSettings
. Additionally, the <T>
value of the CensoredJsonSerializerSettings
allows you to use multiple children of the CensoredContentAttribute
in a single class and optionally trigger specific censor attributes based on the value you send in as <T>
. For example, we use the <RedactedAttribute>
value for <T>
when serializing for logs, and the <LargeDataAttribute>
when serializing for the DB.
Custom attributes provide a world of options and capabilities that every developer tends to find their own unique way of using. The CensoredContentAttribute
is just one of the custom attributes we utilize here at DealerOn and I plan on covering more of our helpful snippets and libraries in the future. Thanks for checking this out, and I look forward to reading how you are utilizing custom attributes in youโre own projects.
Top comments (0)