I am rewriting a part of an application written in python. It queries a json api, extracts relevant information and stores this information in an SQL database. It has served me well for a number of years now, and have changed quite a lot during its lifetime, and so has the relevant information. The data in the API is only available in relatively short timeframes, but valid always valid in my application, which could cause problems if other information became relevant. This is built into the python version quite easily, but this post is about doing the very same in the new C# version.
When deserilization to statically defined types, you have to pick which properties to save and which to ignore. In a preference for simple interfaces and classes, I always only keep data relevant to my app. This is no problem in apps that directly query APIs; if we gain interest in other properties, we can just change our static POCOs to contain those properties too. But what if we store the data, in say, an SQL-database? Basically, a solution would be for our POCOs to include a property containing the JSON-representation of that particular object. Of course, if our application entities are decoupled from JSON, we could alternatively store this data elsewhere eg. a EF-core shadow property.
This way, in the future we can reconstruct data including properties that makes sense in future versions of our app; we never discard information. The FullJsonData string should of course only contain the JSON-data needed to construct this single object. This is easy in a language like python; we can just serialize the deserialized objects. In C#, the process is a little different.
If for example a response looked like this:
And our application at an early stage only contained movie titles and actor names, as in the following classes:
It would be useful to be able, at a later stage in the application lifetime, to get the original JSON source, in case we decided to add the properties Movie.Year and Actor.Born. This is especially useful in situations where we store, or cache if you will, data in the application and if the source data cannot be retrieved at a later stage. It is not enough to store the full JSON document, but only the part of the document needed to reconstruct each object in isolation.
In a JSON-library like NewtonSoft’s we could do this as a generic JsonConverter-implementation, and the end-result could look like this; a map from objects to JsonTokens
Here JsonMapper is a map from C# objects resulting from the deserialization to their source JSON-representation, that is, the JSON subtree itself, not the entire JSON document. The map can then later be used to lookup JSON tokens for each deserialized object, which then can be stored separately after deserialization.
jsonSerializerSettings parameter for the builder is as usual for the newtonsoft json library, containing custom converters and other serializer specific settings. The map.Settings is the same settings object, with type specific converters for the tracked types using the Tracking<T> method. This does not, however, remove any custom-converters for that type, but just sits in-between and once, grabbing the json-text of the subtree, before running any other converters for that type doing the real conversion.
A generic implementation of such a JsonConverter could look like this
The interesting methods are
ReadJson. We only support reading, so write throws an exception; this can be changed if needed.
The converter needs to run once, save the JSONinto the map, use any other capable converter doing the actual conversion while ignoring subtree until deserialized, after which it is enabled. This is stored in the Enabled flag, which is queried in the
CanConvert Method. If
CanConvert evaluates to true, the
ReadJson method is called by the newtonsoft-library. Here we disable the converter, store the JSON-token, deserialize as usual, but with the tracking being disabled. We return the actually deserialized object, and make sure the type-tracking is enabled, by setting
The JsonMapper class and builder act as boilerplate for a fluent interface for tracking and constructing an immutable mapper-object. The details for this can be found in the github-repository.
We can, after deserialization do a lookup for any of the objects above to get the full-json containing properties that did not get serialized. This is useful if, later in the applications lifetime, the original json-source is not available – imagine a weekly schedule-api which only returns current week.
For example, in the above example we could get full JSON representation like this, where serializedResponse is the example JSON-document from above:
And the resulting console-output would be
In the actual application, the classes used for deserialization are generated using quicktype, which is used to construct application specific entity classes; mapping is done between the application entities and their respective JSON-data. This gives a degree of separation between JSON-specifics and the core classes, as JSON is considered a low-level detail not part of the application core.
There are two strategies for object mapping, the flat as described here, or recursive, where recursive datastructures will be separately mapped, e.g. a person with person as parent will result in a single mapping using the flat strategy, and multiple mappings in the recursive mapping. The difference between the two approaches lie in the use of the Enabled flag. Both are found in the git repository.