🧩 JSON Change Notation

We have multiple services in our infra. Every service is responsible for a significant chunk of our business. Even the engineering team is divided in the same way, each team manages one service. Each team is not more than 3 to 4 member on average. By nature of the business, these services have inter dependencies, which means, they interact with each other through some channel. Even though we have common tools and standards to build these services, each team has a decent degree of freedom to build stuff their own way.

As a non-primary-tech company, our focus was to get things done till the time it is a decent solution. Lately, we started focusing on selling our software as service to other similar businesses in the market. In the process of reviewing our system with those lenses, we figured out few areas where we need to improve and one such area is managing the configurations for each service, or across multiple services.

😈 Problem

As mentioned above, each team has the freedom of managing the configurations required for the service. These configurations, in general don't belong to engineering team. The business/marketing/sales teams would be the owners of this config. Example would be, which is the active document agreement vendor? Engineering team has no idea which vendor it should be. The control to select it should be given to non-engineering teams. That is how the engineering team should work in non-tech-product teams.

Engineering team should be building the tools required for the business to run. Using the tools to run the business should not be responsibility of engineering team.

Similar to the above example, each service would have so many configurations on top of which the services run. Ideally, these configurations are inside a json file which will be modified by the non-engineering teams as and when they want. There are multiple problems that we need to solve to make our system fully configurable, which is required to sell it as SaaS. A more robust sample below. Few of them are discussed below

{
  "doc_vendor": "BluffSign",
  "round_off_method": "ciel",
  "mandatory_details": ["name", "social_security_number", "dob"],
  "approval_check": {
    "min_score": 80,
    "max_accounts": 3
  },
  "slabs": [{ "min": 0, "max": 10000, "fee": 100 }],
  "flavours": {
    "base": {
      "serviceable": true
    }
  }
}

1. Config inside the code

As a startup, we usually move fast and focus on get things done if we are okay with the architecture and the design till the time it can be cleaned up later. In the process, we might (have'd) end up keeping all these can-be-configurable things inside the code as constant. They are buried deep inside the code in the course of the time. They need to be moved to json. Most of the times, they are tangled inside complex if-else web. Still, it is comparatively easy problem to solve.

2. JSON inside code

A few teams already identified the above problem and started making config-as-json. Some of the configurations are moved to .json files but kept inside the code itself. They are committed as part of the code. Which means, to update the config, someone has to open IDE, update the value, commit, raise a pull-request and re-deploy the service. This has to be resolved somehow.

3. No UI

Needless to say that in the best case, we had .json files. The current process of updating config depends a lot on the engineering team. A UI to manage these configuration files would be awesome! We wanted these changes to go through some sort of approval system.

4. No central config

As mentioned above, by nature these services depend on each other. There is a dependency tree among these services. So, we needed the configs to be shared across the services so that we don't configure same stuff in different services again and again.

😎 Solution

First fore most thing we wanted to do is to bring standards and tools for managing the configurations. Below is the check list for the solution to solve at minimum

Can be used by any service
Ability to define the schema for the configuration
The final config should be in the form of json and be stored on s3
Should have an UI where business teams can view and update the values themselves with zero involvement of engineering team
Should go through the internal maker and checker system we have
There should be both API and a python package(our primary stack) for the services to get the config to work on
The configs should be versioned and give ability to get a particular version of config if required

Fast forward, in couple of days, we could come up with POC. It has basic UI, people could define the schema of the configuration using pydictable, on updating the config it sends for approval. It also provides APIs to read the config to start with. Going little bit deeper, when the user make the changes to the config on UI, it used to send request as shown below to the config service for the updation. It used to contain only changed fields(partial) and values.

{
  "approval_check": {
    "min_score": 90
  }
}

It was pretty decent solution I would say. Using this solution we could update any existing value, but it was not possible to do add an item to list or update an item of list at a particular index. So, we quickly understood that we cannot just take partial config and update those fields on the existing config json. We needed something different by which we can denote all different types of changes possible on the json ie. update, add, delete. Before even going into deep of defining this notation, we had to take a step back and question ourself if we are in right direction. At high level, there are two ways to update the config.

Take the updated config and replace
- This seems the simplest option. We don't have to worry much about while updating the config. We just take the entire updated config all the time and consider that as new config. The downside of it is, the reviewer do not understand what is the exact change. There should be context provided either online or offline explaining what exactly has been changed. Another disadvantage is that we never know what is the change that caused this new version of the config. Maybe, we always need to run the previous and current configs against a diff checker to identify the change.
Take only the changes and create new config by applying the changes in order
- This method seems pretty controlled in nature. This is how git also works. We record the changes at lowest granular level possible. We pass these changes for the updation and the updation system takes the latest config, applies these changes one by one in order and resultant one will be considered as the updated config. This seems pretty safe. The only con of this method is development cost. It takes and heavy on implementation compared to the previous method.

There is nothing to try in the first method. It is pretty straight forward and we know how it looks. So, we wanted to try out the second method and see how it works. So, we spent some time implementing it and this is what we have done

The UI should record all changes in a particular notation
On submit, it has to send it the backend (through approval system of course) which understands the notation, applies changes one by one in loop and the resultant json will be saved as latest one on s3.

🔌 Notation

As discussed, we needed three things possible ie. set, add, and delete on every field, at any nested level. Separately we need to tell the prefix(address) of the field as well. At high level, we decided to follow {operation}{prefix}: value notation to represent a change. Let us break it down

Prefix

We followed the unix file system convention to define the location of the field inside the json. For example, .approval_check.min_score represents the min_score field that is there inside approval_check. We need to somehow represent the list items as well. We used [index] notation for it. For example, .slabs.[0] represent the 0th item in the list.

Set

It means the field already exists in the config. It has a value associated with it. We denote this operation with <empty string>

Add

This operation represents a new item being added to the json which does not exist already. For objects, it is a new key & value pair, but for lists it is adding new item to the end of the list. So, we denote the operation by + and for lists, it ends with [] always. It has value associated with it.

Delete

It is self explanatory. It can be applied both on lists and nested dicts. We denote it by -. The last item in the prefix will tell the item being deleted. If it is list it would be in this notation [index]

With these notation in place, we can update any json with bunch of changes. Let us take an example

{
  "changes": [
    {
      "prefix": ".doc_vendor",
      "value": "AnotherVendor"
    },
    {
      "prefix": "+.slabs.[]",
      "value": {}
    },
    {
      "prefix": "+.slabs.[0].min",
      "value": 10001
    },
    {
      "prefix": "+.slabs.[0].max",
      "value": 100000
    },
    {
      "prefix": "+.slabs.[0].fee",
      "value": 200
    },
    {
      "prefix": ".flavours.base.serviceable",
      "value": false
    },
    {
      "prefix": "+.flavours.premium",
      "value": {}
    },
    {
      "prefix": "+.flavours.premium.serviceable",
      "value": true
    },
    {
      "prefix": "+.mandatory_details[1]",
      "value": 1 // just to be consistent
    }
  ]
}

If we apply these changes as explained above we end up with the following json. This can later be saved to s3 or some other place. The beauty of this method is we exactly know what are the changes, if we want, we can even simulate the changes and see how the final json would be looking before we approve these changes.

{
  "doc_vendor": "AnotherVendor",
  "round_off_method": "ciel",
  "mandatory_details": ["name", "dob"],
  "approval_check": {
    "min_score": 80,
    "max_accounts": 3
  },
  "slabs": [
    { "min": 0, "max": 10000, "fee": 100 },
    { "min": 10001, "max": 100000, "fee": 200 }
  ],
  "flavours": {
    "base": {
      "serviceable": false
    },
    "premium": {
      "serviceable": true
    }
  }
}

The UI is built in such a way that it supports any level of nesting. The user will be able to see the changes being made, it highlights the fields that are added, updated and deleted in different colours, and even simulates and show the final json. All thanks to the pydictable schema validation and representation in the backend. Nothing can go wrong, for example, if the UI passes integer in the place of string, the validation fails and it shows human readable message.

The config system itself is built in a generic way so that any team can add their own config with a pre-defined schema and get all the functionalities that are discussed above. It even get a python package where we can just call get_config('my_config') which gives us back the json/dict. This gives us the power to centralise the config management and we can even share the configs across multiple services. We are looking forward to move all the configs inside it in future :)

Hope it was helpful and you enjoyed the story. Connect with Mansha and Ranjani to know more about it. There are the people who actually built it. Cheers :)