Using Git for Configuration Data revision control

Omar Mussa · March 1, 2018

We have a large LabVIEW project that is structured in multiple Git repos using Git submodules.

Within each Project we have our source organized into folders like this:

Project Folder
	Configuration Data Folder
	Source Code Folder

We typically do development on our development machines where the Configuration Data Folder contains configuration files that are in simulation mode and using atypical configurations.

We also deploy our development system onto tools during development and testing by cloning the repo onto the hardware supported platforms. On these machines, the configuration data is modified to remove the simulation flag and is further configured for the specific project being developed.

When we do a pull onto the deployed systems, we definitely want all the Source Code Folder changes but we generally do not want the Configuration Data. However, we want the Configuration Data to be tracked, so we do want it to be in the/a repo. Does anyone know what the best method to do this is?

What we don't want to happen is for the deployed system configuration data (which might include some machine specific constants, etc) to get over-written by some developer setting that was used for testing, but we still want to pull/merge changes onto the tool. Note that we have the tool on a separate branch from the development branch but when whenever we merge we can potentially get into trouble if we pull configuration data as part of the sync. Since the configuration data is text based, Git typically auto-merges these changes and so it can be difficult to tell that the merge affected the configuration data. I am curious if anyone has had this issue already and come up with a good solution/strategy.

smithd · March 2, 2018

Well, I'd suggest keeping the simulation-related cfg in a separate location such that there is no way to run with simulation cfg even if everything goes bad, but...it looks like this does what you want:

https://medium.com/@porteneuve/how-to-make-git-preserve-specific-files-while-merging-18c92343826b
(windows instruction in first comment)

Edited March 2, 2018 by smithd

JKSH · March 2, 2018

2 hours ago, smithd said:

Well, I'd suggest keeping the simulation-related cfg in a separate location such that there is no way to run with simulation cfg even if everything goes bad

I agree completely: Keep the Git-tracked config files separate from the deployment config files. Trying to make Git track the files and ignore the files at the same time is messy and unintuitive; if any errors occur in the process, they might be hard to detect and to fix.

Some other possibilities to consider (these ideas aren't mutually exclusive; you can implement more than 1):

Have your application search for config files in a "deployment" folder first. If those aren't found, then fall back to the simulation config files.
- This way, both deployment and development machines can run the same code yet read from different folders.
- This way, the "deployment" folders are untracked by Git and there's no risk of overwriting their contents.
Make it visually obvious when your application is running in simulation mode (e.g. change the background colour and show a label).
Deploy by building and distributing executables instead of pulling source code.

4 hours ago, Omar Mussa said:

We typically do development on our development machines where the Configuration Data Folder contains configuration files that are in simulation mode and using atypical configurations.

We also deploy our development system onto tools during development and testing by cloning the repo onto the hardware supported platforms. On these machines, the configuration data is modified to remove the simulation flag and is further configured for the specific project being developed.

When we do a pull onto the deployed systems, we definitely want all the Source Code Folder changes but we generally do not want the Configuration Data. However, we want the Configuration Data to be tracked, so we do want it to be in the/a repo.

It sounds like your deployment machines run the LabVIEW source code directly. How do you manage the risk of the code getting accidentally (or maliciously) modified by an operator?

Omar Mussa · March 2, 2018

3 hours ago, smithd said:

Well, I'd suggest keeping the simulation-related cfg in a separate location such that there is no way to run with simulation cfg even if everything goes bad, but...it looks like this does what you want:

https://medium.com/@porteneuve/how-to-make-git-preserve-specific-files-while-merging-18c92343826b
(windows instruction in first comment)

I agree about keeping the sim related config separate - I have taken this approach and found it useful but it can also be tricky to ensure that configuration data that is added during development to the simulation files is merged into the deployment folder. I've personally found BeyondCompare to help solve this problem but it is a manual step that has been hard to enforce with process. The article you suggests looks really promising - I think it was exactly what I was looking for but failed to find by my own searching. I think this is going to help me avoid going down a fairly messy rabbit hole!

1 hour ago, JKSH said:

I agree completely: Keep the Git-tracked config files separate from the deployment config files. Trying to make Git track the files and ignore the files at the same time is messy and unintuitive; if any errors occur in the process, they might be hard to detect and to fix.

Some other possibilities to consider (these ideas aren't mutually exclusive; you can implement more than 1):

Have your application search for config files in a "deployment" folder first. If those aren't found, then fall back to the simulation config files.
This way, both deployment and development machines can run the same code yet read from different folders.

This way, the "deployment" folders are untracked by Git and there's no risk of overwriting their contents.

Make it visually obvious when your application is running in simulation mode (e.g. change the background colour and show a label).

Deploy by building and distributing executables instead of pulling source code.

It sounds like your deployment machines run the LabVIEW source code directly. How do you manage the risk of the code getting accidentally (or maliciously) modified by an operator?

We're definitely not running source directly in our final deployment. We run from built exe's and leave no source code on the deployment machines. But the other points are definitely valid. I agree this is a suboptimal way to manage the configuration data. Ultimately what I think I want is for the auto-merging of the text to fail so that the developers can make intelligent decisions about what to merge - basically the way that binary files are treated (probably the only time I will ever think this way) - I think the article @smithd linked to will do that for me.

Very appreciative of the help and advice!

Stagg54 · April 2, 2018

There is a way to get git to treat specific extensions (and possibly specific files) as binary and not try to merge them... Google the .gitattributes file. It's similar to the .gitignore file and I think it will do what you want.

Sign In

Using Git for Configuration Data revision control

Recommended Posts

Omar Mussa

smithd

JKSH

Omar Mussa

Stagg54

Join the conversation

Browse

Activity

Important Information