Avoid storing configuration data in your revision control system

By Confusion on Tuesday 20 October 2009 22:02 - Comments (3)
Categories: Java, Software engineering, XML, Views: 4.792

After a discussion with a colleague this afternoon, I thought I'd share the following: you should avoid storing configuration data in your revision control system. Especially authentication credentials should not be in there. Here's why:
  1. When securing servers and networks, things like the server hosting an RCS don't get the same priority as, say, your web facing production server. Mistakes are easy to make and you can simply use Google to find 'accidentally' web facing RCS's that expose passwords.
  2. There will be plenty of copies 'out there', outside of your control. How many developers have that data stored on their machine? How careful are they with their laptops and your production passwords?
  3. Access to the configuration is limited to those that should be able to change it: no accidental changes by a junior performing a careless check in.
  4. If you can use the exact same build for your development, test/staging and production environment, then you can cleanly separate between code problems and configuration problems. If you need to rebuild a distributable archive to have the build process include environment-specific configuration, there will always be the doubt that some other difference may have sneaked in.
  5. It's much easier to change the configuration if you don't have to make a new build to deploy the change.
Now I specifically say 'avoid' and not 'do not ever', because many frameworks do not make this separation particularly easy. In the Java world, standard frameworks like Maven, Spring and Hibernate all impose obstacles to succeed at keeping sensitive configuration data out of RCS's.

Maven is a build tool that offers all kinds of build-time placeholder substitution capabilities, which is diametrically opposed to this advice. Spring does dependency injection and the configuration to wire your application together strongly attracts other types of configuration data to be included with it. And if you are paranoid enough to give production databases different names, so you can never accidentally run a test against a production database: how do you get that name into your Hibernate OR mappings at startup time?

It takes careful thought and thorough understanding of the build and startup processes, but in my opinion it is well worth it. Every time I deploy a new version of the one application in which configuration and code are completely separated, where I just have to drop a new .jar and restart, I dance with joy.

Volgende: Eten laten bezorgen in Amsterdam? Thai Kitchen! 11-'09 Eten laten bezorgen in Amsterdam? Thai Kitchen!
Volgende: Wikipedia can be funny 10-'09 Wikipedia can be funny

Comments


By Tweakers user Gomez12, Tuesday 20 October 2009 23:59

What do you mean with a revision control system?

We basically use a scm with control-rights.
Basically the configs are just for test-environments. They are readable by almost anyone, changeable only by 1 group ( admin, not developers ) and denied for the build creator...

It could be a problem with scm's like git etc which allow everyone full control over every branche they make themselves.

But I think the rights sufficiently cover it, it is a real pain if you leave the configs out of the scm because then you have 20 different configs etc. ( and why did you start using an scm )

If only one person can change the configs there is only one person to blame if there is a f*ckup, and configs don't change every hour so one person should be able to manage them...

By Tim O'Brien, Wednesday 21 October 2009 02:14

Hmmm... I have a suspicion that you haven't yet learned how to use a Maven build profile.

By Tweakers user Confusion, Wednesday 21 October 2009 07:14

@Gomez:
I mean an scm or vcs; in practice, those are all synonyms, aren't they?

Proper access control mitigates point 2 and 3, but not the others.

@Tim
Every project has several Maven build profiles. The problem is that a build profile does things at build time, while the argument is that some parts of a configuration shouldn't be present in the RCS and shouldn't be present on the developers' machine, in which case a build profile cannot use them.

Comments are closed