How to update large software porjects

ralf · May-11-2018, 09:11 AM

Hi,

In my holiday time I'm working on a huge project for years.
I started to make it open source at the beginning of this year.

Now I'm facing some problems I already expected: How to provide updates.
(Luckily nobody uses this software right now, so I can experiment with possible solutions :) )

While I was the only user of this software, it was easy to do changes by hand.
When I made the software open source, I wrote an installation script (bash) that does all necessary steps to make the software work.
But at the point when it comes to updates it is not that easy to do changes without breaking anything.

I need to do the following things:

Updating a database: Adding/Removing columns
Updating a ini-configuration: Adding/Removing Keys and Sections
Updating a javascript file: Adding/Removing lines
Updating user data directory structures: Adding/Removing/Moving files and directories (I do not mean the software directory, that can be simply updated from the sources via rsync)

An example of the database-adding-column-problem I already solved:
To add a new column to a table in the database the software manages I run during the installation the following script lines.

sqlite3 "$DATAPATH/database.db" "PRAGMA table_info(\"tablename\");" | grep "columnname" > /dev/null
if [ $? -ne 0 ]; then
sqlite3 "$DATAPATH/database.db" 'ALTER TABLE tablename ADD COLUMN columname TEXT DEFAULT "";'
fi

This is ugly as hell.

So my questions is:
How do I manage such updates?

I'm working for over six years on this software and one of its strength is that I always remove features I do not use a lot (Including removing configuration sections and database columns).
And I add a new feature when I need it as I want to have it. When I need to refactor things, I do it. Which leads to adding sections to the configuration and the database.
Clean code has highes priority. When I have to restructure the database or configuration, I want to do it.
I do not want to loose this flexibility.

**Gribouillis** · May-11-2018, 11:07 AM

If I had to do this, I would try to do it with the doit python module. It is a build tool and an automation tool that lets you define flexible collections of tasks to handle a complex workflow. I used it successfully on such problems where there is a mix of command lines to run, static data to update, etc. It is easy to add python code to scan the existing configuration and automatically take decisions about what to build and what not to build.

ralf · May-11-2018, 01:51 PM

doit seams to be a cool software.
But I'd like to solve the problem on a conceptual level.
Forcing a user to install a tool only to install my software is not what I want.

I'm more looking for a strategy to handle updates.

ljmetzger · (This post was last modified: May-11-2018, 03:55 PM by ljmetzger.)

I've always used a mechanical methodology to handle problems like this. My experience with version control systems has not been good, so I tend to do things the old fashioned way.

First there are two jobs that need to be performed (can be the same person):
a. Project Manager - approves all additions, deletions, modifications
b. Librarian - implements additions, deletions, modifications approved by the Project Manager

I use an enhancement/anomaly reporting method. Each file change (addition, deletions, modification) is associated with a numbered enhancement/anomaly report. More than one file can be associated with one report.

Start with baseline software (start with nothing for first version).

Each software version has:
a. List of new files
b. List of deleted files
c. List of changed files
d. Documentation file containing 'software build instructions' . The 'build procedure' can be automated or can require human intervention.
e. Software 'Acceptance Test Suite' (to verify software was built correctly and operates correctly).
f. Software audit instructions to verify build pieces are not corrupt, and to verify that production software is not corrupt. Audit includes verification of file names, update times, sizes, checksum/crc (cyclic redundancy check) values.

Workflow:
a. Software change is proposed.
b. Anomaly/Enhancement report indicates broad brush changes. If appropriate a 'diff' file is generated (and attached to the Anomaly/Enhancement report) to document the changes.
c. Project Manager approves the changes after changes are tested.
d. Librarian implements changes into the 'Strawman' new version, after verifying installation procedure and successful passing of 'Acceptance Test Suite'. Verification includes comparing additions, changes, deletions to those requested in the Enhancement/Anomaly reports, running 'diff' on all files (including files allegedly not changed).
e. Creation of a new 'Official Version' as required.

The above procedure is a lot of work, but I used it successfully for many years. It does require a lot of human overhead.

Lewis

ralf · May-11-2018, 04:28 PM

I think you're talking about the developers side providing the update.
It's just a git repositoy the user pulls.
Then the user executes a 600 line installation script :)

Nevertheless this is a little bit the strategy I use right now for the script.
Checking the difference between what is installed, and what should be availabe for the new version.

This process is very hard to automate due to dependencies of some files (also depending on what time during the update process an action takes place). So it is hardcoded right now, and becomes very complex considering all possible versions installed.

To give some numbers:
Involved in the updating process are

3 SQLite databases
11 configuration files in different formats (ini, xml, json, csv, sh)

Between some configuration files are dependencies like shared passwords between server and client.

The order of your described workflow may be better than my.
Backup old files, make a clean install, and try to use the old data to initialize the new installation.
This just has a very high code overhead since I need to understand the old setup.
Giving each configuration file a version number would simplifie it a bit, but I still need to carry the whole history keys ever existed in the files. Same with the database schemas.

ljmetzger · May-12-2018, 07:06 PM

Quote:This process is very hard to automate due to dependencies of some files (also depending on what time during the update process an action takes place). So it is hardcoded right now, and becomes very complex considering all possible versions installed.

Most things can be automated with UNIX like tools (or other software tools). Whether automated or not, you probably need a mechanical repeatable process. Creating such a process and it's documentation is hard work and not fun. The more versions that exist and their divergence makes the job that much harder, especially if you want to support older forks.

Lewis

How to update large software porjects

User Panel Messages

Announcements