On 08/26/2015 01:18 PM, Erik Huelsmann wrote:
I reviewed some of this, and the Skitch Postgres tutorial... it sounds to me that we're not at the level of complexity that would pose this level of problem -- main things we should keep in mind is to always declare dependencies, and then can make use of a "union merge" if we don't want to resolve merge conflicts manually.
As usual, I'll comment on how it's done elsewhere (Drupal) and my experience with other upgrade systems...
Drupal ends up with a dual approach for schema upgrades: Each module provides a base schema that is up-to-date with the current version, and any arbitrary series of update scripts with a "schema level". The upgrade scripts are skipped for new installs. The "system" table tracks the "Schema level" for each module, and for updates runs all scripts with a greater schema level than is currently installed, updating the schema level as each is successfully installed.
There is no rollback with this system, and while you can declare some module updates as dependent on particular schema updates in other modules, this is cumbersome and not widely used, and so there are often race-condition-type issues on upgrades where one update can't complete because it's dependent on an upgrade to another module that has not yet run.
I know there was talk of switching to a hash-based system rather than a simple integer sequence, I have not yet dug into Drupal 8 to see if there's any improvements here.
I did consult briefly on a Microsoft project on Azure, Drupal on SQL Server, and they had some interesting tools for managing schema updates that seem comparable to what Sqitch is attempting to do.
I'm not quite clear on all of this. Are you suggesting two different Sqitch steps -- one for the data schema/model, and a different one for the stored functions? Is this a recipe for building a clean database, which would destroy all data on reversion? And thus something that you would only run to get a clean environment -- but then you'd have a different set of Sqitch scripts for all the functions that change/evolve within a major release, including non-destructive schema changes? If that's what you mean, then sounds good -- otherwise I'm lost as to the purpose of this.
I would generally be in favor of splitting up modules into independent versions, but agree that's a lot of work that probably isn't as much a priority now -- but thinking about it now is certainly appropriate. I wonder what lessons we could take from the Budgeting module to define extension points so that adding other optional modules is not so painful...
-1 on using 1.4.999 as "next 1.4 version". If you're on 1.4.15, developing for 1.4.16, that interim development at 1.4.999 ends up greater than the release! I would prefer using more established semantics, mostly like we've been doing: 1.4.15 -> 1.4.16-dev > 1.4.16-alpha1 > 1.4.16-beta1 > 1.4.16-rc1 > 1.4.16.
Drupal has taken to adding "+5" to indicate 5 commits past whatever the most recent release is -- e.g. 1.4.16-beta1+5. Don't use 1.4.999, unless that's a dev release of 1.5.0!
My reading from the tutorial is that each individual step may declare a dependency on some earlier step. The discussion mentioned above is more about handling merges from multiple branches.
I do see the value of this -- but it looks like an awful lot of grunt work creating these schema change scripts, if you also have to carefully create matching revert and verify scripts for every change!
Seems like a lot of discipline to pull this off -- certainly something that's worth doing if we aim to deliver a quality package.
I am curious on some of the tools that can supposedly detect the differences between two database schemas, and generate a script to convert them? I think I did run across a couple modules that claimed to do this for SQL Server. I have no experience with these, or knowledge about reliability. But I'm lazy ;-)
If we don't have something like that, Sqitch seems like a great tool in the arsenal, and something that seems quite approachable/understandable, and something we should be able to easily hook up in Travis.
... and now do we need our own dependency manager? :-)
I'm growing quite fond of Composer in the PHP world, and ran across Carton for managing Perl dependencies when putting together the Docker image. Carton definitely looks helpful for managing the installation of specific versions of CPAN modules to get to a consistent state. I wonder if that could be extended to cover Postgres extensions...
Not sure if you've looked at the Docker builds I put together a couple months ago. They're at https://hub.docker.com/r/ledgersmb/ledgersmb/ .
At least for that installation path (which I think is becoming more popular by the minute) the Sqitch approach is far easier to support, if I understand these options... I can easily install Sqitch inside the container and script whatever we need to do there to update the schemas in the database. However, I explicitly did not build Postgres into those images, assuming you would connect the LSMB container to an external Postgres container (or server). So installing an extension in Postgres becomes a much more challenging proposition -- I'm assuming you need access to the Postgres environment to be able to install an extension? If you can install a Postgres extension over a TCP connection to Postgres, I suppose we can script that -- but at least from my standpoint Sqitch sounds much easier to deal with.
Another potential challenge -- can you install extensions in a cloud-provisioned Postgres server, such as Amazon RDS?
Unless I completely misunderstand how you install extensions, I think they would cause some significant challenges to those who want to install in a cloud or container environment.
It does strike me that there's still a question of what schema changes destroy data. From the Sqitch tutorial, I'm not seeing why we need a build step as described earlier -- couldn't our setup simply run the Skitch deployment under the hood? Unless that's slow enough that flattening it to a single database dump file we can quickly load is a big enough win... that is, other than the consideration of how do we safely provide tools to roll database schema changes backwards and forwards on production databases, without losing data?
How does Sqitch track what has been deployed, and what has not? (Just thinking with my Docker hat on -- is this important metadata that needs to get preserved when lsmb containers get destroyed/recreated?)
Overall I like the looks of Skitch and would favor that approach...
_______________________________________________ Ledger-smb-devel mailing list ..hidden.. https://lists.sourceforge.net/lists/listinfo/ledger-smb-devel