Re: Beyond 1.5

Quoted text below from John Locke's mail.

[ .. snip .. ] we've been dancing around old code for a long, long time -- are we ever going to clean it up incrementally?

While I've been hesitant to answer that question in the past (but always believed in it), I can answer with a firm "Yes!" these days. The plan is as follows, as far as I'm concerned:

* Isolate old code so it's completely clear to everybody which code we want to remain untouched
* Create Dojo widgets to gradually replace functionality in old code with a well defined web API and a client UI
* Create web services to support the Dojo widgets (and other remote access, of course)

As a matter of proof that this is going to work to phase out old code: we already have a functionality in 1.5 which follows exactly this approach and has led to deprecation of a significant complexity in old code: the parts selector in the invoice, order, quotation, customer price matrix and parts (bom listing) screens has been replaced by a single Dojo widget.

However as a note, this does not touch the really hard/interdependent parts.

We're completely on the same page here. I didn't mean to suggest we did. However, by moving more of the UI behaviour to the client, we can reduce the "old code footprint", eating away some of the size of the problem that remains: it matters quite a bit whether we have to implement full invoice handling from scratch, including current functionalites, or that we have abstracted out selection of customer/vendors, shipping address selection, part selection, document numbering, etc.

Also I think it would be a serious mistake to build an API around assumptions we intend to change, because otherwise we haven't decoupled anything.

Agreed. we have to be very careful not to rush forward and just "do something"; I'd like to work from requirements through designs to implementations -- as I did with the document state diagram. It's my conviction that regardless of whether we implement from scratch or we incrementally improve, this is documentation that we need anyway.

I'm thinking we're better off at this point with a rewrite -- or at least for the time being, a solid enough plan in place so that the new stuff we're creating will port straight into a 2.0 when the time comes.

Well, given that we have found a way forward now to get rid of the old code, I'm thinking we should proceed on that route for a while -- it might just be the key we needed.

Right, and in areas we can do so, I think we should keep with that approach for a number of reasons (easier maintenance, less old code, and more).

Maybe it will succeed everywhere and I will be proven wrong. But having worked with areas where I think we will run into real problems, I don't think it's the only approach we should take.

I can imagine that it's hard to get some parts of it "just right" the first time. I think we should aim for "first time right" here, including taking time to discuss requirements, designs, frameworks, workflows and more. We should also take the time to implement tests of all kinds and sorts to make sure it works and keeps working. However, we might occasionally need two iterations to get it right.

It's also the main driver behind my efforts to start talking about the document state diagram and the requirements for a web service framework: it's *the* driver to be able to continue on this route and a major enabler of a lot of other efforts.

That's extremely valuable work regardless of how things go btw. I think we need to have all document states discussed and documented in all cases.

Ok. It's my intention to keep going along those lines "to collect the facts" before moving into action. I think we should not hurry to make improvements, yet we should make sure not to get stalled on formalities.

If the core issue keeping us from tackling the financial rewrite and heading for 2.0 is the database schema, I think we should drill into that and make that solid, as soon as we possibly can. [ .. snip .. ]

Agreed. My definition of "as soon as we possibly can" would be: once we have phased out the old code. Seriously, there's a lot of dependency in old code on the current schema, without the sorely needed separation of concerns. Rewriting the database schema now would require editing a lot of old code -- something we really don't want to do. As soon as we have removed our dependency on old code, it'll become a lot easier to rewrite the stored procedures. It'll also become possible to rewrite without breaking it *all*, because we'll have many, many more tests in place to validate consistency and correct operation.

But then we either have an API that is built on assumptions in the current db (in which case still have a lot of the same coupling issues *and* we have to maintain that)...

When we talked earlier today, you mentioned the need to evaluate new APIs in "a 2.0 context" -- something I'd have called "blue sky context": breaking loose from the limitations we have in ourt current database model and elsewhere in the software to come up with the ideal solution.

Then, with that ideal design/idea in hand, we can see how much of that can be retrofitted with the existing code base. Sometimes that will work very nicely; sometimes it won't.

I can definitely see that it may take us years to get there, and we may have several more 1.x releases -- but clearly there's a need here, and the longer we wait to start on it, the longer it will take to get it working...

While in general I would agree, I'm also seeing a major uptake in development activity over the past 12 months. We have put in place a number of enablers to help people join the project in the same period. To me, it seems the strategy is working and while in the past development may have been slow and painful, it's now "just" painful, but no longer slow.

Here's my recommendation: we do both at once. Nip out low hanging areas in the old code. Start a branch for 2.0.

With our team still as small as it is, I'm hesitant to start a new branch which may not be merged for a long time: it will very likely spread development effort - already thinly divided - more thinly. Using a fictitious 2.0 branch to evaluate ideas, to create the ideal database model, etc, etc, however, will most probably help improve the software we'll develop even in the short term.

There are a couple major reasons I think we should do this:

1. I think we need to accept that 2.0 will be a db *import* rather than a db *upgrade* process. Given the depth of the changes I think we will need to do, customizations are not going to just work no matter how we do it. We might as well plan on copy-and-import rather than migrate in place.

Ok. Looking at our history though, we have had "import type" upgrades for 1.3 and 1.4. We might as well have one in 1.6, 1.8 and 1.10.

Not saying that we will, but also not excluding it.

There's an inherent risk in doing the copy-and-import dance though: if the migration script isn't well tested, you run the risk of loosing data during the import. If it's done in-place, the data won't be lost; however, the data migration will simply fail, leading to early detection of the problem.

2. I think we need to understand that this *will* break customizations in big ways and I don't think we want to guarantee any backwards compatibility.

Most likely, yes. At the moment, do we have any API guarantees in place other than "it should keep working within the same minor (1.x) series"? Either way, we probably need to be explicit about it.

3. We *really* do not want to be stuck trying to maintain 1.x compatibility in an API.

Nope. But I think that's all about managing expectations. But it has two sides: we also should not "just" break compatibility for the fun of it.

When we have more testing in place (BDD or otherwise), development won't even be painful anymore.
So... what's wrong with the financial schema

A *very* relevant question which I think should be answered in the short term so we can make sure that all changes move in the right direction. I however, don't have the answer.

@Chris, do you have mails, notes, <whatever document> which lists all or part of the problem(s).

(Ok, in all fairness, I know about:
- Journals should not run across dates
- payments should be journals, not lines on an AP item
- journals should be referenced from 'transactions' or 'gl', not from ar/ap
)

Well, no specific documentation and notes. However they have been spread across a large number of emails over the years. Some problems we have been able to fix but a lot we can't.

Ok. Some of the problems we can't fix are also unfixable because old code actually abuses the fact that the referential integrity constraints aren't in place. Phasing out more old code automatically increases the likelihood of being able to add data constraints in the database model.

Here are the top issues in order of the damage they cause.

1. Payments are not first class. This causes problems in payment management, invoice management, banking reconciliation, and much more. It also makes it very hard to come up with a solid way of managing invoices over an API.

True, but not entirely. We discussed about a year ago, I think that there's a lot of code in place which works around the way "old code" processes payments. However, if we put as one of the prerequisites for a migration from SL that all reconciliations have been done... Can't we then safely assume that LedgerSMB itself has been around long enough to be sure that all data on outstanding reconciliations has been generated by *a* LedgerSMB version (or a migration)? In other words, if we were to remove those workarounds now, those on LedgerSMB for a while should not suffer, nor should those who have recently started new administrations.

2. AP/AP/GL tables splitting up unnecessary duplication causes a lot of headaches in reporting. Also it means when we fix this, we have to rewrite *all* the report queries.

True, although probably the actual rewrite won't be all that complex as the solution is to stop running UNION ALL queries, instead simply running queries on GL and/or (LEFT) JOIN queries on AR/AP.

3. The way things are split up gives us no real hard enforcement of referential integrity. We have partial enforcement but not what I think we need.

Can you be more specific on this one? It's definitely one of the things I'd like to the document collecting notes about the existing database schema.

BTW, I created https://github.com/ledgersmb/LedgerSMB/wiki/Database-schema to record everything we need to remember in a single place. Please, when you remember or find something that you think should be different, list it there, as verbose as you can.

Bye,

Erik.

http://efficito.com -- Hosted accounting and ERP.

Robust and Flexible. No vendor lock-in.