On 6/21/07, Stroller <..hidden..> wrote:
On 20 Jun 2007, at 22:57, Ed W wrote:
>
> I think we are all agreed that snapshotting the invoice is a
> requirement. But:
>
> - lets assume that most invoices contain the same address time and
> time again,
> - hence we start thinking about normalising that address record
> rather than duplicating it endlessly
This makes perfect sense when you're a database programmer, but not
if you're an accountant.
Agreed, and the critical issue is: "Is this to be a Quality system?" See
http://home.iprimus.com.au/davidtangye/it/index.html
- the Quality link for the cornerstones of Quality. Central to this is customer. "Who is the customer here?" If this is an accounting system, its essentially the accountant. If this is an ERP system its them plus operations folk. In neither case is the DBA the customer, he's the servant.
I see what you're trying to do, and if you're a database programmer
or administrator you might think I'm crazy for suggesting this but
IMO it's better to have duplicate data.
Its not crazy at all. Its correct analysis. And its not necessarily a case of duplicate data. Its a snapshot of an event whose information is often represented by the same data values each time. That is not duplicate data.
When you normalise you make
things more complicated and introduce the risk of a record becoming
changed (linked to an different entity) after its been posted.
Yes.
I appreciate that good a programmer will endeavour to ensure that this
never happens,
Programmers should be coding to specifications from systems analysts/modellers who have already built specifications based on a model that reflects the real world and its needs with respect to this system. If this is all done by one person, he must ensure he is fully competent in all these disciplines, plus systems architecture and be a toolsmith (to use the terminology largely from the Rational Unified Process). The worst case is a [x] programmer skilled in [y] rdbms programming using the hammer mentality to see the entire logical modelling, physical design, and programming etc process through just that [x] and [y] perspective. This results in the 'hammer mentality'.
but nevertheless from a certain perspective duplicate
data in posted invoices _is_ "correct" - the address belongs
_separately_ to separate records.
Yes
Let's consider a "normalised" paper trail. We print out an invoice to
send to the customer - it has their address on it, the details of
items & services sold and a total amount owing. The logical thing to
so is not keep a copy of the whole invoice but to simply record
customer number, item numbers & amount in a ledger. But the tax man
does not allow us to do this - he requires us to keep a whole copy of
the invoice, wasting more paper and consuming much more space in our
filing cabinets. Whether we produce it with carbon paper, a
photocopier or by printing it out twice on our laser printer, our
copy of the invoice has to be exactly the same as the one we sent out
(not just the relational data required to produce an exact copy).
Exactly.
Likewise Ledger should keep a whole copy of the invoice, and not to
to take short cuts (however efficient they may seem).
Yes . It needs a system characterised by what got termed 'data warehouse' a while back, where historical records are kept as is and specialist mechanisms/software are employed to access and crunch them. The historical records might or might not be in relational structures.