[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Proposal for file attachment API
- Subject: Re: Proposal for file attachment API
- From: Luke <..hidden..>
- Date: Mon, 4 Jul 2011 02:28:55 -0400 (EDT)
On Sat, 2 Jul 2011, Chris Travers wrote:
On Sat, Jul 2, 2011 at 1:16 PM, Luke <..hidden..> wrote:
Probably though, as I think about it, this would require globally unique
filenames, and a name comparison with new uploads, possibly followed by a
content comparison if names match.
I'm not sure globally unique filenames are such a bad idea anyway.
There's a fairly nasty case here that you can run into. If globally
unique file names are required, then how do you know in advance what
sort of names are used? Do we want to expect the users of the system
to all come up with naming conventions that avoid collisions?
I was expecting that, yes. However, I shouldn't. My recent experience is
with reasonably disciplined corporate users, who either get files from
sources with likely to be unique names (some form of the vendor name and
vendor's ID), create files for customers/vendors with names of the same
type, or are good at storing files with rather long, descriptive, and
accidentally unique names.
However, if we combine our two ways of looking at this, I think we have
If you store files by ID, and internally reference them by ID at all
times, they can all be called "foobar.pdf" and it doesn't matter.
When a new file is uploaded, compare its CRC/checksum to the index of
stored files. If there's no match, it's a new file, gets a new ID, and we
save it. If the checksum matches, compare the contents to the N files with
the match. If one of them matches, create a link to the existing copy, by
inserting a referential entry in whatever table is tracking where files
are attached. If none of them matches, it's a new file.
I'm pushing this, because I think it's more extendable, and it also leads
directly to what Erik wanted.
If you divorce the storage of files, and the way they are tracked, from
the documents to which they are attached, you get a true virtual
Any document can point to any file(s), and any file can be pointed to by
Associations can be re-mapped after file storage (this assumes a file
management UI at some point), which is necessary for Erik's suggestion.