[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Web Services API: URL naming proposal

Subject: Re: Web Services API: URL naming proposal
From: John Locke <..hidden..>
Date: Mon, 21 Nov 2011 08:42:45 -0800
Hi,

On 11/21/2011 03:03 AM, Chris Travers wrote:
> First some general notes.
>
> One of the real headaches in web app programming is state handling.
> With a thick client, what you do is you open a connection, perform
> your operations, commit your changes, etc. and then close the
> connection when you log off.  With a web application, a single atomic
> unit of work may span several network connections. As a result a lot
> of things may have to be tracked in this way, and in some of our
> workloads, a large majority of database processing time is actually
> spent managing the application state. A lot of this time could be cut
> out if we didn't have to make sure that this information persisted
> when the database connection was closed and a new one opened.

Hmm. I'm not thinking we need a huge amount of state here -- mostly we
are assembling resources that will be posted into the system, or doing
queries.

> I will give you an example.  One of my customers pays a large number
> of invoices per week.  I would say probably as many as 5000 invoices
> may be paid in a single batch payment workflow.  I have good
> information on the profiling of the web app as to where time is spent.
>  The basic selection takes only a couple of seconds (2-3), but once we
> add in the necessity to track that someone has selected these, we end
> up with about 45-50 sec of database time in the actual selection for
> payment.  The XHTML document generation in some cases takes another
> couple of minutes.  We can't use cursors to page through results
> because the cursors can't survive the connection teardown process.
> Consequently there is very limited performance tuning possible at
> present.

Ah, yes, I wasn't thinking through the transaction handling over
multiple requests.

If we skip doing transactions for more than single request actions (like
using the shipping field to partially ship an order and generate an
invoice), then I don't think we get into that much trouble.

Drupal does provide a batch API, which can be used to iterate through
large result sets while performing some action. For the user interface,
it puts up a Javascript widget that makes repeated calls, each of which
processes a configurable number of items (generally 50 or so) and then
updates its progress bar. For calls from script, you typically process
the first batch and then leave the rest of the processing to a cron job
that processes a batch at a time. One approach to bulk operations...
>
> Now, if this wasn't over HTTP, we could make the application far more
> responsive, and lead to better productivity, but as it is, we are
> fairly limited.  This is one reason why I see the database-level API
> as such a big deal.  It allows one to tune performance by removing the
> limitations of HTTP in this regard.
>
> If we can do it, I would much prefer to confine these problems to the
> web application than I would to export them to every other application
> that might interface to LedgerSMB.   I would therefore suggest that
> running LedgerSMB over HTTP is a bit kludgy, and that long-term I
> would like to see the web app be secondary to a desktop app directly
> connecting to the database in most environments.

Have you looked at node.js?
<snip>
>
>
> One thing to keep in mind is that the db-level interface has taken a
> lot of inspiration from things I think SOAP and friends do well
> (discoverability, etc).  Additionally you can do things which are
> (IMNSHO opinion) insane to try to do over HTTP, like control over
> database transactions.

Discoverability is the one aspect of SOAP I like...
>> That is what I hate most about SOAP -- having to do multiple calls and
>> manage state. But to a certain extent, it seems unavoidable.
> Ever looked at SOAP over XMPP?  For that matter although it would no
> longer be RESTful, I don't see any reason why RESTful approaches
> couldn't get encapsulated in XML stanzas and sent over XMPP.  XMPP
> could then handle state and you'd no longer have to worry about it.

Not specifically SOAP over XMPP, but I was looking pretty hard at XMPP
as a transport for web applications a while back. And then node.js
appeared...

I would say node is to XMPP what REST is to SOAP -- a much simpler,
friendly way to handle long-running connections. It basically is a
Javascript server that leverages Javascript's event-driven,
callback-oriented patterns to handle large amounts of simultaneous
long-running connections.

> Also if you are willing to to pass in HTTP to your accounting server
> through your firewall, not sure XMPP would be out.
>> It's probably not a big deal to make a remote application pass the
>> company in the URL (instead of in a login session), but slightly easier
>> I would think to omit (in client implementation), to simply pass in with
>> authentication.
>>
>> In many ways, the web application front end is a model for other
>> applications that might call the web service -- ideally everything in
>> the web application should be reflected in the web service.
> Given the headaches dealing with application state we already have to
> worry about (including providing an interface for clearing
> discretionary locks when someone gets called away when running a
> selection for payment), I would highly recommend utilizing some
> discretion in what we push out to web services.  I would instead focus
> on making more performance-friendly ways of closing that gap, even
> when that means ruling out using HTTP as a transfer protocol.
>
> Fortunately the areas where this is most likely to be an issue are
> also the areas where the database API is likely to be available.
>
> That doesn't mean, however, there can't be a common framework that
> couldn't be run stateless over HTTP and statefully over something like
> XMPP where there are questionable cases, however.

+1. I think this is a good approach -- start with RESTful services for
resource access, creation/deletion, etc.

And for batch operations, we could add a long-running connection like
node or xmpp or access to the database API -- given what I currently
see, node is the hot one right now, and what I would be most interested
in working with.


>> In my experience, most web services still do make use of session
>> handling, it's not at all an uncommon approach.
> I guess I am questioning the need for it.  We don't really have forms
> to submit, and requiring server-side tokens is going to mean more API
> calls rather than fewer (i.e. you'd have to get a token for each
> resource you intend to post).  This means additional latency.

I don't think we need per-request tokens -- at least not unpredictable
ones. I think a counter or something to handle replays might be
worthwhile (mainly for unreliable connections/resent traffic).

I suppose we could try skipping sessions altogether -- it's just that in
my experience, something has always come up that necessitated using
sessions in the API and it's been trivial to support since the
underlying web application has it already.
>
>>> 3)  Does the added complexity make sense with general use cases?  I am
>>> assuming we are primarily interested in a web services API for
>>> server->server integration since a piece of software used primarily by
>>> the end user would be more likely to just call the db-level API (which
>>> would provide greater control over db transactions, and the like than
>>> one would get from a web services interface)?
>> Well, server-to-server is certainly the first step. And easiest to adapt
>> to just about any interface we develop. But today we're doing most web
>> services for iOS or Android apps. Think about the POS or an inventory
>> module being available as an app for an Android phone.
> I think we'd need more details to see what the relevant costs and
> benefits of using a web service in such an app would be.
>
> The questions in my mind become:
>
> 1)  Is this an environment where the db-level API is appropriate and
> likely to be available?
>
> 2)  If not, is this an environment where the document/resource
> metaphor of HTTP makes sense and where the systems can be loosely
> coupled? If so, web services are a good choice.
>
> 3)  If not, then are there other approaches to encapsulating one of
> the above API's in another protocol that does make sense?

Part of my thinking is ease of implementation. I'd rather see something
workable very soon, than something perfect but not for years. Providing
a relatively simple wrapper for existing functionality seems like the
shortest path to getting something in place.

I do think http has the most widespread support. Is there a postgres
driver for iOS? And for me, there's a comfort issue here -- I am not
that comfortable allowing the Internet direct access to Postgres --
perhaps it's secure enough, but I'm not that experienced securing it
like I am Apache, working with PHP or Perl or Javascript.
>
>> The recent thread by a Google engineer praising Amazon for making
>> everything an API applies here. If you haven't read it:
>> https://plus.google.com/112678702228711889851/posts/eVeouesvaVX
>>
> I agree that everything should be an API.  I am just less convinced
> that everything in LedgerSMB should be an API over HTTP.

I would think the web application ideally should get ported to use the
web services as its API, basically as the first client. If we abstract
all the data processing out of the web client and into a web service,
then any other application can do everything the web client can do.

Nothing prevents us from adding more web services over other transports
with additional functionality.
>>>>>> 2. Since companies are separate databases, where do we put the name of
>>>>>> the company in the URL? <prefix>/store.pl/<company>/<etc...>?
>>>>>>
>>>>> What do you think of the above proposal?
>>>> I suggest we include the company in the body of the login, and then it's
>>>> represented in the session cookie. If an external application needs to
>>>> work with multiple companies, it can manage multiple session ids.
>>> This is contrary to REST ideals, correct?  Not that departures from
>>> that are necessarily out, but I would prefer to see some justification
>>> for the added complexity of requiring state handling and cookies.
>> Well, yes, it is contrary to REST ideals -- but there's definitely room
>> in REST for actions as well as resources. And I was thinking while
>> writing this up about what might be an effective way of supporting
>> transactions -- complete with begin transaction, commit, and rollback posts.
> I don't see any sane way of handling database transaction controls
> over HTTP.  I think any attempt to do so would significantly reduce
> the robustness of the controls on the server for anyone accessing the
> database.
>
> However if the same API can be encapsulated over XMPP, then the
> problem goes away entirely and now you can be sure of the state enough
> to expose transaction controls safely.

Yes, I'd just suggest looking at Node.js/Socket.io as an alternative to
XMPP...
>
>> I'm not entirely opposed to putting the company in the URL -- it's
>> certainly a viable approach. However, given the complex structure of
>> entity/eca/customer objects alone, having the ability to wrap that in a
>> transaction might be desirable...
> I dont see how you can possibly have a database transaction safely
> exist across multiple HTTP requests, hence my suggestion to explore
> XMPP for these areas.

Agreed, wasn't thinking this through.

<snip>
> > Yes, I suggest doing both -- returning the whole object as rewritten by
> > the server, as well as adding a header to the final URL.
>
> I like the idea of returning the whole object.
>
> I am thinking through the security, and I think we'll have to have
> some way of authenticating clients as well as users.  I don't see
> another way around xsrf issues that doesn't break a web services
> model.  Amazon Web Services does this with a preshared key approach.
> I would personally prefer client-side certificates with a configurable
> CN root, and ban use of any of these same certs in the web app.

That's essentially what OAuth is about. It does work at a client
application level, not a client instance level. I don't think it uses
certificates/PKI -- I think it's still a pre-shared key approach -- but
I think that's better in this case, forces the admin to specifically add
approved apps.

By leveraging OAuth, client apps can use existing libraries and
practices and not have to learn something specific to us, or deploy a PKI...

Cheers,
John Locke
http://www.freelock.com
Follow-Ups:
- Re: Web Services API: URL naming proposal
  - From: Chris Travers
- Re: Web Services API: URL naming proposal
  - From: John Locke
References:
- Web Services API: URL naming proposal
  - From: Erik Huelsmann
- Re: Web Services API: URL naming proposal
  - From: Chris Travers
- Re: Web Services API: URL naming proposal
  - From: John Locke
- Re: Web Services API: URL naming proposal
  - From: Chris Travers
- Re: Web Services API: URL naming proposal
  - From: John Locke
- Re: Web Services API: URL naming proposal
  - From: Chris Travers
Prev by Date: Re: Web Services API: URL naming proposal
Next by Date: How to selecting recurring transactions for processing
Previous by thread: Re: Web Services API: URL naming proposal
Next by thread: Re: Web Services API: URL naming proposal
Index(es):
- Date
- Thread