cloudseer + shared + web_2.0   2

New directions in web architecture. Again.
In 2005, Jesse James Garrett at Adaptive Path published the seminal blog "Ajax: A New Approach to Web Applications" and ushered in new age of web architecture. Ajax meant using the possibilities latent in JavaScript (specifically, the XMLHttpRequest object) so that a web page could contact the server asynchronously and request new data.

This was revolutionary; within months, we were seeing pages that were more dynamic and interactive. Ajax short-circuited the submit/response loop that dominated web applications up to that time. Instead of making an HTTP request, receiving an entire web page, and rendering that page as a replacement for the current page, the browser requested a chunk of data. It used that chunk of data to interact with the DOM and rewrite the page it was displaying on the fly.

Around the same time, the RESTful paradigm started taking hold. REST represented a much simpler, web-oriented way for servers to interact with their clients. As Roy Fielding pointed out in his dissertation, the basic operations of the HTTP protocol were capable of providing general access to data; stateful applications could be built upon stateless protocols; hypermedia could be used to maintain application state. Although Fielding's dissertation dates back to 2000, it took a few years of bad experience with SOAP and its heirs to realize its importance. With REST, it becomes easier for a website to see itself as a source of data for machines to process, rather than as a source of content for humans to read. Websites become data servers.

The HTML page you get from the new Twitter is largely a bunch of empty divs, with a big wad of JavaScript. The JavaScript is the entire application.

Important as Ajax and REST have been to the history of the web, each only represents half of a larger revolution. And in the past few months, we've seen some new sites that have taken the revolution to its logical conclusion. Specifically: take a look at the new Twitter. It's a nice web application, sure -- but look at the HTML. There's not much there. The HTML page you get from Twitter is largely a bunch of empty divs, with a big wad of JavaScript. What's happening? The JavaScript is the entire application; the divs exist only to provide tags so the JavaScript can rewrite the DOM at will. In turn, the JavaScript is constantly (and asynchronously) making requests from the Twitter site, which is just returning data from its API. In fact, the Twitter site is returning the same data for its web page that it would return for its mobile app, for TweetDeck, or for any of the apps in the Twitter ecosystem.

This design isn't particularly new; we've seen it ever since developers started reverse-engineering GMail and Google Maps to get ideas for their own projects. Those big Google apps may have been the first examples of this architectural trend. They were certainly among the first to use JavaScript as a full-fledge client programming language. But we're seeing many more sites built along these lines. Why now, what does this shift mean, and why is it important?

"Why now" is perhaps the easiest question to answer. A few short years ago, web developers only had one platform to support, and that was the "browser." Granted, there were a dozen or so browsers of significance, and the browser world was riddled with incompatibilities. We're in a different world now. Browser compatibilities have been ironed out, to some extent (though conscientious developers still support "legacy" browsers, all the way back to IE6 or even IE5). But it's no news that the most important new apps these days run on devices ranging from phones (iPhone, Android, BlackBerry, Windows Phone), tablets (iPad, Android/ChromeOS devices), and potentially ebook readers and other new devices. With these new devices on the table, browser incompatibilities pale in significance. It's another sign of the times that I can't conceive of an interesting application that doesn't access data across the network. A static application that never accesses remote data -- that's so 1990s.

So, while it's tempting to say that the new age is characterized by the browser as platform, and that applications running in the browser can do anything that native code can do, that's looking in the wrong place. HTML5 certainly ups the ante, as far as browser capabilities -- and is supported to some extent by all of the other devices we're concerned with. But the real meaning and importance of this architectural shift is on the server side, driven both by the need to support many heterogenous device and application types, and by the primacy of live data in modern applications.

Related books and videos

Ajax
-- Head First Ajax
-- Ajax, TDG

REST
-- Rest in Practice
-- RESTful Web Services
-- RESTful Web Service Cookbook

HTML5
-- HTML5: Up and Running
-- HTML5 Mobile Web Development (video)
In the browser-dominated world, static content and data were inevitably mixed. Yes, we had templating systems that let developers separate static content and design elements from data. But once the application server did its magic, what was delivered to the browser was HTML pages mixing data with other content. Browsers were similar enough that, with some browser detection hacks on both the server and client side, it was relatively easy (though a pain) to generate pages that would run anywhere. That doesn't work any more. It's naive to think that you can wrap some HTML around data and be done with the job; the chances are that you're leaving a huge chunk of your human audience behind, and making things more difficult for another audience -- machines that just want to consume your data. To build a modern application, developers must focus on the data: they must see themselves as data providers, they must develop documented and stable public APIs for accessing their data. Over the past few years, we've realized the importance of data. What's the value of Google without the data behind it? Or Facebook? Or, going back 15 or so years, GNN? It took a long time for us to understand the importance of data, as opposed to "content." But when you've gotten that lesson, your design goals change: designing and publishing a stable API to a data service becomes the highest priority.

That's the driving force behind this architectural shift. Front ends, user interfaces, clients, apps, whatever you decide to call them, don't disappear. But we have learned how important it is to keep the data interface separate from the user interface. Your next project will probably have multiple front ends, some delivered through HTML5, and some delivered through native code. Building them on a common data API is going to be much cleaner and simpler. In addition, third parties can build their own apps on top of your API. An important component of Twitter's success has been the ecosystem of applications that have built on their data: TweetDeck, TweetGrid, Tweetie, etc.; Twitdom, the Twitter applications directory, lists more than 1,800 apps. Until the "new Twitter" went live, the Twitter website was significantly less capable than most of the third-party apps.

Although it has been a long time coming, we're finally finishing the revolution that started with Ajax. Get data that users care about, make it available via an API, provide a data presence that's distinct from your HTML-based web presence, and build multiple front ends to serve your customers, on whatever platforms they care about. Your value is in the data.

Related:

What is data science?
The SMAQ stack for big data
Data as a service
Building data startups
Lies, damn lies, and visualizations
Data  Programming  Web_2.0  ajax  html5  javascript  shared  from google
november 2010 by cloudseer
Async Messaging and the Barbarian Hordes
At PDC 1996, Pat Helland did a six
minute bit where he compared personal computing to the sacking
of Rome and Microsoft
Transaction Server to the Renaissance.
It was called “Transaction
Processing and the Barbarian Hordes” and in my opinion it should be required viewing
for everyone in the tech industry.






Of course, the tech industry has changed significantly since PDC96. In particular,
personal computing has become the new “Classical Rome” and web developers are the
new barbarians. Just as Microsoft rediscovered transaction processing in the 90’s,
it seems that RESTifarians are on the verge of rediscovering asynchronous messaging.



“The internet has
been dead and boring for a while now.  It has reached a point of stability
where flashes of technological creativity are rare, but every now and then some new
technology can put a spark back in the ole gal (no sexism intended).




If you haven’t heard of WebHooks or PubSubHubBub its
about time you did. Both are designed to  simplify and optimize the web.”


Mark Cuban, The
Internet is about to change



Not to put too fine a point on it, but these “flashes of technological creativity”
that Mark’s going gaga over aren’t new at all. Both Web Hooks and PubSubHubbub are
essentially async messaging, the oldest form of messaging in the history of networking.
But just as personal computing ignored the importance of transaction processing for
a long time, REST has long ignored the importance of async messaging. Instead, web
development has instead been focused exclusively on request/response – something I’ve
struggled with for quite some time. But the rise of Twitter has driven many people
to realize that something I’ve known since 2003: “In
order to truly evolve syndication…we need to break free of the synchronous polling
model.” [1]


I
love the slogan from this
Web Hooks presentation: “so simple you’ll think it’s stupid”. Web Hooks aren’t
stupid – far from it – but they certainly are simple. They’re basically callbacks –
which Web Hooks creator Jeff Lindsay readily acknowledges
- invoked across the network using standard REST technology like HTTP and XML or JSON.
The canonical webhook examples are Paypal Instant
Payment Notification and GitHub
Post-Receive Hooks. In both cases, you register a custom notification URL with
the system in question. Then, when something specific happens in the system, a message
gets POSTed to the registered URL. In some scenarios, it’s a simple notification.
For example, when GitHub receives a commmit push, it POSTs a JSON message about the
commit to the registered URL. In other scenarios, the initial message is the start
of an async conversation - the system expects you to POST a message back to them sometime
in the future. For example, when a customer makes a payment, PayPal POSTs a message
to the URL you registered. You then confirm the payment by posting a message back
to a well known PayPal URL.


Note, by the way, that both of these canonical examples depend on async messaging.
GitHub isn’t going to do anything with a response anyway, so there’s no point in sending
them a response. PayPal, on the other hand, is expecting a response. Yet, they use
async messaging instead of an arguably simpler HTTP request/response operation. They
do this for same reason WS-Transaction is
the Anti-Availability
Protocol – the last thing you want to do is lock up precious resources in your
system waiting for some nimrod on the other side of the Internet to respond to a request
you sent. Instead you what PayPal does – send an async message, listen on a separate
channel for a response, correlate the messages explicitly via some kind of conversation
identifier and release your precious resources to do other work while you wait for
the response.


As
for PubSubHubbub, it’s focused
on real time delivery of new information. Dave Winer’s
recent RSS Cloud efforts focus on real-time notification as well. In both cases,
instead of subscribers polling a given RSS feed for changes every X amount of time,
they register for notification when the feed is updated. This is very similar to the
way GitHub uses async messages for commit push notification as described above.




Both
PubSubHubbub and RSS Cloud include an intermediary that’s responsible for managing
the list of current subscribers and relaying the notification when the publisher makes
a change.  Honestly, I’m not a fan of the Hub/Cloud intermediary – it feels a
little too ESB-like to me. However, since it’s only relaying notifications it receives
without transformation, I can live with it. Besides, there’s no reason why a publisher
can’t act as it’s own hub. The vast number of blogs and twitter users have so few
subscribers that the extra layer of abstraction is probably not worth it. On the other
hand, if you’re going to run a notification hub for the largest users, you might as
well use it for smaller ones as well.


While I think Mark’s laid the “new technology” hype on pretty thick, I do think he
hits the nail on the head regarding the major new business opportunities that can
come from adopting the heretofore ignored async messaging model on the web:



“This could be an open door for the content business…Using The Associated Press as
an example, AP could post their stories to a HUB. In realtime, the HUB can update
member websites so that they will always have information first, before any aggregator.
It may not take long for aggregators to recognize the new data on the member sites,
but they won’t have it first.


The New York Times could do the same thing. Subscribers could get everything first,
in realtime. Then after some delay which might be 1 minute, it might be 30 minutes
depending on what the paper thinks is the value related to timeliness, it could post
on the website and on twitter and facebook as updates. Would NY Times online readers
pay $1 a month to be guaranteed that they get their news first, before anyone else
? I dont know.


In the sports world, text based play by play websites could be updated in realtime
rather than pulling every 30 seconds or requiring the user to hit refresh every few
seconds.”



Arguably, this opportunity is easier to realize precisely because async messaging
isn’t new technology. Getting people to adopt a new technology is incredibly
hard. It’s much easier to get people to adopt a new pattern for using an existing
technology. And async messaging has been possible as long as the web has been in existence.


Web Hooks and PubSubHubbub are long overdue but very welcome steps forward in the
evolution of the Internet. I wonder what the barbarians will rediscover next?



[1] Of course, writing a prediction like this is a far sight from actually implementing
it. If I had actually put some engineering effort behind this in 2003, maybe I’d be
a household name in the tech community by now. On the other hand, I said some things
in that same post that have turned out to be spectacularly incorrect (“Indigo is going
to make Longhorn a great platform for SOA”) so it probably wouldn’t have made much
of a difference.
Async_Messaging  Web_2.0  Web_Services  shared  from google
august 2009 by cloudseer

Copy this bookmark:



description:


tags: