cloudseer + programming   119

HOWTO Use UTF-8 Throughout Your Web Stack
Unicode can be an issue that's hard to resolve, so it's guide to use the guidance in this document and ALWAYS use unicode where possible.
mysql  programming  unicode  publish 
august 2011 by cloudseer
WordPress Coding Standards
If you are looking for guidance on how how to format your code, you could do worse than to base it this coding standards styleguide.
wordpress  programming  standards  publish 
august 2011 by cloudseer
Crash into Python
Crash into Python is a set of documents/slides that are meant to be used as a teaching aid for bringing programmers from other languages up to speed with Python. It assumes that you have enough familiarity with programming to know what function and class mean, and will recognize that print probably doesn't put ink on paper. More importantly, it assumes that you either have an instructor who is well-versed in Python, or are resourceful enough to find answers for yourself. A number of these slides are designed to trigger questions and discussion, so if it seems like you're missing something, that's a good sign you could be digging deeper.
programming  howto  python 
august 2011 by cloudseer
Get Directory Path of an executing Batch file
Add the following to the start of a batch file to make it work on UNC paths:
pushd %~dp0
batch  programming  windows 
june 2011 by cloudseer
The secrets of Node's success
Sections

Browser wars and JavaScript performance
The rehabilitation of JavaScript
Node.js solves a real problem
Evented I/O
Sharing code between the browser and server
Critical mass for Node.js
Node is not the "next" anything

In the short time since its initial release in late 2009, Node.js has captured the interest of thousands of experienced developers, grown a package manager and a corpus of interesting modules and applications, and even spawned a number of startups.

What is it about this technology that makes it interesting to developers? And why has it succeeded while other server-side JavaScript implementations linger in obscurity or fail altogether?

The key factors are performance, timing, and focusing on a real problem that wasn't easily solved with other server-side dynamic languages.

Browser wars and JavaScript performance

In the early 2000s, AJAX web development was coming into its own and placing increasing demands on browsers' JavaScript engines. New JavaScript libraries such as YUI, Dojo and jQuery were allowing developers to do much more with web user interface (UI), creating a user experience for web applications that mimicked the behavior of desktop applications.


As JavaScript libraries and websites became more complex and users started to notice poor performance in their browsers, browser developers started to focus seriously on their JavaScript engines.

The race for faster JavaScript engines heated up in September 2008 when Google released Chrome and the Chromium source code. The engine behind it was V8 and it outperformed all others. This helped spur the developers of Firefox, Safari, Opera and Internet Explorer to improve JavaScript performance in their browsers and it opened a new front in the browser wars.

Technically speaking, V8 takes a slightly novel approach to improving performance. Certain JavaScript objects are dynamically compiled directly into native machine code before execution based on a predictive analysis of the code.


This, along with a new approach to property access and a more efficient garbage collection system enabled Chrome to initially post significantly faster benchmarks than other browsers.

Save 50% - JavaScript Ebooks and Videos

JavaScript is everywhere: servers, rich web client libraries, HTML5, databases, even JavaScript-based languages. If you've avoided JavaScript, this is the year to learn it. And if you don't, you risk being left behind. Whatever your level, we have you covered:

Introductory  /  Intermediate  /  Advanced

One week only—offer expires 14 June. Use discount code HALFD in the shopping cart. Buy now and SAVE.

The other browsers responded with improved or completely rewritten JavaScript engines that matched or exceeded V8's benchmarks. These optimizations are still going on, and Google's V8 is benefiting from the healthy, often technically brilliant, competition. Compared to the interpreters for server-side dynamic languages like Ruby, Python, PHP and Perl, JavaScript now has several efficient and incredibly fast runtimes.

Ryan Dahl, creator of Node.js, chose the V8 engine for Node. This has an additional benefit for a server-side implementation.


The predictive optimization of JavaScript works fairly well in the Chrome browser, but it is much more effective for server applications where the same chunks of code tend to be run multiple times. V8 is able to refine its optimizations and soon ends up with very efficient cached machine code.

Node has an additional performance advantage (a big one) that is not directly tied to V8, but we'll get to that in a bit.

The rehabilitation of JavaScript

JavaScript was once widely regarded as an awful hack of a language. Many programmers still feel this way, but the prejudice is starting to fade, mostly because there is a growing body of good code that shows off the language.


One person who has done much to pinpoint JavaScript's technical weak points is Douglas Crockford. Fortunately, instead of stopping there, he has also created JSLint and written "JavaScript: The Good Parts" to help developers write better code while avoiding most of the "bad parts" of the language. In his presentations and posts, one of his core assertions is that:

... despite JavaScript's astonishing shortcomings, deep down, in its
core, it got something very right. When you peel away the cruft, there
is an expressive and powerful programming language there. That
language is being used well in many Ajax libraries to manage and
augment the DOM, producing an application platform for interactive
applications delivered as web pages. Ajax has become popular because
JavaScript works. It works surprisingly well.


Without getting into the details of which parts are good or bad, we have seen in the past few years that professional developers have come to realize that JavaScript is not going away. Many of developers have gotten on with the task of building complex, well-designed applications and libraries. There are still problems with JavaScript and with its specification, but programmers are now much less likely to dismiss it out of hand.

Previous server-side JavaScript frameworks had a much harder time overcoming the negative mindset about the language. By the time Node arrived, JavaScript had overcome the most of its image problem.

Node.js solves a real problem

Wikipedia has a fairly comprehensive "Comparison of server-side JavaScript solutions". Node is in there, but most of the others listed are not nearly so well known. The use of the term "solutions" is interesting, as most of these projects are solutions to problems that have already been solved by other languages.

Python, Java, Ruby, PHP, Perl and others are all still extremely good choices for most types of dynamic web applications. They talk to databases, crunch numbers, validate data, and parse templates. They are high-level languages, and there are several MVC frameworks for each of them for quick web app creation. Node is sometimes touted as the next Ruby-on-Rails, but this a bad comparison and misses the point of what Node is for.

Node is not trying to solve the same problems as Rails, and it's not competing head-on with any of the other languages or frameworks in the areas where they do well. It was made for, and is most successful at, solving a special set of problems with modern web applications. What can it do that these other languages cannot?

It turns out that what JavaScript can do is the flip side of something it can't do: blocking I/O.

Evented I/O

JavaScript itself can't actually read or write to the filesystem. This ability was omitted from the language because it wasn't necessary for its job in the browser, so Node was able to start from scratch with an I/O system based on event loops.

Node is all about "evented I/O," but what does that actually mean?

To those of us who are either not programmers or are not familiar with event loops, an analogy might help.

You're in a grocery store with a list of items to buy. You wheel your cart around the store, pick up one item at a time, put it in your cart, then take the cart through the checkout. You can optimize this slightly by fetching the items in a sane order, but you can't go get the milk while you're waiting at the deli counter.

If you're in a hurry, you might start thinking of crazy ways to speed up the process. You could enlist a number of other shoppers with shopping carts and send each out to buy a single item. This would create bottlenecks in narrow isles and a huge traffic jam at the checkout

This is clearly an insane way to solve the issue because it throws more shopping carts and cash registers at the problem than needed.

Programming languages that block on I/O often try to solve similar problems by spawning additional threads or processes (c.f. Apache, sendmail). This can be expensive in terms of memory usage, and an analysis of Python's Global Interpreter Lock shows just how expensive the traffic jam can be in terms of CPU utilization.
JavaScript and Node use event loops and callbacks to approach the problem differently.

Returning to the shopping example: If you had a group of kids along with you on your shopping trip, you could send each off to get a single item and return them to the cart. When they've all returned, you can proceed through the checkout. The time taken in fetching items would be the maximum time for retrieving a single item (the kid who had to wait at the deli counter), rather than the sum (picking up the items in sequence). Using runners for the small, simple task of fetching items is a more efficient way of parallelizing the problem than sending out full-fledged shoppers and carts.

It's not a perfect analogy by any means, but more succinct and accurate descriptions involve code or pseudo-code. Ryan Dahl's initial presentation at JSConf 2009 used the following example:

var result = db.query("select..");
// use result

Here the database query blocks the program from doing anything else until the query is returned, whereas in an event loop:

db.query("select..", function (result) {
// use result
});

... the program can continue doing things while waiting for the function to call provide its callback.

Node provides non-blocking libraries for database, file and network access. Since I/O is not a fundamental part of JavaScript, nothing had to be taken away to add them. Python's Twisted and Ruby's Event Machine have to work around some basic language components in order to get similar evented behavior.

So, in addition to the performance wins Node gets "for free" by using the V8 JavaScript engine, the event loop model itself allows Node servers to handle massive concurrencies in network connections very efficiently. It often approaches the benchmarks achieved by high-performance reverse proxies like Nginx (which is also based on an event loop).

Sh[…]
Programming  javascript  nodejs  server  shared  from google
june 2011 by cloudseer
Four short links: 10 March 2011
Everybody is Spamming Everybody Else on MTurk -- one researcher found >40% of HITs are spammy, but this author posted a Mechanical Turk HIT to supply recommendations for visitors to a non-existent French city and got responses from people expecting that every response would be paid regardless of quality.
Javascript Garden -- a growing collection of documentation about the most quirky parts of the JavaScript programming language. It gives advice to avoid common mistakes, subtle bugs, as well as performance issues and bad practices that non-expert JavaScript programmers may encounter on their endeavours into the depths of the language.
A 5 Minute Framework for Fostering Better Conversations in Comments Sections (Poytner) -- Whether online or offline, people act out the most when they don’t see anyone in charge. Next time you see dreck being slung in the bowels of a news story comment thread, see if you can detect whether anyone from the news organization is jumping in and setting the tone. As West put it, news organizations typically create a disconnect between the people who provide content and the people who discuss that content. This inhibits quality conversation.
Full Text RSS Feed -- builds full-text feeds for sites that only offer extracts in their RSS feeds. (via Jason Ryan)
javascript  mturk  programming  rss  socialsoftware  spam  shared  from google
march 2011 by cloudseer
Four short links: 3 January 2011
RSS is Dying and You Should Be Worried -- If RSS dies, we lose the ability to read in private.
What Could Have Been Entering The Public Domain on January 1, 2011? -- a list of the works that won't be entering the public domain in the US because the copyright term was extended in 1976. Think of the movies from 1954 that would have become available this year. You could have showed clips from them. You could have showed all of them. You could have spliced and remixed and made documentaries about them. (You could have been a contender!) Instead, here are a few of the movies that we won’t see in the public domain for another 39 years .... This list will be viewed two different ways by different groups, reinforcing instead of changing their views: copyright minimalists will say "what a tragedy" but copyright maximalists will say "look at these great works we protected, they're still earning money for their creators therefore they're still valuable and thus worth protecting". (via Bill Bennett on Twitter)
ProxClone -- cloner for proximity cards, cost of parts around $30. (via Hacker News)
2011 Is The Year of Server-Side Javascript -- explanation of why the author will be doing back-end coding in Javascript this year. Good to see an honest assessment that it's still early days for server-side Javascript: Most of the libraries out there are young, buggy and incomplete. I got Node.js to segfault a few times. There’s no killer framework on the same caliber as Rails, nor anything that comes close to ActiveSupport and a decent standard runtime library (hmm … that gives me an idea). But then, it’s not much different than what Ruby was five years ago, or Java back in the late 90′s. We’ve all got to start somewhere.
copyright  hacks  hardware  javascript  programming  rfid  rss  security  shared  from google
january 2011 by cloudseer
Four short links: 23 December 2010
There Is No Such Thing as the Government -- absolutely spot on there is no spoon moment for government. And that matters. It matters because once you recognise that fact, you can start to do things differently. People do, of course, recognise it at the level of caricature I have described here and nobody will admit to believing that they can get things done simply by pulling the levers of power. But inactions speak louder than words and the myth of the lever is harder to eradicate than any of us like to admit.
The Blast Shack (Webstock) -- Bruce Sterling on Wikileaks. No hacker story is more common than this. The ingenuity poured into the machinery is meaningless. The personal connections are treacherous. Welcome to the real world. No army can permit this kind of behavior and remain a functional army; so Manning is in solitary confinement and he is going to be court-martialled. With more political awareness, he might have made himself a public martyr to his conscience; but he lacks political awareness. He only has only his black-hat hacker awareness, which is all about committing awesome voyeuristic acts of computer intrusion and imagining you can get away with that when it really matters to people.
Word Lens -- finally, useful AR: it replaces foreign language text with translations.
Staging Servers, Source Control, Deploy Workflows, and Other Stuff Nobody Teaches You -- this guy has a point: when you emerge from programming school, you're unlikely to have touched this kind of real-world programming.
augmentedreality  brucesterling  gov20  mobileapps  programming  wikileaks  shared  from google
december 2010 by cloudseer
Four short links: 3 December 2010
Data is Snake Oil (Pete Warden) -- data is powerful but fickle. A lot of theoretically promising approaches don't work because there's so many barriers between spotting a possible relationship and turning it into something useful and actionable. This is the pin of reality which deflates the bubble of inflated expectations. Apologies for the camel's nose of rhetoric poking under the metaphoric tent.
XML vs the Web (James Clark) -- resignation and understanding from one of the markup legends. I think the Web community has spoken, and it's clear that what it wants is HTML5, JavaScript and JSON. XML isn't going away but I see it being less and less a Web technology; it won't be something that you send over the wire on the public Web, but just one of many technologies that are used on the server to manage and generate what you do send over the wire. (via Simon Willison)
Understanding Pac Man Ghost Behaviour -- The ghosts’ AI is very simple and short-sighted, which makes the complex behavior of the ghosts even more impressive. Ghosts only ever plan one step into the future as they move about the maze. Whenever a ghost enters a new tile, it looks ahead to the next tile that it will reach, and makes a decision about which direction it will turn when it gets there. Really detailed analysis of just one component of this very successful game. (via Hacker News)
The Full Stack (Facebook) -- we like to think that programming is easy. Programming is easy, but it is difficult to solve problems elegantly with programming. I like to think that a CS education teaches you this kind of "full stack" approach to looking at systems, but I suspect it's a side-effect and not a deliberate output. This is the core skill of great devops: to know what's happening up and down the stack so you're not solving a problem at level 5 that causes problems at level 3.
bigdata  data  datamining  devops  games  json  programming  web  xml  shared  from google
december 2010 by cloudseer
New directions in web architecture. Again.
In 2005, Jesse James Garrett at Adaptive Path published the seminal blog "Ajax: A New Approach to Web Applications" and ushered in new age of web architecture. Ajax meant using the possibilities latent in JavaScript (specifically, the XMLHttpRequest object) so that a web page could contact the server asynchronously and request new data.

This was revolutionary; within months, we were seeing pages that were more dynamic and interactive. Ajax short-circuited the submit/response loop that dominated web applications up to that time. Instead of making an HTTP request, receiving an entire web page, and rendering that page as a replacement for the current page, the browser requested a chunk of data. It used that chunk of data to interact with the DOM and rewrite the page it was displaying on the fly.

Around the same time, the RESTful paradigm started taking hold. REST represented a much simpler, web-oriented way for servers to interact with their clients. As Roy Fielding pointed out in his dissertation, the basic operations of the HTTP protocol were capable of providing general access to data; stateful applications could be built upon stateless protocols; hypermedia could be used to maintain application state. Although Fielding's dissertation dates back to 2000, it took a few years of bad experience with SOAP and its heirs to realize its importance. With REST, it becomes easier for a website to see itself as a source of data for machines to process, rather than as a source of content for humans to read. Websites become data servers.

The HTML page you get from the new Twitter is largely a bunch of empty divs, with a big wad of JavaScript. The JavaScript is the entire application.

Important as Ajax and REST have been to the history of the web, each only represents half of a larger revolution. And in the past few months, we've seen some new sites that have taken the revolution to its logical conclusion. Specifically: take a look at the new Twitter. It's a nice web application, sure -- but look at the HTML. There's not much there. The HTML page you get from Twitter is largely a bunch of empty divs, with a big wad of JavaScript. What's happening? The JavaScript is the entire application; the divs exist only to provide tags so the JavaScript can rewrite the DOM at will. In turn, the JavaScript is constantly (and asynchronously) making requests from the Twitter site, which is just returning data from its API. In fact, the Twitter site is returning the same data for its web page that it would return for its mobile app, for TweetDeck, or for any of the apps in the Twitter ecosystem.

This design isn't particularly new; we've seen it ever since developers started reverse-engineering GMail and Google Maps to get ideas for their own projects. Those big Google apps may have been the first examples of this architectural trend. They were certainly among the first to use JavaScript as a full-fledge client programming language. But we're seeing many more sites built along these lines. Why now, what does this shift mean, and why is it important?

"Why now" is perhaps the easiest question to answer. A few short years ago, web developers only had one platform to support, and that was the "browser." Granted, there were a dozen or so browsers of significance, and the browser world was riddled with incompatibilities. We're in a different world now. Browser compatibilities have been ironed out, to some extent (though conscientious developers still support "legacy" browsers, all the way back to IE6 or even IE5). But it's no news that the most important new apps these days run on devices ranging from phones (iPhone, Android, BlackBerry, Windows Phone), tablets (iPad, Android/ChromeOS devices), and potentially ebook readers and other new devices. With these new devices on the table, browser incompatibilities pale in significance. It's another sign of the times that I can't conceive of an interesting application that doesn't access data across the network. A static application that never accesses remote data -- that's so 1990s.

So, while it's tempting to say that the new age is characterized by the browser as platform, and that applications running in the browser can do anything that native code can do, that's looking in the wrong place. HTML5 certainly ups the ante, as far as browser capabilities -- and is supported to some extent by all of the other devices we're concerned with. But the real meaning and importance of this architectural shift is on the server side, driven both by the need to support many heterogenous device and application types, and by the primacy of live data in modern applications.

Related books and videos

Ajax
-- Head First Ajax
-- Ajax, TDG

REST
-- Rest in Practice
-- RESTful Web Services
-- RESTful Web Service Cookbook

HTML5
-- HTML5: Up and Running
-- HTML5 Mobile Web Development (video)
In the browser-dominated world, static content and data were inevitably mixed. Yes, we had templating systems that let developers separate static content and design elements from data. But once the application server did its magic, what was delivered to the browser was HTML pages mixing data with other content. Browsers were similar enough that, with some browser detection hacks on both the server and client side, it was relatively easy (though a pain) to generate pages that would run anywhere. That doesn't work any more. It's naive to think that you can wrap some HTML around data and be done with the job; the chances are that you're leaving a huge chunk of your human audience behind, and making things more difficult for another audience -- machines that just want to consume your data. To build a modern application, developers must focus on the data: they must see themselves as data providers, they must develop documented and stable public APIs for accessing their data. Over the past few years, we've realized the importance of data. What's the value of Google without the data behind it? Or Facebook? Or, going back 15 or so years, GNN? It took a long time for us to understand the importance of data, as opposed to "content." But when you've gotten that lesson, your design goals change: designing and publishing a stable API to a data service becomes the highest priority.

That's the driving force behind this architectural shift. Front ends, user interfaces, clients, apps, whatever you decide to call them, don't disappear. But we have learned how important it is to keep the data interface separate from the user interface. Your next project will probably have multiple front ends, some delivered through HTML5, and some delivered through native code. Building them on a common data API is going to be much cleaner and simpler. In addition, third parties can build their own apps on top of your API. An important component of Twitter's success has been the ecosystem of applications that have built on their data: TweetDeck, TweetGrid, Tweetie, etc.; Twitdom, the Twitter applications directory, lists more than 1,800 apps. Until the "new Twitter" went live, the Twitter website was significantly less capable than most of the third-party apps.

Although it has been a long time coming, we're finally finishing the revolution that started with Ajax. Get data that users care about, make it available via an API, provide a data presence that's distinct from your HTML-based web presence, and build multiple front ends to serve your customers, on whatever platforms they care about. Your value is in the data.

Related:

What is data science?
The SMAQ stack for big data
Data as a service
Building data startups
Lies, damn lies, and visualizations
Data  Programming  Web_2.0  ajax  html5  javascript  shared  from google
november 2010 by cloudseer
The 100-year leap
In December 1837, the British mathematician Charles Babbage published a paper describing a mechanical computer that is now known as the Analytical Engine. Anyone intimate with the details of electronic computers will instantly recognize the components of Babbage's machine. Although Babbage was designing with brass and iron, his Engine has a central processing unit (which he called the mill) and a large amount of expandable memory (which he called the store). The operation of the Engine is controlled by program stored on punched cards, and punched cards can also be used to input data.


Punched cards created for Babbage's Analytical Engine. From Flickr user lorentey.

Inside the mill, individual operations are controlled by the equivalent of a microprogram. The microprogram is stored on cylinders covered in studs (much like in a music box) that Babbage refers to as the barrels. Data is transferred from the store to the mill for processing and returned to the store for later use. In his plans Babbage described an Engine with 100 storage locations holding 40 decimal digits each (which is roughly equivalent to 1.7KB). He even anticipated the need for ever more memory, describing an Engine with 1,000 storage locations (17KB) and external storage (he would have used punched cards where we use disks).

For output, the Analytical Engine plans call for both a printer and a plotter. The entire Engine would likely have been powered by steam and would have been the size of a small steam locomotive. Its programming language -- if it can be called that -- included loops and conditionals. The only surprising thing about the architecture of the Analytical Engine is when it was invented.

It wasn't until 100 years later that computers came into existence, with Babbage's work lying mostly ignored. In the late 1930s and 1940s, starting with Alan Turing's 1936 paper "On Computable Numbers, with an Application to the Entscheidungsproblem," teams in the US and UK began to build workable computers by, essentially, rediscovering what Babbage had seen a century before. Babbage had anticipated the impact of his Engine when he wrote in his memoirs: "As soon as an Analytical Engine exists, it will necessarily guide the future course of science."

During his lifetime Babbage only constructed parts of the Analytical Engine (which can be seen in the Science Museum in London). His son, H. P. Babbage, working from his father's designs, built a demonstration version of the mill after his father's death. The elder Babbage left behind extensive documentation and plans for the Engine, all of which are safely stored in London and have been examined by historians.

The mill of the Analytical Engine. From Flickr user Gastev.

Babbage came up with the idea of the Analytical Engine while working on a machine to automatically produce mathematical tables (such as tables of logarithms). Mathematical tables were extensively used at the time -- and well into the 20th century -- and they were calculated by hand by people referred to as "computers." Babbage hoped to eliminate errors made by these computers by replacing them with a machine capable of performing the relevant calculations automatically.

The machines he invented are called the Difference Engines (because they use the mathematical technique of differences to perform their calculations). These machines were not completed during Babbage's lifetime partly because of his difficult personality and partly because of the withdrawal of government support for the project. The conception and construction of Babbage's Engines was an enormous undertaking in the 1800s. Despite repeated setbacks, Babbage continued essentially alone, working on plans and designs up until his death and spending his own fortune on the work. Twentieth-century computer pioneer Maurice Wilkes describes being "haunted by the thought of the loneliness of [Babbage's] intellectual life" while working on the Analytical Engine.

The British government had initially supported Babbage and covered some of the costs of construction of the first Difference Engine. But as costs rose and years wore on, the government was advised that the machines would be of little use, were unlikely to pay for themselves, and the money expended would have been better invested and the dividends used to hire additional human "computers" to do the work.

Soldiering on alone with the conviction that his machines would be of great benefit to mankind by taking what had been mental effort and making it mechanical, Babbage wrote that "Another age must be the judge" of his inventions.

Simply put, we live in that age. In the late 1980s the Science Museum in London undertook a project to demonstrate that Babbage's Engines could have been built during his lifetime. The museum constructed his Difference Engine No. 2 and the associated printer using historically accurate materials and to within historically accurate tolerances. In 1991, the working machine was unveiled, and it can still be seen on display in the museum (a copy of the machine is also on display at the Computer History Museum in Mountain View, CA).

Difference Engine No. 2. From Flickr user psd.

The Science Museum's Difference Engine No. 2 project put to rest any doubt about the limits of Victorian engineering. Babbage's Engines were achievable in Victorian Britain and Babbage's 100-year leap in inventing the computer could have been realized.

It's time to build the Analytical Engine

I hope to finish Babbage's dream and build an Analytical Engine for public display. I've launched a project called Plan 28 to raise the money and bring together people to work on the Engine. Babbage left behind extensive documentation of the Analytical Engine, the most complete of which can be seen in his Plan 28 (and 28a), which are preserved in a mahogany case that Babbage had constructed especially for the purpose.

There are three important steps to achieve this goal:

A decision must be made on what constitutes an Analytical Engine
The Engine should be simulated on a computer to help debug the physical machine
The machine must be built

The first step is necessary because Babbage continually refined his designs -- he was constantly aiming at simplification and faster computational speed -- and left behind a mixed collection of plans and notebooks. Sorting through this material will require the help of historians and specialists in Victorian engineering.

Simulating the machine using 3D modeling software and a physics engine would enable us to bring the machine to life without making any metal parts. Given the size and complexity of the machine, this step is vital. And since the final machine would wear out if constantly used, it would provide a way of demonstrating the Engine.

It might seem a folly to want to build a gigantic, relatively puny computer at great expense 170 years after its invention. But the message of a completed Analytical Engine is very clear: it's possible to be 100 years ahead of your own time. With support, this type of "blue skies" thinking can result in fantastic changes to the lives of everyone. Just think of the impact of the computer and ask yourself how different the Victorian world would have been with Babbage Engines at its disposal.

What seemed like costly research that was unlikely to have any short-term value turned out to be the seed of one of the greatest revolutions mankind has seen. I hope that future generations of scientists will stand before the completed Analytical Engine, think of Babbage, and be inspired to work on their own 100-year leaps.


Related:

One of the great inventions that never was - until now?
How Alan Turing Finally Got a Posthumous Apology
Programming  babbage  computer  history  shared  from google
october 2010 by cloudseer
The principle of indirection
Programmers learn, early on, that there's a difference between values stored in memory and pointers to (or references to, or addresses of) values stored in memory. The key distinction emerges when these things move around within programs, and it is captured by a pair of phrases: pass by value and pass by reference.

Suppose the value stored in some memory location represents the number 6. In a pass-by-value regime, 6 is copied from one part of the program to another part of the program. If the value stored in the original memory location then becomes 8, those parts of the program that got copies of 6 still represent 6. The 6 was "passed by value."

In a pass-by-reference regime, though, what's copied from one part of the program to another isn't the value, 6, but rather a reference, or pointer, to the memory location where 6 is stored. In this case, if the value stored in that memory location becomes 8, the parts of the program that received references to 6's memory location now represent 8 too. The 6 was "passed by reference" and, by way of that reference, has become 8.

It used to be that nobody except programmers had to appreciate this subtle distinction. But along came the web, and now everybody does. Why? Another name for a pointer, or a reference, or an address, is a hyperlink. We use hyperlinks every day. But most people don't use them as well as they could, because most people don't see "pass by value" and "pass by reference" at work in our everyday online discourse.

Here's a quiz that looks easy, and should be, but turns out to be quite hard for most people. Somebody asks you: "What information do you have about topic X?" It's a multiple-choice quiz. There are two ways to answer:

1. Make a list of things, and send a copy of the list.

2. Make a list of things, and send a reference to the list.

Most people choose 1 -- that is, pass by value. The value they send, in this case, is a list of things we can describe using words, phrases, sentences, paragraphs, URLs. The list can be printed on paper and delivered by hand. Or it can be typed and sent as an email or text message. Either way, what's sent is a copy of the list. The original list remains in situ. When it changes, over time, those changes don't propagate through the network of copies.

The minority who choose 2 -- that is, to pass by reference -- achieve the same two goals as do the pass-by-value majority. One goal is to send a social signal: "Here is information I want to give you." The other is to convey the actual information. You get these same two effects no matter whether you pass by value or pass by reference.

When I send you a link to the list, though, instead of a copy of the list, I connect you to a live list that provides four extra benefits:

1. I am the authoritative source for the list. It lives at a location in memory (that is, at a URL in the cloud) that's under my control, and is bound to my identity.

2. The list is always up-to-date. When I add items, you (and everyone else) will see a freshly-updated list when you follow the link I sent you.

3. The list is social. If other people cite my link, I can find their citations and connect with them.

4. The list is collaborative. Suppose you want to extend my list. In a pass-by-value world, the best you can do is add to the copy I sent you. I won't see what you've added, and neither will anybody else. In a pass-by-reference world, though, we can both keep our own lists, publish references to them, and then produce a merged list by combining the referents.

(Of course there's no free lunch. If you depend on the link and it fails, we're out of luck. This week's companion piece at answers.oreilly.com explores one way to handle transient failure.)

The fourth benefit, the collaborative one, is rather abstract. So let's nail it down to a common real-world scenario. Suppose you're running a newspaper, or a hyperlocal website, or some other nexus for community information. And suppose I am a source for that information. Almost always, as things stand today, you'll ask me to pass information to you by value. If I'm promoting a council meeting, or a church supper, or a riverside cleanup, or an open mic night, you'll expect me to inform you about my event's date, time, and description by sending you an email, or by visiting your website and typing the data into a form. Either way, it boils down to: "Give me a copy of your information."

Before 1994 there was no alternative. My original, whether it was a piece of paper in my drawer or a file on my computer's hard drive, wasn't immediately available to you. It could only be passed by value. Since 1994 we've had an exciting new option, albeit one the world mostly hasn't yet caught up to. Now the original can reside on the web, at a permanent and well-known address within its vast memory. And it can be passed by reference.

So providers of information about community events -- the city government, the church, the environmental group, the musicians -- can post references to information about their events. Those references can appear wherever the providers choose to establish their online identities: on conventional websites, on blogs, on Twitter, on Facebook. Purveyors of that information -- newspapers, hyperlocal websites, other nexuses -- can use those references to create views that join many sources, from many perspectives, for many purposes.

That's still a notch too abstract so let's make it even more concrete. City governments provide calendars of council and committee meetings. Local newspapers purvey those calendars. Citizens use them. In the prevailing pass-by-value model, the city gives copies of its event information to the newspaper, which in turn makes copies to give to citizens, who in turn may need to make more copies to pass around. Where's the original? In a document on a computer at city hall.

In a pass-by-reference world, the original resides in the cloud at a unique URL. That URL refers to a list of events. And each item on the list -- each event on the calendar -- has its own URL. The city publishes its calendar on its own website, in HTML, so citizens can read it there. But instead of giving the newspaper copies of event information, it gives the newspaper a link to the calendar's feed. The newspaper, by subscribing to the link, ensures that the information it receives from the city is as timely, accurate, and complete as the city cares to make it. Of course the newspaper still has to make copies for its print version. But online, along with the subset of facts about each event that it chooses to relay, it provides the event's URL. Citizens can click through the event URL to see the whole description, and to check for updates. Citizens can also subscribe directly to the city's calendar URL, and thus merge its stream of civic event data with their own streams of personal event data.

I've yet to convince a local newspaper to adopt this model. It could be that they fear disintermediation. After all, if citizens can subscribe directly to calendar feeds, why will they need the newspaper to tell them about what's going on? But I don't think that's the real problem. There will always be community attention hubs. Newspapers, or whatever they evolve into, will continue to occupy that niche. In their role as purveyors of community information, though, pass-by-value makes them less effective than pass-by-reference could.

The real problem, I think, is that if you're a newspaper editor, or a city official, or a citizen, pass-by-reference just isn't part of your mental toolkit. We teach the principle of indirection to programmers. But until recently there was no obvious need to teach it to everybody else, so we don't.

I've noticed that educators do, nowadays, talk a lot about about systems thinking and digital literacy and 21st-century skills. Good! Now let's codify what we mean. Networks of people and data are governed by principles as basic as the commutative law of addition and multiplication. Indirection is one of those principles. Others include pub/sub syndication, universal naming, and data structure. First we need to write them down. Then we need to figure out how to teach them.

Related:

The laws of information chemistry
Personal data stores and pub/sub networks
See all Radar elmcity stories
See all Answers elmcity stories
Programming  education  elmcity  shared  from google
september 2010 by cloudseer
Four short links: 30 September 2010
Learn Python The Hard Way -- Zed Shaw's book on programming Python, written as 52 exercises: Each exercise is one or two pages and follows the exact same format. You type each one in (no copy-paste!), make it run, do the extra credit, and then move on. If you get stuck, at least type it in and skip the extra credit for later. This is brilliant—you learn by doing, and this book is all doing.
When The Revolution Comes They Won't Recognize it (Anil Dash) -- nails the importance of Makers. Dale Dougherty and the dozens of others who have led Maker Faire, and the culture of "making", are in front of a movement of millions who are proactive about challenging the constrictions that law and corporations are trying to place on how they communicate, create and live. The lesson that simply making things is a radical political act has enormous precedence in political history.
Truthy -- project tracking suspicious memes on Twitter.
UK Open Government License -- standard license for open government information in the UK.
book  gov20  license  make  memes  opendata  programming  python  research  twitter  shared  from google
september 2010 by cloudseer
Four short links: 7 April 2010
SproutCore -- open-source HTML5 application framework (i.e., lots of Javascript goodness) that'll work with any backend. To code for this, you put most of the logic in the front-end and leave the back-end much simpler.
RDF for Intrepid Unix Hackers -- an interesting series, showing how to use common Unix tools to manipulate RDF data from the commandline. (via Edd Dumbill)
How to Thrive Among Pirates (Kevin Kelly) -- a look at how indigenous movie-makers make money in countries like China, India, and Nigeria where piracy is rampant. In short, they make cheap movies, sell near the price of inferior-quality knockoffs, and take advantage of unique experiences that movie theaters offer (e.g., air-conditioning).
On Complaints (PublicStrategist) -- a very good analysis of complaints departments and expectations of people who complain. But there is also a vital question of what the organisation thinks the purpose of a complaints process is. If it is a safety valve, a means of finding and correcting the most egregious failures or a means of channelling immediate anger and dissatisfaction into a swamp of unresponsiveness, then it can’t provide any broader value. That’s where the Patient Opinion model starts to look really attractive. It is deliberately and carefully constructed to elicit feedback, not just complaints. More than half the stories it gets told are positive, even some of the most harrowing, and it therefore creates a picture which is as clear about what is valued as it is about what is seen as in need of improvement.
business  copyright  gov20  html5  javascript  opensource  piracy  programming  rdf  unix  web20  shared  from google
april 2010 by cloudseer
Four short links: 11 February 2010
Mimo Monitors -- USB-powered external monitors for your laptop or desktop, and you can daisy-chain them for multiple external monitors. Opens the possibility of task-specific monitors (one for chat, one for email, one for shell, one for code, ...). Monitors are 7" (800x480) and there's even a touchscreen option. (via James Duncan)
The Secrets of Malcolm Gladwell -- how to give a talk like Malcolm Gladwell. A short read and interesting. (via thestrategist)
Plupload -- a nice widget to handle file uploads (drag'n'drop, resizing, etc.). Has backends for Flash, Gears, HTML5, Silverlight, and Yahoo's BrowserPlus, selects the best that's available. (via Simon Willison)
The Coming Data Flood (Sunlight Labs) -- Three and a half years after their launch of data.dc.gov They're looking at incredible exponential growth. Last year they saw more than a doubling of new datasets being released. It isn't crazy to suspect we'll see the same exponential curve of data growth coming out of the federal government and other municipalities as they follow suit.
gov20  hardware  opendata  presentations  programming  web20  shared  from google
february 2010 by cloudseer
Four short links: 8 February 2010
Kindle Development Kit APIs -- Amazon will release a Kindle SDK. These are the API docs. (via obra on Twitter)
rePublish -- all-Javascript ebook reader. (via kellan on Twitter)
Peer Review: What's it Good For? (Cameron Neylon) -- harsh and honest review of peer review with some important questions for the future of science. But there is perhaps an even more important procedural issue around peer review. Whatever value it might have we largely throw away. Few journals make referee’s reports available, virtually none track the changes made in response to referee’s comments enabling a reader to make their own judgement as to whether a paper was improved or made worse. Referees get no public credit for good work, and no public opprobrium for poor or even malicious work. And in most cases a paper rejected from one journal starts completely afresh when submitted to a new journal, the work of the previous referees simply thrown out of the window. Some lessons in here for social software, too.
Analog IMDB -- The transition is moving slowly, but it’s moving. It’s a fascinating thing to watch. The technology is the dull part: what’s interesting is the shift in perception. You know how sometimes you turn off a certain section of your brain and force yourself to see a word not as a piece of language with meaning, but as a sequence of black shapes and white spaces? It’s like you’re seeing that image for the very first time and suddenly “bird” seems like a very odd thing. I’ve been buying all of my in-print books electronically for a couple of years. Physical books aren’t weird to me yet. But damn, that old copy of the Maltin guide was a freaky and bizarre object. It’s the first time I looked at a book and didn’t see a container for information. I saw dead wood.
amazonkindle  ebooks  javascript  opensource  programming  science  socialsoftware  shared  from google
february 2010 by cloudseer
Four short links: 11 January 2010
mytop -- a MySQL top implementation to show you why your server is so damn slow right now.
What Could Kill Elegant High-Value Participatory Project? -- The problem was not that the system was buggy or hard to use, but that it disrupted staff expectations and behavior. It introduced new challenges for staff [...]. Rather than adapt to these challenges, they removed the system. [...] No librarian would get rid of all the Harry Potter books because they are "too popular." No museum would stop offering an educational program that was "too successful." These are familiar challenges that come with the job and are seen to have benefit. But if tagging creates a line or people spend too much time giving you feedback? Staff at Haarlem Oost likely felt comfortable removing the tagging shelves because they didn't see the tagging as a patron requirement, nor the maintenance of the shelves as part of their job.
Gremlin -- a Turing-complete, graph-based programming language developed in Java 1.6+ for key/value-pair multi-relational graphs known as property graphs. Graph structures underly a lot of interesting data (citations, social networks, maps) and this is a sign that we're inching towards better systems for working with those graphs. (via Hacker News)
Anic -- programming language based on stream and latches. I still can't figure out whether it's an elaborate April Fool's Day joke that was released too soon, because the claim of "easier than *sh" is a bold one given the double-backslash and double-square-bracket-heavy syntax of the language. Important because it's built to be parallelised, and we're in transition pain right now between well-understood predictable languages for single CPUs (with hacks like pthreads for scaling) and experimental languages for multiple CPUs.
language  multicore  mysql  opensource  programming  projectmanagement  socialsoftware  shared  from google
january 2010 by cloudseer
Four short links: 22 September 2009
The City is a Battlesuit for Surviving the Future (IO9) -- a great essay by Matt Jones, based on his talk at Webstock this year. Urban design is how we created alternate realities before we had iPhones, and the new technology lets us choose which science fiction future we want to inhabit. We are now a predominantly urban species, with over 50% of humanity living in a city. The overwhelming majority of these are not old post-industrial world cities such as London or New York, but large chaotic sprawls of the industrialising world such as the "maximum cities" of Mumbai or Guangzhou. Here the infrastructures are layered, ad-hoc, adaptive and personal - people there really are walking architecture, as Archigram said. Hacking post-industrial cities is becoming a necessity also. [...]
How and Why Machines Work (MIT Open Course Ware) -- Subject studies how and why machines work, how they are conceived, how they are developed (drawn), and how they are utilized. Students learn from the hands-on experiences of taking things apart mentally and physically, drawing (sketching, 3D CAD) what they envision and observe, taking occasional field trips, and completing an individual term project (concept, creation, and presentation). Emphasis on understanding the physics and history of machines. (via Hacker News)
Google Style Guide -- how Google codes. Useful if you're working on their code, starting a job there, or want to mock them for not specifying K&R braces/four space tabs/<insert One True Way here>. (via Hacker News)
EC2 Usage Guessed From Sequential IDs -- The Superseries ID changes so rarely that originally I had assumed it was some kind of checksum. This would have been odd as it limits the total available IDs to 224 = 16.8 million. Up to very recently, the Superseries ID for all resource types - instances, images, volumes, snapshots, etc. - was 69 (in the us-east-1 region (for eu-west-1 the Superseries ID is 74). These days, new instances use the Superseries ID 68. This subtle change, unnoticed by the industry, may hint at an astonishing achievement: 8.4 million instances launched since EC2’s debut! (Instance IDs are even so 8.4M = 16.8M / 2.) (via mattb on delicious)
alternatereality  architecture  cities  diy  ec2  google  maker  programming  shared  from google
september 2009 by cloudseer
« earlier      

related tags

accessibility  actionscript  advice  ajax  alternatereality  amazonkindle  api  application  architecture  arguments  art  artwitter  asp  asp.net  audio  augmentedreality  authentication  autohotkey  babbage  backup  batch  bestpractice  bigdata  blog  book  bookmarklet  books  brucesterling  business  c#  caching  cake  cakephp  captcha  channel9  cities  cms  code  coding  collaboration  community  compression  computer  copyright  crypto  cryptography  css  data  database  datamining  debug  del.icio.us  delicious  design  dev  development  devops  dhtml  diff  diy  documentation  dom  dotnet  dotnetnuke  dynamic  ebooks  ec2  education  effects  elmcity  encryption  fake  flash  font  formatter  forms  forum  framework  free  functions  fxcop  games  gecko  generator  getelementsbyclassname  google  gov20  hacking  hacks  hardware  hash  helpdesk  history  howto  html  html5  http  identity  individual  injection  interface  internet  interview  java  javadoc  javascript  joins  jquery  js  jsdoc  json  kernel  keyboard  language  learning  library  license  linux  list  log  login  logparser  make  maker  management  memes  menu  messages  microsoft  mistakes  mobileapps  mootools  mozilla  msdn  mturk  multicore  music  mvc  mysql  name  nodejs  nunit  obfuscator  oop  opendata  opensource  optimization  packer  patch  patterns  performance  php  piracy  podcast  popularity  presentations  privacy  productivity  programming  projectmanagement  prototype  prototypes  publish  python  quality  quick  quotes  rb6  rdf  reference  regex  regexp  research  resources  rest  rfid  rss  safari  scheduling  science  scripting  search  secondlife  security  server  services  shared  sharing  shortening  simulation  sl  snippets  soap  social  socialsoftware  software  spam  speed  spore  sql  sqlinjection  standards  subversion  support  svn  swf  synth  synthesis  synthesizer  sysadmin  tech  testing  tidy  tips  tool  tools  top  toread  tuning  tutorial  twitter  typography  ui  unicode  unix  url  usability  validation  vb  vbscript  video  vj  web  web2.0  web20  webdesign  webdev  webservices  Web_2.0  wikileaks  win32  windows  windows7  wordpress  workflow  xhtml  xml  xmlhttprequest  xp  zip 

Copy this bookmark:



description:


tags: