rahuldave + redis   5

How We Built Our Real-Time, Location-Based Urban Geofencing Game
This guest post comes from Amber Case, co-founder of Geoloqi, a private, real-time platform for location sharing. She also speaks frequently on Cyborg Anthropology, the study of humans and computers.

In this post I’ll describe how we planned, built and tested a truly real-time location-based game with Socket.io, Redis, Node.js, and what we learned along the way. Over the past few months, we’ve spent the majority our free time building a real-time game as a test for our location platform, Geoloqi. We call the game MapAttack! due to its map-based nature. Two teams compete to capture the most points on the gameboard. The gameboard, in this case, is the city streets of the neighborhood the players are in.

We set each geofence up with a point value that would give players points for entering geofences. The idea was that a virtual map would be set up on top of the real world, and players on red and blue teams would try to capture all of the geofence points in the game before the other team. To capture a point, the phone would have to detect when the player entered the fence, determine the point value of the fence, notify the player that he received the point, turn the geofence the color of the team, and then add the point to the player score and the overall team score.

Why Build a Real-Time Geofencing Game?
We wanted to create a game that allowed people to physically interact with the real world instead of a computer console like a first person shooter or a real-time strategy game. We were inspired by playing a real-life version of Pac-Man called Pacmanhattan, invented by graduate students at the Interactive Telecommunications Program at NYU in 2004. We played it at Portland’s WhereCamp conference in 2008, and we wanted to see if we could make a GPS version of the game, as Pacmanhattan relied entirely on phone calls and physical maps. We also needed a good demo of Geoloqi’s streaming API.

Technical Challenges
Here is an overview of the problems we had to focus on in order to build the game.

Handling the detection of users entering and leaving 200+ geofences concurrently.
Handling the volume of location-updates from all the phones in a given game (20 or more users per game).
Allowing each phone and web browser watching the game to be able to see the movements of players and the geofences changing color in real time. Every phone in the game sends its location to the server, which broadcasts that data to every other phone and browser watching the game.
Handling errors and differences in GPS technology on different smart phone models in order to ensure a fair gameplay experience.

Differences in GPS Hardware
GPS signals are known for reflecting off of tall buildings in urban settings. This causes inaccuracy and inconsistency in location data. It is less-pronounced in newer phones, but it greatly shows in older ones.

Before our Streaming API
Before we finished the Geoloqi streaming API and before we started using Node.js and Socket.io, everything was based on polling for new updates. Phones reported their location at 5 second intervals and the browsers would update the game board in 5 second intervals.

Using Socket.io, Node.js, Redis, and Sinatra Synchrony

Socket.io

Socket.io is a cross-browser web socket implementation allowing us to do real-time data updates on the browser and also supports older browsers. We can use the latest technology without requiring all of our users to update to the newest browsers, thanks to Socket.io falling back to older technologies in older browsers. This allow us to do instant updates across browsers and the phones in the game.

Node.js

Node.js is Evented I/O for V8 Google’s Javascript implementation for Chrome, implemented with a reactor pattern, that enables for large amounts of asynchronous data traffic.

We use a Node.js server to stream the location data from the phones to the Redis pub/sub channel. It publishes to Redis, and another Node server subscribes to that redis channel. Our Node.js server receives updates from the phones using a custom protocol similar to Google’s Protocol Buffers, which is essentially a very compact binary JSON.

When a browser wants to start streaming data, it connects to the Socket.io server and that server then subscribes to the Redis pub/sub channel. The Socket.io server sends that data via Websockets to the browser, falling back to Flash or long-polling if Websockets is not available.

In essence, Socket.io allows us to use Websockets, which are completely new, but also allows this to work on older browsers thanks to the fallback tricks.

Redis

Redis is an open source, advanced key-value store that has support for message queues using something called publish/subscribe, or pub/sub (not to be confused with PubSubHubbub).

From the higher level what this lets us do is handle the difficulty of sending data to all of the phones in the game and the browser in real-time. Every phone in the game sends its location to the server, which broadcasts that data to every other phone and browser watching the game.

One of the interesting things about the publish/subscribe system is that with a traditional system you have to maintain connections and iterate through each in order to pass data through them. The alternative would be that if you had 10,000 users you’d have to iterate through an array of 10,000 connections, which would be very slow and prone to locking up on socket problems.

Using Redis pub/sub is like starting a radio station. Once it is turned on, people (in this case, browsers) can just listen in. This allows us to do real-time data updates to clients (browsers and phones) at a massive scale.

Sinatra Synchrony

Sinatra::Synchrony is a small extension for Sinatra that dramatically improves the concurrency of Sinatra web applications. Powered by EventMachine and EM-Synchrony, it increases the number of clients your application can serve per process when you have a lot of traffic and slow IO calls (like HTTP calls to external APIs). Because it uses Fibers internally to handle blocking IO, no callback gymnastics are required! This means we can just develop as if we were writing a normal Sinatra web application.

Sinatra::Synchrony allows us to do asynchronous programming (ala Node.js), except that it wraps the callbacks in Fibers (which are basically co-routines in Ruby). This allows you to do synchronous programming while taking advantage of asynchronous code. Aside from being easier to program this way, it also allows us to switch to a different concurrency/parallelism strategy if we need to. Kyle Drake developed Sinatra Synchrony specifically for MapAttack. Drake’s work became popular after he made a presentation on Sinatra::Synchrony at PDX Ruby.

The MapAttack Game Server
Finally, there is the MapAttack Game Server. In this case the Game Server is a simple database that takes care of storing the player point data that is displayed on the map and on the phones as players grab points in real-time.

Source Code
We made the source code for MapAttack available for download. You can download or fork the source code for the MapAttack website, iPhone application and Android application. If you build anything interesting with it, please let us know.

Upcoming Games
We’ll be bringing MapAttack! to WhereCamp Portland on October 7-9, 2011. We’ll give an overview of the technology there as well. If you plan to be in the area, please join us.

Sponsored by

Related ProgrammableWeb Resources Geoloqi API Profile, 2 mashups
Infrastructure  Mapping  Mobile  Tools  geofencing  geoloqi  gps  location-based_gaming  mapattack  real-time  Redis  socket.io  streaming  urban_gaming  from google
september 2011 by rahuldave
The Technology Behind Convore
We launched Convore last week, and the first question developers tend to ask
when they find Convore is "what technology powers this site?" It is asked so
often, in fact, that we have started to copy and paste the same short response
again and again. That response was good enough to satisfy people who simply
wanted to know if we were Rails or Django, or whether we were using node.js for
the real-time stuff, but this article will expand upon that—not only giving
more details for the curious, but also giving us a link to point people at when
they ask the question in the future. I always wish other people were totally
open about their architectures, so that I can learn from their good choices and
their bad, so I'd like to be as open as possible about ours. Let's dive in!

The basics
All of our application code is powered by Python. Our front-end html page
generation is done by Django, which we use in a surprisingly traditional way
given the real-time nature of Convore as a product. Everything is assembled
at once: all messages, the sidebar, and the header are all rendered on the
server instead of being pulled in after-the-fact with JavaScript. All of the
important data is canonically stored in PostgreSQL, including messages, topics,
groups, unread counts, and user profiles. Search functionality is provided by
Solr, which is interfaced into our application by way of the handy Haystack
Django application.

The message lifecycle
When a new message comes into the system, first it's parsed by a series of
regular expressions designed to pull out interesting bits of information from
the message. Right now all we're looking for is username references and
links (and further, whether those links point at images which should be
rendered in-line.) At the end of this parsing stage, we have a structured
message parse list, which is converted into JSON.

So, for example if someone posted the message:

@ericflo @simonw Here's how we connect/disconnect from Redis in production: http://dpaste.com/406797/

The resulting JSON parse list would look like this:

[
{
"type": "username",
"user_id": 1,
"username": "ericflo",
"markup": "<a href=\"/users/ericflo/\">@ericflo</a>"
},
{
"type": "username",
"user_id": 56,
"username": "simonw",
"markup": " <a href=\"/users/simonw/\">@simonw</a>"
},
{
"type": "text",
"markup": " Here&#39;s how we connect/disconnect from Redis in production: "
},
{
"type": "url",
"url": "http://dpaste.com/406797/",
"markup": "<a href=\"http://dpaste.com/406797/\" target=\"_blank\">http://dpaste.com/406797/</a>"
}
]

After this is constructed, we log all our available information about this
message, and then save to the database—both the raw message as it was received,
and the JSON-encoded parsed node list.

Now a task is sent to Celery (by way of Redis) notifying it that this new
message has been received. This Celery task now increments the unread count
for everyone who has access to the topic that the message was posted in, and
then it publishes to a Redis pub/sub for the group that the message was posted
to. Finally, the task scans through the message, looking for any users that
were mentioned in the message, and writes entries to the database for every
mention.

On the other end of that pub/sub are the many open http requests that our users
have initiated, which are waiting for any new messages or information. Those
all simultaneously return the new message information, at which point they
reconnect again, waiting for the next message to arrive.

The real-time endpoint
Our live updates endpoint is actually a very simple and lightweight pure-WSGI
Python application, hosted using Eventlet. It spawns off a coroutine for each
request, and in that coroutine, it looks up all the groups that a user is a
member of, and then opens a connection to Redis subscribing to all of those
channels. Each of these Eventlet-hosted Python applications has the ability to
host hundreds-to-thousands of open connections, and we run several instances
on each of our front-end machines. It has a few more responsibilities, like
marking a topic as read before it returns a response, but the most important
thing is to be a bridge between the user and Redis pub/sub.

Future improvements
There are so many places where our architecture can be improved. This is our
first version, and now that real users are using the system, already some of
our initial assumptions are being challenged. For instance, we thought that
pub/sub to a channel per group would be enough, but what that means is that
everyone in a group sees the exact same events as everyone else in that group.

This means we don't have the ability to customize each user's experience based
on their preferences--no way to put a user on ignore, filter certain messages,
etc. It also means that we aren't able to sync up a user's experience across
tabs or browsers, since we don't really want to broadcast to everyone in the
group that one user has visited a topic, thereby removing any unread messages
in that topic. So going forward we're going to have to break up that per-group
pub/sub into per-user pub/sub.

Another area that could be improved is our unread counts. Right now they're
stored as rows in our PostgreSQL database, which makes it extremely easy to
batch update them and do aggregate queries on them, but the number of these
rows is increasing rapidly, and without some kind of sharding scheme, it will
at some point become more difficult to work with such a large amount of rows.
My feeling is that this will eventually need to be moved into a non-relational
data store, and we'll need to write a service layer in front of it to deal with
pre-aggregating and distributing updates, but nothing is set in stone just yet.

Finally, Python may not be the best language for this real-time endpoint.
Eventlet is a fantastic Python library and it allowed us to build something
extremely fast that has scaled to several thousand concurrent connections
without breaking a sweat on launch day, but it has its limits. There is a
large body of work out there on handling a large number of open connections,
using Java's NIO framework, Erlang's mochiweb, or node.js.

That's all folks
We're pretty proud of what we've built in a very short time, and we're glad
it has held up as well as it has on our launch day and afterwards. We're
excited about the problems we're now being faced with, both scaling the
technology, and scaling the product. I hope this article has quenched any
curiosity out there about how Convore works. If there are any questions,
feel free to join Convore and ask away!

(Or discuss it on Hacker News)
Convore  Django  Eventlet  Haystack  PostgreSQL  Python  Realtime  Redis  Solr  from google
february 2011 by rahuldave
Comprehensive notes from my three hour Redis tutorial
Last week I presented two talks at the inaugural NoSQL Europe conference in London. The first was presented with Matthew Wall and covered the ways in which we have been exploring NoSQL at the Guardian. The second was a three hour workshop on Redis, my favourite piece of software to have the NoSQL label applied to it.

I've written about Redis here before, and it has since earned a place next to MySQL/PostgreSQL and memcached as part of my default web application stack. Redis makes write-heavy features such as real-time statistics feasible for small applications, while effortlessly scaling up to handle larger projects as well. If you haven't tried it out yet, you're sorely missing out.

For the workshop, I tried to give an overview of each individual Redis feature along with detailed examples of real-world problems that the feature can help solve. I spent the past day annotating each slide with detailed notes, and I think the result makes a pretty good stand-alone tutorial. Here's the end result:

Redis tutorial slides and notes

In unrelated news, Nat and I both completed the first ever Brighton Marathon last weekend, in my case taking 4 hours, 55 minutes and 17 seconds. Sincere thanks to everyone who came out to support us - until the race I had never appreciated how important the support of the spectators is to keep going to the end. We raised £757 for the Have a Heart children's charity. Thanks in particular to Clearleft who kindly offered to match every donation.
brightonmarathon  guardian  marathon  nosql  redis  running  from google
april 2010 by rahuldave
Cache Machine: Automatic caching for your Django models
Cache Machine: Automatic caching for your Django models. This is the third new ORM caching layer for Django I’ve seen in the past month! Cache Machine was developed for zamboni, the port of addons.mozilla.org to Django. Caching is enabled using a model mixin class (to hook up some post_delete hooks) and a custom caching manager. Invalidation works by maintaining a “flush list” of dependent cache entries for each object—this is currently stored in memcached and hence has potential race conditions, but a comment in the source code suggests that this could be solved by moving to redis.
cachemachine  caching  django  memcached  mozilla  orm  ormcaching  python  redis  from google
march 2010 by rahuldave

Copy this bookmark:



description:


tags: