mcroydon + scalability 165
High Scalability - High Scalability - Tumblr Architecture - 15 Billion Page Views a Month and Harder to Scale than Twitter
february 2012 by mcroydon
A peek at Tumblr's stack and some of it's interesting properties.
architecture
development
scalability
scaling
tumblr
february 2012 by mcroydon
Richard Jones | A Million-user Comet Application with Mochiweb, Part 3
january 2012 by mcroydon
C1M using Erlang in 2008.
c10k
erlang
libevent
scalability
scaling
c1m
january 2012 by mcroydon
MemoryImage
september 2011 by mcroydon
Something that Martin Fowler said.
data
memory
database
performance
scalability
september 2011 by mcroydon
Ehcache | Performance at Any Scale
september 2011 by mcroydon
Make go faster.
cache
java
performance
scalability
september 2011 by mcroydon
Mu Dynamics Blog » Speed Limit of PaaS – 64K TCP Ports
august 2011 by mcroydon
Multi-homing is the way to get around connection limits.
node
ops
paas
scalability
august 2011 by mcroydon
Hello Singapore (CDN Experiment) - GitHub
march 2011 by mcroydon
Cloudfront makes things fast (usually).
aws
cdn
github
performance
scalability
march 2011 by mcroydon
Pressflow makes Drupal scale | Four Kitchens: the Drupal experts
december 2010 by mcroydon
"Pressflow is a distribution of Drupal with integrated performance, scalability, availability, and testing enhancements."
drupal
performance
scalability
optimization
cms
december 2010 by mcroydon
High Scalability - High Scalability - Facebook's New Real-time Messaging System: HBase to Store 135+ Billion Messages a Month
architecture article bigdata cassandra database mail infrastructure hbase hadoop facebook messages messaging news nosql performance samples storage servers scalability
november 2010 by mcroydon
architecture article bigdata cassandra database mail infrastructure hbase hadoop facebook messages messaging news nosql performance samples storage servers scalability
november 2010 by mcroydon
HBase and Hadoop at Facebook | Jeremiah Peschka
november 2010 by mcroydon
It's awesome to see HBase get some love.
analysis
article
cap
cassandra
comparison
database
db
programming
nosql
imported
hdfs
hbase
hadoop
facebook
replication
scalability
scale
scaling
november 2010 by mcroydon
OpenTSDB - A Distributed, Scalable Monitoring System
november 2010 by mcroydon
OpenTSDB is a distributed, scalable Time Series Database (TSDB) written on top of HBase. OpenTSDB was written to address a common need: store, index and serve metrics collected from computer systems (network gear, operating systems, applications) at a large scale, and make this data easily accessible and graphable.
analysis
architecture
bigdata
cloud
data
database
db
java
lgpl
hbase
hadoop
development
graph
distributed
monitoring
nosql
opensource
operations
scalability
scale
time
sysadmin
software
storage
series
opentsdb
rrd
stumbleupon
time-series
timeseries
november 2010 by mcroydon
s4: distributed stream computing platform
november 2010 by mcroydon
"S4 is a general-purpose, distributed, scalable, partially fault-tolerant, pluggable platform that allows programmers to easily develop applications for processing continuous unbounded streams of data."
apache
bigdata
cloud
cloudcomputing
cluster
computing
mapreduce
map
java
hadoop
framework
distributed
data
opensource
processing
platform
programming
real-time
streaming
stream
software
scalability
reduce
realtime
streamprocessing
yahoo
tool
s4
streams
november 2010 by mcroydon
Overclocking mod_ssl | Paul's Journal
november 2010 by mcroydon
Like how Google does it for mere mortals.
apache
cryptography
http
https
memcache
optimisation
ops
ssl
server
security
scaling
scalability
ping.fm
performance
web
webdev
mod_ssl
openssl
november 2010 by mcroydon
ImperialViolet - Overclocking SSL
november 2010 by mcroydon
How Google does it.
article
authentication
cost
crypto
network
cryptography
encryption
optimization
latency
internet
howto
google
http
performance
protocol
scalability
security
ssl
speed
server
web
toread
tcp
sysadmin
certificate
certificates
https
overclocking
tls
november 2010 by mcroydon
SHARD Triple-Store
october 2010 by mcroydon
"SHARD is a proof-of-concept use of high-performance, low-cost distributed computing technology to develop a highly scalable triple-store. SHARD is released as an open-source project on the BSD license."
database
db
cloud
distributed
hadoop
lubm
mapreduce
rdf
store
sparql
storage
shard
semweb
semanticweb
scalability
triple-store
october 2010 by mcroydon
HAProxy - The Reliable, High Performance TCP/HTTP Load Balancer
october 2010 by mcroydon
It's easy to forget about HAProxy, but it's really good at quietly dishing up a ton of requests.
apache
architecture
balancer
cluster
clustering
ha
loadbalancing
load-balancing
load
linux
http
highavailability
network
networking
opensource
performance
proxy
web
tools
tcp
sysadmin
software
server
scaling
scalability
webserver
october 2010 by mcroydon
BigPipe: Pipelining web pages for high performance | Facebook
september 2010 by mcroydon
It's been on my mind lately as well.
architecture
development
facebook
fb
frontend
html
http
optimization
optimisation
latency
js
javascript
interesting
performance
programming
scalability
school
speed
scaling
webdev
webdevelopment
webdesign
web
bigpipe
mustuse
pipelining
september 2010 by mcroydon
Membase.org
september 2010 by mcroydon
Another NoSQL k/v store but this one pushes some impressive numbers.
apache
architecture
bigtable
cache
caching
cluster
database
development
db
datastructures
databases
distributed
facebook
key-value
keyvalue
memcache
kvs
scalability
programming
opensource
memcached
nosql
scaling
software
storage
sql
webdesign
webdev
zynga
september 2010 by mcroydon
Rails Development – Interface Design – iPhone/iPad/Android Apps
september 2010 by mcroydon
Web firm that communicates what they do really well.
agency
amazon
agile
business
aws
cloud
company
ec2
development
developer
design
dc
css
cool
consulting
inspiration
rails
ruby
ror
rubyonrails
scalability
social
webdev
webdesign
web2.0
web
vc
intridea
scalr
september 2010 by mcroydon
Binary Stream Parsing in Node.js :: The Universe of Discord
september 2010 by mcroydon
Cleaning up binary stream parsing with node.
binary
cs
javascript
js
lib
messaging
node.js
parse
nodejs
stream
scalability
programming
read
parsing
bin
september 2010 by mcroydon
Google search index splits with MapReduce • The Register
september 2010 by mcroydon
Teach the world to Zig then Zag.
algorithms
architecture
bigdata
article
bigtable
gfs
distributed
database
computing
computers
caffeine
google
grid
hadoop
index
indexing
mapreduce
technology
search
scalability
research
programming
colossus
gfs2
news
september 2010 by mcroydon
OpenStack Open Source Cloud Computing Software
july 2010 by mcroydon
OpenStack main site.
admin
cloud-computing
cloud
cloudcomputing
cluster
ec2
distributed
development
computing
code
clustering
infrastructure
open
nasa
open-source
opensource
python
source
software
service
server
scalability
rackspace
standards
storage
technology
virtualization
webdev
july 2010 by mcroydon
appengine-mapreduce - Project Hosting on Google Code
july 2010 by mcroydon
Pretty slick that it's built atop vanilla appengine.
appengine
cloud
cloud-computing
google
gae
hadoop
saas
python
optimization
nosql
mapreduce
scalability
webdev
develop
july 2010 by mcroydon
Parsing file uploads at 500 mb/s with node.js » Debuggable Ltd
june 2010 by mcroydon
This looks like a promising way of dealing with lots and lots of uploads.
algorithm
algorithms
asynchronous
development
data
file
nodejs
node.js
node
library
javascript
js
forms
html5
form
parse
parser
parsing
performance
programming
scalability
webdev
web
upload
ssjs
fileupload
formidable
multipart
uploads
june 2010 by mcroydon
MongoSF: MongoDB conference San Francisco
may 2010 by mcroydon
A ton of mongo presentations.
conference
presentations
video
watch
database
programming
mongo
read
mongodb
redis
node.js
scalability
talks
mongosf
may 2010 by mcroydon
Cassandra : inverted index | Scalable web architectures
may 2010 by mcroydon
A good explanation of inverted indexes and Cassandra.
cassandra
data
database
indexing
index
model
nosql
storage
scalability
programming
performance
optimization
may 2010 by mcroydon
NoSQL at Twitter (NoSQL EU 2010)
april 2010 by mcroydon
A pretty thorough look behind the curtain at Twitter.
analytics
architecture
cassandra
cloud
databases
database
db
grid
hbase
nosql
hadoop
presentation
pig
programming
read
twitter
slideshare
slides
scribe
scaling
scalability
flockdb
yam
april 2010 by mcroydon
How Raytheon Researchers are Using Hadoop to Build a Scalable, Distributed Triple Store « Cloudera » Apache Hadoop for the Enterprise
march 2010 by mcroydon
Triplestores and Hadoop. Together.
hadoop
triplestore
article
articles
cloud
cloudcomputing
cloudera
database
hdfs
graphs
graph
distributed
development
mapreduce
nosql
programming
repository
rdf
scalability
toread
sparql
semanticweb
shard
semantic_web
semantic
triple
web
march 2010 by mcroydon
Canned Platypus » Blog Archive » Availability and Partition Tolerance
march 2010 by mcroydon
Another good explanation of CAP with visual.
algorithms
architecture
article
availability
concurrency
cap
consistency
partition
distributed
nosql
development
databases
database
data
consistent
performance
scalability
theorem
march 2010 by mcroydon
Notes from a production MongoDB deployment « Boxed Ice Blog
march 2010 by mcroydon
This seems like an awful lot of work for a relatively small document set.
2read
architecture
couchdb
database
databases
datacenter
linux
experience
distributed
development
deployment
db
mongo
mongodb
mysql
notes
nosql
performance
production
software
sharding
server
scalability
replication
programming
sysadmin
to_read
tips
web
march 2010 by mcroydon
HBase vs. Cassandra: NoSQL Battle! | Road to Failure
march 2010 by mcroydon
Someone who chose HBase over Cassandra (and why).
amazon
architecture
article
bigdata
bigtable
blog
cap
computing
comparison
compare
clustering
cassandra
database
development
dht
distributed
foss
grid
hadoop
scalability
research
nosql
mysql
hbase
post
vs
march 2010 by mcroydon
HBase vs Cassandra: why we moved « Bits and Bytes.
march 2010 by mcroydon
A look at someone who started with HBase and moved to Cassandra (and why).
architecture
benchmark
article
blog
cap
cassandra
cloud
development
database
db
computer
foss
comparison
compare
distributed
hadoop
hbase
mysql
nosql
performance
toread
scalability
programming
consistency
march 2010 by mcroydon
cloudkick | blog: 4 Months with Cassandra, a love story
march 2010 by mcroydon
A very interesting look at Cassandra with an eye toward gotchas. Cloudkick are doing some interesting stuff with aggregation over time periods.
admin
via:jacobian
administration
architecture
article
cassandra
database
databases
opensource
nosql
mysql
monitoring
django
distributed
db
datawarehouse
python
scalability
scaling
storage
toread
webdev
programming
cloudkick
neat
march 2010 by mcroydon
Celery - The Distributed Task Queue
february 2010 by mcroydon
The new website is both beautiful and informative.
programming
python
development
django
hardware
code
app
framework
distributed
scalability
opensource
library
cloud
documentation
messaging
queue
concurrency
job
rabbitmq
memcache
client
amqp
asynchronous
django-apps
message
messagequeue
task
tasks
celery
cron
february 2010 by mcroydon
High Scalability - High Scalability - How FarmVille Scales to Harvest 75 Million Players a Month
february 2010 by mcroydon
A look at scaling a monster.
opensource
scalability
games
database
web
performance
cool
architecture
scaling
coding
databases
db
rest
memcached
php
review
facebook
puppet
munin
memcache
webdevelopment
gamedev
farmville
zynga
february 2010 by mcroydon
The Basho Blog: Why Vector Clocks are Easy
january 2010 by mcroydon
Straightforward but very powerful message/value versioning conflict avoidance. This reminds me of git in a way since to avoid conflicts each message bust contain all predecessors in its vector mask.
programming
toread
tutorial
scalability
distributed
algorithm
concurrency
event
nosql
vector
versioning
via:chl
clock
dist
riak
basho
vectorclocks
distributed_systems
vector-clocks
clocks
vectorclock
january 2010 by mcroydon
Gojko Adzic » Designing applications for cloud deployment
january 2010 by mcroydon
Somewhat obvious but good thoughts on deploying in the cloud.
programming
aws
design
hosting
cloud
ec2
development
deployment
cloudcomputing
webdev
amazon
deploy
qa
server
scalability
performance
architecture
tech
january 2010 by mcroydon
Practical Cloud Computing | SimpleDB Performance : 5 Steps to Achieving High Write Throughput
january 2010 by mcroydon
For when I need to load massive amounts of data in to SimpleDB.
howto
scalability
performance
aws
startup
databases
cloud
db
cloudcomputing
migration
nosql
simpledb
load
scalable
january 2010 by mcroydon
assertTrue( ): NoSQL Required Reading
december 2009 by mcroydon
A good collection of reading materials.
programming
database
reference
development
toread
webdev
scalability
blog
architecture
distributed
databases
scaling
todo
cloud
db
resources
paper
sql
couchdb
reading
to-read
papers
nosql
read
bigtable
scale
links
collection
to_read
readlater
december 2009 by mcroydon
round-robin django setup with nginx - small py
december 2009 by mcroydon
Interesting tactic where multiple backends can be tried before returning an error if you set the timeout low enough.
python
web
django
server
scalability
performance
sysadmin
scaling
coding
http
deployment
infrastructure
traffic
webserver
nginx
config
load_balancing
roundrobin
december 2009 by mcroydon
Building a Data Intensive Web Application with Cloudera, Hadoop, Hive, Pig, and EC2 | Cloudera
november 2009 by mcroydon
A nice look at end-to-end data analysis of big datasets using things like Pig and Hive.
programming
web
data
database
tools
webdev
business
toread
howto
tutorial
amazon
dev
scalability
hadoop
architecture
aws
computing
cluster
ec2
pig
trends
hive
cloudera
cloudcomputing
analytics
datamining
mapreduce
cloud
application
november 2009 by mcroydon
Extreme Agility at Facebook | blog@CACM | Communications of the ACM
november 2009 by mcroydon
"Another aspect of the Facebook engineering team is how large the ratio of active user to developer is. Currently it stands at 1.1 million users per developer."
programming
development
howto
scalability
architecture
scaling
management
engineering
memcached
facebook
process
agile
acm
hive
scribe
oopsla
niho
cacm
november 2009 by mcroydon
Journal of Eivind Uggedal: NoSQL East 2009 - Summary of Day 1
november 2009 by mcroydon
Some interesting bits and more of the same but I really like the dark-launch approach that Scribe allows.
data
database
toread
blog
scalability
internet
distributed
article
hadoop
scaling
db
cloud
couchdb
conference
papers
keyvalue
nosql
links
cassandra
2009
mongodb
dynomite
riak
november 2009 by mcroydon
Traffic Server is finally here | Ogre.com
october 2009 by mcroydon
Yahoo! open sourced tech originally acquired from Inktomi. It's a caching proxy with some pretty advanced features.
web
software
opensource
server
scalability
internet
performance
sysadmin
todo
http
apache
cache
yahoo
nginx
proxy
trafficserver
squid
october 2009 by mcroydon
LucidDB Home Page
october 2009 by mcroydon
"LucidDB is the first and only open-source RDBMS purpose-built entirely for data warehousing and business intelligence. It is based on architectural cornerstones such as column-store, bitmap indexing, hash join/aggregation, and page-level multiversioning."
programming
software
development
database
data
business
opensource
java
scalability
storage
distributed
cluster
databases
sql
db
datamining
olap
columndb
bi
datawarehouse
dbms
reporting
rdbms
luciddb
column
warehousing
column-store
data_warehouse
column-oriented
dwh
october 2009 by mcroydon
Lehigh University Benchmark (LUBM)
october 2009 by mcroydon
Benchmarking really big triplestores.
web
data
database
scalability
performance
testing
project
rdf
test
benchmark
semantic
semanticweb
ontology
dataset
benchmarks
semweb
semantic_web
university
owl
triplestore
semantic-web
w3c
sparql
repository
lehigh
evaluation
lubm
validation
october 2009 by mcroydon
LargeTripleStores - ESW Wiki
october 2009 by mcroydon
A collection of some of the more scalable triplestores.
programming
web
software
data
reference
database
tools
scalability
storage
wiki
performance
db
databases
list
store
benchmark
rdf
metadata
semantic
semanticweb
ontology
semweb
sparql
triplestore
owl
large
sesame
triple
october 2009 by mcroydon
Geeking with Greg: Advice from Google on large distributed systems
october 2009 by mcroydon
With links to slides from LADIS '09. This includes a refresh and update about how GFS, MapReduce, etc are working in Google's fault-filled environment.
programming
google
blog
scalability
storage
performance
architecture
distributed
advice
scaling
infrastructure
systems
datacenters
october 2009 by mcroydon
How We Made GitHub Fast - GitHub
october 2009 by mcroydon
A fantastic peek behind the curtain.
programming
design
web
development
webdev
ruby
toread
erlang
server
rails
scalability
sysadmin
performance
architecture
scaling
article
hosting
http
redis
deploy
github
rubyonrails
infrastructure
ssh
inspiration
git
optimization
deployment
bert
unicorn
october 2009 by mcroydon
Why are Facebook, Digg, and Twitter so hard to scale?
october 2009 by mcroydon
Scaling massively social sites can be tricky. Here's why.
programming
design
web
blog
internet
scalability
architecture
distributed
scaling
web2.0
article
social
memcached
twitter
grid
developer
facebook
infrastructure
digg
reddit
nosql
socialmedia
issues
social_networking
october 2009 by mcroydon
Riak - A Decentralized Database
october 2009 by mcroydon
"Riak combines a decentralized key-value store, a flexible map/reduce engine, and a friendly HTTP/JSON query interface to provide a database ideally suited for Web applications." Erlang under the hood.
programming
web
development
key-value
database
webdev
opensource
erlang
storage
scalability
distributed
rest
databases
http
mapreduce
json
db
couchdb
store
kvstore
datastore
keyvalue
nosql
document
cloudcomputing
riak
decentralized
basho
documentoriented
key-value-store
october 2009 by mcroydon
Digg the Blog " Blog Archive " Looking to the future with Cassandra
september 2009 by mcroydon
NOSQL is a hammer, but a shiny one.
programming
development
data
database
toread
blog
storage
scalability
architecture
performance
distributed
scaling
article
databases
mysql
cloud
sql
keystore
nosql
cassandra
acm
keyvalue
key-value
bigtable
production
advice
digg
to-read
deployment
september 2009 by mcroydon
Tornado Web Server
september 2009 by mcroydon
Tornado proper.
python
programming
web
software
development
webserver
open-source
django
tools
webdev
opensource
dev
server
network
scalability
performance
framework
async
comet
tornado
friendfeed
asynchronous
realtime
aggregator
frameworks
facebook
http
cms
scaling
epoll
september 2009 by mcroydon
The technology behind Tornado, FriendFeed's web server - Bret Taylor's blog
september 2009 by mcroydon
Some notes on Tornado.
python
programming
web
development
django
code
webdev
opensource
technology
server
scalability
performance
framework
article
apache
http
open
loadbalancing
communications
comet
tornado
friendfeed
realtime
feed
webserver
frameworks
facebook
collaboration
async
longpolling
september 2009 by mcroydon
4store - Scalable RDF storage
august 2009 by mcroydon
"At times holding and running queries over databases of 15GT, supporting a Web application used by thousands of people."
programming
web
software
open-source
development
data
database
opensource
scalability
storage
databases
gpl
rdf
store
db
c
semantic
semanticweb
repository
ontology
semweb
semantic-web
triplestore
sparql
garlik
4store
triple-store
rdfstore
triple
websemantique
august 2009 by mcroydon
HadoopDB Project
july 2009 by mcroydon
Interesting approach, we'll see if it has legs.
programming
software
development
database
java
opensource
research
scalability
distributed
performance
scaling
hadoop
cluster
postgresql
databases
mysql
hadoopdb
map-reduce
hive
dbms
rdbms
2009
analytics
postgres
db
sql
mapreduce
yale
vldb
july 2009 by mcroydon
DBMS Musings: Announcing release of HadoopDB (longer version)
july 2009 by mcroydon
/me increments the "databases atop hadoop" counter (and takes a sip).
database
opensource
scalability
research
distributed
performance
hadoop
cluster
postgresql
mapreduce
project
datawarehouse
hadoopdb
distributedcomputing
july 2009 by mcroydon
up and running with cassandra
july 2009 by mcroydon
More detail than usual for an intro article.
development
data
database
code
toread
blog
tutorial
api
ruby
dev
java
scalability
distributed
rails
storage
hadoop
cluster
application
cloud
twitter
db
facebook
web
example
key-value
bigtable
keyvalue
cassandra
nosql
bigdata
july 2009 by mcroydon
Project Voldemort Blog : Building a terabyte-scale data cycle at LinkedIn with Hadoop and Project Voldemort
july 2009 by mcroydon
More on what makes Voldemort tick.
design
development
data
database
toread
erlang
java
scalability
storage
architecture
distributed
performance
scaling
hadoop
cluster
grid
cloud
mapreduce
db
caching
analytics
arch
key-value
dht
keyvalue
scale
voldemort
batch
linkedin
datastore
july 2009 by mcroydon
Should you go Beyond Relational Databases? | Think Vitamin
july 2009 by mcroydon
Includes links to a bunch of graph databases too.
programming
development
web
data
reference
database
toread
technology
opensource
dev
scalability
storage
work
hadoop
article
graph
databases
mysql
nosql
keyvalue
bigtable
rdbms
resource
reading
couchdb
comparison
db
mapreduce
relational
alternative
july 2009 by mcroydon
braindump: NOSQL debrief
july 2009 by mcroydon
Slidedump.
software
nosql
kvstore
mongodb
dynomite
video
database
scalability
storage
architecture
distributed
scaling
hadoop
presentation
mysql
databases
db
sql
slides
couchdb
conference
videos
bigtable
hbase
trends
keyvalue
key-value
cassandra
voldemort
hypertable
july 2009 by mcroydon
Coding Horror: Scaling Up vs. Scaling Out: Hidden Costs
june 2009 by mcroydon
Food for thought with the caveat that scaling out is a lot easier if you don't have any per-server software costs. Big iron costs less to operate though.
programming
hardware
business
server
scalability
coding
networking
architecture
performance
distributed
scaling
web-development
cluster
hadoop
hosting
clustering
comparison
servers
distribution
it
codinghorror
2009
stackoverflow
june 2009 by mcroydon
Code: Flickr Developer Blog » Building Fast Client-side Searches
june 2009 by mcroydon
Fast == good.
programming
web
development
code
data
javascript
webdev
api
dev
search
xml
ajax
scalability
performance
cache
flickr
optimization
json
caching
parsing
regex
js
fast
yui
bestpractices
speed
autocomplete
clientside
optimisation
eval
june 2009 by mcroydon
Engineering @ Facebook's Notes | Facebook
june 2009 by mcroydon
Big big big data warehousing / data mining.
design
data
database
blog
java
map
scalability
storage
distributed
computing
scaling
article
hadoop
sql
mapreduce
db
reading
facebook
analytics
rdbms
arch
comment
hive
datawarehouse
warehouse
data-warehousing
hdfs
dw
june 2009 by mcroydon
Geeking with Greg: How much can you do with one server?
june 2009 by mcroydon
One server can go far if you let it.
web
hardware
business
toread
technology
server
architecture
scalability
performance
computing
optimization
virtualization
computerscience
cloudcomputing
arch
mailinator
talkinator
june 2009 by mcroydon
Neo4j - a Graph Database that Kicks Buttox | High Scalability
june 2009 by mcroydon
The most common complaint about existing graph databases is performance. Hopefully a stable of good, performant graph databases will change that.
data
database
toread
visualization
java
opensource
network
scalability
cool
architecture
performance
graph
hadoop
databases
db
graphs
2009
arch
socialnetworking
socialmedia
dataviz
neo4j
graph_database
graph-database
relationship
june 2009 by mcroydon
katta - distributed lucene
june 2009 by mcroydon
"Katta serves large, replicated, Lucene indexes as shards to serve high loads and very large data sets."
software
development
java
search
scalability
performance
scaling
distributed
apache
hadoop
clustering
cloud
grid
lucene
tool
indexing
project
ir
searchengine
information-retrieval
dist
package
hdfs
katta
june 2009 by mcroydon
Getting Started With Sqoop | Cloudera
june 2009 by mcroydon
Um, kinda awesome!
scalability
hadoop
rdbms
sql
cloudera
june 2009 by mcroydon
Greenplum: the petabyte-scale database for data warehousing and business intelligence.
may 2009 by mcroydon
Petabyte data mining and data warehousing.
programming
software
development
data
database
business
technology
opensource
storage
scalability
performance
cluster
postgresql
startup
open
postgres
sql
mapreduce
datamining
db
analytics
oss
reporting
rdbms
intelligence
bi
businessintelligence
datawarehouse
greenplum
bizgres
may 2009 by mcroydon
related tags
2read ⊕ 4store ⊕ @toread ⊕ academia ⊕ academic ⊕ acm ⊕ activemq ⊕ activity ⊕ activitystreams ⊕ ad ⊕ admin ⊕ administration ⊕ ads ⊕ advice ⊕ agency ⊕ aggregator ⊕ agile ⊕ ajax ⊕ algorithm ⊕ algorithms ⊕ alternative ⊕ amazing ⊕ amazon ⊕ ampq ⊕ amqp ⊕ analysis ⊕ analytics ⊕ android ⊕ apache ⊕ api ⊕ app ⊕ appengine ⊕ application ⊕ arch ⊕ architecture ⊕ arquitectura ⊕ article ⊕ articles ⊕ async ⊕ asynchronous ⊕ authentication ⊕ autocomplete ⊕ availability ⊕ aws ⊕ backfill ⊕ backfilling ⊕ backup ⊕ balancer ⊕ balancing ⊕ basho ⊕ batch ⊕ beanstalk ⊕ beanstalkd ⊕ benchmark ⊕ benchmarks ⊕ berkeley ⊕ bert ⊕ bestpractices ⊕ bi ⊕ bigdata ⊕ bigpipe ⊕ bigtable ⊕ bin ⊕ binary ⊕ bizgres ⊕ blob ⊕ blog ⊕ book ⊕ business ⊕ businessintelligence ⊕ c ⊕ c1m ⊕ c10k ⊕ cache ⊕ caching ⊕ cacm ⊕ caffeine ⊕ cap ⊕ cascading ⊕ casestudy ⊕ cassandra ⊕ catapult ⊕ cdn ⊕ celery ⊕ center ⊕ certificate ⊕ certificates ⊕ chart ⊕ classes ⊕ click ⊕ client ⊕ clientside ⊕ clock ⊕ clocks ⊕ cloud ⊕ cloud-computing ⊕ cloudcomputing ⊕ cloudera ⊕ cloudkick ⊕ cluster ⊕ clustering ⊕ clusters ⊕ cms ⊕ code ⊕ codecs ⊕ coding ⊕ codinghorror ⊕ collaboration ⊕ collection ⊕ colossus ⊕ column ⊕ column-oriented ⊕ column-store ⊕ columndb ⊕ comet ⊕ comment ⊕ commercial ⊕ communication ⊕ communications ⊕ company ⊕ compare ⊕ comparison ⊕ components ⊕ compression ⊕ compsci ⊕ computer ⊕ computer-science ⊕ computers ⊕ computerscience ⊕ computing ⊕ concurrence ⊕ concurrency ⊕ concurrent ⊕ condor ⊕ conference ⊕ config ⊕ consistency ⊕ consistent ⊕ console ⊕ consulting ⊕ container ⊕ conversion ⊕ convert ⊕ cool ⊕ cooling ⊕ cost ⊕ couchdb ⊕ course ⊕ courses ⊕ craigslist ⊕ cron ⊕ crypto ⊕ cryptography ⊕ cs ⊕ css ⊕ danga ⊕ data ⊕ data-warehousing ⊕ database ⊕ databases ⊕ datacenter ⊕ datacenters ⊕ datamining ⊕ dataset ⊕ datastore ⊕ datastructures ⊕ dataviz ⊕ datawarehouse ⊕ data_warehouse ⊕ db ⊕ dbms ⊕ dc ⊕ debian ⊕ debug ⊕ debugging ⊕ decentralized ⊕ demand ⊕ deploy ⊕ deployment ⊕ design ⊕ dev ⊕ develop ⊕ developer ⊕ development ⊕ dht ⊕ dictionary ⊕ differences ⊕ digg ⊕ dist ⊕ distributed ⊕ distributed-computing ⊕ distributedcomputing ⊕ distributed_systems ⊕ distribution ⊕ django ⊕ django-apps ⊕ djangocon ⊕ document ⊕ documentation ⊕ documentoriented ⊕ drupal ⊕ dw ⊕ dwh ⊕ dynamo ⊕ dynomite ⊕ ebay ⊕ ebs ⊕ ec2 ⊕ education ⊕ elastic ⊕ email ⊕ encode ⊕ encoder ⊕ encoding ⊕ encryption ⊕ energy ⊕ engineering ⊕ engineyard ⊕ english ⊕ enterprise ⊕ epoll ⊕ erlang ⊕ eval ⊕ evaluation ⊕ event ⊕ example ⊕ expectations ⊕ experience ⊕ facebook ⊕ farmville ⊕ fast ⊕ fb ⊕ federation ⊕ feed ⊕ ffmpeg ⊕ file ⊕ filesystem ⊕ fileupload ⊕ firefox ⊕ firewall ⊕ flash ⊕ flickr ⊕ flockdb ⊕ flv ⊕ form ⊕ formidable ⊕ forms ⊕ foss ⊕ framework ⊕ frameworks ⊕ freebase ⊕ friendfeed ⊕ frontend ⊕ future ⊕ gae ⊕ gamedev ⊕ games ⊕ gaming ⊕ garlik ⊕ geek ⊕ gfs ⊕ gfs2 ⊕ git ⊕ github ⊕ gmail ⊕ google ⊕ gpl ⊕ graph ⊕ graph-database ⊕ graphdb ⊕ graphs ⊕ graph_database ⊕ greenplum ⊕ grid ⊕ gridcomputing ⊕ gui ⊕ ha ⊕ hacking ⊕ hadoop ⊕ hadoopdb ⊕ happy ⊕ hardware ⊕ hash ⊕ hashing ⊕ haystack ⊕ hbase ⊕ hdfs ⊕ high-availability ⊕ highavailability ⊕ hive ⊕ hosting ⊕ howto ⊕ hpc ⊕ htc ⊕ html ⊕ html5 ⊕ http ⊕ httpd ⊕ https ⊕ hypertable ⊕ ideas ⊕ image ⊕ images ⊕ imported ⊕ index ⊕ indexing ⊕ information ⊕ information-retrieval ⊕ infrastructure ⊕ inspiration ⊕ intelligence ⊕ interesting ⊕ internet ⊕ interview ⊕ intridea ⊕ introduction ⊕ iphone ⊕ ir ⊕ irb ⊕ irc ⊕ issues ⊕ it ⊕ java ⊕ javascript ⊕ jaylinks ⊕ jms ⊕ job ⊕ jobs ⊕ js ⊕ json ⊕ justin ⊕ justin.tv ⊕ jvm ⊕ jython ⊕ katta ⊕ kellan ⊕ kestrel ⊕ key ⊕ key-value ⊕ key-value-store ⊕ keystore ⊕ keyvalue ⊕ kvs ⊕ kvstore ⊕ language ⊕ large ⊕ last.fm ⊕ latency ⊕ laughingmeme ⊕ law ⊕ lehigh ⊕ lesen ⊕ lgpl ⊕ lib ⊕ libevent ⊕ library ⊕ lifestream ⊕ lightcloud ⊕ lighttpd ⊕ linkedin ⊕ links ⊕ linux ⊕ list ⊕ lists ⊕ live ⊕ load ⊕ load-balancing ⊕ loadbalancing ⊕ load_balancing ⊕ longpolling ⊕ lua ⊕ lubm ⊕ lucene ⊕ luciddb ⊕ machine-learning ⊕ machinelearning ⊕ mail ⊕ mailinator ⊕ management ⊕ map ⊕ map-reduce ⊕ mapreduce ⊕ mathematics ⊕ memcache ⊕ memcached ⊕ memory ⊕ message ⊕ messagequeue ⊕ messages ⊕ messaging ⊕ metadata ⊕ metaweb ⊕ metrics ⊕ migration ⊕ misc ⊕ mobile ⊕ mod ⊕ model ⊕ module ⊕ mod_backhand ⊕ mod_python ⊕ mod_ssl ⊕ mogilefs ⊕ mongo ⊕ mongodb ⊕ mongosf ⊕ mongrel ⊕ monitor ⊕ monitoring ⊕ mq ⊕ multipart ⊕ munin ⊕ mustuse ⊕ mysql ⊕ nasa ⊕ neat ⊕ neo4j ⊕ network ⊕ networking ⊕ networks ⊕ news ⊕ nfs ⊕ nginx ⊕ niho ⊕ nio ⊕ nlp ⊕ node ⊕ node.js ⊕ nodejs ⊕ nokia ⊕ nosql ⊕ notes ⊕ olap ⊕ on ⊕ ondemand ⊕ online ⊕ ontology ⊕ oopsla ⊕ open ⊕ open-source ⊕ opensource ⊕ openssl ⊕ opentsdb ⊕ open_source ⊕ operations ⊕ ops ⊕ optimisation ⊕ optimization ⊕ os ⊕ oss ⊕ overclocking ⊕ overview ⊕ owl ⊕ p2p ⊕ paas ⊕ package ⊕ paper ⊕ papers ⊕ parallel ⊕ parse ⊕ parser ⊕ parsing ⊕ partition ⊕ pattern ⊕ patterns ⊕ paxos ⊕ peep ⊕ performance ⊕ perlbal ⊕ persistence ⊕ photo ⊕ photos ⊕ php ⊕ pictures ⊕ pig ⊕ ping.fm ⊕ pipelining ⊕ platform ⊕ plugin ⊕ plurk ⊕ poker ⊕ post ⊕ postgres ⊕ postgresql ⊕ power ⊕ presentation ⊕ presentations ⊕ process ⊕ processing ⊕ production ⊕ profile ⊕ profiler ⊕ programming ⊕ project ⊕ protocol ⊕ proxy ⊕ puppet ⊕ push ⊕ py-amqplib ⊕ pycon ⊕ python ⊕ qa ⊕ quality ⊕ query ⊕ queue ⊕ queues ⊕ queuing ⊕ rabbitmq ⊕ rack ⊕ rackspace ⊕ rails ⊕ rdbms ⊕ rdf ⊕ rdfstore ⊕ read ⊕ reading ⊕ readlater ⊕ real-time ⊕ realtime ⊕ recommendations ⊕ reddit ⊕ redis ⊕ reduce ⊕ reference ⊕ regex ⊕ relational ⊕ relationaldb ⊕ relationship ⊕ replication ⊕ reporting ⊕ repository ⊕ research ⊕ resource ⊕ resources ⊕ rest ⊕ reverse-proxy ⊕ review ⊕ riak ⊕ ringo ⊕ ror ⊕ roundrobin ⊕ rrd ⊕ ruby ⊕ ruby-on-rails ⊕ rubyonrails ⊕ s3 ⊕ s4 ⊕ saas ⊕ samples ⊕ scala ⊕ scalability ⊖ scalable ⊕ scalaris ⊕ scale ⊕ scaling ⊕ scalr ⊕ schema ⊕ schema-less ⊕ schemaless ⊕ school ⊕ screencast ⊕ scribe ⊕ scripting ⊕ search ⊕ searchengine ⊕ security ⊕ semantic ⊕ semantic-web ⊕ semanticweb ⊕ semantic_web ⊕ semweb ⊕ seo ⊕ series ⊕ server ⊕ servers ⊕ service ⊕ services ⊕ sesame ⊕ shard ⊕ sharded ⊕ sharding ⊕ shards ⊕ simpledb ⊕ simulation ⊕ skype ⊕ slide ⊕ slides ⊕ slideshare ⊕ slideshow ⊕ social ⊕ socialmedia ⊕ socialnetworking ⊕ socialsoftware ⊕ social_networking ⊕ sockets ⊕ software ⊕ softwareengineering ⊕ solr ⊕ solution ⊕ someday ⊕ source ⊕ sparql ⊕ speed ⊕ sphinx ⊕ sql ⊕ sqs ⊕ squid ⊕ ssh ⊕ ssjs ⊕ ssl ⊕ stack ⊕ stackless ⊕ stackoverflow ⊕ standards ⊕ stanford ⊕ starling ⊕ startup ⊕ startups ⊕ statistics ⊕ stats ⊕ storage ⊕ store ⊕ strategy ⊕ stream ⊕ streaming ⊕ streamprocessing ⊕ streams ⊕ stumbleupon ⊕ sysadmin ⊕ system ⊕ systems ⊕ talkinator ⊕ talks ⊕ task ⊕ tasks ⊕ tcp ⊕ tech ⊕ technology ⊕ test ⊕ testing ⊕ theorem ⊕ threading ⊕ time ⊕ time-series ⊕ timeseries ⊕ tips ⊕ tls ⊕ to-read ⊕ todo ⊕ tokyo-cabinet ⊕ tokyo-tyrant ⊕ tokyocabinet ⊕ tokyotyrant ⊕ tomcat ⊕ tool ⊕ tools ⊕ toread ⊕ tornado ⊕ tour ⊕ to_read ⊕ tracking ⊕ traffic ⊕ trafficserver ⊕ transactions ⊕ transcoding ⊕ trend ⊕ trends ⊕ triple ⊕ triple-store ⊕ triplestore ⊕ tumblr ⊕ tuning ⊕ turbogears ⊕ tutorial ⊕ tutorials ⊕ tv ⊕ twisted ⊕ twitter ⊕ ui ⊕ unicorn ⊕ university ⊕ unix ⊕ unread ⊕ upload ⊕ uploads ⊕ urbanairship ⊕ usability ⊕ utilities ⊕ utility ⊕ ux ⊕ validation ⊕ value ⊕ vc ⊕ vector ⊕ vector-clocks ⊕ vectorclock ⊕ vectorclocks ⊕ versioning ⊕ via:chl ⊕ via:jacobian ⊕ via:pskomoroch ⊕ video ⊕ videos ⊕ virtualization ⊕ visualization ⊕ vldb ⊕ voldemort ⊕ vps ⊕ vs ⊕ w3c ⊕ warehouse ⊕ warehousing ⊕ watch ⊕ web ⊕ web-development ⊕ web-server ⊕ web-services ⊕ web2.0 ⊕ webapp ⊕ webdesign ⊕ webdev ⊕ webdevelopment ⊕ webhosting ⊕ websemantique ⊕ webserver ⊕ webservice ⊕ webservices ⊕ website ⊕ websites ⊕ wiki ⊕ wikipedia ⊕ windows ⊕ wishlist ⊕ work ⊕ wsgi ⊕ xen ⊕ xfs ⊕ xml ⊕ yahoo ⊕ yale ⊕ yam ⊕ youtube ⊕ yui ⊕ zynga ⊕Copy this bookmark: