A rule of thumb for choosing column order in indexes - MySQL Performance Blog
june 2011 by amy
In my quest for MySQL performance improvements, the ideas from this article worked beautifully.
database
mysql
optimization
design
june 2011 by amy
Perspectives - Challenges and Trade-offs in Building a Web-scale Real-time Analytics System
february 2011 by amy
Was @ this talk by @b6n, this is a good summary: Challenges/Trade-offs in Building a Web-scale Realtime Analytics System
analytics
cassandra
database
distributed
from twitter_favs
february 2011 by amy
twitter/gizzard - GitHub
january 2011 by amy
A framework for creating distributed datastores.
database
distributed
nosql
january 2011 by amy
The Apache Cassandra Project
march 2010 by amy
The Apache Cassandra Project develops a highly scalable second-generation distributed database, bringing together Dynamo's fully distributed design and Bigtable's ColumnFamily-based data model.
Cassandra was open sourced by Facebook in 2008, where it was designed by Avinash Lakshman (one of the authors of Amazon's Dynamo) and Prashant Malik. In a lot of ways you can think of Cassandra as Dynamo 2.0.
database
distributed
opensource
Cassandra was open sourced by Facebook in 2008, where it was designed by Avinash Lakshman (one of the authors of Amazon's Dynamo) and Prashant Malik. In a lot of ways you can think of Cassandra as Dynamo 2.0.
march 2010 by amy
Introducing DataFabric
april 2009 by amy
"Specifically we needed two features to scale our mysql database: application-level sharding and master/slave replication. Sharding is the process of splitting a dataset across many independent databases. This often happens based on geographical region (e.g. craigslist) or user account (e.g. flickr). Replication provides a near-real-time copy of a database which can be used for fault tolerance and to reduce load on the master node. Combined, you get a scalable database solution which does not require huge hardware to scale to huge volumes. DataFabric extends ActiveRecord’s standard connection handling to provide these two features."
rails
plugins
database
april 2009 by amy
Data.Gov
april 2009 by amy
"this site will make a broad array of US Government data available in downloadable formats."
usa
government
database
april 2009 by amy
D.C. Data Catalog
march 2009 by amy
For years the District of Columbia has provided public access to city operational data through the Internet. Now the District provides citizens with the access to 274 datasets from multiple agencies, a catalyst ensuring agencies operate as more responsive, better performing organizations. Use the data catalog below to subscribe to a live data feed in Atom format and access data in XML, Text/CSV, KML or ESRI Shapefile formats. Please note that by accessing the data catalog and feeds, you agree to our Terms of Use. Please read before accessing the data. All data visualizations on maps should be considered approximate. The visualizations provided by this application include only records that can be mapped.
society
politics
information_access
database
march 2009 by amy
HIBERNATE - Relational Persistence for Idiomatic Java
january 2009 by amy
Hibernate tutorial and docs
java
database
january 2009 by amy
Magic/Replace - Data Cleanup for Everyone from Dabble DB
december 2008 by amy
spreadsheet cleanup
cool
database
december 2008 by amy
Scalaris
november 2008 by amy
We present Scalaris, an Erlang implementation of a distributed key/value store. It uses, on top of a structured overlay network, replication for data availability and majority based distributed transactions for data consistency. In combination, this implements the ACID properties on a scalable structured overlay.
database
storage
erlang
november 2008 by amy
the-cassandra-project - Google Code
july 2008 by amy
Cassandra is a distributed storage system for managing structured data while providing reliability at a massive scale.
database
p2p
distributed
grid
scalability
storage
july 2008 by amy
Some Thoughts on CouchDB and Relational Databases
september 2007 by amy
CouchDB is a distributed document-oriented database which means it is designed to be a massively scalable way to store, query and manage documents.
database
information_management
september 2007 by amy
Progressive privacy leak
august 2007 by amy
discover people's car make online
database
privacy
search
ack
august 2007 by amy
VentureBeat » Greenplum, and the Web 2.0 server
february 2007 by amy
Greenplum is significant because it says it provides a database for speedy data warehousing at a tenth of the cost of leading incumbent, Teradata. It does so by working with the new server built by Sun co-founder Andy Bechtolsheim, billed internally as th
datamining
database
technology
february 2007 by amy
Pentagon Will Review Database on U.S. Citizens
december 2005 by amy
The Pentagon is spying on peace groups and protesters
politics
iraq
usa
law
privacy
database
december 2005 by amy
MAFIA: Maximal Frequent Itemsets
november 2005 by amy
algorithm for mining maximal frequent itemsets from a transactional database
research
database
datamining
november 2005 by amy
related tags
#database ⊕ #DBA ⊕ #nosqleu ⊕ academia ⊕ ack ⊕ ajax ⊕ amazon ⊕ analytics ⊕ apache ⊕ architecture ⊕ arghh ⊕ austin ⊕ authentication ⊕ aws ⊕ bioinformatics ⊕ biology ⊕ cassandra ⊕ cloud_computing ⊕ consistency ⊕ cool ⊕ database ⊖ datamining ⊕ DBA ⊕ deep_web ⊕ design ⊕ development ⊕ distributed ⊕ ec2 ⊕ email ⊕ erlang ⊕ genetics ⊕ geolocation ⊕ government ⊕ grid ⊕ hardware ⊕ information_access ⊕ information_integration ⊕ information_management ⊕ information_retrieval ⊕ iraq ⊕ java ⊕ law ⊕ leopard ⊕ library ⊕ md ⊕ medical ⊕ mysql ⊕ networks ⊕ nosql ⊕ nosqleu ⊕ oodb ⊕ opensource ⊕ open_source ⊕ optimization ⊕ osx ⊕ p2p ⊕ photography ⊕ php ⊕ plugins ⊕ politics ⊕ privacy ⊕ programming ⊕ rails ⊕ redis ⊕ reference ⊕ research ⊕ rest ⊕ rss ⊕ ruby ⊕ s3 ⊕ scalability ⊕ science ⊕ search ⊕ security ⊕ simpledb ⊕ society ⊕ software ⊕ sql ⊕ storage ⊕ tbr ⊕ technology ⊕ tips ⊕ tools ⊕ tutorials ⊕ usa ⊕ utilities ⊕ web_app ⊕ web_dev ⊕ web_services ⊕Copy this bookmark: