arsyed + scaling   56

Exploring Complexity: We Need to Talk About Scaling (Melanie Mitchell)
"In my next several blog posts I want to talk about scaling, especially about the very recent controversies surrounding claims of power-law scaling of particular phenomena [...] All this is going to require some forays into the wild and unruly land of statistics and data analysis. My goal in the next series of posts is to make sense of the following quite important papers in complex systems, which, taken together, form a kind of mini-course on scaling. Understanding ideas from these papers is essential in one’s education as a complex-systems scientist or informed “consumer” of this field."
complexity  scaling  power-law  via:cshalizi 
december 2011 by arsyed
GraphLab: A New Parallel Framework for Machine Learning
"Existing high-level parallel abstractions like MapReduce are often insufficiently expressive while low-level tools like MPI and Pthreads leave ML experts repeatedly solving the same design challenges. By targeting common patterns in ML, we developed GraphLab, which improves upon abstractions like MapReduce by compactly expressing asynchronous iterative algorithms with sparse computational dependencies while ensuring data consistency and achieving a high degree of parallel performance."
machine-learning  parallel  scaling 
june 2011 by arsyed
A Nice Introduction to Logistic Regression (Yi Wang)
"A C++ implementation of large-scale logistic regression (together with a tech-report) can be found at:
http://stat.rutgers.edu/~madigan/BBR
A Mahout slides show that they have received a proposal to implement logistic regression in Hadoop from Google Summer school of Code, but I have not seen the result yet.
Two papers on large-scale logistic regression was published in 2009:
1. Parallel Large-scale Feature Selection for Logistic Regression, and
2. Large-scale Sparse Logistic Regression"
statistics  statcomp  scaling  logistic-regression 
april 2011 by arsyed
Anatomy of a Crushing (Pinboard Blog)
"And a final, special shout-out goes to my favorite company in the world, Yahoo. I can't wait to see what you guys think of next!"
pinboard  delicious  architecture  scaling  postmorten  via:jacobian 
march 2011 by arsyed
Does StackOverflow use caching and if so, how? - Meta Stack Overflow
"In our (admittedly limited) experience, Redis is so fast that the slowest part of a cache lookup is the time spent reading and writing bytes to the network. This is not surprising, really, if you think about it."
stackOverflow  architecture  caching  redis  scaling 
january 2011 by arsyed
Let the microblogs bloom (Russell Beattie)
"Here's how a microblog system has to work to scale: All the messages created by users have to go into a Queue when they're created, and an external process then has to go through one by one and figure out which messages go into which subscriber's message list. As the system grows and more messages are created, the messages may arrive in your "inbox" slower, but they will still arrive. This type of system can be easily broken up into dedicated servers and multiple processes can handle different parts of the read/write process, and the individual user message lists can be more easily cached - as once a page is created that contains messages, it doesn't change."
architecture  microblogging  twitter  scaling 
december 2010 by arsyed
s4: distributed stream computing platform
"S4 is a general-purpose, distributed, scalable, partially fault-tolerant, pluggable platform that allows programmers to easily develop applications for processing continuous unbounded streams of data."
software  yahoo  stream-processing  scaling 
november 2010 by arsyed
The problems with ACID, and how to fix them without going NoSQL (Daniel Abadi, Alexander Thomson)
"In our opinion, the NoSQL decision to give up on ACID is the lazy solution to these scalability and replication issues. Responsibility for atomicity, consistency and isolation is simply being pushed onto the developer. ... the problem with ACID is not that its guarantees are too strong (and that therefore scaling these guarantees in a shared-nothing cluster of machines is too hard), but rather that its guarantees are too weak, and that this weakness is hindering scalability."
database  scaling  distributed  transactions  acid  isolation  deterministic  papers 
september 2010 by arsyed
A Retrospective on SEDA (Matt Welsh)
"If I were to design SEDA today, I would decouple stages (i.e., code modules) from queues and thread pools (i.e., concurrency boundaries)." ... "The most important contribution of SEDA, I think, was the fact that we made load and resource bottlenecks explicit in the application programming model."
server  swarch  scaling  seda  event-driven  concurrency 
july 2010 by arsyed
All Velocity conference 2010 Slides/Notes (Royans Tharakan)
"Here are all the slides/PDFs which I’ve come across from the first 2 days at velocity"
talks  videos  velocity  conference  scaling 
june 2010 by arsyed
Problems with CAP, and Yahoo’s little known NoSQL system (Daniel Abadi)
"In thinking about CAP the past few weeks, I feel that it has become overrated as a tool for explaining the design of modern scalable, distributed systems. Not only is the asymmetry of the contributions of C, A, and P confusing, but the lack of latency considerations in CAP significantly reduces its utility. To me, CAP should really be PACELC --- if there is a partition (P) how does the system tradeoff between availability and consistency (A and C); else (E) when the system is running as normal in the absence of partitions, how does the system tradeoff between latency (L) and consistency (C)?"
database  distcomp  cap  latency  scaling 
may 2010 by arsyed
Why Events Are A Bad Idea (for High-concurrency Servers) (Rob von Behren, Jeremy Condit, and Eric Brewer)
"Event-based programming has been highly touted in recent years as the best way to write highly concurrent applications. Having worked on several of these systems, we now believe this approach to be a mistake. Specifically, we believe that threads can achieve all of the strengths of events, including support for high concurrency, low overhead, and a simple concurrency model. Moreover, we argue that threads allow a simpler and more natural programming style."
papers  concurrency  threading  scaling  events  via:shivak 
march 2010 by arsyed
Server Design (Jeff Darcy)
"The rest of this article is going to be centered around what I’ll call the Four Horsemen of Poor Performance: 1. Data copies 2. Context switches 3. Memory allocation 4. Lock contention"
programming  swarch  scaling  performance  bottlenecks 
june 2009 by arsyed
Queue everything and delight everyone (l.m. orchard)
"The idea here is that the social structure can help you scale, while still delighting people."
architecture  queueing  scaling  twitter  microblogging 
july 2008 by arsyed

related tags

acid  activity-streams  algorithms  amazon  andrew-gelman  appEngine  architecture  async  availability  aws  backtype  benchmark  bigdata  blogs  books  bottlenecks  business  c10k  caching  cap  case  cdn  cloud  cloudComputing  comet  complexity  concurrency  conference  consistency  courses  critique  database  delicious  deterministic  digg  distcomp  distributed  django  ec2  email  erlang  event-driven  events  eventual  facebook  failure  feeds  flickr  foursquare  google  graph  hadoop  hardware  httpd  image  isolation  keyValue  latency  libevent  linux  logistic-regression  machine-learning  memcached  memcachedb  metrics  microblogging  mmds  mongodb  mysql  net  netflix  node.js  nosql  numbers  outage  papers  parallel  pat-helland  patterns  performance  pinboard  post-mortem  postgresql  postmorten  power-law  programming  push  queueing  rails  rdbms  reddit  redis  regression  replication  scala  scaling  scribd  seda  server  servers  sherpa  simpleGeo  slides  software  sql  stackOverflow  statcomp  statistics  storage  stream-processing  swarch  talks  testing  threading  tips  transaction  transactions  twitter  use-cases  velocity  via:chl  via:cshalizi  via:jacobian  via:mcroydon  via:shivak  videos  web  webdev  yahoo 

Copy this bookmark:



description:


tags: