mcroydon + cluster   73

STARDEV: Cluster - Welcome to StarCluster!
A toolkit for launching EC2 instances. Aimed at scientific computing.
amazon  cluster  ec2  python 
january 2011 by mcroydon
s4: distributed stream computing platform
"S4 is a general-purpose, distributed, scalable, partially fault-tolerant, pluggable platform that allows programmers to easily develop applications for processing continuous unbounded streams of data."
apache  bigdata  cloud  cloudcomputing  cluster  computing  mapreduce  map  java  hadoop  framework  distributed  data  opensource  processing  platform  programming  real-time  streaming  stream  software  scalability  reduce  realtime  streamprocessing  yahoo  tool  s4  streams 
november 2010 by mcroydon
Features — execnet v1.0.5 documentation
"execnet provides carefully tested means to easily interact with Python interpreters across version, platform and network barriers." Data structure interop between CPython, Jython, and PyPy.
programming  python  software  development  code  library  opensource  network  distributed  framework  cluster  deployment  module  parallel  foss  pycon2010  cross  interpreter  ipc 
february 2010 by mcroydon
psvm - Project Hosting on Google Code
It would be interesting to see this ported to Hadoop, though I think there's an SVM implementation as part of Mahout.
programming  software  google  cloudcomputing  opensource  code  research  algorithms  network  statistics  learning  cluster  algorithm  ec2  machinelearning  machine-learning  c++  kernel-machines  open_source  svm  foss  classification  nlp  machine  parallel  parallel-computation 
december 2009 by mcroydon
LucidDB Home Page
"LucidDB is the first and only open-source RDBMS purpose-built entirely for data warehousing and business intelligence. It is based on architectural cornerstones such as column-store, bitmap indexing, hash join/aggregation, and page-level multiversioning."
programming  software  development  database  data  business  opensource  java  scalability  storage  distributed  cluster  databases  sql  db  datamining  olap  columndb  bi  datawarehouse  dbms  reporting  rdbms  luciddb  column  warehousing  column-store  data_warehouse  column-oriented  dwh 
october 2009 by mcroydon
Coding Horror: Scaling Up vs. Scaling Out: Hidden Costs
Food for thought with the caveat that scaling out is a lot easier if you don't have any per-server software costs. Big iron costs less to operate though.
programming  hardware  business  server  scalability  coding  networking  architecture  performance  distributed  scaling  web-development  cluster  hadoop  hosting  clustering  comparison  servers  distribution  it  codinghorror  2009  stackoverflow 
june 2009 by mcroydon
Sequoia
"Sequoia is a transparent middleware solution offering clustering, load balancing and failover services for any database."
database  java  performance  cluster  postgresql  mysql  clustering  sql  tool  db  replication  ha  jdbc  highavailability 
may 2009 by mcroydon
Apache Mahout - Taste Documentation
Collaborative filtering as part of the Mahout project. Also includes a web services interface for interfacing with non-Java stuff.
software  java  search  algorithm  cluster  apache  mapreduce  machinelearning  engine  filtering  recommendation  webservice  recommendations  mahout 
april 2009 by mcroydon
peafowl -
Lightweight queues in Python -- essentially a Python port of starling. Looks hawt. Via Travis Cline.
python  code  server  sysadmin  cluster  memcached  messaging  googlecode  queue  rpc  peafowl  memcache  license:mit  mq  starling 
january 2009 by mcroydon
Condor Project Homepage
"Condor is a specialized workload management system for compute-intensive jobs. Like other full-featured batch systems, Condor provides a job queueing mechanism, scheduling policy, priority scheme, resource monitoring, and resource management. Users submit their serial or parallel jobs to Condor, Condor places them into a queue, chooses when and where to run the jobs based upon a policy, carefully monitors their progress, and ultimately informs the user upon completion."
programming  linux  software  development  research  tools  network  sysadmin  work  scalability  unix  windows  architecture  cluster  computing  framework  distributed  batch  hpc  htc  clusters  cloud  parallel  concurrency  clustering  grid  performance  opensource  condor  gridcomputing 
november 2008 by mcroydon
happy - Google Code
I still think I like pure python + Hadoop streaming but I'll definitely keep this in mind.
python  programming  software  development  google  code  howto  library  java  dev  scalability  architecture  apache  cluster  framework  distributed  opensource  grid  hadoop  mapreduce  concurrency  nlp  via:pskomoroch  jython  happy  metaweb  freebase  map-reduce 
october 2008 by mcroydon
KirinDave's fuzed at master — GitHub
"This is a release of Powerset's internal clustering software which has been adapted for use with Rails, but see the generic_json_responder to see exactly how it is used internally."
cluster  clustering  code  development  erlang  hosting  open  rails  ruby  scalability  server  software  source  technology  web  webdev  webserver  webapp 
june 2008 by mcroydon
Eucalyptus
Roll your own in-house EC2. Nice! It's good to see ROCKS in use still.
amazon  aws  cloud  cluster  clustering  clusters  computing  ec2  firewall  grid  hosting  linux  management  open  open-source  programming  scalability  software  sysadmin  to-read  trend  xen  virtualization 
june 2008 by mcroydon
Main Page - RunBlast
I wonder how far off we are from software sequencers where the hardware is "dumb" and just sends info off to an EC2 cluster. Hey, it worked for 56k modems...
bioinformatics  cluster  aws  amazon  ec2  blast  future 
june 2008 by mcroydon
cacherl - Google Code
"Cacherl is an Erlang port of famous Memcached trying to take advantage of both Erlang and Memcached. It follows the simple interface of memcached with additional data persistence."
erlang  cache  caching  cluster  memcached  mnesia  programming  scaling  webdev 
february 2008 by mcroydon
python-cluster
Convenience methods for clustering in Python. LGPL.
algorithm  cluster  clustering  development  library  module  open-source  programming  python 
december 2007 by mcroydon
K-means algorithm - Wikipedia, the free encyclopedia
"The k-means algorithm is an algorithm to cluster objects based on attributes into k partitions."
algorithm  algorithms  cluster  clustering  language  math  statistics  wikipedia  book:PCI 
december 2007 by mcroydon
GlusterFS - GlusterDocumentation
Clustered distributed filesystem with FUSE goodness.
amazon  aws  computing  cluster  distributed  ec2  file  filesystem  linux  storage  s3  raid  system 
november 2007 by mcroydon
SourceForge.net: openMosix Project End of Life Announcement
This is sad. OpenMosix was one of the things I was excited about a few years back when I researched open source clustering software.
cluster  mosix  news  open-source 
july 2007 by mcroydon
Ceph - Petascale Distributed Storage
"Ceph is a distributed network file system designed to provide excellent performance, reliability, and scalability."
cluster  clustering  computing  cool  distributed-computing  data  filesystem  fuse  linux  network  networking  open-source  software  storage  todo  toread 
april 2007 by mcroydon
zumastor - Zumastor Linux Storage Project
"Zumastor is a community project started by Google members to bring enterprise storage features to Linux."
backup  cluster  enterprise  google  linux  nfs  replication  storage 
february 2007 by mcroydon
Geeking with Greg: Hadoop on Amazon EC2
Map/Reduce on scalable hardware. Sweet concept.
amazon  aws  cluster  clustering  grid  hardware  s3  ec2 
november 2006 by mcroydon
How to Build a Beowulf - The MIT Press
Awesome title. I'm requesting it from my University library right now...
cluster  clustering  beowulf  high-availability 
april 2005 by mcroydon
NewsForge | openMosix 2.4.26-1 Released
Simply the easiest way to cluster your Linux systems
cluster  mosix  openmosix 
december 2004 by mcroydon

related tags

@library  @toread  acm  activemq  admin  administration  ai  algorithm  algorithms  amazon  amqp  analysis  analytics  apache  api  appliance  application  arch  architecture  article  articles  automation  aws  backup  balancer  balancing  batch  beowulf  beta  bi  bigdata  bigtable  bioinformatics  bizgres  blast  blog  blogs  book:PCI  brtfs  btrfs  build  business  businessintelligence  c  c++  cache  caching  cassandra  chart  cheap  cifs  classification  cloud  cloud-computing  cloudcomputing  cloudera  cluster  clustering  clusters  code  coding  codinghorror  column  column-oriented  column-store  columndb  comments  commercial  communication  comparison  compsci  computer  computers  computing  concurrency  condor  cool  cow  cross  daemon  data  database  databases  datamining  datastore  datastructures  datawarehouse  data_mining  data_warehouse  db  dbms  debian  deployment  design  desktop  dev  development  dht  differences  disk  distributed  distributed-computing  distributedcomputing  distribution  diy  django  documentation  drbd  dwh  ec2  education  engine  enterprise  erlang  example  ext4  facebook  faq  file  filesystem  filesystems  filtering  firewall  foss  framework  free  freebase  freebsd  fs  fuse  future  ganeti  geek  google  googlecode  greenplum  grid  gridcomputing  gui  ha  hacking  hadoop  hadoopdb  happy  hardware  hash  hbase  high-availability  highavailability  hive  hosting  howto  hpc  htc  http  httpd  infrastructure  intelligence  interesting  interpreter  interview  introduction  ipc  irc  iscsi  it  java  javascript  jdbc  job  journalism  jython  k-means  kernel  kernel-machines  key-value  keyvalue  kmeans  knoppix  kvs  language  last.fm  learning  library  license:mit  license:PSF  linkedin  linux  load  load-balancing  loadbalancing  load_balancing  luciddb  machine  machine-learning  machinelearning  mahout  management  manager  map  map-reduce  mapreduce  maryland  math  matplotlib  media  memcache  memcached  memcachedb  memcacheq  memory  message  messagequeue  messaging  metaweb  middleware  mnesia  mod  module  mod_backhand  mongodb  mongrel  monitor  mosix  mq  mysql  nas  nasa  network  networking  news  nfs  nlp  node  node.js  nodejs  nokia  nosql  numpy  nyt  olap  open  open-source  openmosix  opensolaris  opensource  open_source  operations  optimization  oracle  os  oss  overview  p2p  paper  papers  parallel  parallel-computation  paxos  peafowl  performance  perl  php  pig  platform  portable  postgres  postgresql  preview  processing  programming  project  protocol  proxy  pycon2010  python  queue  queuing  rabbitmq  rackspace  raid  rails  rdbms  real-time  realtime  recommendation  recommendations  reduce  reference  relationaldb  remote  replication  reporting  research  resources  reviews  rpc  ruby  s3  s4  samba  scalability  scalaris  scale  scaling  science  scipy  search  server  servers  service  sharding  shell  snapshot  socket  sockets  software  solaris  solaris10  source  sql  sqs  stackoverflow  standards  stanford  starling  startup  statistics  storage  store  stream  streaming  streamprocessing  streams  sun  supercomputer  survey  svm  sysadmin  system  tcp  tech  technology  times  to-read  todo  tomcat  tool  tools  toread  trend  trends  tutorial  twitter  ui  unix  utilities  via:pskomoroch  virtual  virtualisation  virtualization  vldb  voldemort  vps  warehousing  web  web-development  web2.0  webapp  webdesign  webdev  webserver  webservice  wikipedia  windows  work  xen  yahoo  yale  zfs  zynga 

Copy this bookmark:



description:


tags: