jefframnani + hadoop 23
Bigtop
february 2012 by jefframnani
The primary goal of Bigtop is to build a community around the packaging and interoperability testing of Hadoop-related projects.
apache
hadoop
packaging
february 2012 by jefframnani
Apache Mesos: Dynamic Resource Sharing for Clusters
february 2012 by jefframnani
Apache Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks.
cluster
hadoop
february 2012 by jefframnani
Spark Cluster Computing Framework
february 2012 by jefframnani
Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write.
data
hadoop
analytics
cluster
february 2012 by jefframnani
Get started with Hadoop: From evaluation to your first production cluster - O'Reilly Radar
february 2012 by jefframnani
A comprehensive run-down of Hadoop, and it's ecosystem.
hadoop
february 2012 by jefframnani
Azkaban
november 2011 by jefframnani
Azkaban is simple batch scheduler for constructing and running Hadoop jobs or other offline processes.
cron
hadoop
scheduler
GridComputing
november 2011 by jefframnani
Introducing Cascalog: a Clojure-based query language for Hadoop - thoughts from the red planet - thoughts from the red planet
october 2011 by jefframnani
Cascalog is a Clojure-based query language for Hadoop inspired by Datalog.
hadoop
clojure
mapreduce
october 2011 by jefframnani
OpenTSDB - A Distributed, Scalable Monitoring System
january 2011 by jefframnani
OpenTSDB is a distributed, scalable Time Series Database (TSDB) written on top of HBase. OpenTSDB was written to address a common need: store, index and serve metrics collected from computer systems (network gear, operating systems, applications) at a large scale, and make this data easily accessible and graphable.
database
monitoring
distributed
hadoop
january 2011 by jefframnani
Chicago area Hadoop User Group (CHUG) (Chicago, IL) - Meetup
january 2011 by jefframnani
This is a group dedicated to the advancement and education of people interested in working in the Hadoop (HBase,Hive, Pig) cloud environment.
meetup
hadoop
mapreduce
chicago
january 2011 by jefframnani
Hive - Hadoop Wiki
october 2010 by jefframnani
Hive is a data warehouse infrastructure built on top of Hadoop. It provides tools to enable easy data ETL, a mechanism to put structures on the data, and the capability to querying and analysis of large data sets stored in Hadoop files. Hive defines a simple SQL-like query language, called QL, that enables users familiar with SQL to query the data.
database
hadoop
mapreduce
hive
october 2010 by jefframnani
cloudera's flume at master - GitHub
july 2010 by jefframnani
Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data.
hadoop
logging
sysadmin
monitoring
flume
programming
july 2010 by jefframnani
Welcome to Pig!
may 2008 by jefframnani
Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs.
hadoop
mapreduce
distributed
cluster
apache
pig
yahoo
may 2008 by jefframnani
related tags
algorithms ⊕ analytics ⊕ apache ⊕ architecture ⊕ bigdata ⊕ chicago ⊕ clojure ⊕ cluster ⊕ cron ⊕ data ⊕ database ⊕ distributed ⊕ filesystem ⊕ flume ⊕ gfs ⊕ google ⊕ GridComputing ⊕ hadoop ⊖ hdfs ⊕ hive ⊕ httpn ⊕ java ⊕ learning ⊕ logging ⊕ mapreduce ⊕ meetup ⊕ monitoring ⊕ nosql ⊕ osx ⊕ packaging ⊕ performance ⊕ pig ⊕ programming ⊕ rest ⊕ scheduler ⊕ sysadmin ⊕ yahoo ⊕Copy this bookmark: