Bloom Won’t Micromanage Data So Apps Can Scale
april 2010 by rahuldave
Building webscale or cloud applications is hampered by figuring out ways to spread tasks out over thousands of computers without slowing things down, or requiring too many people to keep things running. Virtualization and faster storage helps, as do new databases (GigaOM Pro sub req’d) and caching techniques, but right now folks are trying to adapt how they program computers to reflect that one has now become many.
Bloom, a programming language created at the University of California, Berkeley by a group led by Joseph Hellerstein, is one such effort. Bloom was profiled this week as one of the top 10 emerging technologies by MIT’s Technology Review, because it could help cloud computing continue to scale. Here’s how, according to Technology Review:
The challenge is that these languages process data in static batches. They can’t process data that is constantly changing, such as readings from a network of sensors. The solution, Hellerstein explains, is to build into the language the notion that data can be dynamic, changing as it’s being processed. This sense of time enables a program to make provisions for data that might be arriving later — or never.
Hellerstein also gave an extensive interview to HPC in the Cloud this week about what Bloom is and the problem it’s trying to solve. From that interview:
To put it simply, our what our work is trying to do is start with the data itself and get people to talk about what should happen to the data step-by-step through a program without ever having them specify at all how many machines are involved. So, when you ask a query of a database you describe what data you want—not how to get it.
The interview lays out how this programming effort came about (building network protocols) and who might care most about using Bloom (Amazon, Google or anyone with big data needs), but for me the best part of it was how Hellerstein underscored that the ability to harness a heck of a lot of servers and treat them as a single computer is the next big shift in information technology.
We can call it cloud computing, webscale applications or merely bigger data centers, but the key element here is that the hardware has gone social in ways that require many-to-many ways of communication and delivering instructions to the processors — inside the servers, between the servers, and soon, between data centers. The exciting aspect of this shift is that while larger companies like Google, Yahoo and Amazon are innovating, there is plenty of room for startups with a new appliance, server, networking technology or chunk of code to make waves — and hopefully, money.
For more on the effort, please check out the FAQ’s Hellerstein has posted on his blog.
Image courtesy of Flickr user tibchris
@NYT
CNN_Big_Tech
Cloud_Computing
Infrastructure
SYN_Straight_News
Stacey's_Posts
innovation
Bloom
webscale
from google
Bloom, a programming language created at the University of California, Berkeley by a group led by Joseph Hellerstein, is one such effort. Bloom was profiled this week as one of the top 10 emerging technologies by MIT’s Technology Review, because it could help cloud computing continue to scale. Here’s how, according to Technology Review:
The challenge is that these languages process data in static batches. They can’t process data that is constantly changing, such as readings from a network of sensors. The solution, Hellerstein explains, is to build into the language the notion that data can be dynamic, changing as it’s being processed. This sense of time enables a program to make provisions for data that might be arriving later — or never.
Hellerstein also gave an extensive interview to HPC in the Cloud this week about what Bloom is and the problem it’s trying to solve. From that interview:
To put it simply, our what our work is trying to do is start with the data itself and get people to talk about what should happen to the data step-by-step through a program without ever having them specify at all how many machines are involved. So, when you ask a query of a database you describe what data you want—not how to get it.
The interview lays out how this programming effort came about (building network protocols) and who might care most about using Bloom (Amazon, Google or anyone with big data needs), but for me the best part of it was how Hellerstein underscored that the ability to harness a heck of a lot of servers and treat them as a single computer is the next big shift in information technology.
We can call it cloud computing, webscale applications or merely bigger data centers, but the key element here is that the hardware has gone social in ways that require many-to-many ways of communication and delivering instructions to the processors — inside the servers, between the servers, and soon, between data centers. The exciting aspect of this shift is that while larger companies like Google, Yahoo and Amazon are innovating, there is plenty of room for startups with a new appliance, server, networking technology or chunk of code to make waves — and hopefully, money.
For more on the effort, please check out the FAQ’s Hellerstein has posted on his blog.
Image courtesy of Flickr user tibchris
april 2010 by rahuldave
Open vs. Closed: Ubuntu Walks the Line
april 2010 by rahuldave
Any debate over open vs. closed systems has to touch on open-source software and the ways in which companies are attempting to build code as a community effort, while still profiting from it in some way. So I chatted with Mark Shuttleworth, CEO of Canonical, the company that supports Ubuntu, about how it walks the line between spending to support open-source software and finding a business model that works.
Canonical’s 330 employees are responsible for maintaining, supporting and selling service for Ubuntu, an open-source version of the Linux operating system for servers, desktops and computer manufacturers. Some 120-150 of the Canonical employees contribute directly to the new releases of the software that come out every six months, and most of the company’s revenue comes from supporting enterprise server customers and makers of computers that want to put Ubuntu on desktops. Consumers also download the software, but few pay Canonical for support. The company is not yet profitable.
Shuttleworth believes that in order to develop a strong business model around an open approach, one has to create an open option early, ideally through a strong standardization process and one also needs to have a lot of different open-source projects fighting it out. For example, in the operating system world there wasn’t a strong history of open alternatives, which meant that Ubuntu had to out-open its proprietary competition, which has high costs.
In that way it has pushed Canonical perhaps further out toward open on the spectrum. Shuttleworth calculates the direct costs of being so open as bringing people together in ways that empowers them and makes them feel like members of a community, as well as reaching out and putting in place the infrastructure to create a company. However, there are indirect costs as well.
“There is a myth that being open is necessarily more efficient and cheaper, but there are no hordes of people showing up to do the hard stuff,” Shuttleworth says. “Occasionally wonderful, magical things happen — really incredible things do happen, like people show up unexpectedly with brilliant ideas — but it’s still hard and expensive and you still have to be willing to do all the hard and expensive things and do it in an open fashion. And you’re still likely to be accused of being open only when it’s convenient.”
He points to the cloud computing market as one that tends to give a lot of lip service toward openness but where a lack of a big standardization effort and robust open source competition could lead to a relatively closed ecosystem.
“The basic story there is pretty bad at the moment,” Shuttleworth says. He notes that proprietary infrastructure, hypervisors and even the APIs and ways data is stored can lock folks into one cloud for life. “We need real open alternatives early in the process, making it possible for people to build own cloud infrastructure that responds to the same APIs that Amazon’s do.”
He’s accepted that Amazon Web Services’ APIs for its web services, while not created through an open standards group, have become a de facto standard and said that it’s more efficient to build open-source code around Amazon APIs rather than try to develop new ones for accessing the cloud. Canonical has a partnership agreement with Eucalyptus, which offers open-source software to create an AWS-compatible cloud, where people can use Ubuntu and Eucalyptus to create their own cloud computing platform. But Shuttleworth would like to see more open-source options other than Eucalyptus for building out a cloud computing service of your own.
At the platform-as-a-service level, the issue around openness will be around moving data from cloud to cloud easily. There’s room there for an open standard or open databases, he said. But at every level, when considering building a business around open source software, he he believes that “you want a common and clear standard with competing open source versions using that standard.”
That keeps proprietary vendors at bay, and gives the companies building a business around the open-source software a chance to decide where they want to be on the open-to-closed spectrum. But it also introduces the prospect of fragmentation, which we’ll leave for a later post.
Related content from GigaOM Pro (sub req’d):
For Open Cloud Computing, Look Inside Your Data Center
CNN_Big_Tech
Cloud_Computing
Infrastructure
NYT_Enterprise
SYN_Feature_Enterprise
Stacey's_Posts
Web
Canonical
Mark_Shuttleworth
Ubuntu
from google
Canonical’s 330 employees are responsible for maintaining, supporting and selling service for Ubuntu, an open-source version of the Linux operating system for servers, desktops and computer manufacturers. Some 120-150 of the Canonical employees contribute directly to the new releases of the software that come out every six months, and most of the company’s revenue comes from supporting enterprise server customers and makers of computers that want to put Ubuntu on desktops. Consumers also download the software, but few pay Canonical for support. The company is not yet profitable.
Shuttleworth believes that in order to develop a strong business model around an open approach, one has to create an open option early, ideally through a strong standardization process and one also needs to have a lot of different open-source projects fighting it out. For example, in the operating system world there wasn’t a strong history of open alternatives, which meant that Ubuntu had to out-open its proprietary competition, which has high costs.
In that way it has pushed Canonical perhaps further out toward open on the spectrum. Shuttleworth calculates the direct costs of being so open as bringing people together in ways that empowers them and makes them feel like members of a community, as well as reaching out and putting in place the infrastructure to create a company. However, there are indirect costs as well.
“There is a myth that being open is necessarily more efficient and cheaper, but there are no hordes of people showing up to do the hard stuff,” Shuttleworth says. “Occasionally wonderful, magical things happen — really incredible things do happen, like people show up unexpectedly with brilliant ideas — but it’s still hard and expensive and you still have to be willing to do all the hard and expensive things and do it in an open fashion. And you’re still likely to be accused of being open only when it’s convenient.”
He points to the cloud computing market as one that tends to give a lot of lip service toward openness but where a lack of a big standardization effort and robust open source competition could lead to a relatively closed ecosystem.
“The basic story there is pretty bad at the moment,” Shuttleworth says. He notes that proprietary infrastructure, hypervisors and even the APIs and ways data is stored can lock folks into one cloud for life. “We need real open alternatives early in the process, making it possible for people to build own cloud infrastructure that responds to the same APIs that Amazon’s do.”
He’s accepted that Amazon Web Services’ APIs for its web services, while not created through an open standards group, have become a de facto standard and said that it’s more efficient to build open-source code around Amazon APIs rather than try to develop new ones for accessing the cloud. Canonical has a partnership agreement with Eucalyptus, which offers open-source software to create an AWS-compatible cloud, where people can use Ubuntu and Eucalyptus to create their own cloud computing platform. But Shuttleworth would like to see more open-source options other than Eucalyptus for building out a cloud computing service of your own.
At the platform-as-a-service level, the issue around openness will be around moving data from cloud to cloud easily. There’s room there for an open standard or open databases, he said. But at every level, when considering building a business around open source software, he he believes that “you want a common and clear standard with competing open source versions using that standard.”
That keeps proprietary vendors at bay, and gives the companies building a business around the open-source software a chance to decide where they want to be on the open-to-closed spectrum. But it also introduces the prospect of fragmentation, which we’ll leave for a later post.
Related content from GigaOM Pro (sub req’d):
For Open Cloud Computing, Look Inside Your Data Center
april 2010 by rahuldave
SpringSource Buys Startup to Scale Messaging in the Cloud
april 2010 by rahuldave
SpringSource, a division of VMware, has purchased the open-source cloud messaging company behind the RabbitMQ software. The value of the deal was undisclosed, but the purchase of Rabbit Technologies Ltd. is yet another effort by VMware to become the operating system for enterprise clouds (GigaOM Pro, sub req’d) and add value to its commoditized hypervisor. It’s also the latest example of a company selling proprietary software buying up an open-source software company aimed at the cloud.
Cloud providers use RabbitMQ to create a messaging server allowing them to quickly manage the flow of messages between applications. It can also be used to notify users of a web service when content on the site has changed, such as when someone posts a Facebook photo and the service sends an email out notifying all a user’s friends.
The RabbitMQ code was created by Cohesive FT and LShift based on the relatively young AMQP standards effort backed by major banks, Cisco and a handful of smaller companies. As hardware is virtualized, translating some of the network equipment like load balancers into software allow services running on the virtualized hardware to scale better. Hopefully we’ll learn more about SpringSource, RabbitMQ and VMware’s plans for becoming the cloud OS when VMware CEO Paul Maritz speaks at our Structure conference in June.
Introduction to AMQP Messaging with RabbitMQ
View more presentations from Dmitriy Samovskiy.
Image courtesy of Flickr user Joshua Davis
CNN_Big_Tech
Cloud_Computing
Infrastructure
NYT_Company_News
SYN_Straight_News
Stacey's_Posts
Startups
RabbitMQ
SpringSource
VMWare
from google
Cloud providers use RabbitMQ to create a messaging server allowing them to quickly manage the flow of messages between applications. It can also be used to notify users of a web service when content on the site has changed, such as when someone posts a Facebook photo and the service sends an email out notifying all a user’s friends.
The RabbitMQ code was created by Cohesive FT and LShift based on the relatively young AMQP standards effort backed by major banks, Cisco and a handful of smaller companies. As hardware is virtualized, translating some of the network equipment like load balancers into software allow services running on the virtualized hardware to scale better. Hopefully we’ll learn more about SpringSource, RabbitMQ and VMware’s plans for becoming the cloud OS when VMware CEO Paul Maritz speaks at our Structure conference in June.
Introduction to AMQP Messaging with RabbitMQ
View more presentations from Dmitriy Samovskiy.
Image courtesy of Flickr user Joshua Davis
april 2010 by rahuldave
You’ve Got Mail! Amazon Creates Cloud Notification Service
april 2010 by rahuldave
Amazon Web Services has launched its Simple Notification Service (Amazon SNS), which allows developers to create a push notification system for applications. The service allows companies to deliver messages to customers of their applications or even to other applications in a couple of different formats, among them HTTP and email. Amazon SNS could be used for system administrators in an IT department (notifying clients if they’re hitting a certain limit on storage capacity or that latency on their service is too high), or it could be used to build out notifications for mobile applications, such as letting consumers when friends check into a location, or when they have new email.
Developers using the service pay per instance, as with all Amazon cloud products. The price includes a per-request, notification delivery and data transfer fee, but developers can get started with Amazon SNS for free. Each month, Amazon SNS customers get the first 100,000 Amazon SNS Requests, the first 100,000 notifications over HTTP and the first 1,000 notifications over email free. After that, prices range from 6 cents to $2 per 100,000 messages sent for delivery and 8-15 cents per gigabyte of data transferred.
Related GigaOM Pro content (sub. req’d):Report: Delivering Content in the Cloud
Image courtesy of Flickr user Ed Siacoso (aka SC fiasco)
CNN_Big_Tech
Cloud_Computing
Infrastructure
NYT_Company_News
SYN_Straight_News
Stacey's_Posts
Amazon
from google
Developers using the service pay per instance, as with all Amazon cloud products. The price includes a per-request, notification delivery and data transfer fee, but developers can get started with Amazon SNS for free. Each month, Amazon SNS customers get the first 100,000 Amazon SNS Requests, the first 100,000 notifications over HTTP and the first 1,000 notifications over email free. After that, prices range from 6 cents to $2 per 100,000 messages sent for delivery and 8-15 cents per gigabyte of data transferred.
Related GigaOM Pro content (sub. req’d):Report: Delivering Content in the Cloud
Image courtesy of Flickr user Ed Siacoso (aka SC fiasco)
april 2010 by rahuldave
The iPad’s Not So Revolutionary Inside
april 2010 by rahuldave
The inner workings of the iPad reveal that Apple has learned much from its iPhone development, using many of the same components and cramming those chips onto a pretty small board behind the device’s 9.7-inch screen. Today, I managed to snag a few minutes with David Carey, VP of technical intelligence at UBM TechInsights, to talk about his experience tearing down the iPad.
He said that so far, the only big surprise was the new processor inside, but he couldn’t yet tell me if it was a new CPU using engineering that Apple acquired via its PA Semi acquisition or a a souped-up ARM Cortex A-8 processor. But he did point out some interesting design choices that Apple has made with its machining, and showed off all the insides. Enjoy.
Broadband
Hardware
SYN_Feature_Enterprise
Stacey's_Posts
AAPL
Apple
iPad
ubm_techinsights
from google
He said that so far, the only big surprise was the new processor inside, but he couldn’t yet tell me if it was a new CPU using engineering that Apple acquired via its PA Semi acquisition or a a souped-up ARM Cortex A-8 processor. But he did point out some interesting design choices that Apple has made with its machining, and showed off all the insides. Enjoy.
april 2010 by rahuldave
EMC’s Crazy Plan to Create a Worldwide Data Cloud
march 2010 by rahuldave
Pat Gelsinger, who moved to EMC late last year after 30 years at Intel, is stirring things up at the storage giant with a plan to virtualize and federate storage so data and compute can truly be linked together (hat tip The Register). The implication of this vision is that organizations will have the ability to keep constantly changing information up to date around the world in real time despite the challenges of moving huge amounts of data over networks that measure data in in gigabytes rather than petabytes.
In a presentation on Thursday, Gelsinger pointed out that compute and storage are rapidly getting better about dealing with more information, while networks are trying to catch up. “Compute is doubling every two years. Storage doubles every 15 months, and networking is much much much slower, like every four years, so how do you deal with latency bandwidth and consistency?” Gelsinger said.
Gelsinger’s answer is caching. Imagine a two-way content delivery network built on EMC appliances that tracks and replicates changes made to data at one node and then pushes them out to all the other nodes as quickly as possible. Gelsinger calls this freeing the information from physical storage, but it sounds more like making sure your information is in a bunch of different physical storage containers. He mentions EMC’s acquisition of intellectual property from Yotta Yotta as offering the breakthrough required to build this technology.
But at the end of the day, this is all a big if, not an actual product yet. If EMC can link storage and virtualized machines together, the data center that “follows the sun” — basically moving compute loads around the world where it’s cheapest to run them – or automatic failover for cloud services become possible. However, it will be controlled by a proprietary hardware vendor, which certainly clouds its prospects a bit.
CNN_Big_Tech
Cloud_Computing
Infrastructure
NYT_Company_News
SYN_Straight_News
Stacey's_Posts
innovation
emc
INTC
Intel
VMWare
VMWR
from google
In a presentation on Thursday, Gelsinger pointed out that compute and storage are rapidly getting better about dealing with more information, while networks are trying to catch up. “Compute is doubling every two years. Storage doubles every 15 months, and networking is much much much slower, like every four years, so how do you deal with latency bandwidth and consistency?” Gelsinger said.
Gelsinger’s answer is caching. Imagine a two-way content delivery network built on EMC appliances that tracks and replicates changes made to data at one node and then pushes them out to all the other nodes as quickly as possible. Gelsinger calls this freeing the information from physical storage, but it sounds more like making sure your information is in a bunch of different physical storage containers. He mentions EMC’s acquisition of intellectual property from Yotta Yotta as offering the breakthrough required to build this technology.
But at the end of the day, this is all a big if, not an actual product yet. If EMC can link storage and virtualized machines together, the data center that “follows the sun” — basically moving compute loads around the world where it’s cheapest to run them – or automatic failover for cloud services become possible. However, it will be controlled by a proprietary hardware vendor, which certainly clouds its prospects a bit.
march 2010 by rahuldave
related tags
@NYT ⊕ AAPL ⊕ Amazon ⊕ Apple ⊕ Bloom ⊕ Broadband ⊕ Canonical ⊕ Cloud_Computing ⊕ CNN_Big_Tech ⊕ emc ⊕ Hardware ⊕ Infrastructure ⊕ innovation ⊕ INTC ⊕ Intel ⊕ iPad ⊕ Mark_Shuttleworth ⊕ NYT_Company_News ⊕ NYT_Enterprise ⊕ RabbitMQ ⊕ SpringSource ⊕ Stacey's_Posts ⊕ Startups ⊕ SYN_Feature_Enterprise ⊕ SYN_Straight_News ⊕ ubm_techinsights ⊕ Ubuntu ⊕ VMWare ⊕ VMWR ⊕ Web ⊕ webscale ⊕Copy this bookmark: