GE creates a ‘data lake’ for new industrial ecosystem

Share with others:

Print Email Read Later

New technology products head at us constantly. There is the latest smartphone, the shiny new app, the hot social network, even the smarter thermostat.

As great (or not) as all these may be, each is a small part of a much bigger process that rarely is admired. They all belong inside a world-changing ecosystem of digital hardware and software, spreading into every area of our lives.

Thinking about what is going on behind the scenes is easier if we consider the automobile, also known as “the machine that changed the world.” Cars succeeded through the widespread construction of highways and gas stations. Those things created a global supply chain of steel plants and refineries. Seemingly unrelated things, including suburbs, fast food and drive-time talk radio, arose as a result.

The current dominant industrial ecosystem is relentlessly acquiring and processing digital information. It demands newer and better ways of collecting, shipping, and processing data, much the way cars needed better road building. And it is spinning out its own unseen businesses.

A few recent developments illustrate the new ecosystem. General Electric plans to announce today that it has created a “data lake” method of analyzing sensor information from industrial machinery in places such as railroads, airlines, hospitals and utilities. GE has been putting sensors on everything it can for a couple of years, and now it is out to read all that information quickly.

The company, working with an outfit called Pivotal, said that in the past three months, it has looked at information from 3.4 million miles of flights by 24 airlines using GE jet engines. GE said it figured things such as possible defects 2,000 times faster than it could before.

The company has to, since it is getting so much more data.

“In 10 years, 17 billion pieces of equipment will have sensors,” said William Ruh, vice president of GE software. “We’re only one-tenth of the way there.”

It hardly matters if Mr. Ruh is off by 5 billion or so. Billions of humans already are augmenting that number with their own packages of sensors, called smartphones, fitness bands and wearable computers. Almost all of that will get uploaded someplace, too.

Shipping that data creates challenges. In June, researchers at the University of California, San Diego, announced a method of engineering fiber optic cable that could make digital networks run 10 times faster. The idea is to get more parts of the system working closer to the speed of light, without involving the “slow” processing of electronic semiconductors.

“We’re going from millions of personal computers and billions of smartphones to tens of billions of devices, with and without people, and that is the early phase of all this,” said Larry Smarr, director of the California Institute for Telecommunications and Information Technology, located inside UCSD. “A gigabit a second was fast in commercial networks, now we’re at 100 gigabits a second. A terabit a second will come and go. A petabit a second will come and go.”

In other words, Mr. Smarr thinks commercial networks will eventually be 10,000 times faster than the best current systems. “It will have to grow, if we’re going to continue what has become our primary basis of wealth creation,” he said.

Add computation to collection and transport. Last month, UC Berkeley’s AMP Lab, created two years ago for research into new kinds of large-scale computing, spun out a company called Databricks, which uses new kinds of software for fast data analysis on a rental basis. Databricks plugs into the 1 million-plus computer servers inside the global system of Amazon Web Services and soon will work inside similar-sized megacomputing systems from Google and Microsoft.

It was the second company out of the AMP Lab this year. The first, called Mesosphere, enables a kind of pooling of computing services, building the efficiency of even million-computer systems.

“What is driving all this is the ability to collect, store and process data at a speed and granularity never seen before, over wide areas,” said Michael Franklin, director of the AMP Lab. “When you do this, you can see patterns you never saw before.”

Of course, it is impossible to know if these are the big changes. We are only one tenth — or is it one ten-thousandth? — of the way there.

United States - North America - California


You have 2 remaining free articles this month

Try unlimited digital access

If you are an existing subscriber,
link your account for free access. Start here

You’ve reached the limit of free articles this month.

To continue unlimited reading

If you are an existing subscriber,
link your account for free access. Start here