Big Data: There’s More to It Than Size
by Al Dean | July 20, 2015 | CAD Software Blog | PTC
One term that has had me stumped for a while has been Big Data. I always looked at the term and looked on, still slightly baffled. Not anymore. I’ve recently managed to get a handle on things and finally found a good analogy to help explain it and some examples to show where it fits into the design, engineering, and manufacturing world.
But let’s not get ahead of ourselves. Let’s start at the world’s favorite source of all vaguely accurate information, Wikipedia, and look at what it has to say on the subject of Big Data. This is the introductory paragraph:
Big data is a broad term for data sets so large or complex that traditional data processing applications are inadequate. Challenges include analysis, capture, data curation, search, sharing, storage, transfer, visualization, and information privacy.
Let’s look at how it applies to our little corner of the technology industry.
In the engineering world, we’re used to large datasets. When you consider a full-scale CAD assembly these days, the benchmark is 100,000s of parts upwards. When it comes to simulation, we’re looking at terabytes of data, if not petabytes in some rare instances. That’s certainly much larger than most industries have created and managed on a daily basis. But thinking of Big Data in terms of storage is a mistake. That’s the first thing that I learned: Big Data is not about size, but quantity.
Big Data is essentially referring to having millions of data points that need to stored (this isn’t a massive issue these days). More importantly, it’s about having appropriate tools for doing something interesting with that wealth of data. It’s about turning that data into information
Consider an engineering and product-related example. Today, we’re on the verge of being able to develop and deliver intelligent, connected devices (that’s the whole Internet of Thing industry – which we’ll get onto later in the year).
Now, consider having 10,000 products in the field. Each is fitted with 10 sensors that monitor its operating conditions. Then consider that each of those sensors is sending data back to base, once every second or so.
Santa Cruz mountain bike outfitted with sensors
Software uses data from sensors on the physical bike to display deeper analysis, in this case. reaction forces, while the bike is in motion.
That’s something in the order of 6 million data points sent back every minute. Each might be very small, essentially, a numerical value, but when you’ve got data stacking up at a rate 8 billion entries per day, you’ve got a couple of issues – and this is what the essence of Big Data is.
Given a huge (and I mean HUGE) amount of simple data points, you need a dramatically different set of tools to be able to extract information from that dataset. This is where the term analytics comes into play and this is really key for getting a firm grasp of Big Data.
Big Data isn’t about the size or quantity of data, but rather having a set of tools that allow you to look into that wealth of data, find trends, find patterns and analyze that data to turn it into information.
Taking our example further, if you have those 10,000 products streaming their data to you, you have a rich source of real-world and real-time information about how your product is being used and more often than not, abused.
So what can you do with that information?
This is where the benefits for the engineering and design world come in. Design and engineering is often a heady mix of science, mathematics, and physics combined with a fair bit of intelligent guess work. Assumptions and abstractions are the basis for many engineering calculations.
When developing a new product, we’re often using guesswork combined with previous experience to predict the usage of that product – whether that’s human behavior, mechanical performance, or operating conditions. That is probably never going to change.
With the rise of Big Data-driven analytics, we can capture more real-world information about how our products are used in the field. What we need is a new set of tools that allow us to take that information and turn it into knowledge when we can then use in the next product iteration or development project.
Intrigued? Me too.
[Ed. For many of these same reasons, PTC recently acquired ColdLight, a big data machine learning and predictive analytics company. Learn more about the ColdLight acquisition here and in the short video below.]