IoT Spotlight - EP 141 - How to seamlessly integrate real time IoT data streams - Brian Gilmore, Director of IoT & Emerging Technology, InfluxData

Podcasts > Ep. 141 - How to seamlessly integrate real time IoT data streams

Ep. 141

How to seamlessly integrate real time IoT data streams

Brian Gilmore, Director of IoT & Emerging Technology, InfluxData

Monday, August 15, 2022

In this episode, we interview Brian Gilmore, director of IoT and Emerging Technology at InfluxData. InfluxData is the creator of InfluxDB, a pioneering time series platform that allows developers to build real-time IoT, analytics, and cloud applications with time-stamped data. They handle massive volumes of data produced by sensors, applications, and systems that change over time.

Today, we discuss how next-generation databases create new opportunities by enabling organizations to seamlessly integrate real-time IoT data streams with cloud databases. We also dive deep into the relationships between database technology and adjacent innovations in AI, AR, and blockchain.

Key Questions:

What is the right way to think about “real time” from the perspective of a user?
What are the unique uses of time series data, and what challenges does it present?
How are AI, AR and blockchain being integrated into IoT systems?
What recent database developments are improving management of complex IoT systems?

Transcript.

Welcome to the Industrial IoT Spotlight, your number one spot for insight from industrial IoT thought leaders who are transforming businesses today, with your host Erik Walenza.

Erik: Brian, thanks for joining us on the podcast today.

Brian: Hey, Erik. Thanks. Thanks for having us. Appreciate it.

Erik: Great. The topic that we're going to dive into today is, broadly speaking, the topic of data and how we think about it, and how we decide what we can do with it. But before we jump into that, I'm interested to understand how you get in touch with the IoT topic originally. Obviously, you've been with Splunk for about seven to eight years before joining InfluxData. That's a great company to really deep dive into IoT. But I imagine that you first touched on the topic even before then. Can you just give us a quick background on where you first touched the IoT topic, and how you ended up now moving into the more cutting edge of the domain?

Brian: Yes, sure. Of course. I think people have been doing work with computers and physical devices for far longer, of course. Then it's been called the Internet of Things. I just found myself sort of working. It started when I was — I spent some time in music school. It was right at the advent of the digitization of the recording studio, just getting to work with that equipment and those applications. It was really interesting for me. I, eventually, in a very roundabout way, found my way to the marine biology field. I've worked in public aquariums for several years and saw there the digitization, the connecting of computers to a lot of the industrial equipment that ran the fish tanks. These were very large tanks. They had PLCs and all the same stuff. Each time, I really thought about, okay, well, if we have these physical devices connected up to these computers, what can we learn about the operation of these machines even when we're not watching? So, I got really passionate about just collecting data and storing it. Early on, I started with just Excel and pivot tables, really trying to figure out how I could report on the regular routine behavior of these machines and then maybe identify anomalies or outliers. I carried that through. I worked in a mechanical contractor. It was actually the mechanical contractor who'd built the last aquarium I worked for. I built a similar application there as part of their smart buildings practice. Again, this predates the term IoT.

It was right about that time that I integrated our last generation of that product, at that contractor, with Splunk that I really started to hear that term IoT come about. It was mostly from the team at Splunk, I think. It was a term that was much more popular in Silicon Valley than it really was anywhere else in the world. I was lucky to get to join them for, like you said, eight years. I worked in all areas of the business, from under 1,000 employees to over 7,000 employees, and helped them build that IoT business there. Then I got a call from a friend, Rick Balada, who had mentioned that InfluxData was looking for somebody to do a similar project. I jumped on it because I had been following InfluxData for the last several years that I was at Splunk. Awesome technology, awesome people. I feel like a real opportunity to completely disrupt a few key spaces in the IoT and beyond. So, I'm excited to be here.

Erik: That's a great backstory. It's always interesting to hear where people first touched this topic. We've had folks on here that were working at the military rockets back in the 80s. They'd go, we're kind of dealing with this IoT, but that is extremely proprietary and very controlled environments. Obviously, every industry has its own point when it starts to make sense of data. I guess the recording studios and, certainly, the aquariums—just like anything else—they have to optimize equipment. They have to figure out new ways of improving how people work. It's interesting always to understand that touch point.

The topic we want to get at — we could approach from a couple of different angles. Maybe we can start just with how you think about datasets. So, we always talk about data and then we throw around words like real-time data, time-series data, et cetera. I guess we can categorize it in different ways based on how we can, to an extent, how we can use it, how it's formatted, the latency or the time period that we have access to it. How do you think about datasets? For example, when we say time-series data, how do you define that? Then what would be the other categories that you would define as separate from this?

Brian: Sure. I think a good place to start would be just really thinking about real-time data. Real-time data, I think everybody assumes, is an indicator of the condition or the position or whatever of really any value at a specific point in time. People assume that that specific point in time is exactly right now. It's never the case. I mean, there's latency in this. Really, the only place that data is absolutely real-time, especially in the IoT or industrial IoT space, is literally at the interface of the sensor. Because there's latency that's induced at every stage along the way. You've got your analog to digital conversion. You've got your protocol translation that happens. You've got all sorts of different steps along the way.

I think the important thing to do is to figure out where you can intercept that data earliest, so that you can remove the greatest amount of latency. That sometimes requires bringing compute down to the edge as close to that sensor interface as possible. Sometimes it requires just suspending disbelief in terms of what you would really think real-time should be and getting into what I think a lot of vendors like us and others are calling 'near real-time,' which is it just happened. In terms of a human's ability to react or to address the situation, the latency there in that real-time data is not the biggest issue in terms of being able to react right away. Now, you have all of that now figured out. You've got real-time data. You're successfully moving it from the point of generation to the point of consumption. You're doing so with a level of latency that you and everybody else is comfortable with. Now, once you have that, the question becomes, how quickly do you want to historize all of that real-time information?

That's when I start to think about time-series data, or we start to think about time series data. If you have this wall of data coming at you, you need to start coming up with a strategy in terms of how you're going to sample that data, and put the timestamp off your sample. For example, if you're grabbing slices of the data every second, or every millisecond, or even every nanosecond, you have to be able to immediately timestamp that information in a way that you can use later to organize that data. Put it in series or run other time-based queries against it. So, you can do that one or two ways. You can do it on regular samples—like I said, every second—or you can do it on condition, which we would call event-base. When it changes a certain amount or when it exceeds a certain threshold, take a reading, put the timestamp of that reading with the data, and then put it in a database that's going to allow you to do all of that storage. If you go back 20, 30 years, people were doing that. They were putting them in text files. Then we're actually probably more than that now, 40 years. Then you had those industrial process historians come in on the early 80s and through the 90s. They did a great job dealing with the volumes of the data that was coming off of those industrial processes, whether it was in manufacturing or in oil and gas energy type things.

But eventually, now, because everything is digitized and everything is centralized, you have just a greater volume of information. It's literally like high-definition television. You have much higher resolution in terms of the number of data points you're collecting. Then you have much higher sample rating. So, you can be sampling to the nanosecond basis. It takes a very specialized database to be able to not only handle that data, but then to also be able to organize it well. Because sometimes data comes in out of order. Then make it actually accessible and usable to the people who actually need to consume that data to do their job. Translating that massive volume of real-time data like the tidal wave of it into something that can be consumed later, for the purposes of troubleshooting or whatever, is really challenging. I mean, honestly, it is really challenging.

Even if you look at what we've been doing at InfluxDB over the years, we've evolved our approach to there as well. We started with a stack everybody loved, called TICK. As the world grew, we had to start thinking about optimizing the back-end for the modern data volumes. We did so with our 2.x. Then we saw more and more data generated in the cloud, which we build a cloud-native Software-as-a-Service version of our database that was re-architected in the background in a way that I think a lot of people didn't know. It was very impressive stuff. Now we've got even more coming. We've recently launched some really cool technology around running time series databases at the edge, and then also in the cloud.

That's the real-time data versus time-series data thing. I think there's a whole mess of other data sources or other data storage mechanisms, I would say, that we use for what we call enrichment. You have relational databases, you have document databases, you have graph databases. These, by their nature, are not time-series. Sometimes people try to turn them into time-series databases. It has not traditionally worked very well. But these are great storage of more static information that can be used to enrich time-series data, add another layer of depth and detail to the story that you're telling through your time-series data or that you're reading, I guess, through your time-series data.

Erik: That would be things like the formula for this chemical is this, or when a temperature reaches this threshold, then this reaction happens. These types of facts that might factor into, is it that?

Brian: Yeah, sure. If you think about like — because of the volume of time series data, you want to optimize your time-series storage in a way that makes it super efficient to write. Because you could be talking about tens or hundreds of millions of time-series events per second. So, you really want to make it efficient to write. There's a lot of information that you wouldn't want to put in with every time-series record.

For example, you might have — say, you're capturing data from a manufacturing machine. That machine has a name. It's built by a particular vendor. It has a serial number. It has the date it was installed. It has all of these other features that would be good to use when you're working with your time-series data later — either through visualizations or artificial intelligence, or whatever you're doing with it. You don't want to store that, all of that detailed information, with the time-series data. What you might do is you might put a short, unique ID for machine ID. Then in a relational database somewhere else, you might have that machine ID. Then you might have a whole bunch of other columns in your database table related to all of those fields and features. Then when you write your time-series query, and you limit your tens or hundreds of millions of events down to a more reasonable number, you can then refer the machine ID in that limited number of time-series events out to a relational database, to pull that information back in and decorate your data in a way. If you want it to map relationships, you can do the same thing with a graph database. If you want it to tie together logs and other event-based data, you can do that by attaching a second time-series database. It's totally up to the user, but the technology now is powerful to the point that you really can do just about anything you would want to with it.

Erik: Let's talk a bit about what is challenging today. Let's say, a lot of the value that is built into the IoT use cases that are emerging now or starting to get wider adoption is being unlocked, because we have greater access to different datasets and the ability to merge them in ways that weren't possible before. We have real-time data coming off the edge. We have large datasets, historical datasets, or data that's integrated from different sources that's maybe stored on the cloud. We're able to merge these and create use cases out of them. I guess there are certain things here that are, more or less, commodities — meaning that they've solved problems that most companies know how to do well. There are other things that are challenging, technically. If we want to do machine learning on the edge, and we want to have access to maybe still updated data from a cloud database or something, we have huge datasets and limited compute power and so forth, we have these challenging edge cases. Where do you define today what the — maybe not the cutting edge. What are the problems that are challenging to solve, that only a small set of companies have solved or that people are really the next frontier of innovation around IoT data management, or maybe data management? Generally, where do you see those boundary today?

Brian: I think from a technology perspective, I think you nailed it. I think that really efficiently, I would say, orchestrating data, understanding where to sample it, where to keep it, where to move it. I think data has both a point of origin, but then it also has a force of gravity. The places that you store your data in large volumes typically pull in the applications. Because of the expense of moving data from the edge to the cloud or from the cloud to the edge, if you have a ton of data in the cloud, you're going to build your apps in the cloud. If you have a ton of data at the edge, you're going to want to build your apps at the edge. That's a force of gravity.

Being able to effectively take that data, sample it at the rate you need to sample it, where it's created, store it there. Then make those smart decisions about what data you move and when, for the different applications. You will need to move some of your data from the edge to the cloud. Because there are, like you said, certain technologies—especially when you start getting into the ML and AI realm—you just don't have the horsepower. You don't have the massive, parallelized GPUs to crunch the data the way you would in the cloud.

What we and other vendors are working with right now is, how do you have this duality between the edge and cloud, where you have a tight coupling between your applications at the edge and your applications in the cloud, and then those applications are making that decision for you? We've started down that route with a feature we launched recently, called Edge Data Replication—which is not autonomous in any way, shape, or form. But it does allow you to give us set of rules that say, okay, here's the data. I want to leave at the edge. Maybe it's summarized to the millisecond or the second. But then, I want to aggregate that data up to the minute, or enrich it, or decorate it, or whatever, and move a subset of that to the cloud, so that I can train those machine learning models up there. Maybe if you have an architecture where you might have hundreds, or thousands, or tens of thousands of edge databases, you don't want to centralize all of that. You want to duplicate it across all of those edge devices. So, you would take it, summarize it up to the cloud, aggregate it together, use that to train your master machine learning models. Then you could push those models back down to the edge for application.

The technology is there. It's possible. I think just people really understanding how it works and to do it effectively, both from an operations perspective but also from a cost perspective. Because if you make mistakes, especially when you're working with some of the larger cloud vendors, you can very quickly run out massive bills because you accidentally moved 50 terabytes of data from your million edge devices to the cloud. Nobody wants to be in that situation.

Aside from the technical stuff, I like to think of — this isn't my sort of theory, but it's something that I've picked up along the way. I truly believe that there is a journey for any new emerging technology. I think we're hitting the point in IoT now which is very positive, in that most of the obstacles are related to the people or the processes and not the technology. I think there are probably still a bunch of companies out there with a budget for IoT. Whenever I talk to companies, I say, "If you have an IoT budget, you're barely started."

IoT is a means to an end alongside a full set of other emerging technologies like cloud and AI, ML, augmented reality, blockchain. You name it. It's using these technologies and investing in these technologies to solve problems. I think what companies need to do to close that last gap—where IoT just becomes ubiquitous and it drops its fame, and it just becomes part of our networks, becomes part of the internet again—is for people to really start asking what are the big problems in the business. Which of these can technology solve? Of those, where does IoT make an impact? Then work with the stakeholders who are kept up at night by those problems, and fix them with technology like IOT. If it doesn't need IoT, that's fine. But if it does need IoT, the technology is there and waiting for you. Otherwise, you're just constantly throwing sensors and platforms and things to every little problem. To a hammer, every problem is a nail. I think IoT has the same issue. People get stuck in this pilot purgatory thing, where they're doing a whole bunch of pilot projects. But because they have no pained stakeholder behind them and no research done in terms of the actual impact to the business, nothing expected in terms of outcomes, et cetera, they're just doomed to fail. I think that's the last hurdle for this to become really, really mainstream. People need to start thinking of it as like any other technology, like you don't use it unless you need it.

Erik: That's a great perspective. I mean, I've almost stopped thinking about IoT as a technology in itself. If you think about it a little bit like the internet, nobody says I'm buying the internet. The internet is this platform where a bunch of different technologies interact. The IoT, the way I view it, is just an extension of that outside of desktop computer to the rest of the world, with different interfaces and so forth.

Brian: Exactly.

Erik: I think that's a very insightful point around where we are in the development, in terms of the challenges coming more from the people, processes. One of the things that really differentiates IoT from the internet is that there's such a long tail of use cases. What that means is that Facebook is like a couple of your friends prove it out. You say, okay, good enough. I'll give that a try. All of a sudden, you have a billion people using the same platform. But with IoT, it's like — I'm based here in Shanghai, in China. You have these industry clusters, where 70% of the world's nail clippers are made in some city, in southern China. It's like one factory will start adopting a technology, and then they'll have some success. Some of their engineers will get hired by other factories. They say, "Hey, that really worked for us, that use case." They invest in it. All of a sudden, you have this cluster of businesses that are all adopting this use case because somebody proved it out, and the talent traveled around and brought it. But that happens cluster by cluster, use case by use case to these different ecosystems. It just takes time. It travels at the speed of human experience. Whereas consumer, it's much quicker to say, "Okay, I'm going to set up a Facebook account, and then I'll get my mom on it next week."

Brian: I think you make a really interesting point there. For each other technology, there has always been that killer app. There has been that one application that elevated a particular technology to the mainstream. It does happen more in consumer than it does in the B2B world, but I think there are some good examples of that. It's not to the level that it has in the consumer space. If you look at the adoption of mobile technology in the industrial world 10 years ago, it was like, "Oh, God. No, don't bring that on there. It's going to come in with viruses. You can't connect it to my network, whatever." Now operators are walking around with their iPhones. They're checking in through cloud connected data pipelines to the machine that they're standing right in front of. That's really cool. I think there's a number of other examples where that will be true.

I think for the software vendors, I think, of course, everybody wants their platform to go viral, right? They wanted to have that experience, like you said, of one person using it in their factory and going to talk at the world's largest conference and getting the world's largest audience, and everybody's standing and cheering at the end. I think from a business-to-business perspective, it just doesn't happen like that because of the inertia of business and the difficulties in making changes for a whole bunch of really good reasons like safety and things that you've committed to your existing customers, all of that. It just tends to be a little bit slower. I think a lot of companies—InfluxData included—are starting to take a product-led growth perspective, where we have, like I said, a cloud service earlier. We have zero barriers of entry to start there. We have a completely free cloud-hosted version that anybody can sign up and prototype at a smaller scale. We use that to get feedback on the product. So, we have a very large number. I think there's 50,000 accounts on that system now. Then we also have an open source for those who can't run on the cloud, which is very similar. We got 600,000 of those out there. Having those databases out there, and then monitoring and working with the communities that are interacting with those free versions of our software helps us build those capabilities that will level it up in terms of its applicability to the big industrial companies or the big consumer IoT companies of tomorrow.

PLG is a real thing. It's a good way to go. I think you're always going to have a salesforce. You're always going to have a product management team who is going to be out there talking with your enterprise customers. But you can get a lot of value by having a product out there, that tens or hundreds of thousands of people can use, and then just looking for their feedback. Sometimes, in the weirdest ways, like a tweet about it. I started monitoring Google Scholar for InfluxDB. There's like 3,000 or 4,000 references over the last 10 years of our databases and all kinds of academic papers. It can be master's theses. We've discovered that we are literally all over the Large Hadron Collider at CERN, because they published these papers that just referenced InfluxDB and how they used it. It's exciting to see.

Erik: Well, I'm just going to ask you, what does the user profile look like? Because when I'm looking at your customer list, you have everything from tech startups to large corporates with tens of billions of revenue. Are you being used by product development teams? Are you being used by operators, like in the IT department of a factory? Is it often more corporate wide infrastructures, or is it more targeted applications? Is it a bit of all of the above? I guess, there's probably not one typical profile, but what are the different types of profiles?

Brian: I think the answer to that is yes. It is very diverse—our audience and our customer base. Any company, you have to focus on one area and then just be glad that other areas come along for the ride. I think what we see and what other companies see is that there's a shift going on, even in the more traditional businesses of finance, banking, and the heavy industrials and things like that, whether it's from hiring a workforce that's younger and more digitally native or whether it's just from the long tail of technology in people, like those fast followers having made it through. Now we've got those laggards coming through in adopting technology that's been experimented with by others. But you start to see these financial services companies. You start to see these oil and gas companies really, aggressively, investing in new technology and new people.

I think that that's really going to bring — I would hesitate to call them developers in terms of the real, I'm a software engineer type developer. But between the folks who are really good at systems integrations, really good at just connecting technology up, really good at writing scripts or writing simple apps and things like that. These types of users are in way more types of organizations now. We target that. It's not 100% polished. It's not turnkey. You can't buy InfluxDB for a specific analytic and oil and gas. Some of our competition has gone that route; I'm not sure it's worked well for them. We leave it open in that we are API first. We have a really powerful set of well-documented APIs. Everything is scriptable. Everything is composable. We have a UI so that somebody who can't write any code at all can click around. Our analytics language, we just call it Flux. I would say more like JavaScript than it is like SQL. We're aiming for those highly technical. Not necessarily telling them you have to be C# superhero, but somebody who can write a little bit of code, and somebody who can build some stuff, integrate some stuff, and customize the platform that we provide to whatever their particular use case for pain points is.

That's why you see that diversity. Because we've got big customers in blockchain. We've got big customers in oil and gas. We've got big customers in, basically, every industry you could possibly imagine. It's because of those digital natives. A good example of that is our relationship with Tesla. I mean, every vendor on the planet would die to talk about how they could work with Tesla. We're lucky to be able to say we don't even do it. They talk about it. We do talk about it, of course. Who wouldn't? But I just read a really interesting article on the virtual power plant program that Tesla is putting together. They've realized now that they have all of these connected Powerwall and solar panels out there in their customers' homes. All of that data is stored and analyzed in InfluxDB. Even on the customer side, when you look at your Tesla mobile app and you see a line chart, the data that you're seeing there is being delivered to that application by InfluxDB. They're finally saying, well, we could link all of this together. We could cluster customers and actually look at their overages in terms of production as like a virtual power plant. We could sell a cluster of tens, or hundreds, or thousand homes back to the power grid as an actual power plant. This is an idea that somebody who understands technology came up with, and they knew that they had the tools to actually make it happen. So, they did it. They've been very public and talking about it. You can just Google Tesla's Virtual Power Plant. There's been tons of articles about it over the last couple of months. As long as any industry has smart folks like that who knows how to use technology to their advantage, we will see adoption in those industries. That's why we focus on a person who will very likely be in all of those industries as compared to something very specifically focused on midstream oil and gas and a particular problem they have there, if that makes sense.

Erik: Yeah, I know. That's interesting. I think databases are the hidden champion of the tech stack, right? They don't get very much credit, but they can make or break a solution. I was going to ask you. I think you've already poked at the answer already. Maybe if I could ask again, what makes this different? What is it about your solution that enables capabilities that somebody else wouldn't be able to find the solution? I always lean into my ignorance. I think me and a lot of other people, we hear database. We think okay, you need a database. Check the box, and you'll find one. It will work. Obviously, that's not the case. There's a lot of intricacy there. So, what is it specifically that allows you to provide that level of capability?

Brian: Our founder, Paul Dix, is just a brilliant developer. He and his team actually build a technology that is very, very hard to replicate in terms of the way the data is stored and organized, and replicated, and all of that. The technology itself is really good. Anybody who would want to use that technology is free to, because it's all open sourced. That's thing number one. Thing number two is that, our database itself is really easy to consume and easy to use. Nobody wants to say, "Oh, my God. I have this urgent project. Let me go out and call IBM or whoever, and ask them to talk to their sales team so I can learn more about their databases." People want immediacy today, overall, just in their daily lives, but also in business. They want to just be able to go out, get a piece of technology, set it up quickly, use it, and get some value out of it right away. So, we have done our best to eliminate all of the obstacles to adoption so that, literally, you can be up and running in minutes or even hours with your first streams of data coming in. You can start to see gauges and charts and things like that moving. We call that 'Time to Awesome.' It's a play on words in terms of time to value. But I think time to value is a much longer process where you start considering like, is my investment in this platform actually paying off from a financial or other perspective?

Time to Awesome is just like that aha moment, where you have a problem, you know you have some sensors sitting out somewhere, you go to the internet, you download either our open source or you set up a free account. You're like, "How do I get my data from here to there? Oh, wait. They have another free open-source piece of technology called 'Telegraf,' which is an agent. I can install that right on the machine that's generating the data. I can just change these few lines in a configuration file, or I can copy and paste a stanza from the thousands of configuration files that are posted all over the internet everywhere—Stack Overflow, Slack, GitHub, everywhere. I have data now streaming, from something that I need to know about to a platform that actually lets me know about it." Because you can do that so quickly and with so few obstacles, people, very quickly, they feel they get attached to it in a way. Then they realize how extensible it is and how easy it is to integrate with all of those external applications because of our API's and our client libraries and all of that. They get to that point of being like, why would I even keep looking? This does what I need. You start with a small open-source single node or a small free cloud account. We're very open with you on all fronts on how you can quickly scale that out to something massive in terms of millions or billions of events per minute. It's not like a rip and replace to make that shift. You just expand and grow as you go, as we say. You don't really need to look elsewhere.

I think that's been really a key to our success. Because we, also, at the same time, are listening to the customers who are using us. We're constantly delivering new capabilities and new features for them. So, it does exactly what they wanted to do. It does it in a way that's easier than they expected. It does it in a way that's more affordable from a resource’s perspective, whether it's time or money than they expected. My experience at Splunk and my experience in the systems integration space tells me that that's a winning strategy in terms of growing your platform, in terms of that product growth, or going viral like we were talking about earlier. Put those things together, and you're going to do pretty well.

Erik: Got it. The topic of proprietary protocols, this has always been a pain point for innovators in the industrial space. Is that something that you deal with at all, or is that a level below where you'd be operating?

Brian: Yeah, I mean, we deal with it a little bit. We've got connectors for OPC UA and MTT. But open source is open source. So, that doesn't keep anybody from creating a Telegraf input for any of these other industrial protocols or applications, as long as they expose the data via API, or file, or whatever it might be. Most of the time, especially when it comes to production level stuff, we will work with the partner, or the customer will work with a partner we don't even know about — somebody like Kepware or HighBite, who does that protocol translation in a way that also has good Time to Awesome, a great Time to Awesome. Then just stacking the platforms is a total no-brainer, because you're just going to get to your solution more quickly. Ultimately, it'll scale very well. We work with a lot of middleware companies.

I think the other thing we're seeing, too, which is really interesting, is that because we have that open-source nature and our product is integrated with a lot of the industrial hardware and software vendors' own platform—like Bosch, Siemens, and Rockwell—they all take InfluxDB and package it as part of their container management system for industrial IoT. Siemens, it's like when you get the WinCC OA external database connector in that platform, it's going to connect to InfluxDB. ThingWorx, when you buy ThingWorx cloud and you configure your time-series database, you're configuring InfluxDB behind the scenes if you're buying ThingWorx for your own premise use case and you want to do so in a very scalable way. Even ThingWorx' documentation recommends that you use InfluxDB as your time series database.

Those are organic partnerships, which I really love. Because it just shows that the technology works and that it's attractive to those vendors as well. It also gives us the opportunity to work with more community members. We don't call all of those people customers. Sometimes they end up making a purchase from us because they need to get one of our enterprise capabilities. But we get to work with them as part of the community and learn more about their use cases, and maybe build new features specifically for them or whatever. It's just another really interesting route and channeling to market that I don't think every vendor gets to do that. Just to say hey, another multinational, large industrial, technology company has just picked our software up and has packaged it as part of their own platform. That's like a really good design to problem. It's just a really good thing to have.

Erik: That's great that you've managed to play well with proprietary. There's a couple other things that are interesting from a database perspective. There's topics like ML at the edge and a lot of innovation around designing chipsets, in algorithms and so forth, making that more feasible. Obviously, data management is a critical topic. There's also, obviously, the blockchain and a lot of innovation. I guess you could argue around how much that's resulted in real world impact so far. But at least, there's a lot of effort in innovation around the topic. What is interesting to you in terms of these other parallel technologies?

Brian: I think they all are — I think blockchain, for example, I think it's really interesting. Because I feel like its trajectory has been a bit of an analog of the trajectory of IoT, just in like a 'hold my beer' manner. I think the technology itself is immature and it's young. I think the hype just blew it completely out of proportion. It got affiliated very early with the use case, decentralized finance, which, of course, everybody wants to make money. They want to do it by making money off of money, because it seems like the easiest way to do that.

I think there's a lot of other really compelling use cases in terms of supply chain provenance, working with data integrity that are really interesting on the blockchain side of things. For example, you could take in our platform. We did this at Splunk, with events data integrity as well. You could take a bucket of database or an S3 bucket in AWS or whatever. You can hash that bucket. Then you could take that hash, and you could put it on the blockchain. At that point, because that blockchain is public and it's immutable, ideally, you could always verify that the data in that bucket had never changed. It just adds a really interesting layer of data integrity to your strategy. For some industries, I think, especially as it comes into the regulated industries and storing data that's extremely, extremely important that they retain it and that nobody makes any changes to it, to either do cover ups or whatever, that will be a use case that will eventually emerge. People will go, "Oh, that's blockchain. Cool. Why didn't we do that 10 years ago?" That's one thing on the blockchain front.

I think augmented reality is pretty interesting as well, I got to admit. I just caved and bought the family an Oculus, or the Meta Rift, or whatever it's called, Meta Quest 2. I had no idea. I'd seen it in trade shows. I used it as more of like, "Oh, it's a fancy gaming monitor." But now that I've really started to explore some of the more immersive experiences there—the training stuff that you could do with that, the safety, the simulation you could do with that—I feel like that is really going to start to play, especially in the industrial IOT space. Because it is immersive, and it does feel much realer than I had ever expected.

Machine learning and AI is always going to be there. It has always been there. I think we're probably in our 30th year of practical machine learning. I think over the last 5 to 10 years, because money is money, it's got a lot of press. People have become hyper aware of it. Part of that is because the technology has improved as well. But to your point, I think it's ironic. Because, really, as we make machine learning more and more efficient, we're actually taking things out of the software domain and moving them back into the hardware domain. I think you start to see, these people for the past 10 years have been like, "Oh, an IoT gateway is just... It's a small, embedded, industrialized, general purpose computer." I think that's going to change. I think what will happen is that the edge gateways or even the edge devices themselves will become much more highly specialized hardware based, whether it's through DSP or some other processor model, where the hardware itself is designed to do specifically what it's supposed to do in terms of work. Whether it's processing data, aggregating data, moving that data over the network — it can all now be put on a chip. It can be functional and reliable on a way that software just yet can't. So, I think that's pretty cool, too. I would imagine that in the near future, you will be able to buy a gateway that will have open slots for chips that you can pop in, that will give you anomaly detection or some other industry use case specific value, that it's solved just by plugging hardware into the board as compared to having to install and configure an application.

Erik: That's an interesting concept. I know a couple of companies that are working on, at least, machine vision specialized chips. I mean, that's obviously very computationally an intensive use case. Great. I think we've covered a good bit of territory here, Brian. Anything else that you wanted to touch on?

Brian: No, I mean, I would encourage your audience to just take a look at our open source. You can find us on GitHub. We got like 23,000 stars. So, if you just type in 'influx' or whatever, it's very likely to be at the top of your search box. Try it out. But I think the thing, too, when you have such a wide audience, especially in the IoT side of things, is participate as well. Use the open-source software. Do whatever you want with it, extend it, customize it. But then also share that back with the community. Make your submissions. Create your connectors, whatever. If you share it back as open source, everybody's going to benefit from that.

Feel free also to reach out to me. You can find me on LinkedIn. Just check out our website influxdata.com. We're really hoping to be able to continue to deeply engage with the IoT and industrial IoT audience. I think, as things go, that will happen more and more, and more deeply and more deeply. Because, like I said, we're seeing the early indicators of good product market fit in those areas where I think a lot of vendors have struggled, and the customers have struggled because of that. I'm not saying we're going to save the world, but I think help is on the way for sure. So, we're looking forward to working with everybody.

Erik: Great. Well, I love your mentality as a company, Brian. I certainly encourage everybody to reach out and just engage and learn more. I'll put the show notes or the links in the show notes. Brian, thanks for joining us.

Brian: Yeah, thanks, Erik. That was fun. Let's do it again.

Transcript.

Contact us

Let's talk!