IoT Spotlight - EP 178 - Replicate human decision making in industrial environments with Autonomous AI - Bryan DeBois, Director of Industrial AI, RoviSys

Podcasts > Ep. 178 - Replicate human decision making in industrial environments with Autonomous AI

Ep. 178

Replicate human decision making in industrial environments with Autonomous AI

Bryan DeBois, Director of Industrial AI, RoviSys

Monday, May 29, 2023

This week, we interviewed Bryan DeBois, director of Industrial AI at RoviSys. RoviSys is a global system integrator that applies the most effective methods for building, analyzing, improving, and maintaining complex autonomous systems.

In this episode, we talk about the concept of autonomous AI that integrates deep reinforcement learning based on machine data with machine teaching based on the main expertise and improvement through simulation. We also explored the industrial landscape broadly, including the cost structure, ROI targets, use cases, cloud strategy, and emerging technologies like GPT4.

Key Questions:

● What is autonomous AI?

● What is the status of industrial AI in the S-curve of artificial intelligence?

● What are the use cases that are fit for autonomous AI?

● What is the cost structure for autonomous AI?

Transcript.

Erik: Bryan, thanks for joining us on the podcast today.

Bryan: Yeah, thanks for having me, Erik.

Erik: I'm really looking forward to this one. Because we're running a couple projects at IoT ONE related to industrial AI, so I get to kind of pick your brain around a couple of the things that we have in motion as well.

Bryan: Yeah, let's do it.

Erik: Cool. But before we get into this topic of industrial AI, I would love to understand a little bit more on your background. You've been working with RoviSys now for 23 years. So you are a very loyal team member. And so I'm curious. Obviously, you've been in software development in a bunch of different roles. How did you end up where you are today as director of Industrial AI?

Bryan: Well, to kind of get there, we got to talk a little bit about who RoviSys is. It was started in 1989 as a control system integrator. And so, for some of your listeners who may not be as familiar, control systems, controllers, these are typically the smallest unit of work on the plant floor. They're mini computers, and they tell all of the other equipment what to do. So if a reaction needs to heat up, or cool down, or you need to move this product through this pipe, all of that is done by these controllers. As a control system integrator, our history is we would go in and we'd program these controllers to do interesting things.

Well, over the evolution of RoviSys as a company over the last 34 years, we've added more and more capabilities. Then we started to be able to build the HMI screen. These are the human machine interfaces. These are the screens that operators use when they're running a plant. Then we kept layering on capabilities. One of those capabilities, and the one that I have been involved with my whole career at RoviSys since I started in 2000, is what we call information systems. And so these are all of the systems that live above the plant floor but live below the business systems. So in-between space. They collect all the data, correlate the data. They visualize the data. This is where all the analytics tend to live.

This is where I get excited because this is where all, in my opinion, all the interesting things happen. Because the plant floor, it can generate just absolute volumes of data. In fact, we like to say we've been doing big data even before really it was a term in the IT world. We were doing big data because of all that data coming off the plant floor. I've been in that information system space the whole time. When I was a programmer, I was programming in that space. Then I moved into MES in that space. Then I moved into managing projects in that space. Then OT came along, or cloud came along with OT data and OT data warehousing, and pulling all of that OT data together and collecting and correlating it into these giant data warehouses in these Historians that we use, which are time-series databases — specialized time-series databases we use on the plant floor. So I was doing all of that.

Then to answer your question, so then in 2019, we hit our 30 year anniversary at RoviSys. We kind of sat back and we said, "What's next? What should we be going after next?" And so, at that point, we launched three horizontals. One was our Industrial IT team. They actually go in. They do cybersecurity audits. They lock down the plant floor. That really was as a response to some of the cybersecurity things or stories that we've heard about major manufacturers, ransomware and things like that that have hit, that was a response to that. The second one was our MES division. That's manufacturing execution systems. These are the systems, the big software systems similar to ERPs. But they sit on the plant floor, and they tell everything what to do down on the plant floor. Organize everything, communicate directly with the ERP systems, handle inventory, goods produced, things like that. Then the third division we created, which is the one that I'm the director of, was Industrial AI.

I have this strong software background. I've got a computer science degree from the University of Akron. And so with that, it was a natural role for me to take. But I will say this. I was a little reluctant about taking the role. Not because I didn't think that there was a future for industrial AI. I knew that there was. I just didn't know in 2019 if we were maybe moving a little too quickly or if we were a little too early in pushing that story. I remember distinctly one of my first meetings with a customer, to talk about industrial AI. Here, they bring out their two data scientists that they had on staff that were already training neural networks on data from the plant floor. That was an eye opener for me. What I found was that we were not late to the party, but we certainly weren't early. And so we really felt like we were hitting at the right time. Because we had these customers that had these small little teams and things like that. They were doing AI already. And so it was a great time for us to enter that market. And so I've been doing it now for almost four years since.

Erik: Interesting. It's always interesting thinking about technology, what is the S curve, and where are we? Are we under the lower tier, the middle? With a technology like industrial AI, it's very difficult to know. Are we already reaching the mid, the higher range on the S curve? Maybe the results are going to start to peter out, and we're going to have just a continuous moderate improvement on what we have today. Or are we at the bottom, and we're just getting ready to take off? Do you have any kind of sense? I guess you can look at AI — obviously, you have different generations of technology — the generation that we're working with today, which is a lot based on machine learning and then now deep reinforcement learning and so forth. Do you have a sense today where you feel like we are on that S curve?

Bryan: Well, I think that we are past any sort of hype, I would say. Because AI, if we define that like you did in terms of machine learning, being used on the plant floor to make transformative impact on the operations of a manufacturer, we are there and have been there for many years now. In fact, there are certain off-the-shelf products where it's pretty turnkey, and you can do things like predictive maintenance. I'll mention it. They are a partner of ours — AspenTech Mtell. It's a pretty point and click turnkey solution for predictive maintenance. It's going to do your anomaly detection for you, and it's going to do the predictive side of the ML equation. So it's been around for a while. It's established. There's lots of customers who have seen results.

There's no question that Industrial AI exists. It's past the hype. Where I think I get into most of the conversations my customers, because I'm still doing a lot of education with my customers about what is the state of the art. A lot of what they hear about AI — we'll probably talk a little bit about generative AI like ChatGPT and things like that — there's a lot that they're hearing that may not be applicable yet in the world that we live in in manufacturing. Because we do tend to be five to seven years behind the rest of the world. That's because, well, for a lot of reasons, manufacturers tend to be risk averse and things like that. Where we're at, we are just now seeing the adoption of technologies like DRL. Hopefully, we can talk more about this in a minute here. We have packaged together DRL into a broader offering called autonomous AI. We really see that as the next generation, as the next step for industrial AI.

Erik: Great. I would love to deep dive that. Let's just cover a little bit more detail on RoviSys first. When you set this team up or when you became director of Industrial AI in 2019, did you already have a handful of people doing different things? Did you say, "Let's package these people"? Or was it like, "Okay, we got to get on LinkedIn and try to hire the right team"? What did that look like at that time?

Bryan: So you're asking me the same questions that the president has asked me when they appointed me the director. He said, "Okay. Here you go. What are you going to do? How are you going to go after this? What does the market look like? What kind of skills and talent do we need to hire in?" And so it took a while to put that all together. One of the things that I found, though, is that for the 20 plus years that we had been doing projects in that information space, every time we went after a new type of project — building MESs or building Historians, we did some linear optimization problems, which were really cool for a hydroelectric dam — every single time, we hired generalists. We hired computer scientists and computer engineers. They understood the manufacturing and industrial space because we've got them into plants very quickly, and we had them talking to customers very quickly. So they understood that really well. Then they were able to adapt to the different technical problems that we threw at them. So I estimated that we probably could do the same thing. And so we took the same software and computer engineers that we use on all our other projects, and we taught them the data science.

My hypothesis, which seems to have been proven out, is that it's easier to teach an OT expert the data science than it is to teach a data scientist the OT. For some of your listeners — because I use this term all the time, I sometimes forget to define it — when I say OT, operational technology. So this was a term coined by Gartner, I think, back in 2006 because they needed something that wasn't IT. Everyone knew what IT was and understood what that was. OT is really about the plant floor, and really is more defined as equipment and making things move, or happen, or producing things typically on a plant floor. But it can apply to things like power generation in pipelines where you're not making anything. You're just moving product. So that's OT.

And so teaching those OT experts the data science seem to be the way to go. Interestingly enough, talking to customers — I don't want to take anything away from data scientists, because many of them are brilliant — what we're hearing from them is a lot of times these data science projects stall for that exact reason. Because they've hired in from the outside data scientists who really understand data and can really — I mean, they can build predictive models, and they can prove they're predictive. But they've got no idea how to actually implement that on the plant floor. And so we call that operationalizing AI. We think that that's a key piece that's typically missing from those meetings. Because everyone sitting around the table is looking at everyone else and saying, "Well, you're going to deploy it." Well, I don't know how to deploy. You deploy. And so we're the ones that can actually sit at the table and say, "Yes, we understand how to deploy an ML model to a plant that has to run 24/7. We understand the organizational change, management that comes with trying to get operators and supervisors to start to trust this model instead of their own eyes, ears, nose that they've developed over a long time. We understand all those factors, and then taking into account things like model retraining. Because these models drift over time. So we understand that too. And so we feel like we can bridge those two worlds really well. We find that as a differentiator for RoviSys.

Erik: It's funny. We had a very similar conversation with Siemens, which is a big client of ours a few years ago. Actually, we're on the same timeframe where we were helping them to define that AI big data strategy for one of their businesses. We got to the point of saying, "Okay. Now we feel like you have a sense of what you want to do. Now how are you going to do it, and who's going to do it?" We had the same conversation of, "Okay. Are we going to try to go—" This was in China. "Are we going to go try to take talent from Tencent and Alibaba and all the tech startups? But then, are those guys going to stick with it? Are they going to be here for two years and then jump? Or is it going to be more effective just to say, let's take some of the guys that have been on your team for seven years. You know them. They know the business. You know that they buy into this industry, and they're kind of married to the industry. Let's upskill them and give them a new opportunity to redefine their career."

That was the path they took, and it seems like it was really the right path. Because there's so many tools on the market now to help somebody learn that data science. It's not like it was 20 years ago where you need a PhD. Now you need to learn some access and toolkits, and you know how to program already. So that really seems like the right approach in a lot of cases. Last question about RoviSys, Bryan. Your customers — I see you're sort of process, discrete, and building. Can you maybe give some examples? What does a typical customer look like for you in those different verticals?

Bryan: Yeah, each of our verticals, we've got about 13 I think verticals now — everything from life science, to chemicals, to oil and gas, to like you said, discrete manufacturing, to building automation. Each one of those, it's always kind of interesting. Because each one of those vertical industries has very specific needs. That's why we're oriented by verticals. Actually, our divisions are divided up by verticals literally within the organization. It's all by those vertical industries. That's because, each one of those verticals, they speak a different language. They have very specific pain points and use cases and things like that. Then we have dedicated engineers that only work in that one industry. And so it allows us to walk into a room and very quickly build credibility with that customer, because they know that we've done it. We're using the same terminology that they're used to. We're talking about the same types of equipment and same type of problems. Because, again, when you focus on an industry like that, the problems are pretty portable. You start to see the same repeatability between different customers on, "Oh, I bet you're having these types of issues." Oh, yeah, we are. So that immediately lends us a certain amount of credibility.

Then we've got these three. I'll talk about it. We call them horizontals. Industrial AI is a horizontal. In my team, there's about 22 managers and about 25 engineers in my team. We go around and we help service then all of the other vertical industries. What's interesting about that is that I end up having to be me and my team have to be Jack of all trades. Because I got to be able to speak to pain points across all 13 different industries that we service. But the other thing, to address your question about the typical kind of customers that we see, as RoviSys has grown, it's been great to watch that. When I started, it was about 150 people. Now we're getting 1,200-1,300 people globally. Watching that evolution, the projects have gotten bigger. The customers have gotten bigger. These are Fortune 500, Fortune 100 customers that we serve. The stakes are higher. The ROIs are higher. Everything is bigger. The teams are bigger. The projects run longer. But it's all pretty exciting. It's fun. It's great.

Erik: Cool. Well, let's then get into what makes this exciting. I think you're in a very unique position probably at the company, pushing the company into new domains. The topic of industrial AI, I guess you've already given us a high level in the past 20 years and where are we today. So why don't we start with today and the cutting edge, and maybe this concept of autonomous AI? What does that mean? Why is it important, and where are we today in terms of actually being able to deploy this on the shop floor or in the building?

Bryan: Yeah, so to talk about autonomous AI, it's important to understand a little bit about deep reinforcement learning. And so, for your listeners, all of the industrial AI that we've talked about up to this point can pretty much be divided into two different categories. That's anomaly detection. That's the idea where you hook it up to a line, and it will monitor that line. It will learn over time without any a priori knowledge. It will learn over time what normal looks like. Then it will start to be able to tell you what abnormal looks like. That's anomaly detection. It's been around for a long, long time.

The second type is predictive — everything from predictive maintenance to predictive quality. The idea behind that is that it can predict one value. Here's the number of days until that piece of equipment is going to fail. Here's what the temperature is going to be tomorrow. Here's what the likely quality of that batch is going to be when it's done. It can make a single prediction. What you do with that prediction is up to you. You're the one that has to take action and make decisions based on that. That's where the value comes from. Those two categories of ML, machine learning, that's where we've been.

This third category is deep reinforcement learning. It's a brand-new category of machine learning. It came onto the scene. Its big coming out party was around 2016. It came out of DeepMind, which was a Google spin off. When it came out, it came out in the form of a program called AlphaGo. It went up against Lee Sedol, which was one of our human grandmasters in Go. It beat him, four to one. Then they built AlphaZero, which went on to beat some of our best chess grandmasters. Then it beat our best chess programs. Then they created AlphaStar, which beat our best StarCraft players. And so they were on to something. It raised a lot of eyebrows in technical circles because this was something new. And so around that time, there were a few companies that were starting to apply this new DRL technology, not to playing games but to solving industrial problems. And so it's grown out of that.

But why DRL is different than any of the other types of ML is it does frankly what people think AI can do. It says, "What should I do next? Given the current state of the system, what is the next best move that I can make?" What's interesting is: now, if you think about that, if we take the chess analogy, so if it comes up on a pawn, well, if it was just kind of a greedy AI, it would just grab that pawn. Well, maybe that's the next best move. But maybe not. If your goal is to win the game, and if you think about its ability to play against grandmasters and win, it can't just grab the next piece that's available to it. It has to build long-term strategy, and it can. That's one of the things that distinguishes it from any other ML that precedes it. I've never seen any other ML that can build long-term strategy like this. And so DRL then is at the heart of this autonomous AI revolution.

There's two other key aspects to it, though. One of them is called machine teaching. The idea behind machine teaching is that you actually codify some of your — we talked to the best SMEs, the best operators, the best subject matter experts on how to run a line. We talked to them about what are the skills and strategies and concepts that they use. They talked about it. How do they talk about operating the line? If it's in this state, a lot of times when you talk to them, they'll say, "Well, if it's in this state, then we run it this way." I was just talking to a vinyl manufacturer. They've got two primary suppliers of their resin. And so they said, "Well, if we get resin from such and such supplier, we tend to run the line this way. If we get resin from this supplier, we tend to run it this way. We run a little hotter. We run a little cooler. We run a little faster, a little slower." Those strategies then become part of our machine teaching. So we bake into it, those skills.

Now why would you do that? It's tempting to say, well, just let the DRL learn all of this stuff. Start from a blank slate, and let it learn all that stuff. Well, the problem with that is that it might take 1,000 years for it to learn all of those skills. The other thing is that that's not even how you would teach a human operator. If you were to teach a human operator, you're going to come in. You're going to say, "Okay. This is what you do. If it's running hot, you do this. If it's running cool, you do that." You're going to teach them those same strategies. So we give it that same running start. Then it's still got plenty of state space to explore and figure out. The way it does that is that third pillar, which is simulation.

The key to autonomous AI is it learns by doing, not by data. So you need a simulated environment. We can talk more about that. But that's been an interesting challenge. Because simulation is not that prevalent, at least not yet, in the manufacturing world. So that's been part of the challenge that we've had to overcome.

Erik: Got you. Okay. So autonomous AI is deep reinforcement learning, plus machine teaching where you're basically giving it a foundation of a set of organizational learnings that already exist, plus learning through simulation.

Bryan: Simulation, yep.

Erik: What are the use cases that autonomous AI is good at, and maybe what are the ones where it's not so good?

Bryan: That's why I get really excited about this technology. Because RoviSys is a system integrator or SI. So what do we do? We integrate systems together. That's what we do. RoviSys has none of our own IP. We're not a product company or anything like that. We look for technologies that can solve a lot of problems that we can address a lot of use cases with. As a comparison, when Historians really started in the early 2000s, late 90s, when Historians really started to become prevalent in the market, we jumped on that bandwagon. I remember, early in my career, I was explaining to customers what a Historian was and what value it could bring. Nowadays, every customer has multiple Historians. But what we loved about that is that you could apply it in all kinds of different industries to solve all kinds of problems. Well, autonomous AI in its early days, but it is looking like that type of a Swiss army knife, which we love.

So to answer your question, we're using it to solve problems as disparate as production scheduling, which is a relatively old problem. Everything from that to we've got a refining customer that's using it to do direct control on drawing out diesel in the distillery column for the refining process. So that's what we really like about it. It's anything that has a human being in the loop, anything where you need human-like decision making, that's a perfect use case for applying autonomous AI. Where it's not so great are problems that already have good solutions for them. So you wouldn't use it for just standard control. Because PID loops do that. They've been doing it great for a long, long time.

Even things like MPC. We don't necessarily even see it as a competitor to MPC. In fact, we've married it to MPC in some of our solutions where autonomous AI sits on top of the MPC. MPC, I always compare it to like an autopilot, a very capable autopilot in a plane. In an autopilot, if in a plane, if you tell the autopilot you lock it in, it's going to take you from point A to point B. It's going to navigate around certain things that it can handle and things like that. When it changes and things like that, it can handle all of that. But all it can ever do, is get you from point A to point B. It can't tell you that it's supposed to go to point B. You need a pilot to do that.

And so that's where autonomous AI comes in. It can actually make those types of decisions. I could go to point B, or I could go to point C. Which one's the more optimal place for me to go? It can do that by learning through sim and learning over — it will run for hours or sometimes days. But in that time, it's the equivalent of running for like 100 years. It's the equivalent of having been an operator for 100 years. So it gets out of that training session as if it's a seasoned operator that's seen everything. We do these things called lessons where we throw crazy curveballs at it. Well, what happens when — production scheduling, in this case. What happens if this equipment goes down? Now build an optimal schedule. What happens if you've got a hot batch that comes in hot order that comes in from your best customer? Now build an optimal schedule. So we throw these curveballs.

The other thing that's interesting. I was presenting this at a trade show. One of the questions I got was, well, okay, but how good does the simulation have to be? It seems like it would have to be really, really accurate. I said, well, no. In fact, if you give me a really, really accurate simulator, what we actually do is we introduce noise. Because the real world has noise. So this brain, they're called brains. When it's trained, it's got a brain. These brains have to be able to handle the real-world noise. We'll introduce bad signals so that it has to deal with that and learn to work around that. Anyway, that's how it learns over time.

Erik: Yeah, interesting. This topic of simulation, I think, is really critical here. Because if you don't have good data, that's been one of the bottlenecks of a lot of machine learning in the factory floor. It's that you just don't have that many faults, right?

Bryan: Right.

Erik: I mean, the goal is not to have faults. And so you say, "Well, give me a data set with a bunch of faults so I can train this." It said, well, we've got 10 of them over the past 10 years. But we don't want to have another one, because each one costs us $100 million. So where are we today in terms of being able to generate simulated data for training? Again, also probably using AI for this. I was just talking to somebody a couple of weeks ago about using also GPT-4, that there's some applications there for generating high-quality simulated data. So how are you doing that today? What's your approach?

Bryan: We basically break simulation into three different categories. The first one is what's called first-principles sims. That's what everyone thinks of when they think of simulation. So you're using advanced calculations that calculate the best that we know about physics or fluid dynamics. You're simulating the real world in these calculations. Those are great. Some of our customers already have some of those things. What's actually happened there, we had one customer in particular that had a whole — they were a glass manufacturing. They had a fluid dynamics, a first-principles sim fluid dynamics model for the process. The problem was that the process happened in two seconds that we were trying to optimize, and the simulator took over 24 hours to run one of those loops. So we couldn't actually use the simulator, the first-principles sim. So then we have to look to one of the other types. The second type is what's called discrete event simulation. That's used primarily in things like production scheduling, or we've done some throughput type of projects with manufacturers. But everything is a discrete step. So it's only usable for certain types of problems.

The third type that we actually use a lot is what's called data-driven model. And so that's where we can take that data from the historical record. We can actually mix it with some traditional machine learning mechanisms to create a data-driven sim. It's going to have limitations. I'll tell you that right now. So first-principles sim has all the capability of everything we know about physics baked into it. Well, this won't. If it's never happened in the real world, it won't be in the historical record. And so, oftentimes, it gives us a great start. But then, we'd have to expand the state space. We've got to try things and do things.

And so with that glass manufacturer, we've got a great data-driven model to start with. But now we're doing what's called design of experiments. They have a research line, so we're able to do this. But in between production runs, we're having them do stuff that you would never actually do under normal circumstances. They make really weird shape — they're a bottle manufacturers. They make really weird-shaped bottles and things like that, do think to expand that state space. We're recording all this in Historian as it's going on, so we can add that data to the data-driven model and make an even more robust sim. But again, back to that guy's question at that trade show. It doesn't have to be as — we don't have to cover every single state. We can make some assumptions and bake in some of those assumptions, and the DRL can still learn very effectively from that.

Erik: Let me throw a situation by you. I'm interested in hearing your thoughts on whether this could or could not be addressed using this. So the situation is a turbine manufacturer making turbines for large hydro or other power plants, they have an issue occasionally where they'll deploy a turbine, and there'll be some fault. And it will break. Then it takes them literally three months and just hundreds of hours of work to figure out what the issue was. Because it's discrete manufacturing, they don't have 1,000 sensors through the plant. They're trying to figure out, is it because we had a different copper manufacturer and somehow there was a slight change in the chemistry? Or is it because we made a fault on a production line, or we were running hot on the furnace? What was this?

So basically, it's this combination of then looking through the historical machine data — they, of course, do have some machine data — but then also looking at the processes, who was working there. It's having a bunch of conversations. Then you come out of this with a big report. A bunch of PDF files are produced. And so we're looking at this and saying, well, a lot of companies have some kind of variation of this where you have this big, messy problem that happens occasionally. The machine data is imperfect. But then, you have these historical reports where you have a bunch of PDFs and a bunch of information that's either locked in a document, or it's locked in people's brains.

And so the question is, would it be possible to have the algorithm trained on the machine data but also reading through the PDFs of all the previous fault reports and maybe the specification, documentation, all of this unstructured text documentation. And maybe even interviewing the engineers and saying, "Hey, how do you think about this? What problems do you think it could be?" Basically, creating this kind of knowledge, this Copilot or this knowledge database which might be then able to say, "Next time a problem arises, I think there could be three potential root causes. You should check X, Y, and Z." But would that kind of more, let's say, a messy kind of root cause analysis where it's not a kind of a process manufacturing environment where you have great data at every stage in the process. Where are we today in terms of being able to bring insight into that situation?

Bryan: Yeah, so would it be a problem for autonomous AI to solve? The challenge with that is now you're getting into more knowledge problems. And so you mentioned about ChatGPT and generative AI. I believe it's early days for that. But it does seem like that's more of the right tool to solve those types of knowledge and knowledge graph type of problems. Autonomous AI, in particular, needs simulation. Everything you're describing, I'm trying to figure out what you would actually simulate.

Could it do all that? Yeah, but it's just not the right tool for the job, I don't think. Because in order for it to do that, could you simulate the turbine for sure, and could you simulate failure models and things like that? But I don't know how you would simulate it interviewing folks and things. Again, that seems more like the ChatGPT type of world in the language and knowledge type of problems there versus this is more of an operational technology. So you're going to use it to solve again more like operational types of problems.

Erik: Okay. Clear. Do you see value in integrating different API or algorithms? So let's say you have an autonomous AI system that is good for solving operational problems. But then, you also have some unstructured information, and maybe you want to have a more conversational user interface and so forth. So GPT might be more useful as the user interface or maybe for integrating other — basically, maybe you could feed the output of the autonomous AI into GPT and say, hey, here's the output of the system. And here's some additional information. What do you suggest we do based on this?

Bryan: Now you're on to it. That's exactly right. So what we're looking at, one of the aspects, I guess, I forgot to mention about machine teaching is while we're laying out all these concepts, we can actually leverage. You actually lay them out in the workflow. We can actually leverage DRL at certain points, but we can leverage other AI technologies.

For instance, we've got a customer that's using a vision system, and they're using a very traditional ML type of model to classify what it's seeing with that vision system. Then it just sends to the DRL a classification. It's this pattern. It's this pattern. It's this pattern. It's this pattern. Here's the five different patterns it could be. So now you're combining. Machine teaching allows you to combine those traditional AI technologies with the DRL to really get some superpowers. And so, absolutely, that is I think exactly what you said. I think that's really the future of autonomous AI. It's to start to incorporate some of those knowledge systems like ChatGPT into it and to make some of those initial decisions, and then just feed the decision to the DRL to learn from.

Erik: Okay. Got it. It makes sense. Yeah, this is something we're very excited about. Because here in China — I imagine it's similar in the US. You have this. Well, obviously, it's similar in the US right now — everybody's having trouble hiring skilled engineers, and so forth. And so the question is, how do you enable one person to do the job of two, or how do you enable somebody who's 27 years old to do the job of somebody that was 45 and knew the factory like the back of their hand? Obviously, we need new tools to do that.

Bryan: And to build on that real quick, just because it just happened last week. I was at a customer's site, and I had a frank conversation with the plant manager. He's like, "We need autonomous AI in this line." He's like, "I'll tell you why. I've got my two best operators, seasoned operators, that have been running this line for over a decade each that are probably, he said, about five years from retirement." He's like, "I don't have anyone on the bench. I got one- to two-year operators. I've got high turnover in that role." He said, "I don't know what else to do this." He's like, "I have to have autonomous AI here. We got basically five years to get it working — it won't take that long — to get it refined to where he can put a one- to two-year person. We baked in all of that expertise from those seasoned operators into it. He can put a one- or two-year operator on it who might quit in a couple of weeks, and still be able to produce at the volumes that he produces out now.

So just to reinforce that point, absolutely, that's a huge problem. And so, like you said, there's two aspects right now in this AI space. There's the operational, which we think autonomous AI addresses really well. Then there's the knowledge problem which, again, the ChatGPT is the world in the large language models and things seem to be addressing that space.

Erik: Maybe we can quickly touch on some of the business topics around this. Maybe the first one would be the cost structure. What does the cost structure look like if you're training? Now, what percentage of that is the technology platform, the training process, the infrastructure, the system integration?

Bryan: Basically, it looks like this right now. This autonomous AI is still very much in its infancy. So a lot of it, a lot of aspects of it are still open source. But we've got vendors like SaaS that we're working with — I don't know if you're familiar with SaaS — that we're able to do some of the stuff on top of their platform. Then you've got Microsoft Azure ML. You've got AWS SageMaker. So we can incorporate some of these different technologies to solve the same problem. It's a methodology. It's an approach. It's not a product. And so we're leveraging these methodologies that were developed in the open source world and applying them, leveraging these products.

So what does it look like? There's a services piece to it. That's the RoviSys side of it. So we're going to spend hours sitting down with your experts, and we're going to learn every aspect that we can about how they run the line. Or in the case of production scheduling, we'll sit down with your expert schedulers and say, "How do you handle when line three is down all day? How do you still build a schedule?" So we're going to learn all that. We're going to bake it in with machine teaching. Then we're going to train it typically with the cloud.

The simulation, I guess I should address that, too. There's a whole aspect of either leveraging simulators. The production schedule was a paint manufacturer. They had actually already built a discrete event simulator. They were using it for making decisions around CapEx type of decisions, like, should we add more equipment? Would that increase our throughput in the way we think it would? So they'd already built this discrete event simulator, and then we leveraged it to train the brain on for production scheduling. We've got to make sure that there's simulation. We may have to build a DDM, a data-driven model, which we were doing for a lot of customers right now. So we got to do all that upfront work. Then we do the machine teaching. Then when we get to the training part, we typically leverage the cloud. Because it is going to scale out. I mean, it may run up to 100 of these simulations at the same time, and it's running very, very fast. So we do typically need cloud types of scale for that.

In terms of cost, up to this point, you've paid us for the services. But then, you've paid some cloud cost. I mean, not huge cloud costs for compute, up to this point. Then once you've got the trained brain — that's what we call it; it's a brain — that gets exported. That brain now is actually just a trained neural network. It's no more exotic than that. It runs on commodity hardware. It can run on Intel chips, whatever. And so once it's exported from the cloud, it can run anywhere. It can run on-prem. It does not need to be connected to the internet anymore. All the training is done. All the knowledge is now baked into it. And so it gets deployed then. So then, the last place where you get into cost is the deployment story.

We've got customers that want to do what's called decision support systems, which is typically where we start and we recommend they start. So that's where we're going to build a web interface on this brain. It's going to sit on the plant floor next to the operator's control screen. It's going to monitor the state of the control system. It's going to give suggestions on, okay, I would recommend turning this dial a little bit more. I'd recommend turning this dial down and this slider up. It's going to make those recommendations. Now, whether or not the operator does it or not, it can monitor that. But then if the state changes, it'll say, okay, you didn't listen to me. But that's okay. Now, here's what I would do. Here's the next optimal thing I would do. But then, we've got some of our customers, the refining customer is looking to move this way, where we move from decision support to direct control. Now we've actually wired it into the control system, and it's sending commands directly to the control system.

Now, that sounds scary. But one thing to keep in mind is it can't do anything that the human operator couldn't do. Every control system we've ever put in has certain interlocks and certain parameters and boundaries on what a human operator can even do. You can only move this slider so far, and then they'll block you. We can actually restrict the parameters even more if we wanted to on one of these brains. The same types of controls and safeties that we put in for human operators, all of those are still in place for the brain. So you got some costs there in terms of deployment. But they're not cheap projects. I don't want to give that impression, because there's a lot of consulting. There's a lot of experimentation. There's some dead ends. They typically run, I would say, anywhere between 8 and 12 months. It's typically what we shoot for for one of these projects. But like I said, they're not necessarily inexpensive projects. But also, we're only targeting really big ROIs. We're looking for the big, big ROIs.

I was talking to a customer, and he estimated that they create about a million dollars in waste every year. That's actually, to me, that seems like it could run pretty well. I've talked to a lot of customers that their waste numbers are in tens of millions or multimillions. But we're looking to reduce that waste by 50%. If we can do that, that's big savings every year. So we're going after those big, in some cases, million or multimillion-dollar problems. We're trying to move the needle 1% to 3% in a lot of processes. That's a lot of money.

Erik: Well, exactly. I wanted to get a poke at this topic of ROI next. I guess you probably have two different high-level, two different scenarios. One is where it's a pure financial ROI and saying, okay, we can say 500,000, or 2 million, or whatever that is per year if we make these changes. Then you calculate the payback period and make a decision based on that. Then the second is where a company says, "I want to reshore because I need to have X percent of production in the US in order to be able to access certain customer base or whatever that might be, or to de-risk my supply chain. I just can't find the people. And so I need to be able to do this. Because otherwise, I'm not going to be able to actually produce the output that the market wants."

Right now, what do you see in driving it? Is it more of a pure financial decision? Is it more of a strategic decision of companies, saying we need this in order to be able to have the position in the market that we want to have?

Bryan: Yeah, I would definitely say most of my customers right now are making the decision based on financial reasons. But we've got customers, like I said, the one I just talked to you last week, where we talked through the ROI. Financial was there, but it was kind of borderline. But it didn't matter. He was like, "I don't care. I can't run this line without these expert operators." Like you said, there's not the skills. There's not the labor. There's not the talent coming in. So I've got to do this if I want to have a factory at all, if I want to have this, if I want to be able to produce anything at all. So for him, it was almost existential. He had to do it.

I think that there's probably going to be more of that. But yes, right now, still, a lot of our customers make the decision based on dollars and cents. One thing I always try to emphasize with them is: this is new, unexplored territory. Well, DRL has been around. Again, manufacturing is always five to seven years behind. For us, this is pretty bleeding edge. And so it's expensive right now. Eventually, it'll become a commodity, and it'll probably baked into the software packages. You can just flip a few switches and get it for free. But right now, it's not. There's a lot of customization. There's a lot of experimentation and a lot of problems that no one has ever tried to even solve before. Because you'd always just have a human do it. Those are the types of problems we're going after. We are looking for, right now, more of a bigger ROI for right now until it becomes more of a commodity type of thing. Well, there'll be more products and things like that that'll come out to help support us over time, too. I'm confident in that.

Erik: What do you find in the market today? What do you find to be a reasonable payback period that makes sense for your customers?

Bryan: Well, in my 20 plus years experience in manufacturing, it feels like the payback periods have gotten tighter and tighter. I mean, it seems like every single project now, they want a 12-month payback period. At least, one month break even. So that's pretty much. So it's the same with this. They want to be able to see results within a year for better or for worse. It drives us to be better from that standpoint.

Erik: Got it. Okay. A few other topics I wanted to touch on real quickly just to see what's interesting here. So one of those, we already touched on cloud. I understand the training needs to be done in the cloud. What is the receptivity today of your customers to actually run the operational AI processes on the cloud? Does everybody still want on-premise? Are they willing to accept private cloud or even public cloud?

Bryan: Yeah, so it's a question I get a lot. If we look at the total operational picture for a typical manufacturer and what their risk level is for cloud, obviously, data collection, kind of offline data collection, offline data analysis, everyone is very comfortable with putting out in the cloud. With just the volume, they typically want cloud storage because it's cheaper than putting it on prem. So that's very doable. If you go a level down from that, these manufacturing execution systems that I talked about, again, they can typically be, without communication to an MES, if everything's designed well for a couple hours and still produce. So that's typically, we've got a lot of customers that they're making very serious considerations about putting their MES systems in the cloud. A level lower than that, Historians, these time series databases. I've got a lot of customers that are willing to put that in the cloud because they can buffer the data. They were to lose connectivity. But anything that would be system critical and actually would affect the operation where they couldn't produce any more, I don't have any customers that are willing to put those types of systems in the cloud.

So to answer your question about AI and operational AI in the cloud, it depends. I've got customers that want to do all the number crunching and make some decisions maybe about better set points and things like that, and then be able to feed those set points back down to the plant floor. That's all a pretty offline, asynchronous type of process. No big deal with that in the cloud. I've got other customers. I've got a drywall manufacturer that has trained a model to detect the quality of the drywall hours before it would typically go off to the quality lab, which is really cool. Basically, in real time, detect the quality of the drywall. That needs to be done in line. And so they do all the training of the ML model in the cloud. But then, they deploy it on the edge. So the answer is it depends. Always, we advise our customers in any given scenario what's appropriate for them based on risk and things like that.

Erik: Got you. Then one other tech topic, just to poke at here, is this question of 5g. I get a lot of people asking me, how do we use 5g in industrial? So the question is, are there any use cases in industrial AI? Will you say, yes, 5g would actually enable the use case where otherwise it's going to be difficult to deploy? Is there anything like this? Or is it always, no, but you can use Wi-Fi 6 or whatever?

Bryan: Right. So I'm not a 5g expert. We've got a whole industrial networking division, like I said. They talk all day about 5g. My understanding, though, before I was in this role, I was solving problems. I did a lot of work. I don't know why. It just happened to be in our metals and mining business. And so, obviously, one of the big problems if you're a steel producer or whatever, everything's metal. Every Wi-Fi is almost a non-starter in a lot of these places. My understanding and some of the early experimentation and early proof of concepts that have been done is that the 5g can penetrate in some of those difficult environments that we weren't able to use Wi-Fi in the past. So I'm excited about that.

Honestly, it's still going to be the same types of risk analysis type of questions. Like, okay, what happens if you lose the 5g, if you lose connectivity? You still have to be able to make decisions. For us, we're always talking about what's the critical path, what's critical to operations. If this ML model, if these AI decisions need to be made in order for you to produce product, and as AI becomes more and more ubiquitous, it becomes more and more a part of the day-to-day critical workflow, critical path, then it can't be reliant on any kind of connectivity technology.

Erik: Got you. Okay. Great. Bryan, I really enjoyed the conversation. Maybe just the last question then is, what's exciting to you about the future, either directly at RoviSys or maybe more indirectly just what in the market? What's exciting for you right now?

Bryan: AI, it's had a lot of Renaissance. It feels like another one. So I'm really excited about what I'm working on right now with this autonomous AI stuff. If any of your listeners want to learn more about it, there's a book by Kence Anderson. He kind of wrote the book on it. He consults with RoviSys. Designing Autonomous AI, you can find it at O'Reilly. I think that's really exciting. Then I think that the LLM stuff, the ChatGPT stuff, the Copilot stuff that Microsoft is about to come out with in Microsoft Office, I think it's really exciting. That's going to blow a lot of people's minds — that ability to leverage ChatGPT type of technologies to write a Word document, or write an Outlook email, or start giving them a starting point for a PowerPoint presentation. That's really going to be exciting.

Erik: Are you already deploying GPT for any customers, or when do you expect to actually find use cases in the manufacturing environment for that?

Bryan: That second part of the question is exactly the problem right now. So it may be our own lack of imagination. But we are struggling to figure out where ChatGPT can apply in our world, in this manufacturing world. The problems on the plant floor typically — now, you mentioned a really interesting problem around trying to figure out a failure in a turbine. But typically, the problems that we face on the plant floor are not necessarily knowledge graph type of problems. It's more operation type of problems. And so we're looking. We'll continue to look and we'll continue to monitor. I'm always watching the development of these technologies to see what can be applicable in our world. And so, hopefully, there'll be some things we can do there.

I had an interesting conversation at a trade show recently with a gentleman. We talked about the application of GPT. He said, "Well, what about if ChatGPT, basically, it will act or it will take on a doctor role? So what if it acted like an expert maintenance person? And so you could ask it what do you think is the problem right here with this particular piece of equipment? Similar to what you had said. And it could come back. It could say, well, these are the five different potential things. Yeah, maybe. I mean, that's an interesting application. But it needs a lot of domain knowledge that probably wasn't in the data that it was trained on. So you're going to have to augment that data source.

Erik: Yeah, cool. Well, it'll be exciting to see what comes out over the next couple of years here, I guess, for both of us.

Bryan: Definitely.

Erik: Bryan, thanks. I really appreciate your time today.

Bryan: Yeah, thank you, Erik. This is great.

Transcript.

Contact us

Let's talk!