Podcasts > Ep. 108 - Optimize deep learning for the edge
Ep. 108
Optimize deep learning for the edge
Yonatan Geifman, Co-Founder & CEO, Deci
Tuesday, December 07, 2021

In this episode, we discuss new technologies that enable AI developers to build, optimise and deploy faster and more accurate models for any environment, from the cloud to the edge. We also explore the development processes that commonly lead to long delays in deployment and unexpected costs when migrating from development environment to an operating environment.   

Our guest today is Yonatan Geifman, Co-Founder and CEO of Deci. Deci provides a proprietary optimisation technology for deep learning practitioners that enables you to accelerate deep neural network interfacing on any hardware while preserving accuracy.

IoT ONE is an IoT focused research and advisory firm. We provide research to enable you to grow in the digital age. Our services include market research, competitor information, customer research, market entry, partner scouting, and innovation programs. For more information, please visit iotone.com

Transcript.

Erik: Welcome to the Industrial IoT Spotlight, your number one spot for insight from industrial IoT thought leaders who are transforming businesses today with your host, Erik Walenza.

Welcome back to the Industrial IoT Spotlight podcast. I'm your host, Erik Walenza, CEO of IoT ONE, the consultancy that specializes in supporting digital transformation of operations and businesses. Our guest today is Yonatan Geifman, cofounder and CEO of Deci. Deci provides a proprietary optimization technology for deep learning practitioners that enables you to accelerate deep neural network interfacing on any hardware while preserving accuracy.

In this talk, we discuss new technologies that enable AI developers to build, optimize, and deploy faster and more accurate models for any environment from the cloud to the edge. We also explored the development processes that commonly lead to long delays in deployment and unexpected costs when migrating from a development environment to an operating environment.

If you find these conversations valuable, please leave us a comment and a five-star review. And if you'd like to share your company's story or recommend a speaker, please email us at team@IoTone.com. Finally, if you have an IoT research, strategy, or training initiative that you would like to discuss, you can email me directly at erik.walenza@IoTone.com. Thank you.

Yonatan, thank you for taking the time to speak with me today.

Yonatan: Sure. Thank you for having me. Nice to meet you.

Erik: So I'm looking forward to the conversation here. I think we're probably going to go into some topics where I'm maybe struggling to keep up with you. But first, maybe you can give us a bit of background on yourself and how you ended up founding Deci. I know you have a Ph.D., was this idea originated in your studies there? What's the origination story behind the company?

Yonatan: Two years ago, I've completed my Ph.D. in computer science, focused on deep learning technology. During my Ph.D., I studied and research areas of making deep learning technology more applicable for real life applications and specifically worked on the problem of uncertainty estimation, which is how can you estimate the confidence or the uncertainty of a machine learning deep learning model in its predictions?

Based on that research, we explore the barriers of taking AI to production in industry or using AI in various types of publications. Well, one of them is what we focused on my Ph.D. Another area that we found out that is very interesting is the computational complexity of those algorithms, and how can we make them production ready in the performance aspects to be easily deployed on any environment in a scalable fashion.

So after completing my Ph.D. together with my Ph.D. advisor, Professor [inaudible 03:29], and another cofounder more focused on the business side called Jonathan Liao, we started the essay in a mission to break the AI barrier and make deep learning technology being accessible and being able to be deployed on any environment anywhere in any scale. And we're doing it by automating the process of finding the structure of the model or developing the model in an autonomous fashion, where the performance is one of the key differentiators that we believe that has to be taken into account in the early days of the development. But maybe first, let's understand a little bit more about how deep learning development lifecycle looks today.

Today, it starts from a data scientist to a deep learning developer that is walking in a lab and has to solve some business problem or product problem. For example, detecting some objects in models. He explored some academic papers and some GitHub repositories and find out a model that is good to solve the problem. And after he or she solved the problem, they found out that it's not so easy to take that model that they solved in the lab to production because of many, many reasons. But one of them is that about half of the algorithms are working in the edge. And in the edge there is limited computational power, and compute that has to be taken into account in the design stage of the algorithm.

And what this is, is a development platform that enable data scientists to build algorithms that already take into account all the production constraints and characteristics in the development cycle, and by that shortening the endless iterations of making the algorithm production ready to be deployed anywhere.

Erik: So maybe a more traditional approach might be that you develop the algorithm on the cloud using more or less unlimited resources, you find a model that basically works in that environment, and then you try to deploy that in the real world, and you hit these constraints, and then you have to rethink how the algorithm is built and that drives a lot of time effort. Is that kind of the problem that you're solving, is avoiding that rework?

Yonatan: Yeah. Actually, this comes with endless iterations between the development on environment in the cloud and the production environment at the edge trying to build something in the cloud that will fit the edge device that you want to deploy this algorithm on. And simply takes a lot of time, because each iteration requires a new training of a new model that takes a lot of time and a lot of resources.

Erik: And so here, we're talking about solving problems that are deployed on the edge. So I guess we're not talking about probably FinTech or ecommerce problems. We're talking about what is this manufacturing? Is this energy utilities? What would be the key industries where your customers are coming from?

Yonatan: Yes, absolutely, manufacturing, smart city applications, autonomous vehicles. Most of the AI algorithms that are deployed in the edge are related to deep learning. Also, we have healthcare related use cases to run computer vision at the edge. And many, many other industries and use case, also mobile applications, video analytic solutions, and many, many, many more.

Erik: So you mentioned already a couple of use cases, you mentioned video and machine vision, which I guess, highly related, so that's obviously a huge class of use cases, anything related to vision. To what extent does it make sense to build a general purpose platform versus maybe a platform specialized around a particular use case? From a technical standpoint, is there a benefit in saying this platform is optimized for machine vision, for example? Are the underlying quantitative challenges similar enough between different types of problems that it doesn't actually make sense to focus on a particular type of use case?

Yonatan: I think that there is a range that each and every solution need to define, its sweet spot between how much the platform is easy to be customized for various use cases, and on the other hand, how much it is being easy to use. If you try to build kind of the dream of the audio ml and [inaudible 08:31] platform, yeah, you will probably have to start with something that is very narrow use case that can be supported on that platform.

But if you're looking to something that is code base, that is an SDK, that could be easily extended into more use cases. And even giving your developers or your users to build upon your tools in order to build new use cases and expand from your initial focus target use cases. And this is exactly what we're doing in Deci. We're providing developer’s tools that our users and customers can build on top of them any type of neural network, any type of use case. What we do focus on is deep learning technology. So we are working on neural networks. Either it could be for computer vision or NLP. Maybe in the future, we'll have some audio processing use cases. But this is roughly all the aspects where deep learning dominates today.

Erik: So if we can consider that you're taking this, call it an SDK approach, where people are building custom solutions on top of your platform, the users then, they would be data scientists? Or what type of technical background would a user need to have in order to? You could look at this first for building the algorithm, and then second is maybe optimizing the algorithm once it's in deployment, but what would be the typical profiles for users?

Yonatan: Yeah, so we're providing tools for the data scientist. At the moment, we are not trying to reach new audiences like to make any software engineer being able to build deep learning solutions. We're targeting the data scientists and helping them build better solution that what they can achieve with the standard tools that they're using today. In terms of the personas, or the people that are using or platform, it started from the data scientist. After they have a data set, and they want to build a model, they use our tools to build models based on one of the models that we provide. In our platform, we call those models Decinates. Decinates are models that are known for their good performance, both in terms of accuracy, and latency. For each and every, however, we have a different Decinate and tasks.

And the second persona is kind of the machine learning engineer or even the DevOps at some use cases that have to take this algorithm and take them into production. And sometimes we're getting the machine learning engineer to approach to us and ask, okay, how can I make this algorithm to run on this specific device in this and that latency? And sometimes the data scientists understand that their work is not only to provide the best accuracy, but also to provide a model that is production ready in the essence of feasible to run on the device that the company is trying to deploy it on.

Erik: There's another problem that actually we're bumping our head against right now. It's the problem of scalability. So let's say you develop an algorithm for a particular motor or on a production line to determine when it might break down, and that algorithm works, it’s production ready for that motor. But then, of course, you've put in a lot of effort to build that algorithm, you want to now scale this to 20 other production lines, and then you also want to scale it to other factories, and maybe the other production lines have sensors from different manufacturers.

So the question then is, to what extent is that algorithm that you built scalable to similar situations that are maybe not 100% identical in terms of the data inputs or the environmental conditions? Because if you can't cut and paste that algorithm, the R&D cost is too expensive to replicate. Let's say, first, maybe Is this one of the value propositions that you're bringing? And then do you have thoughts on how to approach this problem of scalability of an algorithm across multiple assets in maybe similar but not identical environments?

Yonatan: I totally agree with you about the problem of scaling those algorithms for various types of use case with different data sets. Some of our customers, especially an example in the retail space, has to train a different model for each and every customer they’re deployed on. And based on that, one of the properties that is very important for them, let's call it the robustness of the model and the training of the model to do different types of data and data sets.

And this is something that we pay attention very highly, that there’re algorithms that automatically designed the model are not converging to some very, very specific point where it suits only one use case by providing a general purpose algorithms that good for many types of data sets and many types of problems.

And I think that it's a very important property of a model to be able to generalize good for different types of data sets and tasks. Because if that won't be happen, the algorithm, it effect the lifetime of the algorithm because it will break with the first distributional shift or data set change that will occur in the production environment. And we need the ability to retrain models and adapt them as an ongoing process to their production environment and the changes that we see in the use case and in the product.

Erik: So then you could optimize the perfect algorithm for a particular situation but it's better to have an algorithm that's maybe not perfect for that situation, but that's more adaptable to similar situations? I guess one a good example here is Zillow, which it's certainly cloud-based. But are you familiar with the company, Zillow, this real estate?

Yonatan: Yeah, absolutely.

Erik: So they just killed their business. And it seems to be that their algorithm was optimized for marketing conditions that have changed very much this year. And it can't adapt to the new market conditions. Do you have any thoughts on why all of a sudden they closed down this business and why they were unable to adapt to the market conditions?

Yonatan: I don't want to try to predict what happen particularly that use case on Zillow. But generally speaking, I believe that problems like this are coming from the worlds of distributional shift, where algorithms are drained at some point and deploy to production. And we need to have the tools and the methods to track their performance in production, and understand that something has been changed and the algorithm need to retrain to build from scratch, or something like that.

And this is a phenomenon where the lifetime of algorithms that are running in production is sometimes limited, and had to be monitored all the time in order to catch those changes in the market or the environment that affect the algorithms to provide poor performance and then they need to be retrained for and more updated data sets or maybe replaced.

Erik: But before we get more into the technical details, on the business side, I'm actually very curious, what are the KPIs that you are presenting to your customers when you're making the case that your solution is the best solution? So is it reducing the time to deployment? Is it reducing a number of data science hours? What's the mindset of the buyer here in terms of the aspects of production that they're trying to optimize around from a business standpoint?

Yonatan: So we're selling performance. When I'm saying performance, I’m referring to accuracy of models. So on our platform, data scientists can build models that are more accurate, or the other side of performance is models that are running faster for being more scalable, or can run in real time. Those are kind of the KPIs that we usually engage with customer.

Obviously, getting that performance levels is something that will take an experienced data science team much more time to develop, and sometimes they will not be able to do to reach that level of performance at all. So yes, in one, the business value proposition is the time to production. But it comes from a performance standpoint that there is a gap between what the data scientist or the data science team has been able to do to get to the point where we start the engagement, and what is needed in order to get to production.

Usually, it comes with a very tight deadline for going to production in the next version. And we show them how they can build on our platform similar algorithms, or optimize their existing algorithms, and get better accuracy or better latency in order to make the models production ready.

Erik: This data scientists are, I would say, right now, a very constrained resource. I don't know of many companies aside from the Internet giants and maybe some well-bankrolled startups who are rich in data scientists. Everybody seems to be, they've got a small team, a couple people that have way more workload than they can handle and then you deploy something. And then of course, you also have to maintain it. So then you're both building new products, and you're maintaining existing products.

You can look at this regionally. You can look at this globally. Maybe if we just look at it globally, do you see us moving towards a world in the next few years where the supply demand of data scientist is more, more equalizing? Or do you think that we're going to be in this situation where demand is going to greatly outpacing supply of data scientists for the foreseeable future? Because for me, that seems our clients all have a huge constraint there of a lot more ideas than they have talent to execute on this.

Yonatan: So I think that this problem relates two sub-problems. One of them is the total number of data scientists that we have in the world. And the second thing is the productivity of the current data scientists. I think that we'll have much more data scientist, but they will be definitely less experienced than those we have today or those that today already doing data science and how they will be experienced in two years. Average experience of the data science community is getting lower and lower because new data scientists are entering the field in masses every year.

On the other hand, we see a lot of tools and platforms that try to help a data scientist reach their goal in shorter time. So I think that data scientists will be much more productive, and also the proficiency that will be needed from data scientists to solve, let's say, 80% of the problem will be reduced significantly with those tools. So I see two vectors that are working together in order to getting us to a better standpoint, in terms of the need on for data scientists that we see today industry.

So yes, I believe that in two years, we'll have better tools for data scientists, and they will be more productive in that sense. And also, the proficiency gap will be reduced a little bit because of the tools that will automate and standardize some of the work of the data scientist. And the second half of it that we'll see more and more data scientists, as a lot of people are now going to study in universities and courses to study data science in order to enter that field as they see it as a very promising field that is exploding at the moment.

Erik: Let's dive in now to the platform, the Deci platform. Maybe we can start just from a higher level. What is the work to be done? So I like how you've broken this down here into build, optimize, and deploy. And I think some of our listeners are quite familiar with the process of building algorithms. Some are probably not familiar at all. So it might be a useful starting point just to talk through the process of building an algorithm and what are the challenges along that process?

Yonatan: Sure. So as you mentioned, the three components of the platform is the build, optimize, and deploy. Let's start from the build. Build is basically an SDK for data scientists to build on premise, and train on deep learning models based on well-known neural architectures and model templates, or based on this is a starting point for algorithms that we call Decinates. So it enables the data scientists to build faster and streamline all the process of training the model, selecting the hyper parameters, and the neural architecture. And it can't dried after they have an initial data set to start to work on the problem.

So a data scientists that have a data set and want to start experimenting, and try and find a model to get to the accuracy level or to get to the performance level that is needed should definitely start and explore using our built component of our platform. Once we are getting to a model that is reaching the levels of accuracy that we can think about thinking to production, we need to skip to the optimize stage, which is the preproduction of the machine learning model, where in this state, we are applying some techniques such as compulation, quantization, in order to squeeze the algorithm and packet to be ready to go to production.

Which leads us to the latest stage, which is the deploy, where there we have two tools, one of them is SDK based, and the second one is containerized, to take the model and do the serving in the production environment. So this is SDK is that used usually by the machine learning engineer in order to take the model that the data scientists have been built, and run them in the production environment, either connect them to the application or deploy them as a micro service in the cloud.

Erik: I guess this depends on the algorithm. But what's the typical lower limit for processing power, or maybe memory on an edge computer in order to be able to run one of your algorithms?

Yonatan: So it really depends on the use case. If you want to run computer vision on video, it's not similar to our running some NLP applications. So it really depends on the use case and the model that is needed in order to solve the use case. The interesting fact is that we are providing a set of algorithms that can be used on our platform that is ranging from very lightweight algorithm that can run on almost any hardware.

It could be i5 CPU running in real time to more complex algorithms that reach very good accuracy that can be run on more advanced AI dedicated hardware. So we can consider it as something that we call an efficient frontier, a tradeoff between accuracy and latency, which is kind of a curve that you can choose on that rate of the sweet spot that is defined by the hardware that you're trying to use in production and the level of accuracy that you're interested to get.

So if you take a specific example, we can think about the visual background application in videoconference. The visual background feature in videoconference, I believe that everyone is familiar with that feature that we can see on Zoom and Microsoft Teams and all those applications that blur the background or remove it with an image. On that use case, a computer vision algorithm that is called semantic segmentation has to be running in real time on the device, on the laptop of the user. And in order to do that, you need the algorithm to be very fast. One of the ways to make it run faster is to reduce the image resolution and work on small images, but then the details of the segmentation won't be good and the results will not be so good.

And what Deci is doing with our customer in that use case is accelerating the performance that this model will be able to run on the highest resolution with the highest accuracy on a real time performance of 30 frames per second in order to give the real time experience of that feature of the virtual background, but also gaining the best accuracy. And as I mentioned, there is kind of a tradeoff between running fast and running accurate. And Deci help to explore that tradeoff and choose the best sweet spot that is available that is significantly better than any out of off the shelf algorithm that can be used by academic paper or a GitHub repository.

Erik: And then your core technology, I guess, you're probably using a lot of open source solutions, and then you have this automated neural architecture construction Auto Neck technology. Can you talk a little bit about how you make this happen? What is the underlying tech stack look for Deci?

Yonatan: Yeah, absolutely. Auto Neck is an algorithm for my family that is called neural architecture search. It's a family of algorithms that is started from Google in 2017 that automatically design or search for the best structure of the neural network for a given use case. And when I'm saying your use case, I'm talking about the data set on one hand, and the hardware that we're using to run the algorithm in production on the other hand.

And what Auto Neck is doing is being able to solve an optimization problem, that is finding the fastest model or the best performing model for a given use case and a given data set that is reaching a specific level of accuracy. It is solving a constrained optimization of finding the structure of the neural network in a search space of hundreds of thousands of candidate neural architectures that solve the problem of finding the sweet spot of performance between latency and accuracy for a given data set and use case.

Erik: Can I think about this basically as an engine that is testing hundreds of thousands of different structures, and then testing them against each other and then determining what structures fit the use case best? Is that the output of the engine?

Yonatan: Yes, absolutely. That what Auto Neck is doing with one caveat, that searching exhaustively in that search place is impossible. Of course, to test a candidate model out of that search space, the model needs to be trained with the data set in order to get what will be the accuracy of that model. And these could take days of GPUs. So we cannot explore the entire search space and take each and every model and train it and do, let's call it the classical model selection problem. We have to use more clever techniques in order to estimate the resulting accuracy of a given model on a given data set without training it.

And this is what differentiate this Auto Neck from other neural architecture search algorithms that invented in Google or in companies that have huge amount of complete compute, and this is Auto Neck is working into order of magnitude faster for convergence compared to any existing neural architecture search algorithm, making it the only solution that is commercially viable for using by companies in order to optimize their specific use cases and problems.

Erik: You plug in the data set, begin training, are we talking about minutes, are we talking about hours, are we talking about days to optimize or identify the most efficient path?

Yonatan: So it's matter of days. Actually, it's a function of the training time of the model. Usually, it takes three weeks of the training time to optimize the model with Auto Neck. Usually, it ends up with a few days off of conversion for the search algorithm, which is relatively very good, because doing it manually by trial and error, iteration by data scientist is taking much, much more time and is not guaranteed to converge to a result as Auto Neck can converge by searching cleverly in such a huge neural architecture search space.

Erik: And then there's another significant pain point upfront, which is making sure that the data that you're putting in is actually good data. So that means pruning the data, it means also tagging and so forth. Is it generally that this would have to be done on a different platform and then high quality data uploaded into your platform? Or do you do also cover that within the scope of the Deci?

Yonatan: So we usually engage with customers that already have a large enough data set that is labeled and ready for training the model. We're focusing on the model stage in the development lifecycle. And data labeling, or tagging is not something that we're dealing with at the moment, but it's a future direction that we might approach later on.

Erik: Yeah, there's certainly tools for this. I think there's here in China, some companies I know that just have 10,000 people sitting in some city somewhere in the center of the country tagging data for clients, right?

Yonatan: Yeah.

Erik: The pricing of the solution, it's basically sold as a SaaS, is that right? And then would that be around the number of users or it's around runtime? Or how would this type of engine actually be priced?

Yonatan: So it's a subscription to a SaaS platform that is priced by the number of models, actually. So we want all the organization, all the data scientists to be able to work with our tools. And we don't limit the number of users or anything like that. But the subscription is based on the number of models that are being developed on the platform. Usually, small companies has something between 1-5, maybe 10 models, and larger companies has something that is some 20-50. And this is usually the currency that we use in our subscription for the platform.

Erik: I imagine models require different amounts of compute power. So from your standpoint, is there a difference in cost if somebody is running a small model versus a high that kind of impact your optics or that more or less inconsequential?

Yonatan: So actually, the training and the optimization is running on the customer cloud or data center. So it does not affect not our costs and not our pricing. So the answer is no.

Erik: Well, maybe can walk us through one or two cases end-to-end, just looking at where's the company coming from, what problem are they solving, and then walking us through the deployment path?

Yonatan: So one of the use cases that we've seen from a company that is developing a smart city application that detect trash in the cities based on small cameras that are deployed and running with computer vision, compute attached to them. And what we've seen is a company that is struggling for making the algorithms run fast enough in order to get good enough performance and accuracy from the end-to-end application.

And what we have done, we let them develop an algorithm on our platform that is running faster and is more dedicated to the hardware they're using in their production environment. After they trained a model based on their data, and in our platform, they use the quantization and compulation techniques that are integrated in our platform and deploy it based on the deployment SDK that we are providing in the edge device.

And the overall experience that they have seen is running from a model that is running in one frame per second on that hardware device to a model that is running about 12 frames per second. So it was a 12 weeks boosting latency that enabled them to run this algorithm on the specific hardware that they chose, or the candidate hardware that they consider for that application in a good enough performance in order that the algorithm will run the application and will collect the data that is needed for the application itself.

Another example is a mobile app for image processing that is deployed on mobile devices. And on some [inaudible 36:33] devices, especially Android based devices, the performance of the computer vision model in that application was not running fast enough in order to give the good experience on the devices which was draining the battery and taking too much to compute and incur latency in the user experience of the application.

And that company, were using a cloud compute in order to do that specific model computation. So they had an image on their mobile device, they sent it to the cloud, did the image processing of that specific use case in the application of that specific model in the cloud, and then sent back the results to the device and presented it to the user.

This was problematic from three aspects. One of them is the user experience. The user has to wait for a few seconds in order to get a response from the cloud. The second thing is a security issue where the personal data of the customer that is being processed is sent to the cloud and processed in the cloud and being in transit. And the third thing is the cloud compute the cost that is related to that application that has millions of users every day.

So by using the Deci platform, that company was able to rebuild the algorithm behind that model, and make it edge production ready and push it to the edge work on a wide range of devices and making the algorithm run in real time on the device, eliminating all the complexities of sending images to the cloud, running deep learning computation in the cloud and sending things back. And that was a huge gain for that company, that now their mobile application is running significantly better and reducing the cogs associated with that model in the application.

Erik: For the first use case, you said smart city and around the trash, do you know what hardware? I mean, were these on garbage collection endpoints? Were these on trucks? Do you happen to know the operating environment that that one was in?

Yonatan: This deal was led with one of our how well OEMs partners. We recently announced a partnership with Hp. And this specific use case was on HP box edge server that is running on CPU. So the main task here was to make motor running on CPU feasible to run on real time, on a track that is moving all the time. And the high frame per second was needed because of the movement of the garbage collection trucks.

Erik: Now it looks like there's a great new generation of hardware coming out to make machine learning work a bit better. I see a lot of also entrepreneurs working on this product as well as the big players like HPE.

Yonatan: There's something to understand that it's a question of supply and demand. There's supply for a new chips that can process better and run faster. But the demand is also growing exponentially and even faster than the progress in the chip industry. As we hear all the time on new models that are more complex and larger like GPT3, and also new models after GPT3 that are even larger.

But there is always a gap between what the data scientist and the people that are building the application want in terms of compute, and what the hardware can bring them at this specific time. And we believe that this gap will exist for at least the next 10 years, because the progress in hardware is kind of limited by Moore's law. But the progress in algorithms is growing exponentially all the time. And we can see it by the huge language model, or by the use of transformers for computer vision, and many other vectors that we see at academia that make models just be bigger and bigger and more and more hungry for compute.

Erik: This is a really interesting problem space because it's very much an ecosystem approach, where you have improvement in the underlying algorithms you have in the processors, in the modeling approaches, just the physical hardware. So you have these also in connectivity, and bandwidth [inaudible 41:33] improvements in a lot of different areas that all feed on each other. And then there's, of course, some areas that are moving significantly faster than others other slower.

What is the partnership strategy look like for Deci? Who are you working with? I mean, obviously, HPE is one of these partners. But who else are you working with in order to make sure that you're part of a winning ecosystem, because I think that's probably quite important in this business?

Yonatan: Absolutely. So we, at the moment, have three types of partnerships. One of them is the hardware manufacturers. For example, we announced the partnership with Intel. This is both technological partnership to develop our models that are dedicated for their and types of hardware and doing research together and technological aspects with what is called Intel labs.

And the second thing there is the go-to market partnership with Intel, where we are working with the sales organization to identify opportunities for Intel customers that are trying to use our deep learning models on Intel hardware, and how they can help them make Intel hardware more scalable and running deep learning algorithms on that.

The second thing is the OEMs. OEMs are usually those who interact with customers that are trying to build data centers for deep learning or to build edge servers for deep learning. And those are facing the computational limitation of their solutions in front of the applications that their customers are trying to run on that. So by partnering with Hp, we're getting opportunities from the sales organization of use cases that are running computer vision at the edge or in the data center in order to accelerate those models and make the algorithm ready to run on the hardware and solution that HP is selling.

The third part is the cloud area where we partnered with AWS in order to make our solution available in the AWS marketplace and enable AWS users and customers to build more scalable solution with better economics and lower cost to serve in the cloud. We are seeing some other partnerships coming up. But it's still a little bit early to announce on them and other direction for the partnerships that we're exploring at the moment. But these are the main three partnerships that we formed already, and already bringing customers to Deci.

Erik: Yonatan, I think we've covered a fair bit of territory here. Anything important that we haven't touched on yet?

Yonatan: And I’m talk about a very interesting research aspect that we we've done recently at Deci year. Actually, at Deci, we recently broke the state of the art performance for deep learning in a family of neural networks that we called Decinates. This was announced in our website. And also, in the press, and algorithms that are breaking the known tradeoff between accuracy and latency to a new frontier, for example, classification models that are running on two weeks, our reduction in latency compared to existing open source and algorithms we announced.

Decinate is now offered on a platform and can be used by data scientists in order to build more accurate models that are running faster in production. And this is very interesting actually, being able to use the Auto Neck technology to generate models that are at some sense general purpose and can be used by customers in a self-serve fashion for their application to run better or faster.

Erik: Was this done independently by yourself, or you're doing this with a partner or university lab?

Yonatan: So actually, we have a research team here. Deci was three professors on board and few more PhDs. And this result of Decinates was a direct result of applying the Auto Neck technology on some well-known benchmark data sets and models and getting resolve that is competitive to any existing known model that is usually used at industry. So this result was internal, but now we are providing users and customers to leverage those Decinates for their applications. And it's very interesting to see what are the use cases that people are using those Decinates in the early days.

Erik: What do you see on the horizon for Deci? Like, if we look towards 2022, what are the big focus areas for you?

Yonatan: So I think that the focus is about making our tools more accessible for even less experienced data scientist or software engineers. So something we're thinking about is how can we make our platform more accessible for everyone. One of the direction is on to provide an open source in the space to help you some of the components of our platform for free. In other directions is how to make the API easier for using and applying to the standard problems in computer vision and NLP and making everything more accessible for less experienced data scientists.

Erik: I think this is probably one of the big areas of innovation that at least a lot of our clients need, because it's hard for industry to hire really experienced data scientist. It's hard to compete with Google and Alibaba and the big players out there for top talent. So you end up hiring younger people, or you end up training your existing team to run and they need them tools that they can use effectively. What is the best way for people to either reach out to you or more generally to learn more about Deci?

Yonatan: We have a SaaS platform that is open. User can just sign up and start. Try some of the data that we provide on the SaaS platform for free. That will be give a good sense about some parts of our offering. And the second option is just to book a demo on our website, and someone will get back to them and do a demonstration of the entire capabilities of our platform. So this is kind of the best way to see what Deci can deliver to any use case.

Erik: But for folks that are listening, Deci is deci.ai so there you can sign up, you can book a demo. Great. Yonatan, thank you for taking the time today.

Yonatan: Thank you very much. It’s a pleasure.

Erik: Thanks for tuning into another edition of the industrial IoT spotlight podcast. If you find these conversations valuable, please leave us a comment and a five-star review. And if you'd like to share your company's story or recommend a speaker, please email us at team@IoTone.com. Finally, if you have an IoT research, strategy, or training initiative that you would like to discuss, you can email me directly at erik.walenza@IoTone.com. Thank you.

Contact us

Let's talk!

* Required
* Required
* Required
* Invalid email address
By submitting this form, you agree that IoT ONE may contact you with insights and marketing messaging.
No thanks, I don't want to receive any marketing emails from IoT ONE.
Submit

Thank you for your message!
We will contact you soon.