What is our primary use case?
NVIDIA AI Enterprise is essentially a GPU enhancement software that takes advantage of NVIDIA's full stack built on what is called NeMo architecture and NIMs, which are the micro inferencing servers. It allows you to put a GPU into a server, oftentimes with eight or up to eight NVIDIA GPUs in a server.
You will get additional performance that enhances whatever workloads you are running on the server, but you don't have all the right tools such as monitoring, management, and orchestration. NVIDIA AI Enterprise gives you the full software stack that gives you access to really maximize the value. The primary benefit is that you are really taking advantage of the hardware and the software working together.
Orchestration allows you to schedule jobs to run at certain times. NVIDIA AI Enterprise software also has regular updates, so every couple of weeks there are new pushes out there so you can become more proficient and get a much better hands-on experience for achieving the goals and making the most effective GPU investment possible.
What is most valuable?
The best features of NVIDIA AI Enterprise are the GPU orchestration and the visualization that can happen. When you run the Enterprise software, you will have access to other features including AI Workbench.
Many of these features you can actually view for free on build.nvidia.com, and I want to stress that because people think this is so complicated and so expensive that they will never have access to it. That is not true. NVIDIA has a number of resources and training modules that give you access to this information without necessarily needing to purchase everything because usually the companies that have AI Enterprise purchase it on a per GPU basis.
So it is one license per GPU, and that number does add up quickly. However, it is easy because it again gives you visualization of everything. Think of it as a dashboard you can run. You don't need to know how to program or how to read code. It is very visual and it gives you metrics on how effective everything is running and you can toggle different environments. If you want to dive deeper, you can run all kinds of simulations with that.
You kind of start high-level visual and then you can get more specific over time depending on what your job is.
The impact of NVIDIA AI Enterprise on the time for my AI applications is pretty remarkable because you don't have as much downtime with this type of software. The downtime that has been saved is pretty much as much as possible. There is usually about six nines of availability. Six nines means it is 99.9999% up. That comes out to basically experiencing about 11 seconds out of the year, which is just basically them toggling new servers.
What is interesting about this is that many of the GPUs are what are called hot-swappable, meaning that you can actually change them out while your data center rack is still powered on and running. They have it built in so you can pull it with a special tab, which is an orange tab. It is not going to shock you or anything. This gives you the ability to go into the facility or have the facilities team change that out if there is an issue and you need to change a GPU out or you need to change a license out, and they can make it happen without any interruption. There is really not any interruption noticed. It is still a fairly new product so there are not as many examples, but from what I have seen, there is not really any issue with interruption. NVIDIA AI Enterprise definitely keeps the GPUs maintained, and that is why orchestration is so important. It is an electric car in nature—it charges when it is not being worked. Because if you just drive a car all day and don't maintain it, the car is going to fall apart. It is the ultimate maintenance package.
What needs improvement?
My thoughts on the security protocols and their data protection is that this is one area that has actually needed to be improved. NVIDIA has done those things, but I was recently working with the federal government and many times they require what are called FIPS security compliance. It is a cryptography key that gets put onto the hard drives that work with the servers that have the GPUs.
NVIDIA has done some investment in that type of security. There is Zero Trust Architecture that you can use, and that is a theme all of NVIDIA software runs on, meaning all of the software is built to be encrypted between the front end and the back end. However, I feel the investment needed to make this software even more secure could be additionally improved if NVIDIA continues to invest in federal government agencies and things of that nature.
This will help give them the highest level of security and resiliency necessary to really protect everybody from malicious actors because there are so many scams going on with AI and chatbots and phishing attacks are growing because the more that technology grows and expands, the more attacks are possible. NVIDIA is obviously the leader in AI GPUs, so they have such a large surface they have to protect.
In my opinion, the areas that have room for improvement in NVIDIA AI Enterprise are that not a lot of people know that NVIDIA has this offering. The people who know are the people who work in the tech sales world who actually talk to customers. However, people who are trying to learn on their own and don't have access to millions of dollars as the corporations do on a regular basis should still have the resources available to learn this type of information. NVIDIA should continue to invest in marketing to say they have this offering available to them. I have been trying to get them to do this. They should be able to go to universities and students who are obviously interested in this space and may not work at a large tech company.
A lot of my learning has been self-taught. I have some experience, but I went on the website and did a lot of digging.
There are so many resources out there that it can be overwhelming to figure out which is the right one to start with.
Also, going back to the security piece, the solution is secure, but it doesn't meet the Department of Defense regulations from my understanding, and that is a whole other level that NVIDIA would need to achieve. It usually takes a couple of years of auditing and strict compliance before you can get what are called FIPS 140-2 and 140-3 certification. I would hope that NVIDIA can continue to invest in that area. They have started, but they haven't really done enough to get that level of security yet that is needed for the highest level of classified information. Those are the improvements that are possible for sure with the platform.
For how long have I used the solution?
I have been using NVIDIA AI Enterprise for about two years.
What do I think about the stability of the solution?
Regarding stability, NVIDIA AI Enterprise is probably a 10 because they are the best, they are the most profitable company in the world, so I don't see how you get more stable than that.
How are customer service and support?
In terms of technical support, I would rate it probably a nine or so.
How was the initial setup?
The deployment of NVIDIA AI Enterprise is very easy because they handle all of it for you basically. You are just getting the software to install on the GPUs.
There is a pretty useful manual you get, and you get support. Pretty much everybody who has this doesn't do it alone. They have NVIDIA services or professional services that they would purchase as well, and it is all bundled.
There can be NVIDIA team members that can handle this on-site deployment installation or you can buy that remotely or you can get training credits. There is always another resource available to help out. You just figure it out, but it is pretty easy. My approach is to try to learn as much as I can beyond just as much as I am allowed to learn because there is never an end to that.
What's my experience with pricing, setup cost, and licensing?
Regarding pricing, I find it pretty expensive, but as I said, if you bundle everything, you can pretty much handle it. It makes sense because it will pay for itself in the long run. It is a long-term investment. The cost of the product is very expensive, but you are able to save long-term with how the product works and by getting the best bang for your buck. NVIDIA AI Enterprise does pay for itself; it just requires some strategy and some education.
Which other solutions did I evaluate?
When comparing NVIDIA with other solutions or other vendors like AWS, Google, Cerebras, I find that they are pretty much the leader in everything possible because they have the best hardware. They are not really a software company actually, because they prefer to work with their channel resellers and channel partners.
NVIDIA is very profitable because they don't really have a huge sales team. They basically work with everybody and anybody other than AMD is obviously a competitor, but they are still partnering with every possible supplier out there to push the envelope as best they can with innovation. I would say they are vastly outperforming everybody.
If you just look at the stock market, it has been that way for so long and now that they are as big as they are, people are excited to see where it goes, but they are wondering how much more it can grow. I would say they just need to help teach people as much as they can about how this information and technology works. They have a pretty good training and certification program, but many people don't know that it exists unless you already work there or work with someone who works there.
People who are trying to get in the door have to think of it as a numbers game. I hope that NVIDIA would make these resources more publicly available and say, "You want to learn how to use AI Enterprise? Here are the resources." They have these classes, but unless you know somebody, you are not going to really know where to study. They have exams that are about $100, but sometimes people's companies can reimburse them for that type of thing. That would be something I would encourage NVIDIA to continue to invest in. Their solution is going to perform better than anybody out there by far. However, they are going to also be pretty expensive, so you have to compare and contrast that price with the performance.
What other advice do I have?
My advice to others looking into NVIDIA AI Enterprise is to learn as much as they can and ask the right questions and be as open-minded as they can because this is a pretty new product, but it has a lot of upscale potential and it is going to create value for just about anybody. You have to be open-minded to that.
The integration of NVIDIA AI Enterprise with AI frameworks on my project is basically the most important piece of the AI framework because it takes the AI possibilities and actually brings them to reality. It gives you the full capabilities that you would not have access to if you were not running this software because the GPUs alone are just there to help run parallel processor workloads. They are there to basically be resources to handle very intense streams of information running on the server. They are not really built by themselves to be completely optimized and customized without software on top of it. You have to run NVIDIA software to get the full benefit of the APIs, which are the application programmable interfaces, and all the specific use cases that you want, whether it is modeling a digital twin, which is what Omniverse does, and Omniverse is also a bundle.
The thing about AI Enterprise is that it is an overarching term. There are several other AI softwares that NVIDIA has that they are actually running promotions with. When you buy AI Enterprise, you also get access. It changes on a somewhat regular basis, but there are promotions going on. Those promotions are additional packages such as SDKs or software development kits that have the ability to run more things. You might have heard of NVIDIA Omniverse or Run AI—those are two of the most common ones. They have had these deals where depending on what kind of GPU model you are getting, if you get the AI Enterprise software license, you are actually going to get the software, but you are also going to get additional software that is all tied together with AI Enterprise.
I have NVIDIA AI Enterprise deployed in a hybrid model because when I was working with the federal government, security is a top priority. If I did do cloud, it would be a hybrid cloud because it would require some on-site presence with a little bit of remote or cloud orchestration. Hybrid is definitely the approach, and also SaaS because that gives the customer the ability to use it as they go with a consumption model without needing to pay for large upfront costs. Cloud can be expensive because if you don't keep track of your cloud resources, you can spend a lot more money than you anticipate. People are moving to the cloud.
My direct team using NVIDIA AI Enterprise is between 10 to 15 people, but we were supporting an entire sales organization that is 10,000 or more people. The actual team that was the specific sales team was about 10 to 15 and pretty much everybody is using it as best they can.
I would rate this product a 9 out of 10.
Which deployment model are you using for this solution?
hybrid
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
SaaS