Our company is currently moving to consolidate the different programs we use. We regularly use Patrol and TrueSight, both are BMC products, providing the same functionality although completely different solutions. We are evaluating which is the right product for us and we're taking everything into consideration because the economy is not great and we have budget issues. Our business requires several complex configurations of systems; web servers, databases and processing environments. All of them must work together under proper performance and this is where we need a complex product like TrueSight.
Information Systems Computer System Controller at a insurance company with 11-50 employees
Provides great support for the business tools and IT service
Pros and Cons
- "The solution has a very good business event manager tool."
- "The solution is overly complex."
What is our primary use case?
What is most valuable?
The business event manager tool that consolidates detailed information from a single instance of equipment is the most valuable thing for me. It provides support for the business tools and the IT services which come from several systems. Some are replicated and service tools provide the same functionality for some things. The end user service is made up of a lot of systems and it's what I'm interested in, and how I discovered that BMC TrueSight is good for us. I don't use the event management or monitoring capabilities, I work with user management capabilities.
What needs improvement?
I think the solution is overly complex and requires a lot of resources.
For how long have I used the solution?
I've been using this solution for around 18 months.
Buyer's Guide
BMC TrueSight Operations Management
June 2025

Learn what your peers think about BMC TrueSight Operations Management. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
860,592 professionals have used our research since 2012.
What do I think about the stability of the solution?
I haven't noticed any issues with stability. We sometimes have to call for failures or questions but on the whole, it's fine.
What do I think about the scalability of the solution?
There are no concerns about scalability, it's performing well. We have deployed it where we have the most critical applications. We have changed our approach to new architecture and mobile. Instead of big servers, we have now deployed formal servers for web services. We're working on increasing the number of servers available. Our only concern is that it requires some investment at the beginning of the project and we have budget concerns.
How are customer service and support?
We don't use BMC technical support directly, we go through a partner.
How was the initial setup?
I am involved in the planning and the development of the solution, so from my perspective the initial setup is a little complex but not in itself, rather because managing the user services requires access to a CMDB. To get the best from this kind of product requires other processes and tools to be aligned with it. The consideration is that these tools provide very good functionality but getting the benefits requires other processes and tools. Our deployment is still in progress, we've been working on it for six months using a consultant from a third party, a BMC partner.
What's my experience with pricing, setup cost, and licensing?
We haven't yet established what the final cost would be for licensing this solution, we're still working on that.
What other advice do I have?
These kinds of products provide benefits if you have other processes that require alignment with other IT solutions, like in sales and deployment and CMDB. Without that, you don't get the full benefits. At the end of every phase we stop and check the software products before starting the next phase of the project.
I would rate this solution an eight out of 10.
Which deployment model are you using for this solution?
On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.

Sr. Director Operations at a comms service provider with 10,001+ employees
Enables us to triangulate, using multiple sets of data - including log, app, OS, network, and more - and find issues
Pros and Cons
- "The solution's event management capabilities are fantastic. We do a best of breed. If, on the network side, they use a different tool, we pull all that data in so that we have a single console. It's kind of like the monitor of monitors. We're able to aggregate all the different types of data sets, whether it's log data, app data, OS data, infrastructure data, or network data. We're able to aggregate all those events and then correlate and be able to say we're having an event."
- "Specifically around application performance monitoring, BMC is definitely not the market leader. The Dynatraces, the New Relics and the like are more of the market leaders in that space. I would like to see them grow that space a little bit more aggressively. It has not really been their bread and butter."
What is our primary use case?
We use it primarily for monitoring. My organization is an application support organization and part of what we need to do is to make sure is that our infrastructure is running tip-top so that those applications can run, consequently, the same way. We use the tool to do both application monitoring as well as infrastructure monitoring all the way down to storage services, and things like that on the OS layer. We have a full breadth and are able to triangulate what types of issues we're experiencing before our end-users experience those issues.
It monitors our entire platform. Everything in production, every single app, is monitored through the tool. As new applications come into our ecosystem, we have a process. The project team sits down with us. We talk about what the product's capabilities are. Most of the PMs already know that because they've been here for a long time. We set it up, and we move on to the next app. We're expanding it as new tools or new functionality or new applications come into the ecosystem.
How has it helped my organization?
Because we've used it for so long, we've been measuring results for eons. The standard metric that we use, given to us by our CIO, is that 70 percent or more of our outages need to be alert-driven, not customer-driven. So, if a customer calls in and says, "Hey, I'm having an issue logging in to PeopleSoft," which is one of our applications, we should have already known that there was an issue and handled the alert prior to the customer calling in.
A decade ago, we were using Microsoft's and HP's product sets to monitor but it was disparate. The alerts weren't aggregated and we never knew who they would go to. Therefore, we missed a lot of opportunities to be proactive in our organization. Hence, the reason we moved to the product which, at that time, was called ProactiveNet - and then it became BPPM and TrueSight, as it is today. We were able to flip that situation and we have been able to meet that metric for five years running. We had one blip in the year prior to that, and in the years before that, we were knocking it out of the park. So our metric is if we get the alert before someone had to call in, and we're successful in meeting that some 80 to 90 percent of the time.
In addition to that, when we look out across the industry, most organizations have anywhere from five to 15 people who are dedicated to monitoring. We have two. We're able to run the entire stack, along with its complementary adjacency tools, with two people. That was one of the many reasons that we made the migration from other products to ProactiveNet/BPPM/ TSOM. At that time, we were a one-man band and really needed to be able to move quickly but also be able to maintain a product and not require tons of manpower to make the product work. The improvements that BMC has made over the last two to three years are really revamping and consolidating the console so that it is truly a single console that you can run it with a single individual, should you need to.
We have 342 apps in our ecosystem and my team manages around 280 of those from a support-platform standpoint. And because we have two individuals who are dedicated to the monitoring, they partner with the rest of our admin organization to drive exactly how things need to be alerted. We review them quarterly. That is a testament to a really solid product - that it only takes one or two people to really run the thing and administrate it, versus having an entire staff and that's all they do.
The solution provides a single pane of glass where we can ingest data and events from many technologies. I am one of the few, at least from according to BMC, who has screens up in my hallways and I show our top 20 applications from a criticality standpoint - what's most important to our organization, things that I have to run. Everyone sees what's up on those boards every day. I go to it two or three times a day. Because we have that single pane of glass, we see where we're having issues organizationally and we're able to rally resources - whether it's engineering, operations, or our development group - and solve the problem and get those things from red/yellow back to green/blue. The single pane of glass was a key piece of what we needed to have to be successful as a monitoring organization.
In terms of the availability of our infrastructure, ours is not a hybrid environment, per se. We don't really measure and/or monitor - because of legalities with most of these FAS providers - how well their systems perform. But what do is measure any of the interfaces that touch or route to those applications, and we have an uptime measurement of about 99 percent for most of our apps. We have a dashboard for that which is managed out of the ITSM group. They partner with us and they pull all of our monitoring data to figure out two key metrics: total uptime and uptime excluding maintenance. Those are the two keys which enable us not only to showcase to our customer base how well the systems are performing but how often they really are available.
BMC has helped to reveal underlying infrastructure issues that affect app performance. Four years ago, PeopleSoft was running slow in regard to our payroll run. We run payrolls weekly. If you know anything about payroll, you've got to hit a certain deadline and be able to send the check file to the bank for those direct deposits to show up in people's bank accounts. It's a really sensitive issue when people don't get their checks. With the monitoring tools, we were able to triangulate that it was not an application issue but that it was actually a storage issue. Our solid-state storage was having a firmware issue which was causing slow turnover for the IO, and therefore it was slowing down the entire process of payroll. We were able to triangulate that that was the issue, decide what we needed to do - which was move the storage so that the application could continue to perform. We met the need and were able to get the payroll cut just in time so everyone could get their checks. It was a big win.
As for reducing IT ops costs, year over year, my operational expenses grow by three percent, which is mostly salary increase. I've gone from 12 resources to roughly 55 resources organizationally, while growing from 80 apps to 280 apps over the last eight years. Our operational costs have only gone up because of the use of licenses, not because of human capital. The tool has helped us work smart, not hard, and leverage the technology. We haven't necessarily needed to grow our operational expenses to accommodate the new functionality or the new applications which come into our ecosystem. We just set up the monitoring and it does its thing.
What is most valuable?
The solution's event management capabilities are fantastic. We do a best-of-breed. If, on the network side, they use a different tool, we pull all that data in so that we have a single console. It's kind of like the monitor of monitors. We're able to aggregate all the different types of data sets, whether it's log data, app data, OS data, infrastructure data, or network data. We're able to aggregate all those events and then correlate and be able to say we're having an event. Just because we have one or two alerts doesn't necessarily mean that we're having an event. It's when we get several of those that "trip the wire" that we're able to say, "Okay, we are having an event." And the tool allows us to aggregate all of that so that we're managing event-driven versus alert-driven.
The breadth of the solution's monitoring capabilities is also fantastic. A lot of IT organizations that I talk with use a conglomerate of tools to manage their monitoring and it ends up being pocketed. We don't have that problem because we are using it as the monitor of monitors and therefore we are able to take advantage of all of its bells and whistles. As well, we can feed in additional alert data, crunch that, and react appropriately and accordingly, proactively versus reactively. We'll get several low-level alerts saying, "Hey, this may be an issue," and we're able to proactively look at that before it becomes a critical outage. We use almost every aspect of the tool, with the exception of some of the automation because we haven't gotten there and found the need for it. But we're rapidly starting to take advantage of those pieces as well.
A use-case example would be if we have a drive filling up on a particular server for a particular application. If that's a known issue, we can actually orchestrate through the automation component of TSOM to be able to say, "Hey, when we see this type of alert, go try one of these three things and if that fixes the problem, go away. And if it doesn't, go ahead and escalate that as a ticket and we'll have a human go touch that server and remediate the issue." So we're right on the cusp of beginning that journey.
In addition, the entire root-cause analysis functionality within the tool is quite useful. It really comes down to how admins want to leverage it. There are what I call "old-school admins" who want to get on the box and solve it themselves. Then you have the "new-school admins" who go straight to the monitoring tools. It clearly shows you root cause analysis: This is the probable cause, and then they're able to go remediate it more quickly. We use that extensively within the operations team and the products team, which is the team that I own. I don't think the engineering team is quite there yet, but they're beginning to see the value of wanting to see that data and start using the tool themselves.
Regarding mean time to remediation, when I took over this organization, I and the rest of the group were working about 100 hours a week, just trying to keep our major systems running. It wasn't until eight months later, when we actually implemented a more mature monitoring system, that we turned the corner and people were working 60 hours. And now it's somewhere between 40 and 50 hours a week, which is much more maintainable and realistic in the industry. We were doing everything we could to keep those systems running, and we had no idea what would be in the next box of chocolates that we would open up, back when we first started this. There's a direct correlation with TSOM and the BMC product sets that have helped us be successful in working smart and not hard, like we did back in the day.
What needs improvement?
Specifically around application performance monitoring, BMC is definitely not the market leader. The Dynatraces, the New Relics and the like are more of the market leaders in that space. I would like to see them grow that space a little bit more aggressively. It has not really been their bread and butter.
They've been highly focused on cloud initiative. I don't know anyone in the industry who has solved how to monitor cloud, SaaS-based systems, because all of those systems are usually linked through other systems. That would be another area where it would be nice to see if they could find innovative ways to be able to do that.
The third piece would be around out-of-the-box automation. We all have particular types of alerts and events where all we really need to do is be able to turn the functionality on versus creating the functionality. BMC is already addressing that in many cases.
For how long have I used the solution?
We've used it in probably three incarnates of what it is today, so it's been about ten years.
What do I think about the stability of the solution?
We don't have any issues. We're in an HA format so if we do have any issues, things failover quickly and we don't miss a beat. It's the heartbeat of our products, the fact that we provide monitoring services to our businesses, so monitoring can't be down. It can't have a bad day. TrueSight Operations is a highly stable product. It is a beast. It runs really well. There's isn't a lot of care or feeding that we have to do to it to make sure that it stays healthy.
What do I think about the scalability of the solution?
It's highly scalable. We continue to add more servers and more applications within the ecosystem easily and quickly. We continue to review all of those quarterly to make sure that the way that we've tuned the monitoring is still accurate and that it's meeting the needs of both the admins and the business.
How are customer service and technical support?
We have a great relationship with BMC. We're probably different than the average bear. We've got a great account team. When we call customer support, we get answers pretty quickly. We don't have to call them very often, which is a good thing for any vendor. You don't want to have to call support a lot. But when we do, it's usually because we can't figure it out and we're able to get the answers pretty quickly through their organization.
Which solution did I use previously and why did I switch?
We used HP and then we used Microsoft Systems Center Operation Manager, SCOM.
How was the initial setup?
Back in the day, the initial setup was very complex. As it stands today, upgrades are really very easy. It's basically just a matter of refreshing old hardware, turning the system on, and making sure that it picks up all of the agents. Setting up today is infinitely more simple than it was even three or five years ago.
BMC is innovating even further and working towards containerization so that we won't have to do upgrades anymore. We'll just overlay. They've really taken into account how to consolidate consoles so that there aren't so many bits and pieces. That has made it easier for them to do upgrades. Installing the system or deploying the system only takes a couple of weeks in an organization of our size, where it used to, when we originally did it, take four months.
The latest one that we did, we had all the technical bits and pieces done within four weeks. Then we slowly rolled it out as we sunsetted particular agent groups. The total roundtrip was six months to have it fully deployed and embedded and working in the system.
At this point, we do an upgrade every three years, and every five to six years we're upgrading our hardware. This year we actually went fully virtual. Our engineering organization still takes a good bit of time to build servers. We were able to get virtual machines within weeks of the initial setup of the product, and we were able to roll to virtual machines, versus physical machines, relatively simply. It was basically a point-and-shoot install. We pulled over all of our policies and procedures that were already canned - and that was another thing that was more of a challenge in years past because we would have to redo them. This time, all that got pulled in and we were up and running within weeks.
What about the implementation team?
We partnered with BMC this time. Typically, we use a third-party, but in talking with BMC and where we were at - as we use them primarily for consultative - we said, "Hey, what's the best way to go ahead and do the upgrade in the migration?" They gave us the cut plan and then we actually did the physical work ourselves, which saved us some $200,000 in project fees.
With two guys running the system day-to-day, and consultative services from BMC to tell us, "Okay, this is how you do it," we were able to execute both the upgrading project, as well as administrating the product, while still running on the old system. It says a lot about the product's ease of use and capabilities.
Now, my guys are really smart and I'll give them all the credit. They're smarter than the average bears. But the reality is that it's rare to find a product where the people who are running it can be doing a major upgrade at the same time.
What was our ROI?
The very fact that we've been on it for ten years is a testament. We continue to make the investment. We continue to pay the renewal because the return has been fantastic. I don't have any specific data points other than the fact that we've been on the product for ten years. There's a reason for that.
What's my experience with pricing, setup cost, and licensing?
There are no costs in addition to the standard licensing fees. It's a straightforward contract.
Which other solutions did I evaluate?
Every three years, we reevaluate the space. That's just part of the culture that we've established. No one tool stays forever at the top, but BMC's monitoring capabilities and their discovery asset tools are top-of-stack, typically, in any of the research that we do. We continue to use them and we continue to have a great relationship with BMC.
What other advice do I have?
Keep it simple. Make sure that you understand, architecturally, how your applications and your data center are set up. It makes your life easier to know exactly what you're going to need to monitor.
The biggest lesson I have learned from using this solution is to really take full advantage. I joke with the BMC guys that TSOM is like AutoCAD, the engineering tool that people use to design and draw. We only scratch the surface of its full capabilities. The thing that I've learned is that it's a good idea to take advantage of all the bells and whistles as quickly as you can because it really pays dividends to do so.
We are using a little bit of the solution's machine-learning and analytics. That's an adjacency tool called IT Data Analytics and we feed that into our overall, single pane of glass monitoring. I don't know that we've taken full advantage of that quite yet. It is on the roadmap. We'll probably get to that, realistically, next year and in '21, where, as we're seeing those analytics, we will actually link automation to it. So when we see something we'll actually do something. We're a fairly small shop and therefore scale is not an absolutely necessary thing, but it is something that we are striving to move towards. It has affected our application performance in bits and pieces. It's not something that I'd wave the banner on quite yet. We have pocketed instances where ITDA has come back and told us that there was an issue, and we were able to remediate proactively versus reactively. I don't know that we're leveraging the tool's full capabilities where I can say that I have a use case where this was a big win for us.
I don't think that the monitoring tool, TSOM itself, has created or helped to support any business innovation.
As for users of the solution, I have the two admins and then I have, say, half of my organization that consumes it as a tool, so there are about 12 to 15 users. Each of those people is an application admin. Their primary responsibility is the applications that they support. The monitoring is a tool for them to use to ensure that those systems are healthy and top-notch.
I have a senior manager who manages the space. He also manages our asset-discovery tools along with all of our web and third-party space. He is a busy guy but it's all managed under one leader. There are the two folks who administrate it. It's really a very small human-capital resource footprint, in comparison to what it does technologically.
I give TrueSight Operations a nine out of ten. There are always bits and features from other products that we wish we would see in it. Usually, we see them pretty quickly.
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
Buyer's Guide
BMC TrueSight Operations Management
June 2025

Learn what your peers think about BMC TrueSight Operations Management. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
860,592 professionals have used our research since 2012.
Service Delivery Manager at a financial services firm with 1,001-5,000 employees
Knowledge Modules are what make the implementation across our varied infrastructure, but RBAC controls need some work
Pros and Cons
- "From an administrative standpoint, what stands out in TrueSight is the ability to implement quickly. When they have a requirement to monitor something, we're able to turn that on quickly in their environment. We're able to set up new apps within a day."
- "We were somewhat limited in TrueSight due to some of the RBAC controls not quite being what we wanted as far as delegating out administrative privileges for implementation. But because we were able to turn requests around pretty well, that burden wasn't too heavy."
What is our primary use case?
We use it for business service and infrastructure monitoring. We use the full gamut of utilities from them and monitoring in the platform.
How has it helped my organization?
We don't use APM. We used to. We line-item nixed that for various reasons a few years ago. We also don't use the ITDA, their next-gen log monitoring tool. So we're truly just within the TSOM interface, as well as doing synthetics. That being said, the Knowledge Modules that BMC brings to the market are what make the implementation across our varied infrastructure and applications. It's critical to have those Knowledge Modules. If we had to write things ourselves, or to use a more generic monitoring environment, and then build additional scripts on top of that to monitor the Kubernetes of the world, or the WebLogics of the world, or the Oracles and SQLs of the world - if we had to write scripts ourselves to bring back particular monitoring components and performance metrics and so on - that would be a heavy burden that would keep us from implementing. We don't often run into something that we haven't been able to monitor. It's just a matter of getting people to the table to tell us what they need.
When it comes to incident management, we get most of our data from TrueSight, log data, because we don't use the ITDA interface. It would be an effective interface, but for logging we go to our SIEMs, since we're already pumping data to another system there. But TrueSight definitely gives us a view into the health of our business services, which is our primary goal for implementing monitoring.
We try very hard not to use event management. What I mean by that is that we do not have a typical NOC. We don't have ten people staring at screens and then escalating as necessary. Along those same lines, we don't spam our incident management environment with events from TrueSight. With a lot of customers I've met over the years, that's essentially the old school way of doing things. Instead, we create events that are truly actionable. If we don't have an actionable event, we don't create it. We use their baseline technology to ensure that we're only sending items that are either about to have a problem or have passed the threshold of having a problem. If you're talking about typical event management, where you create an event and it gets forwarded to some other system, there's a notification about it somewhere else - the whole ITSM cycle - we don't use it for that. We use it for creating smart events that create alerts directly to the teams responsible. As I described before, we have many distributed teams rather than a centralized NOC.
In terms of TrueSight helping to maintain the availability of our infrastructure, it's an interesting question because of our distributed systems. We have 8,000 hosts across about 40 different teams, and we have 600 different applications that we run. For those critical tier-one apps, teams are highly involved in their day-to-day operations and watching them very closely. Having those two things - the actionable alerts and the ability to see what the health of their system is at any given time, and to be able to check it against what normal looks like for those applications - gives the teams that use it in such a manner the information they need to be confident that their availability is as it needs to be, or better. As far as a hybrid environment goes, we have our own hosting environment because we are the cloud to our clients. So we're not necessarily in that situation. We don't use assets other than what's in our hosting environment.
If, in the past, one of our biggest problems was just plain old infrastructure incidents, basic availability incidents where a server or an application, an interface or an endpoint, may not have been available and no one noticed it until some downstream, business end-result brought it to our attention, we've essentially eliminated 90 percent or more of those. It has been at least three years since we've done any numbers. But at the time, we might have had ten to 15 Sev-One incidents a month. When we last measured it, we were down to one. That was within a couple of years of implementing an enterprise monitoring strategy.
As for root cause, when a team is engaged in monitoring to its full extent, we're usually able to get to root cause pretty darn quick. For example, if a team has many servers that could potentially be impacting an application or a business service, tracking something down across those multiple servers and multiple owners could be really tedious and time-consuming. It would be on the order of hours, or at least many minutes, depending on the scope of the issue. With well-implemented monitoring, for our Sev-One apps, they're able to get to the solution almost immediately. If we have monitoring set up properly, the actionable event will tell them precisely where a critical component has failed and they can resolve it. Where it's a different type of incident that we might not have a particular monitor for, they're able to use the performance data, availability data, and other related alerts to get to their issue much faster than they used to. Having a good monitoring implementation has made a world of difference to our operations teams. It's so much so, that if you think back five years, which is an eternity in the IT world, when there was a Sev-One incident back then, someone would walk around tapping people on the shoulder all over the floor. That was very time-consuming. But now they're able to collaborate quickly and say, "It looks like this is the problem right here," in a well-monitored environment, and get right to the root cause.
It's helped our mean time to remediation, and I'm being conservative here, by about 70 to 80 percent. That's an absolutely huge impact.
What is most valuable?
We have many operational teams, and for any given team their requirements are different. One team is more reliant on infrastructure monitoring, because they are processing-heavy. Another team might be more reliant on endpoint monitoring where we're ensuring that the third-party endpoints they rely on are up and available. Another team may have fairly immature applications, so that they would rely heavily on log monitoring to catch all the errors that may come up. From a consumer-function standpoint, there isn't any feature that stands out. They're all important because all of our consumers are important.
From an administrative standpoint, what stands out in TrueSight is the ability to implement quickly. When they have a requirement to monitor something, we're able to turn that on quickly in their environment. We're able to set up new apps within a day. Most of the work in monitoring is working with the teams, evangelizing, educating, and making sure that they're bringing their smart requests to the table so that they get visibility into their business service. If the implementation wasn't as easy as it is, it would hinder and probably decrease the adoption of monitoring. But because we can turn requests around pretty quickly and adjust things as teams need adjustment for their different release schedules, administratively, we're able to respond and keep pace with the business and the technology that they're implementing. That is a critical function for us.
For how long have I used the solution?
We've been using TrueSight Operations Management for almost six years.
What do I think about the stability of the solution?
Stability is one of those areas of identifying challenges with TrueSight, areas that I'm not entitled to share at this point.
What do I think about the scalability of the solution?
We've been able to implement all the hosts that we care to implement on a couple of servers, with minimal maintenance. We don't use their high-availability solution. We don't really require it because the underlying infrastructure is relatively robust. We haven't had any problems with the scalability. Had we been a couple of times larger, there would've been more to implement server-wise.
The other thing about our implementation is that we send a lot more performance data to our implementation of TrueSight than the typical BMC environment might. We send everything server-side for analysis rather than keeping everything agent-side or emphasizing agent-side, as I've seen a lot of other clients do. I think the tide is turning. I think more people are doing what we're doing where we just push all the data for potential analysis. But we've been able to accomplish what we need without too much infrastructure.
How are customer service and technical support?
They had an advisory board. We, as a group, and even I specifically, had been asked by them what they needed to continue doing. One of those was continuing to build out Knowledge Modules in various technologies. Some of the ones BMC has made available, we've implemented, and some of the ones BMC has made available don't impact us and we haven't implemented. But I've been in discussions where they say, "What do we need to do," and Knowledge Modules is one of those areas where they've made a commitment to continue adding to them, and we appreciate that.
Which solution did I use previously and why did I switch?
When we first started, we did not have a monitoring program at anything resembling an enterprise-type level. We were at about 4,000 hosts and we were really not monitoring anything except for a few services. At that, it was bare-bones monitoring. We monitored, maybe, half of our environment at bare-bones.
We went on this journey six-plus years ago to have an enterprise monitoring solution that focuses on business services. One of the reasons we did that is because of the number of incidents that we had that really should never have happened. Now that we're a number of years in, and we've implemented monitoring and brought teams around in the direction of business service rather than just an executable's use of a CPU, we have much fewer incidents.
As a general trend, we're much more capable of seeing what's out there and monitoring what our issues are and taking care of it before the business incident occurs. I don't have any particularly recent examples where our monitoring was able to resolve an incident after it happened. Of course, I don't get notified when people say, "Oh, look, I resolved this," because it's part of their daily operations to find an issue and resolve it. So it's not necessarily a newsflash anymore for us.
It doesn't happen quite as frequently as it used to, but they continue to build Knowledge Modules, every time there are new products on the market. They need to create Knowledge Modules for the implementation to be enhanced. That's one of the key features of the Operations Management. That's definitely something that helps us take advantage of everything BMC has. They're not sitting on their laurels. They're building things out.
How was the initial setup?
The complexity of our environment demanded the complexity of the implementation. More than half of the effort that we had in implementing monitoring was based on the way we did our program. We were basically starting at zero and bringing teams up to speed, evangelizing, educating, getting people onboard.
The implementation of TrueSight itself was just a software implementation. It had its bumps and bruises. None of us were versed in BMC software. There were some learning curves as would typically be expected for any application of this scope, magnitude, and impact.
We had an overall strategy of doing proofs of concept for various, widespread technologies. We took that success and did a wide-to-narrow type of advertisement. We told everybody what was going on and then we brought more specific people into the room and said, "These are good targets for you to implement." During and after that evangelizing and advertising, we started implementing tier-one applications as an onboarding effort. We did that in a deep-dive fashion where we would sit down and interview these teams and really come to understand what makes their business service tick. A lot of our evangelization effort was actually in changing the focus of operations teams to think from a business service perspective. That paid off in dividends later when people were more interested in monitoring the actual functions of their applications rather than just the infrastructure of their application. We've been able to change mindsets over the course of a number of years. The first two or three years we were doing implementations. That was when we did most of that work.
From there, we worked as much as possible to allow folks to implement their own where possible, rather than centralizing it, so that people could keep up with their own demands. We were somewhat limited in TrueSight due to some of the RBAC controls not quite being what we wanted as far as delegating out administrative privileges for implementation. But because we were able to turn requests around pretty well, that burden wasn't too heavy.
From tier-one apps, we kept going and kept educating, bringing people to the table. When new applications come to our company, we still reach out and educate new teams, bring them to the table and use the onboarding process we built and solidified over the course of the first couple of years.
During the first three years, we had two-and-a-half FTEs for implementation. That was for the full program, not just the TrueSight component. It included all those interviewees, all those educational components, all the training, etc. The full program. The actual pressing of the buttons was about half of that. Once you stand it up and start connecting things, it's a matter of administratively using the tool to execute.
What about the implementation team?
Typically, our company builds knowledge for implementing infrastructure/operations activities like this from the ground up. We did not use a third-party. BMC was instrumental in our success in that they made resources available to us, implementation-wise as well as development- and support-wise.
What was our ROI?
The solution hasn't helped reduce costs in a measurable fashion. That's a measure that we wouldn't undertake. There might be soft costs benefits, such as
- impact on the quality of life for operations folks
- our ability to show our clients that the services we provide to them are healthy
- giving the business teams, our relationship teams, the ability to speak intelligently, rather than just colloquially, about how our systems are running.
Life at our company as an operations person is nicer now because you have confidence that what you're doing makes a difference, that the business service that you're working on is healthy. The business is happier when we're able to talk to them intelligently and say, "I can actually show you that we've been up and successful."
It has helped in our ability to work on smarter things rather than silly incidents. If we eliminate incidents, then we're doing better work. We're able to do the good work of business rather than the sad work of recovery. That's not only quality of life but it's also the ability to get things done. So I know that, at some level, we're doing more with less because of our monitoring. But we don't have any hard numbers from a monitoring perspective.
What's my experience with pricing, setup cost, and licensing?
We're end-of-lifeing it now. Overall, the licensing costs of BMC are a challenge for us in that they're hard costs, whereas open-source monitoring has soft costs, where it's harder to line-item. It's harder to see the cost of implementation for other things. So that change of direction is taking place. It doesn't mean the cost isn't there; it's just soft dollars rather than hard dollars.
Which other solutions did I evaluate?
We looked at Microsoft SCCM. And, because we had a partnership with CA, we looked at their tools. There were a couple of other minor players we looked at which just didn't have the scope of what we needed to do, because of the breadth of technologies that we use. In the bakeoff, we came down to BMC and Microsoft.
It was a long time ago, so I don't know that it's fair to judge at this point, but from a monitoring perspective, the whole Microsoft suite really wasn't there. There was a lot of scripting. It was easy to identify that the administrative burden was going to be high in that implementation. Conversely, with the BMC stuff, out-of-the-box, administratively, you click and implement. That is one of our components of success, our ability to implement quickly.
On the soft side, BMC as a partner was much more interested in our success than the Microsoft folks were at the time. It's very hard to quantify unless you're there sitting in front of them at the table and working with them, consuming their knowledge. It really is a great partnership.
What other advice do I have?
BMC is at a critical point in redefining TSOM, how it's built. Anybody looking at BMC now needs to jump on the new version of TSOM and skip the current versions. I would wait until their new environment is ready. It will be containerized. Anyone implementing BMC can get used to the environment in a PoC but they shouldn't implement until their new stuff is out. I expect it to be that much different.
Make sure that you have stakeholder buy-in and that they are able to provide the resources with the correct knowledge to implement in a smart fashion. Everybody's definition of "smart" is going to be slightly different. We really hone in on the business service side to make sure that our business functions are healthy and that we're able to understand what's normal and what is out of normal. We work with the teams, even from the point that they're in development of projects, to make sure we're ahead of what's going on rather than reactive. But that means the buy-in of multiple teams: development, operations, support. That amount of effort requires stakeholders with decision-making capabilities to say that it's a priority for them.
We knew up front - and we've been able to validate our assumption - that monitoring doesn't do any good unless you are analyzing your business service for what are the critical components to observe. That's an educational effort and an implementation project. It's that upfront effort that will make your monitoring successful. Where we've been able to engage teams and teams have remained engaged, we've been the most successful in that. We took that to heart upfront, we made that part of our route to success, and we put the effort in. Our monitoring's been successful because of that. If we didn't do that, and we didn't constantly engage teams to make sure that they were aware of capabilities including the ability to give us feedback, and that we can implement quickly, we wouldn't be here. We wouldn't have advanced as far as we have. Most of that advancement was in the first two or three years, and we've just been riding that wave of success since then.
Keep in mind that most companies don't go from nothing to an enterprise monitoring solution; they go from one monitoring solution to another. But if there's anyone in the boat that we were in, where they are the size we were with no monitoring solution, they'll be in the pain that we were in. Implementing a good monitoring program, not just the tool, but a program around it, can make a world of difference to the operations teams, and subsequently to the business as well.
For those teams that are utilizing TrueSight, they don't rely on other monitoring environments. Some of those teams rely on those actionable alerts almost exclusively, and don't really use TrueSight's single pane of glass. We do have some teams that consume TrueSight and use it on a daily basis to ensure that they don't have any events, whether or not they've risen to the level of action. They'll also proactively look at some components, either business function components or infrastructure components, to ensure that they're working as designed and within the parameters of normal.
I don't think the functionality of Operations Management helps to support our business innovation. Our business runs forward and headlong into innovation, regardless of whether or not IT can keep up. We were never an impediment, other than cost. The way we run our overall IT environment is very open and flexible. Monitoring is a way for us to give business the confidence that what we're implementing is healthy, but it doesn't impact their interest in being able to implement what's new. They've always been able to do that and continue to be able to do that.
In terms of machine-learning, I mentioned above the baselining which, depending on how it's implemented, might be called machine-learning, but in TrueSight they just have a straight calculation-type of activity. We have other monitoring solutions that we're implementing as well, and that topic may be more applicable to them, but not in the TrueSight world. The TrueSight world is a straight application implementation. It's nothing exciting on that end.
I have to give our BMC partners a lot of credit for where they're planning to take TrueSight based on their roadmap, although it is speculative. I don't think the areas for improvement from us would be any different than anything they've already heard.
If someone were to implement the full suite of BMC products, you'd have to give it a nine out of ten. TSOM by itself, I have to give it a seven out of ten.
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
IT Manager at a manufacturing company with 1,001-5,000 employees
Single pane of glass has resulted in dramatic improvements; it is bringing people together
Pros and Cons
- "We're using native monitoring capabilities for all our server hardware, for visibility for applications, for URLs, for webpage response and accuracy, and for monitoring network throughput in a lot of particular instances. We're using lightweight protocols for pinging, for DNS, for LDAP."
- "The one piece that I would love to see is a general-purpose, configurable agent which would be a framework that you can deploy on anything, whether it be Java or anything else. It would allow you to easily deploy it on a platform that they support."
What is our primary use case?
We stood up an event management group and our responsibility is to monitor the entire company, globally: systems, applications, and infrastructure. We're modeling those out as services. We've got about 800 services that we're modeling out from the CMDB right now and monitoring pretty much everything.
We are big users of the service models. We use CA's SDM system, which we're evaluating. But in the meantime, we wrote the interface between TrueSight and CA to cut tickets and also to, in reverse, give ticket statuses in TrueSight. We're also going through a process of onboarding our services for event management where we go through a checklist of about eight different items and bring them on as a service with SLAs. Some individuals on our Service Desk - and eventually all will be - are dedicated to doing 24/7, 365 monitoring of the services, the events, and the applications.
One of the primary things we're doing is using this as a vehicle, within our "One-IT" initiative - which includes event management - to truly bring people together from a cultural and technological perspective. The goal is that everybody will have the same place to see what's going on. No longer will they have to worry about their application. Is it the databases? Is the network? And how long do they have to spend trying to figure it out? Culturally, the Service Desk is coordinating some of those impacts when they happen, so that the right people are on the call, based on what the service model says. All in all, it's a very flexible tool, which means it's complex but very powerful.
We're using Operations Management, Capacity Optimization, some App Visibility with some of the Synthetic scripting and we're just starting to deploy some Java agents on some app servers.
How has it helped my organization?
With the service modeling, once we managed to build our import stuff to get our CMD impact models and services into TrueSight, that was a big win. Because once we integrate it with SolarWinds, they will actually be able to see when there's a problem with the plant, and they will know if it is a network problem or a server problem. With the service models, they can actually get right down to the impact of any issue. We're working on some other things to make that easier, like event correlation. So if a network goes out at the plant, they don't need to know that there are problems connecting to 60 servers, rather they've got a problem with the router.
We're currently looking at either consolidating the other monitoring tools that we have around the organization or connecting them for the single-pane-of-glass goodness. We're bringing in data from SolarWinds, we're bringing in data from Oracle's OEM, and we're integrated with an application monitoring desktops. It generates an event and a ticket is cut out to the regional support people. They will go to the desktop and say, "Your disk is in danger of imminent failure. We need to go ahead and clone that guy and replace it before you're down." So we're definitely going with a single pane of glass. In terms of our IT ops management, that means it's getting better. We're trying to be more proactive instead of reactive. We've only been heavily into this for nine or ten months so the actual, long-term impacts aren't measurable yet. We're still baselining where we are at.
The single pane of glass is a big improvement.
There is also the ability to do predictive and corrective, especially for some services which we're monitoring out in the field which are critical to various plant components. It used to be that they would go down and the plant would call. Now we're detecting that they're down, we're restarting them, and we're letting somebody know there's an issue. That's also a big improvement in our manufacturing capabilities. Culturally, it is bringing people together with one place to look and giving them something to talk about when there's an issue. It's bringing IT together. The collaborative and predictive stuff is actually starting to improve.
We're not doing a tremendous amount of preventative stuff yet - unless you count when your disk is three percent from being full and you need to do something before it fills up. We're not using some of the more advanced features of the predictive analytics yet. We are starting to look at some data analytics though. We have a data analytics group which we stood up, a couple of people who are starting to use data analytics to do some things.
It's improving the overall operation, but the impact is going to be measured a little bit later. We've seen some cost deferrals and some cost savings with some support renewals we haven't had to do on some other tools. But we haven't seen the major cost impacts yet. We have spent a lot, but on cost-avoidance for various support tools we have saved close to $1,000,000. In the nine months we've been operational, we've deferred cost on at least two tools. One was about $750,000 and the other was $250,000 for maintenance.
It also helps to maintain the availability of our infrastructure across a hybrid, complex environment. I used to work at FedEx and we're not as environmentally complex as FedEx because we consolidate a lot of stuff on the ERP. But if you throw manufacturing in there, we have pretty much every flavor of platform. As with most deployments, we've got three-tier and four-tier applications. You throw the network and some load-balancers in there and it's fairly complex. If you can use a service model to see exactly what's working and what's not, it really gives you the ability to look at some things.
The solution has also helped to reveal underlying infrastructure issues that affect app performance. Let's say there is a system that is occasionally slow but you don't know why. Then you find out that it was supposed to be configured to use a large number of LDAP servers for authentication but somebody had configured it to one. When you compare the times at which the systems people were having trouble logging on and you look at the CPU and memory usage on your LDAP server, you begin to put things together, without actually analyzing configuration files. You can figure out that the system is configured improperly. When they dig in, they find that it's only talking to one LDAP server. It gives us that kind of diagnostic capability, by looking at everything, and the ability to pin things down.
In terms of root cause analysis, we're still working that through. But mean time to repair is going down because it's becoming much more obvious. Between the events that people are looking at which are prioritized, and the service models which show the actual impacts to the relationships, it's becoming much easier. Depending on the event, it's gone from about four to five hours down to 20 minutes. When it works, it's significant. A lot of it is cultural. When you go from everybody monitoring their own stuff and not talking to anybody else, to everybody looking at the same single pane of glass, and you throw a Service Desk on top of that, which is performing incident management and coordinating some things - between the technology and the culture and the process changes, you're going to see some pretty dramatic improvements.
BMC just did a custom KM for us. Typically, on a given server, we want to know when a drive is three percent. But we've got some mixes of drives, servers which have anywhere from a 100-gig drive to a terabyte drive, and the percentages that we are worried about are not the same. This request came from our SQL group. BMC was able to adjust the alert parameters based upon the size of the logical drives. That was definitely a business innovation. I think that was good for BMC too. Although that's a custom KM which we just deployed, I suspect they will make that part of their standard tool kit.
What is most valuable?
From a TrueSight perspective, we love the Capacity Optimization. We manage to collect almost all our capacity information through agents, without having to deploy a capacity agent. We've already saved some money. We're now provisioning more for obsolescence than we are for expansion because we now know exactly what we've got. One of the nice things about it is that we've now put Capacity Optimization in all our plants and mills, where the money's actually made.
The flexibility of the MRL is great. The various abilities to use native KMs to connect to a lot of things that we're doing with the hardware monitoring into the consolidated stuff, like SharePoint, is great. We're using native monitoring capabilities for all our server hardware, for visibility for applications, for URLs, for webpage response and accuracy, and for monitoring network throughput in a lot of particular instances. We're using lightweight protocols for pinging, for DNS, for LDAP. We use the scripting KMs for a lot of stuff that we have to script ourselves. We're also doing a lot of SNMP polling for devices. We've got some places where we really couldn't use a traditional agent and we deployed a Java agent that we wrote. For example, we might be monitoring UPS's out in the field using a Raspberry Pi and pushing that data back up. The problem with UPS's out in the field, when you have thousands of them, is that you don't know that the battery's bad until the power goes out. This gives us the ability to enable them to report back via SNMP.
What needs improvement?
I can only speak from my perspective because I don't know if some of the issues that we've had are industry-wide or not. For instance, we've got a lot of Microsoft stuff here, and the SCOM interface is very difficult to use. They don't have support for SCCM and some other things so you have to go directly.
The one piece that I would love to see is a general-purpose, configurable agent which would be a framework that you can deploy on anything, whether it be Java or anything else. It would allow you to easily deploy it on a platform that they support.
The KMs and some of the user interface are a little bit quirky. That's the stuff that they will eventually get to. TrueSight is a fairly new platform revision for BMC. I'm seeing a lot of those simple platform things, where you have to go here and do this and you have to go there to do that. They're very working very hard to integrate everything into the same simple console. I think that a lot of the issues that we have are going to slowly, or maybe rapidly, disappeared.
For how long have I used the solution?
We installed it a couple of years ago. We started ramping up and have been using it since then. We really went hot and heavy about nine months ago. We moved from Windows to Linux in January so that's when we really started to invest in event management work with it.
What do I think about the stability of the solution?
On Windows we went to application HA and, quite honestly, it was terrible. They'll tell you it's terrible - or they should. We are very religious about patching, so when you go to multi-node HA stuff and you've got the Windows guys patching your stuff every Saturday night, you become very unstable. What we did was we moved to Linux so that the patching wasn't necessary as often. And we went to operating-system and hardware-level failover with Oracle Solaris virtual machines, and we've been incredibly stable since then.
What do I think about the scalability of the solution?
Regarding scalability, so far, so good. We've got about 22,000 devices that we're working with, of which about 8,000 are directly monitored. The rest are coming in from SolarWinds, the network, and some other things. We're running three TSIMs and one parent, so four infrastructure managers. We've got integration servers all over North and South America and Europe. It's very scalable.
In terms of users, it's mostly IT right now and a few business people. We've also got 300 to 400 service providers who log on and look at things occasionally. A lot of them just use the ticketing system. They don't actually get into BMC. They just work their tickets and close their tickets.
As for increasing the usage of it, the foremost thing in our pipeline is to continue to bring on applications. As part of the service onboarding that I talked about, we're bringing in major applications and sitting down with the service owners. We're going through everything they could possibly want monitored and showing them what we can do for them. We're putting those thresholds in place, training their teams, and bringing their teams on as users. Slowly, over the next year to year-and-a-half, we will bring in all of IT.
How are customer service and technical support?
Tech support varies, it depends on who you get. The first-tier is pretty good. If you get the right guy, it's outstanding. They've actually brought on a lot of new people, but they seem to work together as a team. I won't say they're bad, but I don't like tech support for most companies. Overall, they're on par.
Which solution did I use previously and why did I switch?
Prior to BMC, from a monitoring perspective, we were using 65 other solutions. One of my missions is to either integrate them or consume them. Bringing on TrueSight was the vision of a guy who's no longer here. He fully understood the need for a single pane of glass. He understood, fully, the need to bring light to the monitoring situation. We did some evaluations and proofs of concept and decided on TrueSight.
Quite honestly, if you're a large corporation, you can go look at the studies and you can justify it that way, but if you stop and think about how much better your organization can run, and the things that you need to do from an operations management perspective - and you think about the automation that you can put in place - it's a no-brainer. It's just a matter of choosing which tool.
How was the initial setup?
The initial setup was complex, no doubt, by the time you bring in Professional Services, if you opt to. We didn't follow the standard model because we didn't want them to come, drop in a configured system and say, "Here's the book on how it works," and then walk away. We wanted them to participate in every aspect of it. We brought a lot of it on ourselves, where they told us what to do and we did it. We worked with the Pro Services to do it, so we took longer than it probably should have but we knew more about it than we would have as a result. It's a very flexible product, which means it's a very complex product. We had enough servers and monitors that we had to bring up a multi-tiered, large number of TSIMs. It was because of our service models that we introduced a lot of the complexity ourselves.
Because we're pushing full sets of service models out of our CMDB and into TrueSight to use as a service model, we have to put them at a top level of a TSIM so that all the other TSIMs that feed into them can show up as impact models. We went to a three-tiered architecture with presentation on top, a service management infrastructure manager in the middle, and the integration managers below. So a lot of the complexity in our particular configuration was due to the fact that we didn't want to have to figure out where those services belong, or which piece belonged on which TSIM. We wanted to punch them out to the top and then let TrueSight worry about it. So in the long run, it was complex to install but it is much easier to maintain.
The deployment took about three months. There was one person from BMC and about five people, altogether. We had DBAs involved and we had the hardware guys involved and the network guys involved. It was probably three people full-time but, off and on. Every aspect of some department that would touch this thing was involved at some point.
There is a team of five employees and myself who are not only maintaining it but doing all the monitoring configuration - working with users to collect monitoring requirements, setting thresholds and writing custom MRL and PSL.
At the cultural level, it used to be when we first started it up, people would say, "I have my own monitoring tool and I don't need you people. I'll do my thing." Now, they're saying, "You're doing things for these other people, can you, can you help me out?" It's really grown organically, and we've had to put a team together so quickly that there has not been what should have been in place, which is a major deployment plan, where all of the pieces would fall together. We're starting to work on that now.
What about the implementation team?
We worked directly with BMC. We didn't use any third-party.
What's my experience with pricing, setup cost, and licensing?
The only possible additional cost that I can mention, that you might not be aware of, is that it uses Oracle partitioning, if you use Oracle. There are Oracle partitioning fees that go with that.
Which other solutions did I evaluate?
We looked at some other options. BMC has been around a long time. If you look at the industry ratings, it's way up there, top-right quadrant, along with a couple of other solutions. Its flexibility and its capabilities dovetailed with what we wanted to do and we liked their people. They have a good attitude.
What other advice do I have?
My advice is that it's not going to be as easy as you think, but it's going to be worth more than you think when you get it done. It depends on your situation. It depends on how far advanced you are in operations management. For us, this was a complete cultural, technological, and process overall. It wasn't just replacing one tool with another. It wasn't just putting a tool in place. It was an entire IT renewal and it's still going on.
It's been a long, hard road, both from a cultural perspective and from a technology perspective, just getting people to realize the value. But once they do, they're willing to bend over backward for you.
We had some false alerts. In my job the red light means it's bad and the green light means it's good. There should be no light you think is green but it's bad. We had some of that at the beginning, more our fault than anybody else's. But once we got to the point where the signals were good and people could appreciate what they are getting, we became a very different organization.
The biggest lesson I've learned from it is that you can talk about it, you can visualize it, you can proselytize about it, but until you have a single pane of glass which is actually up and running with a lot of stuff connected to it, you just can't really appreciate the value of it.
The functionality of the solution is not helping, so much, in terms of business innovation. We're not doing business process monitoring at this point. While it might be that the business is not complaining as much, I don't measure that. But from an innovation perspective, it has had people look at things and say, "Well, if you can do this, can you do that?" We get a lot of requests for strange things, some we can do, some we can't. But it's getting people to think about things that hadn't really come up before.
It's a really good tool and most of the issues we've got, they've either fixed or they're fixing to fix. So a nine out ten is right.
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
Sr Manager at a tech services company with 1,001-5,000 employees
It covers so many different technologies which can roll up into a single console
Pros and Cons
- "It is breadth. It covers so many different technologies which can roll up into a single console."
- "The noise reduction for ticketing works much better than we have seen in a lot of other companies."
- "I definitely would like to see more improvement in the self-diagnostics. I need to know when anything is not working or collecting, long before our customer finds it."
What is our primary use case?
My company is a data center service provider. We host and manage IT for all types of different companies, using TrueSight to manage and monitor the health performance availability of all our customers' environments: networks, servers, databases, websites, and all their back-end IT.
Right now, the focus is pushing DevOps and AIOps in our more traditional data center management. We are not using it in the cloud space today. Therefore, the focus is the traditional data center space, but for us, that is a very large space.
How has it helped my organization?
One case that we like to use a lot: We have a customer who uses F5 load balancers, and they were managing them with CA products. Those load balancers were generating around 11,000 tickets a month. Just moving them from CA to TrueSight, and replicating the same rules, they went from 11,000 tickets a month to 400 tickets a month. TrueSight did a much better job of doing the same thing. Then from there, we were able to tune it. We got it down to about 40 tickets a month. While this is an extreme example (I don't usually see this type of improvement), it shows the power that is there.
We are able to more quickly identify problems and get an engineer on it to restart services, etc. It is not fixing the customer's bugs. They've got buggy apps, and it goes down all the time. It is just that we can get them back online faster.
What is most valuable?
- It is breadth. It covers so many different technologies which can roll up into a single console.
- The noise reduction for ticketing works much better than we have seen in a lot of other companies.
- We're starting to get into the machine learning pieces to further enhance the intelligence of events.
What needs improvement?
Continue to improve the maturity of the product overall.
I definitely would like to see more improvement in the self-diagnostics. I need to know when anything is not working or collecting, long before our customer finds it.
I would like to see continued improved integration with some of their partners. We use a lot of Intuity software. While the connections are good, they could be better. We use App Visibility, as part of the TrueSight suite. Previously, we were a big BMC TMRT customer previously. They gave up a lot of features of TMRT to get App Visibility in. Features that our customers used. They still complain about this weekly: When are we going to get this report or view back.
When we took this issue back to BMC, they said, "It wasn't an upgrade from TMRT. It's a brand new product. It just happens to be serving the same market." From my user standpoint, we went from BMC TMRT to BMC App Visibility, giving up all these features. For us, it was an upgrade that we lost features on. I need that stuff back, at the end of the day, as a service provider. The customers need to feel comfortable that the data is there. They need to have accurate SLA type reports. The SLA reports that we get on TrueSight today are unfortunately worthless. They go to the whole integer. So, they all show 100 percent, when we've got contracts which are 99.996 percent and are now rounding to 100. Well, if we were at .9995, that's an SLA miss. Things like this are a problem. We have to do all this manually on the side. We can't roll this back, as the versions that we used to use are long out of support.
The biggest issue is probably the gaps in the reporting that I need for my end customers. That is a very public and embarrassing, I can't give you the report that you need. Also, the reliability of the ISNs needs improving. Having a customer find a machine that stopped collecting before we do, that is not what you want when you're a service provider.
For how long have I used the solution?
We have been a BMC client since 2001. We've been through many generations of the product.
What do I think about the stability of the solution?
The stability has a bit more maturing to do. There is still room for improvement. Overall, it's pretty good, depending on which layer you're looking at. At the highest level, which is the presentation server, we find that we have to restart that every two months or so, just because it stops responding. I would like it to be a bit better. We don't have any real understanding of what's causing that. The next layer down is the infrastructure manager level. That's probably about the same, every couple of months it stops responding. As you then go farther down to the data collection layer: the ISN level. Those aren't as stable as they need to be. They will go for six months fine, then fail three times in a row in two weeks. It doesn't give us a good alarm, and unfortunately, we've missed an event. Then, the customers notice something, and that didn't pass its events. So, a little more maturity is needed here.
What do I think about the scalability of the solution?
It's scaling fairly nice, but not as large as we would like. We are not seeing the type of scalability that BMC claims. For example, they say that you can run 900 agents against an ISN. We find the ISN stability goes down when you hit 500 or 600. So, you're only at two-thirds of the capacity. I forget how many millions of things that the TSIM was supposed to be able to handle. We are no where near that capacity. We're spinning up more TSIMs because it's just not scaling as advertised.
How are customer service and technical support?
Technical support is a mixed bag. Some tickets go in and are handled very quickly and well. However, we have had tickets which go in and have been out there for months, and some of them were fairly complex. They will go up to Tier 2 or Tier 3, then park. I'm assuming that we're running into a software bug, or something, but those tickets that stall out are frustrating.
How was the initial setup?
It was complex. I wish we had put Professional Services into the deal. Being a service provider, we are attached to companies all over the world with very strict auditing and security requirements. Therefore, designing the architecture to work in that environment was fairly complex. I was just talking to a product owner about the problems that we still have.
Once we get the architecture, the deployment went fairly smoothly. The policy creation and management were much more complex than in their previous products. It is probably more powerful, but not as easy to administer.
They have rolled things, which were multiple products separately in the past, into a single product. They've had to do some consolidation, or adjustments, to be able to merge them quickly to get their product to ship. This left some things missing. Some features that used to be there are gone. Features that we used to use. So, there are pain points, as we figure out how to work around the new gaps.
What about the implementation team?
We did it ourselves.
Globally, I've got six engineers and 12 operators who worked on the deployment. This is a sizable group. However, I'm currently supporting global operations of a couple hundred clients, and they're major clients.
What was our ROI?
TrueSight has helped reduce IT operations costs. From a software standpoint, I have been able to eliminate a lot of other tools, saving approximately half a million dollars a year in other maintenance costs. That is easy savings. The more important one is the labor savings: more reliable, simplified tickets.
The time savings are recognized by the operations teams, not my team. Therefore, it's hard to know the time savings, but if an operations person takes at least 15 minutes to analyze a ticket and their ticket volume is reduced by 10,000 a month, then TrueSight does save time.
We've been reducing ticket noise five to ten percent annually every year, and it has been cumulative. This means less tickets, noise, and operator intervention.
What's my experience with pricing, setup cost, and licensing?
It is a large, complex product. So, there is a commitment of manpower to deploy it, as it is not a cheap product.
We license per named endpoint for most of the products: servers, network devices, databases, etc. You pay for the initial license and maintenance. The way that my company looks at it is we figure out our monthly costs over five years, and right now, we are between five to six dollars. We need to get that down to about four dollars. That's included in the maintenance.
There is a big upfront cost when you buy the license, then there is annual maintenance. We look at, if I bought a license and paid for maintenance for five years, then average it out, what would be my monthly cost. We have had some of the competing tools come in around four dollars. This is coming in as a premium, which is why I don't have it deployed as I would like it. Therefore, we're in negotiations right now. If I can get it down to the four dollar range, I will triple my deployment in a year and a half. If they could could me to the right price point, there are 10,000 to 15,000 servers that I would install it on.
Which other solutions did I evaluate?
As we've acquired other companies, we've picked up pretty much every other tool set out there: CA, IBM, SolarWinds, etc. We have played with pretty much everything. The BMC TrueSight platform wins probably 80 percent of the time if you look feature by feature. It's a good, strong platform. It's ability to run on all the OSs that I've got is a huge thing. We do a lot with IBM iSeries, and a lot of vendors don't cover that. So, this is a big positive on the platform.
Being able to roll everything up to a single database and single feed out for reporting are all very big positives. The same type of consolidation rules under CA, if you write them in BMC, they just work when they didn't work in CA. Things like that make BMC great.
What other advice do I have?
You really want to plan out your policy and architecture in great detail before you start any deployments. It is a complex product. You don't want to have to go redo it. Pick a small environment, test out your plan, test it out a second time, beat it up, and once you're happy with it, then go nuts by deploying it everywhere. It's great once it's there, you just have to get past that design hurdle, because there are things that aren't necessarily intuitive.
I have a mixed bag impression of the usability. The end user experience is mostly good, as it's a very clean interface. There are some quibbles with it. You have to drill into a lot of layers to get into the data that you want. However, when you hit "Back", it takes you all the way back out of the tree. Then, you have to redrill into all those layers. That is a bit of an annoyance for end users. From an administration side, it is still sort of heavy, and policies are very complex. Therefore, it takes a fairly senior level engineer to build it and get it to work well. But, once it's working well, I can monitor tens of thousands of things.
Definitely get multiple references from each of the clients, since all salesmen lie. They all promise the possible best scenario, and I have found depending on the client that you get very different experiences. So, the claims that the BMC sales guys have made are all achievable in a perfect environment. No one has a perfect environment.
Claims from CA, I have found to be outright fabrications, such as, "We can do this." Then, we buy the product. "Oh well, you actually need Professional Services, and you're going to need like three years of custom coding." Millions of dollars down the drain with them.
Other vendors have different levels. They all come in very rosy, and sometimes too much. So, talk to people who have really done it. Take their advice. Don't assume that they didn't know what they were doing. There are a lot of good engineers out there. If the company is struggling, assume you will also struggle.
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
Director Product Management at Park Place Technologies
Enables us to monitor a hugely diverse set of hardware products from multiple manufacturers
Pros and Cons
- "The ability of this platform to monitor the very diverse assets that we maintain around the world is its most valuable feature... We support a vast array of manufacturers' equipment, like HP, IBM, Cisco, Dell, EMC, Hitachi... We can do it all with [this] one [solution]."
- "We have a unique use case because BMC typically sells this solution into enterprises that are deploying it within their IT, versus to a managed services provider like us where we're supporting thousands of customers. Multi-tenancy and the scalability have been challenges along the way, as we've grown... If anything could have gone better as we were ramping this up and adding a lot of volume to it, I would say it's the scalability. That would be one thing that could be improved."
What is our primary use case?
We're actually hosting the software and providing services to our customers based on all the capabilities that are within TrueSight. We are a very large, global, hardware maintenance provider for data centers. We mostly service the high-end data storage and networking equipment that you would find in data centers and in cloud environments.
A couple of years ago we started on a journey to really improve our ability to maintain and service our customers. This was all about connectivity, getting connected to those servers and storage platforms. We wanted to get connected to everything that we were maintaining around the world so that we could really implement a "diagnosis before dispatch" approach.
With this solution, we gather all the data from a server that has failed, and we do all the troubleshooting, the problem and root-cause determination - we call that triage - before we ever send a field engineer or anyone to the site. So when we do send a part or do send a field engineer, we know exactly what the root cause of the problem is and what they need to do to fix it.
How has it helped my organization?
We are using this solution to scale our business and to drive greater efficiencies. The other side of it is that it's much better for our end customers because they no longer have to monitor their own environments for hardware failures. We do that for them. They don't have to recognize that a server has failed. They don't have to pick up the phone or send us an email to open a ticket and send us files to help us troubleshoot the problem. We're really reducing a lot of the effort required on the customer's side to manage their IT environment using this tool because we can detect the failure, we can troubleshoot it remotely. And, when we do implement the corrective action, we're pretty certain of the root cause, based on the technology and the capabilities of TrueSight.
It has improved our time to repair. From the time we get the incident logged to the time we get the customer back up and running, it has improved that by 33 percent or greater. It has also improved our ability to fix it right on the first call. It gives us the root cause of the problem, and it automates that whole triage, it gives us the part number of what's failed. We're now at somewhere around a 97 percent first-time fix rate. And that's only going to get better as we get more experienced with the product. And that's important to our customers. When we come out, we're going to fix it right on the first call and not have to come again and again and again. That's really important to the uptime of their IT.
We have a graphical representation of this very thing. It shows the old way of service delivery, in which the customer first had to recognize they had a problem. Once they recognized they had a program, they had to call in or email and open a ticket. Once they opened a ticket, the whole troubleshooting process would begin. We were often calling them as many as eight times per ticket, just to get information about the failure. That was taking a lot of time from the customer. After that, we would have to dispatch someone with the right part or the right solution, and oftentimes we either brought the wrong part, or we had to bring a handful of parts, which was costly for us and would drive up the cost of the service for the customer. And often there would be a repeat call, because we might not have brought the right part or have sent the right level of skill out on that call. That was the old way of doing it.
The new way of doing it for the end-customer is that we call them to let them know we have spotted a problem with their server, for instance, and that we're working on it. We don't have to bother them for log files or diagnostic logs or any of that information anymore because it all comes packaged with the alert from TrueSight. The customer really only hears from us two times now: once, when we open the ticket to let them know we've seen a problem and again after we've resolved it.
Another example is that many of our customers have equipment in co-location centers and offsite data centers, where they don't even have anyone to see that there's a problem. Now, we are driving a lot of efficiency for them. They don't have to send people out to check on problems anymore or pay somebody who is running the co-lo to go out and check on something. We're able to see it all remotely through the monitoring tool. That's another huge benefit that we've heard about from our customers.
The solution provides us with a single pane of glass where we can ingest data and events from many technologies. In terms of our IT ops management, we have a unique deployment. We actually have it running in our own shop. Everything that we deploy to our customers we deploy internally first. But we've really licensed and implemented TrueSight to drive our services business. We're supporting all of our customers' data centers with the product. We're not connected to all of those yet. We just officially launched the solution in January of 2018. We've got about a year-and-a-half in production with the product and we're getting good adoption. The real answer to its effect on our IT ops management is not so much our internal deployment. It's more about the deployment that we're leveraging for all of our 16,000-plus customers globally.
We've had a number of cases where, through the analytics in TrueSight, we've actually been able to predict failures. For instance, we've already had a couple of cases where, if we see a hard drive on a storage array is going to fail, we'll actually send the part out ahead of the failure. That allows us to replace that drive before it fails - and on the customer's planned downtime. In the old model, it fails, it's down. The customer waits for us to come out, swap it out, and bring everything back up. In the predictive model, we know it's going to fail, we send the part out ahead of the failure, and we replace that drive on the customer's scheduled downtime. The more of that we can do - and as we expand beyond hardware into operating system, application, and the other layers of infrastructure - we'll be able to exploit the machine learning and the AIOps to a greater degree than what we're doing today on the hardware side.
The way we talk to our customers about the functionality of the solution across IT ops management to support business innovation is that because we've significantly reduced the amount of time they have to spend managing service tickets, they have more time to focus on their digital strategies. We say, "Hey, we're giving you some time back. You don't have to spend all this time interacting with your service provider. You're just going to hear from us when you have a problem and after we've fixed it. We won't bother you for log files and all those things." We're actually giving them time to allow them to do more value-added work, like working on their strategic initiatives and their digital transformation initiative. I think we'll be able to expand on that as we go forward.
What is most valuable?
The ability of this platform to monitor the very diverse assets that we maintain around the world is its most valuable feature. We service over 350,000 data center assets. These assets come in the form of servers, storage arrays, networking devices, etc. We've calculated that we service and support over 36,000 data centers around the world.
We're not really tied in with the manufacturers, but we support a vast array of manufacturers' equipment, like HP, IBM, Cisco, Dell, EMC, Hitachi; and I could go down the line. We have a very diverse install base under contract and TrueSight can connect to all of those and monitor all those different platforms. Many of our customers have as many as 20 tools in their IT environments to try to monitor all this stuff. We can do it all with one, and we're hosting it for them. So it really gives us the ability to take some of that burden off the end customer.
The other really important thing to us, and the reason we chose TrueSight, is not only to monitor and to capture failures and alerts when things fail out there, but to do what we call "automated triage." No matter who manufactured the equipment, when we get the message that tells us something has failed, it always looks the same. Whether it's EMC or Dell or IBM, whatever the equipment might be, TrueSight always returns the event in a standard format which gives us the manufacturer, the model, the serial number. It even gives us a list of what has failed, whether it's a hard drive or power supply, for example. It even gives us the part number of that specific device in that specific machine. That really helps automate the troubleshooting and the triage process. That's a big feature for us.
The solution's event management capabilities are proven. We always like to say it performs as advertised. We evaluated over a dozen products before we chose TrueSight, and we found it to be very good at monitoring at the hardware level, which is core to our business. The ability for it to capture those failures, to capture all the events from that very diverse set of equipment which we maintain out there, means we are very impressed with the performance.
In terms of the breadth of the solution's monitoring capabilities, I've already addressed the different types of products, the different manufacturers. The diversity of what we service out there is amazing, and it can really monitor just about everything that we maintain out in the field. But the other aspect of the breadth is the fact that not only does it do hardware really well, but it's really going to help us start to add to our portfolio of services. We're going to be able to use this to monitor operating systems and applications and software and networks, and even all the way to end-user experience. Ultimately, we're going to be able to move into other areas of service, based on the breadth of what it can do in the total IT infrastructure.
For how long have I used the solution?
In production, we have been using it for about a year-and-a-half.
What do I think about the stability of the solution?
We're in a very stable environment now but it took a little time for us to get there. That's because of the multi-tenancy, the scalability, and the volume of traffic that we're driving through their platform. They're very different than what they're used to. It's potentially hundreds, potentially thousands of customers, with a lot of equipment in their data centers flowing through. We are now in a very stable place in production. We feel very comfortable going forward, scaling it out, and adding thousands of customers to it. It took us a little bit of time to get there and we needed a lot of support from BMC, but we feel good about it right now.
What do I think about the scalability of the solution?
We have a unique use case because BMC typically sells this solution into enterprises that are deploying it within their IT, versus to a managed services provider like us where we're supporting thousands of customers. Multi-tenancy and the scalability have been challenges along the way, as we've grown. But BMC has really been a great partner helping us address those things.
Building that kind of scale and multi-tenancy into the product would serve companies, the way we're deploying it. It's a little different than what BMC is used to, but that would be one thing I would put out there. If anything could have gone better as we were ramping this up and adding a lot of volume to it, I would say it's the scalability. That would be one thing that could be improved.
How are customer service and technical support?
BMC's technical support has been great. They've been by our side. They've been working with us. They could have just said, "Look, our product wasn't built to do that. Good luck." But they didn't. They stuck with us and they're still with us today helping us optimize and do things better. They've been a great technology partner for us.
Which solution did I use previously and why did I switch?
Most of the storage products have a native "call home" feature. It's like email alerting, so when a hard drive fails on the storage array, it will send an email. A lot of the manufacturers did that for the warranties. It would send them an email and they could take care of the warranty claims. What we did was redirect those emails to us, because most of what we do is after the warranties have ended on a product. We were getting all these emails from potentially thousands of things that we were maintaining out there, and every email looked different. Emails from HP looked different than those from EMC which looked different than the ones from IBM or Hitachi. Everything was in a different format. It took a long time to sift through these emails to figure out what was actually wrong, and it was very inefficient. That's how we were doing monitoring.
We also had a little black box that we built internally that was using SNMP and some other technologies. But a lot of customers don't want some rogue hardware in their data center. It's a security concern. So that was very limited in its deployment. Overall, by and large, we really weren't monitoring. We were very crude in our methods and there was a very limited number of things that we were monitoring at the time I came in.
That's when we started thinking, "You know, if we either build or buy a world-class monitoring platform and get it connected to everything, we could really differentiate ourselves in the market." That's what led us to start evaluating some commercial, off-the-shelf things like BMC.
How was the initial setup?
We got it up and running pretty quickly. We had it up within three months because we had to buy hardware and build the whole infrastructure, so it was a little more than just installing the software.
Then we did what I call a controlled deployment. We had about ten to 15 customers in a pilot program. We ran that over about a six-month period before we went live in production.
What about the implementation team?
We had a consulting firm that worked with us, a firm which BMC had brought to the table named Column Technologies. That experience was not good. BMC had said these guys were one of the best partners they had, and they probably are. It could have been Column Technologies, it could have been anybody that they brought in.
Our implementation was so unique and different compared to what they were used to. They were used to going into an end-user and helping them get this solution deployed within their own IT environment, to manage their own back-office IT. But that's not how we were doing it. We were putting it in as a service platform to manage thousands of customers and hundreds of thousands of devices, potentially. So the implementation was very different.
BMC had to work with us pretty extensively on how we were configuring and putting this in to make it work the way we needed it to work. I'm not going to pick on the consultant that much or criticize them too heavily because this installation was very different than what they were used to doing.
We got a lot of support from BMC because it required it. We needed the guys who built the product to help us get this thing implemented in such a way that it would support our business model. Ultimately, we solved those problems and we're in good shape now. But there were some startup issues, that's for sure.
What was our ROI?
I don't know that I have a number available. When we embarked on this journey we had some business-case assumptions about what our internal savings would be. We've got a little more work to do to come up with those numbers. We need to get more volume deployed before we can say we have a reliable percentage of OpEx reduction.
What's my experience with pricing, setup cost, and licensing?
Pricing is all volume-driven. I think we were paying between $80 and $85 per license. That's per unit, for a perpetual license. You pay it one time and then, every year, you pay 20 percent of that for annual maintenance and support.
But now that we've grown, we've purchased tens of thousands of licenses and the cost per license has gone down to something like less than $30.
I wouldn't call it an agent cost because the way they price it is based on the number of things you have connected. You can connect hundreds of things to a single agent but you're paying by the number of things. That's how you use the licenses. So it's really priced by endpoint, not by agent.
Which other solutions did I evaluate?
When we were just starting the journey, we looked at ScienceLogic, Centerity Monitor, and we looked at CA. We also looked at the Microsoft product. Those represent a handful of the products we evaluated.
What other advice do I have?
If we had to do it all over again, we would have spent a lot more time in the early going on planning the architecture, on how we were going to build this out. That could have saved us some pain, once we got it up and running and started adding customers and expanding it. If we had spent a little more time with BMC, planning architecturally how we were going to design this to support the scale we needed, it would have helped. That was a lesson learned. And that would be some advice I would give. Depending on how you're planning to use the tool, make sure you spend some time looking at the architecture in the systems and the architectural design of how you're going to implement it to make sure it's going to meet your needs. Make sure it's going to scale appropriately and do what you need it to do.
Our goal is to get this solution connected to every single customer that we're maintaining equipment for, because of the efficiencies and the improvement in the end-user experience. When I say we support over 350,000 assets in 36,000 data centers around the world, that is our maintenance business. We're working to connect TrueSight to all of that. We have sold - not quite yet deployed, but we have sold - about 33,000 licenses, which means assets. We've deployed just under 10,000 of those so far. So we're making good headway and we're very pleased with how it's performing so far.
One lesson that we've learned is that we're now in a great position to expand our portfolio of services which we offer to our customers, well beyond hardware. Without this technology, we could never get there. Prior to us putting this in, it was all done manually. Phone calls, emails, people driving to the site to try and diagnose problems. It was very manual and inefficient and not scalable the way we were doing business. And we were growing so fast. There's no way we could have scaled to where we're at today or scale to where we want to go, even in our core business.
The other lesson we're learning now is our that customers are asking us to do more and this technology is going to help us do more for them and expand our business. It will enable us to expand our portfolio of services. That's our biggest lesson. When we started out it was really all about driving operational efficiency in our hardware maintenance business. And now we've learned we're in a very good position to move into other services, based on what the capabilities of this platform bring to us, beyond hardware - into application monitoring and operating system and network and all the other pieces of the infrastructure. We can start to support them going forward.
It has completely changed our way of thinking about our strategy going forward. It's amazing.
At this point in time, I'd rate it a ten out of ten. We've got something really unique here. We built some integrations, some things of our own around it. And we're feeling really good about it.
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
Vice President of Managed Services at Park Place Technologies
Enables us to proactively service our customers and even warn them about problems before they occur
Pros and Cons
- "The fact that they have a very integrated relationship with Sentry Software, the Knowledge Module, is valuable... The richest feature for us is the number of Knowledge Modules that we can load into the product to add breadth of service to the customer. It enables us to move up the operational stack from hardware, to operating system, to application, and to cloud... That enables us to provide one pane of glass over all those layers - hardware, OS, app, and cloud."
- "Reporting would be an area for improvement in TrueSight... We have almost 800 customers today on TrueSight and just under 10,000 assets. We need to be able to give a customer some information. If the customer's product fails, they'll ask us, "Did it have a problem beforehand?" We have all those events and we know all the problems it had beforehand. We have to be able to give them access to that kind of reporting. That's an enhancement that we need."
What is our primary use case?
Park Place Technologies brought in TrueSight for three reasons. The first reason was the Presentation Server - the architecture.
The second reason was the fact it has the AIOps piece.
The third reason was their partner, called Sentry Software, out of France. We are a hardware maintenance company. We're probably one of the largest providers, worldwide, for replacing drives and storage equipment. We brought TrueSight in as a means of seeing if we could reduce the number of physical touches on a service ticket from eight to two. We've been accomplishing that with TrueSight and the Sentry software.
We provide post-warranty support for storage equipment and data center equipment. For example, if it's a VNX piece of storage gear that goes off warranty, we come in and we maintain it at a high level off of what the customer paid the OEM. We do the parts and the service in 35,000 data centers worldwide. TrueSight is enabling us to get that done in an automated fashion.
Sentry is the Knowledge Module we use in TrueSight. It has all the information about the storage equipment that we maintain. It tells us the part, the chassis, serial number, and all the detail that we spend a lot of time on phone calls with the customer trying to ascertain. We're doing that automatically now.
How has it helped my organization?
We brought the product in to handle the following: We're in 35,000 data centers today. We have 16,000 customers and we support about 400,000 assets. Those are big numbers. The pieces of storage equipment we provide have something native from the equipment manufacturers, the OEM, called "phone home." What happens is, when these devices start having a problem they send out an email that says, "I'm having this problem." To put that into perspective, we were trending towards 2,000,000 emails at the end of 2017, and growing. We would have to read 2,000,000 emails to find out what was going on. Something lower than seven percent actually had a problem we really had to read, and something well below one percent of those were actually a service event.
Before we brought in TrueSight, there were 8.2 touches via email or phone call after the ticket had come in, including exchanging log files with the customer through to our resolving it. And on the customer side, they had somebody having to look at the equipment to make sure it was actually working. From those 8.2 physical connections with them, we're down to two with TrueSight.
And here's the big difference. Instead of these things sending all of that information out in those emails, it's captured in the Knowledge Module, the policy and the agent, on the customer side of the firewall. What TrueSight does is that when it installs it takes a week to come up with what's called a dynamic baseline. It says, "For this piece of equipment in your environment, these are the key performance indicators that we're going to watch for." We can see events live when they happen. There are predictive and proactive warnings of failures or potential problems. But all that we ever get, the only thing that's communicated to us, is when there's a failure. So we can see all the chatter and we can look at that by customer, but we don't really need to. And if it's a predictive event, it will send us a notice saying, "We think this part's going to fail in two weeks," and we can help that customer.
But ultimately, what we get is a service ticket: "Failed part at this location. Here's the part number, the serial number, and the recommended remediation." That comes into our support center.
Eventually, when we have it all set up the way we envision it, the info will come into the support center and a ticket will be created and it will automatically connect to the tech and the tech will reach out to the customer. We haven't turned that on yet.
Right now, it comes in and we read it. We call the customer and say, "You have a failure." In most cases, the customer didn't know they had it yet, because it's that fast. We call them up and say, "You have a problem. We have the part, and when would you like Larry to come on site?" Because it's storage, they have to schedule downtime. Then we go out on site, we fix it, and we're done. So it's two physical touches now: We call them and they say, "Yes, it's completed."
So 2,000,000 emails have gone away, pretty much, and it all gets done at the customer site. What we see now, instead, is a couple of hundred or 1,000 service events, versus millions of emails. And we have the right part, the right chassis, the right location.
In our industry, there is about a 75 to 78 percent first-time fix rate, meaning repair personnel do not have to go back to a given site within a week. As a company, we were at about an 86 percent first-time fix rate. With TrueSight, we've never gone below 98 percent.
It's all done with software. I read all of the service emails from our customers. Customers are used to finding a log file and talking to our expert - and if a customer has five different pieces of equipment, there are five different experts involved. Now, they send a note in and they'll say, "This is resolved. I just want to make sure this process is working the way it's supposed to. I didn't call anybody. You called me to tell me I had a problem that I wasn't quite aware of. Now, I have a part, it's fixed, and we're good. Is that how it's supposed to work?" It's funny, because they were used to eight different interactions with us, as opposed to two. It's really cool.
It's taking an extremely manual process and, with the AI piece, literally helping us make better decisions. It's what AI is all about. It's really amazing. I'm excited about it because now, instead of our support center people trying to find the right part, they're calling the customer and saying, "By the way, you have a problem. We have a solution for you, and we notice in the same cluster you may have a failure in a week. Would you like us to look at that while we're there?" It's predictive, proactive maintenance. That is what it enables us to do, versus reactive.
Today, when we are proactive, it's for a fan or it's heat or it's a battery. We get notice they are about to fail and they fail pretty quickly thereafter. But when we start getting to operating systems, there are days, as you know, when you have gone on to your computer and it's been slow. On those days of the month, you can probably look in your network and find that there was a big push to get something done. With TrueSight, we'll be able to start proactively predicting these events before they happen, and rerouting the customer so they don't notice a slowdown. Our tagline is all about uptime. TrueSight helps us deliver that. It helps us deliver upfront.
What is most valuable?
The fact that they have a very integrated relationship with Sentry Software, the Knowledge Module, is valuable. We have one Knowledge Module that we're using today, which is the Sentry KM. We're bringing on the operating system Knowledge Module. The richest feature for us is the number of Knowledge Modules that we can load into the product to add breadth of service to the customer. It enables us to move up the operational stack from hardware, to operating system, to application, and to cloud. It's one presentation layer, one path with these Knowledge Modules, which we can add to it to get greater breadth.
That enables Park Place to provide one pane of glass over all those layers - hardware, OS, app, and cloud - which gives us a really good opportunity with the AIOps piece to get root cause analysis. And that's what our customers want: one pane of glass and a detailed root cause. If you've ever been in a data center when something goes wrong, the first thing they ask is, "What happened? What went wrong? Why did it break?"
It's the Knowledge Module which is the biggest feature that benefits us.
What needs improvement?
Reporting would be an area for improvement in TrueSight. In its purest form, TrueSight is an enterprise product, meaning one company would run it in its internal data centers and internal IT organization. But our company is more of a managed-service provider. We have almost 800 customers today on TrueSight and just under 10,000 assets. We need to be able to give a customer some information. If the customer's product fails, they'll ask us, "Did it have a problem beforehand?" We have all those events and we know all the problems it had beforehand. We have to be able to give them access to that kind of reporting. That's an enhancement that we need.
For how long have I used the solution?
We white label TrueSight, but it's TrueSight at its core and we've had it installed here for just under three years. Version 10.7 is our production instance and we're using version 11.3 in Azure. We're moving to a cloud platform and we're doing that with 11.3. I was hired about 15 months ago.
What do I think about the stability of the solution?
The stability of TrueSight, in its natural form, is very good. We had stability issues with it because we were doing things that were outside of that normal boundary. We were bringing in way too much information. We didn't know how to filter it. Once we got the filtering in place, it became very stable.
In our six- to nine-month process of doing the proofs of concept, when we got to that ninth month we were bringing on as many customers as we could and we were getting everything we could possibly get from all of them. It took us about three months to tune that down, with BMC's help. The product was always stable before that. The product itself didn't fail. We just overwhelmed it. If you talk about data lakes, we flooded the lake every day. And it didn't stop. We just kept bringing more stuff in. Once we added the filters, we tuned those valves, it stayed up and has been running really well.
What do I think about the scalability of the solution?
Our focus is to get 16,000 customers in TrueSight. We're walking up that scale every day. Once we figured the filtering out, we started getting the scalability. Prior to that, we were going the wrong way on scalability. But the elasticity seems to be there, the ability scale.
How are customer service and technical support?
On a scale from one to five, with five being the best, I would give BMC technical support four-and-a-half. It's not a five because the reporting piece is still missing. We need the reporting piece. They can't give us all the help, because that help is just not fully there.
Which solution did I use previously and why did I switch?
We had our own homegrown system - an email box - that the stuff came into. That's all we had. We did not use a competing product.
We went down an RFP path over three years ago. Our company has grown pretty dramatically. Between 2015 and two weeks ago, we made 14 acquisitions. There was no way we could grow the business mechanically with an 8.2-touch model in place. The support centers would be the biggest expense in the company.
It was a two-pronged approach to looking at resolving the opportunity that our growth created. The first approach was the customer, to give them quality of service: not having to get log files, not having to figure out what's going on at their end, and not having to call us.
The second approach, for the Park Place support center, was to give them better tools to provide better service to the customer. We wanted our support center to go from trying to figure out what the right part was, to letting the customer know they were about to have a problem. That's a big difference. Both of these approaches have happened.
If you put that around the world with our growth, we now have a global approach with regional focus and local delivery. Because the systems are reporting the information, we don't worry about time zones or language. All that stuff goes away. The machines speak MIB, and the MIB communicates through TrueSight, and we get the information. We don't have to speak the local language until we go out and fix the problem, because the customer is not calling the support center anymore. We have a global footprint with a regional focus. In APAC, they're looking at problems that could be happening overnight in the US and vice-versa, or in EMEA. The problem is resolved, the customer is communicated with, and the person providing that communication speaks the local language.
The machines are literally running this thing, and all we are is the delivery model. TrueSight crosses all those barriers. It crosses time zones, it crosses language; it has all the pieces we need to know about repair, including the part and the location. It knows everything we need to know about the equipment, all the software, the LPARs, etc. It gives that to us in the support center, we contact the customer, and then we speak the local language and we bring the part locally.
How was the initial setup?
It's very straightforward from a setup perspective. We were able to install it and get it running relatively quickly. That's not the hard part.
The complexity comes in because, instead of it being what I would call an off-the-shelf product, TrueSight is a series of products with an encyclopedia of tools and they all add benefit. But getting those tools to work, that's where the complexity is; knowing exactly which piece to pull and to connect. An example would be putting filters in place. That took us a while.
If you look at an average installation, it takes three to six months to get up and running. We got up way faster than that, but it has taken us about a year to get the engine to run at the capacity its capable of. It's like gas mileage where you have to drive it properly to get the right gas mileage. That has taken us some time to do. But once we got there, we have certainly been getting everything that's promised.
Park Place was up and running within a month to two months. Our production product was probably nine months out. That's when we started figuring out the filtering. We brought everything in and opened all of the spigots up, and we had all this volume coming in. With BMC's help - they were very helpful in this capacity - we were able to turn the valves to the proper flow, so we weren't flooding the thing every day.
Our implementation strategy was to put it up in a proof of concept first in a DevOps environment because our goal was to bring it out to customers. Once we got it into production, we started bring customers on as PoCs. We did about six months' worth of bringing on the customers, making sure we could bring it out and get its sea legs. Then we started deploying customers as fast as we could. And that's when we went from 10.5 to 10.7, and now we're moving to the new platform with 11.3.
What about the implementation team?
We installed it ourselves, but with BMC's help. We did it ourselves with them looking over our shoulder.
To get to the 10.7 and 11.3, their services, the Premier Support, created a "cookbook" for us to do that migration. That was extremely helpful.
And from a consulting perspective, as part of the Premier Support, we were able to get the right consultants in to help us fine-tune that motor. They would come in and look at it and say, "By the way, you can filter this stuff out because you're not actually using it." I liken it to our cell phones when our data plans are out of whack with our use. We pay way more than we need for our minutes and a consultant comes in and says, "You can do this, this, and this, and be more efficient." They've been very helpful there.
What was our ROI?
As an example I looked at recently, we had a customer that was doing 27,000 emails a month. That would mean that if we spent 30 seconds reading each email, it would total 2,700 hours per year just reading emails. And that's not solving the issues. That works out to 1.3 FTEs just reading the emails for that customer. Suppose all-in, in the US, we're paying FTEs $72,000 to $75,000 just to read emails. Our license fees are certainly less than that for that one customer.
In terms of ROI, we haven't fully gotten there yet. We've reduced those 2,000,000 emails by 42 percent, so far. We haven't gotten them all done yet. But who do you think is on our list to get moved over to this solution? That customer 27,000 emails, we're going to move them over as fast as we can.
Our ROI is to get people off the old, manual system with 8.2 touches and down to two touches. Once we start hitting critical mass, the product will certainly pay for itself in a very reasonable period of time.
What's my experience with pricing, setup cost, and licensing?
We pay license fees of between $150 and $200 per asset.
In terms of the product's pricing, we don't pay per item and it's not crazy. It's cost-effective enough for us to offer it for free on storage, and we've got some 4,000 storage assets using the product every day.
We bought a large block of licenses. Interestingly enough, we provide TrueSight for free for our storage customers. We thought it was that important, to give them the licenses for the Knowledge Module and the policy. We do charge for network and we do charge for servers.
There is an enterprise software license fee, and then you pay a percentage for your maintenance, and then Premier Support. For example, if you buy a two-year license for the product, then the maintenance fee is added to that for two years at X percent a year. Then there's a small fee on top of that for Premier Support, which I would highly recommend to a company. Standard support gives you normal support processes, while Premier Support is 24/7. It's at a much higher level of support. For a production environment, I would strongly recommend it. In comparison to the extra cost, the value of Premier Support is very worthwhile.
Which other solutions did I evaluate?
We did an RFP and looked at seven products. Although I wasn't here when the company did that, I know they looked at Nagios.
One of the two key reasons for choosing TrueSight was the AI piece, the artificial intelligence; that's the promise of the future for us. We get some of it today. We have the predictive and proactive parts today, but we're going to grow that as we grow up that stack to go to OS, and application, and cloud, to get more AI value.
And the other one was the knowledge module relationship they have with Sentry Software. We're in storage hardware. That's the number one product out in the market. Sentry is a partner with BMC and that has been the lifeblood of our whole "global, regional, local" approach.
What other advice do I have?
The advice I would give is not to make a mistake and think it's an off-the-shelf product like Office 365. Understand that it's a very robust set of tools and procedures. You really should define what you want to do with it before you bring it in the door. If you had asked us before we brought it in, we had an idea, but we didn't know exactly how we wanted to utilize it and that was because we didn't know the capabilities of it. We thought we could do X, and we found out what we really needed to do was Y. It was that gap that we had to fill, and that took us time. So the better you can define your requirements, the quicker you'll gain the true value with your outcome.
Believe me, we're seeing true value. But if we had had a better definition of what we needed up front... We thought we had all the information in the RFP but we probably didn't. I'm not sure you ever can do that, but do a good job of architecting the scope or the spec of what you're trying to do and then get their input. They can give you that information and that's when you get your true value. When those two things meet, you get the value prop.
Working with BMC has been interesting. It's been very helpful. They're part of our team, which is great. They bring their partners to the table. Their partners don't have an agenda. Everything that we get done is literally for us as the end-user and for our customers. I've not had that often before with software companies.
They invest in customer satisfaction to the point that we've asked them to implement some things that are a little bit beyond the normal scope of TrueSight. We're using it for 800 customers in an instance of TrueSight, where it really should be one TrueSight for one customer. And they've helped us make all that work, and arm-in-arm.
With Sentry it has been a team effort. Sometimes we don't know who on the call is not on our team. We're all having the same conversation, and it's not a situation where "BMC said," or "Sentry said," or "we said." It's one common unit. We had a call yesterday about architecture and making that whole piece work. I said to their architect, "Gee, you know I really like that document you put together." He said, "John, you can use any piece of that you possibly want. Go ahead and take it and do anything you need to do with it, make it work your way." That doesn't happen very often, where someone is building their own thing and they come back to you and say, "Yeah, you can use it any way you want. Just make sure it makes for you."
We have 11 people who are installing agents and policies at our customers' sites. Their job is the implementation with our customers.
In terms of people actually running TrueSight within the company and our IT infrastructure, we have parts of a couple of people. It's a part of their job. It's almost like shift work. We have a part of a full-time person on a daily basis engaged with TrueSight care and feeding. Running the product requires less than two people, all-in.
We will be hiring a new person to be a TrueSight architect, because we're bringing on more of those KMs and we need somebody who can help us do the rules management. They're not going to be here running the product, they're going to be adding new features.
Overall, I would give the product a very solid nine. If I had the reporting piece, I would give it a ten. It has provided more value than we expected and it does what it says it's going to do. You can't ask for more from a product than that.
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
IT Operations Monitoring Specialist at a tech services company with 51-200 employees
Robust, and responsive technical support, but setup could be simplified
Pros and Cons
- "BMC TrueSight Operations Management is easily scalable."
- "The graphs are extremely limited. We don't have a lot of dashboard options. To make reports and dashboards more useful, we usually need to integrate some dashboard solutions."
What is our primary use case?
BMC TrueSight Operations Management is used to monitor the infrastructure, applications, and databases.
What is most valuable?
It's very good. I like it. It's a great product, but there are some things that could be improved, such as the dashboards.
What needs improvement?
The dashboards could be better. The graphs are extremely limited. We don't have a lot of dashboard options. To make reports and dashboards more useful, we usually need to integrate some dashboard solutions.
The initial setup could be simplified.
For how long have I used the solution?
I have been working with BMC TrueSight Operations Management for approximately 12 years.
We are working with version 11.304.
What do I think about the stability of the solution?
After you configure everything, it's stable.
What do I think about the scalability of the solution?
BMC TrueSight Operations Management is easily scalable.
In our company, we have four people who use this solution.
How are customer service and support?
Technical support used to be better a few years ago. The level was slightly lower than expected. For the time being, it's not great, but occasionally they are good, but that is dependent on the consultant who answers the phone.
They usually respond quickly, but it's not the solution we require, and it's not always effective, but it can be. Technical training would help.
Which solution did I use previously and why did I switch?
I used Entuity. I also have basic knowledge of PRTG and Nagios. From those three, I have more working knowledge of Entuity.
I started working with Entuity, nine or ten years ago. We stopped using it two years ago. We are not familiar with the current versions.
I am currently working with Helix Operations Management and the ServiceNow ITOM.
How was the initial setup?
In general, it is not easy to install. It's complex. There are too many components, and you must set them up and work with the infrastructure team on permissions and file reports. Because there are so many components, this becomes more complicated and difficult, particularly in terms of infrastructure management. It is not easy to install.
What about the implementation team?
We have a monitoring team. We work alongside them to manage and support the solution.
What's my experience with pricing, setup cost, and licensing?
I'm not familiar with it. They have changed the licensing fees.
What other advice do I have?
You will face some difficulties unless you have someone with advanced knowledge of the solution.
I would rate BMC TrueSight Operations Management a seven out of ten.
Which deployment model are you using for this solution?
On-premises
Disclosure: My company has a business relationship with this vendor other than being a customer.

Buyer's Guide
Download our free BMC TrueSight Operations Management Report and get advice and tips from experienced pros
sharing their opinions.
Updated: June 2025
Product Categories
IT Infrastructure Monitoring Application Performance Monitoring (APM) and Observability Event Monitoring Cloud Monitoring Software AIOpsPopular Comparisons
Azure Monitor
Splunk AppDynamics
Elastic Observability
SolarWinds NPM
PRTG Network Monitor
ServiceNow IT Operations Management
Auvik Network Management (ANM)
Buyer's Guide
Download our free BMC TrueSight Operations Management Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- What are the limitations of BPPM 9.5 server monitoring tools?
- Comparison of BMC Truesight OM with MS System Center OM and IBM Tivoli Monitoring
- BMC TrueSight Intelligence [EOL] vs BMC TrueSight Operations Management: integration with Operations Management Systems and cost
- Any experience with Event & Incident Analytic engines like Moogsoft?
- Windows 10 - what are your main concerns about upgrading?
- When evaluating IT Infrastructure Monitoring, what aspect do you think is the most important to look for?
- What advice would you give to others looking into implementing a mid-market monitoring solution?
- Zabbix vs. Groundwork vs. other IT Infrastructure Monitoring tools
- Anyone switching from SolarWinds NPM? What is a good alternative and why?
- What is the best tool for SQL monitoring in a large enterprise?