PubSub+ Platform is primarily used for guaranteed delivery of messages from across systems, for microservice-based development, and for high-speed data consumption purposes. Guaranteed transmission of messages is another key use case.
PubSub+ Platform is designed for real-time message publishing and outstanding interoperability. With features like intuitive administration and topic filtering, it offers both stability and high performance for scalable deployments across diverse scenarios.

| Product | Mindshare (%) |
|---|---|
| PubSub+ Platform | 2.6% |
| SCOM | 9.9% |
| OpenText AI Operations Management | 7.9% |
| Other | 79.6% |
| Type | Title | Date | |
|---|---|---|---|
| Category | Event Monitoring | Jun 23, 2026 | Download |
| Product | Reviews, tips, and advice from real users | Jun 23, 2026 | Download |
| Comparison | PubSub+ Platform vs ServiceNow IT Operations Management | Jun 23, 2026 | Download |
| Comparison | PubSub+ Platform vs SCOM | Jun 23, 2026 | Download |
| Comparison | PubSub+ Platform vs BMC Helix Operations Management with AIOps | Jun 23, 2026 | Download |
| Title | Rating | Mindshare | Recommending | |
|---|---|---|---|---|
| Databricks | 4.1 | N/A | 96% | 94 interviewsAdd to research |
| MuleSoft Anypoint Platform | 4.0 | N/A | 92% | 62 interviewsAdd to research |
| Company Size | Count |
|---|---|
| Small Business | 4 |
| Midsize Enterprise | 1 |
| Large Enterprise | 12 |
| Company Size | Count |
|---|---|
| Small Business | 96 |
| Midsize Enterprise | 54 |
| Large Enterprise | 178 |
PubSub+ Platform enhances data integration with its event mesh and seamless protocol compatibility, providing a comprehensive solution for organizations tracking shipments, generating reports, and managing transactions. Its granular topic filtering and WAN optimization ensure high utility in event-driven applications and cloud deployments. Users highlight the platform's intuitive administration and ease of management, though some seek improved integration with third-party tools and enhanced observability. Concerns include scalability for large payloads and training resource availability. Despite its interface complexity, PubSub+ remains valuable for trading and market data distribution.
What are the key features of PubSub+ Platform?PubSub+ Platform is widely implemented for asynchronous messaging in industries like finance for trading and market data, logistics for shipment tracking, and tech operations management. It enables companies to modernize applications while ensuring data accuracy and efficiency across global systems.
PubSub+ Platform was previously known as PubSub+ Event Broker, PubSub+ Event Portal.
FxPro, TP ICAP, Barclays, Airtel, American Express, Cobalt, Legal & General, LSE Group, Akuna Capital, Azure Information Technology, Brand.net, Canadian Securities Exchange, Core Transport Technologies, Crédit Agricole, Fluent Trade Technologies, Harris Corporation, Korea Exchange, Live E!, Mercuria Energy, Myspace, NYSE Technologies, Pico, RBC Capital Markets, Standard Chartered Bank, Unibet
| Author info | Rating | Review Summary |
|---|---|---|
| Integration Architect at a tech services company with 11-50 employees | 4.5 | I find PubSub+ Platform excellent for robust, scalable microservice messaging, offering unique graphical design and strong analytics. Its stability and support are good, though advanced authentication and micro-integration features could improve. Cost is a minor concern. |
| Developer at a computer software company with 51-200 employees | 4.0 | I found PubSub+ Platform a highly scalable, reliable messaging queue for IoT data within GCP, with straightforward setup. Event replay aided debugging. While new users could use more in-line info, its overall performance is very good. |
| freelancer at a financial services firm with 1,001-5,000 employees | 4.5 | I find PubSub+ excellent for event-driven communication, praising its decoupling, folder structure, performance, and stability. While I've heard it's expensive compared to Kafka, I recommend it highly, rating it 9/10. |
| Principal Engineer at Citi | 4.0 | I use PubSub+ for long-processing messages and audit trails, finding it enhances asynchronous processing and prevents message loss. Its monitoring dashboard is valuable, but I'd like improved message pooling and a more comprehensive dashboard for metrics. |
| Managing Director at a financial services firm with 5,001-10,000 employees | 4.5 | Solace has been critical for our global capital markets for over a decade. I value its exceptional stability, scalability (120 billion messages daily), and ease of application onboarding. It delivered significant cost savings, and I consider their support the best among all our vendors. |
| Senior Associate Platform Level 2 at Publicis Sapient | 4.5 | I found Solace PubSub+ excellent for event-driven microservices, offering seamless protocol exchange without coding and simplifying cloud operations compared to Kafka. I recommend increasing the limited 5MB payload size to accommodate larger events. |
| Sofware Engineering Manager at Wells Fargo | 4.0 | I use PubSub+ Event Broker in banking, valuing its speed and stability for high-throughput applications. Integrations need improvement, and it's expensive, but I recommend it for business-critical financial transactions due to its strong performance. |
| Cloud Architect at a transportation company with 10,001+ employees | 3.5 | I found PubSub+ effective for event-driven applications, offering good features and a dashboard. However, integrating it with existing applications is complex and costly, especially compared to alternatives like Kafka, which are easier and cheaper to maintain. |
| Enterprise Automation Architect at CIBC | 5.0 | I am implementing Solace for event aggregation from diverse monitoring tools and bridging on-prem messaging to Azure. Its protocol-agnostic transport and stability are key, despite internal challenges explaining event mesh benefits and slow adoption. |
| Technology Lead at a pharma/biotech company with 10,001+ employees | 4.0 | I find Solace’s event mesh and event lifecycle management unique differentiators, boosting productivity by 50% and simplifying digital transformation. It is stable and scalable, but needs better SSO integration and more pre-built connectors for industry applications. |

PubSub+ Platform is primarily used for guaranteed delivery of messages from across systems, for microservice-based development, and for high-speed data consumption purposes. Guaranteed transmission of messages is another key use case.
The unique functions I appreciate about PubSub+ Platform are that it allows me to design my solution in a graphical manner, which is not available in many other products, and the design can also be pushed to the actual infrastructure layer, making it quite advantageous.
Mesh technology is useful in scenarios where different geographies have to be connected, although such situations are not commonly found. It is beneficial but not a super-used feature of PubSub+ Platform.
The event replay function is quite mature in PubSub+ Platform, allowing me to replay messages that are days in the past, which is a good feature.
The main benefits PubSub+ Platform provides for the end-user include building a robust and scalable system with very low network latency, which improves the customer experience, whether using mobile phones or applications. This type of messaging framework is extremely important, and Solace is a very good product in that space. Nowadays, most applications are built using microservices technology, with small microservices interchanging messages via PubSub+ Platform. Without it, realizing a scalable system would not be possible; for example, one cannot have Netflix or similar services that require quick data transit and a good user experience, ensuring that data cannot be lost in transit.
The analytics part of PubSub+ Platform is quite useful as it can connect with many analytical software tools, mainly for analysis of system logs, such as Splunk, DataDog, or Prometheus. It has the flexibility to connect with any of these and supports OpenTelemetry, which is not available in many other products, making traceability very easy. I can see how a message travels from a source system to the target system, end-to-end, along with what happens to that message along the path, making the analytics quite good.
Potential areas for improvement in PubSub+ Platform are its authentication mechanisms, which could be slightly better. While simple authentication using basic methods is easy, moving to more robust mechanisms like certificates or OAuth requires a bit deeper technical expertise.
I feel there is a lack of functionality in PubSub+ Platform; Solace has introduced microservices and micro-integrations recently with some capabilities, but they can improve on these. They also have some AI agent capabilities that are quite unknown, so if they can disclose this information properly, perhaps through blogs or detailed descriptions, it could be better.
I have been working with PubSub+ Platform for about five years.
I would rate the stability of PubSub+ Platform as a nine.
In terms of scalability, I would rate PubSub+ Platform as quite scalable, around 9.99.
I rate Solace's technical support as good; they can come on calls and assist quite well, so I could give them an eight.
The initial setup for PubSub+ Platform is not that difficult; it depends on the environment where I want to use it. If I use it on the cloud, it is quite straightforward, but in a Kubernetes environment, it needs a little bit of support.
Pricing-wise for PubSub+ Platform, I find it a little expensive, so I would rate it at six.
The main competitors in the market for PubSub+ Platform are IBM MQ and Kafka, with Kafka being the primary competitor.
If I compare the three solutions, I believe PubSub+ Platform provides a clear advantage for me because I can design my solution easily, which helps in building on an interface level.
Regarding security and compliance features, I have not explored that area much; it is not a compliance-oriented product, so I do not see any relevance of compliance concerning GDPR or similar regulations. I would rate this review an 8.5 overall.
I describe the main use cases for PubSub+ Platform as wanting to use it as a messaging queue pipeline for managing the data stream events in our IoT platform while I was working at an IoT-based company.
The best features of PubSub+ Platform include being highly scalable, allowing us to handle billions and billions of events. Configuring and managing PubSub+ Platform is straightforward and simple while being highly reliable.
The use of PubSub+ Platform has positively impacted my organization because we are using GCP for our systems, so we did not have to go outside of GCP's ecosystem.
Additional in-line information about certain things on PubSub+ Platform could be more beneficial for new users who are just starting to use this technology.
The analytics tools integrated within PubSub+ Platform are good, as it has already integrated with cloud logging and the cloud logging features, making the analytics pretty good already.
I think the stability of PubSub+ Platform is pretty good, and I would rate it at eight.
For my use case, I would rate the scalability of PubSub+ Platform 10 out of 10, but since I have not worked on a huge scale, I cannot comment on that.
I have looked into PubSub+ Platform's support forums to read and understand a few things that I did not understand. If you are talking about Google's team support, I have less idea about it.
Neutral
In my opinion, the setup of PubSub+ Platform is straightforward.
I do not know about the pricing of PubSub+ Platform because I did not manage the instance.
I have utilized PubSub+ Platform's event replay feature when we were debugging issues, and it definitely helps me understand how or where things broke, quickly enabling us to upgrade the codebase and add additional fixes.
I think the documentation for PubSub+ Platform is pretty good and neat, so I would rate it an eight. I would rate this review an overall eight.
My typical use case for the PubSub+ Platform is as an event-driven solution for communication between two components.
The best features of the tool include multiple impressive capabilities, such as the multi-level folder structure creation. Users can create a queue or topic in a folder structure under C with program files and software. The performance and scalability of the PubSub+ Platform are as impressive as Kafka.
The solution's ability to decouple message producers and consumers is exactly what we use these messaging solutions for, allowing us to have high cohesion and low coupling, making it an excellent solution for that purpose.
Regarding improving the PubSub+ Platform, I'm not sure about the pricing aspect, but I heard that it is quite expensive compared to Kafka. That's the only concern I can mention; otherwise, it was as impressive as Kafka, better than Kafka based on my experience working on the Solace and Kafka white paper.
I have more than 4 years of experience working with the PubSub+ Platform solution.
We don't know which deployment model we are using.
The PubSub+ Platform is a stable solution.
The scalability of the platform can be rated 8 out of 10.
We are not administrators, so we never reached out to support for the PubSub+ Platform.
Positive
The initial setup of the solution is an administrator setup, and we just use it. For us, using it is straightforward.
I have experience working with Kafka, PubSub+ Platform, and IBM MQ, all three of them.
We are customers, meaning my company uses Solace. We use it and customize it based on our needs.
Based on my experience, I would recommend other people use PubSub+ Platform.
Interested parties can contact me if they have any questions or comments about my review.
I rate the PubSub+ Platform a 9 out of 10.
We use PubSub+ primarily for long processing messages. For instance, we generate customer forms, which take a longer time. We place these requests in queues, process them, and return them to the queues. This ensures that messages are not lost due to the time-consuming nature of such tasks. We also use PubSub+ for audit trails where we intercept requests, apply them, and store them in MongoDB, placing them in the queue before final processing.
PubSub+ has enhanced our asynchronous mode processing, ensuring that messages are not lost and requests are kept until fully processed. This means that messages do not get lost in the network or time out, thus improving reliability and efficiency.
Some valuable features include reconnecting topics, placing queues, and direct connections to MongoDB. The platform provides a dashboard to monitor the status of messages, such as how many have been processed or delivered, which is helpful for tracking performance.
The solution could be improved by enhancing the message pooling size for persistent messages to handle both small and large messages effectively. Additionally, providing a comprehensive dashboard to display metrics such as message throughput, connection rates, latency, and automated alerts would be highly beneficial.
We have been working with Solace for about three to four years.
Stability is generally good, but occasionally we encounter delivery failures. These issues are usually traced back to specific nodes not working correctly.
While I haven't directly explored horizontal or vertical scaling techniques, PubSub+ can process transactions efficiently and scale to handle multiple topics and queues. It supports enterprise-level features and multi-cloud messaging. Our use of internal cloud models assists in scalability, enabling multiple consumers and publishers to use the system simultaneously.
I do not directly interact with customer support, but from what I hear from the admin team, they are responsive and resolve issues quickly.
Positive
We have also tried Rabbit MQ. However, the user community for Kafka is larger, offering more discussions and solutions in public forums.
I recommend considering Solace due to its enterprise-level features, high throughput, cloud integration, and security. It is a top competitor among similar products and offers significant benefits in reliability and scalability.
We do a lot of pricing data through here, market data from the street that we feed onto the event bus and distribute out using permissioning and controls. Some of that external data has to have controls on top of it so we can give access to it. We also have internal pricing information that we generate ourselves and distribute out. So we have both server-based clients connecting and end-user clients from PCs. We have about 25,000 to 30,000 connections to the different appliances globally, from either servers or end-users, including desktop applications or a back-end trading service. These two use cases are direct messaging; fire-and-forget types of scenarios.
We also have what we call post-trade information, which is the guaranteed messaging piece for us. Once we book a trade, for example, that data, obviously, cannot be lost. It's a regulatory obligation to record that information, send it back out to the street, report it to regulators, etc. Those messages are all guaranteed.
We also have app-to-app messaging where, within an application team, they want to be able to send messages from the application servers, sharing data within their application stack.
Those are the four big use cases that make up a large majority of the data.
But we have about 400 application teams using it. There are varied use cases and, from an API perspective, we're using Java, .NET, C, and we're using WebSockets and their JavaScript. We have quite a variety of connections to the different appliances, using it for slightly different use cases.
It's all on-prem across physical appliances. We have some that live in our DMZ, so external clients can connect to those. But the majority, 95 percent of the stuff, is on-prem and for internal clients. It's deployed across Sydney, Singapore, Hong Kong, Tokyo, London, New York, and Toronto, all connected together. We are currently working on the cloud setup to connect on-prem with cloud based virtual message routers all connected together.
With the old platforms we were coming from, if we wanted to make changes, some of those changes were intrusive to make. For example, to add a new application into the environment, we would have to make a change that might cause some disruptions to the environment. We only have very limited downtime for our environment on a Saturday after midnight and before midnight again on Sunday. That is our only change-window for the week, if we have to do something intrusive. That limited us to when we could truly make changes. On a lot of other vendors' platforms, to add things, you've got to restart components and cause disruption.
The benefit of Solace is that we can add an application in the middle of the day, with no disruption to anyone. It's purely based on our access-control list and permissioning. We can add an application in with zero disruption. We can onboard applications during the middle of our business day. It's still under change control, but there's zero impact by doing it. For us, that is super-powerful. Whether we're adding users or adding applications, we can do it, without causing any disruption. For a lot of other products, that's not the case. That's been a huge win for us.
In terms of application design, I've seen applications go live in less than a week, from coding the first line of code to putting something into production. It depends on how complex the application is. We have a central team where we support the wrappers on top of the vendor's API and we have some example code bases where we show a simple application built using our wrapper on top of Solace's API. A developer who joins our company knowing nothing about Solace, can walk through our documentation, have a look at our wrappers, take some of our example code, and get up and running and off to the races pretty quickly. Getting up to speed is definitely not difficult.
We might get a new user in our bank who is familiar with other messaging systems and who has preconceived ideas on how they want to do things. They might ask us, "How do I get access to this messaging system that I used to use with my old organization? That's what I'm familiar with." Sometimes we have to do sessions with those people and say, "Okay, we're familiar with the systems you're talking about. We supported them in the past. Talk us through what your use case is, what it is you are trying to achieve." Once they explain their use cases, we can say, "Okay, great. We actually have this and here's some example code and this how to do it." Within a day, that person has gone from knowing nothing about it to saying, "Okay, you're, absolutely meeting my application needs and now I'm educated on how this works." They're off and running very quickly.
We take all kinds of data onto the environment to share. Because the event bus is the place that every application always needs to start, they're no longer building an application now within the capital markets organization without putting their data onto our bus in some way. It's definitely a way of lowering the barrier to sharing data and getting things up and running quickly. Similarly, they can take data from other teams, once they find out what's available. Someone might say, "I need all the FX prices in the bank. Oh, I can just subscribe them from here. I don't even need to talk to the FX team." Teams can get up and running very quickly without having to spend a lot of time working with other groups to make that happen.
By having all of that data together in one place, Event Broker has definitely reduced the amount of time it takes to get a new application onboarded. We came from a place with six or seven different systems, where we might have bridged some of those together in some way, but it wasn't one common environment. Now, we've got application A that comes online and starts putting that data out for application B to get up to speed and to start looking at that data. That is very quick and easy for us to do. All the messaging that we do is self-describing. They can look at the payload of a message and understand it without even needing to talk to the upstream application. We can have applications starting to look at data where they didn't even have to speak to the upstream application. We've gone from 8 x 1 Gig, 10 years ago, to 8 x 10 Gigs today, and the reason for that is because we keep putting more and more data and applications on here. That continues to grow exponentially. If it wasn't easy to do, the data wouldn't be going up and we wouldn't have all these applications now on here. It's hard for me to say it has definitely increased the productivity, because I don't own the application development piece but, anecdotally, I would say it has.
Another area of benefit is that we're in the process of containerizing all of our applications at the moment, whether they'll be run on-prem or in the public cloud. The underlying piece is that these containers, wherever they run, are going to need to share data between the different applications and then back to the users. The Solace event mesh or event brokers are the underlying lifeblood among all of these containers. They need to have some way of communicating with each other and we see solace as being that connection among all of them. All the different cloud environments have their own messaging and we don't want to build applications that are specific to any one cloud; we want to be cloud-agnostic. To do that, we need to have a messaging system that is equally agnostic. Given that we already have a huge investment on-premise for all of our Solace stuff, we see that the future of containerizing our applications goes hand-in-hand with our messaging strategy around Solace, so we can be totally cloud-agnostic.
Technology, in the last 10 years, has probably become a lot more stable generally, but I can say that with the amount of data we put through these appliances, and route globally every day, if our environment was down capital markets wouldn't be operating for the bank. That's how critical it is. We can't afford to have any issues. At the same time, literally no application can run in our front office without this. If I look back 10 years ago, we might have had six or seven different distributed systems, all with their own problems. Now that we've consolidated all that, there's a huge efficiency by sharing all our data between the different groups. It means we can get up to speed very quickly, but also, what we're enabling from a business perspective, by sharing 120 billion messages a day, is hugely valuable to our front office.
I've been running messaging systems for most of my career, for over 17 years. The most valuable feature is the ability of the appliances to cope in a way that I haven't seen other vendors do. You always get into types of message-loss states that can't be explained with some other products that are out there. You raise tickets with the vendors and they'll give you an explanation. But in the 10 years that we've been in production with Solace, we've never had something that cannot be explained. I've got tickets open with the likes of IBM that have never been resolved, for years. The Solace product's stability is absolutely essential.
There is also the ability to have so many things laid in, where we're doing guaranteed messaging and direct messaging laid into the same appliance.
There is also the interoperability. We've built a lot of products into it and it's been quite easy to feed market data onto the systems and put entitlements and controls around that. That was a big win for us when we were consolidating our platforms down. Trying to have one event bus, one messaging bus, for the whole globe, and consolidate everything over time, has been key for us. We've been able to do that through one API, even if it's across the different languages. We support a wrapper on top of the vendor's API and we enforce certain specifications for connecting to our messaging environment. That way, we've been able to have that common way of sending and sharing data across all the groups. That has been very important for us.
In terms of ease of management, from a configuration perspective you can have all your appliances within one central console. You can see your whole estate from there. And you can configure the appliances through API calls so you can be centrally polling and managing and monitoring them, and configure them as you need to. There are certain things where that's a little more tricky to do, but at a general level we have abstracted things like user-commissioning into other systems. So we just have a front-end where we change the commissioning and push it to the appliance in whatever region and it updates the commissioning. From a central management and configuration point of view, it's been extremely easy to interact, operate, and support.
When it comes to granularity, you can literally do anything regarding how the filtering works. It has a caching product that sits on top of that, so depending on the region that you're trying to filter, caching level can make it a bit more difficult than the real-time streaming. But from a real-time stream, you can pretty much filter at any level or component and it's extremely flexible in that regard.
Make it open source :-) with limited features so that this is not a barrier for using the technology in some places where they will only use open source technology despite then just paying for support (and paid features) on any used open source technology in the organization, as its a misnomer in my view the amount of large companies claiming to only use open source but then using products that are paid versions with additional features included. Solace should give this further thought how they can address this.
We've been using Solace products for over a decade.
The uptime on the appliances is huge. We just don't have problems with these appliances in production. For example, we have just gone through the whole COVID-19 situation and then the Russian/Ukraine war and the markets went crazy during that. Our previous maximum of data through the appliances in any one day was about 67 billion messages. During the COVID-19 of February and March 2020, we hit 95 billion messages a day and then Russian/Ukraine war we have hit 120 billion messages a day. That is nearly 90 percent increase on the data rates in less than 2 years and the environment coped just fine. We didn't have any problems. There was zero business disruption. I don't know of any other system where, if I threw an extra 50 billion messages at it, without adding anything, without having to change anything, it would just cope. If it wasn't able to cope with that, the amount of money that might have been lost to the organization would have been exponential. It's definitely paying for itself.
Going back 10 years ago — I want to be real clear, not recently — there were some issues with disks that were in the devices. It was just a faulty batch of disks from their supplier. We had to change the disks. But everything is resilient. So when we had these failures — they were more common than you would expect — we might have a HA failover but not an outage, per se. But that was a very long time ago.
The only other thing that causes issues, and I use that term loosely, is that these are the biggest things on our network within the bank. An 80-Gig appliance is the biggest thing that talks on the network, and it's sending an awful lot of traffic. What you tend to then get into are problems with your own network not being able to cope. You may not have built your network to cope with the volume of traffic you want to try putting over it. As a company we have definitely experienced that over the last few years. It's not a Solace issue, but more a pure core-networking issue. That's a common issue that I know Solace's clients deal with. I meet other Solace clients through various events and they're all having challenges with their network team actually providing a good network to be able to cope. You've got a very strong messaging product that sits on top of the network. It's the biggest thing on the network. Is your network then able to cope with it? So we've had Solace's engineers on calls with our network team, walking them through. That's probably the biggest pain point we have, but it's not a Solace fault.
The scalability of these over time has been very good. When we started on them 10 years ago, we were 8 x 1 Gig appliances, so we had 8 Gig of capacity? We're now doing 8 x 10 Gigs. In 10 years we've grown our footprint by 10 times in terms of volume. And the number of servers, the appliances in our data centers, hasn't really increased. They've obviously continued to grow the capacity of the appliances over that 10 years, without us needing to buy another 20 or 30 appliances to continue to build out. They have the ability to scale.
In terms of users of the solution, there are about 9,000 people in capital markets, of which I'd say about 6,000 or 7,000 of them are using it across the different geographies. Each of those users might be running multiple applications and making multiple connections to the appliances for different applications. A user might have four different applications on their desktop, and they would be making four connections. That works out to about 20,000 to 30,000 actual connections to the appliances. And we have about 5,000 servers in our data centers. A good 80 percent of those are making connections to Solace.
The amount of messaging that we put through it grows every year. We're constantly looking at the volume of data that goes through there and deciding if we need to stripe out the number of appliances to support that. Or, if Solace produces a bigger appliance, do we need to be buying it from a pure networking or volume-of-traffic point of view?
We are in the final steps of cloud implementation. It's going to be a mixture of some of their messaging-as-a-service piece and some of us running our own Docker engines of the software version. There's going to be a bit of a mix as we bridge data between the public cloud, as we stand that up, and our existing on-prem appliances. We don't see the on-prem appliances going away anytime soon. There's no key to getting rid of those. We're putting so much traffic through them, it's massive. But, as some of our workload moves to the cloud, so will some of that traffic and we will need to be able to support that.
But every year the messaging rates only ever go up, as does the number of applications that come on. Every week it's continually growing. It's like the blood that pumps around the body, to be honest.
Solace is truly the best company that we have to deal with when it comes to tech support. In the role that I have I deal with about 100 different vendors, everything from market data exchanges to software vendors, through the likes of IBM and Microsoft, etc. Ten years ago, when we first started dealing with them, Solace was obviously a much smaller company. They've grown. They were only some 50 or 60 people at the time and I think there are a couple of hundred now. All their support team who were there originally are still there — they've added more over time — were excellent. They know everything about their APIs and their systems.
If I reach out to IBM, for example, I'm going to get passed to six help desks before anyone I reach even knows what product I'm talking about. I support Cloudera for our company, as well. Cloudera has sold its support to IBM and when I raise a ticket with IBM, I wait a week to get a response. I have had some pretty shocking support experiences.
We always felt that Solace's support wasn't going to survive as they grew as a company. It was so good. That was one issue I kept raising because it was so good I couldn't see how it would scale. Surely it couldn't. But I can tell you, 10 years later, Solace is still the only company where I have zero outstanding issues, or unknown items, or support tickets that they haven't resolved. If you have a problem, they jump on a WebEx with you and, within minutes, we know what it is. Whereas I can't even get IBM to respond to a support ticket.
I deal with a lot of different people in my role and I can genuinely put my hand on my heart and say they're the best support company that we deal with.
Positive
We had TIBCO EMS, TIBCO RV, IB MQ, and Informatica's LBM. The latter used to be a company called 29West and Informatica bought them. We also had Thomson Reuters RMDS platform, which is now called TREP, sending messages around the planet.
We were using Thomson Reuters RMDS — Reuters Messaging Data System — as a generic messaging bus at the time. Even though you can put their data onto the platform, you can also use it to read your own data around the world. That was a big platform for us at the time and it was coming from two of the underlying systems. You could publish any message onto that bus and send it around. I worked at another bank before the one I'm at now, and we did exactly the same thing there. We were putting a lot of our own internal data onto their messaging bus. It was a good message bus and it still is.
But Thomson Reuters, at the time, now Refinitiv (now LSEG), decided to license it differently. They said that if you put your own data on their platform, they wanted to be paid by every message you sent. We thought, "Okay, well that's crazy. If we buy something from you and pay you a million dollars for it, and then send a hundred messages or a million messages with it, that's nothing to do with you and we're not going to pay you for it." They tried across the entire street to change their pricing model and they really shot themselves in the foot. A lot of people walked away from them over it.
We knew at that point we needed to do something else. We had TIBCO RV, TIBCO EMS; we had so many different systems that we were trying to bridge and connect together, but the RMDS platform along with TIBCO RV dwarfed all the others. Those two together made up 90 percent of all the traffic. That really pushed us to go out.
We spent about two to three months designing out our topic hierarchy when we started this 10 years ago. In the last 10 years we've made very few changes to our topic hierarchy and schema. But we sat with Solace and designed it out. We created a 90-page manual for how we wanted to stand up our event mesh at the time. Bear in mind that our first implementation was not guaranteed messaging, but direct messaging. It was between Sydney, Singapore, Hong Kong, Tokyo, London, New York, and Toronto. We had primary and secondary data centers in every region. I would never characterize it as simple because of the overall scale of what we were putting in place. The actual configuration, and working with Solace to implement that originally, that wasn't the difficult piece of it. Actually standing it up — once we had the appliances in our data centers and all on the network — hooking them up and making them work together that wasn't complex.
What was more complex was the fact that we were meshing up six regions at the same time, and turning on a brand new environment. We didn't stay in one region. We didn't just turn London on. We went big from day one, so it was complex from a geographies perspective, but not complex from a Solace-configuration perspective.
We paid for their heads of engineering to come and sit onsite with us and work through that document. I've actually recommended to Solace that they shouldn't sell their product to anyone without doing that design work upfront because I think it's extremely valuable.
This is true of any system. If you take a good system and don't architect it well, then you can make a good system really bad. Two years down the road you've got people saying, "Okay, I want to go somewhere else," because we've done a bad job of this. Anecdotally, I was talking to the CEO of Confluent a while back and he told me that a large, well-known company has redone its Kafka implementation three times in two years, because they hadn't architected it properly. You can take any technology and make it bad.
Our deployment took about six months, start to finish, from initial discussions and purely white-boarding through to being live in six regions. The first five years after it was implemented, we weren't allowed to build any net-new application that didn't go onto the bus. Every application has a three-year life cycle within the bank. In that five years, a good 80 percent of our applications had been completely rewritten, at which point we only had 20 percent left on our old environment to force over and bridge between old and new environments. After a couple of years of doing that, we didn't have to run any of the old environments anymore and just had one major platform that everyone connects to. That has been the state for the last five or six years.
I speak to other Solace clients occasionally, new ones who are looking at starting up, and they say, "Well, can we be done in a year?" And I say, "Well, your Solace can be done. That's not the issue. It's your life cycle of applications. If anyone tells you you're going to switch all your applications in one year, it's nonsense." Yes, it depends on the scale. If you're a small company, sure. But if you're a company of our size, you've got hundreds of applications and you're not going to rewrite them all overnight. But, we did a migration of JMS users from TIBCO EMS a few years ago and that was actually very simple. It was two or three lines of codes for each of the 200 applications that were connected. Within about three months we'd moved 200 applications. So it is easy to do pure JMS conversions, for example. But if you've actually got to rewrite the application completely, because you're changing how it operates, that's very different.
In that three months of discussion that I mentioned, we were working on our topic hierarchy and making sure that we didn't have any pitfalls. The rest was that it takes a long time to get things set up at data centers, racked and networked and dealing with the firewalls. But the actual configuration of the appliances between all the regions was only about two weeks' worth total, for 12 different data centers. That was not the lion's share of the work. The planning for doing it across multiple regions was the lion's share of that.
The topic hierarchy is hugely flexible, but you do have to put time in to plan your hierarchy and try to think through all the eventualities of how you're going to use it. Otherwise, it can become a bit of a free-for-all if you don't govern and control it in some way. You need a good onboarding process for how you want to use things. If you leave it totally open to your teams to choose, you're going to end up with a bit of a mess.
For naming, we start everything with a region and go from there:
There are six or seven layers of our topic schema that we have published. After that, the application teams can be specific on how they want to name the seventh or eighth level. But the first several levels are defined by us and we say, "Okay, if you're this, you're going to be choosing New York, you're going to be choosing fixed income, you're going to be choosing that this is market-data price, and then you're going to be choosing that your application name is this, and the datatype is real-time. And the message instrument itself is X and the data it contains is Y." So we've already mapped out our schema for all those levels, and then they can put their payload in at that level.
This way, it becomes really easy if you're trying to wildcard things at a higher level. You can say, "I just want to see all the market data prices." I can wildcard three levels and be able to pick those up without having to know anything else. I can look at pretty much any topic name that someone has. And you've got 255 characters to choose from. I've seen people who try to map everything, but then it becomes unreadable. Unless you've got a guide to figure out what topic schema look like, it becomes very difficult for a human to interpret. It has to be readable to them. Six to eight levels works, without needing some sort of decoder to work out what things mean.
In terms of staff involved in the deployment at the time, we had about 16 people, globally, across the different regions. But this wasn't the only thing they were doing. We also support 20 or 30 different systems because we look after the market data system for the bank as well. Solace isn't our only job. In addition to those 16 people for the initial implementation we had 30-something in compliance across Prod, QA, and Dev, etc.
Today, the number of people we have doing maintenance on it is in the high 20s . We haven't exponentially grown our staff around what we're charging back to the business for the true staffing of this. The only thing we have grown out a little bit, over time, is our development team that supports the applications, as we've had 400 applications come on. They have general, day-to-day questions. We only have three people in that Dev team, but they're acting like a first-responder before we raise a question to Solace's support team around API issues. A lot of the questions people ask are common questions that we've answered two times already. We have a lot of Confluence pages with basic how-to and FAQs. But sometimes people just want to jump on a call, go WebEx, and walk through what they were thinking of doing. We only had one developer doing that originally and we've got three now.
We're currently running version 9.6.0.38 in production.
Although we didn't do so on day one, we now work with three companies in this ecosystem. There is a company called BCCG, a Germany-based company. We originally wrote some feed-handlers with Solace to bring market data from companies like Refinitiv (LSEG) and Bloomberg onto the platform. We didn't want to own those, long-term. We felt it was something that could be out on the street. So we partnered up with this company, BCCG, who Bloomberg recommended to us. They're a small startup company and they now own the feed-handlers and the permissioning agents and are selling those as a product on the street. They have a partnership with Solace
We also partnered with a company called MDX Technology and that was really for an Excel plugin. We have a lot of users who use Excel sheets and we want to be able to send and receive data from and to Excel. So MDXT wrote a plugin for Solace. They have plugins to a lot of other messaging environments. They just created one for Solace and, again, they're selling it out on the street. They built it based on us and now they have sold it to plenty of other Solace clients.
We also partnered with ITRS, which is a monitoring company, to build plugins on top of Solace's environment. ITRS is our monitoring system. Every major bank uses them. They have plugins into all the different systems that you might have. We worked with ITRS and Solace to create monitoring for Solace. Again, ITRS has then sold that to a whole bunch of Solace's customers.
The only other one is a company called CJC, which is more of a consultancy and support company. During Asia-PAC hours, they look after first-line support of the whole platform, including the market data as well as the Solace platform. They're doing level-one and level-two during the day in Hong Kong. That's not in any way expensive. They're the company that actually supports Refinitiv's platform so they already have people and staff there.
Capital markets couldn't operate today if Solace were down. Our turnover on a daily basis is significant. To put a dollar value on it would be very difficult. But by not having 500 servers across the globe and having about 60 appliances at the moment instead, we've got a 10-to-one footprint, so in pure infrastructure costs we have hard-dollar savings. By having the appliances in, we've enabled the business to make millions on a daily basis.
We did an RFP and pulled all the vendors in, including Thomson Reuters, TIBCO, and a whole bunch of others such as Informatica, and we did a proper vendor evaluation. It came down to Informatica and Solace, head-to-head, in the final decision.
The choice to go with the Solace appliances has actually paid off massively in savings from an infrastructure point of view. The reason is that, in our old platforms, for example our RMDS Thomson Reuters platform, we had about 500 servers around the globe sending all the data to each other, meshed up in a huge administrative nightmare. The Informatica solution was going to be very similar, as in commodity hardware that you would mesh up to send all the data. We looked at that and said, "Well, a server in our data center is going to cost us $20,000 a year to run," so if we still had 500 of those, you can do the math. If we were to buy the Solace appliances, working out to about $100,000 each, we would then only have to pay support and maintenance on them for the next two or three years, at about $20,000 a year. We only needed 30 of them, compared to the 500 servers. This has been a huge cost saving for us. The 500 servers that we used to have are all gone, and we have replaced them with 30 to 40 appliances. The cost of running things in the data center has, therefore, shrunk significantly.
Although people do view Solace as being this premium product you pay a lot of money for, if you're going to put a lot of data through these things, the amount of servers you need to do that with is also extremely costly. We have saved millions a year by having the appliances, and that was something we picked up right at the beginning. We said, "If we go down this path and these appliances can truly do what they say, then the footprint in our data center is going to shrink 10-to-one, and the cost of running this in our data center is going to be significantly less."
We also support multiple instances of Kafka. There's an enterprise version within our bank, which is the biggest one, and we have some small pockets of it within capital markets. The configuration and support around Kafka, and the quantity of components needed to keep it going, are a configuration nightmare. We use the software broker for development and now in the cloud. In our non-production environments we have a non-appliance based version running in things like Docker. But the ability to have one component that does everything, as opposed to having to layer in multiple components to be able to build the ecosystem for messaging or storage, is extremely powerful from a support perspective. The time spent on keeping Kafka running, compared to Solace, is not in the same league.
We have a lot of problems with Kafka, generally, that we do not have on Solace. The enterprise runs the majority of the Kafka, the stuff that we support is for our regular Cloudera stack. To try to give an idea of scale, the enterprise bank is doing, maybe, less than a billion messages a day on its Kafka environment, which is still a big environment for them. But we're doing 120 billion messages, so we're not even in the same swim lanes. We know they have a lot of problems on that and had to build a second instance after issues with the 1st instance. And in our own Cloudera Kafka, we have problems with Cloudera over time - our current setup is stable but we are not upgraded to latest versions. Whereas our Solace stuff is bulletproof.
Kafka has its place. There's absolutely no question about that. There is some stuff that it does really well, like some of the elastically expanding storage concepts that people have where they want to keep storing everything forever. They can keep elastically expanding their Kafka brokers to do that. Whereas, with a Solace appliance, you are going to have a SAN storage connected to it and you're limited by the size of the SAN you can put on there, or you're going to need to buy another appliance and buy another SAN. With their software broker you could elastically expand that, but you still have the storage issues.
The one real positive with Kafka is that you have a big community of people, and this is something I've spoken to Solace about too. There is this groundswell of community around it, where there are a lot of adapters that are off-the-shelf to a lot of other things. It's a double-edged sword. Sometimes we have new users join the bank who say, "Yeah, but Kafka has a SQL adapter off-the-shelf." We say, "Okay, but we already have written a SQL adapter for Solace. Here you go. It was 10 minutes' work." At the same time, it is nice to have a catalog of 200 adapters that you can use on Kafka. That is definitely a benefit of Kafka, with the community around it. But at the same time, when you scratch the surface of it, the amount of work to do a plugin isn't actually much more, and with the Kafka stuff you need six or seven different components to run it.
In my last design overview with the console group they said, "And then we're going to add this component, and if you want global..." and I said, "Well, actually, all our stuff is global. We don't do anything that's just one region." They said, "Well we haven't gotten our global solution built yet so you could run two versions and start copying data." I said, "Well, I don't really want to do that. We want you to be able to replicate data between regions, under the covers." They're now doing that. They're getting up to speed on some of those things. It all depends on what your use case is.
We even have some stuff where, at the edge of our environment, we might bridge data between Solace and Kafka and there is a bridge component to do that. It would be when there's a very specific use case around what someone wanted to do. For example, if a third-party vendor is only supporting Kafka, we'll plug in Kafka there, but we don't want people then connecting to Kafka because there's no need for it. So we'll then bridge from Kafka to Solace so the data is all on Solace. There are definitely use cases for Kafka. It's just that the scale of Kafka, depending on what the use case is, is a little bit different. I feel people use Kafka because they're just trying to lazily store everything as a long-term retention process.
The implementation of Kafka compared to Solace is very different. As I mentioned, there are multiple components to build up Kafka. Additionally having reviewed Confluent it is not cheap and we would be more interested in using Apache Kafka supported by the cloud providers if we need that in the future. But our future is very much on the Solace environment. We're far more comfortable supporting the Solace environment than our Kafka environment.
If I was coming into this cold, and knowing what I know today, the one thing we would do differently is we'd have the network team involved throughout the whole process of bringing it into the bank. Bring your network team on that journey with you, because if it's going to become like it has with us — the biggest thing on the network — then you want to have the network team at the table from day one. That way, networking knows things are coming. We're putting these huge things into the data centers and they're going to send huge amounts of data around. That team needs to be ready, so they need to be at the table.
In terms of the onboarding and governance processes, fortunately we did think ahead and plan that stuff. But I speak to other customers that didn't and they're struggling with having the right onboarding processes and the right governance around things. At the end of the day, if you've got 120 billion messages going around, if you don't have a good onboarding and governance process, you could just have a 120-billion message mess. We don't have that because we had a good governance and a good architecture to begin with.
As I mentioned, I've suggested to Solace that they shouldn't sell their products without enforcing a bit of the architectural piece to begin with. The problem is that everyone has their own budgets and thinks, "Oh, I don't need you to help me, and I don't want to pay for it," figuring that Solace is trying to push its Professional Services a bit. But that small investment in Professional Services, when you first stand it up, could be hugely involved in the success of your platform. The Solace Professional Services that we've experienced, and the general value out of that, is worth the dollars you pay for it.
From a maintenance point of view, every time Solace releases a new version of the API, we review what has changed in that and whether it affects us in any way. Sometimes a release is for something specific that another client has asked for and that doesn't have any value to us. We don't force applications to upgrade every time a version changes. We tend to do a yearly request of the application teams to upgrade their API to the latest one that we vetted. It's like a yearly maintenance to update the API. And to do that work, to integrate the new API version, it's generally not more than half an afternoon's work to put it in. It might take longer than that to QA, test, and validate your application to put it into production, but the actual coding piece takes an hour or two at most. It's not a huge overhead to be able to do that.
In terms of the event mesh feature, we're a bit of a "halfway house." They have multiple things. One is called dynamic message routing (DMR) and another is multi-node routing (MNR). We use the multi-node routing piece. We are working on the DMR piece of it, which is their newest function for public cloud use. We're in in final stages of setup for expanding Solace into both Azure and AWS.
Internally, we're using their MNR so it's all an event mesh and everything is automatic. If you publish a message in Sydney and you want us to scribe it in New York, we have to do nothing to get that message from A to B. You subscribe and it gets there. Depending on which terminology you're using around event mesh, we consider ourselves to be on event mesh, but we have not deployed that for guaranteed messaging for our general population. We're still using their multi-node routing, which means direct messages fly on demand, and we have to bridge guaranteed messaging.
The clustering feature is really designed around trying to make things easier for clients on configuration, so that you don't have to look at things as an HA pair in a DR device, by representing that as a cluster node. This is all work related to trying to make things easier from a support perspective. Today, if you make a change on an HA pair, you can then force-sync that to DR. It automatically happens to the HA box so you only make a change on the primary; it syncs to the backup. You can then choose whether you want to sync that to the DR device or not by putting it into a cluster node. They're just making it simpler for people. It's definitely a positive. We've actually been involved in helping them design that because we were one of their first and one of their bigger customers. We sit in with their engineering at least every six months and they walk through things they've got coming down the road and we talk about how they go about implementing stuff.
As for the free version of Solace, at the time, 10 years ago, the free version — that's the software version — didn't exist. With the software version there are limits to the number of messages, something like 10,000 messages a second. We're doing 1,000,000 messages a second. We could run lots of 10,000 messages-a-second instances, but then we would need a lot of commodity servers to run them on. If you are a small company that has some messaging requirements and you are looking for a good way to do that, the free version is absolutely an option. It doesn't come with any support either, obviously. You can pay for support on top of that version, but it's only going to do you 10,000 messages a second. At the scale we have, that wouldn't work. For non-production, giving that to a developer to run on their machine, to play around with, absolutely. So we don't really pay for any of the Dev stuff that we have. We're only paying for the physical production appliances and the reason we need those is just the scale of messaging that we do.
We handle publishing/emitting events to the PubSub+ Broker via one protocol, and it seamlessly makes the event available for multiple consumers with different protocols without writing code is one of the primary use cases for us.
As an architect, I wanted to implement an event-driven architecture, majorly focusing on a cloud solution with a PubSub pattern.
Out of curiosity, our CoE team explored Solace PubSub+, and after several months of testing with our requirements, this was the kind of solution we needed.
Previously, we used on-prem Kafka for event distribution. The Kafka cluster was self-managed and hence required a dedicated team and infrastructure to manage. We wanted a cloud solution, and Kafka was already available as a SaaS offering, yet we wanted a better and more compatible solution with respect to protocols and security. Hence, we opted for PubSub+ with the support of AMQP, JMS, MQTT, and HTTPS. Without writing code, we were successful in creating event-driven microservices; parallelly, administering the cluster was not our responsibility anymore.
We like the seamless flexibility in protocol exchange offering without writing a code. This is probably the most useful feature of Solace apart from the structural Topic segments.
Solace PubSub+ has understood and implemented the best features of multiple PubSub brokers. I have seen segmenting of topics in IBM MQ PubSub and have used that feature a lot. With such versatility, Solace has implemented one the most demanding PubSub broker, hence a PubSub+. It is a simple portal for event-driven architecture driving the entire microservices framework.
Feature-wise, it has almost everything that I can ask from an architect's point of view.
For improvements, I would suggest increasing the max payload size to a limit of 100MB or more. The current max payload size is limited to 5MB.
"Events" are supposed to be small in size, yet, as the name suggests for "broker", a broker should be capable of handling large payloads as well. Even Confluent Kafka provides bigger payload size support for enterprise usage licenses.
I've used the solution for two years.
We have used IBM MQ pubsub and Kafka. We were moving away from the IBM stack and implementing microservices. Kafka would have been a good choice, yet keeping the future scope in mind; we opted for Solace.
We implemented the solution with the help of our own in-house team.

We are using PubSub+ Event Broker for specific application processing. We have used the solution In banking systems.
The speed of the throughout has been helpful for our organization.
The valuable feature of PubSub+ Event Broker is the speed of processing, publishing, and consumption.
The integrations could improve in PubSub+ Event Broker.
I have been using PubSub+ Event Broker for approximately five years.
The stability of PubSub+ Event Broker is good.
I rate the scalability of PubSub+ Event Broker a nine out of ten.
There are other platforms that are more scalable than this solution.
There are not very many people using this solution and we do not plan to increase our usage.
I rate the scalability of PubSub+ Event Broker a seven out of ten.
The support from PubSub+ Event Broker is great.
I rate the support of PubSub+ Event Broker an eight out of ten.
Positive
I have used other solutions similar to PubSub+ Event Broker.
The deployment of the solution took approximately three months.
We did the deployment and it took approximately 10 people to complete.
The solution is worth the money for the proper use case, such as high throughput applications.
The price of PubSub+ Event Broker is reasonable for the capability it offers. However, when compared to others solutions on the market it is expensive.
We use approximately four people for the maintenance of the solution.
My advice to others is this solution has high throughput and is used for many stock exchanges. For business critical use cases, such as processing financial transactions at a quick speed, I would recommend this solution.
I rate PubSub+ Event Broker an eight out of ten.

There was a challenge in the market that needed addressing. While some tools could serve as event managers, they were not proper event brokers. For instance, Kafka is referred to as an event broker but not a proper one. If you want to use it as an event broker, you would have to implement your event manager, which could be quite complex. The same can be said for Kafka, a text-based broker that stores events in a text file on disk and then consumes them. In other words, an event is just a message stored in a text file that is not reactive. If you introduce events into the system, you cannot react to them in any way. It creates a problem that needs to be addressed by implementing tools that can react to events or pre-defined topics. The way topics are created is also an issue. For instance, if you want to consume a specific topic, you have to create a new one, and you cannot filter events using the mechanism provided. If you wish to query events, there is no provision.
It is where PubSub+ comes in. It provides the option to query events and route messages from one topic to another, as well as clients, to facilitate this process. We primarily use PubSub+ for event-driven applications, where we react to and process applications based on those events. For instance, when we receive an order, we react to that event and create multiple other events based on it. This reaction is based on events, so we use PubSub+ Event Broker.
We mostly had a few teams, which were prototypes because we were planning for large-scale usage. But now, some domains have around fifty to sixty developers using PubSub+ Event Broker, and that's the teams using it.
PubSub+ Event Broker has a nice dashboard to see the domains and has good asynchronous API documentation. If you are a small or medium-sized company and not very rigorous on the compliance side, and you don't go for production issues on your own cloud, it's a very good tool to reuse.
One of the main challenges of deploying PubSub+ is integrating it with existing applications. The naming convention of PubSub+ can be confusing, and the way it handles producers and messages should be more like a channel where messages can be posted, and watchers can react to them.
PubSub+ Event Broker is a nice tool, especially if you want to deploy into on-premise data centers along with the cloud. It has a complicated architecture, but somehow it works. However, I don't think they have ever worked with very high-scale applications. There have been many improvements, but it still lags behind when compared to large-scale applications. It does not work out of the box and there may be confusion due to its complicated solutions. Nonetheless, it is still a good product to use.
I have worked with PubSub+ for the last two years. I am still working on it currently. We stopped using PubSub+ about two months ago.
PubSub+ Event Broker is stable.
There were several issues with the scalability of PubSub+ Event Broker, such as deploying it to their own environment and integrating it with their applications. They found that the naming convention of publishers and producers was confusing, and the process of creating a topic, making a queue, and making it work was complicated. PubSub+ Event Broker lacks the feature of a channel where someone posted a message and someone from the team reacted to it, which is present in AWS EventBridge. The dashboard to monitor events is missing in PubSub+ Event Broker, making it difficult to map domains and monitor events. The implementation is entirely in Docker, and to scale it, it must be deployed into a cluster queue or an ECS machine, which is costly and operationally intensive. The architecture of this distribution needs improvement.
The technical support team responded on time.
Positive
We stopped using AWS EventBridge because they had some issues with the API specifications regarding how to design the governance and other features. Even though it is a very good tool with all the necessary mechanisms, it does not offer an easy replay function. The dashboard to monitor events is missing in PubSub+ Event Broker. The solution is also complicated and not easy to use, making it difficult to map domains and monitor events. The implementation is entirely in Docker, and to scale it, it must be deployed into a cluster queue or an ECS machine, which is costly and operationally intensive. The architecture of this distribution needs improvement.
First, we experimented with an on-premises model and then we tried the cloud model for production. However, we encountered some issues. The cloud model is successful and does not have these kinds of issues because it's in its own ecosystem. But when you move to your own cloud, you want to manage the security and everything yourself.
The problem is that you have to deploy and recreate a VPC, and they will manage the VPC. You will not manage the VPC and the security of the system. This is a compliance risk that no management team would allow, where a third party manages your own or else account. There are solutions like creating a separate account and then keeping it there, but still, it's deployed in our own cloud, and we can manage the network or security of the whole account in which it was deployed. This is a tradeoff. The main challenge of using PubSub+ Event broker was deploying it to their own environment and integrating it with their applications.
The first deployment challenge is deploying it into Kubernetes. When it is only, it is the only model that they offer. So, in our teams and our company, we are not fond of using Kubernetes unnecessarily because it adds overhead to the Kubernetes cluster. Then you have to hire people who can manage this complexity, and there are many complexities inside. It's a huge thing to deal with. And even PubSub+ is also one of the complicated solutions. So you must manage two complex solutions in one place, working together. So it was too much operations work for the team.
It is one thing. And then after that, keeping this ECS content thing, so it's like one ECS, small ECS content cannot run it. You need to have a very heavy ECS cluster that can connect to the broker. Otherwise, if you have a very large-scale application, this federal will die, and then after that, you will lose the connection to the PubSub+ Event Broker. So you must keep one extra layer of a very heavy ECS cluster that can communicate with the broker. And then, after that, it can forward the requests to the APIs or the targets. And this ECS cluster also costs extra money for this.
We had a devOps team who was responsible for maintenance. But it was not a go-ahead and deployed on a large-scale solution. So we had a few problems with it. Specifically, we were still figuring out whether we should go for it. We had piles of issues that we liked to pile up in a matter of time. Every second week, we had something new coming up. Then it became when we solved all these problems and had a conversation with the team; we found out that they have a small team managing things and were really slow in managing things.
The ROI was good initially, but we eventually found that it would cost too much in recurring operations cost to maintain. So while I don't remember the figures, it wasn't worth it. And if you compare it with other solutions like Kafka, they are much cheaper and easier to maintain than PubSub+.
We paid around thirty-five or thirty-eight thousand euros for it. It was for a medium-sized deployment with two nodes.
My advice is mainly technical. If you're a big corporation with thousands of applications running, then PubSub+ is a good choice. But if you're a very small or mid-range start-up, it may not be necessary if your application doesn't require high computing or compute-intensive operations. However, it's important to consider the investment required, including licensing and operations costs. For example, even just for basic connections, the server cost for 100 or 250 connections can be around 100,000 euros.
Overall, I would rate the solution a seven out of ten.
The first use case is technology operations tools. We are a best of breed monitoring shop. We have all kinds of tools that monitor things, like storage, network, servers, applications, and all types of stovepipes that do domain specific monitoring. Each one of those tools was sold to us with what they called a single pane of glass for their stovepipe. However, none of the tools are actually publishing or sharing any of the events that they have detected. So, we have been doing a poor job of correlating events to try and figure out what's going on in our operations.
Our use case was to leverage that existing investment. For about a year, we have been proving that we can build publishing adapters from these legacy monitoring tools which are each valid in their own right, like storage monitoring tools, network monitoring tools, and application monitoring tools (like Dynatrace), and more modern than other ones. We have been building publishing adapters from those things so we can transport those events to an event aggregation and event correlation service. We're still trying to run through our list of candidates for what our event correlation will be, but the popular players are Splunk, Datadog, and Moogsoft, then ServiceNow has its own event management module.
From an IT systems management perspective, our use case is to have a common event transport fabric that spans multiclouds and is WAN optimized. What is important for me is topic wildcarding and prioritization/QoS. We want to be able to set some priorities on IT events versus real business events.
The second use case is more of an application focus. I'm only a contributor on the app side. I'm more of an infrastructure cloud architect and don't really lead any of the application modernization programs, but I'm a participant in almost all of them. E.g., we have application A and application B side by side sitting in our on-prem data center, and they happen to use IBM MQ Hub to share our data as an integration. Application A wants to move to Azure. They are willing to make their investment to modernize the app, not a forklift, but some type of transformation event. Their very first question to us is, "I need to bring IBM MQ with me because I need to talk to app B who has no funding and is not going to do anything." Therefore, our opening position is, "Let's not do that. Let's use cloud-native technology where possible when you're replatforming your application. Use whatever capability you have for asynchronous messaging that Azure offers you. Let's get that message onto the Azure Event Hub. Don't worry about it arriving where it needs to arrive because we'll have Solace do some protocol transformation with HybridEdge, essentially building a bridge between the Azure Event Hub and MQ Hub that we have in our data center."
The idea is to build bridges between our asynchronous messaging hubs, and there's only a small handful of them, where Azure Event Hub is the most modern. We have an MQ Hub that runs on a mainframe and IBM DataPower appliances that serve as our enterprise service bus (ESB). Therefore, if we build bridges between those systems, then our app modernization strategy is facilitated by a seamless migration to Azure.
The most recent version is what we installed about three weeks ago.
The solution is deployed on Azure for now. We will be standing up some nodes in our on-prem data centers during next phase, probably in the next six months.
The plan is to use event mesh. We're not using it as an event mesh yet, as we are only deployed with Azure. We want to position a Solace event mesh for enterprise, but we're just now stretching into Azure. We're a little slow on the cloud adoption thing. We've got 1200 applications at CIBC with about four of them hosted in clouds: one at AWS and three at Azure. So, we're tiptoeing into Azure right now. We're probably going to focus our energy on moving stuff into Azure. However, for now, because the volume is so low on stuff that's outside of our data center, the concept of a mesh has been socialized. There's not a ton of enthusiasm for it, even though I might be shouting from the rooftops saying, "It's a foundational capability in a multicloud world." It looks like we're putting that funding on the back burner for using it as an event mesh.
This solution has increased our application design productivity compared to other solutions. There is a ton of momentum in our application development space for leveraging Dynatrace with Solace's monitoring tool. We have made the investment in getting Dynatrace to publish events that it detects, mostly application performance related events. The app development teams have taken a liking to implementing the application monitoring tool early in their development cycles, maybe not in development, but in their performance testing cycles. We can practice what a code drop stack shift would look like if they're shifting from stack A to stack B or if they're doing rolling reboots on some of their app servers as they're doing upgrades. We get to exercise that and see what the monitoring patterns look like correlated with servers going up and down along with web services coming up and down. That's been helpful to the development community to see that automation occur in mid-environments.
There have been quite a few incidents where a test infrastructure has become unavailable because of some change going on and the app developers aren't on the nut for fixing the problem in the UAT environment because they know the outage was caused by the fact that they did a code drop 20 minutes earlier, which is a legitimate server outage. We are seeing some benefit, but it's more of an optimization of incident management resources. E.g., we have somewhere between five and 10 Internet-facing applications, and when something goes bump on the firewall that's behind them, we have DMZs (or different zones) where we put our web tier and app tier. Therefore, when something goes bump in our network tier, we got 10 application teams that are all fired up, and say, "What's going on?" Then, they all spin up their own tech bridges. Meanwhile, the firewall guys are working on a problem that we just don't know about. So, we have wasted a lot of time and energy trying to figure out things that aren't our problem. The bad scenario hasn't happened to us in production, but in our test environment, it's happens once where a couple of app dev teams have been able to stand down because we were correlating events correctly.
We struggle with mean time to resolution on things. We do have a lot of change control rigor, but the solution hasn't changed our organization yet. The idea is when we're getting the events for our service provider of operating systems, servers, and storage network correlated intelligently together with our application changes, application performance monitors, and application availability monitoring tools, then we'll make more intelligent decisions about root cause, where problems lie, and be able to react more intelligently. This will reduce mean time to resolution, but we're not there yet.
The division who has been using Solace for years has a mature costing estimator model for internal projects. That model certainly will be leverageable for the technology operations guys. We haven't crossed that bridge yet because we're still in PoC mode. It's very likely that once we hit prod, we'll have ease of solution design when we have a protocol-agnostic message transport in place, and that our solutions will be easier to craft and give cost estimates.
It is easy for architects and developers to extend their design and development investment to new applications using this solution. In our architecture practices, we are always documenting compositions. We care a lot about the data exchanges between applications or the integrations. We have a lot of contractors and other integrations that we care about. Having transmission facilitators definitely makes the architect's life a lot easier when we just put a message on the queue and it's going to get transported by the facilitators to wherever it needs to go. It is definitely easier when we have Solace and an event mesh up and running. Today, when we have integrations that don't leverage those transmission facilitators, like an MQ Hub or Solace event mesh, those integrations are much harder to get approved because we have to dive into the security, access controls, encryption, and all that other stuff.
The most useful features has been the WAN optimization and probably the HybridEdge, which requires some third-party adapters or plugins. The idea that we can position Solace as a protocol-agnostic message transport fabric is key to our company having all manners of asynchronous messaging protocols from MQ, Kafka, JMS, etc. I really like the WAN optimization: Send once over a WAN, then distribute locally as many times as there are subscribers.
I don't think we have yet unleashed the full potential of topic wildcarding. That is a silver bullet that we haven't yet maximized the value on because we don't have a ton of subscribers yet. Coming up with a topic naming convention in our large company has been difficult. However, once we start forking data over to some of our data lakes, enterprise data hub, and security event depositories, it will become a useful feature in the future.
The storytelling about the benefits needs improvement. We have four major lines of business in our company. Our retail, capital markets, and internal corporate center lines of business along with technology operations, which is more of a cost center. Technology operations are not innovators, but more a keep the lights on arm of the business. One of the areas of improvement would be if we could tell the story a bit better about what an event mesh does or why an event mesh is foundational to a large enterprise that has a wide diversity of applications that are homegrown and a small number off the shelf. I wish we were better able to tell the story in a cohesive way to multiple lines of business, but that's more of a statement of our own internal structure and how we absorb or adopt new technology than it is about Solace or the product itself.
It been a bit of a tough slog to try and get everybody to see event meshes are foundational in a multi-data center, multicloud landscape, when we're not there yet. Our company has most of our applications in two data centers that are close to each other. There is no real geo-redundancy, but everything we've ever done has been on-prem with only a small handful of Azure adoptions. Therefore, having folks see the benefit of an event mesh has been tough. I wish we could improve our storytelling a little bit.
We have struggled in a sort of perpetual PoC mode internally. This is no fault of Solace's. It's just that the only executive looking to benefit here is our technology operations team, and they have no money for investments. They're a cost center internally, so they have to be able to make the case that we're going to improve efficiency by leveraging this tech. Thus, the adoption has been slow.
We have three different lines of business in our company. One of them has been using Event Broker for about six or seven years.
Personally, I have been engaged in a proof of concept for about 18 months.
Solace has been incident free in HA deployment for seven years. I did an analysis before we started our PoC for the technology operations team, looking for a lot of incidents. One of the pieces of work I did internally was to figure out our app stabilization, and I couldn't find anything Solace related in terms of the bumpiness. It had a clean track record, unlike our DataPower appliances which have gotten us in the newspapers a couple of times in the last three years.
When I did my analysis, I found a lot of dependencies on our file transmission hub and the product that we use. I found a lot of victims of our DataPower appliances. I found no victims nor incidents related to our Solace hardware appliances under the coverage. There was not a single incident in six years. I went back to the well to try and see if I can find more, but I can speak to the hardware appliances and how stable they have been. They were only deployed within a single line of business, so it didn't have the complexity of an enterprise shared service in multi-LOB mode. However, the stability has been really good with a good track record.
If we deploy this the right way, we get a presence on each cloud at each data center and the full mesh effect. Plugging them into each other or making them part of the same ecosystem so they are aware of each other is not complicated for the guy whom we have working on this. He's not deploying it that way yet for our technology operations use case.
As we start to generate a little more momentum for our event correlation engine, we're probably going to uplift ourselves to a Tier 1 capability that has more of these nodes deployed throughout our various geographies around the globe. But, for now, it's only in one region of Azure Canada Central.
The group who has been using the solution for six or seven years has the physical appliances. Within the last two years ago, they just refreshed on physical appliances again. We're probably not going to do it all. The physical appliances have been in the control of a single line of business in our company who have been able to self-manage. There wasn't really an enterprise-wide adoption that required a lot of coordination in our change process. We've done a lot of change management rigor in our company, so when a service is wholly contained within a particular line of business, then the ease of getting stuff done is a lot higher.
We have a small set of publishers, probably eight or 10 publishers, with maybe two subscribers. We haven't had the need to get into a whole bunch of granularity. The scope of our program: All publishers are sending to the two subscribers. There is really not a need to get very granular about who sends to where.
Today, in IT operations, the usage number is still zero because we are not live. The benefit will be probably 2000 operations staff across our own company and our service provider DXC. It's a 50/50 split. DXC has hundreds of guys doing incident management and operations for servers and below. We have retained services in the application space who are application operators and security operators. Those are retained people who will be working more efficiently as well.
I have not personally dealt with their technical support. They are always responsive. I know I like to talk with them on emails that go back and forth, but it's really about sales, e.g., trying to get statuses on our proof of concept and how it's going. We've not had any reason to reach out to them for tech support issues.
Occasionally, we have needed help for HybridEdge when we were trying to build a new protocol transformation adapter, then we will reach out to them. However, this is not in incident mode. It's always in a sort of a how-to mode for a PoC. We have never had to reach out to them for urgent requests.
We have protocols specific message transport passport hubs, like SFTP hub or IBM MQ Hub, but we never had a tech that has been protocol-agnostic. Therefore, the solution is kind of new.
Our IBM DataPower appliances have had the capability to do protocol transformation, but we've never done it. We've always just used it for REST and XML type stuff.
Our enterprise data hub has been essentially a big data lake for business data, customer information, etc. They are in year three of the enterprise data hub program. For the first three years, they had been receiving data only by file transfer, which was yesterday's data at best. Only because I'm a participant in different projects, I happen to know that two months ago they enabled real-time event streaming by Cloudera Kafka from our customer information repository. When a customer update happens and changes their street address, for example, we publish through Kafka to get that information into our enterprise data hub in near real-time, as opposed to waiting for tomorrow's file transfer. My understanding of that tech is that it requires a queue can be defined between the source and destination but may not scale. It kind of reminds me of the early days of MQ when we had point-to-point MQ happening all over the place. We got about 150 queues in and realized, "Oh my God! Having a hub would be nice." Then, we implemented IBM MQ hub and waited for the next best opportunity to get folks to talk to the hub.
I'm thinking the same thing will probably happen with Kafka emerging through our enterprise data hub service that individually setting up queues to get events into the enterprise data hub. Getting these individual messages one by one for 600 applications will become onerous for the operations and support teams. I suspect before we get to that number that an event mesh will garner more attention.
The initial setup was straightforward. We were a bit lucky because we have a guy on our technologies operations team who did the initial setup of the physical appliances. When it came time to get the software and run it on servers, like Azure, it was relatively easy. Because we outsourced our infrastructure operations and monitoring tools to a service provider, the most complicated part was getting the firewall rules figured out for the publishers from the the legacy systems. The complexity of setting up their product had nothing to do with the Solace.
We are not live yet, but we're deploying using Azure with the intent to build our first bridge to the Azure Event Hub. The applications are hosted with Azure so we're recommending that they leverage cloud-native messaging technology, or Azure native messaging tech. We'll listen in on the messages that traverse the Azure Event Hub and fork them over to a Splunk (probably). The strategy is sort of non-disruptive and not mission-critical. In technology operations, we are just looking to see what events occur at Azure and trying to correlate them with events that are happening on-prem, since our customer information and account information are all stored in mainframes, NonStop environments, and platforms which are not moving to Azure. The implementation strategy is to insert Solace as means of transporting events into common spots so we can have a view of what's happening.
In a company that does rigorous change management, the initial setup took one of our guys probably three or four weeks. He was already supporting the physical appliances, so he had a bit of a running start. However, every time we cut a change record in our company, we need two weeks lead time: Two weeks to get our server infrastructure provisioned, then two weeks to get our firewall rules implemented. After four weeks, we were done.
A quarter of the same person's time who is also supporting the physical appliances is what is needed for maintenance.
I have two techie guys who work on installing it. I am more of the enterprise architect, PowerPoint guy.
On use case number one, we struggle with our mean time to resolution and technology operations. We've outsourced a lot of our data center operations and server storage network operations to a third-party (DXC), who is formerly HPE Enterprise Services. They manage our data centers, OSs, and servers. CIBC applications are mostly homegrown, so we support and maintain our applications. We do code chops, code changes, DevOps toolchains, etc. So, when something goes bump, there is a lot of finger-pointing.
We have DXC publishing their events now. Going forward, we need to figure out which tools we correlate those events to and start recognizing some of the benefits.
We have not seen ROI.
The operational efficiencies that we intend to gain should result in a reduced internal chargeback of tech resources. That's really the ROI that we're going after: operational efficiency and better mean time to resolution for our incidents.
We have been really happy with the product licensing rates. It has been free for us, up to a 100,000 transactions per second, and all we have to do is pay for support. Making their product available and accessible to us has not been a problem at all.
Having a free version is critical for our technology operations use case. This is primarily because our technology operations team is a cost center in our company. They are not profit drivers and having a free version for installation will probably meet our needs. Even for production, it'll support up to a 100,000 messages per second. I don't think in technology operations that we have that many events and alerts from our detection tools. Even if I have 20 or 30 event detection products out there, they're only going to publish the things which are critical or warnings. I don't think we'll ever reach a 100,000 messages per second.
We have been dealing with the free version for a better part of 18 months now. There have been no allergic reactions. You should expect maintenance costs, but we've not really needed that because we're not live yet in production for our first use case. For our physical appliances, capital markets folks were happy to get a big discount on the last version of the physical appliances. I've heard no complaints about what they're being charged for the Solace product that they've had in use for seven years. However, they haven't modernized any of their applications into Azure yet.
When we were searching for protocol-agnostic event meshes, I wasn't the one doing the research. It was our integration domain architect. He had experienced with Solace already. When he was doing market research for protocol-agnostic event meshes, his input to me was there was only one player, a Canadian company based out of Ottawa. Therefore, we didn't do a bake-off with anything else.
Other lines of business in our company have been using things like MQ Hub and IBM DataPower appliances. Our technology operations division has a program that I'm working on right now for trying to start getting our tools to interact together using Solace Event Broker.
Our company is pretty passionate about making sure that we have vendor support. When we do use open source products, we go out and get third-party support. When compared to some other messaging hubs that we do have, I have to admit that our IBM MQ Hub has been also incident free for many years while running on a mainframe, but our IBM DataPower experience has not been good. I would say that Solace fits right up there with the best that we have for message transport in our company.
Topic wildcarding implies that if we had a set hierarchy for our topic naming convention that we could deliver it to subscribers based on wild cards, which is something that differentiates from Kafka. We're not leveraging topic wildcarding, but my understanding of the tech is it would allow our security tools (for example) to be able to poke their nose into topics of interest to them using authorizations that Solace would control.
Kafka is really the only other competitor. We have IBM DataPower, but that's not really a fair comparison. We aren't intending to do format or data transformations with this tech. We're only looking at protocol transformations and message transport. Kafka has gotten a lot of momentum whenever our app developers Google that stuff, they get a lot of support and hits. Trying to find some momentum for Solace has been a bit difficult, but the idea of having Solace be our protocol-agnostic message transport system is the plan. However, when we have only had a small number of applications hosted in the cloud right now, the point-to-point message delivery is not unmanageable. Building a Kafka interface to something with Azure is tolerable and manageable when we have less than five subscribers.
When we realized that that message would be best consumed by something that talks a different language, then we'll start recognizing Solace is an important instead of publishing a message twice in two different protocols. We'll be able to do it by publishing to the Azure Event Hub, not worrying about what language our subscribers talk. We've been juggling between: Do we do Kafka or do we do Solace? Right now, the momentum for Solace is not yet there because the volumes of applications modernizing are so low. But that tide is changing, we're gaining some speed.
In technology operations, we have no use cases that are Kafka-centric. That's mostly because our enterprise tooling doesn't exchange data with anything. There are just these stovepipes of monitoring data.
Get folks in various stovepipes to recognize that their data is valuable to aggregate for the entire enterprise. The biggest lesson learnt for me in use case number one has been to get various support organizations to realize that publishing your data is not about pointing fingers and finding culprits. It's about efficiency of restoring service.
The solution got us to look internally at how we operate and we behave as a split-brain support organization, where we have some of it on the inside and some of it outsourced. That has been a benefit to us.
I would rate this solution as a 10 (out of 10).
We have a hybrid model because we have a lot of systems on-premise as well as a lot on the cloud. We have one instance of Solace in AWS Europe, and the other one is an on-premise setup in our data center, also in Europe.
Given the levels that we have designed into our topic taxonomy and the hierarchies, Solace gives us decent levels that we can get down to, in terms of granularity. It supports two to three character sets of their entire, end-to-end topic structure, so I can actually get down to level six or seven, or even more than that.
The last couple of releases have brought about event life cycle management. That changes the way a designer or an architect will design a topic and quickly discover what is available, and whether something has to be built out. That's pretty easy. With the life cycle of the event portal and the event cataloging that is available, it makes life easier for them. With all these new features in place it increases our productivity by something like 50 percent. Now, because we have a nice, curated view of the contents of the event in the event portal, it is easy to discover and to publish new topics. What used to take one day can be done in half a day, leveraging all the best-practices and the features that come with this product. Of course, you need to pay more if you use the event portal or catalog, but assuming all those tools are in place, it is beneficial for the productivity side.
There has also been an increase in productivity around solution management because of the ease of the key features that they offer. You don't need to spend time moving around multiple screens to manage something on the monitor, implement fixes, find hotspots, or even to publish something new. Because it is easier to navigate around, following the life cycle of an event, it definitely increases the productivity, whether it is from a solution management point of view or an operations point of view. From whichever angle you look at it, it makes life easier for that particular person.
We are implementing the event mesh feature right now. In my previous organization, we used the event mesh. Solace DMR, which is its dynamic message routing, and their event mesh capability is one of their unique selling points. It's a stand-out, a distinctive capability and a differentiator. It is a great feature and, honestly speaking, it is one of the biggest differentiators they bring to the table, compared to many of the message broker platforms or event broker platforms that I have used in the past.
In my assessment of Solace against other products — as I was responsible for evaluating various products and bringing the right tool into companies in the past — I worked with multiple platforms like RabbitMQ, Confluent, Kafka, and various other tools in the market. But I found the event mesh capability to be a very interesting, as well as fulfilling capability, towards what we want to achieve from a digital-integration-strategy point of view. It's distributed, yet it is intelligently connected. It can also span and I can plug and play any number of brokers into the event mesh, so it's a great deal. That's a differentiator.
It is completely self-sufficient when it comes to connecting the brokers together because it uses a proprietary protocol over the TCP layer. It is a Solace messaging protocol and it is not very difficult to configure it and use it. It is easy to use, easy to configure brokers and to connect them all together.
From an administration point of view, Solace gives us a visual view of all the brokers in there. The capability of spinning up a broker and connecting it visually is still in progress in their roadmap. But, technically speaking, if somebody knows the administration of Solace very well, they can actually spin up a broker easily, either on a cloud or on-premises, on Kubernetes or on Docker, and can quickly connect them all together, and it starts showing up in their portal. It is pretty straightforward and pretty easy to implement. Here, we have been able to quickly set up the basic mesh architecture for the sandbox environment. It's straightforward and pretty cool as well.
Another feature and selling point of Solace is that it promotes and uses open standard protocols like SOAP or REST. We use AMQP in some scenarios and there are multiple other ways that we could connect as well, including JMS and TCP. There are five or six different ways that we could integrate with other inter-operating, distributed applications within our enterprise. Since Solace supports all of these open, standards-based protocols, it is pretty easy to connect.
It is also pretty simple to manage. The two major standout points are a very simple architecture and that it's a lightweight middleware platform. You just spin up somewhere and connect. On the top layer there is a single pane of glass to monitor and to keep the checks and balances in place, and also to administer from a cloud platform. That's a pretty simple, straightforward setup, like any cloud-based or middleware platform. The model that I have for MuleSoft in my company is the same thing for Solace as well. I would rate it as simple and straightforward.
I would rate Solace's ease of management better than competitive or open-source solutions, because they have brought thought leadership to the table for looking at event management and building a complete life cycle view of an event. Right from the time an event starts in the company, until the time that the event has to be retired, it goes through a life cycle. That includes discovering an event, designing the event, adding certain rules to it, configuring it, and deploying it. Finally, you'll want to monitor and operate it. The whole life cycle is completely manageable using Solace's UI. That is a great deal. None of the competition has brought that view to the table yet. This is another distinctive differentiator that Solace has.
In terms of the solution's topic hierarchy there are two ways to look at it. One is that there are particular topics that we set up and that are very static in nature because we know about their data already. For any other areas that are fixed, it is pretty straightforward because the topic taxonomy is already agreed on. It is already aligned with the stakeholders and it is easily implementable in Solace.
The other side is that if a publisher chooses to dynamically post a topic — a new topic — if they know what the topic taxonomy model looks like for our company, then it is also possible to dynamically put the topic in place and publish it, as it is.
It also gives you wildcard-based routing rules. Based on the topic taxonomy and hierarchy, I am able to route a message or use the wildcards that are placed in the higher topic hierarchy to even put in security. If a particular group shouldn't see a particular message coming in on a topic, I can control that as well using the right topic taxonomy or the topic hierarchy. In Solace, that is also pretty straightforward because their topic taxonomy definition and the way that they promote it and the way that we have understood it from them is pretty easy.
Kafka has a different way of doing that. RabbitMQ is very similar to the JMS-type of message platforms. Solace is very similar and it supports both dynamic and static. The solutions are even, from that perspective.
Another product that I use very much in my current portfolio is MuleSoft. It's an API management platform, and also iPass, which is Salesforce's company now. Both these products have to work together to give an assured-delivery type of middleware platform. We felt that having a connectivity layer or a connector or an adapter already pre-built in Solace for platforms like MuleSoft, Dell Boomi — middleware especially — would be pretty interesting. It would make it a more authentic and credible connector as well.
Today, we have to rely on JMS or a REST-based protocol but we have raised this request with Solace. While connectivity is definitely easier, at the same time, Solace needs to work on some of the connectors for industry-leading applications like Salesforce, Workday — multiple typical distributed applications that we might have. It is pretty good at this point but they can do better on that.
Also, a challenge we currently have is Solace's ability to integrate with single sign-on in our Active Directory and other single sign-on tools and platforms that any company would have. It's important for the platforms to work. Typically, they support only LDAP-based connectivity to our SQL Servers.
We have one critical step, from an IT security point of view. If there are any SaaS applications or cloud applications which are hosted out of our cloud platform, then the only way that we can do SSO is through a SAML-based or another specific protocol. Solace doesn't support them at this point in time and we have raised this as a platform request. I think it is on their roadmap. But currently, it supports only LDAP. That is an improvement area for them.
This is going to be my third year using PubSub+ Event Broker. I was with another company earlier on before I joined my current company. It was on the fast-moving consumer goods side and I started using Solace there. In my current company, this is a very new platform and I'm setting it up. But my overall experience on Solace would be two to three years' time.
Stability is definitely one of the key factors for us. My experience is that it's one of the robust platforms, because of the way that it's engineered and designed to work. It's absolutely a stable solution. We've never had any problems, given the way that we have implemented it.
It's a completely scalable solution. Our architects have been looking at using Solace for multiple different use cases, whether it is to do with event architecture or assured-delivery types of projects or even for a simple publish/subscribe type of messaging or an async-API type of model. It seems that our architects find this to be a tool that can extend across these lines of capabilities. Solace brings that to the table.
From the developer's point of view, it provides ease of use and ease of configuration. After somebody has worked on and is really proficient in IBM MQ or TIBCO EMS, which are heavyweight platforms that come with certain benefits, those architects and developers find Solace pretty easy to handle and to extend it to other application areas or use cases, including IoT, async APIs, pub/sub, and event-driven messaging. We also are using it for assured delivery, leveraging their queues and persistent layer. It does help our architects and our developers to extend their applications to all of those areas.
Their technical support is pretty quick. We are bound by an SLA and we have the highest tier of support from them. The turnaround time is pretty good and they are strong technically. I would rate their technical support as good.
The fact that there is a free version of Solace was something that we looked at from multiple angles. For example, when we need sandboxes, the question we had in mind was whether we should go for the paid version or use the free version. The free version doesn't come with support but it offers a lot of capabilities which a developer can play around with.
But when we had to choose between the free version and the licensed version for anything on our test stage, pre-prod, and prod, which are the other instances that we have, it was a no-brainer that we wanted to go with the paid version, because that brings in a whole lot of enterprise-class support and multiple other things along with it. We take advantage of the free version for sandbox, for a little bit of training, and PoCs. But predominantly, we use the enterprise-class version for the other instances we have.
The initial setup was straightforward in terms of:
We haven't found anything significantly complex.
We haven't seen return on our investment with Solace yet because it's pretty new in our environment. But we do see there is a value it brings to the table from a digital-transformation point of view. Both the companies that I was part of, where I was fortunate to lead the digital transformation projects, identified Solace as the platform to make that change: from a heavyweight, old or legacy model of middleware, or MQ platform, to a very lightweight, modern, completely distributed model. It's quick and nimble and agile in all types of setups. That is a huge shift in the way that we do things and make things notably faster. Qualitatively, this has definitely been a great tool.
Quantitatively, I would not be able to disclose any numbers, but we sense that there is going to be a huge return on investment because we might shut down some of those old, heavyweight, on-premise-only platforms. Because this is also a pay-as-you-use model, we can effectively make use of the license, as and when we require it. There are definitely going to be good cost savings as well.
They have good pricing in place. Their licensing model is a simple model.
There are different tiers where you can choose what would work for you. As a customer, you need to know roughly how many messages a month you will use.
If you know that it is going to be between 50,000 and 100,000, while there is a large gap between those two figures, you can start small and scale it over a period of time until you reach 100,000. You might start with 50,000. Since it might take six months to reach 100,000, what I would suggest is starting with the lower tier, because you don't need to pay for something that is higher. Then, as the demand grows, the tier can be revisited. That's based on the license agreement that you should have as part of the contract. You should agree with Solace that you will start small but that your intentions are to grow, depending on the demand that's coming in. Provide a roadmap of how long it will take to reach the next tier.
Solace appreciates that view of your roadmap, and they will also come along with you in that journey. They will tell you, "Okay, start with a giga tier, don't go for a tera," or even start with a kilo tier. Slowly, as you see demand going up — it could be once every two or three months — you can have a look at it. It could also be once in six months if you don't want that many interactions. See how many you have done. If it has not gone beyond 75,000, you can continue to operate under the current tier. But if you think it's going beyond 75,000, you can move to 100,000 tier. It's a staged and calculated approach.
You also have to choose which of their product models would work for you. They have an appliance, they have a software as a service model, and they have an on-premise model, using a Kubernetes based setup. You need to look at your architecture and where your real needs are for event-driven brokers to be sitting. The licensing model also changes accordingly.
You have to have the right contract in place so that you can reassess that contract every few months to see whether you have breached your threshold. It's not that it's going to stop working, but you need to have that as part of your agreement, that even before it reaches the 70 or 80 percent of the threshold you will have a call to see whether you want to upgrade or not. That's all part of the contractual terms and conditions and negotiations.
There are two important things to keep in mind when considering this tool. The first is to know what kind of problem that you're trying to solve. If it is just about having a pub/sub, there are a number of other tools in the market — including Solace as well, which offers a simple, straightforward solution. But if you are looking at completely digitally transforming your company and bringing in event-driven architecture as a key factor in your integration strategy, then Solace is definitely a go-to tool. Knowing the end-goal that you're going toward, the objective that you're trying to meet, is very important. That is the first step one needs to be aware of and clear about.
The second thing is the engagement model with Solace, whether it is the terms of the licensing model or the way you will work with their Professional Services team or their support team. All that has to be discussed and agreed with a clear customer-success plan in place.
Thirdly, you want to clearly identify what architecture you want to implement because the mesh can span across anything. But you don't want to start a big-bang approach. Start small and then grow. So you need to know how your architecture is evolving. Start putting that simple MVP in place and from there you can grow it into multiple phases. That's what we are doing.
Have the right people in place. Somebody who has a good background and experience in implementing Solace can turn things around quickly.
We have four or five architects who use Solace, and we have two administrators of the platform, or platform architects. And we have about five developers now using it, but that will probably go up a little bit once we extend the mesh further. We also have two or three in support.
I would rate the solution at eight out of 10. I don't want to give them full marks because there is a lot that they could improve on: the SSO front; there is also the community front, they are also changing their architecture depending on best practices of communities, the way the community works, and so on. There's a lot of work for them to do to re-invent their on-premise model for a Kafka container-based solution. I would give those additional two points, out of 10, if I had seen all of that in action. There is definitely thought leadership within Solace, so I'm assuming that it will come through at sometime.