100%. VMware needs log information to troubleshoot; it's not easy finding problems.
Downloading and uploading logs have become an issue.
100%. VMware needs log information to troubleshoot; it's not easy finding problems.
Downloading and uploading logs have become an issue.
Splunk is our central locale for cybersecurity and protection.
Once we onboarded all of the required needs, it created a lot of visibility for us.
It is quite extensible. It is a platform that we can build our use of each case instead of each case being limited or restricted to each capability. This is probably the best feature.
I would like to see future development in terms of ML (Machine Learning).
It is a stable product.
It can be scaled quite easily in comparison to other products on the market.
The tech support response time could be a bit better. Sometimes I need to wait more than 24 hours for a response to my tickets.
I was not involved with the initial setup.
The price could be improved.
In the beginning, we just wanted to collect the logs from the different devices, like the nano storage, Linux, Windows, and VMware. We tried to get the uniform solution to collect and analyze all of the system logs.
Our current companies need this solution. We need it to highlight the old logging events. Based on the different device and systems, we have Splunk and we can clearly explain the everyday field logging of events in the different IT environments.
In the past, we used a different application to collect logs. We used SurfWatch and VMware to do so but we found that the Splunk has more capacity to do more in less time. They provide a faster speed to index all the events which is a huge asset.
The user can apply for all kinds of device systems, no matter whether he/she is using Windows or Linux. It can easily collect the logs. In addition, the user can have an index which can help us to collect and analyze all kinds of logs and find the outstanding issues.
Splunk is not very user-friendly. It has a complex architecture in comparison to other solutions on the market.
It is stable.
Scalability could be improved.
We used SurfWatch and VMware in the past.
I was not involved with the initial setup.
I am not personally involved with the pricing of the solution.
We also looked at Selopene SIEM. It is a premier logging site.
We use it for logging and troubleshooting.
Every team immediately created their own Splunk dashboard, and all the product owners were ecstatic about this. We were able to create a catalog of dashboards and have a holistic view at all levels. We could understand our business much better. Real-time errors, which were buried in emails before now, surfaced up on dashboards. Even our executives could understand this, and it changed the way teams thought about alerting and reporting. It allowed us to send out real-time notifications to integrate with Opsgenie, and it changed the way IT works.
The dashboards are the most valuable feature. We like the ability to drill in and see what queries are under the dashboard, build new visualizations, edit the querying, and see the reports. The dashboards are very intuitive and similar to SQL. They are easy to set up and get running.
The query language is pretty slick and easy, but it is not consistent in parts. Some of it feels a little esoteric. Personally, some of my engineers are coming from SQL or other languages. Some things are a little bit surprising in Splunk and a little bit inconsistent in their querying, but once you get use to it and once you get use to the field names and function names, you can get the hang of it. However, if it was a bit more standardized, it might be quicker to get it up and running.
I would like additional features in different programming models with the support for writing queries in SQL or other languages, such as C#, Java, or some other type of query definitions. I would also like a better UI tool for enhancements of advanced visual query editors.
It is pretty stable, though it has gone down from our usage. We do need to keep an eye on our query volumes. Right now, it is too easy for a user to write a query, run it, make it available in polling mode (real-time mode), and bring down the server. Some more safety alerting would help and be beneficial.
We do have to educate developers on how to not blow it up. It is a little to easy to write an expensive query and overly stress the system. This could be improved. Overall, once you have people who know what they are doing, it is very stable.
Our environment is on-premise, and it is big. We have a couple hundred users. However, it was slow and unavailable at times before we trained all the engineers on how not write a long, constantly polling query.
Our internal tools team did work with the Splunk support team extensively. I was not directly involved, but from my point of view, they were able to fix and resolve issues within a day or less, so they have been okay
It is early days right now to evaluate the integration and configuration of Splunk in our AWS environment. We are just starting to integrate it with regular stuff. While I think it is okay so far, I really do not have enough information.
Most of our return on investments have been through faster error resolutions. Our meantime to recovery has dropped for issues. We can often fix things before the customer notices them. Whereas, when logging was done custom by each team in non-standard ways, it would take days to resolve issues that are now resolved in sometimes minutes.
We knew we were going to go with Splunk. It was the leader and the one we liked. We didn't consider any others since Splunk met our needs.
We chose Splunk because of the ease of the UI, querying, and creating dashboards. It has a standardized query language, which a lot of the IT staff were already familiar with it. It was the market leader from our prospective for our needs.
Go with Splunk. A lot of people know how to use it because they have experience with it. It works well. While it has some pain points, it provides reports and data visibility.
It integrates great with Opsgenie, PagerDuty and Slack. We love the Slack integration, as works great with the Slack alerts.
We use the on-premise version in our data centers and we use the AWS version. We are just starting to migrate to the AWS hosted version, and I have not seen a difference.
We use it for log aggregation.
If you have a large number of devices, you need to aggregate log data to make more sense of it for parsing, troubleshooting, and metrics. This is all we use it for.
If I need to track logs for certain application, I will push all of those logs to Splunk so I can run reports on those logs. It is more about what you are trying to do with it and what you need from it.
We use it primarily for troubleshooting. We had an issue with SaltStack recently and were able to look for the same log entry on a thousand servers simultaneously, making the process easy.
The ability to create dashboards.
You can run reports against multiple devices at the same time. You are able to troubleshoot a single application on a thousand servers. You can do this with a single query, since it is very easy to do.
When you get into large amounts of data, Splunk can get pretty slow. This is the same on-premise or AWS, it doesn't matter. The way that they handle large data sets could be improved.
I would like to see an updated dashboard. The dashboard is a little out-of-date. It could be made prettier.
It's been very stable for us. Most of our stress in not from Splunk, but from disk I/O, like input and output for the disk that you are writing logs to. We have had more issue with our own hardware than Splunk.
You have to make sure if you're writing an enormous amount of data that you have your I/O sorted out beforehand.
It scales fine. We haven't had any issues scaling it. Our current environment is about 30,000 devices.
The integration of this product in our AWS environment was very simple. We just forwarded our logs to it, and that was about it.
It has agent-base log forwarding, so it is very simple, not complicated at all. This process is the same from on-premise and AWS.
If you have a large number of servers, even a few hundred servers, then you need to track specific data and log information from a lot of servers. You can either go to each server individually or set up jobs to ship those logs somewhere with rsync or Syslog. The other option is use Splunk and push them all to Splunk, then from Splunk you can just create alerts and run reports against all that data in one place with a single query rather than having to do all that work repeatedly. It saves us a lot of time, just in man-hours, and being able to look at hundreds or thousands of servers simultaneously.
Splunk has no real competition. It is just Splunk, and that is it.
Build your environment a lot bigger than you think you will need it, because you fill it up quickly. We log somewhere in the neighborhood of two to four terabytes a day per data center.
We use both AWS and SaaS versions. With the SaaS version, you don't have as much control, but it functions the same, so there is no real difference. Though, the AWS version is probably easier to scale, because it is AWS.
We use it for searching logs in a production environment.
It has reduced the time to resolution, time to investigate, and time to troubleshoot for debugging issues.
Being able to search across all the different production environments at the same time, then being able to do search queries to scope out specific environments, specific components, or specific logs from different languages, such as Java or C++. Thus, being able to have really fine grain control on log searching is really good.
Out-of-the-box, it seems very powerful.
The search query seems slow, but I am not sure if that is just because it is searching millions upon millions of lines of text. Also, I just started using it, so I might have no idea what I am doing. I could probably speed up the queries by improving my search skills.
My company could benefit from doing more Splunk training with Splunk consultants teaching us how to use it. It is possible that we have already done this and I haven't participate, but this type of training would be helpful.
It is always up when I need to search. I am probably not using it that much. I will maybe search a couple times a day for something specific, so I am not using it too much. I know plenty of the people who are doing a lot more for debugging, and who use it a lot all day.
It seems like it scales well. We have hundreds of production and development environments, and we are searching on all of them. Therefore, it seems like the scale is good.
We have hundreds of production environments, and each production environment has ten to 20 host machines. Each production environment can manage tens of thousands of customers.
Maybe going to AWS and scaling it better would be more cost-effective for our company. However, I am not involved in those decisions.
I have not used technical support.
We have other log searching tools, but we have standardized on Splunk.
It is a great product. We have a lot of different tools to do this type of debugging. Yet, it is one of the first ones that I will reach for, and I think that is a good sign.
It works well and is the industry standard for log searching. It probably has other features too. Therefore, if you use it, I would recommend the training, so you know what you are doing.
I am using the on-premise version.
It is mostly centralized logging, a whole bunch of BI metrics, and an aggregation point, which we have adulterated for some PCI data.
It does meet our use case for the most part.
We like the dashboard creation and the ease with which we can harness the APIs to create custom BI dashboards on the fly. This adds most value for us. The nature of some of our microservices that I have run on the cloud are mixed workloads, wherein with the flow of data, it can change over time. In order to adjust for this, and cater to the needs of some of our internal customers, BI dashboards need to be created, tweaked, and modified. Also, doing this by hand is next to impossible. Therefore, we have strung all of this through a programmatic pipeline, which s something which we like because it is easier for us to harness it utilizing the API.
For on-premise, it's more about optimization. With such a heavy byte scale of data that we are operating on, the search for disparate data sometimes takes about a minute. This is understandable considering the amount of data that we are pumping into it. The only optimization that I recommend is better sharding, when it comes to Splunk, so that data retrieval can be faster.
With the AWS hosted version, we have not hit this bottleneck yet, simply because we are not yet at the multiple terabyte scale. We have hit with the on-premise enterprise version. This is a problem that we run into every so often. We don't run into this problem day in and day out. Only during the month of August through October do we contend with this issue. Also, there is a fair bit of lag. We have our ways to work around it. Between those few months, we are pumping in a lot of data. It is between 8 to 10 terabytes of data easily, so it is at a massive scale. There are also limitations from the hardware perspective, which is why it is an optimizing problem.
On the cloud, we are pushing through less than half a petabyte of data. So far, it has been fairly stable because it runs on all the underlying AWS infrastructures. Therefore, we have had no issues at all. In terms of availability or outages that we've experienced, there haven't been any. We've been fairly happy with the overall landscape of how it works on AWS.
On cloud, we absolutely like it. Splunk AMIs make it easy for us to spin up a Splunk cluster or add a new node to it. For our rapid development and scale of deployments in terms of microservices and the number of microservices that we run, we have had no problems here.
On-premise requires a lot of planning, which happens on a yearly basis. We have Splunk dedicated staff onsite for on-premise to help us through this.
We have 450 people making use of Splunk in our organization, and there was a bit of knowledge transfer needed on how to write a Splunk query. So, there is a bit of a learning curve. Once you get over it, it is fairly simple to use. We also have ready-made Splunk queries to help people get started.
We do deal with technical support on an ongoing basis. They can definitely do better from a technical point of view. Their only purpose working onsite is to make sure that our massive set of Splunk clusters are online, and the clusters are tuned well enough to work well.
We would expect the technical support people onsite to be subject-matter experts of Splunk. We have seen in a few areas where we have been left wanting more, wherein some of our engineers happen to know more than them in terms of some of the query optimizations, etc. This is where we think there is a fair amount of improvement that can be done.
We wrote the automation to bootstrap everything onto AWS, which was fairly easy. As long as we had all the hooks going into AWS, and we had the SDK. So, we did not have too much trouble getting the bootstrap up and running.
Some of the insights that we have obtained as a part of using Splunk have greatly helped us in increasing our revenue in terms of selling our products.
We have seen a decent ROI. For the month of October 2018, when we had a product launch, we were able to query and generate BI dashboards on the fly. This was huge, and not possible two and a half to three years back because it was more of a manual process. Now, with APIs being available, it is very simple to tweak or write a small piece of glue code to go ahead and create a new dashboard for a business unit to make near real-time decisions to focus more on other geographies when launching the product.
I wasn't there when the evaluation was done. When I came on board, this product was handed down to me, and we have not evaluated any other solutions or products since then.
Make sure it fits your use case. Be clear about what you want to achieve, get out of the product, and how you want to integrate it. Once you tie the solution into your systems, it is not trivial or easy to walk away from. Therefore, due diligence needs to be made to understand what your requirements are before choosing a product. Some companies may not even want to host, and prefer to go the managed services route.
We have it integrated with every product that I can think of.
We use both the AWS and on-premise versions. The AWS hosted version typically caters to all the microservices that we run on AWS, so there is a clear segregation between on-premise and cloud. In terms of usability and experience, both of them have been similar. We have seen a few bottlenecks on the cloud, but that can probably be attributed more on the user side of the house in terms of the way we write our applications and the type of payloads that we sent this month. This is an optimization which is ongoing from our end. Other that, we have been fairly happy with Splunk and what we get out of it.
We use it for application log monitoring.
It is a logging product. Our application generates log files, then we upload them to Splunk. We run their agent on our EC2 instances in AWS, then we view the logs through their product, and it is all stored on their infrastructure.
We have used the alerts for a lot of things. They gave us the ability to kind of make an alert simply. So, we did one for SQL injection. We also had some services which were problematic that would fail, but we figured out what log line that we could look for, so it was easy to make an alert for that.
Its usability is the best part. It is easy for our developers to use if they want to search their logs, etc.
A problem that we had recently had was we licensed it based on how much data you upload to them every day. Something changed in one our applications, and it started generating three to four times as many logs and. So now, we are trying to assemble something with parts of the Splunk API to warn ourselves, then turn it off and throttle it back more. However it would be better if they had something systematically built into the product that if you're getting close to your license, then to shut things down. This sort of thing would help out a lot. It would help them out too, because then they wouldn't be hollering at us for going over our license.
Stability has been great. I don't think we have ever had an outage from it.
We don't do a lot of searching. If there is somewhere with problems, it will probably have to be with a lot of searches, and we don't have that. We don't have many developers searching every day. It is mostly when there is a problem, then we use it for diagnostics. So, we don't put a large search load on it. However, the reliability of it has been great. It hasn't been down for us at any point.
It seems to have worked out great. We haven't had any problems yet.
I haven't used the technical support.
Before Splunk, we used Kibana and Elasticsearch. Sometimes, with them, logs wouldn't even be there. We have received an infinite time reduction there. We couldn't use what we had before, so Splunk being there and working does a lot.
The integration and configuration with the AWS environment was easy. They had the documentation. All we had to do was get their agent running on our EC2 instance, and their documentation was good for that. It worked, which was great.
The product is also integrated with PagerDuty, Slack, and AWS. Those integrations are good and seamless.
It has made life easier for us through use, then by troubleshooting problems. It reduces the cost of the intangibles.
The pricing seems good relative to the other vendors that we have had here. However, they need to find ways to be more flexible with the licensing and be able to deal with situations where we start generating more logs. Maybe having some controls in the Splunk interface to turn it off, so we don't have to change anything in our application.
We have an existing contract with Splunk, so it makes sense to stay with them for now. Our license is for a 100 GB/logs a day.
There are a lot of vendors in the space at the conference this year. Therefore, we probably talked to six or seven different ones, and the market seems to be consolidating. The market's metrics and log monitoring all seem to be rolling up into a single provider. It looks like that is what will be happening in the next few years.
Right now, there are a ton of different smaller providers doing little pieces of this and that. All the big players, like Splunk, New Relic, and Datadog, seem to be rolling them all up into one offering.
Implement something and watch how much data you are sending to it, then have some way to shut it off without redeploying your app in case things get hairy.
We use the cloud version of the product.