The APM Transaction monitoring is the most valuable feature. Being able to define key transactions and collect traces has been essential to providing actionable data for fixes and improvements.
Early in our app lifecycle we would receive random reports of slow response times from users. Of course, they were never reproducible in our QA environments nor did our OS-specific monitoring tools show any problems. Implementing the APM with our app servers gave us visibility into what our Java code and JVMs were doing at the time users had problems. This allowed us to zero in on infrastructure and code issues as well as implement monitoring cases specific to our app.
Last year, there were several New Relic outages where alerts were either fired in error or not fired at all. These have been remedied over the last year, but it negatively impacted our trust in using New Relic as our sole source of analysis and alerting.
As far as suggested improvements, the Synthetics module could be much more useful if one did not have to learn yet another analytics query language.
I have used New Relic in production since mid-2013.
Since we use a 1.x version of Play Framework, there were some initial challenges in implementing the Java APM agent. The later versions of the agent have drastically improved since then and deployments are considerably less cumbersome.
The aforementioned outages and issues were vexing but, fortunately, are well in the past.
New Relic was an add-on to our existing operations analytics systems. We selected New Relic solely on the basis of the application monitoring feature which our existing systems did not provide.
Once we overcame the challenges of implementing the early Java agent, the remainder of the implementation was effortless. We had 90% functionality within the first 12 hours of implementation.
I performed the implementation personally.
At our usage level, the cost has been trivial compared to our overall operations monthly costs. What the product has done for us was expedite our ability to discover actionable data that led directly to improvements in our app which would have taken considerable longer if we'd had to build similar functionality ourselves.
Whilst it may be tempting to instrument all of your production and non-production environments, this is a tool that is best used where appropriate, rather than as a blanket deployment.
We evaluated building similar functionality ourselves using open source JVM monitoring and log analysis tools. We also evaluated a few semi-competitors. The home-brewed solution would have required additional engineering staff and a much longer build time. The also-ran services were astronomically more expensive.
It's a great tool for monitoring infrastructure and application performance. The only drawbacks have been cost and a few issues with outages and monitoring/alerting failures.