My primary use cases are statistic analysis and data governance.
BI Expert at a university with 501-1,000 employees
Predictive modeling and auto-model are valuable but IBM does not handle our support queries very well
For how long have you used this product?
- Three years
Which features of this product are most valuable to you?
- Predictive modeling, and its best feature is auto-model, it will give you options on which models are best suited to your data such as CHAID, c5 or Logistic REgression, it also displays accuracy, another good feature is that it has boost and reduce data for replication (else you will have to rely on sample power) for power analysis
Can you give an example of how this product has improved the way your organization functions?
- Yes, we predicted enrollment of our students more accurately
What areas of this product have room for improvement?
- pricing maybe
Did you encounter any issues with deployment, stability or scalability?
- It's definitely scalable since its integratable with SQL Server or a flat database file like excel
Did you previously use a different solution and if so, why did you switch?
- We haven't used any since Predictive analytics software is very expensive
Before choosing this product, did you evaluate other options? If so, which ones?
- We used this product this it was parallel with SPSS data sets of which most psychology and stat backgrounds use since its very usable in terms of user interface, its not much based on syntax unlike other software available.
How would you rate the level of customer service and technical support?
- Customer service and technical support is given by STRAND asia, however I would say IBM is not that good in handling our queries.
Was the initial setup straightforward or complex? In what ways?
- It became complex since IBM bought SPSS 4 years ago.
Did you implement through a vendor team or an in-house one? If through a vendor team, how would you rate their level of expertise?
- Through a vendor team
What is your ROI on this product?
- No ROI involved
What advice would you give to others looking into implementing this product?
- IBM SPSS modeler is good to integrate with IBM Cognos, make sure though that you have your data warehouse set up properly, there is also R and Python integration, you can download R essentials (32 bit and 64 bit) to work with R nuggets, definitely a big plus (imagine open source and the power of IBM to bring in a powerful and yet flexible software)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Other non-IT at a non-profit with 501-1,000 employees
Review of IBM SPSS Text Analytics for Surveys
Coding qualitative data is a time-consuming and often costly aspect of the survey process. In this post, I will provide feedback on IBM SPSS Text Analytics for Surveys (STAS), which is software designed specifically for coding survey data. The benefits of using this software include improved efficiency and consistency when coding text.
STAS allows you to set up a customizable coding algorithm to code your data. You can create your own coding scheme, or you can use one that is already built in. The built-in coding mechanism can identify responses that mention a place, a name, a positive or negative opinion, an opinion of how affordable an item is, and a number of other types of responses. One benefit of STAS is that it recognizes synonyms and alternative versions of words, making the task of setting up a coding algorithm more efficient.
I recently used STAS for two very different projects and found that it worked great for some purposes and not so great for others. If you are considering STAS, you may want to answer the following questions to inform your decision.
1. How much data do I have?
STAS is optimized for coding 1,000 responses or fewer, each about 40 characters long. I used it to code about 5,000 open-ended survey responses at a time and it still worked reasonably well.
On the other hand, I also used STAS to code over 100,000 Tweets for research into whether Twitter data can supplement survey data. I set up the coding system (STAS calls this a “Text Analysis Package”) with a subset of Tweets, then I used it to code 25,000 Tweets at a time. Initially, I tried coding all 100,000+ Tweets simultaneously, but STAS could not handle it, so I resigned to coding 25,000 at a time and waiting an hour or two while the process was running. See here and here.
2. Will I be coding more data?
Depending on how complicated your dataset is, setting up the coding system can take a lot of time. You’ll want to consider whether it is worth the initial time investment to set up the coding system, because for a low volume of complicated data, manual coding may be more efficient. For longitudinal studies especially, it might be worth that initial investment of time because once the system is set up you will be able to code subsequent waves of data with minimal effort.
3. How much will the responses vary?
I found that STAS worked best for coding demographic data because the range of responses was limited. It worked reasonably well for coding types of injuries (also somewhat limited), but it did not work as well for coding relationships between people. These responses ranged from simple responses such as “mother,” “friend,” or “teacher,” to complicated responses that were typically unrepeated in the dataset. The complicated responses were along the lines of “my best friend’s boyfriend’s step-mom’s boss.” I’m sure STAS can be set up to accurately code these sorts of responses, but I had a hard time figuring it out (see #5 below) and quite frankly, it’s probably more efficient to manually code these responses.
4. Would I be relying on STAS for sentiment analysis?
STAS is capable of coding sentiment, but I would test it carefully to see how it works with your data. As part of our Twitter research, we manually coded a random sample of 500 Tweets in our dataset and found that STAS sentiment coding was in agreement with manual coding only 44% of the time. STAS would likely perform better with survey data than Tweets, which often use unconventional language, but I would still recommend proceeding with caution if you plan to use STAS for sentiment analysis.
5. Am I able to take a course on STAS?
It took several days of working with test data and poring over the user’s manual (which I was not impressed with) for me to really figure out what to do with STAS. I know enough about STAS to get by, but I have also come to realize just how much I don’t know about STAS. I encourage you to take a STAS course if you are able. Learning new software is usually easy for me, but without any training, I really struggled with STAS.
If you decide to proceed with STAS for coding your data, here’s one tip as you get started. Run your data through spell check (e.g. in Word or Excel) before importing into STAS. STAS catches many spelling errors, but not all. Anything you can correct will speed up the coding process.
Do you have additional STAS tips to share? Has your experience with STAS been similar to or different than mine? Would you recommend something besides STAS for coding?
Please comment below!
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Buyer's Guide
IBM SPSS Statistics
April 2025

Learn what your peers think about IBM SPSS Statistics. Get advice and tips from experienced pros sharing their opinions. Updated: April 2025.
860,592 professionals have used our research since 2012.
Other non-IT at a aerospace/defense firm with 51-200 employees
IBM SPSS Text Analytics - Any Good? Yes.
I recently discovered that IBM bought SPSS a few years ago and is now providing a Text Analytics package called IBM SPSS Text Analytics for Surveys (producing the acronym STAS, which is either a stats package or an STD). I thought I'd take it out for a test drive so I downloaded a 14 day trail version. Before using, I reviewed these excellent tutorials: Analytics Blog and RTI's SurveyPost blog.
For data I could have used their sample data, but I decided to download my Twitter archive. Unfortunately, this caused me some pre-processing hassle. You see, STAS is technically designed for survey data and it expects unstructured language to be in the form of comment responses to questions and it expects those comments to be stored in single cells within a column in a spreadsheet (I see no reason in principle why it couldn't used for analyzing any unstructured data. You just have to package the data in a format SPSS will accept, namely a spreadsheet with the unstructured data all in one column).
Also, STAS does not directly ingests CSV files or Open Office ODC files. Apparently it only accepts inputs of four types: its own file type, Excel, ODBC, and what they call “Data Collection” which I haven't investigated.
Once you open a file, you are asked to drag-and-drop the column name containing your language data into an "Open Ended Text" box (refer to Analytics Blog for screen shots). While I appreciate the simplicity of the drag and drop functionality, my Twitter data had tokens separated into separate columns (which I thought was weird. Let me do my own tokenization, please!). STAS' functional choice means I needed to pre-process my data files. I had to merge the many token columns containing language tokens into a single column. Document pre-processing is common in language analysis, but STAS is supposed to be a platform easy to use for non-engineers. These file ingest and pre-processing steps are tedious and uninteresting and exactly why most people get frustrated. These things can be automated and it is a platform like STAS that ought to be doing this for me.
Also, it seems to only ingest a single file at a time. My Twitter data came to me separated by month so I have 38 files. I can manually merge them, but more work for me. Really no reason STAS can't let me select multiple files all formatted identically, then merge if necessary.
I was surprised and impressed that the software immediately offered me an opportunity to translation non-English comments with a single click. Simple and easy. Quality is what it is with MT. Don't blame STAS if it's a crappy translation. No matter how you slice it, it's a great function. Kudos.
I was super impressed that it will crawl the data and suggest code categories like key concepts. This is essentially topic modeling (though not as sophisticated as something like LDA. The User Guide has a whole chapter devoted to describing the details, but I haven't had time to dig in yet). Color coded clusters of concepts is a very nice function. Colors seem to refer to entity types (Person,. Org, etc). You can collapse all concepts into just the key exemplars of each cluster. There are also several nice filtering options to help you understand what your data is centered around. Here's a screenshot of my final output:
I can see key concept frequencies and filter by that. That's nice. Next steps: Can I see simple word frequencies? Ngrams?
Sentiment analysis can be done with respect to specific categories (food + positive). Pretty easy, but SPSS should mitigate lay people's over-indulgence in sentiment analysis which is tricky and not as easy as this makes it looks. This is where making something easy backfires. How can STAS encourage double checking the data? Gold Standards, sampling, etc.
No doubt, this is easy to use. An academic has the luxury of ignoring people who don't want to learn command line tools or programming languages, but the businessman does not. There's a ton of language data out there owned by thousands of companies and those companies are never going to get their regular employees to learn R just to analyze it. For them, STAS is a legitimate tool that will actually allow the average employee to dig into unstructured data. That's a win.
*In the interest of full disclosure: I do not work for IBM and this is not a sponsored blog in any way. These thoughts are entirely my own. I once worked for IBM briefly over 5 years ago and I still get the occasional IBM recruiter contacting me about opportunities, but this is my personal blog and all content is my own and reflects my honest, personal opinions.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Tech Support Staff at a tech company with 51-200 employees
Data and variable views: Easy to use and effective results
I would like to add and support a few points about this program used to analyze statistics.
I use this software, SPSS, in our company on an annual basis to conduct and analyze surveys. It has two views that I make use of from time to time: data and variable views. With the variable view, we have access to the meta data dictionary while the data view enables us to store our data in the form of columns and rows. The program generally is very easy to use and gives effective results. I learned using this program on my own without any training; it is that simple.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
I would like to say that SPSS identifies market segments using visual classification trees, it isolates unusual or problem data and it prepares and transform data using spreadsheet tools and procedures. I don’t like that the default graphics are poor and not easily customizable to make them better and that SPSS is expensive.
Consultora asociada especialista en Ciencia at a consultancy with 11-50 employees
Very flexible with a short learning curve
Pros and Cons
- "The most valuable features are the small learning curve and its ability to hold a lot of data."
- "I would like SPSS to improve its integration with other data-filing IBM tools. I also think its duration with data, utilization, and graphics could be better."
What is our primary use case?
What is most valuable?
The most valuable features are the small learning curve and its ability to hold a lot of data.
What needs improvement?
I would like SPSS to be integrated with Cloud Pack For Data
For how long have I used the solution?
I've been using this solution for fourty years.
What do I think about the stability of the solution?
This product is stable.
What do I think about the scalability of the solution?
This solution's scalability is good.
How was the initial setup?
The initial setup was very simple and took only a day or two.
What's my experience with pricing, setup cost, and licensing?
Licenses are available on a yearly basis. There are no additional costs, everything is standard.
What other advice do I have?
This is a very flexible product that can be used for numerous purposes. It's also very easy to teach others to use it. I would rate this solution as ten out of ten.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Founder and General Manager at a tech services company with 1-10 employees
Easy to use, overall useful features, and straightforward installation
Pros and Cons
- "The most valuable features are the solution is easy to use, training new users is not difficult, and our usage is comprehensive because the whole service is beneficial."
- "The solution could improve by providing a visual network for predictions and a self-organizing map for clustering."
What is our primary use case?
We are using this solution for statistical analysis in our organization for many areas, such as regression, the next-best-offer, and some models for marketing.
What is most valuable?
The most valuable features are the solution is easy to use, training new users is not difficult, and our usage is comprehensive because the whole service is beneficial.
What needs improvement?
The solution could improve by providing a visual network for predictions and a self-organizing map for clustering.
For how long have I used the solution?
I have been using the solution for approximately 10 years.
What do I think about the stability of the solution?
IBM SPSS Statistics has been stable in our usage.
How are customer service and technical support?
We used to use the support a long while ago but because we have used the solution for a long time we have not needed to contact them.
How was the initial setup?
The solution is easy to install. We use other tools to deploy the models and we exported the models to other languages, then implement them through the database solutions that our clients have. Most of our clients use either Oracle or SQL but none of them use an IBM solution. The most time-consuming part of the process is preparing the data for modeling.
What other advice do I have?
My advice to others is to be prepared to deal with your data and you most likely will require assistance to help to deploy the models and develop them.
I rate IBM SPSS Statistics an eight out of ten.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.

Buyer's Guide
Download our free IBM SPSS Statistics Report and get advice and tips from experienced pros
sharing their opinions.
Updated: April 2025
Popular Comparisons
Databricks
Microsoft Azure Machine Learning Studio
Amazon SageMaker
Altair RapidMiner
IBM Watson Studio
IBM SPSS Modeler
Anaconda
SAS Enterprise Miner
Google Cloud Datalab
FICO Decision Management
Julius AI
Buyer's Guide
Download our free IBM SPSS Statistics Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
Course on STAS - I was totally new in using this program back in 2008. I took a course that lasted approximately three months. Although this SPSS program seemed somehow difficult at first, it was not in the long run. For those without knowledge of this software can take up a course. However, I believe it is a program that anyone who is determined can learn how to use on their own without spending a dime on training materials. Thank you Ashley for sharing with us these extensive tips on STAS.