7

Running files locally in SPSS

 2 years ago
source link: https://andrewpwheeler.com/2022/02/21/running-files-locally-in-spss/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Running files locally in SPSS

Say I made alittle python script for a friend to scrape data from a website whenever they wanted updates. I write my python script, say scrape.py, and a run_scrape.bat file for my friend on their windows machine (or run_scrape.sh on Mac/Unix). And inside the bat file has the command:

python scrape.py

I tell my friend save those two files in whatever folder you want, you just need to double click the bat file and it will save the scraped data into the same folder. Here the bat file is run locally, it sets the current directory to wherever the bat file is located.

This is how the majority of code is packaged – can clone from github or email a zip file, and it will just work no matter where the local user saves those scripts. I need to have my friend have their python environment set up correctly, but most of the stuff I do I can say download Anaconda and click yes to setting python on the path and they are golden.

SPSS makes things more painful, say I added SPSS to my environment variable in my windows machine, and I run from the command prompt an SPSS production job:

cd "C:\Users\Andrew"
spss print_dir.spj" -production silent

And say the spj file, all it does is call a syntax show.sps which has as the only command SHOW DIR. This still prints out wherever SPSS is installed as the current working directory inside of the SPSS session. On my machine currently C:\Program Files\IBM\SPSS Statistics. So SPSS takes over the location of the current directory. Also we can open up the spj file (it is just a plain text xml file). Here is what a current spj file looks like for me (note it is all on one line as well!):

spj_xml.png?w=600

And that file also has several hard coded file locations. So to get the same behavior as python scrape.py earlier, we need to do dynamically set the paths in the production job as well, not just alter the command line scripts. This can be done with a little command line magic in windows, dynamically replacing the right text in the spj file. So in a bat file, you can do something like:

@echo on
set "base=%cd%"
:: code to define SPJ (SPSS production file)
echo ^<?xml version=^"1.0^" encoding=^"UTF-8^" standalone=^"no^"?^>^<job xmlns=^"http://www.ibm.com/software/analytics/spss/xml/production^" codepageSyntaxFiles=^"false^" print=^"false^" syntaxErrorHandling=^"continue^" syntaxFormat=^"interactive^" unicode=^"true^" xmlns:xsi=^"http://www.w3.org/2001/XMLSchema-instance^" xsi:schemaLocation=^"http://www.ibm.com/software/analytics/spss/xml/production http://www.ibm.com/software/analytics/spss/xml/production/production-1.4.xsd^"^>^<locale charset=^"UTF-8^" country=^"US^" language=^"en^"/^>^<output imageFormat=^"jpg^" imageSize=^"100^" outputFormat=^"text-codepage^" outputPath=^"%base%\job_output.txt^" tableColumnAutofit=^"true^" tableColumnBorder=^"^|^" tableColumnSeparator=^"space^" tableRowBorder=^"-^"/^>^<syntax syntaxPath=^"%base%\show.sps^"/^>^<symbol name=^"setdir^" quote=^"true^"/^>^</job^> > transfer_job.spj
"C:\Program Files\IBM\SPSS Statistics\stats.exe" "%base%\transfer_job.spj" -production silent -symbol @setdir "%base%"

It would be easier to use sed to find/replace the text for the spj file instead of the superlong one-liner on echo, but I don’t know if Window’s always has sed installed. Also note the escape characters (it is crazy how windows parses this single long line, apparently the max length is around 32k characters though).

You can see in the call to the production job, I pass a parameter, @setdir, and expand it out in the shell using %base%. In show.sps, I now have this line:

CD @setdir.

And now SPSS has set the current directory to wherever you have the .bat file and .sps syntax file saved. So now everything is dynamic, and runs wherever you have all the files saved. The only thing that is not dynamic in this setup is the location of the SPSS executable, stats.exe. So if you are sharing SPSS code like this, you will need to either tell your friend to add C:\Program Files\IBM\SPSS Statistics to their environment path, or edit the .bat file to the correct path, but otherwise this is dynamically run in the local folder with all the materials.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK