SPSS syntax introduction: programming step by step | Doctor Analytix | Skillshare

Playback Speed

  • 0.5x
  • 1x (Normal)
  • 1.25x
  • 1.5x
  • 2x

SPSS syntax introduction: programming step by step

teacher avatar Doctor Analytix, SPSS fanatic

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Lessons in This Class

10 Lessons (52m)
    • 1. Welcome to the course

    • 2. Learn when the Syntax is preferable to the method point-and-click

    • 3. Three Really good reasons to use the syntax as well

    • 4. Checklist: follow these guidelines to set a syntax file properly

    • 5. Syntax Diagram

    • 6. DO IF .. END IF

    • 7. LOOP - END LOOP



    • 10. Useful commands of SPSS Syntax

  • --
  • Beginner level
  • Intermediate level
  • Advanced level
  • All levels
  • Beg/Int level
  • Int/Adv level

Community Generated

The level is determined by a majority opinion of students who have reviewed this class. The teacher's recommendation is shown until at least 5 student responses are collected.





About This Class

Hi I am Doctor AnalytiX and I have spent many years on testing what really works in SPSS to get results in very short time, and now I want to share with you how to prepare data and how to automate tasks in a few simple steps.

I have trained hundreds of people in these years and explained how to use SPSS syntax programming in several live training sessions and now you have the chance to follow me.

This course will teach you how to use this 'behind the scene' language and other features for data management and manipulation, and for overall control of SPSS execution.

There will be an "in-depth" details about the Syntax language: useful and relevant examples and exercises that shows how Syntax can make your analysis more efficient, more transparent and more easy.

This awesome program includes:

  • useful handouts.
  • handy checklists.
  • interesting projects.

Let us code together!


Please use dataset included in the 'tutorials.zip' file.


Meet Your Teacher

Teacher Profile Image

Doctor Analytix

SPSS fanatic


I am a technical and highly accomplished statistician with demonstrated experience in delivering high-level technical projects

I am passionate data mining, predictive analytics and Big Data trainer and consultant with 15+ years of experience.

I love to do:

o Statistics, Data Mining, Business Intelligence and Big Data.

My favorite analytics platforms and software are:

o IBM SPSS Statistics.

o IBM SPSS Modeler.

o Rapid Miner.

o R ; Python; Matlab for programming.

o SQL server for Business Intelligence.

o Hadoop technlogies

See full profile

Class Ratings

Expectations Met?
  • Exceeded!
  • Yes
  • Somewhat
  • Not really
Reviews Archive

In October 2018, we updated our review system to improve the way we collect feedback. Below are the reviews written before that update.

Why Join Skillshare?

Take award-winning Skillshare Original Classes

Each class has short lessons, hands-on projects

Your membership supports Skillshare teachers

Learn From Anywhere

Take classes on the go with the Skillshare app. Stream or download to watch on the plane, the subway, or wherever you learn best.


1. Welcome to the course: Hello, everybody. I'm very glad to welcome you to the curse in this step by step curse. I'm going to teach you SP assess syntax essentials. But before we begin the curse, I just want to explain a few very important points. Please notice that discusses specifically aimed for complete beginners. This means that this curse is going to teach you from scratch as if you know nothing at all . I will work with you step by step toe, understand SPS and syntax. So you must contact me for help by the U to me discussion board for any questions or feedback the to my head so that you can get the most from this curse. And you know, I don't know each of you personally, so I can't create a curse that feeds everyone's needs. So the only way that I can feel the curse toe everyone's specific needs is that you contact me and I strongly suggest you to go through all the lessons in order and not just peek lessons. You think a right for you and also remember to complete each exercise at the end of each section and please write down the tapes that I provide you in each lesson. They will become a valuable service for you in the future. Whenever I have on important update to share with you, I will send you an email through the U Tim emailing system. So you will always be up to date. And finally, please, always keep in mind that you will not master s pss Intacs overnight or within a few days, it will take hard work in patients. And again, please feel free to contact me for any support. If you follow these guidelines, you will get the most out of this cars. And it makes for a better learning experience for everyone. So I think we are ready now. Let's get started. 2. Learn when the Syntax is preferable to the method point-and-click: Let's see when the syntax is preferable to the missile pointing. Click well. There may be many situations in which the use of the syntax is preferable or even essential here below. There are some examples, so if you have to create weekly or monthly reports instead of recreating the commands by using the menus each time, you can save them in a syntax file. Fel for later reuse if you have to use commands that are not available by using the dialogues. Some subroutines for reading complex that instructors transformations like structures do if and loop and mark rose. You can get them only by use of the syntax. If you have to create multiple tables or graphics, the scene talks allows you to create them in a single step, whereas by using the menu you can only generate them one by one. If every month you run the same SPS s commands, but the name off the database changes every time you run the analysis, you can change the file amusing syntax editor and reuse the rest of the commands rather than redefining from the beginning. The file. By using the menu every time 3. Three Really good reasons to use the syntax as well: I will give you some really good reasons to use the scene. Texas Well, But first of all, I'd like to point out that the sin taxes the SP assess automation language and consists off commands for analysis and data management. All SPS says procedures are built upon a powerful programming language that has been extended since SP Assess was first developed. Discovers will teach you how to use this language and other features for file in data manipulation and for overall control off SPS says execution. The SP says language called seeing tax is generated by the program every time a user clicks on the okay baton. So a syntax file is simply a text file that contains commands. So, given the obvious advantages of using the C stem a graphic system, maybe you wonder why anyone would want to program using the scene ducks. So there are three really really good reasons to use this in Texas. Well, first reason efficiency. You can run complex analyses over and over again on updated data, sets in menus. Instead, you have to start all over and remember exactly which options you choose. Last time. The syntax common language also allows you to save your jobs in a syntax file so that you can repeat your analysis at the name at a later time or run it in an automated way with the production job so syntax can be corrected. Recycled in mainly syntax gets things done fast. Second reason. Log off analyses. You will be able to keep track of your analysis step by step and save them in a syntax file . Because, of course, you will not always remember every step you did. Third reason. Project documentation. Syntax is the only way for you to show what you did and how you did it. Like which options you used in your analyses or exactly how you perform that Mariah ble and so on. You can only clearly communicate them by showing the same tax, and you can share your sin tax code with colleagues and students. And so you will know how to reach the same results in goals 4. Checklist: follow these guidelines to set a syntax file properly: checklist. Follow these guidelines to set a syntax file properly. In order to set a syntax file properly, you'll need to follow these guidelines. First of all, develop simple syntax using the dialog box. Be aware off pending changes. Check the correct syntax step by step and finally a raise the results in the viewer. Now let's see the guidelines in detail. Item one developed simple syntax using the dialog box. Making the syntax with the help off the dialogues is definitely beneficial. Toe all users. This technique minimizes errors due to typing commands and increases accuracy in document syntax. Item two. Be aware off painting changes if mistakes are made in data management operations or you do not complete the command. S P SS can remain in a state off pending changes, SPS says. Does not accept new instructions until you stop or close to command sequence or insert a procedure Item three. Check the correct syntax step by step. It's a real complex challenge toe. Analyze old mistakes and the steps correctly performed in an analysis process in one step. For this reason, it's important to systematically verify the results. Obtained step by step, The Data editor is very useful for this purpose. It's equally useful to use the reporting procedures based on the data to verify the transformations such as, for example, frequencies table or the least off cases with least and print commands. Item four A. Raise the results in the viewer if mistakes are made. Once read and understood in the viewer, it's recommended to delete them from the viewer before re performing the procedure. This is due to the fact that the new ever messages are appended to the previous ones and it's complex to distinguish them. And that's all. Please take care off this important checklist each time you do your analysis. 5. Syntax Diagram: and now we talk about syntax diagrams, and we will see how to use the syntax diagram as reference for any command. Each command includes a Syntex diagram that shows all of the sap, Kamins keywords and specifications allowed for that command by recognising symbols in different type ponds. You can use this index diagram as a quick reference for any command. Let's see an example. Open a new syntax editor. We know type frequencies. Select the test frequencies, impress F one on the keyboard or right click and in contextual menu, select help. After that, a new tab will be open with Web Page of Commanders Index reference. This is a concise version off the manual, but for all details, you should use the Command Syntex reference in PdF format. Now I'll show you how to read and use the Syntex diagram. Off frequencies Hilemon's shown in upper case are keywords to identify commands, subhumans, functions, operators and other specifications. Take a look at this diagram. Frequencies is the command format is a sub command of our names. Sense for Mariah Bill name. Bar list stands for Mariah Policed. So far, names are separated by spaces. It lemons in lower case described specifications to supply. For example, VAR list indicates that you need to supply a list of varietals a very important thing. Parentheses, apostrophes and quotation marks are required. Where indicated, employees also know that a lemons in bold are defaults. But please pay attention cause there are two types of defaults. First type when the default is followed by two asterisks. For example, a value asterisk asterisk. Second type. If a default is not followed by two asterisks, it is in effect when the sub command or keyword is specified by itself. And please remember that parentheses, apostrophes and quotation marks are required where indicated it. Lemons enclosed in square brackets are optional for some commands. Square brackets are part off the required seen tux. The command description explains which specifications are required and which are optional. The specification Frequencies Varietals equal Violist Open square bracket bar list dot, dot dot close square bracket means that you can specify multiple varietal lists with optional braces. Indicated choice between the lemons. You can specify any off the lemons enclosed within the align braces. Ellipses indicate that you can repeat a new element in the specification. For example, as before in the command frequencies Varietals equal violist. Open a square bracket bar list that that that close, square bracket the ellipses indicate that you can use other violist and so calculate frequencies tables for other list off varietals. Now I would like to show you how to use the frequencies to sort the frequencies table categories in ascending order off frequency by default Frequencies displays the frequency, table and sorts categories in ascending order, off values for numeric variables and in alphabetical order for string varietals. In the Centex diagram, we see that we can specify the subhuman four months with a frack as keyword to get the goal . In fact, the command format controls various features off the output, including order off categories and suppression off tables. We need to use the subhuman former, and so I can choose to use one off the lemons enclosed within the airline braces. The syntax code family is this one. Frequencies varietals, equal marital flesh for month equal a frank. He's execute the syntax code and see results in the output views. And that's all. In this lesson, you have learned how to read Syntex diagram and how to define correctly any kind off. SPS says Syntex commands 6. DO IF .. END IF: Now we will talk about do if command and learn why to use. Do if the do if and if construct is useful to control conditional transformations. The do if and if structure is often used to execute multiple transformations, such as with the compute record or count commands you can have as many nested do. If end instructors as you like. As long your computer has the memory to support it here below, you'll find examples off the most common transformation commands. These are compute recode if selective and song basically control or the ability to change the value off. A varietal is passed from one command to the next one, based on the evaluation off the present command being true or false. In the description that follows, you'll see how control flows from one command to another, depending on different output off the condition. If the logical expression is through the commands following, do if are executed up to the next else, if else or and if commands, then control passes to the next command following the end. If command, if the expression is false, control passes to the following else. If command and this will go on will continue until one off. The logical expressions is true if none of the expressions are through on the do if or any off the else. If commands the commands following else will be executed. This is the syntax diagram. The sentence diagram contains the fall win parts. The do If command We mentioned the condition to Tech, and that condition will be tacked on each observation off the data set the computes command . We'll write our logical man to run when the condition is satisfied or met the else. If command, this command is optional, and it's gonna be repeated as many times as it's as it's needed. The end. If command this command is mandatory, it ends the DUI floop. And please take know that Onley transformation commands can be used with do if command. And now, let's see one to use. Do if do. If condition is applied on Lee to observation off cases, off data or rose off data, let's see an example. We want to use the file student that sound the student. That self file contains three varietal. We want to calculate the grade named in fact grade, based on the following Marx Criteria for the first line off the Marks criteria table. We can see that if the average off student marks is over 65 and the actual exam mark is below 40 the great off this student is C plus. We have six conditions to define in the next creeped and populate the varietal grade for older records. And here is the solution. And now let's see the code explanation. Step one, this command declares string meribel named grade with three char's Step two. Here we are starting the do if command with the first condition taking. If value off a V G is greater than 65 or not. Step three. Using RECODE condition to assign value to arrival grade based on the three criteria mentioned in parentheses, with expression on the left hand side of equal being the criteria and value and on the right hand side of equal begging the value to assign when criteria is met. After executing this statement, the control against goes to step to for the next subsequent observation step for if control reaches toe else command, which in this case is only possible when the value off a B G is not greater than 65. Then the command following else gets executed. Step five the same as in step three, but with different values. Step six. This step marks the end off the do If command step seven, this command ends the program. 7. LOOP - END LOOP: and now we we talk about loop command and learn why to use blue. The S P SS Loop Command is used when a transformation is required to be performed repeated number off times. Luke is very useful to deal with factors. This loop counter can be used as vector index, and thus we confront snipped off code on all factor using counter varietal. We can have loop inside a loop called nested loops, and there is no restriction on number. Off times we can nested loops. But as you increase the level of nesting, the complexity will increase, and it becomes difficult than to backtrack the code in case of trouble shooting. And now let's see when to use Luke look common, reduces the lens off the code and thus makes code easy to understand. For example, if you want to run a same piece off code 50 times on each observation, then you don't have to write that code 50 times using loop command. We can just have to run our code ones, and we can tell, SPS says. How many times took secured the cold? This is a syntax diagram. Inside loop, we can mention any commands any number off times to execute. The most important thing is to set up a scratch varietal, see next example hereafter and define the initial count and the final count. And what this loop is going to do, Let's see an example. We used the file G s s 98 that some the file contains survey data. We want to compute the difference between comfort in banks and financial institutions confident and the other of arrivals in the sorry about the confidence in other institutions . For this problem, we use vector and loop commands and a scratch for rival as the following Cold. And now let's see the code explanation. Step one. Let's define better some commands before going ahead. For example, in step one and two, we see the command Baxter Victor is a collection off arrivals. Each varietal in a vector holds opposition, which is called vector index. So using vector index, we can access any Meribel vector can have either string or numeric variables. Vectors are mostly used with loops. That is when we want to run a code on collection on varietals. In that case, deficient way is to create a vector off all those varietals and then run the loop on the vector. Vector Command creates a vector named bar group having element as arrivals from confident to con army. And please note the varietals we want to include Enb actor must be adjacent invariable view Step two creating vector named dif off size 13. Using vector command here we're just creating a new empty vector. That is, until this point, the lemons off vector doesn't contain any values which will compute and assign later in the step for Step three, The varietal ash tug I is called scratch arrival. In our script, we have to create a varietal to maintain the loop count. But we don't want that that meribel toe appear in the active data set. So we created Meribel, called I as a scratch Meribel Scratch Meribel also retains its value from one loop it oration to the next scratch. Meribel follows the same Syntex that is Ashton symbol, followed by Meribel name. This loop command runs the loop 12 times with value off one off I starting from two 2 13 incriminating by one after each iteration step. For here we are computing the difference off the consecutive arrivals in Vector bar group and assigning it to vector deep. So dif ask the I minus one. This code will tell s process for which element off vector dif we are computing values Bar Group one minus bar group Ashton I This part will subtract the value off do consecutive lemons or vector var group and assign it to the defector limbs. For example, during the first iteration off the loop, the value off straight off a scratch Meribel I will be too. So the compute command will result to the following code compute dif tu minus one equal via group one minus bar group to which will be father result to give the following compute diff one equal var Group one minus bar group to As a result, the value off the second element off our group gets subtracted from the first element and the result will be assigned to the first element off def vector. Step five. This command marks the end of Luke and now following command, if any will be executed sequentially. Step six. This command ends the program and now let's see the control Throw in Loop Command after encountering the key. Would Lou SPS says knows that the following command till end loop needs to be executed repeatedly. And the counter varietal, which in our example is I will decide how many times to repeat the command. So s P SS were well, execute step for 12 times before executing. 8. AGGREGATE COMMAND: talk about a brigade command and learn why to use aggregate command is used when you need to summarise groups off cases within a data set into single cases using prettify ing functions like average person stage, etcetera. In other words, it is effective for combining cases into groups in creating a data set with one case for each group, arrivals which defined groups are called breaking arrivals. Using this procedure, we can get and manipulate with statistical properties off macro objects within our data set instead of properties off single cases. And now let's see went to use aggregate. The most common cases off this command. Use our analyzing data on the macro instead off my truck level correlations cluster regression analysis, etcetera with the use off arrivals which describe geographical regions, organizations and so on instead of particular people off these structures, comparing different time periods free since aggregation off regional income data and factors which influence it for our here period effectively store and operate large data sets. If you need to regularly analyse the data on the level off region firm department etcetera , it may be reserved ineffective to store data in case level. Former in calculate macro level statistics each time aggregated datasets are usually at least tens times smaller than original datasets. Analysis. Off structural factors influences upon macro objects for Easton's exploring how regional economy parameters influence a particular business organization. Income. Let's see now the basic parameters. These parameters are mandatory for each use off the command. It will not work without them slash BRAC of arrival or several of arrivals, which defined case grouping. For instance, if we want to analyze our company revenues in each region, we use a region of arrival as a Breck Meribel. As a result, we get one case parade, gin slash Oliver label, equal function arguments, aggregated date of arrival, its label and desirable aggregation function for reasons. If you want to get a percentage of highly educated people in each region, you right slash high a Jew percentage off highly educated equal ping education 111 tau, one where one is a value off high education in a case level. Variable education. Normally, the aggregated arrivals names are written with capital letters to distinguish them from case level variables. And now let's see the functions, which can be used for data aggregation. Some for some off arrivals values, I mean for mean off viable values. Max. The maximum value off arrival values mean the minimum value off arrival bells. Feen meribel Value one value. 2% of cases between very one and value to inclusive taking. Previous. Example. If we have one for primary to for secondary and 34 college, then P education 123 will produce the percentage of people off all three categories. Please. Not that by default. Calculations are done for unweighted data set. You have to specify weight by weight varietal before performing aggregation. If you want weighted percentages, fractions in counts and now let's see some additional parameters. This parameters will modify data manipulation before aggregation and resulting data, said storage flesh pre sorted slash out file and then slash pre sorted before aggregation. The original data must be sorted by breaking varietals. In ascending order. SP assess will do it automatically each time degradation is performed. If you already have your data sets sorted, then you can save calculations time. By using this command, it must be placed before wreck out file equal specifications off saving aggregation results . Possible usages are out. File. Equal file path saving aggregated data set to a specified directory with a specified file name that's up out file Equal asterisk mode equal add varietals will add abrogated arrivals turn original deter set If we omit out file sub command at file Equal asterisk mode. Equal adv Arrivals will be automatically applied to an original data set. And now let's see the Sindh extractor. And now let's see an example. You have a retail change and have to optimize it. To do this, you have collected data from each shop for 100 days for each day and shop. You have a some off revenue and quantity off sold items define which shop is the most effective in terms of revenue and disability off saves. In this example, we will calculate to aggregated arrivals mean off revenues for each shop and mean off items sold. The data set we use is shops for aggregated that south. And now let's see the code explanation. Step one. Aggregate command starts saving results off aggregation to the file on this. See Step two Shop I D. Is a group in varietal. A separate case for each shop will be produced. Step three, calculating the aggregated meribel off each shop revenue as a mean, off daily revenues step for calculating the aggregated varietal off items sold in each shop as a mean off daily. Some off sold items, ending aggregation command with a full stop. Now, if we look at the data in aggregated that south file, we can conclude that shops three and six are the least effective, unstable in terms off daily revenues and items sold. 9. TEMPORARY COMMAND: and now we talk about temporary command and learn why to use temporary temporary command allows you to temporarily transformed the data without making permanent changes. Transformation works for the very first procedure. Then it reverts back to the original values. Now let's see when to use temporary. In order to use temporary effectively, you must know which commands do or do not read the data with temporary. You can perform separate analysis for subgroups in the data and then repeat the analysis for the file as a whole, you can also use temporary to transform data for one analysis, but not for others of Sequent analyses. By using temporary, we don't need to undo these modifications after creating the table. Let's see the basic parameters. The only specification is the key word. Temporary. There are no additional specifications. Temporary could be applied to the following commands. Transformation commands compute recode if Count and the do repeat utility former commands being four months right for months. In for months, data selection commands select ive simple filter and wait. A reliable declarations numeric string and vector labeling commands, varietal labels and value labels. And the missing values commands split file except the Ex Safe command leaves temporary transformations in effect. Saved, however, reads the data in turns temporary Transformations off after the file is written. Temporary. Can't be applied to the following commands. Sort cases. Match files. Add files or compute with a leg function. If any. Off these commands follow temporary in the command sequence. There must be Unterberg Aging procedure or command that reads the data to first execute the temporary command. Temporary cannot be used within the do if and if or loop and look structures. Let's see an example. We want to certain split the file loan that sub and calculate for each gender the average off the income in order to reach the goal. We use the following commands Sort cases arranges cases in the file according to the values off Meribel Gender Split file splits the file according to the values off arrival gender and this creep thieves generates separate mean income tables for men and women. By default, the two groups, men and women are compared in the same frequency and statistics tables. Women are 2000 and 77 customers in the file, and their average income is around 25,370 while the men are 2040 and their average income is around $25,794. And now we want to apply a temporary command before splitting the file and compute the average off income by gender. And for the whole data set. Using these commands were reach the requested result. Please not is that descriptive Command is repeated price, and it's not a narrow because off temporary command split file applies to the first procedure Onley. Thus, the first descriptive proceed er generates separate tables for men and women. Second, descriptive proceed er generates tables that include boss gender. So here the statistics when temporary command is active recommend is not active. 10. Useful commands of SPSS Syntax :