Sorting is an essential operation in database management. This tutorial explains how to use the SAS© SORT procedure to organize your records, whether for preparing a report or for cleaning data before analysis.
Sorting is an essential operation in database management. This tutorial explains how to use the SAS© SORT procedure to organize your records, whether for preparing a report or for cleaning data before analysis.
To make the exercise more concrete, we will work with a movie database. Let's imagine a table containing the following information:
Title: The name of the movie.
Director: The name of the director.
Studio: The production company.
Budget_M: The film's budget in millions of dollars.
Year: The year of release.
Here is the code to generate this example dataset:
| 1 | DATA movies; |
| 2 | INFILE DATALINES dsd; |
| 3 | INPUT Title : $20. Director : $15. Studio : $15. Budget_M Year; |
| 4 | DATALINES; |
| 5 | Inception,Nolan,Warner Bros,160,2010 |
| 6 | Tenet,Nolan,Warner Bros,200,2020 |
| 7 | Dune,Villeneuve,Warner Bros,165,2021 |
| 8 | Jaws,Spielberg,Universal,9,1975 |
| 9 | E.T.,Spielberg,Universal,10,1982 |
| 10 | Blade Runner 2049,Villeneuve,Warner Bros,150,2017 |
| 11 | Avatar,Cameron,Fox,237,2009 |
| 12 | Titanic,Cameron,Paramount,200,1997 |
| 13 | The Room,Wiseau,,6,2003 |
| 14 | .,Unknown,Indie,1,2020 |
| 15 | Metropolis,Lang,UFA,.,1927 |
| 16 | ; |
| 17 | RUN; |
To reverse the order (for example, to see the most recent films first), we use the DESCENDING keyword. Be careful, this keyword is placed before the variable it modifies.
Example: Sort by Director (Z to A) then by Budget (from most expensive to least expensive).
| 1 | |
| 2 | PROC SORT |
| 3 | DATA=movies out=sorted_movies; |
| 4 | |
| 5 | BY descending Director descending Budget_M; |
| 6 | |
| 7 | RUN; |
| 8 |
One of the most powerful uses of sorting is to prepare the data for the DATA step, in order to keep only "the best" or "the first" of a group.
Let's imagine we want to keep only the most expensive film for each director.
We sort by Director.
We sort by Budget in descending order (so that the largest budget is at the top of each group).
| 1 | PROC SORT DATA=movies out=sorted_movies; |
| 2 | BY Director descending Budget_M; |
| 3 | RUN; |
| 4 | |
| 5 | DATA big_budget_only; |
| 6 | SET sorted_movies; |
| 7 | BY Director; |
| 8 | IF first.Director THEN OUTPUT; |
| 9 | RUN; |
Developers accustomed to SQL (Structured Query Language) must be careful with the syntax, as it is the reverse of SAS©.
In SAS© (PROC SORT): DESCENDING is placed before the variable.
In SQL (ORDER BY): DESC is placed after the variable.
Here is the SQL equivalent:
| 1 | PROC SQL; |
| 2 | SELECT * FROM movies |
| 3 | ORDER BY Director DESC, Budget_M DESC; |
| 4 | QUIT; |
Sorting is an operation that consumes memory and processor time. If your movie data already comes from an SQL or Oracle server, it is often better to let the database server handle the sorting via its own indexing rather than importing the data in disorder to sort it later with SAS©.

Les codes et exemples fournis sur WeAreCAS.eu sont à but pédagogique. Il est impératif de ne pas les copier-coller aveuglément sur vos environnements de production. La meilleure approche consiste à comprendre la logique avant de l'appliquer. Nous vous recommandons vivement de tester ces scripts dans un environnement de test (Sandbox/Dev). WeAreCAS décline toute responsabilité quant aux éventuels impacts ou pertes de données sur vos systèmes.