How to extract a "substring" from a numeric variable?
Simon 19 vistas
Nivel de dificultad
Débutant
Publicado el :
Consejo del experto
Stéphanie
While mathematical hacks like FLOOR(date/10000) are computationally efficient for simple extraction, the most robust approach is Semantic Conversion. If a number represents a date, convert it to a true SAS date value using INPUT(PUT(var,8.),yymmdd8.). This unlocks the full power of SAS time functions (YEAR, INTNX) and makes your code self-documenting.
Why? Because SUBSTR expects a character string, not a number. Furthermore, trying to compare the result (a string) with a number (2019) creates a type mismatch.
Here are three methods to solve this problem, from the most "mathematical" to the "best practice".
If your column is numeric, the most performant solution is often to stay in the numeric domain. Instead of trying to cut the text, use mathematics.
Imagine a variable date_id containing 20190901. To get the first 4 digits (the year), you just need to divide by 10,000 and keep only the integer part.
proc sql;
select count(receipt_number)
from ma_table
where floor(date_id / 10000) = 2019;
quit;
1
PROC SQL;
2
select count(receipt_number)
3
from ma_table
4
where floor(date_id / 10000) = 2019;
5
QUIT;
Here, 20190901 / 10000 results in 2019.0901. The FLOOR function rounds down to the nearest integer, which is 2019.
If you needed to extract a part from the middle of the number, the logic would become more complex (a mix of subtractions and divisions), making this method less readable for complex extractions.
2. The Conversion Method (Explicit Typing)
If you absolutely want to treat the variable as text, you must explicitly convert it. It's not enough to use PUT; you often have to manage the spaces that can be inserted during the conversion.
However, this method is often more verbose and prone to comparison errors (comparing a string '2019' with the number 2019).
3. The "Best Practice": Convert to a Real SAS Date
proc sql;
/* Création d'une date temporaire pour le filtre */
select count(receipt_number)
from ma_table
where year(input(put(date_id, 8.), yymmdd8.)) = 2019;
quit;
1
PROC SQL;
2
/* Création d'une date temporaire pour le filtre */
3
select count(receipt_number)
4
from ma_table
5
where year(INPUT(put(date_id, 8.), yymmdd8.)) = 2019;
6
QUIT;
Advantages:
Readability: It's immediately clear that we are filtering on the year.
Flexibility: You can easily change to filter by month (MONTH() = 9) or day, without redoing complex mathematical calculations.
Aviso importante
Los códigos y ejemplos proporcionados en WeAreCAS.eu son con fines educativos. Es imperativo no copiarlos y pegarlos ciegamente en sus entornos de producción. El mejor enfoque es comprender la lógica antes de aplicarla. Recomendamos encarecidamente probar estos scripts en un entorno de prueba (Sandbox/Dev). WeAreCAS no acepta ninguna responsabilidad por cualquier impacto o pérdida de datos en sus sistemas.
SAS y todos los demás nombres de productos o servicios de SAS Institute Inc. son marcas registradas o marcas comerciales de SAS Institute Inc. en los EE. UU. y otros países. ® indica registro en los EE. UU. WeAreCAS es un sitio comunitario independiente y no está afiliado a SAS Institute Inc.
Este sitio utiliza cookies técnicas y analíticas para mejorar su experiencia.
Saber más.