image augmentImages

Standard Case: Augmenting Images of Machine Parts for Defect Detection Model Training

Scénario de test & Cas d'usage

Business Context

A manufacturing company wants to build a deep learning model to automatically detect scratches and dents on metal parts. The initial dataset is small. They need to expand it by creating slightly modified versions of existing images (both good parts and defective parts) to make the model more robust to variations in lighting and orientation.
About the Set : image

Image processing, manipulation, and analysis.

Discover all actions of image
Data Preparation

Create a simulated dataset of machine parts, including an ID, a label ('defect' or 'ok'), and a placeholder for the image data. In a real scenario, this table would be populated by the `loadImages` action.

Copied!
1DATA casuser.manufacturing_parts;
2 LENGTH _image_ $200. image_id $10. label $6.;
3 INFILE DATALINES dsd dlm='|';
4 INPUT image_id $ label $ _image_ $;
5 DATALINES;
6PART-001|defect|...binary_data_for_image_1...
7PART-002|ok|...binary_data_for_image_2...
8PART-003|ok|...binary_data_for_image_3...
9PART-004|defect|...binary_data_for_image_4...
10;
11RUN;

Étapes de réalisation

1
Load the prepared data into the CAS server.
Copied!
1 
2PROC CASUTIL;
3load
4DATA=casuser.manufacturing_parts casout='manufacturing_parts' replace;
5QUIT;
6 
2
Execute a first augmentation pass to create horizontally flipped versions of the whole images. This tests a basic, deterministic mutation and the `useWholeImage` option. The `copyVars` option ensures the original label is kept.
Copied!
1PROC CAS;
2 image.augmentImages /
3 TABLE={name='manufacturing_parts', caslib='casuser'},
4 copyVars={'image_id', 'label'},
5 augmentations={{useWholeImage=TRUE, mutations={horizontalFlip=TRUE}}},
6 casOut={name='parts_flipped', caslib='casuser', replace=TRUE};
7QUIT;
3
Execute a second, more complex augmentation pass on the original data. This applies a combination of random rotations and random darkening to simulate real-world conditions. The `seed` ensures reproducibility, and `addColumns` tracks the exact transformations applied.
Copied!
1PROC CAS;
2 image.augmentImages /
3 TABLE={name='manufacturing_parts', caslib='casuser'},
4 seed=54321,
5 addColumns='augmentAttributes',
6 copyVars={'image_id', 'label'},
7 augmentations={{
8 useWholeImage=TRUE,
9 mutations={
10 rotateLeft={type='RANGE', value={0, 25}},
11 darken={type='RANGE', value={0.1, 0.3}}
12 }
13 }},
14 casOut={name='parts_randomized', caslib='casuser', replace=TRUE};
15QUIT;

Expected Result


Two output tables are created. The 'parts_flipped' table contains the flipped images with their original IDs and labels. The 'parts_randomized' table contains new images with random transformations applied. This second table should also include the original ID and label, plus new columns (e.g., `_rotation_angle_`, `_darken_value_`) detailing the exact random augmentation applied to each image, allowing for full traceability.