image compareImages

Large Scale Copyright Infringement Detection (Many-to-Many)

Scénario de test & Cas d'usage

Business Context

A stock photography agency needs to scan user-uploaded images against a massive database of licensed assets to prevent copyright infringement. Since filenames will never match (users rename files), the system must compare EVERY uploaded image against EVERY licensed image to find visual matches.
About the Set : image

Image processing, manipulation, and analysis.

Discover all actions of image
Data Preparation

Creating a 'USER_UPLOADS' table and a 'LICENSED_ASSETS' table with dummy IDs to track ownership.

Copied!
1DATA casuser.user_uploads;
2 LENGTH _path_ $255 user_id $50;
3 _image_ = '00FF00'b;
4 DO i = 1 to 5;
5 user_id = cat('User_', i);
6 _path_ = cat('upload_', i, '.png');
7 OUTPUT;
8 END;
9RUN;
10
11DATA casuser.licensed_assets;
12 LENGTH _path_ $255 asset_id $50;
13 _image_ = '00FF00'b;
14 DO j = 1 to 10;
15 asset_id = cat('Asset_', j);
16 _path_ = cat('licensed_', j, '.png');
17 OUTPUT;
18 END;
19RUN;

Étapes de réalisation

1
Perform a comprehensive Many-to-Many comparison using 'pairAll' to check every combination.
Copied!
1PROC CAS;
2 image.compareImages /
3 sourceImages={TABLE={name='user_uploads', caslib='casuser'}}
4 referenceImages={TABLE={name='licensed_assets', caslib='casuser'}}
5 casOut={name='potential_matches', caslib='casuser', replace=true}
6 pairAll=true
7 copyVars={'user_id', 'asset_id'};
8RUN;
2
Filter results to show only high-probability matches (simulated via SQL query on the output).
Copied!
1PROC CAS;
2 TABLE.fetch /
3 TABLE={name='potential_matches', caslib='casuser'}
4 where='_ssim_ > 0.9';
5RUN;

Expected Result


The 'potential_matches' table contains Cartesian product results (50 rows). The fetch step highlights pairs where a user upload is visually identical to a licensed asset.