Using a pre-trained ResNet-50 or Vision Transformer (ViT) to extract the embedding vector for 148_1000.jpg .

Edge cases or "noisy" samples (like 148_1000.jpg ) can disproportionately affect model convergence or bias.

(e.g., ImageNet, a local project, or a specific website?)

1. Introduction