- Find the Leak, Fix the Split: Cluster-Based Method to Prevent Leakage in Video-Derived Datasets We propose a cluster-based frame selection strategy to mitigate information leakage in video-derived frames datasets. By grouping visually similar frames before splitting into training, validation, and test sets, the method produces more representative, balanced, and reliable dataset partitions. 4 authors · Nov 17, 2025 2