Toward Prototypical Vision Clustering