Therefore, in this work, we propose to build class prototypes from text descriptions instead of limited visual instances by leveraging a classical pretrained VLM named CLIP. Concretely, we generate ...