Abstract: Vision Transformers uniformly process all image patches, often leading to inefficiencies by learning unnecessary background information. To address this, this work propose a preprocessing ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results