Abstract: Transformer architectures have recently been introduced into the field of visual question answering (VQA), due to their powerful capabilities of information extraction and fusion. However, ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results