Nalaganje...
HATNet: Hierarchical Attention Transformer With RS-CLIP Patch Tokens for Remote Sensing Image Captioning
Remote sensing image captioning (RSIC) aims to generate natural language descriptions of critical visual content in overhead-view remote sensing images. However, existing methods often produce descriptions with significant omissions and misidentifications of scene types or object counts, particularl...
Shranjeno v:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Artigo |
| Jezik: | Inglês |
| Izdano: |
IEEE
2025-01-01
|
| Serija: | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
| Teme: | |
| Online dostop: | https://ieeexplore.ieee.org/document/11214231/ |
| Oznake: |
Označite
Brez oznak, prvi označite!
|