Collaborative Context-Aware Visual Question Answering
Başlık:
Collaborative Context-Aware Visual Question Answering
Yazar:
Toor, Andeep Singh, author.
ISBN:
9780438109759
Yazar Ek Girişi:
Fiziksel Tanımlama:
1 electronic resource (176 pages)
Genel Not:
Source: Dissertation Abstracts International, Volume: 79-11(E), Section: B.
Advisors: Harry Wechsler Committee members: Mihai Boicu; Jim Chen; Gheorghe Tecuci; Harry Wechsler.
Özet:
This dissertation is about advances we have made to address the Visual Turing Test (VTT), in general, and Image Understanding using Visual Question Answering (VQA), in particular. The visual world poses challenges such as uncertainty, incompleteness, and complexity. Additionally, the multi-modal queries submitted to VQA may or may not contain relevant information, which is yet another real-world challenge. The novelty of our dissertation is to approach VQA using a collaborative and context-aware approach where the content of queries can be parsed to assess their relevance, if any, and iteratively refined for their ultimate resolution. The proposed Collaborative Context-Aware Visual Question Answering (C2VQA) methodology encompasses convolutional neural networks and deep learning, joint visual-text embedding, recurrence and sequencing, and memory models to interpret the queries and best answer them.
The feasibility and utility for C2VQA is shown across a number of diverse applications that include single images, sets of images, and videos. These applications include data fusion of biometrics and forensics, content-based image retrieval, novel security protocols for access and authentication, biometric surveillance, query relevance and editing, ranking, and triage.
The dissertation makes new datasets available to the scientific community for the performance evaluation of VQA. Rigorous experimental design, including suitable metrics, shows that C2VQA outperforms state-of-the art approaches that were adapted to run on the novel tasks that we introduced to expand the reach and scope for VQA. We conclude by outlining directions and venues for future research including dialog-based relevance approaches and expanded multi-modal models.
Notlar:
School code: 0883
Tüzel Kişi Ek Girişi:
Mevcut:*
Yer Numarası | Demirbaş Numarası | Shelf Location | Lokasyon / Statüsü / İade Tarihi |
---|---|---|---|
XX(691081.1) | 691081-1001 | Proquest E-Tez Koleksiyonu | Arıyor... |
On Order
Liste seç
Bunu varsayılan liste yap.
Öğeler başarıyla eklendi
Öğeler eklenirken hata oldu. Lütfen tekrar deneyiniz.
:
Select An Item
Data usage warning: You will receive one text message for each title you selected.
Standard text messaging rates apply.