Eylem Seç
Scalable Nonlinear Spectral Dimensionality Reduction Methods for Streaming Data
Başlık:
Scalable Nonlinear Spectral Dimensionality Reduction Methods for Streaming Data
Yazar:
Mahapatra, Suchismit, author. (orcid)0000-0002-1851-4195
ISBN:
9780438050020
Yazar Ek Girişi:
Fiziksel Tanımlama:
1 electronic resource (125 pages)
Genel Not:
Source: Dissertation Abstracts International, Volume: 79-10(E), Section: B.
Advisors: Varun Chandola Committee members: Nils Napp; Jaroslaw Zola.
Özet:
High-dimensional data is inherently difficult to explore and analyze owing to the "curse of dimensionality" that render many statistical and Machine Learning (ML) techniques (e.g. clustering, classification, model fitting, etc.) inadequate. In this context, nonlinear spectral dimensionality reduction (NLSDR) methods have proved to be an indispensable tool. However, standard NLSDR methods, e.g. Isomap or Locally Linear Embedding (LLE), have been designed for off-line or batch processing. Consequently, they are computationally too expensive or impractical in cases where dimensionality reduction must be applied on a data stream. Processing data streams efficiently using standard approaches is also challenging in general, given streams require real-time processing and cannot be stored permanently. Any form of analysis, including NLSDR and/or detecting concept-drift requires adequate summarization which can deal with the inherent constraints and that can approximate the characteristics of the stream well. In spite of advances in hardware and development of novel processing frameworks, the issue of scalability of ML algorithms still remains. The scalability of an algorithm is measured via how its performance gets affected as the problem size increases. Scalable algorithms should be able to work with any amount of data without consuming ever growing amounts of storage memory and computations. The challenge is often to find a trade-off between quality and processing time i.e. getting "good enough" solutions as "fast" or "efficiently" as possible.
In this thesis, I propose a generalized framework for streaming NLSDR which can work with different manifold learning approaches e.g. Isomap and LLE to be able to deal effectively with data streams, having underlying distributions which can be multi-modal in nature and be non-uniformally sampled as well. In particular, I developed streaming Isomap or S-Isomap, an algorithm which via a clever approximation is able to scalably reduce the computation cost of discovering the low-dimensional embedding at a fraction of the cost without affecting the quality significantly.
However, S-Isomap was limited in this scope i.e. it could only deal with unimodal, uniformly sampled distributions. Hence arose the need for S-Isomap++, which ameliorated the flaws of its predecessor in being able to deal with multimodal and/or unevenly sampled distributions. However, S-Isomap++ can only detect manifolds which it encounters in its batch learning phase and not those which it might encounter in the streaming phase. Thus, S-Isomap++ ceases to "learn" and evolve to be able to limit the embedding error for points in the data stream, which motivated the need for GP-Isomap, which via a novel positive-definite geodesic-distance based kernel, and using Gaussian Processes to measure variance, is able to detect concept-drift i.e. distinguish among different manifolds and embed streaming samples effectively. Subsequently, we developed the streaming LLE algorithm, for processing streams using LLE as well as discuss a generalized Out-of-Sample Extension methodology for streaming NLSDR, applicable for different manifold learning algorithms. Lastly, we provide theoretical bounds for S-Isomap and GP-Isomap as part of this work.
Notlar:
School code: 0656
Konu Başlığı:
Tüzel Kişi Ek Girişi:
Mevcut:*
Yer Numarası | Demirbaş Numarası | Shelf Location | Lokasyon / Statüsü / İade Tarihi |
---|---|---|---|
XX(682047.1) | 682047-1001 | Proquest E-Tez Koleksiyonu | Arıyor... |
On Order
Liste seç
Bunu varsayılan liste yap.
Öğeler başarıyla eklendi
Öğeler eklenirken hata oldu. Lütfen tekrar deneyiniz.
:
Select An Item
Data usage warning: You will receive one text message for each title you selected.
Standard text messaging rates apply.