Algorithms and Algorithmic Obstacles for Probabilistic Combinatorial Structures

Select an Action

Place Hold(s)
Add to My Lists
Email
Print

Title:

Algorithms and Algorithmic Obstacles for Probabilistic Combinatorial Structures

Author:

Li, Quan, author.

Personal Author:

Li, Quan, author.

General Note:

Source: Dissertation Abstracts International, Volume: 79-10(E), Section: B.

Advisors: David Gamarnik.

Abstract:

We study efficient average-case (approximation) algorithms for combinatorial optimization problems, as well as explore the algorithmic obstacles for a variety of discrete optimization problems arising in the theory of random graphs, statistics and machine learning. In particular, we consider the average-case optimization for three NP-hard combinatorial optimization problems: Large Submatrix Selection, Maximum Cut (Max-Cut) of a graph and Matrix Completion.

The Large Submatrix Selection problem is to find a k x k submatrix of an n x n matrix with i.i.d. standard Gaussian entries, which has the largest average entry. It was shown in [13] using non-constructive methods that the largest average value of a k x k submatrix is 2(1 + o(1)) logn/k with high probability (w.h.p.) when k = O (log n/ log log n). We show that a natural greedy algorithm called Largest Average Submatrix LAS produces a submatrix with average value (1+o(1)) 2logn/k w.h.p. when k is constant and n grows, namely approximately 2 smaller. Then by drawing an analogy with the problem of finding cliques in random graphs, we propose a simple greedy algorithm which produces a k x k matrix with asymptotically the same average value (1+o(1)) 2logn/k w.h.p., for k = o(log n ). Since the maximum clique problem is a special case of the largest submatrix problem and the greedy algorithm is the best known algorithm for finding cliques in random graphs, it is tempting to believe that beating the factor 2 performance gap suffered by both algorithms might be very challenging. Surprisingly, we show the existence of a very simple algorithm which produces a k x k matrix with average value (1 + ok(1) + o(1))(4/3) 2logn/k for k = o((log n) 1.5), that is, with asymptotic factor 4/3 when k grows. To get an insight into the algorithmic hardness of this problem, and motivated by methods originating in the theory of spin glasses, we conduct the so-called expected overlap analysis of matrices with average value asymptotically (1 + o(1))alpha 2logn/k , for a fixed value alpha epsilon [1, 2 ]. The overlap corresponds to the number of common rows and common columns for pairs of matrices achieving this value. We discover numerically an intriguing phase transition at alpha* &trie; = 5 2 (3 3 ) ≈ 1.3608.. epsilon [4/3, 2 ]: when alpha < alpha* the space of overlaps is a continuous subset of [0,1]2, whereas alpha = alpha* marks the onset of discontinuity, and as a result the model exhibits the Overlap Gap Property (OGP) when alpha > alpha*, appropriately defined. We conjecture that OGP observed for alpha > alpha* also marks the onset of the algorithmic hardness - no polynomial time algorithm exists for finding matrices with average value at least (1+o(1))alpha 2logn/k , when alpha > alpha* and k is a growing function of n.

Finding a maximum cut of a graph is a well-known canonical NP-hard problem. We consider the problem of estimating the size of a maximum cut in a random Erdo&huml;s-Renyi graph on n nodes and [cn ] edges. We establish that the size of the maximum cut normalized by the number of nodes belongs to the interval [c/2 + 0.47523, c c /2 + 0.559093 c ] w.h.p. as n increases, for all sufficiently large c. We observe that every maximum size cut satisfies a certain local optimality property, and we compute the expected number of cuts with a given value satisfying this local optimality property. Estimating this expectation amounts to solving a rather involved multi-dimensional large deviations problem. We solve this underlying large deviation problem asymptotically as c increases and use it to obtain an improved upper bound on the Max-Cut value. The lower bound is obtained by application of the second moment method, coupled with the same local optimality constraint, and is shown to work up to the stated lower bound value c/2 + 0.47523 c . We also obtain an improved lower bound of 1.36000n on the Max-Cut for the random cubic graph or any cubic graph with large girth, improving the previous best bound of 1.33773n.

Matrix Completion is the problem of reconstructing a rank-k n x n matrix M from a sampling of its entries. We propose a new matrix completion algorithm using a novel sampling scheme based on a union of independent sparse random regular bipartite graphs. We show that under a certain incoherence assumption on Al and for the case when both the rank and the condition number of M are bounded, w.h.p. our algorithm recovers an epsilon-approximation of M in terms of the Frobenius norm using O(nlog 2(1/epsilon)) samples and in linear time O( nlog2(1/epsilon)). This provides the best known bounds both on the sample complexity and computational cost for reconstructing (approximately) an unknown low-rank matrix. The novelty of our algorithm is two new steps of thresholding singular values and resealing singular vectors in the application of the "vanilla" alternating minimization algorithm. The structure of sparse random regular graphs is used heavily for controlling the impact of these regularization steps. (Copies available exclusively from MIT Libraries, libraries.mit.edu/docs - docs mit.edu).

Local Note:

School code: 0753

Subject Term:

Computer science.

Statistics.

Applied mathematics.

Added Corporate Author:

Massachusetts Institute of Technology.

Electronic Access:

http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqm&rft_dat=xri:pqdiss:10902456

Available:*

Shelf Number	Item Barcode	Shelf Location	Status
XX(687415.1)	687415-1001	Proquest E-Thesis Collection	Searching...

On Order

Select a list

Make this your default list.

The following items were successfully added.

There was an error while adding the following items. Please try again.