Large deviation behavior of the largest eigenvalue \(\lambda _1\) of Wigner matrices including those arising from an Erdős-Rényi random graph \({\mathcal {G}}_{n,p}\) with i.i.d. random conductances on the edges has been the topic of considerable interest. However, despite several recent advances, not much is known when the underlying graph is sparse i.e., \(p\rightarrow 0\), except the recent works (Bhattacharya et al., Ann Probab 49(4):1847–1885, 2021and Bhattacharya and Ganguly, SIAM J Discret Math, 2020) which consider the simpler case of the graph without additional edge weights. Under sufficiently general conditions on the conductance distribution, one expects the ‘dense’ behavior as long as the average degree np is at least logarithmic in n. In this article we focus on the case of constant average degree i.e., \(p=\frac{d}{n}\) for some fixed \(d>0\) with standard Gaussian weights. Results in Bandeira and Van Handel (Ann Probab 44(4):2479–2506, 2016) about general non-homogeneous Gaussian matrices imply that in this regime \(\lambda _1\) scales like \(\sqrt{\log n}.\) We prove the following results towards a precise understanding of the large deviation behavior in this setting. 1. (Upper tail probabilities and structure theorem): For \(\delta >0,\) we pin down the exact exponent \(\psi (\delta )\) such that $$\begin{aligned} {\mathbb {P}}(\lambda _1\ge \sqrt{2(1+\delta )\log n})=n^{-\psi (\delta )+o(1)}. \end{aligned}$$ Further, we show that conditioned on the upper tail event, with high probability, a unique maximal clique emerges with a very precise \(\delta \) dependent size (takes either one or two possible values) and the Gaussian weights are uniformly high in absolute value on the edges in the clique. Finally, we also prove an optimal localization result for the leading eigenvector, showing that it allocates most of its mass on the aforementioned clique which is spread uniformly across its vertices. 2. (Lower tail probabilities): The exact stretched exponential behavior $$\begin{aligned} {\mathbb {P}}(\lambda _1\le \sqrt{2(1-\delta )\log n})=\exp \left( -n^{\ell (\delta )+o(1)}\right) \end{aligned}$$ is also established. As an immediate corollary, one obtains that \(\lambda _1\) is typically \((1+o(1))\sqrt{2\log n}\), a result which surprisingly appears to be new. A key ingredient in our proofs is an extremal spectral theory for weighted graphs obtained by an \(\ell _1-\)reduction of the standard \(\ell _2-\)variational formulation of the largest eigenvalue via the classical Motzkin-Straus theorem [37], which could be of independent interest.