Three algorithms are presented and compared for the solution of the steady Euler equations on unstructured triangulargrids.AllarevariationsonNewton' smethod— onequasi-andtwofull-Newtonschemes— andemploythe BILU(n)-preconditioned generalized minimum residualmethod (GMRES)algorithm to solvetheJacobian matrix problem that arises at each iteration. The quasi-Newton scheme uses a e rst-order approximation to the Jacobian matrixwiththestandardGMRESimplementation,inwhichmatrix-vectorproductsareformedintheusualexplicit manner. The full-Newton schemes are distinguished by the implementation ofGMRES: One employs thestandard GMRES algorithm, and theother is matrix free using Frechet derivatives. Thematrix-free, full-Newton algorithm is shown to be the fastest of the three algorithms. Optimal preconditioning, reordering, and storage strategies for thematrix-free, full-Newton algorithm are presented. Registerand cache performanceissues arebriee y discussed. thetradeoffbetweenspeedandmemory.Manyauthorshaveusedthe block ILU factorization with no additional e ll, BILU (0), using the natural four-by-four block size for the two-dimensional Euler equa- tions.OtherILUvariantsincludeILU (n)andBILU(n),usingscalar- and block-based storage schemes, respectively, which allow e ll en- tries based on a level-of-e ll parameter n, and ILUT(p, t), which employs a threshold drop tolerance t to introduce a maximum of p e ll entries in each factor. 19 Possible reordering algorithms include reverse Cuthill- McKee, one-way dissection, nested dissection, and quotient minimum degree. 20 The goal of the present paper is to examine the tradeoffs inherent in the various Newton- GMRES implementations and to develop a fast, robust algorithm. In particular, we compare two full-Newton algorithms,usingmatrix-freeandstandardimplementationsofGM- RES, and a quasi-Newton algorithm. The primary criterion is CPU usage. Results from an extensive parameter study of each solver are presented. Optimalstrategies for the choiceof GMRESKrylov sub- space dimension and exit tolerance, BILU (n) e ll level, reordering technique, and storage model are presented.
Read full abstract