Graph Partitioning Research Articles

Abstract. We discuss the various performance aspects of parallelizing our transient global-scale groundwater model at 30′′ resolution (30 arcsec; ∼ 1 km at the Equator) on large distributed memory parallel clusters. This model, referred to as GLOBGM, is the successor of our 5′ (5 arcmin; ∼ 10 km at the Equator) PCR-GLOBWB 2 (PCRaster Global Water Balance model) groundwater model, based on MODFLOW having two model layers. The current version of GLOBGM (v1.0) used in this study also has two model layers, is uncalibrated, and uses available 30′′ PCR-GLOBWB data. Increasing the model resolution from 5′ to 30′′ creates challenges, including increased runtime, memory usage, and data storage that exceed the capacity of a single computer. We show that our parallelization tackles these problems with relatively low parallel hardware requirements to meet the needs of users or modelers who do not have exclusive access to hundreds or thousands of nodes within a supercomputer. For our simulation, we use unstructured grids and a prototype version of MODFLOW 6 that we have parallelized using the message-passing interface. We construct independent unstructured grids with a total of 278 million active cells to cancel all redundant sea and land cells, while satisfying all necessary boundary conditions, and distribute them over three continental-scale groundwater models (168 million – Afro–Eurasia; 77 million – the Americas; 16 million – Australia) and one remaining model for the smaller islands (17 million). Each of the four groundwater models is partitioned into multiple non-overlapping submodels that are tightly coupled within the MODFLOW linear solver, where each submodel is uniquely assigned to one processor core, and associated submodel data are written in parallel during the pre-processing, using data tiles. For balancing the parallel workload in advance, we apply the widely used METIS graph partitioner in two ways: it is straightforwardly applied to all (lateral) model grid cells, and it is applied in an area-based manner to HydroBASINS catchments that are assigned to submodels for pre-sorting to a future coupling with surface water. We consider an experiment for simulating the years 1958–2015 with daily time steps and monthly input, including a 20-year spin-up, on the Dutch national supercomputer Snellius. Given that the serial simulation would require ∼ 4.5 months of runtime, we set a hypothetical target of a maximum of 16 h of simulation runtime. We show that 12 nodes (32 cores per node; 384 cores in total) are sufficient to achieve this target, resulting in a speedup of 138 for the largest Afro–Eurasia model when using 7 nodes (224 cores) in parallel. A limited evaluation of the model output using the United States Geological Survey (USGS) National Water Information System (NWIS) head observations for the contiguous United States was conducted. This showed that increasing the resolution from 5′ to 30′′ results in a significant improvement with GLOBGM for the steady-state simulation when compared to the 5′ PCR-GLOBWB groundwater model. However, results for the transient simulation are quite similar, and there is much room for improvement. Monthly and multi-year total terrestrial water storage anomalies derived from the GLOBGM and PCR-GLOBWB models, however, compared favorably with observations from the GRACE satellite. For the next versions of GLOBGM, further improvements require a more detailed (hydro)geological schematization and better information on the locations, depths, and pumping rates of abstraction wells.

SummaryClustering ensemble is a popular approach for identifying data clusters that combines the clustering results from multiple base clustering algorithms to produce more accurate and robust data clusters. However, the performance of clustering ensemble algorithms is highly dependent on the quality of clustering members. To address this problem, this paper proposes a member enhancement‐based clustering ensemble (MECE) algorithm that selects the ensemble members by considering their distribution consistency. MECE has two main components, called heterocluster splitting and homocluster merging. The first component estimates two probability density functions (p.d.f.s) estimated on the sample points of an heterocluster and represents them using a Gaussian distribution and a Gaussian mixture model. If the random numbers generated by these two p.d.f.s have different probability distributions, the heterocluster is then split into smaller clusters. The second component merges the clusters that have high neighborhood densities into a homocluster, where the neighborhood density is measured using a novel evaluation criterion. In addition, a co‐association matrix is presented, which serves as a summary for the ensemble of diverse clusters. A series of experiments were conducted to evaluate the feasibility and effectiveness of the proposed ensemble member generation algorithm. Results show that the proposed MECE algorithm can select high quality ensemble members and as a result yield the better clusterings than six state‐of‐the‐art ensemble clustering algorithms, that is, cluster‐based similarity partitioning algorithm (CSPA), meta‐clustering algorithm (MCLA), hybrid bipartite graph formulation (HBGF), evidence accumulation clustering (EAC), locally weighted evidence accumulation (LWEA), and locally weighted graph partition (LWGP). Specifically, MECE algorithm has the nearly 23% higher average NMI, 27% higher average ARI, 15% higher average FMI, and 10% higher average purity than CSPA, MCLA, HBGF, EAC, LWEA, and LWGA algorithms. The experimental results demonstrate that MECE algorithm is a valid approach to deal with the clustering ensemble problems.

Graph Partitioning Research Articles

Related Topics

Articles published on Graph Partitioning

Improved Selective Deep-Learning-Based Clustering Ensemble

GLOBGM v1.0: a parallel implementation of a 30 arcsec PCR-GLOBWB-MODFLOW global-scale groundwater model

Multiway Spectral Graph Partitioning: Cut Functions, Cheeger Inequalities, and a Simple Algorithm

A novel member enhancement‐based clustering ensemble algorithm

FPT approximation and subexponential algorithms for covering few or many edges

On degree conditions of semi-balanced k-partite Hamiltonian graphs

Deep Graph Reinforcement Learning for Solving Multicut Problem.

GpDB: A Graph-partition Based Storage Strategy for DAG-Blockchain in Edge-cloud IIoT

The truncated variational model for image labeling and graph partitioning

Multi-Robot Stochastic Patrolling via Graph Partitioning

Exact solutions to the Erdős-Rothschild problem

Equitable cluster partition of graphs with small maximum average degree

Closure results for arbitrarily partitionable graphs

Topology-Based Clustering Techniques for Graph Partitioning Applied to the Italian Transmission Network

DeepMulticut: Deep Learning of Multicut Problem for Neuron Segmentation from Electron Microscopy Volume.

Mobility-Aware MEC Planning With a GNN-Based Graph Partitioning Framework

An effective algorithm for genealogical graph partitioning

A Sharding Scheme Based on Graph Partitioning Algorithm for Public Blockchain

Graph Reconfigurable Pooling for Graph Representation Learning

ProvGRP: A Context-Aware Provenance Graph Reduction and Partition Approach for Facilitating Attack Investigation

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Graph Partitioning Research Articles

Related Topics

Articles published on Graph Partitioning

Improved Selective Deep-Learning-Based Clustering Ensemble

GLOBGM v1.0: a parallel implementation of a 30 arcsec PCR-GLOBWB-MODFLOW global-scale groundwater model

Multiway Spectral Graph Partitioning: Cut Functions, Cheeger Inequalities, and a Simple Algorithm

A novel member enhancement‐based clustering ensemble algorithm

FPT approximation and subexponential algorithms for covering few or many edges

On degree conditions of semi-balanced k-partite Hamiltonian graphs

Deep Graph Reinforcement Learning for Solving Multicut Problem.

GpDB: A Graph-partition Based Storage Strategy for DAG-Blockchain in Edge-cloud IIoT

The truncated variational model for image labeling and graph partitioning

Multi-Robot Stochastic Patrolling via Graph Partitioning

Exact solutions to the Erdős-Rothschild problem

Equitable cluster partition of graphs with small maximum average degree

Closure results for arbitrarily partitionable graphs

Topology-Based Clustering Techniques for Graph Partitioning Applied to the Italian Transmission Network

DeepMulticut: Deep Learning of Multicut Problem for Neuron Segmentation from Electron Microscopy Volume.

Mobility-Aware MEC Planning With a GNN-Based Graph Partitioning Framework

An effective algorithm for genealogical graph partitioning

A Sharding Scheme Based on Graph Partitioning Algorithm for Public Blockchain

Graph Reconfigurable Pooling for Graph Representation Learning

ProvGRP: A Context-Aware Provenance Graph Reduction and Partition Approach for Facilitating Attack Investigation