Given a large data graph, trimming techniques can reduce the search space by removing vertices without outgoing edges. One application is to speed up the parallel decomposition of graphs into strongly connected components (SCC decomposition), which is a fundamental step for analyzing graphs. We observe that graph trimming is essentially a kind of arc-consistency problem, and AC-3, AC-4, and AC-6 are the most relevant arc-consistency algorithms for application to graph trimming. The existing parallel graph trimming methods require worst-case \(\mathcal O(nm)\) time and worst-case \(\mathcal O(n)\) space for graphs with n vertices and m edges. We call these parallel AC-3-based as they are much like the AC-3 algorithm. In this work, we propose AC-4-based and AC-6-based trimming methods. That is, AC-4-based trimming has an improved worst-case time of \(\mathcal O(n+m)\) but requires worst-case space of \(\mathcal O(n+m)\); compared with AC-4-based trimming, AC-6-based has the same worst-case time of \(\mathcal O(n+m)\) but an improved worst-case space of \(\mathcal O(n)\). We parallelize the AC-4-based and AC-6-based algorithms to be suitable for shared-memory multi-core machines. The algorithms are designed to minimize synchronization overhead. For these algorithms, we also prove the correctness and analyze time complexities with the work-depth model. In experiments, we compare these three parallel trimming algorithms over a variety of real and synthetic graphs on a multi-core machine, where each core corresponds to a worker. Specifically, for the maximum number of traversed edges per worker by using 16 workers, AC-3-based traverses up to 58.3 and 36.5 times more edges than AC-6-based trimming and AC-4-based trimming, respectively. That is, AC-6-based trimming traverses much fewer edges than other methods, which is meaningful especially for implicit graphs. In particular, for the practical running time, AC-6-based trimming achieves high speedups over graphs with a large portion of trimmable vertices.