Abstract

Principal component analysis (PCA) and other multivariate analysis methods have been used increasingly to analyse and understand depth‐profiles in XPS, AES and SIMS. For large images or three‐dimensional (3D) imaging depth‐profiles, PCA has been difficult to apply until now simply because of the size of the matrices of data involved. In a recent paper, we described two algorithms, random vector 1 (RV1) and random vector 2 (RV2), that improve the speed of PCA and allow datasets of unlimited size, respectively. In this paper, we now apply the RV2 algorithm to perform PCA on full 3D time‐of‐flight SIMS data for the first time without subsampling. The dataset we process in this way is a 128 × 128 pixel depth‐profile of 120 layers, each voxel having a 70 439 value mass spectrum associated with it. This forms over a terabyte of data when uncompressed and took 27 h to process using the RV2 algorithm using a conventional windows desktop personal computer (PC). While full PCA (e.g. using RV2) is to be preferred for final reports or publications, a much more rapid method is needed during analysis sessions to inform decisions on the next analytical step. We have therefore implemented the RV1 algorithm on a PC having a graphical processor unit (GPU) card containing 2880 individual processor cores. This increases the speed of calculation by a factor of around 4.1 compared with what is possible using a fast commercially available desktop PC having central processing units alone, and full PCA is performed in less than 7 s. The size of the dataset that can be processed in this way is limited by the size of the memory on the GPU card. This is typically sufficient for two‐dimensional images but not 3D depth‐profiles without sampling. We have therefore examined efficient sampling schemes that allow a good approximate solution to the PCA problem for large 3D datasets. We find that low‐discrepancy series such as Sobol series sampling gives more rapid convergence than random sampling, and we recommend such methods for routine use. Using the GPU and low‐discrepancy series together, we anticipate that any time‐of‐flight SIMS dataset, of whatever size, can be efficiently and accurately processed into PCA components in a maximum of around 10 s using a commercial PC with a widely available GPU card, although the longer RV2 approach is still to be preferred for the presentation of final results, such as in published papers. Copyright © 2016 The Authors Surface and Interface Analysis Published by John Wiley & Sons Ltd

Highlights

  • Principal component analysis[1] (PCA) is a powerful tool for surface analysis data and has many applications

  • At the core of PCA software is singular value decomposition (SVD), a matrix algebra method for decomposing spectra into orthogonal components.[12,13]. These methods have been difficult to apply to very large datasets such as spectra associated with two-dimensional (2D) images or threedimensional (3D) depth-profiles because the size of the dataset is too large to hold in the memory of commonly available personal computers (PCs)

  • We have demonstrated PCA on a large 3D ToF-SIMS dataset, and how graphical processor unit (GPU) and low discrepancy series (LDS) techniques may be used to obtain a rapid and accurate approximation to the PCA results from the full dataset

Read more

Summary

Published online in Wiley Online Library

Rapid multivariate analysis of 3D ToF-SIMS data: graphical processor units (GPUs) and low-discrepancy subsampling for large-scale principal component analysis.

Introduction
Requirements of surface analysis
The need for sampling
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call