The chemical compounds including lipid and protein in green coffee beans are important indicators of the final quality of the coffee products, which are usually determined by time-consuming and destructive chemical methods. Therefore, a fast and reliable method was attempted to exploit by near-infrared (NIR) spectroscopy combined with partial least squares (PLS) regression for the determination of lipid and protein in green coffee beans from different origins. Orthogonal signal correction (OSC) and several traditional spectral pretreatment methods were compared during the PLS regression model building process. Important variables selection was further achieved based on the regression coefficients (β). The results showed that the 1st and 2nd derivative reduced the model quality, while OSC, MSC, and SNV pretreatment enhanced the model quality. The quality of PLS models was significantly improved after important variable selection. Especially, OSC-PLS models were the most robust for protein and lipid prediction with the best performance indicators (R2p>0.982, RPD > 7.55, RMSEcv <0.101, RMSEP < 0.106). The excellent performance showed that the NIR technique together with PLS regression could be applied as a substitute way to determine the protein and lipid content in green coffee beans.