Despite recent breakthroughs in the domain of implicit generative models, the task of evaluating these models remains a challenging task. With no single metric to assess overall performance, various existing metrics only offer partial information. This issue is further compounded for unintuitive data types such as time series, where manual inspection is infeasible. This deficiency hinders the confident application of modern implicit generative models on time series data. To alleviate this problem, we propose two new metrics, the InceptionTime Score (ITS) and the Fréchet InceptionTime Distance (FITD), to assess the quality of class-conditional generative models on time series data. We conduct extensive experiments on 80 different datasets to study the discriminative capabilities of proposed metrics alongside two existing evaluation metrics: Train on Synthetic Test on Real (TSTR) and Train on Real Test on Synthetic (TRTS). Our evaluations reveal that the proposed assessment evaluation metrics, i.e., ITS and FITD in combination with TSTR, can accurately assess class-conditional generative model performance and detect common issues in implicit generative models. Our findings suggest that the proposed evaluation framework can be a valuable tool for confidently applying modern implicit generative models in time series analysis.
Read full abstract