Genome sequencing has allowed the generation of genomic and high-throughput post-genomic data. The avail- ability of huge amounts of this data has, in turn, led to the development of protein role inference methods. Some of these methods allow the use of heterogeneous data of varying quality which are more or less informative. However, only limited research has been devoted to finding relevant data in terms of the inference of protein roles. In this study, we identified relevant subsets of data for the prediction of protein roles within the framework of a kernel method (KCCA) used to pre- dict the role of a bacterial protein. We carried out a sensitivity analysis based on a fractional factorial design in order to study the influence of different microarray experiments, as well as of bacterial orders (groups of families) used to con- struct the phylogenetic profiles, on the prediction of a protein role. The results of this analysis showed to be useful for in- terpreting biological predictions highlighting specific data that should be investigated. The method is not restricted to KCCA, nor to the organism or to the data we used here.