Observers effortlessly extract global patterns of spatiotemporal features from multiple objects in dynamic scenes (e.g., the movement of a crowd). Studies have revealed that a variety of spatial features are efficiently integrated into ensemble representations (e.g., mean), even when displayed in temporally complex contexts. However, it is not yet clear whether the ensemble coding also applies to purely temporal features such as frequency. Here, we investigated whether the visual system can extract the mean frequency from a group of flickering objects and how many objects are integrated into this representation of ensemble temporal frequency. In the Experiment 1, the display contained four disks flickering at independent frequencies in parallel for 4 seconds. Participants reported whether the mean of display frequency was higher or lower than a subsequent single test flicker. Temporal frequencies for the display disks were extracted randomly from one of three distributions (positively-skewed, symmetric, or negatively-skewed in a range of 0.5 to 12 Hz.) Points of subjective equality (PSEs) for the estimated means significantly shifted in accordance with the means of the three distributions. In Experiment 2, we displayed a full group of 8 disks or subsets of them (1, 2, or 4 disks) and participants compared the mean of the displayed disks with the subsequent test flicker frequency. We plotted the data as a function of relative distance between the mean of all the 8 frequencies and the test flicker, and observed a trend that participants' sensitivity improved as the number of displayed disks increased, at least up to four disks. However, this effect was not very large, suggesting that participants could integrate multiple temporal frequencies and extract the mean from them, but the efficiency of integration was not as high as other low-level visual features such as size and orientation. Meeting abstract presented at VSS 2017