Lagrangian Motion Fields for Long-Term Motion Generation.
Long-term motion generation is a challenging task that requires producing coherent and realistic sequences over extended durations. Current methods primarily rely on framewise motion representations, which capture only static spatial details and overlook temporal dynamics. This approach leads to significant redundancy across the temporal dimension, complicating the generation of effective long-term motion. To overcome these limitations, we introduce the novel concept of Lagrangian Motion Fields, specifically designed for long-term motion generation. By treating each joint as a Lagrangian particle with uniform velocity over short intervals, our approach condenses motion representations into a series of "supermotions" (analogous to superpixels). This method seamlessly integrates static spatial information with interpretable temporal dynamics, transcending the limitations of existing network architectures and motion sequence content types. Our solution is versatile and lightweight, eliminating the need for neural network preprocessing. Our approach excels in tasks such as long-term music-to-dance generation and text-to-motion generation, offering enhanced efficiency, superior generation quality, and greater diversity compared to existing methods. Additionally, the adaptability of Lagrangian Motion Fields extends to applications like infinite motion looping and fine-grained controlled motion generation, highlighting its broad utility.
- Research Article
44
- 10.1016/j.patcog.2020.107293
- Feb 21, 2020
- Pattern Recognition
Learning shape and motion representations for view invariant skeleton-based action recognition
- Conference Article
5
- 10.1109/humanoids.2015.7363549
- Nov 1, 2015
This paper introduces an approach to generate ground-collision-free gait motion by learning a statistical model of walking motion and applies assist-as-needed (AAN) training scheme in learned statistical model which is efficient for robotic gait rehabilitation. The method utilizes a nonlinear dimensionality reduction technique, which is based on Gaussian process, to construct the model using gait motion data obtained from several dozens of healthy subjects. The model is a common, averaged in statistical sense, low-dimensional representation of walking motion. Using the model, it is possible to generate a ground-collision-free gait trajectory at an arbitrary walking speed for a subject on the gait rehabilitation robot, and apply AAN training paradigm around the generated motion. We simulate the framework of learning and generation of motion with gait data from 50 healthy subjects, who walked on a motorized treadmill at 3 different speeds.
- Preprint Article
- 10.5194/egusphere-egu25-4538
- Mar 18, 2025
The polarimetric phase optimization has been effectively incorporated into the multi-temporal synthetic aperture radar interferometry (InSAR, MT-InSAR) to improve phase estimation quality and extend deformation monitoring coverage. This technique, commonly called multi-temporal polarimetric InSAR (MT-PolInSAR), has shown great potential in enhancing interferometric measurements for various geophysical applications, including deformation monitoring and disaster assessment. However, most existing MT-PolInSAR methods optimize phase independently along the temporal and polarimetric dimensions, which neglects the potential synergies between these two aspects. As a result, the capability of polarimetric and temporal information for phase optimization is not utilized fully, leading to suboptimal results, which reduces the effectiveness of deformation analysis in complex scenarios, such as landslides, subsidence, and fault movement. To address these limitations, this study proposes a novel multi-polarization optimization method that achieves one-step phase optimization by jointly considering the temporal and polarimetric dimensions. The proposed method is based on a joint probability density function of the multi-polarization covariance matrix and maximum likelihood estimation method, which enable a more comprehensive optimization of phase information by leveraging the inherent relationships between the temporal and polarimetric dimensions. Unlike traditional methods that treat these dimensions independently, the proposed approach effectively combines the strengths of both dimensions to achieve superior phase quality. Additionally, a no-threshold regularization technique is employed in this method to enhance the stability of the multi-polarization covariance matrix. This regularization eliminates the need for manual thresholding based on an analytical solution, avoiding relying on empirical threshold values. This approach significantly enhances the reliability and consistency of the optimization process, especially in scenarios with high noise levels or challenging scattering conditions. The effectiveness of the proposed approach has been validated using both synthetic and real quad-polarization datasets. Synthetic data experiments were conducted to evaluate the method’s ability to handle varying noise levels and scattering mechanisms. For real data validation, two datasets were utilized: ALOS-2/PALSAR-2 data from the Fengjie landslide region in China and Radarsat-2 data from the Barcelona airport in Spain. These datasets cover diverse scenarios with different levels of complexity and provide an excellent testbed for assessing the performance of the proposed method. The experimental results demonstrate that the proposed approach significantly reduces phase noise compared to traditional MT-PolInSAR methods, leading to a more accurate representation of deformation signals. Furthermore, the method achieves a notable increase in the density of measurement points, which is crucial for applications requiring high spatial resolution and coverage. In the case of the Barcelona airport, the proposed approach successfully identified subtle deformation patterns that were otherwise obscured by noise in traditional methods. Similarly, in the Fengjie landslide dataset, the method provided a clearer and more detailed phase distribution, which could enhance the monitoring of landslide.     
- Research Article
82
- 10.1016/j.rse.2020.111942
- Jun 18, 2020
- Remote Sensing of Environment
Satellite-observed night-time light in urban areas has been widely used as an indicator for socioeconomic development and light pollution. Up to present, the diurnal dynamics of city light during the night, which are important to understand the nature of human activity and the underlying variables explaining night-time brightness, have hardly been investigated by remote sensing techniques due to limitation of the revisit time and spatial resolution of available satellites. In this study, we employed a consumer-grade unmanned aerial vehicle (UAV) to monitor city light in a study area located in Wuhan City, China, from 8:08 PM, April 15, 2019 to 5:08 AM, April 16, 2019, with an hourly temporal resolution. By using three ground-based Sky Quality Meters (SQMs), we found that the UAV-recorded light brightness was consistent with the ground luminous intensity measured by the SQMs in both the spatial (R2 = 0.72) and temporal dimensions (R2 > 0.94), and that the average city light brightness was consistent with the sky brightness in the temporal dimension (R2 = 0.98), indicating that UAV images can reliably monitor the city's night-time brightness. The temporal analysis showed that different locations had different patterns of temporal changes in their night-time brightness, implying that inter-calibration of two kinds of satellite images with different overpass times would be a challenge. Combining an urban function map of 18 classes and the hourly UAV images, we found that urban functions differed in their temporal light dynamics. For example, the outdoor sports field lost 97.28% of its measured brightness between 8: 08 PM – 4:05 AM, while an administrative building only lost 4.56%, and the entire study area lost 61.86% of its total brightness. Within our study area, the period between 9:06 PM and 10:05 PM was the period with largest amount of light loss. The spectral analysis we conducted showed that city light colors were different in some urban functions, with the major road being the reddest region at 8:08 PM and becoming even redder at 4:05 AM. This preliminary study indicates that UAVs are a good tool to investigate city light at night, and that city light is very complex in both of the temporal and spatial dimensions, requiring comprehensive investigation using more advanced UAV techniques, and emphasizing the need for geostationary platforms for night-time light sensors.
- Research Article
5
- 10.1088/0031-8949/55/3/005
- Mar 1, 1997
- Physica Scripta
Using the recent formulation of Gell-Mann and Hartle for approximating quantum dynamical phenomena by means of classical equations, we simulate electron motions in ground state H2+, in ground state H2, and in the first excited state of H2. The approach develops approximate initial data first by mathematical bisection. The dynamical calculations are then carried out over short time intervals only, which is consistent with the Gell-Mann and Hartle theory and which is applicable because the phenomena to be studied are periodic. An energy conserving numerical scheme is used so that the energy of a given system will be a numerical invariant. Graphical representation of the electron motions indicate readily electron distributions, or clouds, over various time intervals.
- Research Article
8
- 10.1016/j.knosys.2024.111852
- May 9, 2024
- Knowledge-Based Systems
CANet: Comprehensive Attention Network for video-based action recognition
- Research Article
206
- 10.1109/tmm.2017.2666540
- Feb 10, 2017
- IEEE Transactions on Multimedia
Learning the spatial-temporal representation of motion information is crucial to human action recognition. Nevertheless, most of the existing features or descriptors cannot capture motion information effectively, especially for long-term motion. To address this problem, this paper proposes a long-term motion descriptor called sequential Deep Trajectory Descriptor (sDTD). Specifically, we project dense trajectories into two-dimensional planes, and subsequently a CNN-RNN network is employed to learn an effective representation for long-term motion. Unlike the popular two-stream ConvNets, the sDTD stream is introduced into a three-stream framework so as to identify actions from a video sequence. Consequently, this three-stream framework can simultaneously capture static spatial features, short-term motion and long-term motion in the video. Extensive experiments were conducted on three challenging datasets: KTH, HMDB51 and UCF101. Experimental results show that our method achieves state-of-the-art performance on the KTH and UCF101 datasets, and is comparable to the state-of-the-art methods on the HMDB51 dataset.
- Research Article
97
- 10.1152/jn.1991.65.3.511
- Mar 1, 1991
- Journal of Neurophysiology
1. Intracortical microstimulation (ICMS) and surface stimulation studies of primate face motor cortex have shown an extensive representation within face motor cortex devoted to movements of the tongue and face; only a very small representation for jaw-closing movements has ever been demonstrated. These data suggest that face motor cortex plays a critical role in the generation of tongue and facial movements but is less important in the generation of jaw-closing movements. Our aim was to determine whether disruption of primate face motor cortical function would indeed interfere with the generation of tongue movements but would not interfere with the generation of jaw-closing movements. 2. The face motor cortex was reversibly inactivated with the use of cooling in two monkeys that were trained to perform both a tongue-protrusion task and a biting task. Recording of single neuronal activity in the cortex beneath the thermode confirmed the reversible inactivation of the cortex. Each task involved a series of trials in which the monkey was required to produce a preset force level for a 0.5-s force holding period; the monkey received a fruit-juice reward if it successfully completed a task trial. Cooling of the ICMS-defined face motor cortex was achieved bilaterally or, in one experiment, unilaterally by circulating coolant through thermodes placed either on intact dura overlying face motor cortex in both monkeys or directly on the exposed pia in one of the monkeys;thermode temperature was lowered to 3-5 degrees C during cooling. Electromyographic (EMG) recordings were also made from masseter, genioglossus, and digastric muscles. 3. During bilateral cooling of the thermodes on the dura overlying the face motor cortex, there was a significant reduction in the success rates for the performance of the tongue-protrusion task in comparison with control series of trials (i.e., precool and postcool) in which the thermodes were kept at 37 degrees C. Quantitative analyses of force and EMG activity showed that the principal deficit was an inability of each monkey to exert sufficient force with its tongue for a sufficient length of time onto the tongue-protrusion task transducer; this deficit was paralleled by a reduction in the level of genioglossus and digastric EMG activity. At 4 min after commencement of rewarming, task performance had returned to control, precool levels.(ABSTRACT TRUNCATED AT 400 WORDS)
- Research Article
- 10.1109/access.2021.3101823
- Jan 1, 2021
- IEEE Access
Videos are full of dynamic changes along both the spatial and temporal dimensions. Large, jerky short-term motions make it difficult to extract significant changes from videos such as subtle color changes and long-term motions occurring in time-lapse sequences. In this paper, we introduce two singular value decomposition (SVD)-based video decomposition schemes to clearly reveal such changes. The first scheme involves enhancing the visual characteristics of small subtle color changes in the presence of a wide variety of motion patterns by magnifying their pixel intensities. The second scheme removes short-term motions that visually distract attention from the underlying content of video sequences such as time-lapse videos, snowing scene, and maritime surveillance. Both schemes involve the decomposition of videos into spatiotemporal slices in which each slice is further decomposed into several singular components. The low-rank components that primarily represent background and color intensity information are then temporally processed to magnify the magnitude of the signal at the subtle color change target frequency. At the same time, an approach similar to that used in denoising time-lapse sequences is applied to temporally filter the singular components representing sparse information, thereby removing jittery short-term motions while preserving long-term motions, which are represented by both low-rank and unfiltered sparse components. We demonstrate promising color magnification and motion denoising results that can be obtained much faster than results estimated using state-of-the-art techniques.
- Research Article
- 10.1145/2964797.2964818
- Jun 27, 2016
- ACM SIGIR Forum
Since time is an omnipresent feature of our existence, many elements of time are embedded in information itself, and related behaviours such as creation, seeking and utilisation. In IR, time can distinguish the interpretation of information, and influence the intentions and expectations of users' information seeking activity. Many time-based patterns and trends - namely temporal dynamics - are evident in streams of information behaviour by individuals and crowds. A temporal dynamic refers to a periodic regularity, or, a one-off or irregular past, present or future of a particular element (e.g., word, topic or query popularity) - driven by predictable and unpredictable time-based events and phenomena. Several challenges and opportunities related to temporal dynamics emerge in IR. This thesis explores temporal dynamics from the perspective of (i) query popularity and meaning, and (ii) word use and relationships over time. In particular, I consider how real-time temporal dynamics in information seeking should be supported for consistent user satisfaction over time, and moreover, how previously observed temporal dynamics offer a complementary dimension which can be exploited to inform more effective IR systems. Uncertainty about user expectations is a perennial problem for IR systems, further confounded by changes over time. Addressing this, IR systems can either assist the user to submit an effective query (e.g., error-free and descriptive), or better anticipate what the user is most likely to want in relevance ranking. I first explore methods to always help users formulate queries with time-aware query auto-completion capable of suggesting both recent and always popular queries. I propose and evaluate several novel approaches, and demonstrate state-of-the-art performance of up to +9.2% improvement above existing baselines for diverse search scenarios in different languages. Furthermore, I explore the impact of temporal dynamics on the motives behind users' information seeking, and thus how relevance itself is subject to temporal dynamics. I find the most likely meaning of ambiguous queries is affected over short and long-term periods (e.g., hours to months) by several periodic and oneoff event-driven temporal dynamics. Finally, I find that for many event-driven multi-faceted queries, relevance can often be inferred by modelling the temporal dynamics of changes in related information. IR approaches are typically based on methods which characterize the nature of information through the statistical distributions of words and phrases. I model and exploit the temporal dimension of the collection, captured by temporal dynamics, in these established IR approaches. I explore how the temporal dynamic similarity of word and phrase use in a collection can be exploited to infer temporal semantic relationships between the terms. I propose an approach to uncover a query topic's "chronotype" terms -- that is, its most distinctive and temporally interdependent terms, based on a mix of temporal and non-temporal evidence. Experiments demonstrate that exploiting chronotype terms in temporal query expansion leads to significantly improved retrieval performance in several time-based collections. Temporal dynamics provide both a challenge and an opportunity for IR systems. Overall, this thesis demonstrates that temporal dynamics can be used to derive tacit structure and meaning of information and information behaviour, which is valuable for improving timeaware IR system effectiveness.
- Research Article
4
- 10.1111/ejn.15501
- Oct 30, 2021
- The European Journal of Neuroscience
There is ample evidence that the contralateral sensorimotor areas play an important role in movement generation, with the primary motor cortex and the primary somatosensory cortex showing a detailed spatial organization of the representation of contralateral body parts. Interestingly, there are also indications for a role of the motor cortex in controlling the ipsilateral side of the body. However, the precise function of ipsilateral sensorimotor cortex in unilateral movement control is still unclear. Here, we show hand movement representation in the ipsilateral sensorimotor hand area, in which hand gestures can be distinguished from each other and from contralateral hand gestures. High‐field functional magnetic resonance imaging (fMRI) data acquired during the execution of six left‐ and six right‐hand gestures by healthy volunteers showed ipsilateral activation mainly in the anterior section of precentral gyrus and the posterior section of the postcentral gyrus. Despite the lower activation in ipsilateral areas closer to the central sulcus, activity patterns for the 12 hand gestures could be mutually distinguished in these areas. The existence of a unique representation of ipsilateral hand movements in the human sensorimotor cortex favours the notion of transcallosal integrative processes that support optimal coordination of hand movements.
- Book Chapter
29
- 10.1007/978-3-319-25739-6_2
- Nov 25, 2015
This manuscript describes an approach, based on Laban Movement Analysis, to generate compact and informative representations of movement to facilitate affective movement recognition and generation for robots and other artificial embodiments. We hypothesize that Laban Movement Analysis, which is a comprehensive and systematic approach for describing movement, is an excellent candidate for deriving a low-dimensional representation of movement which facilitates affective motion modeling. First, we review the dimensions of Laban Movement Analysis most relevant for capturing movement expressivity and propose an approach to compute an estimate of the Shape and Effort components of Laban Movement Analysis using data obtained from motion capture. Within a motion capture environment, a professional actor reproduced prescribed motions, imbuing them with different emotions. The proposed approach was compared with a Laban coding by a certified movement analyst (CMA). The results show a strong correlation between results from the automatic Laban quantification and the CMA-generated Laban quantification of the movements. Based on these results, we describe an approach for the automatic generation of affective movements, by adapting pre-defined motion paths to overlay affective content. The proposed framework is validated through cross-validation and perceptual user studies. The proposed approach has great potential for application in fields including robotics, interactive art, animation and dance/acting training.
- Conference Article
- 10.1117/12.2621413
- May 17, 2022
Epifluorescence Microscopy Imaging is a technique used by neuroscientists for observation of hundreds of neurons at the same time, with single-cell resolution and low cost from living tissue. Recording, identifying and tracking neurons and their activity in those observations is a crucial step for researching. However, manual identification of neurons is a hardworking task as well as prone to errors. For this reason, automatized applications to process the recordings to identify functional neurons are required. Several proposals have emerged; they can be classified in four kinds of approaches: 1) matrix factorization, 2) clustering, 3) dictionary learning and 4) deep learning. Unfortunately, they have resulted inadequate to solve this problem. In fact, it remains as an open problem; two major reasons are: 1) lack of datasets duly labeled and 2) existing approaches do not consider the temporal dimension or just consider a tiny fraction of it, integrating all the frames in a single image is very common but inefficient because temporal dynamics are disregarded. We propose an application for automatic segmentation of neurons with a Deep Learning approach, considering temporal dimension through recurrent neural networks and using a dataset labeled by neuroscientists. Additional aspects considered in our proposal include motion correction and validation to ensure that segmentations correspond to truly functional neurons. Furthermore, we compare this application with a previous proposal which uses sophisticated digital image processing techniques on the same dataset.
- Research Article
10
- 10.1109/access.2021.3052795
- Jan 1, 2021
- IEEE Access
Research on clustering spatio-temporal data to extract mobility patterns requires further development, as most existing studies do not simultaneously integrate data along both spatial dimensions and temporal dimensions but instead focus on only one dimension or separate the dimensions in analyses and applications, which could lead to discoveries that are not representative of the overall data or are dificult to interpret. To simultaneously reveal the spatial and temporal patterns of urban mobility datasets, we propose an analytical framework that is based on co-clustering and enables mobility behaviors to be distinguished in spatial and temporal dimensions. We use one month of taxi GPS data from the Manhattan area to explore spatio-temporal co-occurrence patterns. The spatial and temporal dimensions of taxi trip data were co-clustered by using the Bregman Block Average co-clustering algorithm with I-divergence (BBAC_I). We performed this process on weekdays and holidays and compared the mobility differences between these two periods. The experimental results demonstrated the effectiveness of this analytical framework, with which we can reveal the spatial patterns and their temporal dynamics as well as temporal patterns and their spatial dynamics in mobility data.
- Research Article
20
- 10.1109/access.2021.3055554
- Jan 1, 2021
- IEEE Access
Satellite image time series (SITS) collected by modern Earth Observation (EO) systems represent a valuable source of information that supports several tasks related to the monitoring of the Earth surface dynamics over large areas. A main challenge is then to design methods able to leverage the complementarity between the temporal dynamics and the spatial patterns that characterize these data structures. Focusing on land cover classification (or mapping) tasks, the majority of approaches dealing with SITS data only considers the temporal dimension, while the integration of the spatial context is frequently neglected. In this work, we propose an attentive spatial temporal graph convolutional neural network that exploits both spatial and temporal dimensions in SITS. Despite the fact that this neural network model is well suited to deal with spatio-temporal information, this is the first work that considers it for the analysis of SITS data. Experiments are conducted on two study areas characterized by different land cover landscapes and real-world operational constraints (i.e., limited labeled data due to acquisition costs). The results show that our model consistently outperforms all the competing methods obtaining a performance gain, in terms of F-Measure, of at least 5 points with respect to the best competing approaches on both benchmarks.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.