E2SIFT: Neuromorphic SIFT via Direct Feature Pyramid Recovery from Events
In recent years, event cameras have achieved significant attention due to their advantages over conventional cameras. Event cameras have high dynamic range, no motion blur, and high temporal resolution. Contrary to traditional cameras which generate intensity frames, event cameras output a stream of asynchronous events based on brightness change. There is extensive ongoing research on performing computer vision tasks like object detection, classification, etc via the event camera. However, due to the unconventional output format of the event camera, it is difficult to perform computer vision tasks directly on the event stream. Mostly, works reconstruct the intensity image from the event stream and then perform such tasks. An important and crucial task is feature detection and description. Scale-invariant feature transform (SIFT) is a widely-used scale-invariant keypoint detector and descriptor that is invariant to transformations like scale, rotation, noise, and illumination. In this work, given an event voxel, we directly generate the LoG pyramid for SIFT keypoint detection. We fit a 3rd-degree polynomial and calculate the polynomial roots to compute the scale-space extrema response for SIFT keypoint detection. Since the extrema computation is performed after LoG thresholding, the solution is computationally less expensive. Experimental results validate the effectiveness of our system.
- Peer Review Report
- 10.1039/d5lc00816f/v1/decision1
- Sep 30, 2025
Decision letter for "Direct access and recovery feature of solid precipitates embedded in microfluidic device"
- Peer Review Report
- 10.1039/d5lc00816f/v2/decision1
- Nov 25, 2025
Decision letter for "Direct access and recovery feature of solid precipitates embedded in microfluidic device"
- Peer Review Report
- 10.1039/d5lc00816f/v1/review2
- Sep 25, 2025
Review for "Direct access and recovery feature of solid precipitates embedded in microfluidic device"
- Peer Review Report
- 10.1039/d5lc00816f/v3/response1
- Dec 1, 2025
Author response for "Direct access and recovery feature of solid precipitates embedded in microfluidic device"
- Peer Review Report
- 10.1039/d5lc00816f/v3/decision1
- Dec 12, 2025
Decision letter for "Direct access and recovery feature of solid precipitates embedded in microfluidic device"
- Research Article
- 10.1039/d5lc00816f
- Jan 1, 2026
- Lab on a chip
Droplet microfluidics, which generates and manipulates water-in-oil microdroplets within continuous phases, has emerged as a compelling platform in modern science. The core advantage of this technology lies in the fact that each picoliter to nanoliter droplet functions as an independent microreactor, ensuring no cross-contamination. This enables ultra-high-throughput experiments while dramatically reducing the consumption of expensive reagents and rare samples. However, the efficient extraction of solid precipitates (such as crystals and particles) formed within droplets remains a fundamental challenge for subsequent analysis and utilization. This study proposes a novel microfluidic device and operational method to address these challenges: (1) the difficulty in extracting solids that cannot be recovered through simple fluid flow and (2) sample loss during long-distance transport. The key innovation combines (1) a passive trap structure for in situ solid formation processes within droplets and (2) a physically accessible harvesting chamber positioned nearby. This design eliminates the need for long-distance sample transport, enabling the gentle transfer of droplets containing precipitated solids to an adjacent extraction chamber with an open top, allowing for physical solid recovery. We demonstrated the system functionality using fluorescent microbeads as model particles, followed by the successful generation and recovery of protein (lysozyme) crystals as a practical application.
- Peer Review Report
- 10.1039/d5lc00816f/v2/review1
- Nov 23, 2025
Review for "Direct access and recovery feature of solid precipitates embedded in microfluidic device"
- Peer Review Report
- 10.1039/d5lc00816f/v3/review1
- Dec 11, 2025
Review for "Direct access and recovery feature of solid precipitates embedded in microfluidic device"
- Peer Review Report
- 10.1039/d5lc00816f/v2/response1
- Nov 15, 2025
Author response for "Direct access and recovery feature of solid precipitates embedded in microfluidic device"
- Peer Review Report
- 10.1039/d5lc00816f/v1/review1
- Sep 11, 2025
Review for "Direct access and recovery feature of solid precipitates embedded in microfluidic device"
- Research Article
2
- 10.11834/jig.211217
- Jan 1, 2023
- Journal of Image and Graphics
目的 在人体行为识别研究中,利用多模态方法将深度数据与骨骼数据相融合,可有效提高动作的识别率。针对深度图像信息数据量大、冗余度高等问题,提出一种通过获取关键时程信息动作帧序列降低冗余的算法,即质心运动路径松弛算法,并根据不同模态数据的特点,提出一种新的时空特征表示方法。方法 质心运动路径松弛算法根据质心在相邻帧之间的运动距离,计算图像差分后获得的活跃部分的相似系数,然后剔除掉相似度高的帧,获得足以表达行为的关键时程信息。根据图像动态部分的变化特性、人体各部分在运动中的协同性和局部显著性特征构建一种新的时空特征表示方法。结果 在MSR-Action3D数据集上对本文方法的效果进行验证。在3个子集中进行交叉验证的平均分类识别率为95.743 2%,分别比Multi-fused,CovP3DJ,D3D-LSTM(densely connected 3DCNN and long short-term memory),Joint Subset Selection方法高2.443 2%,4.763 2%,0.343 2%,0.213 2%。本文方法在使用完整数据集的扩展实验中进行交叉验证的分类识别率为93.040 3%,具有很好的鲁棒性。结论 实验结果表明,本文提出的去冗余算法在降低冗余后提升了识别效果,提取的特征之间具有相关性低的特点,在组合识别中具有良好的互补性,有效提高了分类识别的精确度。;Objective Human body motion-related recognition has been developing in the context of computer vision and pattern recognition like auxiliary human-computer interaction,motion analysis,intelligent monitoring,and virtual reality. To obtain two-dimensional information for its behavioral recognition,conventional motion behavior recognition is mainly used the RGB image sequence captured by RGB camera. To improve the ability to detect short-duration fragments,current feature descriptors for RGB image sequences are employed to characterize human behavior,such as histogram of oriented gradient(HOG),histogram of optical flow (HOF),and a three-dimensional feature pyramid. Some researchers are focused on the feature that image depth is insensitive to ambient light since RGB images are oriented to behavior image sequences of objects in terms of two-dimensional information. The depth information of the image is coordinated with the features of RGB image to describe the related behavior. Human behavior recognition-relevant multi-modal method can be used to fuse depth data and skeleton data,which can improve the recognition rate of action effectively. Recent depth map is widely used in relevant to human behavior recognition. But,the collection of depth information data is required to be optimized because of time complexity of feature extraction and space complexity of feature storage. To resolve the problems,we develop an algorithm to optimize frames of the depth map and resource consumption. At the same time,a new representation of motion features is facilitated as well according to the motion information of the centroid. Method First,the temporal feature vector is used in terms of depth map sequence-extracted time sequence information. The centroid motion path relaxation algorithm is used to realize depth image de-duplication and de redundancy,and the skeleton map-extracted spatial structure feature vector from are spliced to form the spatio-temporal feature input. Next,spatial features are extracted in terms of the original skeleton points coordinates-spliced three-channel spatial feature map. Finally,the fusion probability of spatio-temporal features and spatial features is used for classification and recognition. Our centroid motion path relaxation algorithm is focused on the optimization of redundant information,the time complexity of feature extraction,and the space complexity of feature storage. For the skeleton data,the global feature of motion direction is proposed to fully reflect the integrity and coordination of limb movements. The extracted features are concatenated to obtain the spatio-temporal feature vector,and they can be fused and enhanced through the original coordinates of skeleton points-built three-channel spatial feature map. Its effectiveness is verified on the MSR-Action3D dataset. Result The experimental setting 1 demonstrate that it is 0. 826 0% higher than the depth motion map(DMM)-local binary pattern(LBP)algorithm,1. 015 2% higher than DMM-CRC(collaborative representation classifier),3. 450 1% higher than gradient local auto correlation(DMM-GLAC) algorithm,0. 605 8% higher than EigenJoint algorithm,and 0. 605 8% higher than space-time auto correlation of gradient (STACOG)algorithm is 10. 624 5% higher. After removing redundancy,the result of experimental setting 1 is 0. 126 1% higher as well. The cross-validation on experimental setting 2 show that the average classification and recognition rate in the three subsets is 95. 743 2%,2. 443 2% higher than multi-fused method,4. 763 2% higher than CovP3DJ method,0. 343 2% higher than D3D-LSTM method,and 0. 213 2% higher than joint subset selection method. For the overall data set,it is 2. 030 3% higher than low latency method,0. 240 3% higher than combination of deep models method,and 2. 340 3% higher than complex network coding method. The experimental setting 2 illustrates that the average classification recognition rate of cross-validation in three subsets is 95. 743 2%,and the classification recognition rate of the complete dataset is 93. 040 3%. Conclusion Our algorithm proposed can improve the recognition effect based on redundancy-optimized,and the featuresextracted have lower correlation mutually,which can improve the accuracy of classification recognition effectively.
- Research Article
13
- 10.1145/3540201
- Feb 6, 2023
- ACM Transactions on Multimedia Computing, Communications, and Applications
Most matting research resorts to advanced semantics to achieve high-quality alpha mattes, and a direct low-level features combination is usually explored to complement alpha details. However, we argue that appearance-agnostic integration can only provide biased foreground (FG) details and that alpha mattes require different-level feature aggregation for better pixel-wise opacity perception. In this article, we propose an end-to-end hierarchical and progressive attention matting network (HAttMatting++), which can better predict the opacity of the FG from single RGB images without additional input. Specifically, we utilize channel-wise attention (CA) to distill pyramidal features and employ spatial attention (SA) at different levels to filter appearance cues. This progressive attention mechanism can estimate alpha mattes from adaptive semantics and semantics-indicated boundaries. We also introduce a hybrid loss function fusing structural similarity, mean square error, adversarial loss, and sentry supervision to guide the network to further improve the overall FG structure. In addition, we construct a large-scale and challenging image matting dataset comprised of 59,000 training images and 1,000 test images (a total of 646 distinct FG alpha mattes), which can further improve the robustness of our hierarchical and progressive aggregation model. Extensive experiments demonstrate that the proposed HAttMatting++ can capture sophisticated FG structures and achieve state-of-the-art performance with single RGB images as input.
- Conference Article
4
- 10.1109/ictai50040.2020.00173
- Nov 1, 2020
This paper proposes a novel single shot network for object detection. The proposed network, termed IDNet, explores the strategies of the feature fusion to alleviate the scale variation problem in object detection. IDNet mainly consists of two feature fusion modules: an indirect feature fusion module (IF) and a direct feature fusion module (DF). The IF shares long-range dependencies within pyramidal layers and based on these information, IDNet learns to emphasize informative regions and suppress the less useful ones on each layer. The DF is a feature fusion strategy based on modified lateral connection inspired by feature pyramid networks (FPN). It utilizes the averaging operation to reduce the change of feature maps' order of magnitude during fusing features to further improve the performance for detecting small instances. Comprehensive experiments are performed and the results indicate the effectiveness of IDNet, which reaches 80.3 mAP on PASCAL VOC 2007 benchmark.
- Research Article
8
- 10.1007/s40747-024-01580-3
- Aug 14, 2024
- Complex & Intelligent Systems
Prohibited item detection is crucial for the safety of public places. Deep learning, one of the mainstream methods in prohibited item detection tasks, has shown superior performance far beyond traditional prohibited item detection methods. However, most neural network architectures in deep learning still lack sufficient local feature representation ability for overlapping and small targets, and ignore the problem of semantic conflicts caused by direct feature fusion. In this paper, we propose MFPIDet, a novel prohibited item detection neural network architecture based on improved YOLOV7 to achieve reliable prohibited item detection in complex environments. Specifically, a multi-scale attention module (MAM) backbone is proposed to filter the redundant information of target regions and further applied to enhance the local feature representation ability of overlapping objects. Here, to reduce the redundant information of target regions, a squeeze-excitation (SE) block is used to filter the background. Then, aiming at enhancing the feature expression ability of overlapping objects, a multi-scale feature extraction module (MFEM) is designed for local feature representation. In addition, to obtain richer context information, We design an adaptive fusion feature pyramid network (AF-FPN) to combine the adaptive context information fusion module (ACIFM) with the feature fusion module (FFM) to improve the neck structure of YOLOV7. The proposed method is validated on the PIDray dataset, and the tested results showed that our method obtained the highest mAP (68.7%), which is improved by 3.5% than YOLOV7 methods. Our approach provides a new design pattern for prohibited item detection in complex environments and shows the development potential of deep learning in related fields.