Bio-Inspired 3D Affordance Understanding from Single Image with Neural Radiance Field for Enhanced Embodied Intelligence.

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Affordance understanding means identifying possible operable parts of objects, which is crucial in achieving accurate robotic manipulation. Although homogeneous objects for grasping have various shapes, they always share a similar affordance distribution. Based on this fact, we propose AFF-NeRF to address the problem of affordance generation for homogeneous objects inspired by human cognitive processes. Our method employs deep residual networks to extract the shape and appearance features of various objects, enabling it to adapt to various homogeneous objects. These features are then integrated into our extended neural radiance fields, named AFF-NeRF, to generate 3D affordance models for unseen objects using a single image. Our experimental results demonstrate that our approach outperforms baseline methods in the affordance generation of unseen views on novel objects without additional training. Additionally, more stable grasps can be obtained by employing 3D affordance models generated by our method in the grasp generation algorithm.

Similar Papers
  • PDF Download Icon
  • Research Article
  • 10.31861/sisiot2023.2.02005
Expert System for Supporting the Construction of Three-Dimensional Models of Objects by the Photogrammetry Method
  • Dec 30, 2023
  • Security of Infocommunication Systems and Internet of Things
  • Serhiy Balovsyak + 2 more

The task of building high-quality three-dimensional (3D) models of objects is relevant, since such 3D models are widely used in various fields of science, technology and medicine. In this work, the construction of 3D models is performed by the photogrammetry method, which consists in the construction of a 3D model of an object based on a series of its photographs. The advantages of the photogrammetry method are low hardware requirements and relatively high accuracy. To build 3D models of objects by photogrammetry, the 3DF Zephyr program was used, which contains a set of tools for pre-processing images, reconstructing 3D models, editing and measuring the dimensions of 3D models, and exporting the obtained models. The principles of building three-dimensional models of objects by the method of photogrammetry based on initial images are considered. The main stages of building 3D models are described: calculation of sparse point cloud, key points, dense point cloud, polygon grid, texture grid. Model parameters are also edited and analyzed. An expert system was developed in the CLIPS environment to select the correct modes for building a 3D model. The knowledge base of the expert system contains production rules that allow you to establish the correct modes of building a 3D model based on the initial facts. 30 facts-conditions have been developed that describe the conditions for building a three-dimensional model. 20 facts-consequences and 15 facts-recommendations for building a 3D model have been developed. Using the developed rules, 36 production rules were built. Experimental verification of the developed system was carried out. Three-dimensional models of objects were built using the 3DF Zephyr program. After entering the facts that describe the process of obtaining the model into the expert system, a number of recommendations were obtained, in particular, to increase the area of textured surfaces and use uniform lighting of objects. After following these recommendations, the model was built with satisfactory accuracy.

  • Research Article
  • Cite Count Icon 1
  • 10.26583/sv.17.1.06
Применение инструментов компьютерного зрения PyTorch3D и NERF для построения облака точек трехмерной модели и определения положения камеры фотоснимков в пространстве
  • Apr 1, 2025
  • Scientific Visualization
  • V.V Konkov + 1 more

Recently, computer graphics plays a key role in solving computer vision problems. The problem of converting 2D images into 3D models continues to be urgent, as it requires precise determination of camera position and construction of accurate 3D models of objects. Traditional methods are often limited in application and do not offer a comprehensive solution. This study examines the use of PyTorch3D and NERF libraries to determine the camera position in 3D space and create a 3D model of an object from a single 2D image. As a method of data preparation, a hardware and software system was used, including a stepper motor control device that provides manual and sequential positioning of the camera and its return to the initial position, a shooting control system to generate a comprehensive set of photos at each camera position, and a mechanism for sending data to a remote computer for further processing. The PyTorch3D library was selected during the study to explore the possibilities of converting 2D images into 3D models or determining the position of an object in the photos. The processing process included several steps: building a point cloud to generate a 3D volumetric model of the object, determining the camera position in 3D space from a single 2D image using inverse problem algorithms, and constructing a 3D object using differentiable rendering, creating 3D voxels and 3D meshes. The results of this study showed successful determination of camera position in 3D space and construction of a 3D object model from a single 2D image, demonstrating the advantages of using the PyTorch3D library over other existing models. These findings can be applied in the development of software and hardware systems for creating 3D images from 2D photographs. The study confirmed the relevance and effectiveness of using PyTorch3D library to solve the problems of converting 2D images into 3D models. Further work will be aimed at expanding the functionality of the system and its use in various areas of computer vision.

  • Research Article
  • Cite Count Icon 13
  • 10.1080/19475683.2012.727865
An SDOG-based intrinsic method for three-dimensional modelling of large-scale spatial objects
  • Oct 15, 2012
  • Annals of GIS
  • Jieqing Yu + 3 more

Three-dimensional (3D) modelling is a powerful tool for spatial representation and data analysis, and large scale is the common feature for spatial objects in Global Spatial Information System/Science (GSIS), especially in Earth System Science (ESS). It is important to develop a new 3D modelling method for large-scale spatial objects to meet the demands of global change and ESS researches. The projection-based methods, which have been applied for hundreds of years, are inadequate to perform large-scale spatial modelling, while the embedding methods are unnatural to represent the gravitational features of geo-objects, making the spatial modelling complex and the global data analysis hard. Although the current intrinsic methods are capable of dealing with large-scale spatial modelling, they have some defects such as shrinking, overlapping, non-latitude–longitude consistent, triangular prism-shaped or non-uniformly subdivided and lack of a unified representation model on geometric, topologic and attributive information integrally. Spheroid Degenerated-Octree Grid (SDOG), which takes the advantages of non-shrinking, quasi-uniform, non-overlapping, latitude–longitude consistent, hexahedron-shaped, uniformly subdivided, multi-resolution, is a preferable grid for developing an intrinsic method for the 3D modelling of large-scale spatial objects. This article employed SDOG to develop a new intrinsic method for large-scale 3D modelling. A triple representation model, T(OID, S, A), was proposed to conduct a unified representation on geometric, topologic and attributive information integrally. An algorithm of triples construction, as well as a two-table data structure, was developed to make the intrinsic method operable. A large-scale 3D modelling case, with SDOG-based intrinsic method, on the lithosphere of planet Earth in intrinsic space was illustrated. It shows that the SDOG-based intrinsic method is feasible to perform the 3D modelling of large-scale spatial objects, so as to support global visualization and ESS studies.

  • Research Article
  • 10.26787/nydha-2686-6846-2023-25-1-11-17
ИСПОЛЬЗОВАНИЕ 3D-МОДЕЛЕЙ ПРИРОДНЫХ ОБЪЕКТОВ КАК СОСТАВЛЯЮЩИЙ СЕГМЕНТ ЦИФРОВОГО ОБРАЗОВАТЕЛЬНОГО ПРОЦЕССА
  • Jan 30, 2023
  • “Educational bulletin “Consciousness”
  • Lazareva N.V

The article discusses the essence and characteristics of 3D modeling and the features of using 3D models of natural objects in the educational process of higher educational institutions in terms of improving the quality, accessibility and visibility of the material being studied. The goals of education are identified, the solution of which can be achieved through the use of 3D models of natural objects. The forms of creating 3D models based on the materials of photographing natural objects are systematized. The advantages of visualization and the use of three-dimensional models in 3D space from the point of view of the formation of students' spatial thinking are given. The purpose of the study: Substantiation of the introduction of the use of 3D models of natural objects in the development of digital environmental education in the training of specialists in various industries. One of the important directions related to the solution of environmental problems at the present stage is environmental education and upbringing of the younger generation. Environmental education both in the world and in Russia is today considered a priority area for teaching and educating students in higher educational institutions. Particular attention is drawn to the use of the digital segment in the educational process. The introduction of the study of 3D technologies into the main educational process in order to educate a talented young generation of Russian specialists in the field of ecology and other related specialties is a timely and very necessary process. the use of 3D models of natural objects is a current trend in modern digital education. In this case, the use of 3D models of natural objects in the educational process allows both to increase the susceptibility of biological objects for students through visualization and increased visibility, and to make the process of mastering the material more accessible and efficient. Thus, the creation of electronic resources with the inclusion of 3D models of natural objects qualitatively transforms the activities of educational institutions, which will subsequently have a positive impact on the formation of students' professional competencies.

  • Research Article
  • Cite Count Icon 8
  • 10.2478/geocart-2013-0007
3D modeling of architectural objects from video data obtained with the fixed focal length lens geometry
  • Dec 1, 2013
  • Geodesy and Cartography
  • Paulina Deliś + 3 more

The article describes the process of creating 3D models of architectural objects on the basis of video images, which had been acquired by a Sony NEX-VG10E fixed focal length video camera. It was assumed, that based on video and Terrestrial Laser Scanning data it is possible to develop 3D models of architectural objects. The acquisition of video data was preceded by the calibration of video camera. The process of creating 3D models from video data involves the following steps: video frames selection for the orientation process, orientation of video frames using points with known coordinates from Terrestrial Laser Scanning (TLS), generating a TIN model using automatic matching methods. The above objects have been measured with an impulse laser scanner, Leica ScanStation 2. Created 3D models of architectural objects were compared with 3D models of the same objects for which the self-calibration bundle adjustment process was performed. In this order a PhotoModeler Software was used. In order to assess the accuracy of the developed 3D models of architectural objects, points with known coordinates from Terrestrial Laser Scanning were used. To assess the accuracy a shortest distance method was used. Analysis of the accuracy showed that 3D models generated from video images differ by about 0.06 ÷ 0.13 m compared to TLS data.

  • Research Article
  • Cite Count Icon 2
  • 10.1115/1.4033230
Freehand Gesture and Tactile Interaction for Shape Design
  • Nov 7, 2016
  • Journal of Computing and Information Science in Engineering
  • Monica Bordegoni + 3 more

This paper presents a novel system that allows product designers to design, experience, and modify new shapes of objects, starting from existing ones. The system allows designers to acquire and reconstruct the 3D model of a real object and to visualize and physically interact with this model. In addition, the system allows designer to modify the shape through physical manipulation of the 3D model and to eventually print it using a 3D printing technology. The system is developed by integrating state-of-the-art technologies in the sectors of reverse engineering, virtual reality, and haptic technology. The 3D model of an object is reconstructed by scanning its shape by means of a 3D scanning device. Then, the 3D model is imported into the virtual reality environment, which is used to render the 3D model of the object through an immersive head mounted display (HMD). The user can physically interact with the 3D model by using the desktop haptic strip for shape design (DHSSD), a 6 degrees of freedom servo-actuated developable metallic strip, which reproduces cross-sectional curves of 3D virtual objects. The DHSSD device is controlled by means of hand gestures recognized by a leap motion sensor.

  • Research Article
  • Cite Count Icon 22
  • 10.1016/s0031-3203(03)00041-4
A 2D/3D model-based object tracking framework
  • Mar 27, 2003
  • Pattern Recognition
  • Ediz Polat + 2 more

A 2D/3D model-based object tracking framework

  • Research Article
  • Cite Count Icon 6
  • 10.1134/s0001433820120427
Constructing 3D Models of Rigid Objects from Satellite Images with High Spatial Resolution Using Convolutional Neural Networks
  • Dec 1, 2020
  • Izvestiya, Atmospheric and Oceanic Physics
  • O G Gvozdev + 4 more

A way of constructing 3D models of rigid objects from one satellite image is described. It is based on the use of two convolution neural networks which sequentially process high-resolution satellite images. The first neural network performs integral image analysis for segmentation and identification of objects of specified physical classes. The second neural network performs local image analysis and works with images segmented by the first neural network in areas of the image that presumably contain objects of specified classes. An algorithm for reconstructing a 3D model of an object from raster domains of a segmented image obtained from local analysis is described. It is based on regression analysis, the assessing of equivalent figures, and the linearization and polarization of contours. Results from the algorithm’s operation are given using the example of railway infrastructure facilities. The results from constructing 3D models of three objects of the railway infrastructure, identified via the operation of neural networks for four informative classes of areas are presented, e.g. roofs, walls, railroad tracks, contanc lines (poles). Standard dimensions (e.g., the railway gauge (1.52 m) and the height of railway support poles (11.35 m)) are used to estimate scaling coefficients that allow determination of base dimensions and object heights. The possibility of constructing 3D models of objects of areas from 210 to 4200 m2 is shown.

  • Research Article
  • Cite Count Icon 29
  • 10.1016/j.autcon.2010.06.003
Rapid 3D object detection and modeling using range data from 3D range imaging camera for heavy equipment operation
  • Jun 29, 2010
  • Automation in Construction
  • Hyojoo Son + 2 more

Rapid 3D object detection and modeling using range data from 3D range imaging camera for heavy equipment operation

  • Conference Article
  • Cite Count Icon 3
  • 10.1109/iciip.2015.7414827
A new next best view method for 3D modeling of unknown objects
  • Dec 1, 2015
  • Mahesh Kr Singh + 2 more

In this paper, we present a new method to determine the “next best view” (NBV) solution for accurate 3D reconstruction of an object with minimum prior information about the object's geometry. The proposed method determines the best visible surface of unknown objects using an adaptive mean shift algorithm which avoids the inaccessible position. The proposed method automatically generates the 3D model of objects in real time with a minimum number of best visible surface patches while the objects are moving on a turntable. By generating a set of potential next views, the proposed method ensures proper avoidance of unreachable positions. The number of views required to reconstruct a 3D model of objects depends upon their complexity. The proposed method is applicable to all kinds of range sensors and experimental results validate the proposed method for 3D modeling of real objects and prove its robustness.

  • Conference Article
  • Cite Count Icon 1
  • 10.1117/12.473080
3D modeling applications for cultural heritage
  • Jan 20, 2003
  • Proceedings of SPIE, the International Society for Optical Engineering/Proceedings of SPIE
  • Raffaella Bologna + 3 more

The arising interest towards 3D modeling of both single objects and whole environments is strictly related with the availability of more and more powerful computing and surveying devices. A new set of issues has to be addressed in the 3D modeling of real objects. A lot of data are needed about the object surface or volume, which have then to be aggregated, regardless the data format and the acquisition device used, in order to get the final model. Actually, the data registration requires an approximate estimate of the alignement between acquired data.This approach is often time-consuming, increases the final cost of the 3D model and represents the major limit to the wide spreading of real object models. Taking into account this drawback, a fully automatic range data registration system has been developed. This system is able to execute all the steps needed for 3D modeling of real objects in automatic way or at least minimizing as more as possible the human intervention, without any other information but the range data only. In this paper an overview of the whole registration system is presented, focusing on the integration between the two main blocks. In the first one, overlapping areas between range image pairs are detected by mean of spin-images and an initial approximate alignement between image pairs is computed. Then, in the second block, a refinement of this estimate is performed by use of a cascade of two registration algorithms: the Frequency Domain and the ICP. Some interesting applications of proposed strategy for 3D modeling of cultural heritage objects will be also reported.

  • Book Chapter
  • Cite Count Icon 168
  • 10.1007/978-3-642-15555-0_48
Depth-Encoded Hough Voting for Joint Object Detection and Shape Recovery
  • Jan 1, 2010
  • Min Sun + 3 more

Detecting objects, estimating their pose and recovering 3D shape information are critical problems in many vision and robotics applications. This paper addresses the above needs by proposing a new method called DEHV - Depth-Encoded Hough Voting detection scheme. Inspired by the Hough voting scheme introduced in [13], DEHV incorporates depth information into the process of learning distributions of image features (patches) representing an object category. DEHV takes advantage of the interplay between the scale of each object patch in the image and its distance (depth) from the corresponding physical patch attached to the 3D object. DEHV jointly detects objects, infers their categories, estimates their pose, and infers/decodes objects depth maps from either a single image (when no depth maps are available in testing) or a single image augmented with depth map (when this is available in testing). Extensive quantitative and qualitative experimental analysis on existing datasets [6,9,22] and a newly proposed 3D table-top object category dataset shows that our DEHV scheme obtains competitive detection and pose estimation results as well as convincing 3D shape reconstruction from just one single uncalibrated image. Finally, we demonstrate that our technique can be successfully employed as a key building block in two application scenarios (highly accurate 6 degrees of freedom (6 DOF) pose estimation and 3D object modeling).KeywordsImage PatchObject InstancePascal VOC07Object HypothesisObject DepthThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

  • Conference Article
  • Cite Count Icon 42
  • 10.1109/icra.2014.6906958
Stability of soft-finger grasp under gravity
  • May 1, 2014
  • Kensuke Harada + 5 more

We discuss grasp stability under gravity where each finger makes soft-finger contact with an object. By clustering polygon models of a finger and an object, the contact area between a finger and an object is obtained as the common area between an object cluster and a finger cluster. Then, by assuming the Winkler elastic foundation, the pressure distribution within the contact area is obtained. By using this pressure distribution, we show that we can judge grasp stability under soft-finger contact. We further consider defining a quality measure of a soft-finger grasp by assuming that although the gravitational force is applied to an object, the direction of gravity is unknown. To demonstrate the effectiveness of the proposed approach, we show several numerical examples. I. INTRODUCTION Grasp stability is widely used as an index for evaluating the grasping posture of multifingered hands. If grasp stability is satisfied, we can guarantee that the grasped object can resist an external disturbance from any direction. In this study, we focus on grasp stability for general multifingered hands, wherein a finger makes soft-finger contact (SFC) with an object. We consider solving formulating grasp stability under SFC by explicitly using the pressure distribution acting in the contact area between finger and object. In addition to this problem, we obtain the quality measure by taking the effect of gravity into consideration. Many industrial manipulators are equipped with two- fingered parallel grippers at the tip. For such manipulators, it is well known that, if a finger contacts an object under the point contact with friction (PCwF), three fingers are needed to ensure grasp stability. However, many two-fingered grippers have a flexible sheet attached to the finger surface. When a finger contacts an object, the flexible sheet deforms, thereby generating torsional friction at the contact area. Because of this torsional friction, grasp stability can be satisfied even if a hand has only two fingers. This contact style is called SFC. Here, because the amount of torsional friction depends on the area of contact, grasp stability also depends on the area of contact. Although the area of contact depends on the object shape and the flexibility of the sheet, there has been no research on grasp stability under SFC in which the effect of object shape and flexibility of the sheet has been considered. On the other hand, we determine grasp stability under SFC by explicitly considering this effect. We

  • Research Article
  • Cite Count Icon 4
  • 10.1007/s12541-017-0035-2
Automatic 3D model acquisition for unknown objects based on hybrid vision technology
  • Mar 1, 2017
  • International Journal of Precision Engineering and Manufacturing
  • Wei Fang + 3 more

Three-dimensional (3D) model acquisition is the process of building a 3D model of an object. But due to the limited field of view of the scanner, this task is mainly performed by taking several scans with human intervention. In order to make the 3D modeling process efficient, a novel automatic 3D modeling method for unknown objects based on hybrid vision technology in a binocular structured light system (BSLS) is proposed. Firstly, the limit visual vacuums of the BSLS are established, and they will be used to predict the unknown area with an acquired 2.5D range image. With the 2D intensity image acquired synchronously, the coarse boundary size is recovered from Shape from Shading, and it leads the prediction of the unknown area to be more precise. Based on the combination of the predicted contours, the next best viewpoint is determined with more unknown areas visible. The proposed method can be used to obtain the 3D models of unknown objects automatically, and the experimental results illustrate the validity and efficiency of our approach.

  • Research Article
  • Cite Count Icon 11
  • 10.1109/tip.2014.2378032
Estimation of sunlight direction using 3D object models.
  • Dec 22, 2014
  • IEEE Transactions on Image Processing
  • Yang Liu + 2 more

The direction of sunlight is an important informative cue in a number of applications in image processing, such as augmented reality and object recognition. In general, existing methods to estimate the direction of the sunlight rely on different image features (e.g., sky, texture, shadows, and shading). These features can be considered as weak informative cues as no single feature can reliably estimate the sunlight direction. Moreover, existing methods may require that the camera parameters are known limiting their applicability. In this paper, we present a new method to estimate the sunlight direction from a single (outdoor) image by inferring casts shadows through object modeling and recognition. First, objects (e.g., cars or persons) are first (automatically) recognized in images by exemplar-SVMs. Instead of training the Support Vector Machine (SVMs) using natural images (limited variation in viewpoints), we propose to train on 2D object samples generated from 3D object models. Then, the recognized objects are used as sundial cues (probes) to estimate the sunlight direction by inferring the corresponding shadows generated by 3D object models considering different illumination directions. We demonstrate the effectiveness of our approach on synthetic and real images. Experiments show that our method estimates the azimuth angle accurately within a quadrant (smaller than 45°) and compute the zenith angle with mean angular error of 23°.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant