- Research Article
- 10.24132/jwscg.2025-7
- Jan 1, 2025
- Journal of WSCG
- Felix Raith + 2 more
- Research Article
1
- 10.24132/jwscg.2024.6
- Jan 1, 2024
- Journal of WSCG
- Lintao Fang + 4 more
Fish motion is a very important indicator of various health conditions of fish swarms in the fish farming industry. Many researchers have successfully analyzed fish motion information with the help of special sensors or computer vision, but their research results were either limited to few robotic fishes for ground-truth reasons or restricted to 2D space. Therefore, there is still a lack of methods that can accurately estimate the motion of a real fish swarm in 3D space. Here we present our Fish Motion Estimation (FME) algorithm that uses multi-object tracking, monocular depth estimation, and our novel post-processing approach to estimate fish motion in the world coordinate system. Our results show that the estimated fish motion approximates the ground truth very well and the achieved accuracy of 81.0% is sufficient for the use case of fish monitoring in fish farms.
- Research Article
- 10.24132/jwscg.2024.1
- Jan 1, 2024
- Journal of WSCG
- Boris Bordeaux + 1 more
Lacunar fractal structures reduce the material quantity and weight while improving some physics properties, such as heat transfers, and preserving good mechanical properties. Nowadays, it is possible to construct such shapes thanks to additive manufacturing. This paper focuses on automatically generating subdivision rules for fractal lacunar structures with local topology control. The first main difficulty is guaranteeing topological consistency while assembling different cells to build a complicated multi-lacuna structure. The second is the adaptation of such shapes to geometric constraints like imposed boundaries. We address these questions throughout the formalism of the Boundary Controlled Iterated Function System. Then, we analyze the lacunarity and complexity of these structures from various geometric, topologic, and fractal measures.
- Research Article
1
- 10.24132/jwscg.2024.2
- Jan 1, 2024
- Journal of WSCG
- Kevin Cardenas + 2 more
We developed Beta Caller, an end-to-end system supporting the sport of rock climbing for climbers with visual impairment. Beta Caller provides real-time, audible instructions containing a prediction for the climber’s next move while they are actively climbing a rock wall. This system leverages computer vision techniques to collect key information about the climber’s environment, enabling Beta Caller to make move predictions on climbing walls it has never encountered before. Neural networks are used to predict where the climber should move next, based on information provided by the computer vision models. The predicted move is translated into a verbal message guiding the climber to the next hold and then transmitted via wireless headphones using a text-to-speech model. This novel idea makes one of the fastest growing sports in the world even more appealing and approachable to climbers with visual impairment, however, this tool can be utilized by all climbers to improve their climbing skills. Beta Caller achieved 80.08% accuracy predicting which limb the climber should move next and, when predicting the location of the next hold, Beta Caller achieved a bounding box error of only 6.79%. These results pioneer a strong foundation shaping the future landscape of rock climbing prediction tools for visually impaired climbers.
- Research Article
- 10.24132/jwscg.2024.8
- Jan 1, 2024
- Journal of WSCG
- Roman Sandu + 1 more
Modern graphics APIs expose control over the infamously non-coherent GPU caches to application programmers through the mechanisms of pipeline barriers and render passes. A developer is then asked to group together their GPU computations based on memory access patterns such that cache flushes and invalidations are minimized, but render graph systems enable automation of this process. In this paper, we study the problem of finding an optimal execution order for a frame graph to minimize the amount of render pass breaks, which in turn minimizes cache control operations. We formulate and analyze a novel $NP$-complete problem $lang{MLGP}$ and use it to propose an approach to render pass merging that results in 30\% less render pass breaks when compared to previous works.
- Research Article
- 10.24132/jwscg.2024.7
- Jan 1, 2024
- Journal of WSCG
- Katarina Gojkovic + 2 more
In this paper, we aim to improve rendering reflections using environment maps on moving reflective objects. Such scenarios require multiple reflection probes to be positioned at various locations in a scene. During rendering, the closest reflection probe is typically chosen as the environment map of a specific object, resulting in sharp transitions between the rendered reflections when the object moves around the scene. To solve this problem, we developed two convolutional neural networks that dynamically synthesize the best possible environment map at a given point in the scene. The first network generates an environment map from the coordinates of a given point through a decoder architecture. In the second approach, we triangulated the scene and captured environment maps at the triangle vertices - these represent reflection probes. The second network receives at the input three environment maps captured at the vertices of the triangle containing the query point, along with the distances between the query point and the vertices. Through an encoder-decoder architecture, the second network performs smart interpolation of the three environment maps. Both approaches are based on the phenomenon of overfitting, which made it necessary to train each network individually for specific scenes. Both networks are successful at predicting environment maps at arbitrary locations in the scene, even if these locations were not part of the training set. The accuracy of the predictions strongly depends on the complexity of the scene itself.
- Research Article
- 10.24132/jwscg.2024.5
- Jan 1, 2024
- Journal of WSCG
- Diana Sungatullina + 1 more
We present an approach to backpropagating through minimal problem solvers in end-to-end neural network training. Traditional methods relying on manually constructed formulas, finite differences, and autograd are laborious, approximate, and unstable for complex minimal problem solvers. We show that using the Implicit function theorem to calculate derivatives to backpropagate through the solution of a minimal problem solver is simple, fast, and stable. We compare our approach to (i) using the standard autograd on minimal problem solvers and relate it to existing backpropagation formulas through SVD-based and Eig-based solvers and (ii) implementing the backprop with an existing PyTorch Deep Declarative Networks (DDN) framework. We demonstrate our technique on a toy example of training outlier-rejection weights for 3D point registration and on a real application of training an outlier-rejection and RANSAC sampling network in image matching. Our method provides 100% stability and is 10 times faster compared to autograd, which is unstable and slow, and compared to DDN, which is stable but also slow.
- Research Article
- 10.24132/jwscg.2024.9
- Jan 1, 2024
- Journal of WSCG
- Charles Lepaire + 3 more
Strokes concerned more than 795,000 individuals annually in the United States as of 2021. Detecting thrombus (blood clot) is crucial for aiding surgeons in diagnosis, a process heavily reliant on 3D models reconstructed from medical imaging. While these models are very dense with information (many vertices, edges, faces in the mesh, and noise), extracting the critical data is essential to produce an accurate analysis to support the work of practitioners. Our research, conducted in collaboration with a consortium of surgeons, leverages generalized maps (g-maps) to compute quality criteria on the cerebral vascular tree. According to medical professionals, artifacts due to noise and thin topological changes are significant parameters among these criteria. These parameters can be determined via the Reeb graph, a topological descriptor commonly used in topological data analysis (TDA). In this article, we introduce a novel classification of saddle points, and a Reeb graph variant called the Local to Global Reeb graph (LGRG). We present parallel computation methods for critical points and LGRG, relying only on local information thanks to the homogeneity of the g-map formalism. We show that LGRG preserves the most subtle topological changes while simplifying the input into a graph formalism that respects the global structure of the mesh, allowing its use in future analyses.
- Research Article
- 10.24132/jwscg.2024.3
- Jan 1, 2024
- Journal of WSCG
- Martin Čavarga
Fairing methods, frequently used for smoothing noisy features of surfaces, evolve a surface towards a simpler shape. The process of shaping a simple surface into a more complex object requires using a scalar field defined in the ambient space to drive the surface toward a target shape. Practical implementation of such evolution, referred to as Lagrangian Shrink-Wrapping, on discrete mesh surfaces presents a variety of challenges. Our key innovation lies in the integration of adaptive remeshing and curvature-based feature detection, ensuring mesh quality and proximity to target data all while maintaining the stability of the solution in time. We introduce the Equilateral Triangle Jacobian Condition Number metric for assessing triangle quality and introduce trilinear interpolation for enhanced surface detailing to improve upon existing implementations. Our approach is tested with point cloud meshing, isosurface extraction, and the elimination of internal mesh data, providing significant improvements in efficiency and accuracy. Moreover, we extend the evolution to surfaces with higher genus to shrink-wrap even more complex data.
- Research Article
2
- 10.24132/jwscg.2024.11
- Jan 1, 2024
- Journal of WSCG
- Thomas Pöllabauer + 4 more
Deep Neural Networks (DNNs) require large amounts of annotated training data for good performance. Often this data is generated using manual labeling (error-prone and time-consuming) or rendering (requiring geometry and material information). Both approaches make it difficult or uneconomic to apply them to many small-scale applications. A fast and straightforward approach of acquiring the necessary training data would allow the adoption of deep learning to even the smallest of applications. Chroma keying is the process of replacing a color (usually blue or green) with another background. Instead of chroma keying, we propose luminance keying for fast and straightforward training image acquisition. We deploy a black screen with high light absorption (99.99\%) to record roughly 1-minute long videos of our target objects, circumventing typical problems of chroma keying, such as color bleeding or color overlap between background color and object color. Next we automatically mask our objects using simple brightness thresholding, saving us from manual annotation. Finally, we automatically place the objects on random backgrounds and train a 2D object detector. We do extensive evaluation of the performance on the widely-used YCB-V object set and compare favourably to other conventional techniques such as rendering without needing 3D meshes, materials or any other information of our target objects and in a fraction of the time needed for other approaches. Our work demonstrates highly accurate training data acquisition allowing to start training state-of-the-art networks within minutes.