Forensic Biology & Its biological significance.pdf
Multi-Chart Generative Surface Modeling
1. Multi-chart Generative Surface Modeling
Heli Ben-Hamu
Joint work with Haggai Maron, Itay Kezurer, GalAvineri,Yaron Lipman
Weizmann Institute of Science
SIGGRAPH Asia, December 2018
3. Sinha et al. 2017
Groueix et al. 2018
Litany et al. 2018
Achlioptas et al. 2017
Wu et al. 2016
PreviousWork
• Volumetric representation
• Point cloud representation
• Mesh representation
• Parametrizations
4. Litany et al. 2018
Groueix et al. 2018
Wu et al. 2016
Non-efficient representation
information-wise
Partial or no connectivityHas connectivity
Use dense correspondence or same
connectivity between models
Uses only sparse correspondence
between models
Sinha et al. 2017
Achlioptas et al. 2017
PreviousWork
• Volumetric representations
• Point cloud representations
• Mesh representation
• Parametrizations
Our Representation
643
< 643
More efficient representation
information-wise
19. Conclusions
Limitations and future work
• Scope: only genus zero surfaces
• Automatic choice of triplets (charts)
• Use multi-chart structure for other learning applications
• We devised a framework for 3D shape generation
• High quality of generated examples
• Found a constructive sufficient condition for scale-translation rigidity
20. ThankYou
Acknowledgments
• ERC Consolidator Grant, ”Lift-Match”
• Israel Science Foundation
• Authors of Groueix et al. 2018, and Litany et al.
2018 for sharing their results for comparison
Notes de l'éditeur
Hi everyone.
In the following talk I will present our work which introduces a 3D generative model that is based on a new tensor representation for 3D shapes with sphere topology.
https://docs.google.com/document/d/1oRTaUR_QaUu0gfSNpjq1vZ5I1DfGuDFQwzwLbCGcm9E/edit?usp=sharing
With deep-learning becoming a major tool in solving many long standing problems in image applications, a desire to transfer this powerful tool to shape learning is only natural.
However, unlike images, where the data lays on a regular domain, in shape learning we face the challenge of finding a good data representation for learning. In our work, we devise a new shape representation that allows designing a generative model based on a GAN architecture.
In recent years, several works introduced various approaches for handling learning generative models on shapes. The first one, uses the simplest analogy to images, a volumetric representation with occupancy functions on a 3D grid. The second one, works on the point cloud representation of shapes dealing with challenges regarding points order invariance. The next one, works directly on meshes treating them as graphs defining a new kind of convolution on them, and the last methods use parametrizations of the surface to a 2D domain. In our work we also use parametrizations of the surface to a 2D domain as we believe that it is the natural way to represent surfaces since intrinsically they are 2 dimensional.
Although many works have tried to find a good representation for learning each introduces its disadvantages creating a balance of tradeoffs between the different methods, and I would like to highlight where our methods lays in this domain. The methods by Sinha and Litany et al. use either dense correspondence or the same connectivity between models where our method only uses a set of sparse correspondences. The methods of Aclioptas and Groueix et al., though more general in the sense that they do not need to have a correspondence between the models, return an output that has partial or no connectivity while our method returns a mesh. And last, the volumetric representation suffers from low resolution where we introduce a more efficient representation information-wise.
We call our representation the “Multi-chart structure” which is a collection of parameterizations, a set of images, that collectively represent the shape. We require our representation to satisfy 2 properties: covering and scale-translation rigidity. In the following slides we will construct it and explain the need for these properties.
The building blocks of the multi-chart structure are conformal parametrizations of the sphere to a 4-toric cover used by Maron et al. for the task of surface segmentation. Each parametrization is defined uniquely by choosing an ordered triplet of points on the surface. Cutting the sphere along the path that goes through the points, mapping it to the plane and glueing 4 copies together, gives us a mapping to the plane which is homeomorphic to the flat torus. This mapping allows us to define a translation invariant operator on the surface - convolution. And, we have reduced our shape learning problem to image learning problem.
For the task of shape generation we use those parametrizations to map the mesh to the flat torus, we push the coordinates functions of the mesh, and sample an image from the mapping. However, when trying to recover the information stored in the image, we see that areas of the surface have disappeared due to high scale distortion of the parametrization. We can see how the head area is shrunk to be enclosed within 4 pixels of the image and hence consequently it cannot be recovered from this choice of parametrization.Like Maron et al. we will have to use more than a single chart to cover the whole shape. However, unlike Maron et al. where the task was to learn functions on a given known shape, and random choices of parameterizations would eventually yield a good covering of the shape, we need a consistent choice of parameterizations that will guarantee recovery of the shape from the chosen representation.
To do so, we choose a set of triplets of points on the surface that together define a set of charts. On the model on the left, the coloring represents the maximum scale over the set of chosen charts. We choose the triplets manually, in a way that each choice of a triplet covers another area of the surface. Our final set of charts provides a good covering of the surface and is defined only by a sparse set of corresponding points.
The dimensionality of our representation will be the chosen spatial resolution which is a parameter of our choice, times 3 times the number of charts, where 3 stands for the 3 coordinate functions. It seems like we are ready to train our neural network. Unfortunately, the training results in meshes which are not smooth is small parts of the model like the face and the palms. This is due to the fact that charts that represent small parts of the model lie in a different dynamic range than for example charts that represent the torso, and we can see the differences in range of colors in the charts. The solution to this problem, is to learn on normalized charts, placing all the images in the same range of values. In a way, this normalization is similar to instance normalization.
So, after normalizing our charts, we tried once again to train our network. We can see how each generated chart reconstruction yields a very high quality smooth mesh. However, the normalization introduces a problem. Can we recover the natural scale and translation of the charts to construct a single mesh?
To answer this question we introduce the notion of scale translation rigidity. Think about our model with the set of chosen points on it. Recall that each chart is defined by a triplet of points. Therefore, we can think of our set of charts as an abstract triangulation, where each face represents a chart and faces share a vertex if it defines both of the charts. The rules of the game are as follows - we are allowed to scale and translate each face and the question that we ask is:
Given that our triangulation has been “broken” under the rules of the game, can we? And how do we? Recover the scale and translation of the faces up to a global scale and translation?
Looking at the simple example below of a triangulation with just 2 triangles, we see that we can easily formulate a linear system to find the original scale and translation of the faces up to a global scale and translation. But, when does this system have a single solution?
In fact, the number of solutions to the system depends solely on the triangulation and not on a specific embedding of it. Here we see two examples of triangulations which are not rigid, meaning that the linear system has an infinite number of solutions. In the example to the left we can take on of the faces and scale it as we like, giving an infinite number of possible embeddings, note that this triangulation is not 2-connected. In the example to the right, we have a chordless cycle, which is a cycle that cannot be shortened, of length 5, and in R^3, this is also a non-rigid triangulation. We devised and proved a theorem that gives a sufficient condition for rigidity of triangulations in terms of the length of the maximal chordless cycle in them and their connectivity.
The theorem states that: a 2-connected triangulation with chordless cycle of length at most 4 is scale-translation rigid in R^3.
So, we have finished construction our multi-chart structure with the covering and rigidity properties.
The architecture we use, is based on off-the shelf architectures of GANs for images with the change that we use a periodic convolution to have continuity on the surface, and we enforce the symmetry of the 4 copies in each chart. Furthermore, we add a layer of landmark consistency, which enforces the coincidence of points of the known sparse correspondence.
Having the output of the network, we have to reconstruct our model. First, we recover the scale and translation of the charts by solving the linear system and then we use a template mesh of a human model, which we chose as an average of all meshes in rest pose, to return a mesh. The weighting of each vertex is according to the scale that it experiences in each parametrization. Meaning that charts that represent areas of the surface well will contribute more to the value of the vertices in that area.
We have evaluated our results with a nearest neighbor evaluation. The metric for finding neighbors was l2 in the charts domain. In each set of 3 models we see on the left the generated example, and to its right the GT nearest neighbor. Once recovered from our representation and the original mesh of the GT nearest neighbor. Note the differences in the faces of neighbors and the change in fit.
We have also compared our results to a set of 3 baselines, a volumetric baseline which is based on the work of Wu et al. and on AtlasNet and on the work of Litany et al.
The superiority of the quality and smoothness of our results compared to the other methods is clearly shown.
Other than human models, we have also applied our method for anatomical shapes - teeth. In this image we see a set of 25 randomly generated teeth by our model.
A last and cool illustration - here we see an interpolation in the latent space learned by our model. Note how smoothly it transfers between models while always keeping a human like shape.
It is worth saying that our main limitations are the fact that our methods fits only sphere type meshes and that the choice of triplets is done manually. In the future, we would like to use this data representation that we devised for other learning applications.
So to conclude. We devised a framework for 3D shape generation which was able to generate very high quality results. In the process we also designed and proved a sufficient condition for scale-translation rigidity.
Thank you for listening!
Mapping to the flat torus allows us to define a translation invariant operator on the surface with periodic boundary conditions.
Where we transferred our shape learning problem to image learning problem. This allows us to use off the shelf architectures designed for image learning with the minor change that our convolution is periodic over the image boundaries and we add a layer to enforce the symmetry of the 4 identical copies of the sphere in the torus.