Replies: 13 comments 6 replies
-
The most obvious optimization is what libfive and others do: have a small virtual machine that reads in which hand-picked SDFs render. I understand now what @doug-moen told me, that curv is more powerful than these tools because there is no limitation on the SDF you can create. I think we should stick to this strength. |
Beta Was this translation helpful? Give feedback.
-
First, let's define the "operation tree" as the tree of operations that are used to specify a shape in Curv. We need a hybrid geometry engine, that supports multiple strategies for representing and rendering shapes. Different shape representations have different tradeoffs WRT speed of rendering, for different operation trees. In traditional low-level computer programming ("procedural programming"), what you are trying to accomplish and how you accomplish it are entwined together in the same code. This creates complexity and inflexibility, and leads to a lot of work if you want to change the "how" without changing the "why", due to a lot of code needing to change. But, this is initially the simplest way to design a programming language. Using the procedural approach, we will provide multiple different ways to represent and render the same shape, and you will explicitly choose a strategy when designing a shape. In declarative programming, which is high-level, it is possible to separate the "what" and the "how" into two separate pieces of code that are not intertwined. In some cases, the developer is responsible for writing the "how" code. (An example might be the proof assistant systems, where you write a proof in pure first order logic or whatever, clean code that just expresses the "what", and then separately you define "tactics" which is the implementation strategy for proving the proof.) Even better and more advanced is where the system examines the operation tree of a shape and automatically chooses an optimal implementation strategy for representing and rendering that shape. One example is a query optimizer in a relational database. Writing an optimizer is traditionally difficult, but there is a relatively new technique called equality saturation and e-graphs that automates the complex coding for the "how" part of optimization, and you just specify the "what" using declarative code, a set of transformation/equivalence rules over expressions. Not only is the coding much simpler, but it is no longer necessary to hard code all of the transformations in the optimizer. Transformation rules over specific operation tree patterns could also be specified using Curv code. But this is for later. What's the relevance to making unions faster? I claim that, depending on the operation tree for a unioned shape, there are different shape representations and rendering algorithms appropriate for that operation tree. Next, what are these different representations and rendering algorithms that speed up different special cases of unions? There is some exploration of this question in the directory ideas/v-rep. Specifically, my initial starting points for speeding up large unions. |
Beta Was this translation helpful? Give feedback.
-
I remembered some more stuff. Different shape representations for speeding up union rendering:
A common theme of §2 and §3 is two-phase rendering, where we "compile" the shape into a data structure during evaluation, then we render using that data structure in the GPU viewer. The tradeoff may be that it's slow to update the data structure when a parameter changes by dragging a slider.
Two-phase rendering is also my idea for splines. |
Beta Was this translation helpful? Give feedback.
-
In the RaphLevian essay, I'm trying to understand Raph's advanced ideas for fast and flexible GPU rendering of 2D graphics. He's using a multi-phase rendering approach, where for performance reasons, each phase is implemented on the GPU, using a compute shader for each stage. This is why I was talking about the need to switch Curv over to using compute shaders for everything back in 2021. The mTec system, which is for efficiently rendering 3D signed distance fields (unlike RaphLevian) also uses multistage rendering using a chain of compute shaders. |
Beta Was this translation helpful? Give feedback.
-
If you put this all together, it points towards a complete redesign of Curv (Curv 2), in which:
The idea of "multiple shape representations", a hybrid operation tree, needs more elaboration.
|
Beta Was this translation helpful? Give feedback.
-
Very nice ideas 🙂 I have one question: How do the expressed ideas affect using curv as a REPL?
This sounds the most appealing to me out of everything. The simplest approach is the approach I'd really like to invest my time in. At the end of the day I'd like anyone to be able to re-implement curv with some confidence... Optimizing code output and linear_union sounds like it would get curv to a point of performance that I'm looking for - which is about ~500 unions in less than a few seconds. Right now it has a hard time to do ~20. |
Beta Was this translation helpful? Give feedback.
-
Speaking of hybrid shape representations, what about triangle meshes? It should be possible to read an STL file and use it with Curv shape operators. Or, provide a For this to work, we need some way to create an SDF for the mesh. The mTec project has a tool that converts a mesh into a voxel grid. You do this offline, then import the voxel grid. There is another project (a research paper I read years ago) that creates a BVH for the mesh, then interprets the BVH to produce an SDF. It's GPU friendly, I recall they reported good performance. |
Beta Was this translation helpful? Give feedback.
-
The fastest way I know to render meshes within an SDF scene only works in a special case, where the mesh is a member of a top level union. The root of the operation tree is a union, and the mesh is one of the union arguments. Let's call this approach A shape optimizer could convert a |
Beta Was this translation helpful? Give feedback.
-
The mTec project experimented with a number of ways to speed up SDF rendering on a GPU. For each optimization, they give the % speedup. The optimization with the greatest benefit is what is sometimes called "tile based rendering" or "beam optimization", and what they call "culling". You partition the display into small tiles (often 16x16, but they used 8x8). There is a compute shader that creates a display list for each tile: the display list contains only the shapes in the top level union that are visible in that tile. A separate compute shader is dispatched to render the display list for each tile. This speeds up a large top level union because you don't have to evaluate every SDF in the top level union for every pixel in the display. Since top level unions are important for optimization, a shape optimizer could use algebraic identities to push a non-union top level operation farther down into the tree. For example,
could be transformed to
|
Beta Was this translation helpful? Give feedback.
-
There are many ideas here, in order to illustrate the scope of what's possible. For a "Curv 2" project I would focus on a small, high value subset, comprising 1. new language, 2. new SubCurv compiler, 3. new graphics engine.
|
Beta Was this translation helpful? Give feedback.
-
Curv has an AST, that's easy. It has a tree-walking AST interpreter, that's easy. |
Beta Was this translation helpful? Give feedback.
-
I discussed with Doug about this large union problem years ago, and I just went through this thread of discussion. I can provide some of what we previously discovered, especially when you want to accelerate without a complete-rewrite compiler or something. The main problem is about the length of compiled code. Here's 2 little skills:
If you only want to export the shape to STL, without need of rendering, then you can do union of a very big number, even more than 100,000 (I don't know the exact limitation). For exporting, you can further accelerate by manually sperate regions ahead of time, and import the sperated information into the curv file, then write your make_shape function region by region, go through the elements involved in each region. I finally get it pretty fast and now large union exporting is not a concern for me, for a large number (several thousand to hundred of thousand, didn't remember exact number) it mostly takes only several mins to export. The largest case I have run took maybe several hours. BTW I used multi-threading when exporting and the -O jit flag. |
Beta Was this translation helpful? Give feedback.
-
I was reading about sdf and come across this discussion. From a compiler perspective, floating point operations are hard to optimize as the operations are generally not associative. For floating point operations, it is sometimes possible to convert one expression into another with bounded error and faster: https://herbie.uwplse.org/ |
Beta Was this translation helpful? Give feedback.
-
The biggest blocker for SDFs for me is the union of shapes. This causes massive amounts of computation per pixel in the fragment shader. We need to come up with a way to still have unions, but without the massive computation.
At it's core, we need:
min(a, b)
to be minimized in some way. Maybemin
is actually bad overall for shader size? Maybe we should be usingif
? Etc. I'm open to all ideas. Remember that it has to work for both rendering and meshing!Beta Was this translation helpful? Give feedback.
All reactions