17 October 2020

ShipBasher Development Log 14: The GPU Bullet Collision Saga, Episode 4: Conclusion...?

Having solved The Big Problem®, I was afforded no respite before having another problem to address. I noticed along the way that the testing code I had added to draw debug lines for all of the bullets that had entered bounding spheres was displaying a different set of bullets than the bullet rendering script was displaying. When I sucked up the performance impact and had the rendering script draw debug lines for every bullet in existence, I found something worrisome: all of the visible bullets were in fact behaving as they should and appearing in their correct locations, but a fair chunk of the bullets that had been fired were invisible!

I temporarily disabled stretching for the bullet sprites and made them really big so it was obvious which were being displayed properly and which only via debug lines (tiny blue dots).

At first I figured that some conditional statement or other was hiding bullets, so one by one I tried disabling all such checks. I even made the system draw red debug lines for inactive bullets. I had no luck.
Then I figured maybe something was up with the collision system. Fortunately that already has a simple Off switch, but it turned out that wasn't it either.
It did turn out that, while I didn't bother counting them myself and don't expect anyone else to do so, that exactly 2/3 of the bullets were invisible.
One may notice that 3 is the exact number of vertices in a triangle! HALF LIFE 3 CONFIRMED

Just kidding. The real reason this is relevant is because I was calling the sparsely undocumented function Graphics.DrawProcedural(). Because my geometry shader outputs triangles, I figured that when it asks for the MeshTopology argument, I should say to use triangles. Nope! Somehow that was causing the shader to be informed that two out of every three vertices were part of a triangle belonging to the first vertex and should be skipped by the geometry shader. Odd. Changing the MeshTopology argument to specify points (individual vertices) fixed it.

So yeah the moral of the story is that if you're drawing a point cloud using Graphics.DrawProcedural, use MeshTopology.Points. Hooray! Now my system draws three times as many bullets with one simple change in code and no significant change in performance! Here's how the system looked after all the recent improvements:

The captions in the image should be fairly self-explanatory. I can stretch the bullets based on their absolute world-space velocities or their velocities relative to the camera or some other object, I can spawn particle effects at their precise points of impact, and when the bullets ricochet they finally do so based on proper reflection vectors and I can specify how much of their original velocity to maintain and how much to randomly scatter. There is a tiny amount of imperfection in the impact positions still, but I feel that I've refined it as much as I need for the time being.

Here's another comparison, this time showing all of the iterations thus far:

Observe how, as the camera moves in order to maintain its position relative to the target ship, the purple bullets stretched according to their absolute world-space velocities appear to stretch in the wrong direction, whereas the cyan bullets appear much more correct. It's not shown above, but I can also have the cyan bullets maintain some portion of their forward velocities and penetrate into the target rather than bounce. This could be useful on some kind of powerful railgun that can punch through armor plates to damage modules inside a ship.

Also note that I finally fully implemented the option to have bullets become inactive upon striking a module, as I expect will be the case for most bullets in the final game. And look! I even got bullet damage working:

With all these kinks worked out, my next task was optimization. When I originally cooked it up, even with all its flaws, the GPU Bullet System could handle over 100,000 bullets on screen at once before any noticeable performance drop. At this point? Not so much.

The bulk of the problem is that the GPU can easily draw tons of identical sprites, but communication between the GPU and CPU is relatively slow, even on an integrated graphics chip. While the actual GPU and CPU are intimately connected, literally sharing a casing, in this architecture (if I'm not mistaken) the GPU stores its data in the same place the CPU does - the RAM. Thus, every time a buffer needs to travel between one memory region and the other, I lose several cycles of processing waiting for the data to be accessed from the RAM. Since each different type of bullet needs its own set of compute buffers, and at least one of these has to travel one way or the other up to four times per frame, having many different kinds of bullets in play at a time (as I do currently) leads to a significantly lower framerate.

Currently I'm investigating a few remedies to this.
By doing some optimization work in the code to reduce unnecessary operations and eliminate garbage generation wherever possible, I've sped it up by a noteworthy factor, but I still have several milliseconds' delay every frame while the CPU waits on data from the GPU. I've experimented with making this subsystem asynchronous, but that enables the data to arrive in an entirely different frame from the one during which I requested it and thus I have to deal with discrepancies in where the bullets are in the buffer I've received versus the main buffer, which in turn leads to grossly inaccurate results from physics queries and, once again, to my frustration, bullets floating through the target ship without touching it.
Next I think I'll investigate the possibility of only having one bullet manager script for the whole game and having each bullet within the buffer carry a bunch of extra data about what sort of bullet it is. Depending on how far I go with this, it could get very, very complicated.
There's also still the option of keeping it as-is. I'm open to suggestions on this.

Finally I leave you with a picture of what happens when I dispense with concerns about framerate entirely and make the system display ONE MILLION BULLETS!

There's no starfield background here - every single little white dot in the image is a bullet that the system is processing and rendering. The framerate is less than stellar in this situation but still surprisingly high, and I surmise that a more powerful GPU than mine would handle it easily.

13 October 2020

ShipBasher Development Log 13: The GPU Bullet Collision Saga, Episode 3: for(int i = r; i < r % z + r; i += r - z > i ? 1 : z % (r + z)){ z -= r; r += i; r = r % z + i > 0 ? i : r - z % r + z; z -= r; }

That "code" in the title isn't what's actually in the game (in fact I doubt it'd even compile, let alone do anything useful), but it is a caricature of how my code was starting to look to me at one point during the long debugging process I mentioned undertaking at the end of the last post.

See, I had also upgraded the compute shader one more time with a fourth compute buffer - this one representing bullets that the physics engine had confirmed actually did hit a ship. As of now I'm still just having them bounce off, but pretty soon I'm going to want them to stop existing, or in more precise terms as far as my code is concerned, become inactive so they stop being displayed and hitting things. Maybe sometimes I'll want bullets to survive and have secondary impacts (or penetrate through modules) but the option to deactivate bullets that have impacted is important. This buffer gets filled up on the CPU end of things, and then at the end of each frame, after the collisions have all been addressed, it gets sent to the GPU containing copies of all these bullets, notably including their indices in the original buffer so that the compute shader knows which of the original bullets have now become inactive.

Here, I'll recap with a flowchart that might help or might just make this all even more confusing:


Rounded rectangles represent the game systems related to GPU bullets; ellipses are tasks the systems can do, and the clouds surround the tasks belonging to a given system. The parallelograms represent the four compute buffers, and the arrows represent how each system affects each buffer or other system.

As shown here, the persistent data for the bullets is stored in the GPU's memory, rather than the CPU's as is the case for most of the game's data. Note how in order for the game to function, communication must occur back and forth between the CPU and GPU.

Because GPU code doesn't deal in pointers the way CPU code does, any time a data structure has to travel between one and the other, the data itself gets sent as a copy. Thus when, for example, the compute shader adds a bullet that has entered a bounding sphere, the bullet in the main buffer remains where it is, unchanged, while the new buffer entry is a copy of that bullet's data. In order to keep track of which data belongs to which bullet, I simply include an ID in that data. The first bullet that ever gets fired is given ID number 0, then the next 1, and so on up to the maximum number of bullets I allow in the game configuration, e.g. if I allow 10000 bullets then the last one is number 9999. After that, if I fire a "new" bullet, what actually happens is I overwrite bullet 0, which is bound to have either hit something or drifted off into deep space by this point. This is a common programming concept called a ring buffer or circular buffer, which is a type of object pool.

When a bullet enters a bounding sphere and a copy is added to the corresponding buffer and in turn sent to the CPU and the physics engine, unlike the bullets in the main buffer, that copy only exists for that frame. It's a trivial task for the GPU to regenerate it as it iterates through every bullet every frame anyway, so this has insignificant performance impact. In order for anything to happen, the physics engine must give a positive result when the bullet is evaluated that frame; otherwise it goes away when the buffer gets reset and thus nothing happens to the main buffer.

If a collision is detected, then that bullet's data is copied once again, this time into the buffer representing bullets that have hit something and must either be redirected or deactivated. Each type of bullet is handled by a corresponding instance of the bullet manager script and its own associated set of compute buffers, and for each type of bullet I can configure whether to allow ricochets or delete bullets after impact; based on which option applies, the data copied into the "impactor" buffer can contain a modified velocity or a flag indicating that it represents a bullet that should be deactivated. All bullets that have struck something are added to this buffer and then the buffer is read by the compute shader.

Because each bullet's complete data is copied every time, not only is its important position and velocity perserved, but also its original ID from when it was fired. Thus when the compute shader receives all the impacted bullets, it can match their IDs with the corresponding IDs in the main buffer and change those bullets accordingly, altering their velocities or deactivating them. In the end, bullets and the things that have happened to them persist frame after frame as long as they are needed.

I'm almost to the point where I explain the big problem I faced. There's just one more thing to explain. To prevent this becoming too much of a textbook page, here's another picture:

Another screenshot of my debugging process. The yellow specks are the debug lines for bullets that have bounced off the target ship and now entered the test ship's bounding sphere, which I temporarily made larger. You may notice something suspicious going on here that I'll address in the next entry.

Back on topic, I mentioned above that I assigned an ID to each bullet as it was fired. Unfortunately it's not quite as simple as slapping on a number during the firing function. If I were to individually update members of the compute buffer as bullets were fired, it would cause lots of little data packets to have to be sent to the GPU  - possibly lots every frame if the overall firing rate is high, as is the case with a rapid-fire gun or a large number of guns firing at once. Due to the way computers are designed, this would cause soul-crushing lag. Rather, the optimal way to go about his would be to gather up all the bullets that have been fired in a given frame in a nice little array and then at the end of the frame send one packet containing that array to the GPU - so that's what I did. Thus the firing function didn't count up numbers in the buffer itself, but rather the number of bullets that had been fired that frame. I called this number R.

What this means of course is that when I send the array of new bullets, I also need to tell the compute shader where to start changing bullets in its compute buffer, so I added a second counter value, which I called Z. Every time a bullet was fired, I would add one to R and to Z. At the end of every frame, once all the new bullets were ready to submit, R would reset to zero, but Z would remain as-is, tracking how many bullets had been fired ever, or at least since the last time I had made a full loop of the ring buffer. By doing math (see title) with R and Z, I could discern where in the compute buffer to start making changes and how many entries to update, and I could even check whether I had run past the end of the compute buffer and needed to go back to the beginning. Soon after I had implemented this, I had bullets happily whizzing out of the gun barrels frame after frame with no invalid array index exceptions or whatever, and I washed my hands of this and shifted gears to things like the collision detection that's been the focus of the last few devlogs.

But then I noticed something very, very strange. At high rates of fire, everything seemed perfect, but if I happened on a whim to switch to a low rate of fire, only a few bullets per second... bullets would float through ships unimpeded for a short time and then bounce off empty space as if it were the ship!

What's going on here?!? There aren't a lot of bullets visible so I drew some arrows to make it clearer - the bullets (blue crosses) are traveling up from bottom right, passing through the target ship, and then bouncing near the top and going off to bottom left even though there isn't anything at the top! No, there were no invisible or inactive colliders or anything simple like that.

When I first noticed bullets occasionally ignoring collisions, I figured it was the physics engine being unreliable and played around with how I did collision detection queries. No luck.

It seemed like the bounce was occurring not only at the exact rate of fire, but at the exact time firing occurred, so I investigated my weapon and module classes in extreme detail. No luck.

I even investigated my compute shader, geometry shader, and bullet rendering script. Swallowing the huge performance impact, I had Unity draw little blue debug lines on every single bullet all the time - those are the blue crosses visible above. No luck!

I was starting to go nuts and getting increasingly tempted to give up and resign myself to releasing a buggy game (okay I'm sure it'll be buggy anyway, but I should at least try to address the ones I do catch, right?)... but eventually I did figure it out. Notice that pink line? That's the debug line I have Unity draw to represent the new velocity a bullet is given when it impacts a ship - a line that only appears during the frame during which that impact occurs. The bullet bouncing off empty space is occurring exactly when, and only when, the next bullet actually strikes the ship!

It turns out that the second counter, Z, which tracks cumulative bullets, incremented after each bullet was fired and then was used as the start index for the compute buffer. Thus Z would begin at 0, I'd fire bullet 0, then Z would become 1. I'd fire bullet 10 and then Z would become 11. Thus when it was time to update the compute buffer, I'd say to start at index 11 rather than index 10 - every bullet would get assigned to the index just after its ID, and thus, as these IDs traveled back and forth during collision detection and eventually the compute shader compared the bullet IDs to the buffer indices, every time a bullet hit something the compute shader would go and edit the index matching that bullet's ID, i.e. the bullet just before - in other words, each bullet would bounce if the bullet just after it hit a ship. Yes that probably sounds a little confusing and it threw me for a loop (pun not planned but welcome) too.

Once I had that figured out, I went and re-did my counting system to be much more sensible. Now, R increments with every bullet, but Z does not - rather, Z stays put while the new bullets array is built and then increments by R afterward. Problem solved, though my head hurt.

Of course, after every bug there's another bug...

11 October 2020

ShipBasher Development Log 12: The GPU Bullet Collision Saga, Episode 2: There and Back Again, a Bullet's Tale

When I left off, I had just finished bragging about how I got information about bullets and ships to the compute shader so they could interact, and then I confessed that even with all the extra additions the bullets still couldn't have any effect on the ships in the actual game. Information about the bullets and ships was getting to the GPU and the compute shader, but no information was getting back from there to the CPU and the rest of the game's code; of most immediate importance was getting information to the physics engine.

I expanded the compute shader yet again to use a third compute buffer, this one representing bullets that had intersected a ship's collision sphere. This starts off empty and every time a bullet intersects such a sphere, it gets copied into the new buffer. The bullet manager is able to read this buffer every frame and thereby discover which bullets are inside a ship's radius and thus need proper collision checks. To start off I had Unity draw debug lines to represent each of these:

There are bunch of lines in this image, including the purple lines I have the script draw to vaguely indicate the bounding sphere of the target ship (the cylindrical object at center), but the focus for the moment is on the yellow lines. The blue crosses (which ironically I added later) are drawn by the "Point Cloud Renderer" script in charge of displaying the bullets. Every bullet that's inside the bounding sphere of the target ship is drawn with a yellow line running from its current position to its expected position in the next frame based on its velocity. Note that these are drawn, and collisions are handled, before the blue crosses are drawn by the rendering system, so the blue crosses generally line up with the forward ends of the yellow lines but sometimes are in different places as a result of a collision during this frame.

Now yes, for a very simple ship such as this target ship that only consists of one blocky module, it would be simpler to just send the collider to the compute shader and do all of the collision detection there, but in the finished game I expect ships to be composed of many modules and have odd shapes. Since the developers of PhysX (Unity's built-in physics engine) have already sunk a lot of time and expertise into doing precise and efficient collision detection, I intend to take advantage of the existing system here. Each yellow line not only depicts the bullet's projected trajectory, but traces the raycast used to query the physics engine for precise collision results. These not only include yes/no answers to whether the bullet struck something, but details about what it struck and how including surface normal vectors, the precise location of the impact, etc. In short, round-trip communication was now established and it had become possible to do proper game things like spawn explosion particles and apply damage to modules.

I also proceeded to take the "bouncing" placeholder code out of the compute shader and add a somewhat better bouncing function in the manager script. Bullets thus ricocheted off the ships' hulls and did so based on the surface normal vectors:

From left to right in the lower image: laser turret for reference; "old" turret without target leading (note how it misses the moving target); improved turret with instantiated prefab bullets; two different variations using the GPU Bullet System but without collision detection (orange and yellow - note how the bullets pass through the target with no effect); turret using GPU Bullet System and rough collisions based on bounding spheres (purple - note how the bullets bounce off in a scattered cloud); turret using latest collision detection features. The new system at this point caused bullets to reflect off the surface normals, but do so in a very "perfect" fashion so that they all came back in a nearly perfect single-file line.

After a few minor adjustments, such as allowing bullets to ricochet at reduced speeds (purple) and with some random variation in direction (cyan):

Things were looking pretty good, except that I kept occasionally noticing that a few bullets would still somehow manage to float effortlessly through the target ship as if it weren't there. I kept wanting to dismiss the problem as just a minor quirk, but looking objectively at the situation, when there were a lot of bullets flying around, the small fraction that didn't collide still made up a lot of bullets, too many to ignore.

I kept poking at this problem until it started to drive me crazy. I carefully examined debug lines such as those above, stepping through frame-by-frame in hopes of gleaning what made the non-colliding bullets special, and oooh, did I find quite a big problem lurking at the bottom of it all. Look forward to the depths of my despair frustration in the next entry.

09 October 2020

ShipBasher Development Log 11: The GPU Bullet Collision Saga, Episode 1: Spheres

I concluded the last log with a paragraph about how I planned to continue integrating my GPU Bullet System into ShipBasher by establishing round-trip communication between CPU-based physics code and a compute shader running on my GPU for bulk processing of bullets. As of now I've finally achieved this as well as uncovered and fixed some shocking bugs. The path here was quite a saga, so I shall recount it in parts.

Several years ago I started playing EVE Online and have gone back and forth between active play and long breaks ever since. I love many facets of the game and it's probably little surprise that it's one of the inspirations guiding my development of ShipBasher, both aesthetically and mechanically.

Because EVE Online runs on a single massive server cluster that has to handle tens of thousands of concurrent active players at times, there's no budget for careful, precise physics calculations when big swarms of ships start yeeting clouds of bullets at each other. As far as my research has led me to understand, the server instead abstracts away all the bullet motion and simply treats every ship as a sphere - a shape that can be fully defined with only four numbers, those being its position in each of three dimensions and its radius. With this knowledge, the server can get a fairly decent approximation of the ability of one ship in one place to damage a ship of a given size in some other place. Whenever a weapon fires, it crunches these numbers and a few others and determines what happens - all further detail is just visual effects.

All those little red, orange, and blue squares are indicators of player's ships. One can see the need for performance optimization.

I don't intend to approximate this roughly in ShipBasher, but I saw great potential in the idea of treating each ship as a simple sphere for coarse collision detection. I realized I could have every ship compute how big a sphere would fully enclose it, tell the GPU Bullet System what that value is, and thereby enable that system to easily differentiate between bullets near enough to a ship to be likely to touch it and bullets adrift in space unlikely to be touching anything. Presumably, most of the time the bullets about to hit things will be a minority of all the bullets that exist (since there's a lot more space not inside a ship than there is inside a ship in most circumstances), so if I can narrow those down and only do physics calculations for those, I can support much larger quantities of bullets without a much larger performance impact. The first step to doing this, of course, is to get those bounding spheres, which, like many things in programming, proved more complicated than it sounds.

Modules consist, in the game engine, of combinations of physical objects and visual objects, which don't necessarily (and usually don't) exactly match in shape or size. Most visual objects are "meshes," collections of 3D vertices connected with triangles, and most physical objects are "colliders," which are sets of equations that define geometric shapes and are invisible, but important for running simulations of solid objects. Calculating the exact radius of a collider is usually fairly simple, but requires a different strategy for every given type of collider that might exist in the finished game, and calculating the radius of a mesh is conceptually simple but very tedious. Fortunately, one thing that colliders and meshes have in common is that the engine uses axis-aligned bounding boxes as representations of their rough sizes. I could have used these instead of spheres, but it would have added slightly more work for the compute shader, and every bit of optimization counts in a system that might have to handle hundreds of thousands of bullets at once.

I investigated a few strategies for converting bounding boxes into approximate spheres and eventually settled on iterating through every collider and renderer (a visual object that has a bounding box - usually a mesh) attached to a given ship and, based on its relative position and the radius of its bounding box, incrementally calculating an approximate radius for the whole ship. This technique should be mathematically guaranteed to never give a result smaller than the "true" radius of the ship, but typically does overestimate slightly. Fortunately for my purposes, the smaller each individual module is relative to the ship, the more precise the overall calculation ends up being.

 I didn't actually need to include renderers at this point, but later on I expect to reuse the radius value in a few other parts of the game, and I want players to see a radius that is consistent with how big the ship actually appears to be.

With that part out of the way for now, I changed tack and started getting the GPU Bullet System ready to deal with spheres. I reprogrammed the compute shader to use a new buffer of spheres in addition to its existing buffer of bullets, and as a temporary debugging feature I rigged the bullet management script to generate some random spheres to feed into this new buffer alongside some random bullets (since the existing turrets are only able to target things like ships and modules, not imaginary spheres):

Not to be an overachiever, I didn't bother building a fancy visualization for the spheres, since they were temporary after all. I just had the engine draw some random debug lines based on the centers and radii of the spheres to give a vague sense of where their boundaries were. Also visible here are the debug lines and bullets from the old turrets, which I didn't bother disabling, but more importantly there are the white bullets spewing out in random directions. Note how most are radiating out from the origin, but a few are traveling other directions - these have struck a sphere and "bounced" (I put it in quotes because I didn't bother with actual reflection vector math and just made a crude approximation) off. Collision detection, hooray!

Next all I had to do was feed the compute shader with the real bounding spheres from the ships' actual positions and radii:

I had the wherewithal to turn off the starfield background at this point so the important things could be seen clearly. At left is the GPU Bullet System as it was before, flinging yellow dots into space to look pretty but do nothing else. At right is the new version. The target ship in the distance (as well as the testing ship in the foreground) has calculated its bounding sphere and added it to the sphere buffer as a 4D vector, in which the first three values are the position and the last value is the radius. Due to the way GPUs are built to deal with matrices and four-component colors (red, green, blue, and an "alpha" value typically used for opacity), this format is easy to implement and process.

The compute shader at this point had two tasks for each bullet: move it forward a bit based on its velocity if it is active, and go through all the bounding spheres to see if the bullet is inside one of them. This does mean that every compute thread is going to run through the full collection of all bounding spheres, as I haven't implemented any optimizations such as spatial partitioning, but considering that I don't expect there to be a huge number of ships active at once, I decided that a little inefficiency at this particular stage was a lesser evil than the extra complexity of some algorithm for picking and choosing which spheres to check. In fact, unless there actually are a huge number of ships, I suspect that my choice was in fact the optimal one here.

Once the compute shader had run, all of the active bullets had advanced forward and any bullets that touched a ship's bounding sphere had been detected. For the moment I simply had the GPU change their velocities to point directly away from the bounding sphere (that "bounce" I described above) so I could see that it was working, but all of the bullets were still confined to the realm of the GPU. They made it onto the screen, but no information about what happened to them made it back into the rest of my codebase, meaning that ships didn't know they'd been hit (nor did anything else, even the bullet manager script) and thus couldn't be damaged or otherwise affected. My next task (and the subject of the next entry to come) was thus to establish a line of communication from the compute shader back to the CPU and the scripts it was running.

Sorry this isn't a real post :C

I've entered graduate school and, as I expected, suddenly become far busier than I had been before, leaving no time to maintain my blog,...