-NaN fluid solver bug part 2

Overall goal: Fix this bug and get a fluid simulation going, so on Monday we can run it by the other students in my lab to make sure it's accurate. Then, try to figure out what the hell I'm doing for my actual thesis, and potentially code up MPM. Potentially -- but I must make a decision ASAP. 

First objective: Iterate through the pressure vector and manually replace all -NaN values with some other number. See if the program runs and completes properly, and if the elimination of pressure -NaNs also eliminates POSITION -NaNs. If it does, then we’ll have to work backwards from there.

If it doesn't, I am probably misunderstanding this bug...

Result #1: I replaced all -NaN vectors with 1 for the pressure solve. Indeed, it seemed to eliminate the -NaNs from showing up in my position data files. 

So that's good. 

Basically, it's either something in the cg.solve() (not likely) or something funky getting passed into B that screws up the result of cg.solve(), which is saved into the pressure vector (more likely). 

Then there's the sneaking part of me that wonders "wait maybe we are supposed to have -NaNs"? But no, a quick drop into Houdini shows that eliminating the -NaNs also eliminates the vanishing particles bug! So we can start working backwards; thank goodness. Of course, it isn't the real fix, and we need to figure out what that actually is -- pronto. 


Kinda peaceful, ain't it? Like staring up into the night sky, thinking about the great voids of dark nothing that exist between the galaxies. 

OBJECTIVE #2: 

The following snippet of code is how the P (pressure) vector gets computed. This is the vector that gets all those nasty-af -NaNs. 


K cool. This solve function is from the Eigen library in C++ -- it's a sparse matrix solver -- so it sure as hell better not be screwy. I'm not gonna debug someone else's code. It is open-source, but almost certainly, the issue is what I'm passing into it. 

It is preceded by this, btw:

So the issue is either with the A matrix or the B vector. 

Grand. 

From previous debugging efforts, I've found that the B vector has a bunch of really tiny values as defaults, which all amount to -6.27744e+66. There is nothing significant about this number -- it's just C++ garbage. The pointers inside the B vector that haven't been initialized are just pointing to a random memory location which happens to have this value. 

I'm not sure if anything is wrong with the A matrix. So objective #2 is just to print that out. It might be difficult because it's a matrix, not a vector, but we will do it. I mean if it breaks we can use the debugger but sometimes that's annoying. 

RESULT #2:


So I mean, there is a bunch of uninitialized garbage, but I'm not sure that matters. It's supposed to be a sparse matrix, so that shouldn't matter, but maybe it does. We will keep searching. 

Also, turns out you can pause a cout terminal by just clicking on it, lol. Who knew. I certainly didn't until this lab, and I've been doing C++ since C S 142 (though admittedly not very consistently).

OBJECTIVE #3:

I want to look at those garbage B-vector values. If we change those, even just forcing them to a value of 1, will the P-vector still have -NaNs showing up? If it stops the -NaNs, then we computed B incorrectly. Otherwise, maybe we computed A incorrectly, or maybe we screwed up both A and B. 

Suffice it to say, I really hope I just screwed up B. I don't think A is wrong because I ran the algorithm past Samuel, one of the students in the lab, and it looked effectively the same as his...So it might be B. 

The problem is, though, that B is likely supposed to be a sparse vector. The reason for this is that only fluid cells get B-values...everything else isn't set. So if everything goes according to plan, then maybe there needs to be actual default B-values? 

Thus, my next objective is to check my books and papers to see if there's anything. 

RESULT #3:


It keeps talking about each cell containing fluid and since A only works with fluid cells too, so should B. But, is it set in the right location? (Same as A right now) 

Each FLUID cell is Ci. Then they use Bi a bit later...


So it seems like we should only set B for fluid cells. 

See, squeezing information out of these papers is a major pain. This is why I'm kind of burned out and just want to be done with school so I can study on my own. I don't mind wrestling with them, but I do mind when I need to meet harsh deadlines. 

We want to set B only for fluid cells, so why is it set in Sean's code to be this? 


It's the size of _numCells_, not the size of the number of FLUID cells, and I'm not sure why. Maybe just in case every single cell is fluid at the worst case. 

But the sparse matrix solver should be able to detect garbage values, right? Trouble is, I'm not sure if we're supposed to be seeing default values or not. They seem to be blank in A, which leads me to think that replacing them with normal values like 1 won't actually solve the issue. Or, what if we used a sparse vector object? 

At this point, I'm starting to realize that I don't fully understand the requirements here. Getting this done is vital to finishing school, moving out of winter-land, and finally having time to get my own work done and have fun / play video games / etc. I NEED to finish this. I am TIRED, man. 

OBJECTIVE #4: Try a SparseVector

RESULT #4: This pisses off Visual Studio:


OBJECTIVE #5: Remove all the -6.27744e+66 values from B, and see if this causes no -NaNs to show up in our result files. 

RESULT #5: We comment out the P patch and try a B patch: 


Interestingly enough, this doesn't work at all: 


I hate seeing this. I hate seeing this so much. 

Okay. Try again. 

We cannot conclude that those huge values are screwing up. However...what if we replaced everything in B with a value of 1?

OBJECTIVE #6: replace everything in vector B with a value of 1, a very safe number. If we see no more -NaNs, then we're good. 

RESULT #6: We still see a bunch of -NaNs. I am going to burn this entire planet and everyone on it to the ground, most certainly including myself. 


OBJECTIVE #7: I'll send an email to Sam explaining the bug, and asking if he has ever seen a bunch of -NaNs. 

RESULT #7: I feel bad for emailing him on a Saturday. 

It's almost naptime. I am feeling sleepy, and I want to do art and writing and music too...and my back hurts. 

And I want to play Pokemon Violet. Ugh. Debugging sinks its infuriatingly sharp little teeth into everything. I also need to clean...But I'm still debugging. 

If this were a personal project, instead of for this degree, I could stop to rest...

I am so tired lol. 

OBJECTIVE #8: Make sure those value were actually set by printing out the B-vector at the end...

RESULT #8: Yes we are actually setting them. 

OBJECTIVE #9: Print out cg and see if there's -NaN in there

RESULT #9: can't


OBJECTIVE #10: set a variable as an answer and print that? 

RESULT #10: can't


I am officially out of steam, and will be taking a break. The next objective will be to understand what B is supposed to be doing mathematically and computationally. But I need to TIMEBOX. I'll keep this in the back of my mind for a while. 

Until then. I'll make a new post for it. 










Comments

Popular posts from this blog

Unleashing my inner Disney Princess ✩₊˚.⋆☾⋆⁺₊✧ at the 2024 Disney Princesses Half Marathon

The 20-something types of Computer Science majors

The Evenstar