Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++: Add implicit destructors for named variables to the IR #15506

Merged
merged 34 commits into from
Feb 27, 2024

Conversation

rdmarsh2
Copy link
Contributor

@rdmarsh2 rdmarsh2 commented Feb 1, 2024

No description provided.

@github-actions github-actions bot added the C++ label Feb 1, 2024
Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments. So far I've only looked at some of the .expected files. I think I'll delay looking at too many of the .qll changes until the conflicts with main has been resolved.

I think some of the consistency issues I've noted here would also be revealed by running the cpp/ql/test/library-tests/ir/ tests

cpp/ql/test/library-tests/ir/ir/raw_ir.expected Outdated Show resolved Hide resolved
cpp/ql/test/library-tests/ir/ir/raw_ir.expected Outdated Show resolved Hide resolved
Comment on lines 7192 to 7265
# 1244| r1244_2(glval<String>) = VariableAddress[c] :
# 1244| r1244_3(glval<unknown>) = FunctionAddress[~String] :
# 1244| v1244_4(void) = Call[~String] : func:r1244_3, this:r1244_2
# 1244| mu1244_5(unknown) = ^CallSideEffect : ~m?
# 1244| v1244_6(void) = ^IndirectReadSideEffect[-1] : &:r1244_2, ~m?
# 1244| mu1244_7(String) = ^IndirectMayWriteSideEffect[-1] : &:r1244_2
# 1244| r1244_8(glval<String>) = VariableAddress[b] :
# 1244| r1244_9(glval<unknown>) = FunctionAddress[~String] :
# 1244| v1244_10(void) = Call[~String] : func:r1244_9, this:r1244_8
# 1244| mu1244_11(unknown) = ^CallSideEffect : ~m?
# 1244| v1244_12(void) = ^IndirectReadSideEffect[-1] : &:r1244_8, ~m?
# 1244| mu1244_13(String) = ^IndirectMayWriteSideEffect[-1] : &:r1244_8
# 1244| r1244_14(glval<String>) = VariableAddress[a] :
# 1244| r1244_15(glval<unknown>) = FunctionAddress[~String] :
# 1244| v1244_16(void) = Call[~String] : func:r1244_15, this:r1244_14
# 1244| mu1244_17(unknown) = ^CallSideEffect : ~m?
# 1244| v1244_18(void) = ^IndirectReadSideEffect[-1] : &:r1244_14, ~m?
# 1244| mu1244_19(String) = ^IndirectMayWriteSideEffect[-1] : &:r1244_14
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct. a, b and c are all static locals. So their destructors should not be called at the end of the scope. Ideally, we should have some kind of new IRFunction that calls these destructors at the end of main, but let's leave that aside for now.

As a start, I think we should simply not emit these destructors for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good point. I missed that those were static when I was looking over the results.

cpp/ql/test/library-tests/ir/ir/raw_ir.expected Outdated Show resolved Hide resolved
cpp/ql/test/library-tests/ir/ir/raw_ir.expected Outdated Show resolved Hide resolved
cpp/ql/test/library-tests/ir/ir/raw_ir.expected Outdated Show resolved Hide resolved
@rdmarsh2 rdmarsh2 force-pushed the rdmarsh2/cpp/ir-synthetic-destructors branch from a341acf to 2d010f6 Compare February 2, 2024 17:40
result = TranslatedVariableInitialization.super.getChildInternal(id)
}

final override Instruction getChildSuccessorInternal(TranslatedElement elem, EdgeKind kind) {

Check warning

Code scanning / CodeQL

Redundant override Warning

Redundant override of
this predicate
.
result = TranslatedVariableInitialization.super.getChildInternal(id)
}

final override Instruction getChildSuccessorInternal(TranslatedElement elem, EdgeKind kind) {

Check warning

Code scanning / CodeQL

Redundant override Warning

Redundant override of
this predicate
.
@@ -11,6 +11,7 @@
private import TranslatedCondition
private import TranslatedElement
private import TranslatedExpr
private import TranslatedCall

Check warning

Code scanning / CodeQL

Redundant import Warning

Redundant import, the module is already imported inside
TranslatedExpr
.
@rdmarsh2
Copy link
Contributor Author

rdmarsh2 commented Feb 2, 2024

Force-pushed to fix a merge conflict with #15461 and remove some commits from #15318

@MathiasVP
Copy link
Contributor

MathiasVP commented Feb 8, 2024

This looks great, Robert! The unaliased SSA failures are still a mystery to me, but other than excluding destructor of static locals this is looking really good 🎉

The DCA run you started also looks promising. I took the liberty of starting a run that also run the IR consistency queries to make sure we're not adding too many new inconsistencies. Hopefully, the new inconsistencies are simply because we're generating more IR 🤞

@rdmarsh2
Copy link
Contributor Author

rdmarsh2 commented Feb 9, 2024

Assuming you're referring to the points_to test, I think what's happening there is that the destructor calls are leading to all the variables in that test escaping, which then means their MemoryLocations become part of AllAliasedMemory, so the results for them get filtered out.

@MathiasVP
Copy link
Contributor

Assuming you're referring to the points_to test, I think what's happening there is that the destructor calls are leading to all the variables in that test escaping, which then means their MemoryLocations become part of AllAliasedMemory, so the results for them get filtered out.

That makes sense. It sounds like we now need a side-effect/alias model for the destructor of a smart pointer, then. How about we add something like the following: https://gist.github.com/MathiasVP/a681b702be4ab46d45a9adaf034352c9 ?

cpp/ql/test/library-tests/ir/points_to/points_to.expected Outdated Show resolved Hide resolved
cpp/ql/test/library-tests/ir/ir/raw_ir.expected Outdated Show resolved Hide resolved
@@ -1009,6 +1067,8 @@ class TranslatedConstructorBareInit extends TranslatedElement, TTranslatedConstr
result = this.getParent().getChildSuccessor(this, kind)
}

override Instruction getALastInstructionInternal() { none() } // FIXME: does this need to be filled in?
Copy link
Contributor

@MathiasVP MathiasVP Feb 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a FIXME that we haven't resolved yet. What's the status of this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That class only exists to handle an (old?) extractor bug. I don't think it can ever have a destructor attached, and neither can the init list it's part of, so I think it's fine for getALastChild to have no results.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Should we delete the FIXME comment then? And maybe convert it to an internal issue?

@MathiasVP
Copy link
Contributor

Also a question: Do we not need to do anything in this PR with the destruction of static locals now that we know that they aren't handled properly in the extractor?

Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're almost good to go! I'll run a final DCA on this since there's been quite a lot of changes since the last run. After that, I think we should eyeball the new results to make sure they're destructor-related, but then I think we can merge this 🎉

Comment on lines 184 to 185
// TODO: Check if this is join ordered correctly.
result.getDeclaringType() = declaringType.getBaseType()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you check this? Because I was kinda hoping that you would 😂 I think we should just start a fresh DCA run on this PR, and if nothing stands out here we should just delete the TODO comment.

@@ -1009,6 +1067,8 @@ class TranslatedConstructorBareInit extends TranslatedElement, TTranslatedConstr
result = this.getParent().getChildSuccessor(this, kind)
}

override Instruction getALastInstructionInternal() { none() } // FIXME: does this need to be filled in?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Should we delete the FIXME comment then? And maybe convert it to an internal issue?

@MathiasVP
Copy link
Contributor

I think we're almost good to go! I'll run a final DCA on this since there's been quite a lot of changes since the last run. After that, I think we should eyeball the new results to make sure they're destructor-related, but then I think we can merge this 🎉

DCA came back:

  • No performance problems 🎉 So I think we should just delete the TODO that crept into f791b0e
  • I think we should look at a small subset of the query results and confirm that these are related to the introduction of destructors and not some new bug in IR construction

@geoffw0
Copy link
Contributor

geoffw0 commented Feb 14, 2024

I think we should look at a small subset of the query results and confirm that these are related to the introduction of destructors and not some new bug in IR construction

I've just had a look at a couple of the Samate result changes in the above DCA run:

  • new result for cpp/path-injection on dsp-testing__samate-c-cpp: C/testcases/CWE23_Relative_Path_Traversal/s01/CWE23_Relative_Path_Traversal__char_connect_socket_ifstream_83_bad.cpp:124:32:124:36
    • this is a result in a "bad" case and appears to be a correct result.
    • this involves a class with a constructor that calls recv (the source) and writes the result into a member variable data; then the destructor of the same classes accesses data (at the sink).
    • the test creates an object of this class in a local variable. Presumably we were missing the fact that the destructor was called at all prior to this PR. 🎉
  • new result for cpp/tainted-format-string on dsp-testing__samate-c-cpp: C/testcases/CWE789_Uncontrolled_Mem_Alloc/s01/CWE789_Uncontrolled_Mem_Alloc__new_char_connect_socket_83_bad.cpp:118:24:118:38
    • this is a result in a "bad" case and appears to be a correct result.
    • very similar to the previous case; the sink is different. 👍

So these both look good, and I'm pretty confident all of the other Samate test changes are explained by the same pattern (they appear to be all for cases labelled '83', which means they'll follow the same constructor-destructor pattern).

There are a handful of results on other projects besides Samate - someone should review a few of these as well.

@rdmarsh2
Copy link
Contributor Author

Looked into a couple of those today (had a horrible time getting them to evaluate without hitting some sort of cache...) and looks like they're false positives related to an issue with successors of return statements. Shoudl have a fix tomorrow.

Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new changes makes sense, and it looks like this removed a bunch of seemingly DCA results. I did verify that the use-after-free results that were removed were instances of the successor issue that you fixed in 2494b7d.

I think I spotted one bug in your fix commit. I tried to run this example after this fix:

struct String
{
  String();
  ~String();
};

void VoidFunc();

void IfReturnDestructors(bool b) {
  String s;
  return VoidFunc();
}

but now I'm getting a bunch of duplication and an unexplained loop: https://gist.github.com/MathiasVP/7afa7176a7a4bf4750c8579a04855ec4

@jketema
Copy link
Contributor

jketema commented Feb 22, 2024

@rdmarsh2 it seems your cli is somewhat out-of-date, as your latest changes to the PrintAST output revert some stuff that was fixed.

I would also be sensible to re-run DCA on this, given that things were fixed.

Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small nit, but otherwise I'm happy with this as soon as we have a happy DCA run! (Although, the .expected files obviously need to be modified as per Jeroen's comment)

Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're good to go! 2 remaining things:

  • We need to add a change note. I suggest making a library change note of category minorAnalysis.
  • We should probably start a DCA run to test the impact of the last IR fix (Edit: I took the liberty of starting this one myself. I hope that's okay!)

@rdmarsh2 rdmarsh2 force-pushed the rdmarsh2/cpp/ir-synthetic-destructors branch from 2454252 to a513598 Compare February 26, 2024 19:42
@rdmarsh2
Copy link
Contributor Author

There's one new alert in the DCA that looks like a false positive... It's a use-after-free involving a delete in a loop, but there's an unconditional break following the free. I'm not sure if it's a pre-existing CFG issue that's now being exposed or a new CFG issue.

@MathiasVP
Copy link
Contributor

There's one new alert in the DCA that looks like a false positive... It's a use-after-free involving a delete in a loop, but there's an unconditional break following the free. I'm not sure if it's a pre-existing CFG issue that's now being exposed or a new CFG issue.

I'm not too concerned about one new FP result amidst all the new TPs we're seeing on Samate (which Geoffrey looked at earlier). So if you investigated this one and concluded that it's a FP then I think this is good enough, and we should create an issue for why this happens as a follow-up.

Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@MathiasVP MathiasVP merged commit f7b2de8 into github:main Feb 27, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants