Skip to content

Conversation

@edmondop
Copy link

@edmondop edmondop commented Sep 20, 2025

This PR addresses #706 by making the return-type check verify, in case the return type is Self, that the class is defined as final

@meta-cla
Copy link

meta-cla bot commented Sep 20, 2025

Hi @edmondop!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

@meta-cla
Copy link

meta-cla bot commented Sep 20, 2025

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@meta-cla meta-cla bot added the cla signed label Sep 20, 2025
Copy link
Contributor

@stroxler stroxler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @edmondop, thanks for the PR!

This looks plausible to me, but there are several failing tests and it would be nice to see the behavior changes in the diff view in case this is interacting badly with other parts of Pyrefly (in particular, I see some incompatible-override errors popping up and it would be nice to make sure those make sense).

cc @rchen152 and @samwgoldman who are probably our experts on the Self type (and for that matter function subtyping as well!)

@edmondop
Copy link
Author

edmondop commented Sep 21, 2025 via email

@edmondop
Copy link
Author

I confirm the change has broken some existing funtionality, and some tests that should be passing are not. I haven't investigated the other tests, but this one is a good example:

testcase!(
    test_call_instance_method_from_classmethod,
    r#"
class A:
    def f(self):
        pass

class B(A):
    @classmethod
    def g(cls):
        super().f(cls())
    "#,
);

I need to dive deeper into subset.rs and understand if I have the context, I think the difference in the good case is that Self is in the return type, while here is an argument to f. According to Liskov Substitution principle, this test case should pass for any subclass of B, so the fact that Bm is not final is not a problem

@edmondop
Copy link
Author

I haven't being able to find a solution that affect only subset.rs that wouldn't break previous tests, because within solver it's unclear whether got/want are in parameter or return type position, while in the solve.rs I extended the return type handling code

@edmondop edmondop requested a review from stroxler September 21, 2025 20:03
self.expr(expr, hint.as_ref().map(|t| (t, tcc)), errors)
} else {
if let Some(Type::SelfType(want_class)) = hint.as_ref() {
if !self.type_order().is_final(want_class.class_object()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I thought I had run cargo clippy. Fixed now

@stroxler
Copy link
Contributor

Yeah this is tricky and tied into a few very difficult questions:

  • what the Self type actually means (PEP 673 actually specified it as a method level type parameter, but I don't think most type checkers treat it that way anymore)
  • whether it is allowed to appear in parameter position, except by being bound as the receiver object (this is unsound, but I think most type checkers allow it anyway because best-effort unsound analysis is sometimes helpful for stating intent and catching a subset of bugs)

If we actually solved Self as a bound type variable in the call as PEP 673 specifies, I'm pretty certain that the error would go away but at the moment our handling of Self is entirely adhoc and I don't think we ever actually solve it as a Quantified.

The adhoc handling has benefits because it allows us to do things like handle an attribute specified as Self (which is again unsound if the attribute is mutable, but we allow this), so I'm unsure how easy it would be to change the behavior and match PEP 673 better. We might be able to detect the direct use of an unbound method and convert it to a ForAll type where Self behaves like a normal function-level type variable.

Let's see if @rchen152 and @samwgoldman have thoughts

@samwgoldman
Copy link
Member

Thanks for contributing! I don't think the fix is in the right place. This diff adds some logic to the code handling returns with explicit annotations, but assignability happens in more places, like x: Self = C() within a method in C. It would also apply to the cases in returns that are not handled by this change, async and generated functions.

The right place for this fix is to remove the invalid rules in subset.rs, here: https://github.com/facebook/pyrefly/blob/main/pyrefly/lib/solver/subset.rs#L1005-L1013

These rules say it is OK to assign C to Self@C, which is exactly false, so the right thing to do is remove them.

However, there is a snag -- we rely on this rule in a few places. Usually this happens when we fail to preserve "Self"-ness. I recently fixed a couple examples of this, in 968a99c and in fc7592c.

I know of at least one other place we get this wrong -- calling a constructor on the call target type[Self] [Sandbox].

@samwgoldman samwgoldman self-requested a review September 22, 2025 16:53
Copy link
Member

@samwgoldman samwgoldman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the original issue is mislabeled as "good first issue" -- but you are welcome to keep going if you'd like. Try removing the invalid rules first and see what tests fail. Hopefully you will see an example like the ctor call on type[Self], but maybe some others. Those issues would need to be fixed first, preferably in separate PRs.

@edmondop
Copy link
Author

Thanks, I tried to originally make that change and it was causing several test to fail. I was trying to restrict my condition in the subset.rs based on the location of the type (return type) but I wasn't able to find a way through the type_order to access this information. Was I looking in the wrong place?

@samwgoldman
Copy link
Member

I see -- in subset.rs it would not be possible to restrict to just return types, but I don't think such a restriction is worthwhile, since the bug exists everywhere. The fact that other tests start to fail is likely because of other latent issues that need to be fixed first, per my comment above. For example, the test_call_instance_method_from_classmethod fails because calling cls() should return Self@C but currently incorrectly returns C -- constructing type[Self] should return Self.

@edmondop
Copy link
Author

edmondop commented Oct 2, 2025

I think I found the root cause of why I wasn't able to solve the problem in subset.rs. Doing the check on final would fix the problems indicated in the issue #706 but would introduce test failures somewhere else.

The problem was that in call.rs

     Type::Type(box Type::ClassType(cls)) | Type::Type(box Type::SelfType(cls)) => {
                Some(CallTarget::Class(cls))
            }

this handling means that later in the call_infer, the type[Self] becomes a type[ConcreteClass],

In practice this mean that would end up being processed on a different branch of subset.rs rather than the one we expect

class B:
    @classmethod
    def factory(cls) -> Self:
        return cls()

The solution seems to be to introduce a new enum type for the CallTarget of type Self, and produce a Type::SelfType later for subset.rs. This ensures that subset.rs pattern matching receive the right type. (I haven't figure out how to handle type parameters yet)

self.construct_class(cls, args, keywords, range, errors, context, hint)
}
CallTarget::SelfClass(cls) => {
if cls.has_qname("typing", "Any") {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this should be factor out so we can share it between ClassTarget::Class and ClassTarget::SelfClass since this code is identical

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can skip all of these checks for CallTarget::SelfClass, actually.

  1. The only place we could have type[Self] where Self is Any is the implementation of Any, which doesn't actually exist.

  2. For type[P] where P is a protocol, we actually should allow calling a constructor. This also applies to type[Self] where Self is a protocol.

  3. Similar to (1), the only place this check comes up is in the implementation of bool itself, since type[Self] where Self is typing.bool can only exist inside that class.

Copy link
Member

@samwgoldman samwgoldman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the long delay on this review. I wanted to figure out more of the details around other type checkers' behavior when constructing type[Self], especially when there is a metaclass __call__ or __new__ implementation with explicit return types.

I found a counter-example to the implementation here, but both mypy and pyright also fail to find the error in my counter-example. Curious to hear your thoughts, but I'd like to take a bit more time to understand existing type checkers' behavior, and maybe even get some additional clarity written into the typing specification about constructing type[Self].

Comment on lines +667 to +676
self.construct_class(cls.clone(), args, keywords, range, errors, context, hint);
// Handle custom __new__
match &result {
Type::ClassType(result_cls)
if result_cls.class_object() == cls.class_object() =>
{
Type::SelfType(result_cls.clone())
}
_ => result,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part seems a bit off to me. If we have type[Self] where in class C, then we will call construct_class using C, then check to see if that method returned C, and if it does assume that the constructor actually returned Self.

In order to return Self, we need to know that for C and any of its subclasses, the constructor call will return the class itself. Here we are only checking that calling it for C returns C, so the implication feels incomplete.

Consider this program, which fails at runtime. Because of the __new__ implementation, which always returns a C, when we construct cls(0) when called from a subclass, we actually get a C not a D.

from typing import Self

class C:
    def __new__(cls, x: int) -> 'C': # Note C instead of Self here
        return C(x)

    @classmethod
    def m(cls: type[Self]) -> Self:
        return cls(0) # should fail

class D(C):
    d_only: int = 0

D.m().d_only # runtime error

My feeling is that the solution here needs a bit more work. We would need to push the "self"-ness into construct_class, and be a bit careful to check that metaclass __call__ and __new__ preserve "self"-ness as well, so we can infer that cls(0) on cls: type[Self] actually returns C and not Self.

That said, looking at the behavior of mypy and Pyright, both seem to accept the program above. I couldn't find any explanation for that behavior in the typing spec, which talks a lot about metaclass __call__ and __new__, but not type[Self].

I'm hesitant to request changes on this bit, because I think those changes might be pretty intricate. I'd really like to understand why mypy/Pyright seem to accept the program above.

self.construct_class(cls, args, keywords, range, errors, context, hint)
}
CallTarget::SelfClass(cls) => {
if cls.has_qname("typing", "Any") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can skip all of these checks for CallTarget::SelfClass, actually.

  1. The only place we could have type[Self] where Self is Any is the implementation of Any, which doesn't actually exist.

  2. For type[P] where P is a protocol, we actually should allow calling a constructor. This also applies to type[Self] where Self is a protocol.

  3. Similar to (1), the only place this check comes up is in the implementation of bool itself, since type[Self] where Self is typing.bool can only exist inside that class.

// 1. The classes are the same
// 2. Both have no type arguments (i.e., they're the exact same bare type)
// This ensures NonFinalClass is not assignable to Self@NonFinalClass,
// but allows partial[int, str] to be assignable to Self@partial
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a test case for this? I'm not sure I understand what this is referring to.

.has_superclass(got.class_object(), want.class_object());
// Only apply the finality check when:
// 1. The classes are the same
// 2. Both have no type arguments (i.e., they're the exact same bare type)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A test case with an expected error in this case, which also explains why it's not safe, would be great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants