[pyspark] AttributeError: ‘DataFrame’ object has no attribute ...
cumsum.wordpress.com › 2020/10/10 › pysparkOct 10, 2020 · Unfortunately this throws a big error: AttributeError: ‘DataFrame’ object has no attribute ‘_get_object_id’. The reason being that isinexpects actual local values or collections but df2.select('id')returns a data frame. Solution: The solution to this problem is to use JOIN, or inner joinin this case: df.join( df2.select('id').drop_duplicates(), # df2 with id column on=['id'], # join on id how='inner' # inner join to keep only common ids).show()+---+---+---+| ...