Explode in PySpark - Intellipaat Community
intellipaat.com › community › 16638Jul 25, 2019 · Explode function basically takes in an array or a map as an input and outputs the elements of the array (map) as separate rows. Also, I would like to tell you that explode and split are SQL functions. Both of them operate on SQL Column. Now if you want to separate data on arbitrary whitespace you'll need something like this:
python - Explode in PySpark - Stack Overflow
stackoverflow.com › questions › 38210507For a slightly more complete solution which can generalize to cases where more than one column must be reported, use 'withColumn' instead of a simple 'select' i.e.: df.withColumn('word',explode('word')).show() This guarantees that all the rest of the columns in the DataFrame are still present in the output DataFrame, after using explode.