Unable to read text file with '|' separator using spark.read in pyspark

I am trying to read a pipe delimited text file in pyspark dataframe into separate columns but I am unable to do so by specifying the format as 'text'. It works fine when I give the format as csv.

This code is what I think is correct as it is a text file but all columns are coming into a single column.

df = spark.read.format('text').options(header=True).options(sep='|').load("path\\test.txt")

df.show()

+--------------------+
|               value|
+--------------------+
|Name|Color|Size|O...|
|Rabbit|Brown|7|Wa...|
| Horse|Green|28|Dock|
|  Pig|Orange|17|Port|
|Cow|Blue|23|Wareh...|
|  Bird|Yellow|2|Dock|
|   Dog|Brown|10|Port|
|Carrot Man|Orange...|
+--------------------+

This piece of code is working correctly by splitting the data into separate columns but I have to give the format as csv even though the file is actually .txt.

df = spark.read.format('csv').options(header=True).options(sep='|').load("path\\test.txt")

df.show()

+----------+------+----+---------+
|      Name| Color|Size|   Origin|
+----------+------+----+---------+
|    Rabbit| Brown|   7|Warehouse|
|     Horse| Green|  28|     Dock|
|       Pig|Orange|  17|     Port|
|       Cow|  Blue|  23|Warehouse|
|      Bird|Yellow|   2|     Dock|
|       Dog| Brown|  10|     Port|
|Carrot Man|Orange|  22|Warehouse|
+----------+------+----+---------+

edited Apr 13, 2023 at 5:28

samkart

6,7173 gold badges19 silver badges36 bronze badges

asked Apr 12, 2023 at 14:55

OhMoh24

718 bronze badges

2

Because sep is not a valid option for text. spark.apache.org/docs/latest/sql-data-sources-text.html

Emma
– Emma

2023-04-12 15:21:34 +00:00
Commented Apr 12, 2023 at 15:21

Add a comment |

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Unable to read text file with '|' separator using spark.read in pyspark

0

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.