How to Change Column Type (cast) in PySpark?

Spread the love

In this post you will learn how to change the column type (cast) in PySpark.

Change column Type in PySpark –

Sometime you may want to change a column from one type to another. We can convert columns from one type to another by casting the column from one type to another.

Let’s read a dataset to work with. We will use the clothing store sales data.

from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()

df = spark.read.format('csv').option('header','true').load('../data/clothing_store_sales.csv')
df.show(5)

Let’s say that you want to change the Net Sales column from Floats to Integer.

df.withColumn("Net Sales Int", df['Net Sales'].cast("long")).show(5)

Rating: 1 out of 5.

Leave a Reply