In this post you will learn how to change the column type (cast) in PySpark.
Change column Type in PySpark –
Sometime you may want to change a column from one type to another. We can convert columns from one type to another by casting the column from one type to another.
Let’s read a dataset to work with. We will use the clothing store sales data.
from pyspark.sql import SparkSession spark = SparkSession.builder.getOrCreate() df = spark.read.format('csv').option('header','true').load('../data/clothing_store_sales.csv') df.show(5)
Let’s say that you want to change the Net Sales column from Floats to Integer.
df.withColumn("Net Sales Int", df['Net Sales'].cast("long")).show(5)