How to add current DateTime to existing PySpark data frame in a Fabric Notebook
In the blog post below, I am going to describe how to add the current Date Time to your existing Spark data frame.
This is really useful when I am inserting data into a Fabric Lakehouse table, and I want to know when the data got inserted.
Here is my Pyspark data frame with some data loaded.
I then added the following to my notebook to create the additional column called “CurrentDateTime” as shown in line 15 below.
This is then what it looks like when I run the cell, with the new column highlighted below.
NOTE: You will also see that the column data type is DateTime, so it will have this same data type when I query it with the SQL Analytics end point.
If you would like to get a copy of the code you can find it here: Fabric/Adding Date Time Column to Pyspark data frame.ipynb at main · GilbertQue/Fabric (github.com)
In this blog post I have demonstrated how to add the Current Date Time to your existing Spark data frame. I hope that you found this useful and can use this snippet in your notebooks.
Thanks for reading.
Be the first to comment