pyspark.sql.functions.randn#

pyspark.sql.functions.randn(seed=None)[source]#

Generates a random column with independent and identically distributed (i.i.d.) samples from the standard normal distribution.

New in version 1.4.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters

seedint (default: None): Seed value for the random generator.

Returns

Column: A column of random values.

Notes

The function is non-deterministic in general case.

Examples

Example 1: Generate a random column without a seed

>>> from pyspark.sql import functions as sf
>>> spark.range(0, 2, 1, 1).withColumn('randn', sf.randn()).show() 
+---+--------------------+
| id|               randn|
+---+--------------------+
|  0|-0.45011372342934214|
|  1|  0.6567304165329736|
+---+--------------------+

Example 2: Generate a random column with a specific seed

>>> spark.range(0, 2, 1, 1).withColumn('randn', sf.randn(seed=42)).show()
+---+------------------+
| id|             randn|
+---+------------------+
|  0| 2.384479054241165|
|  1|0.1920934041293524|
+---+------------------+