pyspark.sql.DataFrame.observe¶
-
DataFrame.
observe
(observation: Observation, *exprs: pyspark.sql.column.Column) → DataFrame[source]¶ Observe (named) metrics through an
Observation
instance.A user can retrieve the metrics by accessing Observation.get.
New in version 3.3.0.
- Parameters
- observation
Observation
an
Observation
instance to obtain the metric.- exprslist of
Column
column expressions (
Column
).
- observation
- Returns
Notes
This method does not support streaming datasets.
Examples
>>> from pyspark.sql.functions import col, count, lit, max >>> from pyspark.sql import Observation >>> observation = Observation("my metrics") >>> observed_df = df.observe(observation, count(lit(1)).alias("count"), max(col("age"))) >>> observed_df.count() 2 >>> observation.get {'count': 2, 'max(age)': 5}
New in version 3.3.