df.with_columns()
df.with_columns([..])
allows you to create new columns in parallel. Unlike df.select([..])
, it adds the newly created columns to the original dataframe instead of dropping them.
Setup
import numpy as np
import pandas as pd
import polars as pl
np.random.seed(42)
data = {"nrs": [1, 2, 3, 4, 5], "random": np.random.rand(5)}
Example
The behavior of df.with_columns([..])
can be treated as df.assign(..)
in Pandas
.
out_pl = df_pl.with_columns(
pl.sum("nrs").alias("nrs_sum"), pl.col("random").count().alias("count")
)
print(out_pl)
shape: (5, 4)
┌─────┬──────────┬─────────┬───────┐
│ nrs ┆ random ┆ nrs_sum ┆ count │
│ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ f64 ┆ i64 ┆ u32 │
╞═════╪══════════╪═════════╪═══════╡
│ 1 ┆ 0.37454 ┆ 15 ┆ 5 │
│ 2 ┆ 0.950714 ┆ 15 ┆ 5 │
│ 3 ┆ 0.731994 ┆ 15 ┆ 5 │
│ 4 ┆ 0.598658 ┆ 15 ┆ 5 │
│ 5 ┆ 0.156019 ┆ 15 ┆ 5 │
└─────┴──────────┴─────────┴───────┘
Reference
The examples in this section have been adapted from the Polars
user guide.