case_when

case_when(case_list, otherwise=None)

Simplifies conditional logic in Polars by chaining multiple when-then-otherwise expressions.

Inspired by pd.Series.case_when(), this function offers a more ergonomic way to express chained conditional logic with Polars expressions.

Keyword shortcut is not supported

Passing multiple keyword arguments as equality conditions—such as x=123 in pl.when()—is not supported in this function.

Parameters

case_list : Sequence[tuple[pl.Expr | tuple[pl.Expr], pl.Expr]]: A sequence of tuples where each tuple represents a when and then branch. This function accepts three input forms (see examples below). Each tuple is evaluated in order from top to bottom. For each tuple, the expressions before the final element are treated as when conditions and combined with &. If the combined condition evaluates to True, the corresponding then expression (the last element) is returned and the evaluation stops. If no condition matches any tuple, the otherwise expression is used as the fallback.
otherwise : pl.Expr | None = None: Fallback expression used when no conditions match.

Returns

: pl.Expr: A single Polars expression suitable for use in transformations.

Examples

DataFrame Context

The example below demonstrates all three supported input forms.

expr1 uses the simplest form, where each tuple contains a single when condition followed by its corresponding then expression.

expr2 shows tuples with multiple when conditions listed before the final then expression. These conditions are implicitly combined with &.

expr3 uses a tuple as the first element of each tuple, containing multiple when conditions which are also combined with & before evaluation.

import polars as pl
import turtle_island as ti

df = pl.DataFrame({"x": [1, 2, 3, 4], "y": [5, 6, 7, 8]})

expr1 = ti.case_when(
    case_list=[
        (pl.col("x") < 2, pl.lit("small")),
        (pl.col("x") < 4, pl.lit("medium")),
    ],
    otherwise=pl.lit("large"),
).alias("size1")

expr2 = ti.case_when(
    case_list=[
        (pl.col("x") < 3, pl.col("y") < 6, pl.lit("small")),
        (pl.col("x") < 4, pl.col("y") < 8, pl.lit("medium")),
    ],
    otherwise=pl.lit("large"),
).alias("size2")

expr3 = ti.case_when(
    case_list=[
        ((pl.col("x") < 3, pl.col("y") < 6), pl.lit("small")),
        ((pl.col("x") < 4, pl.col("y") < 8), pl.lit("medium")),
    ],
    otherwise=pl.lit("large"),
).alias("size3")

df.with_columns(expr1, expr2, expr3)

shape: (4, 5)

x	y	size1	size2	size3
i64	i64	str	str	str
1	5	"small"	"small"	"small"
2	6	"medium"	"medium"	"medium"
3	7	"medium"	"medium"	"medium"
4	8	"large"	"large"	"large"

List Namespace Context

Working with Lists as Series

In the list namespace, it may be easier to think of each row as an element in a list. Conceptually, you’re working with a pl.Series, where each row corresponds to one item in the list.

Check whether each string in the list starts with the letter “a” or “A”:

df2 = pl.DataFrame(
    {
        "col1": [
            ["orange", "Lemon", "Kiwi"],
            ["Acerola", "Cherry", "Papaya"],
        ],
        "col2": [
            ["Grape", "Avocado", "apricot"],
            ["Banana", "apple", "Mango"],
        ],
    }
)

case_list = [
    (pl.element().str.to_lowercase().str.starts_with("a"), pl.lit("Y"))
]
otherwise = pl.lit("N")

(df2.with_columns(pl.all().list.eval(ti.case_when(case_list, otherwise))))

shape: (2, 2)

col1	col2
list[str]	list[str]
["N", "N", "N"]	["N", "Y", "Y"]
["Y", "N", "N"]	["N", "Y", "N"]