is_every_nth_row

is_every_nth_row(n, offset=0, *, name='bool_nth_row')

Returns a Polars expression that is True for every n-th row (index modulo n equals 0).

is_every_nth_row() can be seen as the complement of pl.Expr.gather_every().

While pl.Expr.gather_every() is typically used in a select() context and may return a DataFrame with fewer rows, is_every_nth_row() produces a predicate expression that can be used with select() or with_columns() to preserve the original row structure for further processing, or with filter() to achieve the same result as pl.Expr.gather_every().

Ensure offset= does not exceed the total number of rows

Since expressions are only evaluated at runtime, their validity cannot be checked until execution. If offset= is greater than the number of rows in the DataFrame, the result will be a column filled with False.

Parameters

n : int: The interval to use for row selection. Should be positive.
offset : int = 0: Start the index at this offset. Cannot be negative.
name : str = 'bool_nth_row': The name of the resulting column.

Returns

: pl.Expr: A boolean Polars expression.

Examples

DataFrame Context

Mark every second row:

import polars as pl
import turtle_island as ti

pl.Config.set_fmt_table_cell_list_len(10)
df = pl.DataFrame({"x": [1, 2, 3, 4, 5]})
df.with_columns(ti.is_every_nth_row(2))

shape: (5, 2)

x	bool_nth_row
i64	bool
1	true
2	false
3	true
4	false
5	true

To invert the result, use either the ~ operator or pl.Expr.not_():

df.with_columns(
    ~ti.is_every_nth_row(2).alias("~2"),
    ti.is_every_nth_row(2).not_().alias("not_2"),
)

shape: (5, 3)

x	~2	not_2
i64	bool	bool
1	false	false
2	true	true
3	false	false
4	true	true
5	false	false

Use offset= to shift the starting index:

df.with_columns(ti.is_every_nth_row(3, 1))

shape: (5, 2)

x	bool_nth_row
i64	bool
1	false
2	true
3	false
4	false
5	true

For reference, here’s the output using pl.Expr.gather_every():

df.select(pl.col("x").gather_every(3, 1))

shape: (2, 1)

x
i64
2
5

You can also combine multiple is_every_nth_row() expressions to construct more complex row selections. For example, to select rows that are part of every second or every third row:

df.select(
    ti.is_every_nth_row(2).alias("2"),
    ti.is_every_nth_row(3).alias("3"),
    ti.is_every_nth_row(2).or_(ti.is_every_nth_row(3)).alias("2_or_3")
)

shape: (5, 3)

2	3	2_or_3
bool	bool	bool
true	true	true
false	false	false
true	false	true
false	true	true
true	false	true

List Namespace Context

Working with Lists as Series

In the list namespace, it may be easier to think of each row as an element in a list. Conceptually, you’re working with a pl.Series, where each row corresponds to one item in the list.

Mark every second element:

df2 = pl.DataFrame(
    {
        "x": [[1, 2, 3, 4], [5, 6, 7, 8]],
        "y": [[9, 10, 11, 12], [13, 14, 15, 16]],
    }
)
df2.with_columns(pl.all().list.eval(ti.is_every_nth_row(2)))

shape: (2, 2)

x	y
list[bool]	list[bool]
[true, false, true, false]	[true, false, true, false]
[true, false, true, false]	[true, false, true, false]