import polars as pl
import turtle_island as ti
pl.Config.set_fmt_table_cell_list_len(10)
df = pl.DataFrame({"x": [1, 2, 3, 4, 5]})
df.with_columns(ti.is_every_nth_row(2))| x | bool_nth_row |
|---|---|
| i64 | bool |
| 1 | true |
| 2 | false |
| 3 | true |
| 4 | false |
| 5 | true |
Returns a Polars expression that is True for every n-th row (index modulo n equals 0).
is_every_nth_row() can be seen as the complement of pl.Expr.gather_every().
While pl.Expr.gather_every() is typically used in a select() context and may return a DataFrame with fewer rows, is_every_nth_row() produces a predicate expression that can be used with select() or with_columns() to preserve the original row structure for further processing, or with filter() to achieve the same result as pl.Expr.gather_every().
offset= does not exceed the total number of rows
Since expressions are only evaluated at runtime, their validity cannot be checked until execution. If offset= is greater than the number of rows in the DataFrame, the result will be a column filled with False.
n : intThe interval to use for row selection. Should be positive.
offset : int = 0Start the index at this offset. Cannot be negative.
name : str = 'bool_nth_row'The name of the resulting column.
: pl.ExprA boolean Polars expression.
Mark every second row:
| x | bool_nth_row |
|---|---|
| i64 | bool |
| 1 | true |
| 2 | false |
| 3 | true |
| 4 | false |
| 5 | true |
To invert the result, use either the ~ operator or pl.Expr.not_():
| x | ~2 | not_2 |
|---|---|---|
| i64 | bool | bool |
| 1 | false | false |
| 2 | true | true |
| 3 | false | false |
| 4 | true | true |
| 5 | false | false |
Use offset= to shift the starting index:
| x | bool_nth_row |
|---|---|
| i64 | bool |
| 1 | false |
| 2 | true |
| 3 | false |
| 4 | false |
| 5 | true |
For reference, here’s the output using pl.Expr.gather_every():
You can also combine multiple is_every_nth_row() expressions to construct more complex row selections. For example, to select rows that are part of every second or every third row:
In the list namespace, it may be easier to think of each row as an element in a list. Conceptually, you’re working with a pl.Series, where each row corresponds to one item in the list.
Mark every second element:
| x | y |
|---|---|
| list[bool] | list[bool] |
| [true, false, true, false] | [true, false, true, false] |
| [true, false, true, false] | [true, false, true, false] |