Functions
Setup
import numpy as np
import pandas as pd
import polars as pl
np.random.seed(42)
data = {
"nrs": [1, 2, 3, 4, 5],
"names": ["foo", "ham", "spam", "egg", "baz"],
"random": np.random.rand(5),
"groups": ["A", "A", "B", "C", "B"],
}
shape: (5, 4)
┌─────┬───────┬──────────┬────────┐
│ nrs ┆ names ┆ random ┆ groups │
│ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ str ┆ f64 ┆ str │
╞═════╪═══════╪══════════╪════════╡
│ 1 ┆ foo ┆ 0.37454 ┆ A │
│ 2 ┆ ham ┆ 0.950714 ┆ A │
│ 3 ┆ spam ┆ 0.731994 ┆ B │
│ 4 ┆ egg ┆ 0.598658 ┆ C │
│ 5 ┆ baz ┆ 0.156019 ┆ B │
└─────┴───────┴──────────┴────────┘
Column naming
Count unique values
In Pandas
, it appears that there is no built-in method for approximating the count of unique values.
Conditionals
Reference
The examples in this section have been adapted from the Polars
user guide.