pirrtools package#
Subpackages#
Submodules#
pirrtools.list_chunks module#
Utilities for splitting iterables into chunks.
This module provides functionality for dividing iterables into smaller, more manageable chunks of specified sizes. It includes options for equalizing chunk contents based on element properties.
The primary function chunk() distributes elements across a calculated number of sublists, with optional sorting to balance element characteristics.
Example
>>> chunk([1, 2, 3, 4, 5, 6], 2)
[[1, 4], [2, 5], [3, 6]]
>>> chunk(['a', 'bb', 'ccc', 'dddd'], 2, equalize=True)
[['a', 'ccc'], ['bb', 'dddd']] # Sorted by length first
- pirrtools.list_chunks.chunk(iterable: Iterable[int | str | float], chunk_size: int, equalize: bool = False) list[list[int | str | float | None]] [source]#
Split an iterable into chunks distributed across sublists.
This function divides an iterable into a calculated number of sublists, distributing elements evenly by taking every nth element for each sublist. When equalize is True, elements are first sorted by string length to balance the characteristics of elements within each chunk.
- Parameters:
iterable (Iterable[Union[int, str, float]]) – The input iterable to chunk.
chunk_size (int) – The target number of elements to distribute across. This determines the number of sublists created.
equalize (bool, optional) – Whether to sort elements by string length before chunking to balance element characteristics. Defaults to False.
- Returns:
- A list of sublists where
elements are distributed evenly. The number of sublists is calculated as ceil(len(iterable) / chunk_size).
- Return type:
Examples
- Basic chunking (elements distributed, not grouped sequentially):
>>> chunk([1, 2, 3, 4, 5, 6], 2) [[1, 4], [2, 5], [3, 6]]
- With equalization (sorted by string length first):
>>> chunk(['a', 'bb', 'ccc', 'dddd'], 2, equalize=True) [['a', 'ccc'], ['bb', 'dddd']]
- Handling uneven division:
>>> chunk([1, 2, 3, 4, 5], 2) [[1, 3, 5], [2, 4]]
Note
The function ensures chunk_size is at least 1 to avoid division by zero. The distribution pattern takes every nth element where n is calculated as ceil(len(iterable) / chunk_size).
pirrtools.load module#
Module for loading utilities in pirrtools.
Provides functions for loading modules, classes, and other entities with optional IPython integration.
pirrtools.pandas module#
Pandas utilities for caching and rich display formatting.
This module provides comprehensive utilities for pandas DataFrame and Series objects, including efficient caching using feather format and advanced rich table display with styling support.
- Key Features:
Efficient caching system for non-conforming datasets using feather format
Rich table display with CSS styling support and background colors
Pandas accessor (.pirr) for convenient method access
Support for MultiIndex and complex pandas objects
Dynamic column width optimization for styled tables
- Main Classes:
UtilsAccessor: Pandas accessor providing caching and rich display methods
- Public Functions:
load_cache: Load cached DataFrame or Series from directory cache_and_load: Cache and immediately reload a pandas object
Example
>>> import pandas as pd
>>> df = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]})
>>>
>>> # Cache the DataFrame
>>> df.pirr.to_cache('my_cache')
>>>
>>> # Display with rich formatting
>>> df.pirr.to_rich(bg='gradient')
>>>
>>> # Load from cache
>>> cached_df = load_cache('my_cache')
- pirrtools.pandas.load_cache(path: str | Path) DataFrame | Series [source]#
Load cached pandas DataFrame or Series from directory.
Reconstructs a pandas object from the cache directory created by _save_cache(), properly restoring indexes, columns, and metadata.
- Parameters:
path (Union[str, Path]) – Path to the cache directory.
- Returns:
- The reconstructed DataFrame or Series with all
original structure and metadata preserved.
- Return type:
PandasObject
- Raises:
FileNotFoundError – If the cache directory does not exist.
Example
>>> df = pd.DataFrame({'a': [1, 2], 'b': [3, 4]}) >>> _save_cache(df, 'my_cache') >>> loaded_df = load_cache('my_cache')
- pirrtools.pandas.cache_and_load(obj, path, overwrite=False)[source]#
Cache a pandas object and immediately reload it.
This is a convenience function that saves an object to cache and then loads it back, useful for testing cache integrity or for ensuring feather-compatible data types.
- Parameters:
- Returns:
The reloaded DataFrame or Series.
- Return type:
PandasObject
Example
>>> df = pd.DataFrame({'a': [1, 2], 'b': [3, 4]}) >>> reloaded_df = cache_and_load(df, 'test_cache')
- class pirrtools.pandas.UtilsAccessor(pandas_obj: DataFrame | Series)[source]#
Bases:
object
Pandas accessor providing caching and rich display utilities.
This accessor is automatically registered as ‘.pirr’ on pandas DataFrame and Series objects, providing convenient access to caching functionality and rich table display features.
- _obj#
The underlying pandas DataFrame or Series.
- Type:
PandasObject
Example
>>> df = pd.DataFrame({'a': [1, 2], 'b': [3, 4]}) >>> df.pirr.to_cache('my_cache') # Cache the DataFrame >>> df.pirr.to_rich() # Display as rich table
- __init__(pandas_obj: DataFrame | Series)[source]#
Initialize the UtilsAccessor.
- Parameters:
pandas_obj (PandasObject) – The pandas DataFrame or Series object that this accessor is attached to.
- Raises:
AttributeError – If pandas_obj is not a DataFrame or Series.
- to_cache(*args, **kwargs)[source]#
Save the DataFrame or Series to a cache directory.
This method provides convenient access to the caching functionality through the pandas accessor interface.
- Parameters:
*args – Positional arguments passed to _save_cache().
**kwargs – Keyword arguments passed to _save_cache().
- Common Parameters:
path (Union[str, Path]): Directory path for the cache. overwrite (bool, optional): Whether to overwrite existing cache.
Defaults to False.
Example
>>> df.pirr.to_cache('my_cache', overwrite=True)
- to_rich(styler=None, console=None, minimize_gaps=False, show_index=True, index_style='dim', index_header_style='bold dim', index_justify='left', index_width=None, bg=None, bg_kwargs=None, tg=None, tg_kwargs=None, column_header_style=None, index_bg=None, index_bg_kwargs=None, alternating_rows=False, alternating_row_colors=('', 'on grey11'), table_style=None, auto_optimize=True, box=None, padding=None, collapse_padding=None, show_edge=None, pad_edge=None, expand=None, format=None, na_rep=None, **table_kwargs)[source]#
Create a Rich table from pandas DataFrame or Series with advanced styling.
This method converts pandas objects into beautifully formatted Rich tables with support for CSS styling from pandas Styler objects, built-in gradients, and extensive customization options.
- Parameters:
styler (pandas.io.formats.style.Styler, optional) – Pandas Styler object with applied styles. If None, uses basic formatting.
console (rich.console.Console, optional) – Rich Console object to use. If None, creates a new one.
minimize_gaps (bool, optional) – Force minimal padding/borders for better background color display. Defaults to False.
show_index (bool, optional) – Whether to show the index as a separate column. Defaults to True.
index_style (str, optional) – Rich style string for index values (e.g., “dim”, “bold blue”). Defaults to “dim”.
index_header_style (str, optional) – Rich style string for index column header. Defaults to “bold dim”.
index_justify (str, optional) – Text justification for index column (“left”, “center”, “right”). Defaults to “left”.
index_width (int, optional) – Fixed width for index column. If None, auto-sizes based on content.
bg (str, optional) – Background gradient style. Use “gradient” for default colormap or specify colormap name (e.g., “viridis”).
bg_kwargs (dict, optional) – Additional arguments for background_gradient().
tg (str, optional) – Text gradient style. Use “gradient” for default colormap or specify colormap name.
tg_kwargs (dict, optional) – Additional arguments for text_gradient().
column_header_style (str, optional) – Rich style string for column headers.
index_bg (str, optional) – Background gradient for index. Use “gradient” or specify colormap name.
index_bg_kwargs (dict, optional) – Additional arguments for index background_gradient().
alternating_rows (bool, optional) – Whether to apply alternating row colors. Defaults to False.
alternating_row_colors (tuple, optional) – Tuple of (even_style, odd_style) for alternating rows. Defaults to (“”, “on grey11”).
table_style (str, optional) – Rich style string applied to entire table.
auto_optimize (bool, optional) – Whether to automatically optimize table settings when background colors are detected. Defaults to True.
box (Box, optional) – Rich Box style for table borders. Overrides auto_optimize.
padding (tuple, optional) – Padding around cell content (vertical, horizontal). Overrides auto_optimize.
collapse_padding (bool, optional) – Whether to collapse adjacent cell padding. Overrides auto_optimize.
show_edge (bool, optional) – Whether to show table outer border. Overrides auto_optimize.
pad_edge (bool, optional) – Whether to add padding around table edges. Overrides auto_optimize.
expand (bool, optional) – Whether table should expand to fill console width. Overrides auto_optimize.
format (dict or str, optional) – Format specifiers for columns. Can be a dictionary mapping column names to format strings, or a single format string applied to all columns. Uses pandas Styler.format() internally.
na_rep (str, optional) – String representation of NaN values. Defaults to “”.
**table_kwargs – Additional keyword arguments passed to Rich Table constructor.
- Returns:
A Rich Table object ready for display or printing.
- Return type:
Examples
- Basic usage:
>>> from rich.console import Console >>> console = Console() >>> table = df.pirr.to_rich() >>> console.print(table)
- Background gradients:
>>> table = df.pirr.to_rich(bg="gradient") >>> table = df.pirr.to_rich(bg="viridis", bg_kwargs={"axis": 0})
- Text gradients:
>>> table = df.pirr.to_rich(tg="gradient")
- Header styling:
>>> table = df.pirr.to_rich(column_header_style="bold blue on white")
- Alternating rows:
>>> table = df.pirr.to_rich(alternating_rows=True) >>> table = df.pirr.to_rich(alternating_rows=True, ... alternating_row_colors=("", "on blue"))
- Manual table optimization control:
>>> from rich import box >>> table = df.pirr.to_rich( ... auto_optimize=False, box=box.ROUNDED, ... padding=(1, 2), show_edge=True ... )
- String formatting:
>>> table = df.pirr.to_rich( ... format={"Sales": "${:.0f}", "Growth": "{:.1%}"} ... ) >>> table = df.pirr.to_rich( ... format="{:.2f}", na_rep="N/A" ... )
- Combined styling:
>>> table = df.pirr.to_rich( ... bg="viridis", tg="plasma", alternating_rows=True, ... table_style="bold", title="My Data" ... ) >>> console.print(table)
Note
The method automatically optimizes table settings when background colors are detected, minimizing gaps for better visual appearance.
- pirrtools.pandas.cls#
alias of
I_N
pirrtools.sequences module#
This module provides utility functions for mathematical operations and computations, including Fibonacci number calculation, prime number generation, prime factorization, divisor calculation, and least common multiple (LCM) calculation.
- Classes:
FibCalculator: A class for calculating Fibonacci numbers using memoization and the fast doubling algorithm.
- Functions:
get_prime_generator: Generate an infinite sequence of prime numbers.
get_prime_factorization_generator: Generate the prime factorization of a number.
count_prime_factors: Count the prime factors of a number and return a pandas Series.
get_divisors: Get the divisors of a number.
lcm: Calculate the Least Common Multiple (LCM) of a set of numbers.
Examples
>>> fib = FibCalculator()
>>> fib(10)
55
>>> primes = get_prime_generator()
>>> next(primes)
2
>>> next(primes)
3
>>> list(get_prime_factorization_generator(28))
[2, 2, 7]
>>> count_prime_factors(28)
2 2
7 1
dtype: int64
>>> get_divisors(28)
[1, 2, 4, 7, 14]
>>> lcm(12, 15)
60
Note
This module relies on the pandas and numpy libraries.
- class pirrtools.sequences.FibCalculator[source]#
Bases:
object
A class for calculating Fibonacci numbers using memoization and the fast doubling algorithm.
The FibCalculator class provides a method to compute the nth Fibonacci number efficiently by caching previously computed values and using the fast doubling algorithm. This method is particularly efficient for large Fibonacci numbers.
- - __init__
Initialize the FibCalculator with a base cache.
- - __call__
Calculate the nth Fibonacci number.
Example
>>> fib = FibCalculator() >>> fib(10) 55
- Reference:
Fast Doubling Algorithm for Fibonacci Numbers: https://www.nayuki.io/page/fast-fibonacci-algorithms
- __call__(n: int) int [source]#
Calculate the nth Fibonacci number using memoization and the fast doubling algorithm.
- Parameters:
n (int) – The index of the Fibonacci number to calculate. Must be non-negative.
- Returns:
The nth Fibonacci number.
- Return type:
- Raises:
ValueError – If n is a negative integer.
- Reference:
Fast Doubling Algorithm for Fibonacci Numbers: https://www.nayuki.io/page/fast-fibonacci-algorithms
- pirrtools.sequences.get_prime_generator() Generator[int, None, None] [source]#
Generate prime numbers.
- Yields:
The next prime number.
- pirrtools.sequences.get_prime_factorization_generator(n: int) Generator[int, None, None] [source]#
Generate the prime factorization of a number.
- Parameters:
n – The number to factorize.
- Yields:
The next prime factor.
- pirrtools.sequences.count_prime_factors(n: int) Series [source]#
Count the prime factors of a number.
- Parameters:
n – The number to factorize.
- Returns:
A pandas Series of counts of the prime factors of n.
Module contents#
Main entry point for the pirrtools package.
This module provides core utility functions for path management, module reloading, and environment setup. It automatically loads configuration from .pirc files and sets up matplotlib inline mode for IPython environments.
The module exposes key functionality from submodules and provides utilities for: - System path manipulation - Module and class reloading - Configuration file loading - IPython environment setup
Example
>>> from pirrtools import addpath, reload_entity
>>> addpath('/my/custom/path')
>>> reloaded_module = reload_entity(my_module)
- pirrtools.addpath(path, position=0, verbose=False)[source]#
Add a path to the system path at the specified position.
- Parameters:
Note
The path is expanded and converted to absolute form before adding. Duplicate paths are not added.
- pirrtools.reload_entity(entity)[source]#
Reload a module or class.
If a class is provided, its module is reloaded and the class is re-imported from the reloaded module.
- Parameters:
entity (module or class) – The module or class to reload.
- Returns:
The reloaded module or class.
- Return type:
module or class
Example
>>> import my_module >>> reloaded_module = reload_entity(my_module) >>> reloaded_class = reload_entity(MyClass)
- pirrtools.load_pirc_file(verbose=False)[source]#
Load the .pirc module from the home directory and add specified paths.
This function loads the .pirc.py file from the home directory and automatically adds any paths specified in the mypaths variable to the system path.
- Parameters:
verbose (bool, optional) – Whether to print status messages. Defaults to False.
- pirrtools.load_matplotlib_inline(verbose=False)[source]#
Load the ‘%matplotlib inline’ magic command in IPython if available.
- Parameters:
verbose (bool, optional) – Whether to print status messages. Defaults to False.
- pirrtools.get_base_package(module)[source]#
Get the base package name of a module.
- Parameters:
module (module) – The module to get the base package of.
- Returns:
The name of the base package (first component of module.__name__).
- Return type:
Example
>>> import numpy.linalg >>> get_base_package(numpy.linalg) 'numpy'
- pirrtools.find_instances(cls, module, tracker_type=<class 'pirrtools.structures.attrdict.AttrDict'>, filter_func=None)[source]#
Find all instances of a class in a module and its submodules.
- Parameters:
cls (type) – The class type to search for instances of.
module (module) – The module to search in.
tracker_type (type, optional) – The container type to use for results. Defaults to AttrDict.
filter_func (callable, optional) – A function to filter results. Should accept (name, obj) and return bool. If None, no filtering is applied.
- Returns:
- A nested structure containing found instances,
organized by module hierarchy.
- Return type:
tracker_type
Example
>>> instances = find_instances(MyClass, my_module) >>> print(instances.submodule.instance_name)