Pirrtools Interactive Tutorial#

This notebook demonstrates the key features of the pirrtools library with interactive examples.

[ ]:
import pandas as pd
import pirrtools
from pirrtools.structures import AttrDict, AttrPath

Pandas Caching with Pirrtools#

The core feature of pirrtools is pandas DataFrame caching using the feather format.

[ ]:
# Create a sample DataFrame
df = pd.DataFrame({
    'product': ['A', 'B', 'C', 'D', 'E'],
    'sales': [100, 150, 200, 120, 180],
    'region': ['North', 'South', 'East', 'West', 'Central']
})

print("Sample DataFrame:")
df
[ ]:
# Use the pirr accessor for caching
# This will save the DataFrame to cache and allow quick reloading
cache_path = '/tmp/sample_data.feather'
df.pirr.to_cache(cache_path, overwrite=True)
print(f"DataFrame cached to: {cache_path}")

AttrPath - Attribute-based File Navigation#

Navigate the file system using dot notation with intelligent file viewing.

[ ]:
# Navigate to workspace directory
workspace = AttrPath('/workspace')
print(f"Workspace path: {workspace}")
print(f"Is directory: {workspace.is_dir()}")

# List some contents
if hasattr(workspace, 'D'):
    print(f"\nDirectories available: {list(workspace.D.__dict__.keys())[:5]}")
if hasattr(workspace, 'F'):
    print(f"Files available: {list(workspace.F.__dict__.keys())[:5]}")

AttrDict - Dictionary with Attribute Access#

Access dictionary values using dot notation.

[ ]:
# Create an AttrDict
config = AttrDict({
    'database': {
        'host': 'localhost',
        'port': 5432,
        'name': 'mydb'
    },
    'debug': True,
    'features': ['caching', 'logging', 'monitoring']
})

print(f"Database host: {config.database.host}")
print(f"Database port: {config.database.port}")
print(f"Debug mode: {config.debug}")
print(f"Available features: {config.features}")