Data Analysis Skills

Skills for analyzing and visualizing data with Python

/api/skills/data-analysis

7 Skills

analyze_data

Perform statistical analysis on datasets

/api/skills/data-analysis/analyze_data
Skill python
Content Type
Static (Markdown)
Language
python
Widgets
n/a

Trigger Keywords

analyzestatisticsmeanmediancorrelationregressiondistributionoutliers

Required Packages

pandasnumpyscipystatsmodels

Instructions Preview

# Data Analysis Skill When performing statistical analysis, use: 1. **Quick analysis**: The helper script at `/home/daytona/skills/analyze_data/analyze.py` via `execute_shell` 2. **Custom analysis**: Your own Python code, typically written into an artifact script and run via `execute_shell` ## Using the Helper Script ```bash # Run all analyses python /home/daytona/skills/analyze_data/analyze.py /home/daytona/files/data.csv --all # Run specific analyses python /home/daytona/skills/analyze_data/analyze.py data.csv --describe --correlate # Save results to file python /home/daytona/skills/analyze_data/analyze.py data.csv --all --output /home/daytona/out/results.json ``` ## Custom Analysis Guidelines When performing statistical analysis, follow these guidelines: ## Analysis Workflow 1. **Data Inspection**: Always start by examining the data structure 2. **Cleaning**: Handle missing values, duplicates, and outliers 3. **Descriptive Stats**: Calculate basic statistics first 4. **Deep...

analyze_spreadsheet

Explore and analyze Excel/CSV spreadsheets with pandas in a Daytona sandbox.

/api/skills/data-analysis/analyze_spreadsheet
Skill python
Content Type
Static (Markdown)
Language
python
Widgets
n/a

Trigger Keywords

spreadsheetexcelxlsxodscsvpandasdata analysis

Required Packages

pandasopenpyxl

Instructions Preview

# Spreadsheet Analysis Skill Use this skill to analyze spreadsheet documents (Excel `.xlsx`, `.xls`, `.ods`, `.csv`) stored in Vertesia using Python (`pandas`, `openpyxl`) inside a Daytona sandbox. The goal is to: - **Discover the dataset shape first** (sheets, columns, types, ranges, missingness). - Then perform **targeted analysis** (aggregations, trends, segments) using that schema. - Use **artifacts** (`write_artifact`, `read_artifact`, `grep_artifact`) to pass data and reuse scripts/results across steps. Important: - You **must** work with Vertesia document IDs, not local file system paths like `/Users/...`. - Whenever you need to analyze spreadsheets, you must first identify the **exact documents** to use (one or several): - If the user provides one or more **document IDs**, use them directly. - If the user provides one or more **names/labels** (e.g. `MODE_2025_Data_JanToAug`), call `search_documents` with a name‑focused query (and type filter if known) to resolve each to ...

create_spreadsheet

Generate Excel/CSV spreadsheets from data using Python in the Daytona sandbox, with artifacts and execute_shell.

/api/skills/data-analysis/create_spreadsheet
Skill python
Content Type
Static (Markdown)
Language
python
Widgets
n/a

Trigger Keywords

spreadsheetexcelxlsxcsvexportreport

Required Packages

pandasopenpyxl

Instructions Preview

# Create Spreadsheet Skill Use this skill to create spreadsheets (Excel `.xlsx` or `.csv`) from data available in Vertesia or artifacts. The recommended pattern is: 1. **Stage inputs as artifacts** (CSV/JSON/text) or pull them from Vertesia documents. 2. **Write a Python script as an artifact** under `scripts/`. 3. **Execute the script via `execute_shell`** in the Daytona sandbox. 4. **Write the spreadsheet into `/home/daytona/out/`** so it becomes an `out/*` artifact that can be reused or downloaded. For long‑lived knowledge base objects, a human or a follow‑up workflow can ingest the resulting `.xlsx` file (e.g. via `create_content_object` when it is available on a reachable URL). --- ## 1. Prepare input data as artifacts or documents ### 1.1 From existing documents If data already lives in Vertesia: 1. Use `search_documents` to find the right documents and get their `id`s. 2. Use `fetch_document` when you need text content for context, or: 3. Use `execute_shell` with `docum...

discover_schema

Infer schema, column types, and basic statistics for tabular data (CSV or Excel) so the agent can understand and reuse the data shape.

/api/skills/data-analysis/discover_schema
Skill python
Content Type
Static (Markdown)
Language
python
Widgets
n/a

Trigger Keywords

schemaprofilecolumnstypesstructureoverview

Required Packages

pandasnumpyopenpyxl

Instructions Preview

# Discover Dataset Schema This skill inspects a tabular dataset (CSV or Excel) and returns a structured schema description that other tools and the agent can reuse. It is designed to work **without any predefined schema**: - Infers column data types (numeric, categorical, datetime, boolean, text). - Computes basic stats (missingness, cardinality, ranges, sample values). - Suggests likely keys and time indexes. ## Usage Place your data file in the sandbox files directory (typically via `write_artifact`): - CSV: `/home/daytona/files/data.csv` - Excel: `/home/daytona/files/data.xlsx` (first sheet by default) ### Example (CSV) ```bash python /home/daytona/skills/discover_schema/discover_schema.py \ /home/daytona/files/data.csv ``` ### Example (Excel, specific sheet) ```bash python /home/daytona/skills/discover_schema/discover_schema.py \ /home/daytona/files/data.xlsx \ --sheet "Sheet1" ``` ## Output The script prints a JSON object to stdout with a stable structure, for ex...

etl_pipeline

Extract, transform, and load tabular data (CSV/Excel/Parquet) in the Daytona sandbox using Python, pandas, and artifacts.

/api/skills/data-analysis/etl_pipeline
Skill python
Content Type
Static (Markdown)
Language
python
Widgets
n/a

Trigger Keywords

etlextracttransformloadpipelinepandaspyarrowcsvexcelparquet

Required Packages

pandaspyarrow

Instructions Preview

# ETL Pipeline Skill (Python) Use this skill when you need to build **extract–transform–load (ETL)** workflows inside the Daytona sandbox. The standard pattern is: 1. **Extract** data from Vertesia documents and artifacts into `/home/daytona/files/` or `/home/daytona/documents/`. 2. **Transform** with Python (`pandas`, `pyarrow`) in scripts under `/home/daytona/scripts/`. 3. **Load** final tables into `/home/daytona/out/` as CSV or Parquet so they become reusable `out/*` artifacts. All code runs via `execute_shell` and scripts stored as artifacts (no inline `execute_code` tool). ## 1. Extract data into the sandbox You can pull data from: - **Vertesia documents** (CSV, Excel, JSON, etc.). - **Artifacts** you create on the fly (CSV/JSON/text). ### 1.1 From Vertesia documents 1. Use `search_documents` to find the right sources and get their `id` values. 2. Use `execute_shell` with the `documents` parameter to download them: - `documents=[{"id": "<DOC_ID>", "mode": "file"}]` - `c...

make_chart

Learn how to create interactive charts in responses using chart JSON specs rendered by the UI.

/api/skills/data-analysis/make_chart
Skill
Content Type
Static (Markdown)
Widgets
n/a

Trigger Keywords

chartchartsgraphvisualizationbarlineareapiescatterradarfunneltreemap

Instructions Preview

# Chart Creation Skill Use this skill when you need to create charts or graphs in your responses. The UI can render interactive charts from special markdown code blocks whose language is `chart`. You provide a JSON specification; the UI does the rest. --- ## 1. Chart Type Decision Tree ``` What are you visualizing? ├── Comparing categories? → bar ├── Trend over time? │ ├── Emphasize volume/area? → area │ └── Just the trend line? → line ├── Mix of bars and lines? → composed ├── Part of a whole (%)? → pie (use innerRadius for donut) ├── Correlation between 2 variables? → scatter ├── Multi-dimensional comparison? → radar ├── Progress toward goals? → radialBar ├── Conversion/funnel stages? → funnel └── Hierarchical proportions? → treemap ``` --- ## 2. Chart Types Reference | Type | Use Case | Data Shape | |------|----------|------------| | `bar` | Category comparisons | `[{x, y1, y2...}]` | | `line` | Trends over time | `[{x, y1, y2...}]` | | `area` | Trends with volume emphasi...

update_spreadsheet

Modify existing Excel/CSV spreadsheets using Python in the Daytona sandbox, with artifacts and execute_shell.

/api/skills/data-analysis/update_spreadsheet
Skill python
Content Type
Static (Markdown)
Language
python
Widgets
n/a

Trigger Keywords

spreadsheetexcelxlsxcsvupdatemodifyreport

Required Packages

pandasopenpyxl

Instructions Preview

# Update Spreadsheet Skill Use this skill to **update existing spreadsheet documents** stored in Vertesia: - Adjust or clean data. - Add or modify sheets. - Append new rows or computed metrics. - Regenerate summary tabs, while preserving raw data where appropriate. All updates are done via code running in the Daytona sandbox, with **scripts and data managed as artifacts**. --- ## 1. Locate and download the spreadsheet 1. Identify the document: - If the user knows the document ID, use it directly. - Otherwise, use `search_documents` (by name, type, or full_text) to find the correct spreadsheet document and get its `id`. 2. Download the file into the sandbox using `execute_shell` and the `documents` parameter: - `documents=[{"id": "<DOC_ID>", "mode": "file"}]` - `command="ls -l /home/daytona/documents"` The sandbox will contain: - `/home/daytona/documents/<DOC_ID>.<ext>` (e.g. `.xlsx` or `.csv`). > Treat `/home/daytona/documents` as the source of truth for the original sp...