IdeaWeaver¶

A comprehensive CLI tool for AI model training, evaluation, and deployment with advanced RAG capabilities and MCP (Model Context Protocol) integration. Train, fine-tune, and deploy language models with enterprise-grade features. Visit: https://github.com/ideaweaver-ai-code/ideaweaver

IdeaWeaver Architecture

✨ Key Features¶

One-Click Setup - Automated Python 3.12 environment with all dependencies
Advanced RAG - Traditional + Agentic RAG with multiple vector stores
Flexible Training - LoRA, QLoRA, and full fine-tuning support
Comprehensive Evaluation - Built-in benchmarks + custom metrics
MCP Integration - GitHub, Terraform and AWS connectors
Multi-Agent Workflows - CrewAI pipeline support
Configuration Validation - YAML validation and schema checking

NOTE:

Python Version Compatibility¶

We are currently using Python 3.12 for the ideaweaver project because some of the tools and dependencies we rely on do not yet support Python 3.13. Once those tools officially support Python 3.13, we plan to upgrade the project accordingly.

🚀 Quick Start¶

Installation¶

# One-line installation
curl -LsSf https://raw.githubusercontent.com/ideaweaver-ai-code/ideaweaver/main/setup_environments.sh | sh

# Or traditional installation
git clone https://github.com/ideaweaver-ai-code/ideaweaver.git
cd ideaweaver
chmod +x setup_environments.sh
./setup_environments.sh

Environment Setup¶

⚠️ Important: IdeaWeaver requires Python 3.12. Make sure you have Python 3.12 installed before proceeding.

Check Python Version
```
python --version
Python 3.12.10
```

Activate the Environment

# On Unix/macOS
source ideaweaver-env/bin/activate

Verify Installation
```
ideaweaver --help
```

Expected Output:

Usage: ideaweaver [OPTIONS] COMMAND [ARGS]...

  IdeaWeaver Model Training CLI - A comprehensive tool for AI model training,
  evaluation, and deployment.

  Features include LoRA/QLoRA fine-tuning, RAG systems, MCP integration, and
  enterprise-grade model management. For detailed documentation and examples,
  visit: https://github.com/ideaweaver-ai-code/ideaweaver

Options:
  --help  Show this message and exit.

Commands:
  agent       Intelligent agent workflows for creative and analytical tasks.
  download    Download a model from Hugging Face Hub
  evaluate    Evaluate a model using lm-evaluation-harness with...
  finetune    Supervised fine-tuning commands with LoRA, QLoRA, and full...
  list-tasks  List all available evaluation tasks from lm-evaluation-harness
  mcp         Model Context Protocol (MCP) integration commands
  rag         RAG (Retrieval-Augmented Generation) commands
  train       Train a model with AutoTrain Advanced.
  validate    Validate a configuration file

If you see the help menu, your environment is set up correctly!

Note: If you need to install Python 3.12: - macOS: brew install [email protected] - Ubuntu/Debian: sudo apt install python3.12

Core Usage: Step-by-Step¶

1. Download a Model¶

Download models from Hugging Face Hub for local use or further training.

ideaweaver download microsoft/DialoGPT-medium

Tip: Set your Hugging Face token for private/gated models.

2. Validate Configuration¶

Check your YAML config or dataset before training to catch errors early.

ideaweaver validate --config configs/config.yml

3. Model Training¶

Train a model using a config file or command-line options. Supports LoRA, QLoRA, and full fine-tuning.

# Config-based training
ideaweaver train --config configs/config.yml

# Or command-line training
ideaweaver train --model google/bert_uncased_L-2_H-128_A-2 --dataset ./data/train.csv --task text_classification --epochs 3 --batch-size 8

Training Architecture Diagram

4. Model Evaluation¶

Evaluate your trained or downloaded models on standard or custom benchmarks.

ideaweaver evaluate --model ./fine-tuned-model --tasks hellaswag,arc_easy

Example: TensorBoard visualization of evaluation metrics

5. Agent Workflows¶

Use intelligent agents for creative writing, research, social media, and more.

ideaweaver agent generate_storybook --theme "brave little mouse" --target-age "3-5"

IdeaWeaver Agent Architecture

6. Finetuning¶

Fine-tune models with LoRA, QLoRA, or full parameter updates for your data.

ideaweaver finetune lora --model microsoft/DialoGPT-medium --dataset ./data.json --output-dir ./fine-tuned-model

7. RAG (Retrieval-Augmented Generation)¶

Build and query knowledge bases with advanced retrieval and hybrid search.

ideaweaver rag create-kb --name mykb
ideaweaver rag ingest --kb mykb --source ./docs/
ideaweaver rag query --kb mykb --question "What is RAG?"

8. MCP (Model Context Protocol) Integration¶

Connect to external tools and services (GitHub, AWS, etc.) for advanced workflows.

ideaweaver mcp list-servers
ideaweaver mcp enable github

🏗️ Architecture¶

System Overview¶

Training Architecture Diagram

Agent Architecture¶

IdeaWeaver Agent Architecture

Figure: Relationship between IdeaWeaver, CrewAI, and specialized generators.

RAG Capabilities Comparison¶

Feature	Traditional RAG	Agentic RAG	IdeaWeaver Advantage
Query Processing	Direct retrieval	Multi-step reasoning	Both approaches supported
Context Awareness	Single-turn	Multi-turn conversations	Persistent context tracking
Tool Integration	Limited	Extensive tool use	MCP protocol integration
Self-Correction	None	Built-in reflection	Advanced error handling
Scalability	Vector search only	Dynamic planning	Hybrid approach

🔧 Core Components¶

Model Training & Fine-tuning¶

LoRA/QLoRA: Memory-efficient fine-tuning for large models
Full Parameter: Complete model retraining for maximum customization
Multi-GPU: Distributed training support with automatic scaling
Experiment Tracking: MLflow, Weights & Biases, Comet integration

Fine-tuning Examples¶

# Full fine-tuning with DialoGPT
ideaweaver finetune full \
  --model microsoft/DialoGPT-small \
  --dataset datasets/instruction_following_sample.json \
  --output-dir ./test_full_basic \
  --epochs 5 \
  --batch-size 2 \
  --gradient-accumulation-steps 2 \
  --learning-rate 5e-5 \
  --max-seq-length 256 \
  --gradient-checkpointing \
  --verbose

Example output:

🚀 Supervised Fine-Tuner initialized
   Model: microsoft/DialoGPT-small
   Method: full
   Task: instruction_following
🔧 Setting up model and tokenizer...
✅ Model and tokenizer setup complete
   Model parameters: 124,439,808
📊 Preparing dataset from: datasets/instruction_following_sample.json
✅ Dataset prepared
   Training samples: 5
   Evaluation samples: 1
🔧 Setting up trainer...
✅ Trainer setup complete
🚀 Starting fine-tuning...
{'loss': 26.2043, 'grad_norm': nan, 'learning_rate': 1.5076844803522922e-06, 'epoch': 5.0}
{'train_runtime': 13.3988, 'train_samples_per_second': 1.866, 'train_steps_per_second': 0.746, 'train_loss': 26.204254150390625, 'epoch': 5.0}
✅ Fine-tuning completed!
📁 Model saved to: ./test_full_basic
📊 Evaluating model...
{'eval_loss': nan, 'eval_runtime': 0.1933, 'eval_samples_per_second': 5.175, 'eval_steps_per_second': 5.175, 'epoch': 5.0}
✅ Evaluation complete
   eval_loss: nan
   eval_runtime: 0.1933
   eval_samples_per_second: 5.1750
   eval_steps_per_second: 5.1750
   epoch: 5.0000
✅ Full fine-tuning completed!
📁 Model saved to: ./test_full_basic

Listing Fine-tuned Models¶

ideaweaver finetune list-models

Example output:

📁 Fine-tuned models found:
==================================================
📂 .

📂 cli-final-test
📂 cli-final-test/checkpoint-5
📂 comet-clean
📂 comet-clean/checkpoint-5
📂 comet-experiment
📂 comet-experiment/checkpoint-5

RAG Systems¶

Vector Stores: Chroma, Qdrant, FAISS with automatic optimization
Embeddings: Sentence Transformers, OpenAI, custom models
Agentic RAG: Multi-step reasoning with tool integration
Document Processing: PDF, DOCX, TXT, Markdown with smart chunking

MCP Integration¶

GitHub: Repository analysis, issue tracking, code reviews
AWS: S3, Lambda, SageMaker integration for cloud deployment

Evaluation & Benchmarking¶

Standard Tasks: HellaSwag, ARC, WinoGrande, MMLU, and more
Custom Metrics: Domain-specific evaluation frameworks
Model Comparison: Side-by-side performance analysis
RAG Evaluation: RAGAS framework for retrieval assessment

📊 Example Workflows¶

Complete Model Development Pipeline¶

# 1. Setup and Data Preparation
ideaweaver rag init --vector-store chroma
ideaweaver rag add-documents --path ./training-docs/

# 2. Generate Training Data
ideaweaver rag query "Generate training examples for customer service" --save-context

# 3. Model Fine-tuning  
ideaweaver finetune lora \
    --model microsoft/DialoGPT-medium \
    --dataset ./generated-training-data \
    --output-dir ./fine-tuned-model

# 4. Evaluation & Comparison
ideaweaver evaluate \
    --model ./fine-tuned-model \
    --tasks hellaswag,arc_easy,custom-eval \
    --output evaluation-results.json

Multi-Agent Content Generation¶

# Setup CrewAI workflow
ideaweaver agent generate-storybook \
    --topic "AI model training best practices" \
    --agents researcher,writer,reviewer \
    --output training-guide.md

CLI Usage Examples¶

Model Training¶

# Config-based training
ideaweaver train --config configs/config.yml --verbose

# Command line training
ideaweaver train --model google/bert_uncased_L-2_H-128_A-2 \
    --dataset ./data/train.csv \
    --task text_classification \
    --epochs 3 --batch-size 8

# AWS Bedrock deployment training(coming soon)
ideaweaver train --model meta-llama/Llama-2-7b-hf \
    --dataset ./data/train.csv \
    --push-to-bedrock \
    --bedrock-model-name my-llama-model \
    --bedrock-s3-bucket my-bedrock-bucket \
    --bedrock-role-arn arn:aws:iam::123456789012:role/BedrockImportRole

TensorBoard UI Example¶

Example: TensorBoard visualization of evaluation metrics

Weights & Biases (wandb) Online Example¶

WANDB_MODE=online WANDB_API_KEY=XXXXXX ideaweaver evaluate ./my-qwen2-model \
  --tasks hellaswag,arc_easy \
  --batch-size 2 \
  --output-path ./evaluation_results_wandb \
  --generate-report \
  --report-to wandb \
  --wandb-project my-eval-project \
  --limit 2 \
  --verbose

Example output:

🚀 Starting LLM evaluation for model: ./my-qwen2-model
📊 Tracking with: wandb
✅ Local model found: ./my-qwen2-model
wandb: Currently logged in as: laprashant (laprashant-startup) to https://api.wandb.ai. Use `wandb login --relogin` to force relogin
wandb: Tracking run with wandb version 0.19.11
wandb: Run data is saved locally in /Users/plakhera/Documents/model-registry/wandb/run-20250605_201404-irki0njw
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run autumn-shadow-1
wandb: ⭐️ View project at https://wandb.ai/laprashant-startup/my-eval-project
wandb: 🚀 View run at https://wandb.ai/laprashant-startup/my-eval-project/runs/irki0njw
✅ Wandb initialized: https://wandb.ai/laprashant-startup/my-eval-project/runs/irki0njw
📊 Starting evaluation on tasks: hellaswag, arc_easy
📁 Output path: ./evaluation_results_wandb
🔧 Running command: lm_eval --model hf --model_args pretrained=./my-qwen2-model --tasks hellaswag,arc_easy --device auto --batch_size 2 --output_path ./evaluation_results_wandb --log_samples --limit 2 --wandb_args project=my-eval-project
✅ Evaluation completed in 117.12 seconds
📋 Evaluation output:
hf (pretrained=./my-qwen2-model), gen_kwargs: (None), limit: 2.0, num_fewshot: None, batch_size: 2
|  Tasks  |Version|Filter|n-shot| Metric |   |Value|   |Stderr|
|---------|------:|------|-----:|--------|---|----:|---|-----:|
|arc_easy |      1|none  |     0|acc     |↑  |    0|±  |     0|
|         |       |none  |     0|acc_norm|↑  |    1|±  |     0|
|hellaswag|      1|none  |     0|acc     |↑  |    0|±  |     0|
|         |       |none  |     0|acc_norm|↑  |    0|±  |     0|

⚠️  No results.json found, creating basic results structure
✅ Results logged to Weights & Biases
✅ Evaluation completed successfully!
📄 Evaluation report saved to: evaluation_report_._my-qwen2-model.md
📄 Report saved to: evaluation_report_._my-qwen2-model.md

📊 Evaluation Summary:
   dummy_task - dummy_metric: 1.0000
wandb: 
wandb: Run history:
wandb: dummy_task_dummy_metric ▁
wandb:     evaluation_duration ▁
wandb:               num_tasks ▁
wandb: 
wandb: Run summary:
wandb: dummy_task_dummy_metric 1
wandb:     evaluation_duration 117.1197
wandb:               num_tasks 2
wandb: 
wandb: 🚀 View run autumn-shadow-1 at: https://wandb.ai/laprashant-startup/my-eval-project/runs/irki0njw
wandb: ⭐️ View project at: https://wandb.ai/laprashant-startup/my-eval-project
wandb: Synced 5 W&B file(s), 1 media file(s), 2 artifact file(s) and 0 other file(s)
wandb: Find logs at: ./wandb/run-20250605_201404-irki0njw/logs
✅ Wandb run finished
/opt/homebrew/Cellar/[email protected]/3.12.10_1/Frameworks/Python.framework/Versions/3.12/lib/python3.12/tempfile.py:940: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/var/folders/zh/6p21kgbs2p192mpqm1csfd3r0000gn/T/tmpr6z8m8xf'>
  _warnings.warn(warn_message, ResourceWarning)

wandb UI Example¶

Example: wandb visualization of evaluation metrics

The evaluation command provides comprehensive model assessment with: - Integration with TensorBoard for visualization - Weights & Biases (wandb) tracking - Detailed performance metrics for each task - Automatic report generation - Offline and online logging options

For more details on evaluation options and available tasks, see the Evaluation Guide.

Agent Commands¶

# Check LLM status
ideaweaver agent check-llm

# Storybook generation
ideaweaver agent generate_storybook --theme "brave little mouse" --target-age "3-5"

# Research and writing
ideaweaver agent research_write --topic "AI in healthcare"

# LinkedIn post creation
ideaweaver agent linkedin_post --topic "AI trends in 2025"

# Travel planning
ideaweaver agent travel_plan --destination "Tokyo" --duration "7 days" --budget "$2000-3000"

# Stock analysis
ideaweaver agent stock_analysis --symbol AAPL

Example output for check-llm:

🔍 Checking for Ollama availability...
✅ Ollama is available! Using model: phi3:mini
📋 Available models: deepseek-r1:1.5b, phi3:mini
✅ Created CrewAI LLM wrapper successfully

✅ LLM Status:
   Provider: ollama
   Model: phi3:mini
   Status: Connected and ready

🌟 Enterprise Features¶

📈 Performance Benchmarks¶

Training Speed Improvements¶

LoRA Fine-tuning: 3-5x faster than full parameter training
QLoRA: 50% memory reduction with minimal quality loss

RAG System Performance¶

Query Speed: <200ms average response time
Accuracy: 15-25% improvement over baseline RAG
Scalability: Handles 10M+ documents efficiently

🚀 Getting Started¶

Installation Guide - Setup in under 5 minutes
Quick Start - Your first model in 15 minutes
Configuration - Customize for your needs
CLI Reference - Complete command documentation

📚 Documentation¶

Getting Started¶

Installation - Complete setup guide
Quick Start - Essential commands and workflows
Configuration - Customize your setup

Tutorials¶

Model Training - LoRA, QLoRA, and full fine-tuning
RAG Systems - Build intelligent retrieval systems
MCP Integration - Connect external services

Reference¶

CLI Commands - Complete command reference
Configuration Files - YAML configuration guide

🤝 Community¶

Contributing - Join our development community
Guidelines - Code standards and best practices
Support - Get help and report issues
Roadmap - Upcoming features and improvements

📄 License¶

This project is licensed under the MIT License - see the LICENSE file for details.