Skip to content

Engineering Blueprint

Conduct Autonomous AI Research From Hypothesis to Paper

Orchestra-ResearchMay 2026

A comprehensive library of 98 AI research engineering skills across 23 domains, covering the full research lifecycle from literature survey and ideation through model architecture, training, optimization, evaluation, and paper writing.

AutoresearchModel ArchitectureTokenizationFine-TuningMechanistic InterpretabilityData ProcessingPost-TrainingSafety & AlignmentDistributed TrainingInfrastructureOptimizationEvaluationInference & ServingMLOpsAgentsRAGPrompt EngineeringObservabilityMultimodalEmerging TechniquesML Paper WritingIdeationAgent-Native Research Artifact

Install Command

Run this command to deploy the blueprint to your environment.

What problem does this solve?

Autonomous AI research agents need access to specialized expertise across dozens of research domains to conduct end-to-end experiments and discover novel findings. The autoresearch orchestration skill routes agents to domain-specific skills for execution, enabling research teams to move faster without mastering infrastructure, training frameworks, and evaluation tools separately.

How does it work?

Once set up, the autoresearch skill manages a two-loop research cycle. The inner loop runs rapid experiments on a measurable metric (validation loss, accuracy, or mechanistic properties). The outer loop steps back, synthesizes patterns across experiments, and decides the next research direction (deepen, broaden, pivot, or conclude). The agent reads structured research state (research-state.yaml, findings.md) at each step, routes to domain-specific skills when it needs training, evaluation, or infrastructure, and maintains a git-backed research log with timestamped discoveries. Output: end-to-end research projects with experiment logs, synthesis documents, and publication-ready papers.

What's the biggest win?

AI agents can conduct full research projects autonomously from hypothesis through experimental validation to published paper, with continuous operation across multiple sessions via wall-clock loops.

What should I know technically?

Requires Node 16+ for the npm installer (@orchestra-research/ai-research-skills). Individual skills depend on their respective frameworks (PyTorch, JAX, HuggingFace, NVIDIA stack, etc.). For Claude Code: /loop with 20-minute intervals is recommended for typical research projects (scale up or down based on experiment duration). For OpenClaw: use cron.add with sessionTarget: 'current' to bind the research loop to the current chat session and maintain context across ticks. Agent must initialize research-state.yaml, findings.md, and directory structure before entering loops (use provided templates).

What should I watch out for?

Research agents require clear research questions with measurable metrics to use the inner loop effectively. Purely exploratory questions without metrics may need heavier human steering. The autoresearch skill expects autonomous operation with minimal human input after bootstrap. Large-scale experiments with long training times will trigger loop ticks during training; design experiments to be resumable or adjust the loop interval accordingly.

Key Features

Two-Loop Orchestration

Inner loop runs rapid experiment iterations with measurable outcomes; outer loop synthesizes patterns and steers research direction

23 Research Domains

Model architecture, fine-tuning, post-training, distributed training, inference, optimization, evaluation, interpretability, safety, multimodal, and more

Autonomous Continuity

Claude Code /loop and OpenClaw cron jobs keep research running continuously across sessions without manual intervention

Structured Research Memory

research-state.yaml, findings.md, and research-log.md maintain project state and discoveries; git commits prove experiment protocols before results

End-to-End Skill Routing

Autoresearch routes to domain skills for data processing, training, inference, evaluation, and interpretability based on what the research phase requires

Tools in this Blueprint

Claude logo
4.7(315 reviews)
OpenCode
Cursor
Codex CLI

About This Blueprint

License
MIT
Industry
Technology
Skills
1 workflows, 0 sub-skills, 97 standalone