Skip to content

Orchestrator API Reference

The Orchestrator is the central component of MacScrape that coordinates the web scraping, AI analysis, and data visualization processes.

Class: OrchestratorModel

Initialization

from models.orchestrator import OrchestratorModel

orchestrator = OrchestratorModel()

Methods

orchestrate(urls: List[str], prompt: Optional[str] = None) -> Dict

Orchestrates the entire process of web scraping, analysis, and visualization.

Parameters:

  • urls: List of URLs to analyze
  • prompt: Optional custom prompt for AI analysis

Returns:

  • A dictionary containing the analysis results and generated website content
results = await orchestrator.orchestrate(["https://example.com"], "Analyze the main topics")

clear_cache() -> None

Clears the internal cache of the orchestrator.

orchestrator.clear_cache()

update_user_styles(styles: Dict) -> None

Updates the user-defined styles for the generated website.

Parameters:

  • styles: A dictionary containing CSS styles
orchestrator.update_user_styles({
    "body": {
        "background-color": "#f0f0f0",
        "font-family": "Arial, sans-serif"
    }
})

Internal Workflow

sequenceDiagram
    participant User
    participant Orchestrator
    participant Forager
    participant AIRegenerator
    participant Visualizer

    User->>Orchestrator: orchestrate(urls, prompt)
    Orchestrator->>Forager: sniff_data(urls)
    Forager-->>Orchestrator: raw_data
    Orchestrator->>AIRegenerator: analyze_content(raw_data, prompt)
    AIRegenerator-->>Orchestrator: analysis_results
    Orchestrator->>Visualizer: generate_visualizations(analysis_results)
    Visualizer-->>Orchestrator: visualizations
    Orchestrator->>Orchestrator: generate_website(analysis_results, visualizations)
    Orchestrator-->>User: final_results

Error Handling

The Orchestrator uses a robust error handling system:

  1. Individual component errors are logged and don't halt the entire process
  2. If a critical error occurs, an exception is raised with detailed information
  3. Timeouts are implemented to prevent indefinite hanging

Performance Considerations

  • The Orchestrator uses asynchronous programming for improved performance
  • Caching is employed to reduce redundant processing
  • Consider using the max_concurrency parameter to control resource usage

Next Steps

Explore the Forager API to understand the web scraping process in detail.