đ Learning Flow
AI Agent Tutorial
Quick Start (3 minutes)
System Overview & Core Concepts
Run in Colab (No Code)
Build Your Own AI Agent
FAQ / Troubleshooting
How does the AI Agent retrieve insights?
You submit a complete question. The system uses an LLM to understand what you want, then calls five Agents in sequence:
Dataset Search â Analytics Planning â Filter Decision â Analytics Execution â Interpretation
This end-to-end flow is built on RAG (retrieve first, then generate) and vector-based semantic search.
The Five Agents (what they do & what they expect)
- Dataset Search Agent
- Uses semantic retrieval across multiple Data Planets (e.g., SDG, UN) to find relevant datasets.
- Output: candidate datasets + field metadata.
- Analytics Planning Agent
- Decides the x-axis, y metrics (with calculations like avg/sum), filters (years/geo), and field formats (e.g., year, admin_level_4).
- Filter Decision Agent
- Finalizes years/geo and other conditions into a concrete, executable parameter set.
- Analytics Execution Agent
- Queries the selected datasets, retrieves the data, and aligns time and geography.
- Interpretation Agent
- Sends the aligned data to the LLM and produces a concise written insight (â300 words) that states the conclusion and limitations.
Plain-English Glossary
- LLM (Large Language Model): A âtext assistantâ that reads your question, understands context, and writes a response.
- RAG (Retrieval-Augmented Generation): First retrieve documents/data, then let the LLM answer based on what was found.
- Vectors / Semantic Search: Convert text into numerical vectors so the system can find items by meaning, not just keywords.
- LangGraph: A way to model multi-step flows as nodes and edgesâboth visual and executable.
- Data Planet: Araliaâs open data universe. Multiple Planets host datasets curated by different data providers, each sharing domain expertise.
Pro tip: Ask a complete question
- The demo Agent will not ask follow-up questions. Clearly scoped questions help the Agent find the right insight.
- When unsure about terminology, use well-known, precise metric names (e.g., Gini coefficient)âavoid vague abbreviations.
- Specify how you define averages and comparison points from the start.
Completeness checklist (copy and tick)
- Metric / Definition: e.g., average GDP growth at purchasersâ prices (average? growth rate? nominal/real?)
- Time: explicit year or range (e.g., 2021â2024; 2024)
- Geography level: country / state / county / region + scope (e.g., Malaysia by state)
- Relationship / Comparison: correlation, ranking, gap, grouped comparison, etc.
Question template (edit in place)
Is there a relationship between <Metric A> in <Time Range A> and <Metric B> in <Time/Year B> across <Geography Level/Scope>?
Example (ready to use):
Is there a relationship between the average GDP growth at purchasers' prices from 2021 to 2024 and the Gini coefficient of each state in Malaysia in 2024?
Non-examples (the Agent canât answer these)
- âIs there a relationship between GDP and Gini in Malaysia?â (missing time and geography level)
- âCompare inequality.â (metric definition and years not specified)
Welcome to the next chapter! Weâll run the Agent hands-on and unpack how each step works.
P.S. You do not need to prepare your own environmentâweâll use Google Colab so you can run everything in the browser.
â Previousďź [
Quick Start (3 minutes)](https://deciduous-centipede-9d7.notion.site/Quick-Start-3-minutes-264ddf94fd14808d9b83c9da2cf9efb4)ă| NextďźRun in Colab (No Code) â