NAME
Hyper-Extract — Transform unstructured text into structured knowledge with LLMs. Graphs, hypergraphs, and spatio-temporal extractions…
SYNOPSIS
pip install hyperextractINFO
DESCRIPTION
Transform unstructured text into structured knowledge with LLMs. Graphs, hypergraphs, and spatio-temporal extractions — with one command.
README
Smart Knowledge Extraction CLI
Transform documents into structured knowledge with one command.
"Stop reading. Start understanding."
"告别文档焦虑,让信息一目了然"
Hyper-Extract is an intelligent, LLM-powered knowledge extraction and evolution framework. It radically simplifies transforming highly unstructured texts into persistent, predictable, and strongly-typed Knowledge Abstracts. It effortlessly extracts information into a wide spectrum of formats—ranging from simple Collections (Lists/Sets) and Pydantic Models, to complex Knowledge Graphs, Hypergraphs, and even Spatio-Temporal Graphs.
✨ Core Features
- 🔷 8 Auto-Types: From basic
AutoModel/AutoListto advancedAutoGraph,AutoHypergraph, andAutoSpatioTemporalGraph. - 🧠 10+ Extraction Engines: Out-of-the-box support for cutting-edge retrieval paradigms like
GraphRAG,LightRAG,Hyper-RAG, andKG-Gen. - 📝 Declarative YAML Templates: Zero-code extraction definition. Includes 80+ presets across 6 domains.
- 🔄 Incremental Evolution: Feed new documents on the fly to continuously map out and expand the extracted knowledge.
⚡ Quick Start
1. Installation
For CLI Users (install he command globally):
uv tool install hyperextract
For Python Developers (use as library):
uv pip install hyperextract
2. The Command Line Way
Extract, search, and manage directly from CLI.
By default, the CLI uses
gpt-4o-miniandtext-embedding-3-small.
# Configure OpenAI API Key he config init -k YOUR_OPENAI_API_KEYExtract knowledge
he parse examples/en/tesla.md -t general/biography_graph -o ./output/ -l en
Query the knowledge abstract
he search ./output/ "What are Tesla's major achievements?"
Visualize the knowledge graph
he show ./output/
Incrementally supplement knowledge
he feed ./output/ examples/en/tesla_question.md
Show the updated knowledge graph
he show ./output/
🐍 The Python API Way (click to expand)
Installation
# Clone the repository git clone https://github.com/yifanfeng97/hyper-extract.git cd hyper-extractInstall dependencies
uv sync
Configuration
# Copy the example env file cp .env.example .envEdit .env with your API key and base URL
OPENAI_API_KEY=your-api-key
OPENAI_BASE_URL=https://api.openai.com/v1
Usage
import os from dotenv import load_dotenvLoad environment variables from .env file
load_dotenv()
from hyperextract import Template
Create a template
ka = Template.create("general/biography_graph")
Parse a document
with open("examples/en/tesla.md", "r", encoding="utf-8") as f: text = f.read() result = ka.parse(text)
Visualize the knowledge graph
ka.show(result)
Incrementally supplement knowledge
with open("examples/en/tesla_question.md", "r", encoding="utf-8") as f: new_text = f.read() ka.feed(result, new_text)
Show the updated knowledge graph
ka.show(result)
🔗 For complete examples, see examples/en
Installation Comparison:
| Use Case | Command | Purpose |
|---|---|---|
| CLI Tool | uv tool install hyperextract | Install he command globally |
| Python Library | uv pip install hyperextract | Use in Python code |
🧩 Deep Dive: The 8 Auto-Types
Our framework embraces complexity without making you write boilerplate code.
Example: AutoGraph Visualization
Here is the knowledge graph visualization after AutoGraph extraction:
🛠️ Architecture Overview
Hyper-Extract follows a three-layer architecture:
Auto-Types define the data structures for knowledge extraction. With 8 strong-typed structures (AutoModel, AutoList, AutoSet, AutoGraph, AutoHypergraph, AutoTemporalGraph, AutoSpatialGraph, AutoSpatioTemporalGraph), they serve as the output format for all extractions.
Methods provide extraction algorithms built on Auto-Types. This includes Typical methods (KG-Gen, iText2KG, iText2KG*) and RAG-based methods (GraphRAG, LightRAG, Hyper-RAG, HypergraphRAG, Cog-RAG).
Templates offer domain-specific configurations with ready-to-use prompts and data structures. Covering 6 domains (Finance, Legal, Medical, TCM, Industry, General) with 80+ preset templates, users can extract knowledge without dealing with Auto-Types or Methods directly.
Use via CLI (he parse, he search, he show...) or Python API (Template.create()).
📚 Related Documentation
- Preset Templates: Browse 80+ ready-to-use templates across 6 domains
- Design Guide: Learn how to create custom templates
📋 Template Structure Example (Graph Type)
Here's a complete YAML template example for Graph type extraction (entity-relationship extraction):
language: enname: Knowledge Graph type: graph tags: [general]
description: 'Extract entities and their relationships to construct a knowledge graph.'
output: entities: fields: - name: name type: str description: 'Entity name' - name: type type: str description: 'Entity type: e.g., person, organization, event' - name: description type: str description: 'Entity description' relations: fields: - name: source type: str description: 'Source entity' - name: target type: str description: 'Target entity' - name: type type: str description: 'Relation type: e.g., invention, collaboration, competition' - name: description type: str description: 'Relation description'
guideline: target: 'Extract entities and their relationships from the text.' rules_for_entities: - 'Extract meaningful entities' - 'Maintain consistent naming' rules_for_relations: - 'Create relations only when explicitly expressed in the text'
identifiers: entity_id: name relation_id: '{source}|{type}|{target}' relation_members: source: source target: target
display: entity_label: '{name} ({type})' relation_label: '{type}'
📈 Comparison with Other Libraries
| Feature | GraphRAG | LightRAG | KG-Gen | ATOM | Hyper-Extract |
|---|---|---|---|---|---|
| Knowledge Graph | ✅ | ✅ | ✅ | ✅ | ✅ |
| Temporal Graph | ✅ | ❌ | ❌ | ✅ | ✅ |
| Spatial Graph | ❌ | ❌ | ❌ | ❌ | ✅ |
| Hypergraph | ❌ | ❌ | ❌ | ❌ | ✅ |
| Domain Templates | ❌ | ❌ | ❌ | ❌ | ✅ |
| CLI Tool | ✅ | ❌ | ❌ | ❌ | ✅ |
| Multi-language | ✅ | ❌ | ❌ | ❌ | ✅ |
📚 Related Documentation
- Documentation - Complete documentation site
- CLI Guide - Command-line interface
- Template Gallery - Available templates
- Example Code - Working examples
🤝 Contributing & License
Contributions are welcome! Please submit Issues and PRs. Licensed under Apache-2.0.