HYPER-EXTRACT(1)

NAME

Hyper-ExtractTransform unstructured text into structured knowledge with LLMs. Graphs, hypergraphs, and spatio-temporal extractions…

SYNOPSIS

$pip install hyperextract

INFO

917 stars
104 forks
0 views
PythonAI & LLM

DESCRIPTION

Transform unstructured text into structured knowledge with LLMs. Graphs, hypergraphs, and spatio-temporal extractions — with one command.

README

Hyper-Extract Logo

Smart Knowledge Extraction CLI

Transform documents into structured knowledge with one command.

📖 English Version · 中文版

PyPI Version Python Version License Status Docs


"Stop reading. Start understanding."
"告别文档焦虑,让信息一目了然"


Hero & Workflow

Hyper-Extract is an intelligent, LLM-powered knowledge extraction and evolution framework. It radically simplifies transforming highly unstructured texts into persistent, predictable, and strongly-typed Knowledge Abstracts. It effortlessly extracts information into a wide spectrum of formats—ranging from simple Collections (Lists/Sets) and Pydantic Models, to complex Knowledge Graphs, Hypergraphs, and even Spatio-Temporal Graphs.

✨ Core Features

  • 🔷 8 Auto-Types: From basic AutoModel/AutoList to advanced AutoGraph, AutoHypergraph, and AutoSpatioTemporalGraph.
  • 🧠 10+ Extraction Engines: Out-of-the-box support for cutting-edge retrieval paradigms like GraphRAG, LightRAG, Hyper-RAG, and KG-Gen.
  • 📝 Declarative YAML Templates: Zero-code extraction definition. Includes 80+ presets across 6 domains.
  • 🔄 Incremental Evolution: Feed new documents on the fly to continuously map out and expand the extracted knowledge.

⚡ Quick Start

1. Installation

For CLI Users (install he command globally):

uv tool install hyperextract

For Python Developers (use as library):

uv pip install hyperextract

2. The Command Line Way

Extract, search, and manage directly from CLI.

By default, the CLI uses gpt-4o-mini and text-embedding-3-small.

# Configure OpenAI API Key
he config init -k YOUR_OPENAI_API_KEY

Extract knowledge

he parse examples/en/tesla.md -t general/biography_graph -o ./output/ -l en

Query the knowledge abstract

he search ./output/ "What are Tesla's major achievements?"

Visualize the knowledge graph

he show ./output/

Incrementally supplement knowledge

he feed ./output/ examples/en/tesla_question.md

Show the updated knowledge graph

he show ./output/

🐍 The Python API Way (click to expand)

Installation

# Clone the repository
git clone https://github.com/yifanfeng97/hyper-extract.git
cd hyper-extract

Install dependencies

uv sync

Configuration

# Copy the example env file
cp .env.example .env

Edit .env with your API key and base URL

OPENAI_API_KEY=your-api-key

OPENAI_BASE_URL=https://api.openai.com/v1

Usage

import os
from dotenv import load_dotenv

Load environment variables from .env file

load_dotenv()

from hyperextract import Template

Create a template

ka = Template.create("general/biography_graph")

Parse a document

with open("examples/en/tesla.md", "r", encoding="utf-8") as f: text = f.read() result = ka.parse(text)

Visualize the knowledge graph

ka.show(result)

Incrementally supplement knowledge

with open("examples/en/tesla_question.md", "r", encoding="utf-8") as f: new_text = f.read() ka.feed(result, new_text)

Show the updated knowledge graph

ka.show(result)

🔗 For complete examples, see examples/en


Installation Comparison:

Use CaseCommandPurpose
CLI Tooluv tool install hyperextractInstall he command globally
Python Libraryuv pip install hyperextractUse in Python code

🧩 Deep Dive: The 8 Auto-Types

Our framework embraces complexity without making you write boilerplate code.

Knowledge Structures Matrix

Example: AutoGraph Visualization

Here is the knowledge graph visualization after AutoGraph extraction:

AutoGraph Visualization

🛠️ Architecture Overview

Hyper-Extract follows a three-layer architecture:

  • Auto-Types define the data structures for knowledge extraction. With 8 strong-typed structures (AutoModel, AutoList, AutoSet, AutoGraph, AutoHypergraph, AutoTemporalGraph, AutoSpatialGraph, AutoSpatioTemporalGraph), they serve as the output format for all extractions.

  • Methods provide extraction algorithms built on Auto-Types. This includes Typical methods (KG-Gen, iText2KG, iText2KG*) and RAG-based methods (GraphRAG, LightRAG, Hyper-RAG, HypergraphRAG, Cog-RAG).

  • Templates offer domain-specific configurations with ready-to-use prompts and data structures. Covering 6 domains (Finance, Legal, Medical, TCM, Industry, General) with 80+ preset templates, users can extract knowledge without dealing with Auto-Types or Methods directly.

Use via CLI (he parse, he search, he show...) or Python API (Template.create()).

Architecture

📚 Related Documentation

📋 Template Structure Example (Graph Type)

Here's a complete YAML template example for Graph type extraction (entity-relationship extraction):

language: en

name: Knowledge Graph type: graph tags: [general]

description: 'Extract entities and their relationships to construct a knowledge graph.'

output: entities: fields: - name: name type: str description: 'Entity name' - name: type type: str description: 'Entity type: e.g., person, organization, event' - name: description type: str description: 'Entity description' relations: fields: - name: source type: str description: 'Source entity' - name: target type: str description: 'Target entity' - name: type type: str description: 'Relation type: e.g., invention, collaboration, competition' - name: description type: str description: 'Relation description'

guideline: target: 'Extract entities and their relationships from the text.' rules_for_entities: - 'Extract meaningful entities' - 'Maintain consistent naming' rules_for_relations: - 'Create relations only when explicitly expressed in the text'

identifiers: entity_id: name relation_id: '{source}|{type}|{target}' relation_members: source: source target: target

display: entity_label: '{name} ({type})' relation_label: '{type}'

📈 Comparison with Other Libraries

FeatureGraphRAGLightRAGKG-GenATOMHyper-Extract
Knowledge Graph
Temporal Graph
Spatial Graph
Hypergraph
Domain Templates
CLI Tool
Multi-language

📚 Related Documentation

🤝 Contributing & License

Contributions are welcome! Please submit Issues and PRs. Licensed under Apache-2.0.

⭐ Star History

Star History Chart

SEE ALSO

clihub4/11/2026HYPER-EXTRACT(1)