RESEARCHBY AI|EXPERT SCOUT· Friday, June 26, 2026· 4 MIN READ
Open-Weight Pipeline Achieves 68% Accuracy Extracting Political Networks from News
Researchers published a complete open-source pipeline for multilingual entity-relation extraction (used here for political networks), avoiding proprietary LLM APIs and achieving competitive accuracy. The methodology scales to hundreds of thousands of documents across 40+ languages, making it deployable as an on-premise knowledge-extraction microservice.
Generative Imagery
Three-stage extraction pipeline converts news into signed temporal networksFIG. 01
Researchers at IDea_Lab, University of Graz released an open-weight pipeline for multilingual entity-relation extraction that builds signed, temporal knowledge graphs from unstructured news without proprietary APIs. Tested against 3,491 relations, the system achieves 68.2% strict accuracy and 93.7% lenient accuracy. Two case studies on European political networks validate the approach beyond benchmarks.
The three-stage pipeline identifies entity mentions with a span-based NER model, resolves mentions to Wikidata Q-identifiers, then extracts directed, signed relations using a mixture-of-experts model with guided decoding. The decoder can only emit relation types defined in the schema, structurally preventing hallucinated predicates.
The 68.2%-vs-93.7% gap reflects two scoring methods: strict scoring requires exact predicate match to the ontology; lenient scoring accepts textually correct extractions that map to near-synonyms. For fixed-schema deployments, strict accuracy governs. For exploratory or human-reviewed graphs, lenient applies. The paper omits per-language breakdowns, so teams targeting low-resource languages should verify results with spot-checks before deployment.
FIG. 02Accuracy comparison: strict scoring (exact predicate match) vs. lenient (predicate match with flexible argument assignment) on 3,491-relation gold standard.— IDea_Lab, University of Graz
The Austria case study tracks a political party's lifecycle from news: internal fractures dated, personnel following successor factions, court convictions linked. The Poland case study maps state-enterprise patronage and the PO–PiS conflict graph. Temporal and signed edges capture adversarial, historical, or ongoing relationships—information co-occurrence methods miss.
Open-weight design runs on your own hardware. Gaps: the paper calls throughput "high" without publishing per-document latency or GPU-hour costs, creating uncertainty for infrastructure sizing. Wikidata linking is a hard dependency; entities absent from Wikidata won't resolve. The domain ontology currently covers political networks only; adapting to supply chains, financial filings, or clinical records requires new schemas and revalidation.
This is a field-validated architecture for extracting structured knowledge at scale without third-party data transfer. The entity linking cascade transfers to other domains. The MoE-plus-guided-decoding approach requires heavy domain tuning for non-news verticals. Run spot-checks on samples from your target language and domain before committing to production estimates.