Artificial intelligence for libraries, archives and heritage
Eight AI tools covering the 8 strategic objectives published by BNElab, integrable under a single ecosystem. Built in Europe, with data on European soil, by Videoconversion Digital Lab — over 20 years in documentary digitization.
BNElab's 8 AI objectives, covered today
BNElab publishes eight strategic AI work areas. For each one, we have an operational tool with comparable world references.
Natural language querying
Multilingual RAG on documentary corpora with intent detection, parallel sub-queries and mandatory page-referenced citations. 9 native languages.
OCR and manuscript transcription
Classic OCR + neural HTR with Gemini Vision. Automatic detection of document type, archival series and geographic location.
Image element identification and extraction
Detection of seals, maps, tables, engravings and miniatures. Visual embeddings with HNSW index for similarity search. Upscaling and restoration with dedicated GPU.
Text entity extraction
NER with confidence score: people, places, organizations, dates, events. Ready to integrate specialized models for historical Spanish.
Music score recognition
Functional H-OMR prototype — one of the BNElab objectives that very few institutions worldwide address operationally.
Assistant chatbots
Natural language conversation over the document corpus with context memory, mandatory source citations and 24/7 availability.
Automated cataloguing
Automatic metadata generation: Dublin Core 15 elements, MODS, METS 2.x, PREMIS v3, MIX. Document series and type suggestions with humans-in-the-loop.
Automated classification
Classification by diplomatic type, series, language, period and geography. Human review of high-confidence labels.
What makes us different
Advantages no international provider combines under one roof.
20+ years of experience
Videoconversion Digital Lab (Barcelona) has been digitizing documentary heritage for more than two decades. We didn't arrive at AI from a startup — we apply it to a craft we already master.
100% data on European soil
Neon PostgreSQL and Vercel hosted in the EU. Native GDPR and Spanish ENS compliance. No transatlantic transfers, no legal surprises.
Integration with MarIA (BNE)
Designed to incorporate MarIA — the Spanish language model trained by BNE on 135 billion words with MareNostrum — as an optional engine for NER and linguistic analysis.
7 material types
Text, manuscript, image, plan, map, audio, video and score — all processable in one coherent ecosystem. Other providers cover one or two.
Multi-LLM validation
Architecture with Gemini + GPT-4o + Claude voting on critical results. Keeps hallucinations below the 95% F1 threshold set by the Library of Congress.
Music score recognition
H-OMR prototype in early production — one of BNElab's objectives that very few institutions worldwide have addressed at product level.
Flexibility by institution size
Same ecosystem, three deployment sizes. From a municipal library to a national one.
Municipal · University
Up to 100,000 documents
- SaaS on the mediasolam platform
- Immediate deployment, no in-house infrastructure
- Monthly payment, no commitment
- Municipal libraries, diocesan archives, foundations
Regional · Provincial
Up to 2 million documents
- Dedicated EU deployment with exclusive database
- SLA and service-level support
- Integration with existing catalogue (Koha, Alma)
- Regional libraries, provincial archives
National · BNElab Edition
No operational limit
- Hybrid cloud + on-premises architecture
- Integration with MarIA and the institutional OAIS repository
- AI policy co-drafted with the institution
- Dedicated L3 support and joint roadmap
The 3 transversal principles BNElab requires
Ethical, sustainable and transparent use of AI — not as a promise, but as architecture.
Ethical
Mandatory humans-in-the-loop for all automated decisions. Visible confidence scores. Right to explanation and human correction at all times. Formal responsible AI policy, inspired by the British Library's FRAIM project.
Sustainable
Energy consumption measured per model call. Aggressive embedding caching. Smaller models used when sufficient. Quarterly footprint dashboard. Compatible with on-premises MarIA to reduce data transport.
Transparent
Complete audit trail: which model processed which document, when and with which prompt. Versioned prompts. Documented reproducibility. Generated metadata in open standards (Dublin Core, MODS, METS, MARC21).
Aligned with the world's references
The leading national libraries and European networks of AI applied to documentary heritage are our reference coordinates.
AI in Libraries working group of the Conference of European National Librarians. BNE is an active participant.
International community of AI applied to libraries, archives and museums. Organizers of the Fantastic Futures conferences.
ALT-EDIC
European alliance coordinated by France, with 17 member states including Spain, for language technologies.
Exploring Computational Description — applying transformers to over 23,000 ebooks with a 95% F1 quality threshold.
Computer vision model for historical maps, developed with the Alan Turing Institute.
KB-BERT, the Swedish National Library's language model, trained on on-premises Nvidia DGX with 500 years of Swedish text.
Automated subject indexing microservice in production at the National Library of Finland.
The ecosystem live
Eight applications in production. You can try them.
Let's talk about your institution
All apps are operational and can be tested with your own digitized collections. Arrange an institutional demo with our team.