Some Ongoing KBS/Ontology Projects and Groups
[ A |
If you are working on or know of
related work not mentioned here, please email me
Other useful Ontology and KR collections include:
Knowledge-Base Projects, Groups, and Related Material
- Carter - (Postscript paper).
A (now defunct) tool for assisting
experts in agreeing on what should go into a consensus
knowledge base. This paper was eventually published as:
Trice A, Davis R, "Heuristics for reconciling independent knowledge bases",
Information Systems Research, 4 (3), pp262-288, Sept 1993.
- The Cell Cycle Ontology -
An integrated ontology of cell cycle knowledge, in OBO and OWL
(Computational Biology Division, VIB/Ghent University, Belgium).
- a software system that supports users in creating an maintaining ontologies,
in particular merging
multiple ontologies together and
diagnosing individual or multiple ontologies (KSL, Stanford).
CIA World Factbook - a popular
resource for building a world-savvy KB.
- CKML -
Conceptual Knowledge Markup Language.
An application of XML and an extension of OML.
- CLIPS -
A public domain, expert system shell written in C. Also see the
(non-public-domain) successor JESS.
- CommonSense Reasoning - See
the Common Sense Problem Page. Also
see Cyc, OpenMind.
Common Sense Problem Page - A list of benchmark problems in
- CommonKADS - A comprehensive knowledge engineering methodology. See also:
- Integral Solutions Ltd -
development of AI/KBS tools, including sale and support of the
CommonKADS WorkBench, and involvement in the KACTUS project.
- Common Logic
(CL) Standard - a new (2003) attempt to define a (ISO) standard language and
semantics for use in Ontology and Knowledge Engineering. Also see earlier
work on KIF.
- Common Logic Controlled English - a formal language with an English-like syntax
(by John Sowa).
- The Component Library (CLib) - a knowledge base of formally represented,
general, domain-independent concepts (Univ Texas at Austin), represented
in the KM knowledge representation language.
- ConceptNet (also browse the
data here) -
A freely available commonsense knowledge base and natural language processing
toolkit, generated automatically from the 1,000,000 sentences (as of 2010) of the
Open Mind Common Sense Project.
- Concept Maps -
A graphical notation for organizing and representing knowledge in an informal
way. Also see this introduction to Concept Maps.
For some examples, see the IHMC CMap
Tools page (view with Internet Explorer) and the
Graphic Organizer Concept
Maps page. Also see Topic Maps and
- Conceptual Dependency - A summary
of Roger Schank's conceptual primitives for representing simple sentences.
- Conceptual Grammar (TM) -
A hierarchical knowledge base management system (KBMS) bundled with the
VisualText (R) integrated development environment (IDE) and NLP++ (R)
programming language (from
Text Analysis International).
- Conceptual Graphs -
Researchers using Peirce/Sowa's
conceptual graphs theory for knowledge representation.
- Contexts -
Univ. Maine's collection of pointers.
- Controlled Languages -
Rolf Schwitter's collection of pointers.
- CRACK -
A description logic with an on-line, Web-based interface
(from IRST, Italy).
- CYC -
A massive ongoing effort to formalize
common-sense knowledge. An open source version is now available as
OpenCyc, including (parts of) the
ontology, axioms, inference engine, and knowledge acquisition tools.
- CycL - the representation language of CYC.
- ECOR - the European Centre for Ontological Research (Saarland University, Germany)
- The eHealth Ontology Project - promoting ways in which in which scientific methods in ontology can bring benefits to healthcare (by a partnership of IFOMIS, ECOR, and RIDE).
- Electronic Lexical Knowledge Base - software for accessing and exploring Roget's thesaurus, including tools for detecting lexical chains; calculating semantic distance; and clustering words.
- The Enterprise Ontology - a collection of terms and definitions
relevant to business enterprises (AIAI, Edinburgh, UK).
- EPILOG -
(Episodic Logic). Len Schubert et al.'s knowledge representation language,
for use in natural language processing.
- EXPECT -
an environment for developing knowledge-based systems that includes
knowledge acquisition tools to extend and modify KBs (ISI, Los Angeles).
- FaCT -
An optimized, tableaux-based DL (Ian Horrocks, Manchester, UK)
- F-Logic -
A novel formalism that accounts in a clean and declarative fashion
for most of the structural aspects of object-oriented and frame-based
- FLORA -
An object-oriented KB language and application development platform.
The programming language supported by FLORA is a dialect of
F-logic with numerous extensions, which include a
natural way to do meta-programming in the style of HiLog and logical
updates in the style of Transaction Logic.
- FLORID - A deductive,
object-oriented database system employing
F-logic as data definition and query language;
FLORID has been extended for handling semistructured data in the
context of Information Integration from the Web.
- FrameNet -
"an online lexical resource for English". Contains an extensive,
semantic analysis of verbs and their case-frame representations.
Also see (independent project) WordNet.
- FRDCSA (also see here) - An organization supporting programmers of the Free and Open Source Software (FLOSS) revolution. FRDCSA's goal is to assemble the most comprehensive ontology of FLOSS applications, and make packages available for every free operating system, distribution and hardware platform.
- FreeBase - "an open shared database of the world's knowledge," and "a massive, collaboratively-edited database of cross-linked data." Developed by MetaWeb Technologies.
- GandrKB -
(Gene Annotation Data Representation KB, available free of charge) -
an ontological framework for laboratory-specific gene annotation.
Gandr uses Protege 2000 for editing, querying, and visualizing
microarray data and annotations. Genes can be annotated with provided,
newly created, or imported ontological concepts. Annotated genes can
inherit assigned concept-properties and can be related to each other.
The resulting knowledgebase can be visualized as interactive network of
nodes and edges representing genes and their functional relationships.
(Daniel Schober, Max Delbruck Center for Molecular Medicine, Germany)
- The Gene Ontology (GO) -
a controlled vocabulary that can be applied to all organisms even as
knowledge of gene and protein roles in cells is accumulating and
- GALEN -
Generalized Architecture for Languages,
Encyclopedias and Nomenclatures in Medicine. GALEN is a
medical terminology server for supporting the development of
clinical coding schemes.
Generalized Upper Model -- a
linguistically motivated ontology that supports NL processing;
a multilingual outgrowth of the Penman Upper Model.
- The Generic Frame Protocol
- not a language, but a standardized set of low-level access
functions for frame-based KR systems. GFP has been recently renamed
(The Open knowledge Base Connectivity Protocol).
Generic Knowledge-Base Editor -
a graphical KB editor, implemented in Lisp+CLIM, from SRI.
- The Groningen Meaning Bank -
A free, semantically annotated corpus, consisting of public domain English texts with
corresponding syntactic and semantic representations.
- Halo - a
long-term, knowledge-based research program aimed at developing
inference-capable applications which can acquire knowledge directly
from domain experts and other users, answer novel questions, and solve
advanced problems (Vulcan Inc.). Also see AURA,
developed under this program.
- HIRO - the Health Incident and Reporting Ontology, developed by
the company Language and Computing.
- HOMER - An "intelligent agent" integrating many aspects of AI,
operating in a (simulated) microworld for unmanned submarines.
- HowNet - An on-line, common-sense KB
for multilingual Natural Language Processing. It contains inter-conceptual relations
and inter-attribute relations as connoted in Chinese lexicons and their English equivalents.
(Also see this overview).
- HPKB -
DARPA's High-Perforance Knowledge Base program, an initiative
in constructing large-scale, knowledge-based systems. This
program finished in 2000, with the follow-up RKF
program now underway.
HTW - The How Things Work project at Stanford;
use of a KB and interactive explanation facilities (through WWW) to
assist understand/manage/etc. engineered products.
- ICOM - A
CASE tool for Intelligent Conceptual Modeling (Univ Manchester, UK)..
It includes handling of multiple E-R diagrams,
and logical specification verification.
- IEEE Standard Upper Ontology (SUO) Working Group -
see Standard Upper Ontology (SUO) Working
- IFOMIS -
The Institute for Formal Ontology and Medical Information Science (Saarland University, Germany).
- Inference Engines: See Languages
and Inference Engines for knowledge representation
- InfoQuilt - A research project that support human-directed
knowledge discovery in a multi-ontology, multi-agent environment.
- Internet Business Logic -
Writing and running English business rules with unlimited vocabulary, using a browser.
Automatic generation and execution of SQL. (from Reengineering LLC).
- JESS -
Java Expert System Shell. JESS is a rule engine and scripting environment,
originally inspired by the CLIPS expert system
JOE - Java Ontology Editor (Center for IT, Univ South
Carolina). Based on entity-Relation diagrams. Will read ontologies
stored in KIF format.
- Jumper - a set of semantic
tools for aggregating and managing biological data, supporting
knowledge base development, semantic integration, linked data,
query routing, and auditing & tracking. From Jumper Networks Inc.
- KA2 - The Knowledge Annotation Initiative by (and of) the Knowledge Acquisition Community:
KACTUS - An interactive environment for browsing, editing and managing
ontologies (Univ Amsterdam).
- KADS: See CommonKADS.
- KAON -
The KArlsruhe ONtology and Semantic Web infrastructure,
an open-source ontology management infrastructure
targeted for business applications. It includes
a comprehensive tool suite allowing easy ontology
creation and management, as well as building ontology-based
- Knowledge Acquisition Tools: See
and additional systems under
Ontology Building Tools.
- Knowledge Extraction Systems: There are many, but some interesting ones performing large-scale extraction from text include:
- KIF -
Knowledge Interchange Format: A proposed, logic-based format for
exchanging knowledge between computer programs (more details
Also see the newer effort on the
ISO Common Logic standard.
- KM - the Knowledge Machine.
A frame-based knowledge knowledge representation language from
the University of Texas at Austin,
the Boeing Company,
used for several projects including Project Halo.
- KNEXT -
Knowledge Extraction from Text. Large-scale extraction of general world knowledge by abstracting sentence parses
- Knowledge Management -
"A range of practices used in an organization to identify, create,
represent, distribute, and enable adoption of insights and experiences"
(Wikipedia). For some pointers, see
Best of the Web - Knowledge Management.
The Knowledge Sharing Effort (DARPA)
and related work.
A large, collaborative project to research and promote
email archive (email exchanges)
- KIF -
Knowledge Interchange Format: A proposed, logic-based format for
exchanging knowledge between computer programs.
- KSL -
Stanford's Knowledge Systems Lab; doing extensive,
pionering research in knowledge representation,
reuse, and sharing.
- Laboratory for
Applied Ontology (Italy) - a new (2002) laboratory resulting
from the fusion of two former groups LADSEB-CNR and ITBM-CNR.
- Language and Computing - A technology company active in healthcare and pharmaceuticals, developing solutions combining Natural Language Processing and the use of ontologies.
The company has developed an expansive medical knowledge base called LinKBase, and is developing
the Health Incident and Reporting Ontology.
- Languages and Inference Engines for
- For description logic languages, see
- For conceptual graph tools, see
- For other languages, see
- In addition, the theorem proving/
automated reasoning community have developed numerous theorem provers
operating on representations expressed in standard first-order logic syntax
(e.g., see here).
- LOOM -
a description logic (ISI, CA). Also see its successor
PowerLOOM, and Bob MacGregor's
Retrospective on LOOM
- The Ontosaurus - a Web-based interface to LOOM.
SHELTER - "SHELTER is intended to offer an
environment for collaborative, team development of large knowledge based
systems in a manner facilitating sharing and reuse.". Uses LOOM as
it's underlying representation language.
- LSDIS Lab -
The Large-Scale Distributed Information Systems Lab (Univ Georgia). Doing
extensive research and education on the Semantic Web.
- Marine Metadata Interoperability Project (MMI) - "Promoting the exchange, integration and use of marine data through enhanced data publishing, discovery, documentation and accessibility.". Also see MMI's activities:
- Medical Terminologies and Ontologies: See
- Mental Causation - See The New Ontology of the Mental Causation Debate.
- MeSH - The
National Library of Medicine's subject thesaurus, also a
principle component vocabulary of UMLS.
Try out the
- The Meteor Project - researching
Semantic Web services and their
An in-depth, broad coverage ontology for multilingual NLP from the
Computing Research Laboratory, New Mexico State University.
- Mind Maps -
Informal graphical notation for representing
information (e.g., see How to Create a Mind Map).
Some similarities to
Concept Maps and
Also see Nelements.
- MindNet -
Microsoft's knowledge representation project that uses a broad-coverage parser to build semantic networks from dictionaries, encyclopedias, and free text. (Browse a MindNet KB built from word definitions).
Mizoguchi Lab - Japan. Some interesting papers and ontologies.
- Moby -
A public domain suite of lexical resources
(word lists, part-of-speech lists, thesaurus, etc.).
for a quick summary.
- Multiple Ontologies -
see Ontologies - Dealing with
- NASA's Thesaurus - covering aerospace (available for a nominal fee).
- Natural Language Processing tools - By no means a complete list.
- NLTK - Natural Language Toolkit (Open Source Licence)
- Building/using KBs via natural-language processing (NLP) techniques: See
- Nelements -
a generic 3-dimensional knowledge representation/knowledge management system
(you can think of it as a 3-dimensional
- ODE - The Ontology Design Environment (Univ Madrid, Spain). See
- OIL -
Interchange Language. A proposed standard for specifying and exchanging
ontologies, drawing together ideas from Web languages (e.g., XML, RDF),
Description Logics, and frame-based systems.
- OKBC - The Open Knowledge
Base Connectivity Protocol (formerly called GFP, the Generic
Not a language, but a standardized set of low-level access
functions for frame-based KR systems.
- OK Station (in French)
- The Ontological Knowledge Station, a commercial modeling tool
dedicated to the acquisition, definition and manipulation of knowledge bases
and ontologies, based on the OK ontological model (Univ Savoie, France).
- OMCSNet - A semantic network derived from the
OpenMind commonsense database, now superceded
by the more recent OpenMind derivative knowledge base
- ON9 -
The ontology library at
the Laboratory for Applied Ontology (Italy),
developed using the
ONIONS methodology and implemented in
- a methodology for building a library of generic ontologies, and generating
domain ontologies (the Laboratory for Applied Ontology,
- OntoBroker -
A system using ontologies for both annotating Web documents, and providing
ontology-based answering services. OntoBroker is being
used as part of the
- OntoEdit - An Ontology
Engineering Workbench. OntoEdit is a development environment
for design, adaptation and import of knowledge models for application
- ONTOGEO -
The Geospatial Ontology Research Group in the National Technical University
of Athens. This is a repository for the research and teaching
activities of the OntoGeo Group in Geographic Information Science,
addressed primarily to researchers.
Ontolingua - a language and set of tools for ontology
development (from Stanford). Includes a
Sharable Ontology Library.
- ONTOLOG - an open,
international, virtual community of practice on ontologies.
- Ontology - Definitions and perspectives
- Ontologies: History and Philosophy - See:
The ANSI Ad Hoc Group on Ontology Standards
- Ontologies - examples and collections (also see Thesauri). See
- Ontology Building Tools -
For building and managing ontologies.
See Michael Denny's survey of ontology editors (2002), and also the following
and additional systems under
Knowledge Acquisition Tools.
- Ontology Learning Tools - Automated/assisted techniques for
building an ontology. Also see a good survey of ontology learning methods and techniques (OntoWeb deliverable 1.5, A. Gomez-Perez, D. Manzano-Macho).
- The New Ontology of the Mental Causation Debate - an AHRC (Arts & Humanities Research Council) funded research project, attempting to frame the debate with more metaphysical precision, and explore the consequences of that reframing (Univ Durham, UK).
- Ontology Merging Tools - See
Chimera and PROMPT.
Also see Carter, a tool for helping experts build
a consensus KB.
- Ontologies - Dealing with
multiple ontologies - See
- Ontology Works - Ontology Works is a leading source of ontology construction
software, ontology-based database software, and ontology-based information
integration software. The Ontology Works IODE is software designed to produce
ontologies - true-to-the-world information models. These models may be
implemented in the Knowledge Server and High-Performance Knowledge Server.
(Also see their page
- OntologyStream Inc. -
a Virginia-based company focused on incorporating knowledge science and
technology into eGovernment and intelligence applications.
- Ontosaurus -
a Web-based interface to LOOM.
- OntoWeb -
Ontology-based information exchange for
knowledge management and electronic commerce. A collaborative
network of European researchers and industrials,
which aims to strengthening the European influence on
standardisation efforts such as those based on RDF and XML.
- OpenCyc -
The open source version of the Cyc technology
(including selected parts of the ontology, axioms,
inference engine, and knowledge acquisition tools).
Public OpenCyc servers.
- The OpenMind Initiative - a collection of projects to develop "intelligent" software. Note
in particular OpenMind CommonSense (below).
- OpenMind CommonSense -
A web site for collecting "commonsense" knowledge, as English sentences,
en masse from people on the Web.
The database can be downloaded for free. Also, a semantic network/knowledge base
mined from this corpus is available. (Also see
Push Singh's homepage).
Pangloss and Penman -
natural language projects
at USC Information Sciences Institute. The creation and use of
large ontologies forms a key component of this research.
- Panlingua - A universal
theory of linguistic structure (by Chaumant Devin).
The PARKA Project -
a frame-based AI system which claims to scale to extremely
large KB applications (Univ Maryland at College Park).
- PharmGKB -
The Pharmacogenetics and Pharmacogenomics Knowledge Base - an integrated
resource about how variation in human genes leads to variation in our
response to drugs.
- PowerLOOM -
A KL-ONE-style knowledge representation language
(the successor of LOOM).
- Probase - Large-scale ontology and fact extraction from the Web, including measures of certainty (Microsoft Research, China)
- PROMPT - An interactive ontology merging tool. Part of Protege-2000.
- Protege -
An extensible KR tool for constructing ontologies,
customizing knowledge-acquisition forms, and entering domain knowledge.
- PSL -
The Process Specification Language Ontology.
- QuALiM - A corpus-based QA system at Edinburgh, searching Wikipedia for answers (not just page hits).
- QUDT - Quantities, Units, Dimensions, and Data-Types - A unified model of measurable quantities, units, and their values (Being developed by TopQuadrant and NASA)
- Question-Answering Systems: See
Halo (knowledge-based question-answering),
TRAINS and TRIPS
(NLP + interactive planning),
AskJeeves (Web search),
QuALiM (Wikipedia text search),
TextMap (TREC/Web search, ISI), and
Webclopedia (NLP and information retrieval).
- RacerPro - an OWL reasoner and inference server for the Semantic Web.
- RCC-8 - Region Connection Calculus - An ontology for qualitative spatial reasoning.
- RDF -
Resource Description Framework: A lightweight ontology system to
support the exchange of knowledge on the
- Read The Web (NELL - Never-Ending Langauge Learning) - continuous crawling of the Web to extract an ontology of concepts, facts about those concepts, and confidences in those facts. From Carnegie-Mellon University.
- RIDE - A Roadmap for Interoperability of eHealth Systems (a European-funded project).
- RKF -
DARPA's Rapid Knowledge Formation project, developing
methods to allow Subject Matter Experts (SMEs) to construct
knowledge bases directly (1999-2004).
- Roget's Thesaurus
- The Semantic Business Process Management (SBPM) Working Group - working towards the mechanization of Business
Process Management by using Semantic Web techniques, especially Semantic
- The Semantic
Enterprise - (Amit Sheth's graduate course).
- The Semantic Web - A
vision of having data on the web defined and linked in a way that it can be
used not just for display purposes, but also in various applications by
machines. Also see:
SENSUS from ISI, Los Angeles. A large concept taxonomy
for natural language processing and other applications.
- Shaken - A knowledge
acquisition and reasoning tool built by the SRI Team, as part of DARPA's
Rapid Knowledge Formation project.
(Contact firstname.lastname@example.org for access to the software itself).
- SHOE -
HTML extensions to allow knowledge-representation semantics to
be added to Web pages, using explicit user-defined ontologies.
See also its successor CKML.
Knowledge Engineering Tool - An environment for
developing, viewing and debugging theories in first order
- Simple Dictionary - A
dictionary using a 430 word defining language, or Interlingua, which also serves
as the foundation for SIMPLE, a semantically-based NLP system. (Inspired by
Ogden's Basic English).
- SMDF - The Shared Meanings Design
Framework (in a very early stage, as of August 2000). An HCI-centered
methodology for supporting e-commerce developers,
by focussing on the semantics which an interface transmits. Uses SMML (Shared
Meanings Markup Language).
- SNePS -
The Semantic Network Processing System,
a knowledge representation and reasoning system from
Univ at Buffalo, NY.
- Snobase -
IBM Ontology Management System (also known as SNOBASE, for Semantic
Network Ontology Base) is a framework for loading ontologies from files
and via the Internet and for locally creating, modifying, querying, and
- The IEEE
Standard Upper Ontology (SUO) Working Group -
working to specify a standardized upper ontology. Also see
SUMO (the Suggested Upper Merged
- Stanford KSL -
Stanford's Knowledge Systems Lab. Doing extensive,
pionering research in knowledge representation,
reuse, and sharing.
- The START natural language
question-answering system (Infolab, MIT). Matches questions with hand-written annotations
- Statistical NLP -
A web site for a course, including reports, books, tools, and data collections.
- STEP/PDES -
Standard for Exchange of Product Data. A huge, international
effort to create an interlingua for exchanging manufacturing
See also STEP Tools, Inc.
- The Suggested Upper
Merged Ontology, developed within the IEEE
Standard Upper Ontology (SUO) Working Group. Contains (as of December
2003) about 1000 terms and over 4200 assertions for general ontological
concepts such as temporal relations, spatial relations, activities and
roles. Also see:
- TAMBIS -
(Transparent Access to Multiple Biological Information Sources). TAMBIS
is an integration and retrieval system for bioinformatics resources,
using an extensive ontology.
- TextRunner -
Large scale knowledge (taxonomy and facts) extraction from the Web - (Univ Washington).
- TCM - Toolkit for Conceptual Modelling.
A collection of software tools to present conceptual models of
software systems in the form of diagrams, tables, trees, and the like.
- Topic Maps: A graphical representation of the topics an information
set is about, their interrelationships, and which part of an
information set are relevant to which topics. Also see
Concept Maps and
- The TOVE
project and ontologies, from the
Enterprise Integration Lab
, Univ Toronto.
- Trellis - An interactive environment allowing users to add their
observations, viewpoints, and conclusions as they analyze information
(ISI). Also see this paper).
- TRIPLE -
an RDF query, inference, and transformation language for the Semantic
- Twente Ontology Collection --
(Univ Twente, Netherlands). On Ceramics, substances, and engineering design.
This page no longer seems to be available.
UMLS - Unified Medical Language System (UMLS) of the National
Library of Medicine (NLM). A collection of around 60 different
biomedical vocabularies, unified/aligned into a single
thesaurus, lexicon, and semantic network. Also see MESH, one of UMLS's
principle component vocabularies.
UT Austin - KBS Group - Work on large-scale, multifunctional
- The VAC System - A complex, knowledge-based strategy and software tool for vehicle development (by the automotive company IAV). An animated visualization is available.
- VerbNet - A semantically rich, class-based verb lexicon
- the Verb Semantics Ontology - build on IBM's Verb Ontology, a database of senses, subcategorizations, roles etc. for high-frequency verbs.
- VINE -
Vocabulary INtegration Environment - A tool to map ontologies (from the Marine Metadata Interoperability Project)
- VOC2OWL - a tool to create OWL ontologies from ASCII files (from the Marine Metadata Interoperability Project)
- Webclopedia - an automated, retrieval-based question-answering system (ISI).
A Web-based knowledge elicitation tool, implementing repertory grids,
from the Knowledge Science
Institute, Univ Calgary, Canada.
- WebKB-2 - A
large-scale, lexically-oriented (WordNet-like) knowledge-base + Web-based
interface. The WebKB-2 knowledge-base includes a "tidied up" version of
WordNet 1.7 (see here for details). The WebKB-2 interface allows users to
retrieve, re-use, complement, annotate and be guided by other users' knowledge.
- WebOnto -
a Web-based knowledge modeling tool, allowing users to browse
and edit knowledge models over the Web
(Knowledge Media Institute, UK).
Uses OCML as the underlying language.
- WikiTaxonomy - The Wiki ontology (taxonomy) extracted from Wikipedia (by EML Research, Germany).
- WonderTools -
for helping select an ontology-building tool
(SWI, Univ Amsterdam, Netherlands).
- WordMap -
A commercial taxonomy management system for building and
deploying taxonomies quickly and easily.
- WordNet -
a large, on-line lexical reference system. Also see:
- YAGO - a huge semantic knowledge base. Currently, YAGO knows more than 2 million entities (like persons, organizations, cities, etc.). It knows 20 million facts about these entities. Unlike many other automatically assembled knowledge bases, YAGO has a manually confirmed accuracy of 95%. YAGO is part of the
YAGO-NAGA project at the Max-Planck Institute for Informatics in Saarbrucken, Germany.
Peter Clark (email@example.com)