NanoParticle Ontology (NPO) is an ontology that is being developed to represent the knowledge underlying the description, preparation and characterization of nanomaterials in the area of cancer nanotechnology. The NPO is designed and developed within the framework of the Basic Formal Ontology. The development of NPO started with defining the terms and relationships used for describing the chemical composition, physicochemical and functional/biological characterization of nanomaterials (e.g., nanoparticles, nanodevices, nanostructures, etc.), which are formulated and tested for applications in cancer diagnostics and therapeutics.
There are ontologies / controlled vocabularies (e.g., Gene Ontology (GO), Chemical Entities of Biological Interest (ChEBI), NCI Thesaurus (EVS), etc.), which represent some parts of the knowledge within the domain of cancer nanotechnology. Nevertheless, an ontology that provides a unifying knowledge framework for cancer nanotechnology research has to be developed for the following purposes:
An ontology is a formal, explicit representation of knowledge belonging to a subject area: the knowledge is encoded and represented as a hierarchy of concepts (terms / classes) that are described using attributes (e.g., metadata such as preferred name, definition, synonyms, etc.), related using associative relations, and formalized using logical axioms in a machine-interpretable language (e.g., Ontology Web Language or OWL).
Ontologies have applications as common vocabularies, which researchers from different disciplines can share for annotating data in texts as well as in databases. There are several advantages to using ontologies:
An intrinsically interdisciplinary field of research devoted to the development and application of nanotechnology-based methods in the treatment, diagnosis, and detection of cancer.
Experimental data from cancer nanotechnology research are diverse and large in volume. Informatics methods are needed to efficiently use these data, and facilitate the realization of nanotechnology applications in personalized treatment methods.
Most of these data characterize the physicochemical and functional properties related to the in vitro / in vivo behavior of nanoparticles that are formulated for applications in cancer diagnostics and therapeutics. Small changes in chemical composition can cause drastic changes in the properties of nanoparticles. Since there are many combinatorial ways by which the chemical composition can be modified, one can formulate diverse types of nanoparticles with varying properties and applications. Each new formulation will require experimental characterizations and this in turn adds more volume and diversity to the data. Additionally, the data and the underlying knowledge in cancer nanotechnology are complex due to the integration of information from multidisciplinary areas such as chemistry, material science, biology, and cancer medicine.
Most of the experimental results are found in disparate sources like journal articles. Manually, it is difficult to process large amounts of information from textual sources. This difficulty limits the effective use of information for advancing research. Therefore, in order to effectively process and utilize the information, there is a need for informatics methods (e.g., grid-based database technologies, text-mining techniques, information models, controlled vocabularies, ontologies) that facilitate 1) integration, sharing and searching of data from disparate sources, 2) semantic integration of data, and 3) unambiguous interpretation of data. An important application of informatics is in the area of data mining and knowledge discovery. Cancer nanotechnology data sets are rich in information and this can be mined for structure-activity relationships, and to seek correlations between different characteristic nanoparticle properties (e.g., correlation between in vitro and in vivo properties). Mining of existing literature data can provide useful information to guide the re-purposing or de novo design of nanoparticles. There are database resources such as caNanoLab, which are being developed for storing, searching and sharing data generated from characterization experiments, with the goal of enabling knowledge discovery. However, databases must be complemented by a common vocabulary to facilitate semantic interoperability among them.
To construct the NPO, we created an initial list of terms using the descriptions of nanoparticle formulations in the literature. These terms were obtained using information related to the:
Specifically, for each type of information, we identified the header terms and relationships associating these terms. These terms and relationships provided a structure for organizing the information content in the literature, based on which we collected more terms and organized them in the form of a taxonomic “is_a” hierarchy.
For formal development of the NPO, we re-factored this hierarchy of terms using terms from the Basic Formal Ontology (BFO) at the upper-level of NPO, and constructed the NPO in the Ontology Web Language (OWL) using well-defined design principles. Terms that are found in other relevant ontologies / controlled vocabularies like GO, ChEBI, and NCI Thesaurus are re-used in NPO.
The BFO (Basic Formal Ontology) was selected as the upper-level ontology for developing a structured classification of NPO terms. BFO is a formal ontology based on tested principles for biomedical ontologies. The reasons for using BFO as the upper-level ontology are as follows:
For a comprehensive account of the BFO, the reader is directed to the BFO manual. Only relevant BFO terms are currently used in the NPO; however, other top-level BFO terms may be added if needed.
We have selected OWL-DL as the language for encoding the NPO. This is because
The main design principles used in developing the NPO are listed below. These design precepts are based on BFO and Open Biomedical Ontologies (OBO) Foundry principles (http://www.obofoundry.org/crit.shtml) as well as our review of other OWL-encoded ontologies and controlled vocabularies:
The NPO is developed to represent knowledge underlying the chemical composition, preparation, physicochemical and functional/biological characterization of nanomaterials in the cancer nanotechnology domain.
Public releases of the NPO are made available through the BioPortal web site, maintained by the National Center for Biomedical Ontology.
The NPO is now included in the NCI metathesaurus (NCIm), which can be accessed at http://ncimeta.nci.nih.gov/ .
The NCI metathesaurus contains about 3,600,000 terms from over 76 vocabularies, and these terms are mapped to about 1,400,000 biomedical concepts. Terms from multiple vocabularies that are mapped to a single biomedical concept allows the user to choose from the multiple vocabularies to annotate data. Simultaneously, this facilitates discovery of vocabularies unknown to the user. By the inclusion of NPO into the NCI metathesaurus, we expect that NPO accessibility and usage will be extended within the NCIm; NPO will add semantics into the NCIm; and that NCIm users will be able to take advantage of the knowledge provided by NPO.