Thursday, March 17, 2022
HomeBusiness IntelligenceConceptual Knowledge Modeling: An Examination of Tendencies

Conceptual Knowledge Modeling: An Examination of Tendencies

John Singer wonders if Conceptual Knowledge Modeling can save IT from itself.“I positively suppose that we want a little bit of saving. We want just a little assist by way of how we construct methods, particularly from an information perspective.”

Singer spoke at DATAVERSITY®’s Enterprise Knowledge World Convention, about Knowledge Modeling, present gaps within the area, and the way the way forward for modeling would possibly look. Singer is the founding father of NodeEra open-source Property Graph Modeling Software program.


There’s no query that individuals are doing superb issues with machine studying and enterprise analytics, Singer stated. It’s not that immediately’s methods don’t produce good outcomes, however on the finish of the day, we’re actually nonetheless constructing unit report processing methods — they’re simply quicker and higher at what they do. “And I don’t suppose we are able to transfer ahead till we tackle that difficulty.”

Present Knowledge Modeling Instruments

Usually, a knowledge modeler is assigned a mission the place the said product is an information mannequin, however in actuality, what the mission homeowners are asking for is a bodily database design. The methodology that modelers are taught is to first construct a conceptual knowledge mannequin, then extract a logical knowledge mannequin from that, after which to refine that right into a bodily knowledge mannequin.

Conceptual Knowledge Modeling is business-oriented, expertise impartial, and summary. The logical mannequin provides particular properties and technical parts, and the bodily mannequin contains DDL and tremendous/sub sorts particular to the database, he stated.

The Drawback with the Mannequin

However Singer has an issue with the conceptual knowledge mannequin as a result of it’s often outlined in such broad-brush strokes. Ask what a conceptual knowledge mannequin is, the reply is usually: “It’s extra summary.” To Singer, that’s not adequate. “It’s actually not what we have to accomplish, nevertheless it’s all we’ve.” One other difficulty is with the polyglot persistence layer. Organizations have so many various goal databases that an Entity/Relationship mannequin doesn’t actually apply to a whole lot of the databases in use immediately.

Present modeling instruments help the creation of those completely different fashions, and they are often linked, however the upkeep is an enormous drawback, he stated. “You’ll be able to create the best conceptual mannequin on the earth, however no person cares about it, as a result of it’s simply not impactful to anybody aside from the information modeler.” Though he has no grievance with the method, it’s simply not sufficient for conceptual fashions.

A Determined Want for Conceptual Knowledge Modeling  

Singer identified that the majority of EDW addresses subjects that exist to repair the shortage of a very good conceptual knowledge design: governance, knowledge catalogs, knowledge glossary, lineage, technique, and high quality — these are all essential, however the design on the entrance finish of the system will get misplaced as a result of the information mannequin can’t seize it. “And once we persist the information into the database, it positive doesn’t get captured there.” Which results in his assertion that there’s a important want for a conceptual knowledge mannequin.

Answer Necessities for the Conceptual Database

Singer’s three-step resolution, which he calls a “conceptual database,” contains each the mannequin and the persistence. 

The mannequin and the information are outlined utilizing the identical language in order that the mannequin equals the information.

The mannequin should simply map forwards and backwards to and from current methods and databases.

  • Mirror human habits/be intuitive

The mannequin ought to be intuitive, extra intently mirroring human habits, as a result of people excel at defining and discussing ideas, he stated. “Language is admittedly the lacking piece.”

Present Conceptual Knowledge Modeling Approaches

In 1977, Peter Pin-Shan Chen wrote a paper titled, The Entity-Relationship Mannequin: Towards a Unified View of Knowledge. His aim was to unify the completely different knowledge fashions in use on the time.

“The relational mannequin relies on relational principle,” stated Chen, “however it could lose some essential semantic details about the actual world.” We are able to create a conceptual mannequin that’s extra semantically wealthy, Singer added, “however as quickly as we put that knowledge in a relational database, we lose all of the context.”

Early Linguistic Primarily based Modeling: NIAM/ORM

Within the Nineties, one other conceptually-oriented modeling method, NIAM, emerged. An acronym for Nijssen’s Info Evaluation Methodology, (after G.M. Nijssen, one of many researchers who developed it), it was later renamed Pure Language Info Evaluation Mannequin to make clear that the mannequin was a staff effort. The method finally turned referred to as Object-Position Modeling (ORM).

ORM was designed to raised mirror human language used to explain the ideas within the mannequin. It’s a extra semantically wealthy method to mannequin knowledge, he stated. It doesn’t persist on this type in a database, so though a relational design may very well be constructed from it from it, all of the semantic element can be misplaced.

Towards a New Database Administration System (DBMS)

Newer applied sciences like property graphs and semantic net present some, however not all, of what’s wanted.

To know property graphs, it’s essential to let go of the assumptions inherent in a relational database construction. An especially versatile mannequin, the property graph may be very easy: “It’s nodes and relationships, and you place properties on them. You’ll be able to actually do something you need with it,” he stated, and modelers will usually naturally gravitate towards a Chen- or an ORM-style mannequin. The conceptual knowledge mannequin shouldn’t be predefined, and since it’s not created till runtime, the modeler can simply intuitively begin modeling the information, treating each property as an entity. The draw back, he stated, is that “The semantics are simply all in your head. And the underlying database doesn’t actually have any understanding of the semantics.”

  • Semantic Internet Applied sciences

Distributed by its very nature, the aim of the semantic net is that “anybody anyplace can say something about something.” Customers can publish knowledge and that knowledge might be linked to every other printed knowledge. As with property graphs, semantic net is completely different from the relational database construction, utilizing describing issues as a type of logic. The essential unit, known as an “RDF triple” (Useful resource Description Facility) is an assertion of some reality — a relationship that exists between the topic and the item — expressed as three elements of a sentence within the type: subject-predicate-object. The mix of all RDF assertions is named the RDF Graph. In contrast to earlier fashions, there isn’t a lack of semantics when persisting knowledge, he stated.

Variations from a Relational Database

In a relational database,the desk sort should be outlined earlier than knowledge might be added to it. With the semantic net, occasion knowledge might be collected and the database can classify it for you, or it could decide what class it belongs to.

The whole lot is expressed utilizing the bodily knowledge mannequin, (the triple), however the conceptual knowledge mannequin is rigorously outlined, versus the property graph, the place the conceptual mannequin is outlined simply by conference.

“Right here, it’s particularly known as out.” Singer calls semantic net’s inferencing engine its “superpower,” as a result of it could infer new information or sorts from given information, and it could classify issues independently. “The ‘kryptonite’ half is that it’s laborious to know. Actually sensible individuals get the logic and the remainder of us all sort of battle.”

Semantic net databases appear to meet a number of the necessities of a conceptual database, he stated. Most significantly, the “mannequin = knowledge” requirement is clearly there, however the actual difficulty is ease of use. How can this be made simpler to make use of and accessible to enterprise customers, not simply IT consultants?

Formal Semantics

The idea of formal semantics grew out of the research of linguistics. Formal semantics makes use of strategies from arithmetic and logic to type theories about human or pc languages.

The essential unit in formal semantics is the sentence, which, like human language, is a grammatically sound string of phrases. Every sentence has that means and that that means is named a “proposition.” Propositions are transformed right into a logical meta-language utilizing a type of logic known as predicate calculus. Propositions are matched with a set of values concerning the world and based mostly on how properly they match, might be decided to be true or not.

Towards a Language-Primarily based API

The best way knowledge ideas are modeled should evolve to an simply understood type that survives persistence to a database, he stated, “And the one means I’m capable of see how this could occur is by going to a extra language-based API.”  

Language course of happens within the unconscious thoughts. The system ought to be capable of clarify itself when requested: “What’s the definition of that?” or  “Which a part of the enterprise cares about this?” “We should always be capable of seize and keep all this enterprise context in a means that that stays with the information.”

Conceptual Database Future

The problem is to bridge from the logic to the language. “We have to do that in a means that extra mirrors human habits,” and Singer believes that language is the best way to perform that.Individuals are undoubtedly doing superb issues with machine studying and enterprise analytics, he stated, “however on the finish of the day, we’re actually nonetheless constructing unit report processing methods — they’re simply quicker and higher at what they do. And I don’t suppose we are able to transfer ahead till we tackle that difficulty.”

Need to be taught extra about DATAVERSITY’s upcoming occasions? Take a look at our present lineup of on-line and face-to-face conferences right here.

Right here is the video of the Enterprise Knowledge World Presentation:

Picture used underneath license from


Be part of us for this in-depth three-day workshop on the elemental constructing blocks of Knowledge Modeling. Use code DATAEDU by March 31 for 25% off!



Please enter your comment!
Please enter your name here

Most Popular

Recent Comments