Discover the collections in meet-Alex.
And create your own.

Your browser does not support SVG
Your browser does not support SVG

Not one model fits all

Ĉu tomato estas frukto aŭ legomo?

Is it a fruit or a vegetable? Opinions may differ. From biological perspective it is a fruit. But from cooking and legal perspective it may be considered a vegetable, as per declaration by the US Supreme Court in 1893.

Does it matter? Not really, as long as we have the same understanding about it: a nice, healthy, red thing you can eat and may or may not like. No matter whether we name it a "tomato", or "Solanum Lycopersicum", or even a "love apple".

What's in a name? that which we call a rose
By any other name would smell as sweet

Shakespeare in Romeo and Juliet

The sentence "Ĉu tomato estas frukto aŭ legomo?" means "Is a tomato a fruit or a vegetable?" in Esperanto, the artificial language created to become a universal common language bridging cultural and political differences across the world. However, despite its elegance and easy grammar, most people do not speak or understand the language.

Similarly, there is not one single model or ontology to describe and structure all data that can efficiently be used and understood by everyone. Different models and languages will always remain.

What has a tomato to do with data?

Suppose you want to know the colour of a tomato. That is data. Or you want to know how many calories a tomato has. That is also data. Maybe described by someone else, in his/her own words, with his/her own interpretation. Nevertheless, you can use that data, as long as you actually mean the same.

And if you are not interested in tomatoes, you may be interested in data about customers (or do you prefer the term "clients"?), products, loans, transactions, houses, vehicles, medicines, food, planets, stars, equipment, hobbies, etc.

Data is useless without context

An example

Suppose someone asks you how many calories a tomato contains. The answer depends on context and interpretation. Questions you should consider, are:

What type?

What type of tomato is meant? A cherry tomato? A roma tomato? A beefsteak tomato? A salad tomato? The tomato size and amount of calories may differ per type.

Which origin?

Where was the tomato grown? When was the tomato harvested? Age and origin may impact the sugar level in the tomato, and thus the amount of calories.

What is exactly meant?

Calorie has two distinct definitions (5304584060):
  • The energy needed to rise the temperature of 1 gram of water through 1°C (also known as "small calorie", cal).
  • The energy needed to rise the temperature of 1 kilogram of water through 1°C (also known as "large calorie", Cal).
Your answer (your data) may be a factor 1000 higher or lower than expected...

Often this context is not immediately available, leading to different interpretations, hampering efficient data use. We want to change that.

Connected collections to describe it all

Instead of one overall common model or ontology, we developed a "multi-model approach" to describe and structure data.


All data, your data

Every second of every day, huge quantities of data are created. Do you know how the data you need is described and interpreted by others? And how your data matches the data need of others?

The size of the digital universe will double every two years at least, a 50-fold growth from 2010 to 2020.

insideBIGDATA in The Exponential Growth of Data, February 16, 2017

In one place

There must be millions of data models to manage all data. Where are these models? Can you connect to these data models to understand and locate the data you need?

Now there is one place where you can store, share, find, and reuse data models: any collection of descriptions and structures of data, connected to the interpretation of others.


These connected collections require a solution that is highly scalable.


Crowd sourcing

power of crowd

Crowd sourcing is key to the solution. The millions of data models can never be collected and maintained by a small group of people. To capture all these descriptions and structures, a very large group of people is needed. Contributing and working together on interconnected collections, to achieve a common understanding.

The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations

James Surowiecki

Ease of use

Everyone can use the solution. Just as easy as you draw a model on a whiteboard. And now it is not wiped-out, but stored for later reuse by you and others. Just as easy.

Open source

We developed the solution as open source, and provide it as open data. This allows you to contribute to developing new exciting functionality. And it allows you to host the solution in your own environment. Connected to meet-Alex. This is key for scaling up, and for the success of the solution.

Your browser does not support SVG
Your browser does not support SVG

Everything you need to describe your data

Term

The basic unit in meet-Alex. It's a word or a few words combined, used to name something. For example "tomato", "table", or "dinner table", "customer", "name", "colour", "calorie". All information is basically organised by terms.

Collection

The mechanism to group a number of terms by a user. Currently we only allow creating terms through a collection. But we also allow using terms from other collections inside your collection.

Collections enable management of terms and organise collaboration on terms.

Relation

The mechanism to structure data, by connecting two terms inside a collection. Relations are always stored in the context of a collection though some of the terms which are in the relation might be used in other collections.

Description

The mechanism to communicate the meaning of a term inside a collection. Here you can describe what the term represents (for example "tomato is a fruit"), and how it distincts from other terms (for example which characteristic clearly distincts a "tomato" from an "apple"?).

Why did we start this journey?

There is great value in effectively managing data

We have a compelling belief that in order to get in control of data, you need to be able to effectively communicate over data. Making models related to the data enables and facilitates managing data.

Scale up the thinking about data models

Use of models for data to enable the true potential is known for decades. We learned in a hard way that these decade old techniques are only known to a very limited number of people. We started looking for tools in this space and found that they are primarily designed for expert users. This resulted in lack of scalability. In the current world, everybody works with data, so this is a wasted opportunity to speed up and become more effective within and between organisations on exchanging, analysing and using data.

None of this definitely stops any organisation to exploit data. Most organisations and people already build several systems every day which services many customers. They all have models implemented so how is it possible that this subject did not scale?

We learned that this is due to the fact that people who implement systems uses modelling techniques which are close to the implementation. This means, you see the results quickly and adopt accordingly. But this also means that the results rarely get published for others in the organisation to reuse and connect to. Often even, the models get lost again.

Crowd sourcing

This led to us with a belief that "Crowd Sourcing" is key to achieve success in this area. To enable a place where models can he shared, stored and reused. But to achieve this, we need to keep things simple for people to get started. We also need to be able to accommodate several types of model structures – from simple to complex.

Make models understandable by both business and IT

Additionally, we have found that data models are often used mostly in the IT related part of organisations. They are rarely discussed with users at other parts of the organisation. This is one of the key factors in the inefficiency of requirement setting between users of IT and developers of IT. There is almost no discussion on the data which should be in a system. Thus one of our core beliefs is that we need to enable the dialogue on data between IT people and non-IT people. Again modelling helps, but only when we can make it simple for non-IT people to read, use and even build a model on a business level. This puts demands on our usability.

Any collection of terms and relations

Models at business level require participation of non-IT people. We found that the term "model", and especially "data model" is often recognised by IT people, but less by non-IT people. So we need terminology that makes it more intuitive to non-IT.

A "data model" is basically a collection of terms and relations between these terms, that are relevant to people in a specific context. The terms are usually explained via a definition or description.

We have found that most people have an intuitive understanding of this explanation, allowing for increased participation. Thus, we refer to "collections" instead of "data models" in meet-Alex.

Cross the organisation boundary

Next to this, we also realise that there are data models ("collections") which are authoritative and common for every individual and organisation (e.g. utility models, legal frameworks, governmental or regulatory frameworks). So the Crowd Sourcing need to cross the organisation boundary. So we also came to a conclusion that we should have the code Open Sourced and also allow others to host the solution and provide it as Open Data and later build a network of connected data models. Future plans can also include use of techniques like Block Chain.

Artificial Intelligence

Why cannot Artificial Intelligence (AI) solve this problem? AI can solve part of the problem in this field but not all since it does not replace human intelligence. For supervised learning, we need sufficient data and may be our approach for meet-Alex will provide that data in the future to exploit AI technology to assist in the scalability.