In this article, I will try to demystify and explain in everyday words the concept of metadata and why they are so important in today’s data world.
Let’s first try to define metadata. According to the National Information Standards Organization (NISO), metadata is a piece of structured information that describes, explains, locates or otherwise make it easier to retrieve, use or manage an information resource. Metadata is often called data about data or information about information. In other words, it is the information that we create and enables discovery, accessibility, and usability of data.
If we look back in history, we can find traces of the usage of metadata in the antique world. At that time, the usage of metadata consisted mainly of the creation of catalogues in libraries in order to reference all the works stored. Imagine having to retrieve any information in Alexandria’s library without any organization! Imagine today’s history museums being able to retrieve any bone or item of their collection without labelling and referencing them.
When the library became digital, then metadata started to carry a more complex role. Here is a definition of a digital library and by extension of information that metadata should carry:
“Digital libraries are organizations that provide the resources, including the specialized staff, to select, structure, offer intellectual access to, interpret, distribute, preserve the integrity, and ensure the persistence over time of collections of digital works so that they are readily and economically available for use by a defined community or set of communities.” http://www.diglib.org/
Today, metadata is to be found everywhere, Netflix uses them to retrieve its videos when a company use them to classify its employees. Amazon uses metadata to classify the incredible amount of furniture it delivers. When almost every human being has a social security number that allows the country he comes from to classify him.
Metadata is created, stored and used as data is. Indeed, the distinction between data and metadata tends to be blurry and to be solely one of semantics. However contrary to data, metadata has to be structured and is collected to fulfil a given purpose when data has not to be structured and often collected for no specific purpose. Furthermore, metadata provides context for non-textual materials like datasets or images that may not be used without this additional information.
Many metadata properties are useful to display to users, to help in identification or understanding of a resource. Interoperability, the effective exchange of content between systems, relies on metadata describing that content so that the systems involved can effectively profile incoming material and match it to their internal structures.
Different types of metadata
This is a historically independent kind of metadata. As explained before, at the very beginning this was the only kind of metadata used in libraries to retrieve books of writings.
- For finding or understanding a resource
In a word this is anything needed to understand, decode, or use the metadata such as hashing algorithms or creative commons rights or any other information of this kind.
- Technical metadata
- Preservation metadata – Rights metadata
- For decoding and rendering files
- Long-term management of files
- Intellectual property rights attached to content
This section introduces a way of describing the relationship between several data. How are they related to each other? This is an important part of a modern context of data issued from workflows in order to keep track of how they were generated, and which are its data sisters, parents or children.
- Relationships of parts of resources to one another / technical used to produce or required to use a data.
I hope this has helped you understand a little better the necessity of metadata. In a further article, I will talk a little more about the different languages used to organise these metadata.