Skip to main content

Table 1 GBIF Metadata Profile (GMP) implemented in the GBIF Integrated Publishing Toolkit for authoring metadata document

From: The data paper: a mechanism to incentivize data publishing in biodiversity science

GBIF Metadata Profile (GMP) elements



A brief overview describing the dataset.


Any information that is not characterized by the other resource metadata fields.


A flexible field for including any other relevant metadata that pertains to the resource being described. This field allows EML to be extensible in that any XML-based metadata can be included in this element.


A container for multiple subfields that describe the physical or electronic address of the responsible party for a resource.


The equivalent of a 'state' in the US or province in Canada. This field is intended to accommodate the many types of international administrative areas.


This is the only identifier issued by the IPT for the metadata document; it is a persistent identifier.


A party associated with the resource. Parties have particular roles.


A single time stamp signifying the beginning of some time period.


The lower value in a range of numbers. Use to represent an exact number by omitting the 'endRange' value.


A list of citations that form a bibliography on literature related to or used in the dataset.


The four margins (N, S, E, W) of a bounding box, or when considered in latitude-longitude pairs, the corners of the box.


Used to express a date, giving the year, month and day. The format should be one that complies with ISO standard 8601. The recommended format for EML is YYYY-MM-DD, where Y is the four-digit year, M is the two-digit month code (01-12, where January = 01), and D is the two-digit day of the month (01-31). This field can also be used to enter just the year portion of a date.


Contains the name of the character encoding. This is typically ASCII, UTF-8 or one of the other common encodings.


A single citation for to use when citing the dataset.


Used for the city name of the contact associated with a particular resource.


A container element for other elements associated with collections (for example collectionIdentifier, collectionName).


The URI (LSID or URL) of the collection. In RDF, used as URI of the collection resource.


Official name of the collection in the local language.


Applicable common names, which may be general descriptions of a group of organisms, if appropriate, for example invertebrates, waterfowl.


Contains contact information for the dataset. This is the person or institution to contact with questions about the use, interpretation of a dataset.


Used for the name of the contact's country.


Describes the extent of the coverage of the resource in terms of its spatial, temporal and taxonomic extent.


The person who created the resource (not necessarily the author of this metadata about the resource).


A container element for other elements that describe the internal physical characteristics of the data object.


A wrapper for all other elements relating to a single dataset.


Used for the physical address for postal communication, for example, GBIF Secretariat, Universitetsparken 15.


Contains general textual descriptions.


Used to document domains (themes) of interest, such as climate, geology, soils or disturbances.


Contains a general description, either thematic or geographic, of the study area.


Contains general textual descriptions of research design. It can include detailed accounts of goals, motivations, theory, hypotheses, strategy, statistical design and actual work.


Provides information on how the resource is distributed. When used at the resource level, this element can provide only general information, but elements for describing connections to online systems are provided.


Defines the longitude of the eastern-most point of the bounding box that is being described.


The email address for the party. It is intended to be an internet SMTP email address, which should consist of a username followed by the @ symbol followed by the email server domain name address.


A single time stamp signifying the end of some time period.


The upper value in a range of numbers.


Information about a non-text or proprietary formatted object.


Name of the format of the data object, for example, ESRI Shapefile.


Version of the format of the data object.


Text description of the time period during which the collection was assembled for example 'Victorian', '1922-1932' or 'c. 1750'.


Used to provide information about funding sources for the project, such as grant and contract numbers or names and addresses of funding sources.


A general description of the range of taxa addressed in the dataset or collection.


A container for spatial information about a resource; allows a bounding box for the overall coverage (in latitude and longitude), and also allows description of arbitrary polygons with exclusions.


A short text description of a dataset's geographic areal domain. A text description is especially important to provide a geographic setting when the extent of the dataset cannot be well described by the 'boundingCoordinates'.


Can be used for first name of the individual associated with the resource, or for any other names that are not intended to be alphabetic, as appropriate.


Dataset level to which the metadata applies; default value is 'dataset'.


Contains subfields so that a person's name can be broken down into parts.


Contain a rights management statement for the resource, or a reference a service providing such information.


A quantitative descriptor (number of specimens, samples or batches).


A range of numbers (x to x), with the lower value representing an exact number when the higher value is omitted.


A general description of the unit of curation, for example, 'jar containing plankton sample'.


The exact number of units within the collection.


A keyword or key phrase that concisely describes the resource or is related to the resource. Each keyword field should contain one and only one keyword.


A wrapper element for the keyword and keywordThesaurus elements.


The name of the official keyword thesaurus from which keyword was derived.


The language in which the resource (not the metadata document) is written.


Time period during which biological material was alive (for paleontological collections).


Contains the additional metadata to be included in the document. This element should be used for extending EML to include metadata that is not already available in another part of the EML specification.


The language in which the metadata (as opposed to the resource being described by the metadata) is written.


The party responsible for the creation of the metadata document.


Allows for repeated sets of elements that document a series of procedures followed to produce a data object, including text descriptions of the procedures, relevant literature, software, instrumentation, source data and any quality control measures taken.


Documents scientific methods used in the collection of this dataset. It includes information on items such as tools, instrument calibration and software.


Defines the latitude of the northern-most point of the bounding box that is being described.


The name of the data object. This often is the filename of a file in a file system or that is accessible on the network.


Contains information for accessing the resource online represented as a URL connection.


A link to associated online information, usually a website. When the party represents an organization, this is the URL to a website or other online information about the organization. If the party is an individual, it might be their personal website or other related online information about the party.


The full name of the organization that is associated with the resource. This field is intended to describe which institution or overall organization is associated with the resource being described.


Allows for text blocks to be included in EML.


Identifier for the parent collection for this sub-collection. Enables a hierarchy of collections and sub-collections to be built.


Extends associatedParty with role information and is used to document people involved in a research project by providing contact information and their role in the project.


Describes information about the responsible party's telephone (voice or fax) number.


A container element for all of the elements that allow description of the internal/external characteristics and distribution of a data object (for example, dataObject, dataFormat, distribution).


Intended to be used instead of a particular person or full organization name. If the associated person who holds the role changes frequently, then positionName would be used for consistency; for example, GBIF Data Manager.


Equivalent to a US zip code or the number used for routing to an address in other countries.


Contains information on the project in which the dataset was collected. It includes information such as project personnel, funding, study area, project design and related projects.


The date on which the resource was published.


A description of the purpose of the resource/dataset.


Provides a location for the description of actions taken to either control or assess the quality of data resulting from the associated method step.


Intended to be used for describing a range of dates and/or times. It can be used multiple times to document multiple date ranges. It allows for two 'singleDateTime' fields, the first to be used as the beginning dateTime and the second to be used as the ending dateTime of the range.


URL of the logo associated with a resource.


Used to describe the role the party had with respect to the resource. Some potential roles include technician, reviewer and principal investigator.


Description of sampling procedures, including the geographic, temporal and taxonomic coverage of the study.


Allows a text-based/human-readable description of the sampling procedures used in the research project. The content of this element would be similar to a description of sampling procedures found in the methods section of a journal article.


Intended to describe a single date and time for an event.


Defines the latitude of the southern-most point of the bounding box that is being described.


Picklist keyword indicating the process or technique used to prevent physical deterioration of non-living collections. Expected to contain an instance from the Specimen Preservation Method Type Term vocabulary.


Documents the physical area associated with the research project. It can include descriptions of the geographic, temporal and taxonomic coverage of the research location and descriptions of domains (themes) of interest, such as climate, geology, soils or disturbances.


Represents both a specific sampling area and the sampling frequency (temporal boundaries, frequency of occurrence). The geographic studyExtent is usually a surrogate (representative area of) for the larger area documented in 'studyAreaDescription'.


Used for the last name of the individual associated with the resource. This is typically the family name of an individual, for example, the name by which s/he is referred to in citations.


The name of the taxonomic rank for which the taxon rank value is provided, for example, phylum, class, genus, species.


The name representing the taxonomic rank of the taxon being described.


Information about the range of taxa addressed in the dataset or collection.


A container for taxonomic information about a resource. It includes a list of species names (or higher level ranks) from one or more classification systems.


Specifies temporal coverage, and allows coverages to be a single point in time, multiple points in time, or a range of dates.


Provides a description of the resource that is being documented that is long enough to differentiate it from other similar resources. Multiple titles may be provided, particularly when trying to express the title in more than one language (use the 'xml:lang' attribute to indicate the language if not English).


The URL of the resource that is available online.


Defines the longitude of the western-most point of the bounding box that is being described.

  1. The definitions of the elements are taken from [64, 85, 86]. Mandatory elements when authoring metadata through IPT 2.0.2+ are in bold.