Skip to main content

Table 1 GBIF Metadata Profile (GMP) implemented in the GBIF Integrated Publishing Toolkit for authoring metadata document

From: The data paper: a mechanism to incentivize data publishing in biodiversity science

GBIF Metadata Profile (GMP) elements

Description

abstract

A brief overview describing the dataset.

additionalInfo

Any information that is not characterized by the other resource metadata fields.

additionalMetadata

A flexible field for including any other relevant metadata that pertains to the resource being described. This field allows EML to be extensible in that any XML-based metadata can be included in this element.

address

A container for multiple subfields that describe the physical or electronic address of the responsible party for a resource.

administrativeArea

The equivalent of a 'state' in the US or province in Canada. This field is intended to accommodate the many types of international administrative areas.

alternateIdentifier

This is the only identifier issued by the IPT for the metadata document; it is a persistent identifier.

associatedParty

A party associated with the resource. Parties have particular roles.

beginDate

A single time stamp signifying the beginning of some time period.

beginRange

The lower value in a range of numbers. Use to represent an exact number by omitting the 'endRange' value.

bibliography

A list of citations that form a bibliography on literature related to or used in the dataset.

boundingCoordinates

The four margins (N, S, E, W) of a bounding box, or when considered in latitude-longitude pairs, the corners of the box.

calendarDate

Used to express a date, giving the year, month and day. The format should be one that complies with ISO standard 8601. The recommended format for EML is YYYY-MM-DD, where Y is the four-digit year, M is the two-digit month code (01-12, where January = 01), and D is the two-digit day of the month (01-31). This field can also be used to enter just the year portion of a date.

characterEncoding

Contains the name of the character encoding. This is typically ASCII, UTF-8 or one of the other common encodings.

citation

A single citation for to use when citing the dataset.

city

Used for the city name of the contact associated with a particular resource.

collection

A container element for other elements associated with collections (for example collectionIdentifier, collectionName).

collectionIdentifier

The URI (LSID or URL) of the collection. In RDF, used as URI of the collection resource.

collectionName

Official name of the collection in the local language.

commonName

Applicable common names, which may be general descriptions of a group of organisms, if appropriate, for example invertebrates, waterfowl.

contact

Contains contact information for the dataset. This is the person or institution to contact with questions about the use, interpretation of a dataset.

country

Used for the name of the contact's country.

coverage

Describes the extent of the coverage of the resource in terms of its spatial, temporal and taxonomic extent.

creator

The person who created the resource (not necessarily the author of this metadata about the resource).

dataFormat

A container element for other elements that describe the internal physical characteristics of the data object.

dataset

A wrapper for all other elements relating to a single dataset.

deliveryPoint

Used for the physical address for postal communication, for example, GBIF Secretariat, Universitetsparken 15.

description

Contains general textual descriptions.

descriptor

Used to document domains (themes) of interest, such as climate, geology, soils or disturbances.

descriptorValue

Contains a general description, either thematic or geographic, of the study area.

designDescription

Contains general textual descriptions of research design. It can include detailed accounts of goals, motivations, theory, hypotheses, strategy, statistical design and actual work.

distribution

Provides information on how the resource is distributed. When used at the resource level, this element can provide only general information, but elements for describing connections to online systems are provided.

eastBoundingCoordinate

Defines the longitude of the eastern-most point of the bounding box that is being described.

electronicMailAddress

The email address for the party. It is intended to be an internet SMTP email address, which should consist of a username followed by the @ symbol followed by the email server domain name address.

endDate

A single time stamp signifying the end of some time period.

endRange

The upper value in a range of numbers.

externallyDefinedFormat

Information about a non-text or proprietary formatted object.

formatName

Name of the format of the data object, for example, ESRI Shapefile.

formatVersion

Version of the format of the data object.

formationPeriod

Text description of the time period during which the collection was assembled for example 'Victorian', '1922-1932' or 'c. 1750'.

funding

Used to provide information about funding sources for the project, such as grant and contract numbers or names and addresses of funding sources.

generalTaxonomicCoverage

A general description of the range of taxa addressed in the dataset or collection.

geographicCoverage

A container for spatial information about a resource; allows a bounding box for the overall coverage (in latitude and longitude), and also allows description of arbitrary polygons with exclusions.

geographicDescription

A short text description of a dataset's geographic areal domain. A text description is especially important to provide a geographic setting when the extent of the dataset cannot be well described by the 'boundingCoordinates'.

givenName

Can be used for first name of the individual associated with the resource, or for any other names that are not intended to be alphabetic, as appropriate.

hierarchyLevel

Dataset level to which the metadata applies; default value is 'dataset'.

individualName

Contains subfields so that a person's name can be broken down into parts.

intellectualRights

Contain a rights management statement for the resource, or a reference a service providing such information.

jgtiCuratorialUnit

A quantitative descriptor (number of specimens, samples or batches).

jgtiUnitRange

A range of numbers (x to x), with the lower value representing an exact number when the higher value is omitted.

jgtiUnitType

A general description of the unit of curation, for example, 'jar containing plankton sample'.

jgtiUnits

The exact number of units within the collection.

keyword

A keyword or key phrase that concisely describes the resource or is related to the resource. Each keyword field should contain one and only one keyword.

keywordSet

A wrapper element for the keyword and keywordThesaurus elements.

keywordThesaurus

The name of the official keyword thesaurus from which keyword was derived.

language

The language in which the resource (not the metadata document) is written.

livingTimePeriod

Time period during which biological material was alive (for paleontological collections).

metadata

Contains the additional metadata to be included in the document. This element should be used for extending EML to include metadata that is not already available in another part of the EML specification.

metadataLanguage

The language in which the metadata (as opposed to the resource being described by the metadata) is written.

metadataProvider

The party responsible for the creation of the metadata document.

methodStep

Allows for repeated sets of elements that document a series of procedures followed to produce a data object, including text descriptions of the procedures, relevant literature, software, instrumentation, source data and any quality control measures taken.

methods

Documents scientific methods used in the collection of this dataset. It includes information on items such as tools, instrument calibration and software.

northBoundingCoordinate

Defines the latitude of the northern-most point of the bounding box that is being described.

objectName

The name of the data object. This often is the filename of a file in a file system or that is accessible on the network.

online

Contains information for accessing the resource online represented as a URL connection.

onlineUrl

A link to associated online information, usually a website. When the party represents an organization, this is the URL to a website or other online information about the organization. If the party is an individual, it might be their personal website or other related online information about the party.

organisationName

The full name of the organization that is associated with the resource. This field is intended to describe which institution or overall organization is associated with the resource being described.

para

Allows for text blocks to be included in EML.

parentCollectionIdentifier

Identifier for the parent collection for this sub-collection. Enables a hierarchy of collections and sub-collections to be built.

personnel

Extends associatedParty with role information and is used to document people involved in a research project by providing contact information and their role in the project.

phone

Describes information about the responsible party's telephone (voice or fax) number.

physical

A container element for all of the elements that allow description of the internal/external characteristics and distribution of a data object (for example, dataObject, dataFormat, distribution).

positionName

Intended to be used instead of a particular person or full organization name. If the associated person who holds the role changes frequently, then positionName would be used for consistency; for example, GBIF Data Manager.

postalCode

Equivalent to a US zip code or the number used for routing to an address in other countries.

project

Contains information on the project in which the dataset was collected. It includes information such as project personnel, funding, study area, project design and related projects.

pubDate

The date on which the resource was published.

purpose

A description of the purpose of the resource/dataset.

qualityControl

Provides a location for the description of actions taken to either control or assess the quality of data resulting from the associated method step.

rangeOfDates

Intended to be used for describing a range of dates and/or times. It can be used multiple times to document multiple date ranges. It allows for two 'singleDateTime' fields, the first to be used as the beginning dateTime and the second to be used as the ending dateTime of the range.

resourceLogoUrl

URL of the logo associated with a resource.

role

Used to describe the role the party had with respect to the resource. Some potential roles include technician, reviewer and principal investigator.

sampling

Description of sampling procedures, including the geographic, temporal and taxonomic coverage of the study.

samplingDescription

Allows a text-based/human-readable description of the sampling procedures used in the research project. The content of this element would be similar to a description of sampling procedures found in the methods section of a journal article.

singleDateTime

Intended to describe a single date and time for an event.

southBoundingCoordinate

Defines the latitude of the southern-most point of the bounding box that is being described.

specimenPreservationMethod

Picklist keyword indicating the process or technique used to prevent physical deterioration of non-living collections. Expected to contain an instance from the Specimen Preservation Method Type Term vocabulary.

studyAreaDescription

Documents the physical area associated with the research project. It can include descriptions of the geographic, temporal and taxonomic coverage of the research location and descriptions of domains (themes) of interest, such as climate, geology, soils or disturbances.

studyExtent

Represents both a specific sampling area and the sampling frequency (temporal boundaries, frequency of occurrence). The geographic studyExtent is usually a surrogate (representative area of) for the larger area documented in 'studyAreaDescription'.

surName

Used for the last name of the individual associated with the resource. This is typically the family name of an individual, for example, the name by which s/he is referred to in citations.

taxonRankName

The name of the taxonomic rank for which the taxon rank value is provided, for example, phylum, class, genus, species.

taxonRankValue

The name representing the taxonomic rank of the taxon being described.

taxonomicClassification

Information about the range of taxa addressed in the dataset or collection.

taxonomicCoverage

A container for taxonomic information about a resource. It includes a list of species names (or higher level ranks) from one or more classification systems.

temporalCoverage

Specifies temporal coverage, and allows coverages to be a single point in time, multiple points in time, or a range of dates.

title

Provides a description of the resource that is being documented that is long enough to differentiate it from other similar resources. Multiple titles may be provided, particularly when trying to express the title in more than one language (use the 'xml:lang' attribute to indicate the language if not English).

url

The URL of the resource that is available online.

westBoundingCoordinate

Defines the longitude of the western-most point of the bounding box that is being described.

  1. The definitions of the elements are taken from [64, 85, 86]. Mandatory elements when authoring metadata through IPT 2.0.2+ are in bold.