HSE banner

A detailed guide to metadata

What goes into each field, how to correctly code it and why. See also our section on PDF metadata.

Key elements used by HSE

Key elements used by HSE
Metadata element HTML format, with completed examples eGIF Status
Title <title>Title of the page </title> Mandatory
DC Title <meta name="DC.title" content="Title of the page " /> Mandatory
HSE.Longtitle <meta name="HSE.longtitle" content="wordy title text, for titles containing more than 65 characters " /> Optional
Keywords <meta name="keywords" content="web page, internet... " /> Optional
Description <meta name="description" content="This document is about... " /> Optional
Creator <meta name="DC.creator" content="Division, Group, A Surname" /> Mandatory
Subject category (Integrated Public Service Vocabulary (IPSV) heading) <meta name="DC.subject" scheme="eGMS.IPSV" content="Health and safety at work; farming" /> Mandatory
Date <meta name="DC.date.issued" scheme="DCTERMS.W3CDTF" content="2004-03-08" /> Mandatory
Disposal date <meta name="DC.disposal.review" scheme="DCTERMS.W3CDTF" content="2005-03-08" /> Optional
Modification date <meta name="DC.date.modified" scheme="DCTERMS.W3CDTF" content="YYYY-MM-DD" /> Optional
HSE checked <meta name="HSE.checked" content="2006-03-25" /> Optional
Accessibility scheme <meta name="eGMS.accessibility" scheme="eGMS.WCAG10" content="Double-A" /> Mandatory
Accessibility content <meta name="eGMS.accessibility" content='(pics-1.1 "http://www.icra.org/ratingsv02.html" l gen true for "http://www.hse.gov.uk" r (cz 1 lz 1 nz 1 oz 1 vz 1) "http://www.rsac.org/ratingsv01.html" l gen true for "http://www.hse.gov.uk" r (n 0 s 0 v 0 l 0))' /> Mandatory -
This is in the template
Identifier <meta name="DC.identifier" content= "http://www.hse.gov.uk/folder name/file name” /> Mandatory
Publisher <meta name="DC.publisher" content="Health and Safety Executive" /> Mandatory
Language <meta name="DC.language" scheme="ISO 639-2/T" content="Eng" /> Recommended
Coverage <meta name="DC.coverage" content="Britain" /> Recommended
Type <meta name="DC.type" scheme=" " content="

" />

Optional
     

Instruction manual

Purpose

This document sets out the standards for applying metadata to each file on the websites. It explains what each element means and gives examples of how metadata should be written.

Important

The rebranding of the website exercise which began at the end of 2004 has introduced a set of standard metadata fields at the top of each file. There are four important points to note here:

  1. Some files will have been missed or not yet rebranded. The standard metadata fields must be pasted in at the top of all HTML files.
  2. Where metadata fields are present, check that all standard fields are listed and paste in any missing fields.
  3. Where metadata fields are present, these must be checked to ensure any data they contain is complete and conforms with the standards listed below.
  4. The standard metadata fields are being applied to HTML files only. For pdf files, the ‘Document information’ boxes are being utilised. See PDF Metadata guide.

In order for us to check which files in the site have a full set of metadata applied to them, the following tag should be added to each file: <meta name="HSE.checked" content="yyyy-mm-dd" /> where ="yyyy-mm-dd" is the date you add the tag to the file.

A. Fixed fields

The following ‘Fixed’ fields must be present at the top of each file. The format used must be entered precisely as shown:

HTML Version

This sets out the version of HTML we are currently using. It must be entered in this format:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

Language of code

NB This is part of the standard template and will already be present This specifies that the XML Schema in use is English. The DC.Language field can be used to specify other languages where appropriate. See the ‘Language’ tag for further information. The language code should be expressed as follows:

<html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml">

Navigation

NB This is part of the standard template and will already be present These are links to key navigational aids, respectively the home page, copyright, the search engine and the Acronyms List. They must be entered in this format:

<link rel="home" href="../index.htm" />
<link rel="copyright" href="../copyright.htm" />
<link rel="search" href="../search.htm" />
<link rel="glossary" href="../acronym/index.htm" />

Accessibility

These are standard statements which show how our site is rated by the Internet Content Rating Association (ICRA), an “international, non-profit organization of internet leaders working to make the internet safer for children, while respecting the rights of content providers”. There are two statements and they must be entered in this format:

<meta http-equiv="PICS-Label" content='(pics-1.1 "http://www.icra.org/ratingsv02.html" l gen true for "http://www.hse.gov.uk" r (cz 1 lz 1 nz 1 oz 1 vz 1) "http://www.rsac.org/ratingsv01.html" l gen true for "http://www.hse.gov.uk" r (n 0 s 0 v 0 l 0))' />
<meta name="eGMS.accessibility" content='(pics-1.1 "http://www.icra.org/ratingsv02.html" l gen true for "http://www.hse.gov.uk" r (cz 1 lz 1 nz 1 oz 1 vz 1) "http://www.rsac.org/ratingsv01.html" l gen true for "http://www.hse.gov.uk" r (n 0 s 0 v 0 l 0))' />

Integrated Public Service Vocabulary (IPSV) heading

This is a list of Government-wide subject headings. It can be found in various formats at http://www.esd.org.uk/standards/ipsv/ The best version to use is at http://www0.esd.org.uk/standards/ipsv/viewer/viewer.aspx It is a mandatory field but it is very general and not much use for occupational health and safety information. The most useful terms are under the ‘Safety’ area of ‘Health, well-being and care’. The ‘meta name=”Keywords”’ element allows for greater flexibility/creativity – see below. It should be specified in the format: <meta name="dc.subject" scheme="IPSV" content="Occupational health and safety">

B. Variable fields

Title

This is the title of the document. It is probably the most important meta tag. It should be specified as it appears on the document as in the following example:

The field should be expressed in this format:
<title>Wave Slap Loading on FPSO Bows</title>

Most search engines and some validation tools such as Site Morse cut off or disregard long titles. In this case, it is necessary to truncate or transpose a long title so that all of it is used. There are three thresholds.

Best
Less than 65 characters. Always aim for this target in the first instance.
Good
Less than 80 characters. If you can't quite squeeze the perfect title into less than 65 characters try to make it less than 80.
Acceptable
Less than 120 characters. Always ensure the title is less than 120 characters - no exceptions!

Spaces between words and dots to indicate missing words (…) also count as characters. An example of a truncated title is given below:

When you need to truncate or summarise a title as shown in the example above please copy the original full title into our own specific long title tag.

An example of how to code a long title
<title>Female form manikin used to test the fire protection of personal protective equipment: RR475</title>
<meta name="HSE.longtitle" content=”RR475 - The development of a ‘female’ form manikin as part of a test facility to assess the fire protection afforded by personal protective equipment” />

Summary:

Keep it short. Remove fluff. Front load keywords.

HSE.Longtitle

The problem:
Search engines discount the content of titles over 65 to 80 or so characters. Many HSE reports have very long titles, we need to truncate such titles to deliver meaningful and concise content to search engines. Doing so loses valuable bibliographic information.
The solution:
The Online Team have developed an HSE metatag called "HSE.longtitle" . Put the full title in this tag when documents have 'official' titles longer than 65 characters.
Please note: Google, Yahoo and other search engine will ignore this data. We are simply storing it for when we have a custom search facility that will utilise it.

Keywords

These are natural language keywords and should accurately represent the content of the file. They should include synonyms, acronyms where appropriate and broader terms where appropriate. The following sources should also be checked for appropriate keywords:

It is not necessary to use the terms ‘HSE’, ‘safety’, ‘health’ or ‘occupational health’ within the keywords as these are implicit for all documents. There is no limit to the size of this metadata element, but a maximum of 25 keywords should be sufficient. Keep keywords in lower case for the sake of consistency. Certain categories of files require additional standard keywords. These files are as follows:

HSE standard keywords
File type Standard keywords
Case studies case study, case studies
Consultative documents consultative document, cd series no.
Discussion documents discussion document, dd series no.
Factory inspectorate minutes factory inspectorate minute, fim series no.
Factory inspectorate minutes circulars factory inspectorate minute circular, fic series no.
HSC Board agendas hsc board agenda
HSC Board minutes hsc board minutes
HSC Board papers hsc board paper
HSE Board agendas hse board agenda
HSE Board minutes hse board minutes
HSE Board papers hse board paper
Industry advisory committee papers industry advisory committee paper, iac
Industry advisory committee agendas industry advisory committee agenda, iac
Industry advisory committee minutes industry advisory committee minutes, iac
Leaflets include series code
Local authority circulars local authority circular, lac series no.
Memorandum of Understanding documents memorandum of understanding, mou
NSD Local Liaison Committee reports local liaison committee, llc
Operational circulars operational circular, oc series no.
Operational minutes operational minute, om series no.
Press releases press release, series no. for press release
Research reports include rr, crr or hsl report no as appropriate
Sector industry minutes sector industry minute, sim series no.

The keywords should be recorded in this format:
<meta name="Keywords" content="railways" />

Description

The description allows the user to decide if the web page is relevant for his or her needs. It is typically displayed in a list of search results. As search engines have different criteria for displaying search results, it is important to ensure that the description is kept short (maximum of 25 words) and that the key message and important keywords appear first. An example of a good description for a document on ‘Evidence demonstrating the impact of worker involvement and consultation’ is:

“effective worker involvement and consultation on health and safety has a positive impact on individual workers as a whole”

The format for the description is:
<meta name="Description" content=" effective worker involvement and consultation on health and safety has a positive impact on individual workers as a whole " />

eGov metadata

D.C. Title

The instructions for the ‘Title’ field described above apply here. The only reason for including this ‘D.C.’ field is that some search engines will not recognise the field without the ‘D.C.’ part. The format to be used is
<meta name="DC.title" content=" Wave Slap Loading on FPSO Bows" />

Creator

This records the Directorate and Unit responsible for the content. It is recorded in preference to the author’s name as it is more likely that we will be able to trace the Unit rather than the author who may have moved on, when it is time to review the document. However the author’s name is a useful addition if this is available. The data should be recorded in this format:
<meta name="DC.creator" content="RPD, CDS1, Morris, J" />

Date issued

This is the date the file was published on the website. It should be recorded in this format:
<meta name="DC.date.issued" scheme="W3CDTF" content="YYYY-MM-DD" />

Date modified

This is the date the file was last edited. It should be recorded in this format:
<meta name="DC.date.modified" scheme="W3CDTF" content="YYYY-MM-DD" />

Disposal review

This records the date the document should be removed from the website. It is usually determined by the author/owner. It should be given in this format:
<meta name="DC.disposal.review" content="YYYY-MM-DD" />

HSE checked

A custom HSE tag to date when a page's meta data was last given a complete overhaul. If this is present then we can assume the page has all the e-gif mandatory and recommended fields.

Identifier

This is a unique reference which identifies the document. It can be an ISBN for a book, a press release number or a leaflet series code. Express in this format:
<meta name="DC.identifier" scheme="" content="indg401" />
<meta name=”DC.identifier” scheme=”” content=”ISBN 0 7176 2726 8” /> but if there is no unique reference, it is permissible to quote the website URL as follows:
<meta name="DC.identifier" scheme="" content="http://www.hse.gov.uk/" />

Accessibility

This element denotes the availability and usability of the resource to specific groups. On the HSE website this is expressed as:
<meta name="eGMS.accessibility" scheme="WCAG" content="Double-A" />

Publisher

Record ‘Health and Safety Executive’ as the publisher, unless the document in question has been produced by the Health and Safety Commission. The format to use is:
<meta name="DC.publisher" content="Health and Safety Executive" />

Language of content

This specifies the language that is used in the file. This is usually English, but we do have an increasing number of foreign language publications on the site. The appropriate three letter language codes are available in ISO 369-2/T at http://www.evertype.com/standards/iso639/iso639-en.html
The format to use is:
<meta name="DC.language" scheme="ISO 639-2/T" content="Eng" />

Type

This provides an opportunity to describe the form the file takes. The e-GMS list at http://dublincore.org/documents/dcmi-type-vocabulary/ lists a variety of possibilities. If there is nothing suitable in this list, you can use “website facility” or devise a suitable term yourself. The format for this field is as follows: <meta name="DC.type" scheme="e-GMSTES" content="Website facility" />

Coverage

This is geographical coverage only and is recorded as:
<meta name="DC.coverage" content="Britain" />

Scheme

This is the metadata scheme in use and is always recorded as:
<meta name="DC.type" scheme="e-GMS.TES" content="Text" />

<!-- end e-gov metadata -->