Metadata

Metadata is required for all web content at EPA. 

Metadata should provide succinct, descriptive information of the specific page or document.  Having metadata can influence and improve search results, as well as give you greater control over your content.

For Drupal WebCMS metadata, go to:

On this page:


How do I create good metadata?

  • Write good titles and descriptions first, add in keywords as needed.
  • Use metadata fields to succinctly describe the current page or document. Do not generalize; do not describe the whole web area or the general concept of your site.
  • Know your analytics and your audiences. Look at your search logs, site traffic reports, and metadata error reports. Identify your users current search terms and the search terms in your current page or document. Make sure the metadata repeats these important, terms from your content.

Top of Page

What are the required metadata fields?

The required metadata fields are:

  • Title*
  • Description
  • Keywords
  • Publisher*
  • Type
  • Channel

*Title and Publisher are not listed as required metadata fields when editing content in Drupal WebCMS.  The title metadata is automatically created from the page title. However, it is still important to create good, descriptive titles as they directly influence search results. Descriptive titles also promote better link text, another important factor in search results. Publisher metadata is created when you set up your web area.  

Top of Page

How do I create metadata for foreign language documents?

As a best practice, you should always try to create metadata in the language of the document itself. However, there are some exceptions:

  • Since most systems do not allow special characters, documents in non-Western languages such as Chinese, Japanese, or Arabic, require English metadata. We suggest adding a note to your title and description fields.
    • Title example: "Document Title (Chinese Translation)"
    • Description example: "Chinese translation of..."
  • Also due to the limitation on special characters, accent marks or other diacritical marks must be omitted (such as those used in Western languages such as Spanish or French).

Top of Page

Why do I need to add channels and types?

Channels and types can be used to create on-demand lists of content or to filter search results, allowing users to refine and target their searches. When we reach a critical mass of good quality metadata, we can turn on new search filters for EPA content. Type metadata will also be used to create automatic lists of content, syndication feeds such as RSS, and in potential ROT identification.

In Template 4 and in One EPA Web templates, the four channels identified in the Information Architecture are now used as global navigation. Web editors for these channels will continue to improve the content of the four Channel areas as more EPA content is tagged. See: Information Architecture

Content Type is also tied to the content review schedule. Even outside of Web CMS, web owners are required to follow this procedure for reviewing, updating, and removing web content. See: Web Content Types and Review Procedure

Top of Page

Do I need metadata for other file types (PNG, GIF, JPG, DOC, XLS, etc.)?  

No. You are only required to complete metadata for web area homepages, basic pages, document pages, and webforms.

When you upload an image or file via the Add files and images window, keywords you enter in that window are not metadata keywords.  See: Why do I have to enter keywords twice?

However, to comply with Section 508 requirements, you should add metadata to document properties fields (where applicable) and add alt text to image files. See: GSA 508 Tutorials, Guidance, Checklists.

Top of Page

Do I need metadata for PDF files?

In Drupal WebCMS, PDFs can only be uploaded to document pages, which have required metadata fields. See: Metadata for Document Pages.

When you upload a PDF to the document page, keywords you enter in the upload window are not metadata keywords.  See: Why do I have to enter keywords twice?

To comply with Section 508 requirements and improve search results outside of EPA's search engine, you should add metadata to the Adobe PDF document properties.  See: GSA 508 Tutorials, Guidance, Checklists.

Outside of Drupal WebCMS, PDFs require metadata.  See: Required Metadata for PDFs (Legacy).  

Top of Page

A screenshot showing the keywords field and page field in the PDF upload window.  Keywords shown are "final" and "report"
The keyword field on the upload window does not create metadata for your document page

Why do I have to enter keywords twice?

The keywords you enter when you upload an image file, a document file (.xls, .doc), or a PDF are not metadata keywords.  These are tags to help your find and retrieve your files in Drupal WebCMS.

If I use a specific keyword will my page come up first for that search term?

Having metadata and getting to first place in search results is not a direct relationship. Metadata can influence and improve search results, but it does not guarantee them.

Absolutely critical terms should be in your title or description fields. The keyword field reinforces important terms from the content of your document. If users are searching by terms not in your document, just adding that term to the keywords field is ineffectual. Adding too many keywords, or keywords not found in the actual content, can negatively impact your results in outside search engines, as they try to filter out spam and irrevalant results.

For more about influencing search results in EPA's search, see Improving Relevance Ranking in EPA Search Results.

Top of Page

What is search engine optimization (SEO) and how can it help me?

At a basic level, SEO is about creating good content, written for your specific user, that is highly linked to by other sites. This means understanding the search terms your users employ and using that same language in your HTML or PDF document. Your page title and metadata description should include critical search terms.

Not only is it important that your page is linked to, but it also matters that the keywords you target are used in the link text. You should write good, meaningful links on your page-- and give external websites a leg up in linking to you by creating smart, content-rich titles to your pages.

SEO is not about including every possible keyword in your metadata. In fact, very few search engines include meta tags in their ranking algorithms because of this kind of spamming. Some unscrupulous SEO services promise good results by creating fake pages that flood the internet with links to your site or they "hide" text and links in the code, falsifying search results for those keywords. Meta tags are, however, very important for EPA's search engine and will affect your search results.

Read more:

Top of Page

Why do we need metadata?

Metadata gives you more control over your content, tying it to EPA's Information Architecture (IA) and content review schedules.

It also improves your page rank in EPA's Google Search Appliance (GSA). The GSA will use the metadata to make connections between your document and the search query. However, metadata must be supported by the terms in your document to get the full benefit.

Metadata is required for any content published through the WebCMS and supports the OneEPA Web restructuring effort. Read the April 2010 Memo: Restructuring EPA's Web Site.

PDF metadata review and certification is required for pages outside of the OneEPA structure.

Top of Page

What are the main issues with metadata at EPA?

  • Missing: there is no metadata at all.
  • Incomplete: some fields are filled out, usually just the HTML title tag, but others are left empty.
  • Incorrect:
    • Most commonly, the title and description are the same within a single page or document.
    • The metadata is repeated, exactly, for every page or document within the entire web area or TSSMS.
    • Metadata tags are being used incorrectly, such as the subject field being used for keywords in PDF documents or non-WCMS pages using web taxonomy fields ("DC.Subject.xxxx") as open keyword fields.

Check your metadata!

Top of Page

How are EPA's information architecture, web taxonomy, and metadata related?

Information architecture is the structure and organization of the content. The web taxonomy is how content is tagged for search and retrieval in the WebCMS and the web site.  The web taxonomy includes the metadata specification.  Read more on EPA's information architecture and web taxonomy.

Top of Page

In the web taxonomy, why is there no "general public" term in the audience facet?

Labeling content for the "general public" does not add value to EPA content. All of our public access pages, unless otherwise specified, are for the general public.

What we want to do is note any pages that are for a specific subset of the general public, such as Kids or Businesses. Having that metadata will allow us to create content collections segmented to those audiences.

Top of Page