Metadata for Document Pages

Metadata is required for any page published through Drupal WebCMS.  

On this page:


Creating Document Pages

  • Document pages are intended to have a 1:1 relationship with the PDF -- meaning one PDF per each document page.   Metadata for the document page is written for the PDF file.
  • If your PDFs are closely related, as a collection, you can put more than one on the page. Metadata for a PDF collection describes the set of documents.
  • Do not create document pages for multiple PDFs that have no relationship to each other. That would make search engines return confusing and inaccurate results, creating a poor experience for our audience.

See: Document Pages - How We Publish PDFs in Drupal

Title

Drupal WebCMS automatically creates the metadata title from the page title. Think carefully about your page title. It is important to provide context within your title and to create good, descriptive titles, such as "Green Chill Webinar", rather than "Webinar." Titles directly influence search results.  Descriptive titles also promote better link text, another important factor in search results.   See: Improving Page Rank in the Google Search Appliance

Page Title appears in:

  • Drupal URLs
  • Microsite breadcrumbs
  • the HTML title tag (<title>)
  • dynamic lists of related content

Do:

  • For complete PDFs with official titles, write the complete, official document title in the page title.
  • For segmented PDFs with official titles, write the complete, official document title in the page title. Include chapter or section information, as needed.
  • For PDF collections or PDFs without official titles, write descriptive titles incorporating critical search terms.
    • Briefly summarize the document content.
    • Include details such as the type of document (letter, memo, etc.), the date of the document, and/or it's purpose and identification numbers (permit, review, etc.).
    • Use the body field to provide additional, critical text describing the document's content, as needed..

Do Not:

  • Do not title document pages with generic, general titles such as "[Web Area] Publications" or "[Web Area] Resources"
    • Do not create PDF collection pages of all documents or all publications for your web area. 
  • Do not use acronyms in titles unless you spell it out in the body field and/or in the metadata keywords.
Title Examples
  Title Why this is good
Complete PDF with official document title Donna Reservoir and Canal Site Update This is the official title of a factsheet. The body field will include the location (city, state), the date of the factsheet, and a brief summary
Segmented PDFs with official titles Guidelines for Water Reuse, September 2004, Chapters 1-8. Includes the official title "Guidelines for Water Reuse" as well as important information about the published edition and the PDF segments.
PDF Collections Region 3 (Mid-Atlantic) Greenhouse Gas Inventory Reports Is specific about the type of reports on the document page. Includes critical search terms "greenhouse gases", "region 3" and "mid-atlantic."
PDF Collections Memos and Technical Documents for Prasa Adjuntas Wastewater Treatment Plant, NPDES Application No. PR0020214 Is specific about the types of documents on the page, "memos" and "technical". Includes critical search terms: the company name and the identification number.
PDFs without official titles Kickoff of FY 2014 National Program Manager (NPM) Guidance Development: Memo from the Chief Financial Officer Uses text from the actual document "kickoff of FY 2014" to describe the content. Describes the document type and it's author: "memo", "chief financial officer". Includes critical search terms: "national program manager guidance", "NPM".
PDFs without official titles Superfund Site Update Presentation: Aircraft Components Superfund Site, Benton Harbor, MI Describes the document type: "presentation". Includes critical search terms: the site name and the location.

Top of Page

Description

Do:

  • Write a short statement, one to two sentences long, describing the document's content, highlighting key concepts or issues.
    • "This PDF is..." is a poor description; "Includes information about water pollution..." is much better (includes two key search terms).
  • Include important search terms not already in your title.
  • Spell out any acronyms used in the title.

Do Not:

  • Do not exactly copy the title or write the same description for every document in a set.

Description Examples:

  Description Why
Good The 2004 edition provides explanations of major water reuse application types: urban, industrial, agricultural, environmental, recreational, groundwater recharge, and augmentation of potable supplies. Short, yet provides context to the document and also incorporates critical search terms pulled from the chapter subheadings ("Chapter 2.2 Industrial Reuse").
Bad Guidelines for Water Reuse. Copies the title field.

Top of Page

Keywords

A screenshot of the keyword field on the PDF upload window, including the words "final" and "report", which create search and retrieval tags in Drupal WebCMS.
The keyword field on the upload window does not create metadata for your document page

When you upload a PDF to your Document page, the keyword field in the upload window is not the metadata keywords field.  Use those keywords as tags to help find and retrieve your PDF file in Drupal WebCMS.  Metadata keywords are the important terms your users employ to search for your document.    

Do:

  • Pull keywords from the actual text of your PDF document.
    • Look for terms in the headings, table of contents, introductory paragraphs, etc..
    • Remember, your most important terms belong in the title and/or description.
  • Be selective with keywords. In most cases, ten or less keywords per document are sufficient.
  • Use commas to separate keywords.

Do not:

  • Do not repeat terms from the title or description.
  • Do not create keywords for every possible combination of terms, or for capitalization, plurals, etc.
  • Do not use the same keywords for every document in a web area.
  • Do not use general terms, such as “EPA” and “environment”

Keyword Examples:

  Keywords Why?
Good water reclamation, water supplies, freshwater, effluent discharges, EPA-625-R-04-108 Includes important search terms not used in the title or description, such as "freshwater" and the EPA publication number.
Bad EPA, water, environment Uses general terms that do not help users searching for this specific document.

Top of Page

Type

Metadata tag: “DC.type”

Indicates the type of information that your page contains and ties it to EPA’s content review schedule. Read more about Type in Metadata.

Do:

  • You can only choose one type for each page.
  • Read the scope notes below to determine the type best describing the majority of content on your page.
    • Landing pages, index pages, and/or home pages are most likely to be “Collections & Lists”. These pages typically provide very little in-depth content.
    • If your page is a factsheet that includes some contact information, the type is "Overviews & Factsheets" not "Contact Information".
    • If you have written guidance that includes a short introduction or overview of the issue, your type is “Policies & Guidance” not “Overviews & Factsheets.”

Do Not:

  • Do not apply Overview & Factsheets to every page in a web area.
Types:
Type Scope Note
Announcements & Schedules News, news releases, calendars, comment schedules, meeting agendas, Requests for Proposals, job announcements, etc.
Collections & Lists Lists of links, bibliographies, recommended resource lists, hubs, etc.
Contact Information A list of the addresses, phone/FAX numbers, and affiliations of a specific individual, groups of people, companies, organizations, publications, etc. May include additional information such as professional titles or credentials.
Data & Tools Models, methods, maps, data files, databases, glossaries, software, tutorials, etc.
Overviews & Factsheets Factsheets, Frequent Questions pages, Basic Information pages, etc.
Policies & Guidance Internal and external policies, guidance and guidelines related to agency operations and/or regulatory compliance & enforcement. Includes proposed rules, MOUs, Judicial Decisions, International Agreements, etc.
Reports & Assessments In-depth information, toxicity assessments, budgets, strategic plans, conference proceedings, etc.
Speeches, Testimony & Transcripts A written record of dictated or recorded speech. Includes correspondence.

Top of Page

Channel (Metadata tag:“DC.Subject.epachannel”)

Channels are content distribution and publication channels for the top level of EPA’s Information Architecture. Find more information about Channels in Metadata or view EPA Channels in the Web Taxonomy.

Do:

  • Read the scope notes below to determine the channel that best fits your content.
  • Select at least one channel for every page. You can select multiple channels.
  • Apply the channel that best describes the majority of your content.
    • If your page has both scientific and regulatory content, apply both channels. If you have educational content, scientific content, and information about the Agency, apply three channels.
    • If all four channels apply, you may want to re-think the content on your page.

Do Not:

  • Do not apply "Learn the Issues" to every page unless that content is actually appropriate to that channel. Content that is specific to "Laws & Regulations" should not also be tagged "Learn the Issues."
EPA Channels
Channel Name Scope Notes
Laws & Regulations Materials and content related to the legal and regulatory responsibilities and programs of the agency. Including, but not limited to, compliance and enforcement activities, guidance, regulatory development, permitting programs, etc.
Science & Technology Materials, tools and content and related to the scientific, technical and research activities of the agency. Including, but not limited to, methods, models, research programs and plans, laboratories, software and databases, science products, etc.
Learn the Issues Educational and consumer information as well as general or basic information related to all topics. Including, but not limited to, health and safety information, environmental emergency information and contacts, household management information (e.g. energy efficiency, recycling and waste reduction, chemical use and storage info, etc.), local information, etc.
About EPA Information about the agency itself. Including, but not limited to, information about its leadership, its organization, its budget, its strategic plans, etc.

Top of Page

Taxonomy Topics and Facets (multiple metadata tags)

The EPA Web Taxonomy allows audiences easy access to relevant information from EPA programs, by using a common vocabulary to describe EPA web content. The web taxonomy is organized into multiple facets, arranged hierarchically (see table below).

Do:

  • Lookup terms and descriptions first in the Web Taxonomy or search across all facets and topics in the Web Taxonomy Search
  • Choose terms that are as broad or narrow as the content dictates.
  • Choose terms that describe a significant portion of the content.

Do Not:

  • Leave all topics and facets blank. At least one topic or one facet should apply to the page.
  • Do not choose terms that are only somewhat related to the page content or are about the web area in general.

Table of Taxonomy Topics and Facets

Topic/Facet

Name and Link

Metadata Tag

Subtopics and facets

Topic

Cooperation and Assistance

DC.Subject.epacat

Advising & consulting, community assistance, environmental justice, financial assistance, international cooperation, partnerships

Topic

Emergencies and Cleanup

DC.Subject.epaect

Cleanup processes, cleanup sites, accidents, emergency management, natural disasters

Topic

Environmental Media

DC.Subject.epaemt

Air, soils & land, species, water, wastewater, water pollution

Topic

Health

DC.Subject.epahealth

Human health conditions or concerns, food safety, health effects, special populations

Topic

Pollution Prevention

DC.Subject.epappt

Conservation, energy efficiency, fuel economy, pollution prevention, renewable energy, sustainable development, waste reduction

Topic

Regulatory and Industrial

DC.Subject.eparit

Compliance & enforcement, permitting programs, regulated facilities, regulatory development, substances management,

Topic

Research, Analysis and Technology

DC.Subject.eparat

Environmental technology, research & analysis

Topic

EPA Operations

DC. Subject.epaopt

Budget, facilities management, human resources management, information management, legal services, legislative & intergovernmental relations, standards for government conduct, technology management, travel

Facet

Audience

DC.audience

Community organizers & educators, concerned citizens & students, kids, regulated community, research & technology community

Facet

Geographic Locations

DC.coverage

International regions, United States, Territories, Water Bodies

Facet

Substances

DC.Subject.epasubstance

Chemicals, consumer products, fuels, human health disruptors, munitions, pesticides, pollutants & contaminants, radiation & radioactive substances, wastes

Facet

Environmental Laws, Regulations and Treaties

DC.Subject.eparegulation

Executive orders, judicial decisions, regulations, statutes, treaties & agreements

Facet

Industries

DC.Subject.epaindustrty

Agriculture, banking, construction, manufacturing, mining, quarrying, and oil and gas extraction, real estate, service industries, transportation and warehousing, utilities, waste management & remediation

Facet

Agency Function

DC.subject.epabrm

Management of government resources, mode of delivery, services for citizens, support delivery of services

Top of Page