Metadata for Document Pages
Metadata is required for any page published through Drupal WebCMS.
On this page:
- Creating Document Pages
- Title
- Required metadata fields:
- Taxonomy
Creating Document Pages
- Document pages are intended to have a 1:1 relationship with the PDF -- meaning one PDF per each document page. Metadata for the document page is written for the PDF file.
- If your PDFs are closely related, as a collection, you can put more than one on the page. Metadata for a PDF collection describes the set of documents.
- Do not create document pages for multiple PDFs that have no relationship to each other. That would make search engines return confusing and inaccurate results, creating a poor experience for our audience.
See: Document Pages - How We Publish PDFs in Drupal
Title
Title is not a required metadata field when you edit content in Drupal WebCMS. Drupal WebCMS automatically creates the metadata title from the page title. The page title is repeated in microsite breadcrumbs, in the HTML title tag (<title>), and in URLs.
However, it is still important to create good, descriptive titles as they directly influence search results. Descriptive titles also promote better link text, another important factor in search results. See: Improving Page Rank in the Google Search Appliance
Do:
- For complete PDFs with official titles, write the complete, official document title in the page title.
- For segmented PDFs with official titles, write the complete, official document title in the page title. Include chapter or section information, as needed.
- For PDF collections or PDFs without official titles, write descriptive titles incorporating critical search terms.
- Briefly summarize the document content.
- Include details such as the type of document (letter, memo, etc.), the date of the document, and/or it's purpose and identification numbers (permit, review, etc.).
- Use the body field to provide additional, critical text describing the document's content, as needed..
Do Not:
- Do not title document pages with generic, general titles such as "[Web Area] Publications" or "[Web Area] Resources"
- Do not create PDF collection pages of all documents or all publications for your web area.
- Do not use acronymns in titles unless you spell it out in the body field and/or in the metadata keywords.
Title Examples:
| Title | Why this is good | |
|---|---|---|
| Complete PDF with official document title | Donna Reservoir and Canal Site Update | This is the official title of a factsheet. The body field will include the location (city, state), the date of the factsheet, and a brief summary |
| Segmented PDFs with official titles | Guidelines for Water Reuse, September 2004, Chapters 1-8. | Includes the official title "Guidelines for Water Reuse" as well as important information about the published edition and the PDF segments. |
| PDF Collections | Region 3 (Mid-Atlantic) Greenhouse Gas Inventory Reports | Is specific about the type of reports on the document page. Includes critical search terms "greenhouse gases", "region 3" and "mid-atlantic." |
| PDF Collections | Memos and Technical Documents for Prasa Adjuntas Wastewater Treatement Plant, NPDES Application No. PR0020214 | Is specific about the types of documents on the page, "memos" and "technical". Includes critical search terms: the company name and the idenfitication number. |
| PDFs without official titles | Kickoff of FY 2014 National Program Manager (NPM) Guidance Development: Memo from the Chief Financial Officer | Uses text from the actual document "kickoff of FY 2014" to describe the content. Describes the document type and it's author: "memo", "chief financial officer". Includes critical search terms: "national program manager guidance", "NPM". |
| PDFs without official titles | Superfund Site Update Presentation: Aircraft Components Superfund Site, Benton Harbor, MI | Describes the document type: "presentation". Includes critical search terms: the site name and the location. |
Description
Do:
- Write a short statement, one to two sentences long, describing the document's content, highlighting key concepts or issues.
- "This PDF is..." is a poor description; "Includes information about water pollution..." is much better (includes two key search terms).
- Include important search terms not already in your title.
- Spell out any acronyms used in the title.
Do Not:
- Do not exactly copy the title or write the same description for every document in a set.
Description Examples:
| Description | Why | |
|---|---|---|
| Good | The 2004 edition provides explanations of major water reuse application types: urban, industrial, agricultural, environmental, recreational, groundwater recharge, and augmentation of potable supplies. | Short, yet provides context to the document and also incorporates critical search terms pulled from the chapter subheadings ("Chapter 2.2 Industrial Reuse"). |
| Bad | Guidelines for Water Reuse. | Copies the title field. |
Keywords
When you upload a PDF to your Document page, the keyword field in the upload window is not the metadata keywords field. Use those keywords as tags to help find and retrieve your PDF file in Drupal WebCMS. Metadata keywords are the important terms your users employ to search for your document.
Do:
- Pull keywords from the actual text of your PDF document.
- Look for terms in the headings, table of contents, introductory paragraphs, etc..
- Remember, your most important terms belong in the title and/or description.
- Be selective with keywords. In most cases, ten or less keywords per document are sufficient.
- Use commas to separate keywords.
Do not:
- Do not repeat terms from the title or description.
- Do not create keywords for every possible combination of terms, or for capitalization, plurals, etc.
- Do not use the same keywords for every document in a web area.
- Do not use general terms, such as “EPA” and “environment”
Keyword Examples:
| Keywords | Why? | |
|---|---|---|
| Good | water reclamation, water supplies, freshwater, effluent discharges, EPA-625-R-04-108 | Includes important search terms not used in the title or description, such as "freshwater" and the EPA publication number. |
| Bad | EPA, water, environment | Uses general terms that do not help users searching for this specific document. |
Type
Metadata tag: “DC.type”
Indicates the type of information that your page contains and ties it to EPA’s content review schedule. Read more about Type in the Metadata FAQs.
Do:
- You can only choose one type for each page.
- Read the scope notes below to determine the type best describing the majority of content on your page.
- Landing pages, index pages, and/or home pages are most likely to be “Collections & Lists”. These pages typically provide very little in-depth content.
- If your page is a factsheet that includes some contact information, the type is "Overviews & Factsheets" not "Contact Information".
- If you have written guidance that includes a short introduction or overview of the issue, your type is “Policies & Guidance” not “Overviews & Factsheets.”
Do Not:
- Do not apply Overview & Factsheets to every page in a web area.
Types:
|
Type |
Scope Note |
|
Announcements & Schedules |
News, news releases, calendars, comment schedules, meeting agendas, Requests for Proposals, job announcements, etc. |
|
Collections & Lists |
Lists of links, bibliographies, recommended resource lists, hubs, etc. |
|
Contact Information |
A list of the addresses, phone/FAX numbers, and affiliations of a specific individual, groups of people, companies, organizations, publications, etc. May include additional information such as professional titles or credentials. |
|
Data & Tools |
Models, methods, maps, data files, databases, glossaries, software, tutorials, etc. |
|
Overviews & Factsheets |
Factsheets, Frequent Questions pages, Basic Information pages, etc. |
|
Policies & Guidance |
Internal and external policies, guidance and guidelines related to agency operations and/or regulatory compliance & enforcement. Includes proposed rules, MOUs, Judicial Decisions, International Agreements, etc. |
|
Reports & Assessments |
In-depth information, toxicity assessments, budgets, strategic plans, conference proceedings, etc. |
|
Speeches, Testimony & Transcripts |
A written record of dictated or recorded speech. Includes correspondence. |
Channel (Metadata tag:“DC.Subject.epachannel”)
Channels are content distribution and publication channels for the top level of EPA’s Information Architecture. Find more information about Channels in the Metadata FAQs (being revised) or view EPA Channels in the Web Taxonomy.
Do:
- Read the scope notes below to determine the channel that best fits your content.
- Select at least one channel for every page. You can select multiple channels.
- Apply the channel that best describes the majority of your content.
- If your page has both scientific and regulatory content, apply both channels. If you have educational content, scientific content, and information about the Agency, apply three channels.
- If all four channels apply, you may want to re-think the content on your page.
Do Not:
- Do not apply "Learn the Issues" to every page unless that content is actually appropriate to that channel. Content that is specific to "Laws & Regulations" should not also be tagged "Learn the Issues."
EPA Channels:
|
Channel Name |
Scope Notes |
|
Laws & Regulations |
Materials and content related to the legal and regulatory responsibilities and programs of the agency. Including, but not limited to, compliance and enforcement activities, guidance, regulatory development, permitting programs, etc. |
|
Science & Technology |
Materials, tools and content and related to the scientific, technical and research activities of the agency. Including, but not limited to, methods, models, research programs and plans, laboratories, software and databases, science products, etc. |
|
Learn the Issues |
Educational and consumer information as well as general or basic information related to all topics. Including, but not limited to, health and safety information, environmental emergency information and contacts, household management information (e.g. energy efficiency, recycling and waste reduction, chemical use and storage info, etc.), local information, etc. |
|
About EPA |
Information about the agency itself. Including, but not limited to, information about its leadership, its organization, its budget, its strategic plans, etc. |
Taxonomy Topics and Facets (multiple metadata tags)
The EPA Web Taxonomy allows audiences easy access to relevant information from EPA programs, by using a common vocabulary to describe EPA web content. The web taxonomy is organized into multiple facets, arranged hierarchically (see table below).
Do:
- Lookup terms and descriptions first in the Web Taxonomy or search across all facets and topics in the Web Taxonomy Search
- Choose terms that are as broad or narrow as the content dictates.
- Choose terms that describe a significant portion of the content.
Do Not:
- Leave all topics and facets blank. At least one topic or one facet should apply to the page.
- Do not choose terms that are only somewhat related to the page content or are about the web area in general.
Table of Taxonomy Topics and Facets
|
Topic/Facet |
Name and Link |
Metadata Tag |
Subtopics and facets |
|---|---|---|---|
|
Topic |
DC.Subject.epacat |
Advising & consulting, community assistance, environmental justice, financial assistance, international cooperation, partnerships |
|
|
Topic |
DC.Subject.epaect |
Cleanup processes, cleanup sites, accidents, emergency management, natural disasters |
|
|
Topic |
DC.Subject.epaemt |
Air, soils & land, species, water, wastewater, water pollution |
|
|
Topic |
DC.Subject.epahealth |
Human health conditions or concerns, food safety, health effects, special populations |
|
|
Topic |
DC.Subject.epappt |
Conservation, energy efficiency, fuel economy, pollution prevention, renewable energy, sustainable development, waste reduction |
|
|
Topic |
DC.Subject.eparit |
Compliance & enforcement, permitting programs, regulated facilities, regulatory development, substances management, |
|
|
Topic |
DC.Subject.eparat |
Environmental technology, research & analysis |
|
|
Topic |
DC. Subject.epaopt |
Budget, facilities management, human resrouces management, information management, legal services, legislative & intergovernmental relations, standards for government conduct, technology management, travel |
|
|
Facet |
DC.audience |
Community organizers & educators, concerned citizens & students, kids, regulated community, research & technology community |
|
|
Facet |
DC.coverage |
International regions, United States, Territories, Water Bodies |
|
|
Facet |
DC.Subject.epasubstance |
Chemicals, consumer products, fuels, human health disruptors, munitions, pesticides, pollutants & contaminants, radiation & radioactive substances, wastes |
|
|
Facet |
DC.Subject.eparegulation |
Executive orders, judicial decisions, regulations, statutes, treaties & agreements |
|
|
Facet |
DC.Subject.epaindustrty |
Agriculture, banking, construction, manufacturing, mining, quarrying, and oil and gas extraction, real estate, service industries, transportation and warehousing, utilities, waste management & remediation |
|
|
Facet |
DC.subject.epabrm |
Management of government resources, mode of delivery, services for citizens, support delivery of services |