The View Text option in Historic Oregon Newspapers displays machine-generated text that is produced by Optical Character Recognition (OCR) software. OCR is a fully automated process that converts the visual image of numbers and letters into computer-readable numbers and letters. Computer software can then search the OCR-generated text for words, phrases, numbers, or other characters. However, OCR is not 100 percent accurate, and, particularly if the original item has extraneous markings on the page, unusual text styles, or very small fonts, the searchable text OCR generates will contain errors that cannot be corrected by automated means. Digitization of microfilmed newspapers inherently includes a wide range of image quality in the content (quality derived from the original newspaper, the original newspaper when it was microfilmed and associated deterioration, or the film itself.)
Although errors in the process are unavoidable, OCR is still a powerful tool for making text-based items accessible to searching. For example, important concept words often appear more than once within an article. Therefore, if OCR misreads one instance of a key word in a passage, but correctly reads the second instance, the passage will still be found in a full-text search.
The Newspaper Directory provides access to newspaper title records cataloged according to standard bibliographic rules. Until recently, most non-English language characters were difficult to represent in library records and so Romanization - or standard rules for transliterating other alphabets to the Roman alphabet - was used to convey phonetic pronunciations of non-English words.
Historic Oregon Newspapers supports persistent links to newspaper directory records and pages by providing a predictable URL, displayed in the descriptive information for that object. Using the proposed URI Template syntax the links will use the pattern:
When describing Historic Oregon Newspapers as the source of content, please use the URL and a website citation, such as "from Oregon Digital Newspapers Initiative".
Images of historic newspaper pages, as well as uncorrected page text, are displayed through your web browser. However, Historic Oregon Newspapers also contains high-resolution images (JPEG2000) and enhanced text (PDF) that may require special viewers. Most viewers can be downloaded free from vendor sites. The links below explain the various formats used and how to access them.
(Portable Document Format, .pdf)
|Used for page images|| Adobe Acrobat Reader
Adobe text-only download page
|- Sample PDF|
- About this sample
- Wavelet compression technology
- Tiling supports decompression of only that portion of the image requested by the user
- Compression ratio is approximately 20:1, depending on image content and color depth
- ERDAS ER Viewer
- IrfanView with JPEG2000 plug-in
OS X:Preview supports baseline JP2 only; commercial software may be needed to view tiled JP2 files, such as those in Historic Oregon Newspapers.
- Sample JPEG2000 page
- About this sample
Some Web browsers incorrectly assume that Quicktime (automatically included with the browser software) can display a JPEG2000 image (JPEG2000, or .jp2, is not a "native" Web format.) To counteract this effect, download the JPEG2000 (JP2) image by "right-click*quot;-ing with the mouse on the image link --e.g., "JP2 (4.0 Mb)". In the dialog box that appears, you will see "Save Link As..." or "Save Target As..." (depending on the Web browser used). Selecting this option will result in downloading the image to your desktop for further review.
To view the JPEG2000 (.jp2) file you will need a JPEG2000-friendly software, such as those listed above.
Historic Oregon Newspapers provides access to historic newspaper pages digitized under the NEH/LC National Digital Newspaper Program (NDNP). For more information on the scope and content of the program, click here (http://www.neh.gov/projects/ndnp.html).
Search Historic Oregon Newspapers to find
Users of Historic Oregon Newspapers have the option of performing basic or advanced searches. The basic search box is designated as the blue Search Pages tab and is found on many of the pages of the site. Basic search options are limited to state, time period, and key words located near each other. The basic search returns all supported languages.
For basic searches, results listed first are most likely to be relevant to your search. Results will appear higher in the list when they contain
Your searches will yield better results if you keep the following points in mind:
Historic Oregon Newspapers's search engine utilizes language-specific dictionaries toinclude word variants for your search terms. This is often called stemming. For example, the search term house, when stemmed in English, would also return words like houses and housing.
For more search options, see Advanced Searching in Historic Oregon Newspapers below. For information about language support in Historic Oregon Newspapers, see Searching by Language in Historic Oregon Newspapers.
To make the most of searching this text, take advantage of the search options provided on the Search page.
Too Many Results - If a search generates too many results, try using more specific terms and/or limiting to a specific State of publication or a particular newspaper title. Use the search box options in combination to narrow your results. For example, use "President Roosevelt" as phrase and "Roosevelt conservation" within 10 words to narrow results to text about only President Roosevelt's conservation policies.
Too Few Results - If a search generates too few results, try alternate terms or broader subjects and relax any limiting criteria (date ranges, state limitations, etc.).
Because language changes, be sure to use search terms used at the time the materials were created, even if those terms are now obsolete. For example, the following historic terms will produce more results than their modern-day counterparts:
|Modern Usage vs. Historic Usage comparison table|
|Modern Usage||Historic Usage|
|gas, service station||filling station|
|African American||Afro American, Negro|
Use the names of towns, landmarks, bridges, buildings, and other geographic features that were current when the materials you are searching were created. For instance, the state of Oklahoma was referred to as both "Indian Territory" and "Oklahoma Territory" prior to its admission as a state, so searching for "Indian Territory" may produce more search results if searching on topics related to Oklahoma.
Matching a phrase can be useful for searching place names or when common words have a particular sense used in combination.
For example, the term "normal school" was used in the early twentieth century to describe schools for training teachers. Searching for the phrase may eliminate results containing the words "normal" and "school" in unrelated ways.
Note: Some very common words, such as and, of, the, a, and to, are ignored even when matching exact phrases.
A good and historically significant example of missing issues is in the San Francisco Call, where the April 19th and April 20th issues from 1906 are missing due to the devastating San Francisco earthquake that prevented the newspaper from publishing on those days.
Historic Oregon Newspapers supports language-specific searching in English, French, German, Italian, and Spanish (although not all languages may be represented at this time). By default, in both Basic and Advanced Search, all content is searched together regardless of language. To limit searches to a specific language, conduct an Advanced Search and choose the appropriate language from the Language drop-down menu. For additional technical information on how languages are encoded and identified for search, see current NDNP Technical Guidelines at http://www.loc.gov/ndnp/guidelines/.
Historic Oregon Newspapers's search engine utilizes language-specific dictionaries to include word variants for your search terms. This is often called "stemming". For example, the search term house, when stemmed, would also return words like houses and housing. In Spanish, words like hermano would include stems such as hermanos. By default, the exact match (unstemmed) results will be ranked higher than the stemmed results.
Other reasons for language-specific search may be more content related. For example, reporting in Spanish about the building of the Panama Canal may convey a different perspective than reporting in the mainstream English-speaking press.