Searchable Database of Visual Content in Historic Newspapers
For more than a decade, educators and students have explored historic newspapers through the Chronicling America website produced by the National Digital Newspaper Program, a partnership between the National Endowment for the Humanities and the Library of Congress. Through the database, the text of more than 17 million historic newspaper pages is made searchable by character recognition technology, but users looking for specific images were required to page through the individual issues. Now the latest machine learning experience from Library of Congress Labs allows users to search visual content in American newspapers dated 1789–1963. Called the Newspaper Navigator, the site is simple to search: The user begins by entering a keyword that returns a selection of photos. Then the user chooses photos to search against, allowing the discovery of related images that were previously undetectable by search engines.
Plus: Although image-searching techniques are not new from tech companies, Newspaper Navigator marries cultural heritage with computer science. Users encounter a real-time demonstration of how algorithms are trained to scan millions of pieces of data in seconds. All code used in the project is open sourced and placed in the public domain for unrestricted reuse. The dataset code can be accessed online.
Oxford University’s History of Science Museum hosts a leading collection of scientific instruments from the Middle Ages to the nineteenth century. The museum’s virtual tours allow visitors to explore exhibits and artifacts of some of the most important scientific discoveries in science history.