Tuesday, March 8, 2016

Archaeology’s Information Revolution - The Atlantic

Archaeology's Information Revolution

In the near future, every archaeological artifact could be digitally connected to every other artifact.

The Temple of Bel in the historical city of Palmyra, Syria, photographed in August, 2010 Sandra Auger / Reuters

"We're collecting billions of those data points," he told me. "And then we sort of mesh them all together and we have not only a 3-D model of the actual excavation from this Biblical period, but we also have a kind of digital data-scaffold in which to embed all the archaeological data points."

Thanks to satellite data, those data points can now be embedded within a topography of the entire planet. For instance, Sarah Parcak, a space archaeologist, analyzes satellite imagery of Earth, looking for telltale features that might signal a long-lost historical site. Here's how Wired described her process:

When looking for new archaeological sites, Parcak orders satellite imagery for parcels of land ranging from 65×65 to 165×165 feet square. Then she applies filters to highlight different portions of the electromagnetic spectrum in each image. She's looking for features that may hint at what's buried underground. A hallmark clue is the condition of surface vegetation. An architectural structure buried underground can stunt the growth of the flora above it, creating a dead zone—invisible to the naked eye, but detectable in short wave infrared images—in the shape of the underlying infrastructure. In places like Egypt, where vegetation is scarce, satellite imagery can help Parcak distinguish between natural and man-made materials like the mud bricks many tombs are made of.

It's mind-boggling to think of the amount of data now flowing into the annals of archaeology. But the same thing that makes all this data useful—the sheer volume of information—presents difficult new challenges. Archaeologists aren't yet sure about the best way to preserve these datasets, and they don't know how, and in what format, they should be shared across networks.

Lots of people are looking for answers. Levy and his colleagues at several California universities are building a network that contains information from tens of thousands of archaeological sites. And there are other resources, like the Mediterranean Archaeology Network, which contains a series of linked archaeological nodes—which in turn contain regional databases for researchers to query. But the more important question of how to be a steward for massive, scholarly datasets is part of a larger conversation among information scientists that could end up redefining the library as we know it.

All this reflects a profound shift in how human knowledge will be contextualized, stored, and shared as reams of data continue to grow. Increasingly, people are looking for ways to classify and connect datasets. The Library of Congress, for example, is designing a new cataloguing system—for the first time in 40 years—optimized for the semantic web. Today, most of the world's biggest libraries use an electronic filing system called MARC records, the standard that replaced physical card catalogues in the 1970s. The idea for the next generation of organizing library collections is to have a system that recognizes many more fields of metadata than ever before—and finds connections to other resources both within and outside of any individual institution.

So instead of just listing books and documents by "title," "author," "key words," "genre," and other basic fields, libraries are thinking about how to be far more descriptive about individual titles and far more comprehensive about how resources connect to one another. They're also trying to figure out how to handle huge digital assets like datasets—everything from historic climate records to census data to satellite images to geospatial coordinates from archaeological excavations, and so on. I've interviewed several librarians who are seriously thinking about how to make this kind of information accessible to those who need it—these are people who are reshaping institutions like the Library of Congress, and Oxford, and Yale, and Harvard—and they all say that huge datasets will transform the fundamental functions libraries serve.

"A library is not a big box filled with books," said Catherine Murray-Rust, the dean of libraries at Georgia Tech. "It is not just a study hall. Going back to the notion of a library from the past, it is really a space—and today a physical and virtual space—in which people can appreciate the scholarship of the past while they create the scholarship of the future."

Georgia Tech is in the midst of a major renovation of its library system, an overhaul that will include removing many of the books from public spaces. (Print materials that are removed will still be retrievable upon request.) As the project has moved forward, Murray-Rust says the team working on the new library system has gotten "more radical in our thinking about what a library should be."

"The huge issue now is data," she said. "It's probably more important than text. We have traditional reading rooms where there actually are a few books. Books are a tremendous visual cue to people about the seriousness of the space. We love the book, as technology, but we also know it is not the only—and in some fields not the best—vessel for content. This is particularly true with data: The book doesn't work terribly well."

Murray-Rust calls data "the new frontier" of human knowledge. She and others agree that data is changing entire industries and academic specialties so quickly that key information is bound to be lost before best practices are standardized. This is perhaps inevitable, but it represents more than just a missing piece of knowledge. People often talk about data points as if they're conjured from thin air, somehow non-existent until they're part of a larger set. And though it's true that meaning arises from assembling great constellations of data, the data itself usually begins in the material world.

Among archaeologists, the datasets collected today—and the visualizations made from that data—may be all that exists after great structures have crumbled.

"Having Palmyra real is much more important than having 3-D models of it, obviously," Levy told me, referring to the ancient city where several historic sites have been destroyed by ISIS in recent months. "But in a world where we have so much intentional destruction of cultural heritage, we're in a position now to record it in ways that were impossible even a decade ago."