Imagine an antique book with fragile pages and a delicate cover. Museums and collectors need to balance the desire to inspect such a volume with potential damage an examination could cause. Often they decide to forgo an examination, but this means the contents remain concealed.
New research from MIT’s Camera Culture group could provide the perfect solution. The researchers designed a computational imaging system that can read text through multiples layers of paper: the system can read a closed book. This research was recently published in Nature Communications.
“The Metropolitan Museum in New York showed a lot of interest in this, because they want to, for example, look into some antique books that they don’t even want to touch,” says Barmak Heshmat, a research scientist at the MIT Media Lab and corresponding author on the new paper.
According to the CBSNews article “New tech could read books without opening them”, this system potentially could be used for business applications, enabling scanning through large amounts of documents without physically separating the pages. Banks could process a stack of checks with a single scan.
The prototype system was able to identify letters on the top nine sheets in a stack of pages. The pages were scanned with Terahertz radiation, or T-rays. The system also uses a sophisticated algorithm, developed by researchers from Georgia Tech, to decipher the reflected T-rays as they return to the scanner.
Three steps to read a closed book
- The first step in the process is to find a specific page in the stack of paper by sending a pulse of radiation through the layers. The paper must be slightly transparent in the selected frequency range. This research used terahertz radiation since it could penetrate the stack of paper. The frequency ranged from 100 GHz to 3 THz.
Time resolution is used to distinguish between different pages. The time resolution must be small enough to observe very small objects, such as the air gap between pages in a closed book. For this, the MIT team turned to femto-photography. According to MIT Media Lab’s Camera Culture Group, with femto-photography, “The effective exposure time of each frame is two trillionths of a second …”
As the signal was sent through the pages, the air gap reflected a portion of the T-ray pulse. The time resolution enabled the system to use the reflected pulse to identify when the radiation passed through one page and traveled to the next. The team achieved a space resolution of approximately 30 microns, which was sufficient to separate the pages of a closed book. The “pages” used in the study were 300 microns thick. This process, probabilistic pulse extraction (PPEX), was used to find a specific page.
- Once the page is found, the printed letter must be located on the page. For this, the team used the spectral information associated with the ink used in printing. Ink absorbs various frequencies of terahertz radiation to different degrees, depending on its chemical composition. While X-rays can penetrate paper, their frequency profiles cannot distinguish between ink and blank paper.
When the T-rays passed through the pages, a portion of the radiation was reflected back towards the device. The returning pulse had a different frequency profile when it bounced off a blank spot on the page versus a location with ink. This created the second step in the process: time-gated spectral imaging (TGSI).
- Once the pages were scanned and the reflected pulses were recorded, the system needs to decipher the characters. For this, researchers from Georgia Tech developed an algorithm that accurately identifies the letters, even when the ink from previous pages casts a “shadow” over the shapes. They devised a convex cardinal shape composition (CCSC) algorithm that automatically compares the reflected shapes against combinations of the templates of various letters in the alphabet. The CCSC was successful despite considerable shadowing on the deeper layers. CCSC is the final step of the process: identifying the text.
The data captured by the system was processed using MATLAB. Processing includes the application of PPEX, k-means clustering and CCSC extraction. The researchers also used Wavelet Toolbox to denoise the input signal. To learn more about the fundamentals behind wavelet transforms, check out this tech talk.
The three main steps are shown in the short animation below: PPEX to find the page, TGSI to detect the ink, and CCSC to identify the letter.
So what did the letters on the 9 sheets of paper spell?
THZ = Terahertz
CCG = Camera Culture Group
This is an exciting first step, but the technology does have some limitations. The prototype system can only count pages to a depth of 20, and can accurately identify characters on the first 9 pages. Although most of the T-ray is either absorbed by the ink and paper or bounces back to the sensor, a small portion bounces back and forth between the pages before getting back to the sensor. This creates interference which limits the number of pages that can be accurately processed.
With more powerful radiation sources and improved T-ray detectors, the methodology could process the entire contents of a book without opening the cover. Someday, this system could non-intrusively examine ancient texts, page by page. Other potential applications are more likely to interest James Bond: The system could be used by spies to examine the contents of documents and envelopes without opening them.
コメントを残すには、此処 をクリックして MathWorks アカウントにサインインするか新しい MathWorks アカウントを作成します。