High-throughput instruments such as those at national laboratories can produce terabytes of experimental data in seconds, and this rate keeps increasing. However, the available software tools to organize and retrieve images cover a small fraction of our needs. With significant improvements in image processing and availability of large data repositories, the development of methods to query and retrieve images is fundamental to support activities varying from cataloguing to complex research, such as synthesizing new materials. These requirements motivated our team to develop pyCBIR, a new python tool for content-based image retrieval (CBIR) capable of searching relevant items in large datasets, given unseen pictures. While much work in CBIR has targeted ads and recommendation systems, pyCBIR allows general-purpose investigation across image domains. Preliminary results indicate promising directions toward ranking scientific data using Convolutional Neural Networks. We will conclude by illustrating some of the pyCBIR applications to biomedical and geological data.
See more on this video at www.microsoft.com/en-us/research/video/searching-images-images-characterization-retrieval-ranking/