Outline of the MLFilms Project

Thesis

To test machine learning techniques to experimental image analysis.

Statement 1| The machine learning renaissance has given machines the ability to analyze complicated data sets (such as images) to extract useful information (see facial recognition). Success of these methods depends on having large sets of training data, a selection of the overall dataset to train an algorithm.

Statement 2| Experimental results from table-top physical experiments are rarely large or robust enough to form self-consistent training sets, so experimental analysis in physics remains a ‘bespoke’ process, requiring a craftsperson (graduate student) to analyze each set of data by hand.

Statement 3| Physical systems can often be simulated to high-accuracy using modern computing resources.

Proposal| For certain physical systems, training sets can be generated by physical simulations of the system of interest. These training sets can be fed into modern machine learning algorithms to produce robust analysis models, resulting in greater bandwidth for experimental analysis.

Science Plan

Experimental system of interest

Two-dimensional topological films…

Stage 0: Implementation

  • establish machine learning pipeline (simulation (generate images)-> training (train machine learning program)-> verification (verify on human annotated experimental data)
    • In Progress generate simulated training sets
    • Completed choose machine learning paradigm
    • Completed setup basic ML framework to work with simulation and experimental data
    • In Progress collect data from experimental system
  • provide basic documentation for developed tools and concepts (README.md, website)
  • In Progress dockerize codebase for reproducibility and simplification

Stage 1: Refinement

  • alter simulation images to imitate experimental data images (camera noise, blurring)
  • experiment with algorithm parameters to optimize the model
  • implement filtering to sanitize inconsistent detections

Stage 2: Application

  • apply pipeline to system of interest to generate novel results