Facebook’s AI learns the relationships between physical places from first-person video footage
Facebook | January 21, 2020
Computer vision systems generally excel at detecting objects but struggle to make sense of the environments in which those objects are used. That’s because they separate observed actions from physical context even those that do model environments fail to discriminate between elements relevant to actions versus those that aren’t (e.g., a cutting board on the counter versus a random patch of floor).