The project uses computer vision powered by the Common Objects in Context (COCO) training model to identify glasses and people through a webcam. Video is sent to Google Colab's cloud-based processing application where relative location data and ownership is derived. Drink locations are then sent via Google's Pub/Sub API to a Raspberry Pi to display lights corresponding to drinks.
Computer-vision enabled bar top that highlights drink locations, corresponding ownership and patron interactions.