Case Study: APIs & AI for a Cornell Nutrition Dashboard

APIs & AI for a Cornell Nutrition Dashboard

We conceptualized and prototyped an AI-powered dashboard and chatbot interface to manage and present nutritional survey data from 11 countries in various languages and formats. These reports come primarily as PDFs, each with different tabular layouts and units.

If you’re looking to learn more please feel free to reach out to Dave and say hi or check out our case studies for some examples of our work.

The Challenge

Manual Data Extraction from Complex PDFs: Researchers were spending significant time pulling data from nutrition reports in multiple languages, often embedded in inconsistent, hard-to-parse PDF tables.

Lack of Standardization Across Reports: Incoming data used varying units, formats, and structures—making it difficult to compare or analyze values across sources or regions.

No Centralized, Interactive Reporting Tool: Researchers lacked a real-time, self-serve dashboard to explore, filter, and export key nutrition metrics from multiple datasets.

Limited Access to Insights for Non-Technical Users: Without data science skills, many users couldn’t access the insights buried in reports, creating barriers to analysis and collaboration.

Our Solution

AI-Powered PDF Parsing Engine: Built a robust backend using OpenAI to extract and interpret tabular data from multilingual PDFs, regardless of format inconsistencies.

Normalization Engine for Unit Harmonization: Converted all data into a unified schema with SI-standard units, enabling accurate comparisons across regions and sources.

Interactive, Searchable Dashboard: Created a Laravel-based interface that lets researchers browse, filter, and visualize nutrition data with export options for further analysis.

Natural Language AI Chatbot: Integrated a text-based chatbot that allows users to ask plain-language questions and receive data-driven responses directly from the structured dataset.

Tech Stack:

The Cornell University AI Nutrition Survey project is powered by a PHP, Python + OpenAI backend for NLP and data extraction, a Laravel frontend/dashboard for visualization and user interaction, and a data pipeline that harmonizes multilingual inputs into a unified schema. For more information please contact Dave.

Results

Fully Automated Report Generation: Staff no longer need to manually extract or prepare nutrition data, freeing up time for higher-value research activities.

100% Elimination of Manual Reporting Tasks: Every previously manual data task has been replaced with automated workflows, dramatically improving efficiency.

Real-Time Field-to-Dashboard Insights: Nutrition data moves seamlessly from source PDFs to a live dashboard, supporting rapid decision-making and exploration.

Increased Researcher Engagement and Trust: Personalized insights, fast access to clean data, and intuitive tools have boosted platform usage and user retention among researchers.

Scroll to Top