Nourish: A Knowledge-Driven Recommender System for Food Entrepreneurs
Scripts
File Size |
|
File Format |
|
Scope And Content | The repository is composed of a variety of scripts used to create the knowledge graph and integrate the knowledge graph with a LLM. On top of the integration, there is also code for a front facing website for users to log in to as well as a front end for the chat application. Please see below for more details on each directory and their contents. 1. Arcgis * Location: `arc_gis` * README: `arc_gis/README.md` * Description: Code used to create spatial feature layers, EDA, and querying as well as geoenrichement code. 2. Chatbot * Location: `chatbot_apps` * README: `README.md` * Dependencies * **PYTHON VERSION: 3.9.X** * `./requirements.txt` * Description: Tools used to interface with Openai and knowledge graph. Code for the front facing chat application is also located here and is called `nourish_chatbot_app.py` 3. Front End * Location: `front_end` * README: `front_end/README.md` * Description: Frontend javascript that creates the login for users and queries them for information 4. Models * LocationL `models` * README: There is not a readme for the one off clustering * Description: Scripts used to cluster food products by nutitrion and to calculate HPF scores 5. Neo4j * Location: `neo4j` * README: None, this was preliminary EDA * Description: Initial eda code to connect to neo4j server and directly update server 6. Ontologies * Location: `ontologies` * README: `ontologies/README.md` * Dependencies * **PYTHON VERSION: 3.10.X** * `ontologies/requirements.txt` * Description: Code used to convert owl files into neo4j ready csv files. |
Technical Details |
Python dependencies are provided in the requirements.txt files within the Scripts component file. |
Input file
File Size |
|
File Format |
|
Scope And Content | arcgis_feature_layers_link.txt: Link to public arcgis feature layers lexmapr_branded_foodon.csv: Lexmapr results mapping branded foods to ontological entities nodes.csv: nodes used in cypher database rels.csv: relationships made from the ontology file and loaded into cypher usda_2022_hpf_component.csv.gz: HPF clustering results on the USDA branded food data |
Output file
File Size |
|
File Format |
|
Scope And Content | FoodData_Central_branded_food_csv_2022-10-28.zip: USDA branded food data downloaded and used nourish_merged_ontology.owl: Merged ontology file used to creates nodes.csv and rels.csv registrants.csv: Example users and business profiles used in the chat application to give recommendations to. This was the registrant table in the Nourish postgresql server. |
- Collection
- Cite This Work
-
Allen, Jessica K.; Henry, Ramona T.; Kale, Amol T.; Michael, Garrett J.; Stickle Matthew P.F. (2023). Nourish: A Knowledge-Driven Recommender System for Food Entrepreneurs. In Data Science & Engineering Master of Advanced Study (DSE MAS) Capstone Projects. UC San Diego Library Digital Collections. https://doi.org/10.6075/J01N81B7
- Description
-
The process of operating a food-related business is complex and requires in-depth knowledge of many factors, including local policies/regulations, supply chains, sources of funding, and more. These complicated factors have made the business of food a very difficult field to work in, for both new and existing professionals. To help with this effort, a comprehensive food business knowledge graph and chart based user interface, Nourish, was created in an attempt to reduce the barrier of finding the right information to successfully operate in the food industry. The knowledge graph was developed by integrating multiple data sources to address each component of the food industry. These data sources included geographic, document, ontological and relational data. For geographic data, the knowledge graph utilized Arcgis online via Esri to pinpoint optimal business locations (based on an opportunity scoring calculation) which could be suggested to the end user. For the document data, document stores were created to provide users with funding information from various government institutions, such as the Small Business Administration (SBA) and the United States Department of Agriculture (USDA). For the ontological data, various foundational ontologies including FOODON and FIBO were integrated into a graph database. For relational data, the USDA Food Data Central database was also incorporated to better understand nutritional alternatives and increase accessibility to healthy options. To enable widespread access to the knowledge graph, Nourish was connected to Open AI large language model (llm) GPT-3.5, which provided a user-friendly way to query the knowledge graph. To transform user queries with GPT-3.5 into responses, the ReAct conversational agent chain was implemented using the Langchain framework as an interface. The agent was composed of several tools to address user inputs. For example, the location tool was used to suggest optimal locations for a business. Another tool queried the document indexes created on the document stores to address loan eligibility and general loan inquiries. The Nourish chatbot was displayed through a Dash app to facilitate conversation with the user. Overall, the Nourish chatbot effectively queried the integrated knowledge graph to provide users with personalized recommendations on vital information such as where to open their business, what loans they could apply for, and where they could find additional support.
- Scope And Content
-
Since the project revolved around building and utilizing a knowledge base, there is not necessarily a clean input and output relationship for data. Normally the output from one step directly fed into the input of another step. For example, a lot of data ended up being part of the knowledge base ultimately used by the chatbot application.
- Date Collected
- 2023-01-03 to 2023-06-09
- Date Issued
- 2023
- Advisors
- Contributors
- Note
-
This project relies on external software packages, modules/libraries, or programs, use of which may carry specific license requirements. Users should comply with any licenses specified within the contents of this project.
- Series
- Topics
Formats
View formats within this collection
- Language
- English
- Related Resources
- Nourish Group ARCGIS Content (arcg.is/1nSziL0)
- Ontobee Ontologies (ontobee.org)
- USDA Food Data Central (fdc.nal.usda.gov)
- ArcGIS Online (UC San Diego affiliates): https://ucsdonline.maps.arcgis.com/home/index.html
- ArcGIS Online: https://www.arcgis.com/index.html
- LexMapr: A Lexicon and Rule-Based Tool for Translating Short Biomedical Specimen Descriptions into Semantic Web Ontology Terms: https://github.com/cidgoh/LexMapr
Source data
Software
- License
-
Creative Commons Attribution 4.0 International Public License
- Rights Holder
- Allen, Jessica K.; Henry, Ramona T.; Kale, Amol T.; Michael, Garrett J.; Stickle, Matthew P.F.
- Copyright
-
Under copyright (US)
Use: This work is available from the UC San Diego Library. This digital copy of the work is intended to support research, teaching, and private study.
Constraint(s) on Use: This work is protected by the U.S. Copyright Law (Title 17, U.S.C.). Use of this work beyond that allowed by "fair use" or any license applied to this work requires written permission of the copyright holder(s). Responsibility for obtaining permissions and any use and distribution of this work rests exclusively with the user and not the UC San Diego Library. Inquiries can be made to the UC San Diego Library program having custody of the work.
- Digital Object Made Available By
-
Research Data Curation Program, UC San Diego, La Jolla, 92093-0175 (https://lib.ucsd.edu/rdcp)
- Last Modified
2023-08-11