# DSE Capstone - Social Media Image Analysis. This repo contains code, notebooks and reports for MAS DSE Capstone project at UCSD. ## AWS Console (Personal, need to get one from UCSD): User: ucsdprojects@gmail.com ## Pipeline Integration: 1. CSV File triggers the pipeline. 2. Cleanup the CSV File 3. Extract key attributes. 4. Save them into a pre-modeled PostgreSQL database. 5. Extract Image URLs 6. Download and standardize images. 7. Save them to S3 8. Learning Model should pull the images from this S3 location. 9. Clustered Images become the output. 10. Visualize the image clusters – tag them with hashtags and text messages? ## Technical Considerations: 1. Use Python style classes and functions for code reusability. 2. Independent work can be done in Jupyter notebooks and that will be integrated into the main codebase periodically. 3. Run the code in Docker containers so that it becomes portable and can be run on individual local machines, without needing to install anything.