Deciphering Amazon’s User-based Recommendation System
This project offers a step-by-step guide to creating a movie recommender system using Amazon’s movie ratings dataset. By the end of this guide, you’ll gain not just technical know-how but also insights into the practical applications of such models, particularly in shaping personalized educational tools, a blend of data science and educational innovation. Surprise library (SVD Algorithm) was utilized as the recommender system.
(Completed in October 2023)


Deploying Models and Interactive Data Visualization with R Shiny
Developing, visualising, and deploying models with R Shiny allows data scientists and statisticians to create powerful, interactive, and user-friendly charts and web applications. By bulidng a college GPA calculator as an example, I’ve demonstrated the process of building a Shiny app from scratch, highlighted its importance for educational institutions, and outlined the benefits over traditional applications. Shiny thus represents a valuable tool in the modern data scientist’s toolkit.
(Completed in January 2024).
Leveraging Analytical Tools for Barratt Development PLC’s Strategic Investment Decision
This report aims to provide sufficient advice on Barratt as a good real estate company for investment based on existing data and to provide prospective investors with an understanding of the company’s overall financial performance. To accomplish this, descriptive, time series, and fundamental analysis were carried.
From the analysis, Barratt had a higher mean return and standard deviation. This however implies that Barratt has both the higher return and volatility/risk. The regression line had a good fit between Barratt and Berkeley Returns. The time series forecasting analysis suggests Simple Exponential Smoothing method as the most suitable forecasting model due to its lower Moving Average Error and highest (four) forecast line points matching with the test price line (ground truth or expectation).
(Completed in April 2022)


Student Success Prediction with
SAT Scores and Attendance
For educators, it’s sad to see students struggle and drop out. The goal of this analytics project was to predict the GPA scores of college students based on their SAT scores and attendance. A regression model was used, as well as stats models library in Python. The findings from the project can be used by educators to identify poor-performing students at risk of dropping out on time and provide targeted support and personalized learning.
(First Completed April 2020. Updated July, 2023)

Exploration Of Walmart Retail Sales Data – A Descriptive Approach
The goal of this project was to analyze and visualize Walmart retail sales using Tableau software in a descriptive approach with six metrics - average sales, profit, and discount, distribution of average profit by state in the United States, percentage of profit per region, sales profits per month, average sales profit based on age, and sub-category per average profit.
The three theme solutions preceding the six key metrics were decision support, customer retention, and inventory management. Furthermore, it is suggested that the protocols used in this study be used in the analysis of other services, such as banking and financial institutions.
Similarly, the use of other big data analysis and visualisation technologies in retail, such as R, Python, and artificial intelligence/machine learning, opens up exciting new possibilities for prediction.
(Completed in March 2022)
