My Story
Being studied Masters in Data Science and Business Analytics from a top Business School (ESSEC Business School) and Engineering College (CentraleSupélec) in Europe, I am a Data Scientist with a Business Acumen. I am currently working as a Data Science Consultant in Switzerland. I am interested in exploring the field of Data Science in multiple domains.
Data Science Projects
Neighboring Restaurants Influence
- A model to estimate the influence of neighboring restaurants on the rating (success) and number of reviews (attention) of a new restaurant in the same neighborhood has been made
- A new data set was created using the raw data from Yelp by grouping restaurants in a radius into neighborhoods
- Support Vector Regression was used to estimate the rating and number of reviews for a potential restaurant
Smart Grids
- Worked on part of the industry live project, Smart Grids for the Canadian state of Ontario at Mahindra Ecole Centrale
- Developed a model to forecast the electricity demand on a local power grid using Python and C++
- A series of Artificial Neural Networks were used first to estimate the temperature and humidity and then the electricity loads on a local grid
Building Energy Model Calibration
- There are several metrics which influence the energy consumption in a building
- Building Energy Model Calibration is the process of tuning parameters of a building model to minimize the difference between simulation metrics and collected metrics
- This process was achieved using Machine Learning techniques in Python
Aspect-Based Sentiment Analysis
- A single sentence can have different polarities with respect to different aspects
- Aspect-Based Sentiment Analysis helps in assigning a polarity to an opinion with respect to a particular aspect
- Natural Language Processing was applied and 5 different classifiers were implemented using scikit-learn in Python in this respect, SVM, Naive Bayes, Logistic Regression, Decision Tree and Random Forest out of which Logistic Regression was selected finally
Music Genre Classification
- The goal was to classify songs to different genres based on the information about the song like beat, tempo, frequency, etc. making around 50 features in total
- After the song data processing steps, this multi class classification problem was solved in Python using Logistic Regression, K Nearest Neighbors and Random Forests
- After parameter tuning of each model, Random Forests came out to be the best model
Stock Prices Estimation
- The historical stock data, news and tweets of a company were used to estimate its future stock price
- The pattern of a particular stock was captured using Recurrent Neural Network with Long Short-Term Memory
- Natural Language Processing and Artificial Neural Networks were used to incorporate the insights from news and tweets
New York Crime Network Analysis
- New York City crime records have been been analyzed and the crime network patterns have been identified using Network Science Analytics
- Multiple network graphs were created to analyze the various relations between the crimes and the locations within the city
- Attempted to estimate the presence of organised crime groups or gangs around New York City using centrality measures like degree, betweenness and eigenvector centrality of the nodes in the network
Support Vector Machines
- Literature survey on Support Vector Machines
- Several modules for Support Vector Machines were developed for non-linear classification and regression
- This project was done using Python
Marketing Analytics
- Made a database insights report for a manager with relevant visualizations
- Estimated who is likely to make a donation and how much to a charity for a fundraising campaign based on the customer data available with the charity
- Constructed a solicitation strategy to decide how to select the customers to target for donations to the charity
Cryptocurrency dependence on its Historical Price Movements
- The goal was to find out which cryptocurrencies can be estimated with the most certainty using only the historical price movements
- 10 cryptocurrencies were chosen based on the popularity and the initial visualization of the data was done using Tableau
- Recurrent Neural Networks with Long Short-Term Memory was used to model the price of each cryptocurrency and 3 out of the 10 chosen cryptocurrencies had a high dependence on the historical data
Survival from Titanic Incident
- This a Kaggle project in which the passengers survived from the Titanic shipwreck were estimated based on the information about the passengers
- A lots of importance was given to feature engineering and creation of new features
- This binary classification problem was solved using 5 different algorithms, Logistic Regression, Decision Tree, SVM, KNN and Random Forest out of which Random Forest gave the best results
Supply Chain Analytics
- An optimal allocation plan for resource allocation between several sources and destinations for a product has to be established
- This has been done using linear programming by considering some constraints including meeting the demand at each port and utilizing the entire inventory available
Education
Master
Data Science and Business Analytics
ESSEC Business School and CentraleSupelec, Paris, FRANCE
Aug 2018 – Feb 2020
Bachelor of Technology
Mechanical Engineering
Mahindra Ecole Centrale, Hyderabad, INDIA
Aug 2014 – June 2018
Experience
Data Scientist
Natural Language Processing, Computer Vision and Time Series
L2F SA (Giotto AI), Lausanne, SWITZERLAND
Sept 2020 – Present
Data Scientist
Face Recognition and Identity Verification
Facedapter, Geneva, SWITZERLAND
Sept 2020 – July 2021
Data Scientist
Building Energy Model Calibration
Airboxlab, Esch-sur-Alzette, LUXEMBOURG
July 2019 – Dec 2019
Data Science Intern
Support Vector Machines
Mahindra Ecole Centrale, Hyderbad, INDIA
Feb 2016 – Aug 2016
Data Science Intern
Electricity Demand Prediction in Smart Grids
Mahindra Ecole Centrale, Hyderabad, INDIA
Aug 2015 – Feb 2016
Contact
j.veeramaneni@gmail.com
+33 605781966
France
