Hey! I am Sai Vivek

Senior Machine Learning Engineer @ Realtor.com

READ MORE
Vivek Image

I am a Computer Science Graduate at UT Dallas with a professional experience of 3+ years in developing Big Data and ML Applications.
Actively working in the areas of Big Data Applications with Deep Learning, Machine Learning & Natural Language Processing, Software Development and Web Applications.

Profile

FULL NAME
Sai Vivek Kanaparthy
RESUME
Vivek Resume
Master's in Computer Science
(Spl. Data Science)
University of Texas at Dallas

August 2016 – December 2018

Coursework: Machine Learning, Natural Language Processing, Big Data Analytics, Database Design, Web Programming Languages, Implementation of Advanced Data Structures and Algorithms, Design and Analysis of Algorithms, Statistical Methods for Data Sc.

Integrated Post Graduate in Information Technology
(Bachelors and Masters)
Indian Institute of Information Technology and Management Gwalior

August 2011 – May 2016

Coursework: Design and Analysis of Algorithms, Data Structures, Operating System, Database Management Systems, Object Oriented Programming

Realtor.com
Senior Machine Learning Engineer

January 2019 – Present

- Productionalized and scaled personalized recommended systems in the real estate domain for 16M users on a daily basis and improved the user engagement by 6% and reduced the performance time by 20% using PySpark, AWS Athena and Elastic MapReduce
- Lead, designed and deployed the architecture of large scale machine learning data pipelines based on terabytes of user behavior data integrated with AWS Glue for ETL transformations, Sagemaker for hosting ML Inference endpoints orchestrated with Step functions
- Developed a platform to track the performance of the ML models to analyze the business impact user engagement, and retention
- Developed the personalized notifications and email campaigns for new listings in the market, recommended listings based on the user’s activity and A/B tested with multiple model variants for different audience segments
- Research, design and implemented ML modules on the website to improve buyer experience like Similar Homes to the home the user is viewing, Affordability Calculator which recommends affordable neighborhoods
- Developed Real profile, which gives the portfolio of the user based on his activity in the past and deployed it as a candidate generation model. Developed a robust model based on LightGBM, Match Score, which gives the relative importance of any listing for a user and deployed it as a candidate ranking model for multiple use cases

Mavenir
Software Engineer Intern

January 2018 – August 2018

- Designed a framework to create state chart XMLs. Developed the front-end Graphical User Interface using Angular 4
- Developed the back-end of the framework using Python, which builds the SCXMLs from the flow charts on the GUI

VOMO
Jr. Developer Intern

September 2017 – December 2017

- Developed the accounts application for sign up and sign in of a user and registered organization on the VOMO platform using Node.js and deployed it on AWS EC2
- Analyzed data to recommend volunteer projects for users using machine learning models, content based and collaborative recommender systems

Securonix
Big Data Developer Intern

May 2017 – August 2017

- Developed integrated framework and sample application using Apache SOLR, Hadoop and Spark for improving the quality of big data solutions on large scale datasets.
- Optimized the indexing and search performance of Apache SOLR using Java, Spark and Web Services built on Spring MVC in Cloudera environment which uses Hadoop Distributed File Systems(Cloudera).

University of Texas at Dallas
Web Developer

November 2016 – January 2017

- Designed web pages of Prof. Yvo Desmedt using HTML, CSS, JavaScript and exposure to tools like Latex.

Java, Python (NLTK; Flask; NumPy; Pandas; Matplotlib; Seaborn, Scikit-learn), R
90%
Big Data

Hadoop, Spark, Kafka, SOLR, Elastic Search, Kibana, Scala, TensorFlow , AWS EMR, Glue, Sagemaker, SQS

90%
Web Technologies

Spring MVC, REST Web Services, Angular 4, Node.js, HTML, CSS, JavaScript, jQuery

70%
Databases

SQL (AWS Athena, MySQL), NoSQL (MongoDB, DynamoDB)

80%
Misc.

Git, Shell, Bash, UNIX/Linux, Excel, Agile

75%
Implementation of Advanced Data Structures and Algorithms
Technology used: Java

- Implemented BFS, Dijkstra, A* algorithms on real-time google maps data to compute the shortest distance between two locations and extended it by implementing Euler tours on these maps.
- Implemented Multi-dimensional search analogous to amazon selling tens of thousands of products for adding, removing and fetching items of similar prices and similar descriptions and generating invoices using TreeMap, HashMap and HashSets.
- Implemented Big Integers and arithmetic operations on Big Integers in Java, various versions of Merge Sort (avoid copying, combining with insertion sort) and Quick Sort (Dual Partition, Random Pivot Partition) on Generic arrays.

Online Banking Application
Technology used: Angular 4, Java, Spring Boot, REST Services, JPA, MySQL

- Implemented an application analogous to real-life banking application using Spring Boot framework. Developed the front-end using Angular 4 and the backend REST Services using Java. Integrated MySQL using JPA.

Electronic Auction
Technology used: Java, JSP, Spring MVC, REST Services, Hibernate, Oracle DB

- Electronic Bidding Website developed for both buyers and sellers, where products are sold by auctions. Developed the back-end functionality using REST services, Spring MVC and Hibernate. The front-end is developed using HTML, CSS, JavaScript, JQuery.

Messenger Application
Technology used: MongoDB, Express, Angular, Node.js

- Implemented a messenger application where users can exchange messages using MEAN stack. Developed the front-end using Angular 2 and backend using Node.js. Integrated MongoDB to store users and messages.

Opinion Mining of US President
Technology used: Python, Apache Spark, Apache Kafka, Elastic Search, Kibana

- Predicted the sentiments of tweets using nltk library in Spark environment by streaming the tweets over a messaging server, Kafka and visualized them over a global map by using Kibana where tweet objects are indexed using Elastic Search.

Near Real Time Event Coding
Technology used: Python, Apache Spark, Apache Kafka, CoreNLP, SEMAFOR, MongoDB

- Encoded real time news feeds to event types using SVM and multinomial Naive Bayes in Spark environment where feeds are queued over Kafka server. Text analysis on news feeds is performed using CoreNLP and SEMAFOR and knowledge base, WordNet.

Stylometric Analysis of E-mail Content for Author Identification
Technology used: Python, MongoDB, Stanford NLP Libraries

- Identifying the author of an arbitrary e-mail by using customized writing style features like lexical, syntactic, idiosyncratic attributes and syntactic n-grams using Stanford coreNLP and scikit-learn and by applying Support Vector Machines, Random Forest, Neural Networks and Boosting Classifiers.

Bosch Production Line Performance
Technology used: Python, Kaggle Datasets, XGBoosting

- Predict which Bosch parts fail quality control along the production lines using Ensemble methods using boosting, bagging and random forest classifiers using Python. (Competition hosted on Kaggle.com)

Insights of US Elections 2016
Technology used: Python, Kaggle Datasets, Topic Modelling - MALLET

- Topic Modelling of the presidential debates to detect the main agenda of the candidates using MALLET. Detecting the audience reactions and sentiments by interventions using NLTK.

Advanced Data Structures in Java (Coursera - UC San Diego)
view certificate
Object Oriented Programming in Java (Coursera - UC San Diego)
view certificate
Java Programming: Solving Problems with Software (Coursera - Duke University)
view certificate
Algorithms: Design and Analysis, Part 1 (Coursera - Stanford University)
view certificate
M101P - Introduction to MongoDB for Developers (MongoDB University)
view certificate
Machine Learning(Coursera - Stanford University)
view certificate
Neural Networks and Deep Learning (Coursera - deeplearning.ai)
view certificate
Big Data Foundations - Level 1 (IBM)
view certificate
Hadoop Foundations - Level 1 (IBM)
view certificate
Jonsson School Graduate Scholarship
$1000 Graduate Scholarship awarded to the top picked students admitted during Fall 2016 for the academic year 2016-2017.
Accomplishment of Participation at ACM- ICPC Regional Contest 2014
Certificate of Accomplishment for participation in onsite round, 2014 ACM-ICPC Asia Amritapuri Regional Contest hosted by Amrita University.
RESUME
Vivek Resume