Menu

Hello, World.

This is Shoban K.
Software & Machine Learning Engineer
Purdue Alumni , works @ YAHOO!

About

More About Me.

Software engineer with over 4 years of experience, with expertise in developing backend services for distributed systems, machine learning applications with high scalability and low latency.

Shoban finished his Master's in Computer Science (class of 2020) at the Purdue University. During his time at Purdue, he served as a Graduate Research Assistant and Graduate Teaching Assistant for Computer Science Department. Also, he interned at Yahoo! office in Champaign during Summer 2020 focussing on augumented analysis and forecasting for entire yahoo advertising ecosystem across the globe.

Shoban's current interests lies in the field of information retrieval and data analytics

Contact Me Download CV

Current Work @YAHOO![ Mar 2021 - Present ]

What I Do.

I work as a part of Financial Data Warehouse Team which consolidates the revenue and payment data of entire Yahoo ecosystem and manage as a central source of truth for revenue forecasting and growth across multiple platforms.

  • Aggregation Framework

    Due to the high volume and velocity of the data,we provide and manage preaggregation on all the real time events captured across YAHOO!. We came with a reaggregation framework which re-aggregate the data to save storage and namespace in turn helps in easier maintainance and cost cutting without losing granularities.

  • Analytics and Insights

    I work on a big data analytics platform that helps to analyze real-time business insights to marketing and sales team with improving targeted revenue for customers by tens of thousands of dollars. This also helped to uncover potential revenue and data quality impacts.

  • Oncall

    Part of the oncall cycle responsible for managing and monitoring issues which impact customers with highest quality.

  • Report Summarization Framework(Internship)

    I designed and developed a new framework to provide automatic insights on the report generation and summary. This involved creation of real - time prediction on anomaly detection and trend forecasting using statistical approach and reduced the manual effort of the Business Analysts and Product Managers by more than 30% during report investigation and analysis.

Skillset

I've Got Some Skills.

Since my Bachelor days, I developed diverse skillset which aided me to work on interesting & complicated technical problems and come up with innovative solutions. I continue my journey in developing and harnessing my skills to build software that impact billions of lives.

  • 90%
    Java
  • 85%
    Python
  • 80%
    Machine Learning
  • 85%
    Data Analytics
  • 75%
    Big Data Technologies

Career and Education

More of My Credentials.

I enjoy learning and developing new skills both for my career and personal growth.

Experience

Mar 2019 - Dec 2020

Agricultural & Biological Engineering

Research Assistant

  • Worked on Machine Learning Algorithms using Classification Tree Analysis to improve the mapping between the Plant Gene Species by 50%. This involved in developing statistics and heuristics algorithms on large scale high dimensional data to enable parallelism in the data pipeline by reducing the runtime of data processing by more than 40%
  • Proposed a novel algorithm to help us understand the relationship between them which will aid the biologists to reduce time and cost for their experiments
  • Created jobs and schedulers on clustered environment to aid multiple post-docs and graduate students to analyse their data and infer results using latest data mining and ML techniques

Nov 2015 - Jan 2019

SAP LABS INDIA

Software Developer

  • Worked on the Document Processing pipeline team which processed the unstructured data from the external systems into meaningful insights and deliver the results to Clinical Data Warehouse.
  • Developed Document Processing Workbench which simplifies the ingestion of the unstructured data from the external systems, and helps in better visualization of the meaningful insights and reduced the manual efforts by more than 40%
  • Created and productionized Document Viewer which highlights the extracted text entities in Plain text format from the unstructured data and provides sophisticated visualization for ease-of-use and improving model performance.
  • Maintained production code by writing unit tests, scenario tests and fixing issues in the Acceptance testing phase
  • Helped and mentored multiple junior software developers to be upto speed with the product and technical knowledge

Education

Jan 2019 - Dec 2020

PURDUE University

Master Degree in Computer Science

I specialized in Machine Learning domain by taking courses involving Statistical Machine Learning,Deep Learning, Cloud Computing, Design and Analysis of Algorithms,etc.

Aug 2011 - May 2015

Anna University

Bachelor Degree in Information Technology

I was introduced to multiple programming languages - C, C++, JAVA, C# with courses involving Data Structures and Algorithms, Cloud Computing, Web Development, Operating Systems, etc.

See My Latest Projects.

Resnet

A micro service-based framework which allows the users to upload millions of images to the database and effectively classify them using ResNet. The serveless architecture was designed with low latency, fault tolerance and secure acccess gateways.

Fake News Detection

Classified news headlines from 2015 U.S. Presidential Elections into real or fake news by designing an LSTM and evaluating it against common classification and regression algorithms from scikit-learn. Developed weight based technique to find the factors affecting fake news(model achieved an accuracy of 96%)

Remaining Useful Life Prediction For LITHIUM-ION Batteries

A hybrid approach to diagnose the battery life using Bayesian inference technique. Devised kernel based variance approach to find the remaining useful life of the batteries.Verified with RMSE score to show that our prediction is indeed better than other approaches with accuracy close to 95 %

Adversarial detection against Neural Networks

A deep learning approach to generate adversarial samples from the MNIST dataset and check the robustness of multiple Convolution Neural Network models and proved that our novel approach can detect the adversarial samples with better accuracy

ChatBots

Built a deep learning NLP model using hybrid approach of generative and retrieval-based approach to generate an automated responses to the questions using Chatbots. Text corpus is created from scratch with respect to 2 major domains - Machine Learning and Sports. The response is framed grammatically using a combination of Bag of Words, TF-IDF Vectorization / Cosine Transformations.

Contact

Preferred Location

Bay Area
Seattle
Austin

Email Me At

shobankumarrv@gmail.com
rvshobankumar23@gmail.com

Call Me At

Phone: (+1) 765 772 5154