Dev is available for hire

Dev Sharma

Verified Expert in Engineering

Data Scientist and AI Developer

Location

New York, NY, United States

Toptal Member Since

November 25, 2020

Dev是一名多才多艺的数据科学家和开发人员，专门从事构建异常准确的预测AI模型. He focuses on using statistics, deep learning, 以及数据工程，以制定战略并优化数据在组织中的作用. Dev的专业知识和实践经验得到了纽约哥伦比亚大学应用分析硕士学位的支持, where he also teaches almost all facets of data science at the graduate level.

Portfolio

Columbia University

GPT, Natural Language Processing (NLP)...

Insight Data Science

Amazon Web Services (AWS), Python, PyTorch, Scikit-learn, Word Embedding...

Dotin

LSTM, Python, PyTorch, Classification, Neural Networks, Model Tuning...

Experience

SQL - 6 years Pandas - 3 years Python - 3 years PyTorch - 2 years Deep Learning - 2 years Amazon Web Services (AWS) - 1 year JavaScript - 1 year

Availability

Part-time

Preferred Environment

Teams, Linux, PyCharm, Visual Studio Code (VS Code)， Slack App, Jupyter Notebook, Slack, MacOS

The most amazing...

...我很自豪地把自己的名字写在这个项目上，那就是与印孚瑟斯和斯坦福实验室合作，登上自然语言理解模型的全球排行榜.

Work Experience

Instructor

2020 - PRESENT

Columbia University

Instructed graduate students in programming, statistics, databases, front-end development, business intelligence tools, hypothesis testing, machine learning, and other analytical skills.
领导并建立了一种协作文化，在这种文化中，我们四人教学人员的每个成员都完全致力于每个学生的成功.
Consistently achieved high student satisfaction scores (4.5+/5).

Technologies: GPT, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Python, Deep Learning, Machine Learning

Artificial Intelligence Researcher

2020 - 2020

Insight Data Science

Built an intelligent search product for textbooks that uses ALBERT, a lightweight deep learning model, 将学生的搜索查询转换为结果，比传统的目录方法快100倍. I was the sole developer.
通过构建一个容器化的web应用程序(textbookqa)来服务于模型和信息检索器.com) in Docker and AWS.
在四周的期限内交付MVP，并向利益相关者展示产品.

Technologies: Amazon Web Services (AWS), Python, PyTorch, Scikit-learn, Word Embedding, Neural Networks, Learning to Rank, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), GPT, Deep Learning, Model Tuning, Machine Learning, Models

Data Scientist (Capstone)

2019 - 2020

Dotin

通过建立一个基于长短期记忆(LSTM)的架构，利用调查对象的鼠标移动来帮助识别和收回不公平的调查成本，预测付费调查的有效性，准确率约为76%.
完成了我们团队关于验证调查回复的研究的同行评审出版物(arxiv).org/abs/2006.14054). Commercialization of the survey validation product is in progress.
Worked within an Agile framework in a team of eight.

Technologies: LSTM, Python, PyTorch, Classification, Neural Networks, Model Tuning, Machine Learning, Models, Consulting, Data Science

Machine Learning Intern

2019 - 2019

Infosys

将最先进的NLP模型(RoBERTa)与斯坦福大学的切片功能集成在一起，在斯坦福大学的SuperGLUE上取得了最佳结果, a leading NLP benchmark for evaluating general natural language understanding models.
Placed as the first runner up out of 32 teams in the Annual InStep Hackathon, 通过实施创新的教育内容顺序推荐系统，个性化用户的学习之旅.
通过实现神经网络架构(PyTorch)，以95%的准确率和90%的召回率检测欺诈性医疗保健提供者, outperforming the firm’s existing rule-based classifier by around 46%.

Technologies: Transformers, Python, Natural Language Toolkit (NLTK), PyTorch, Word Embedding, Classification, Neural Networks, Natural Language Generation (NLG), Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), GPT, Deep Learning, Models, Machine Learning

Data Science Intern

2018 - 2019

Byteflow Dynamics

建立机器学习模型，使用新闻和时间序列数据对未来股票价格表现进行分类，准确率为61%.
Developed a Python crawler to extract around 5,500 financial news articles on a weekly basis for 100 tickers.
通过使用Regex和基于规则的金融词汇清理原始数据，对股票进行情绪分析.

Technologies: Natural Language Toolkit (NLTK), PyTorch, Neural Networks, Consulting, Data Science, Python, Models, Machine Learning

Co-founder | Vice President

2016 - 2018

Ummid A Hope Foundation

Raised $75,000+ to benefit abandoned girls in Udaipur, India, helping to build the core team and a global network of 1,000+ donors.
协调团队会议和团队技术栈，以促进组织的全球拓展.
组织了几次当地的筹款活动，以留住现有的捐助者并吸引新的捐助者.

Technologies: Nonprofits, Business Management

Business Analyst

2014 - 2018

Zodiac21 Solutions

Managed datasets with SQL, Excel, and Tableau to track KPIs, present dashboards, and discover actionable insights.
通过领导跨职能团队，将平均客户保留率从35%提高到64%, 由五人组成的团队开发网页和信息亭应用程序，以实现客户对员工的即时反馈.
实施和培训50多名员工使用最新的自动化工具来实现数字报告, cloud-based time tracking, and task management.

Technologies: Tableau, SQL

Experience

AskAi

http://github.com/devkosal/askai

A complete question answering application for extracting answers from textbooks. 现代信息检索技术成功地从较小的文档中检索信息. However, when it comes to larger documents, current options fall short.

这个存储库试图解决在大型文档上执行问答的问题. This requires a two-part approach. 在一部分中，ALBERT在斯坦福问答数据集(SQuAD) QA数据集上进行训练. 在另一种方法中，我们使用基于规则的方法将教科书分成多个部分. 然后，我们可以将用户问题嵌入与部分嵌入进行比较，以找到最相关的部分。.

我是唯一的贡献者——从产品概念化到部署——并且存储库目前处于MVP状态.

RoBERTa with Fast.ai

http://medium.com/analytics-vidhya/using-roberta-with-fastai-for-nlp-7ed3fed21f6c

Implementing the current state-of-the-art NLP model in fast.ai. 迁移学习的概念对NLP来说仍然是一个新的概念，并且正在以非常快的速度发展. 像RoBERTa这样的模型在SuperGLUE基准测试中在几个不同的NLP任务中表现得非常好. This project facilitates the usage of RoBERTa with fast.ai.

我是唯一的开发人员—从概念化到完成交叉集成—并且集成模型可供使用.

Survey Validation With Mouse Movements

http://github.com/dachosen1/Dotin-Columbia-Capstone-Team-Alpha-

Thirty percent of users fill out psychometric surveys falsely. 这对于那些期望有效的调查结果来支付调查费用的组织来说是一个问题. 该项目创建多个模型，以找到优于当前验证方法的方法. In the end, the aim is to reduce survey costs by at least 30%.

This project was built by a team of eight. I took ownership of building the complete pipeline for our LSTM approach, which yielded 80% accuracy and an F1 score of .76 on the validation set. The end deliverables are model weights that can be used locally to test predictions. Future goals for this project are to create an API for the LSTM model, which can be sent requests to identify false survey responses.

Fight Detection

http://github.com/devkosal/fight_detection

A deep learning computer vision model to detect fights in videos. By using five to ten frames from a two-second sample of frames, 我们使用残差神经网络(ResNet)模型提取特征，然后将提取的特征传递给LSTM(从零开始训练)来分类视频中是否发生了打斗. 我的模型能够在平衡数据集上以90%以上的准确率预测战斗是否发生. The accuracy is 71% on surveillance camera footage.

I am the sole contributor. The core development phase is complete and the next step is deployment.

Text Generator Web App

A text generator web app built in under 50 lines of Python, using PyTorch. 在PyTorch中，我们使用transformer库导入预训练的OpenGPT-2模型. 其次，PyViz Panel库用于完全创建web应用程序，仅使用Python.

I am the sole contributor to this app. 它是完整的，旨在教育其他人构建完整的文本生成应用程序.

Education

2018 - 2020

Master's Degree in Applied Analytics

Columbia University - New York, NY, USA

2009 - 2013

Bachelor's Degree in Business Administration

University of Memphis - Memphis, TN, USA

Certifications

SEPTEMBER 2020 - PRESENT

SQL Aptitude Test (http://app.testdome.com/cert/6a938ba738ac4fd587aa1808cc2de863)

TestDome

SEPTEMBER 2020 - PRESENT

Python Aptitude Test (http://app.testdome.com/cert/98109584b10e44f68312e8114cdad0fd)

TestDome

AUGUST 2018 - PRESENT

Introduction to Computer Science and Programming Using Python

Massachusetts Institute of Technology | via edX

Skills

Libraries/APIs

PyTorch, Pandas, Matplotlib, SQLAlchemy, Beautiful Soup, Node.js, React, Scikit-learn, Natural Language Toolkit (NLTK), LSTM, Fast.ai

Tools

NGINX, Tableau

Frameworks

Selenium

Languages

Python, JavaScript, R, Visual Basic for Applications (VBA), SQL, HTML

Platforms

Google Cloud Platform (GCP), Docker, Amazon Web Services (AWS)

Paradigms

Business Intelligence (BI), Data Science

Other

Regular Expressions, Gunicorn, Version Control, Neural Networks, Transformers, BERT, Recurrent Neural Networks (RNNs), Convolutional Neural Networks (CNN), Regression, Clustering, SVMs, Models, Model Tuning, Deep Learning, Natural Language Processing (NLP), Learning to Rank, Classification, Word Embedding, Natural Language Generation (NLG), Computer Vision, Computer Science, Business Management, Nonprofits, Teams, Consulting, Machine Learning, GPT, Generative Pre-trained Transformers (GPT)

Collaboration That Works

How to Work with Toptal

在数小时内，而不是数周或数月，我们的网络将为您直接匹配全球行业专家.

Share your needs

在与Toptal领域专家的电话中讨论您的需求并细化您的范围.

Choose your talent

在24小时内获得专业匹配人才的简短列表，以进行审查，面试和选择.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring