Data Scientist

CloseFactor- Remote
Full Time
Entry (1-3 years)

Pay Range


$115000 - $135000

No equity




Your responsibilities will include: Defining Machine Learning, Natural Language Processing and Knowledge Graph pipelines and models. Designing end to end pipelines for document information extraction solutions and leveraging semantic parsing, named entity recognition, and relationship extraction. Tackling real world problems using machine learning algorithms including but not limited to regression, clustering, and natural language processing techniques. Translating these into production code using Python, PyTorch, and TensorFlow. Maintaining customer and company data in production databases and search engines and updating and querying the data to support customer workflows, using Google Cloud Platform. Implementing fast, reusable, and testable software to extract information from data collected from disparate sources, building features by various methodologies such as text classification, sentiment analysis, etc., ranking relevant information according to customer sales plays and presenting the results to customers through web application interfaces. Building a distributed self-learning and self-updating software system that stores collected information in a constantly updating knowledge graph. Working in close collaboration with customers and product users to train and keep up to date Machine Learning / AI system features for existing products. Implementing the latest Machine Learning and Natural Language Processing algorithms to reduce human effort required product offerings through effective labeling, training, and keeping models up to date. Conducting cutting edge research and experiments in machine learning, text classification, entity linking, and entity extraction and other related projects. Scaling machine learning and NLP projects to run against large datasets in virtualized environments. ** Our ideal candidate will meet the following requirements:** -Experience implementing Machine Learning and Natural Language Processing solutions using a variety of modern techniques including statistical models, linear regression, logistic regression, KNN, SVD, TF-IDF, Random Forest, SVM, Naive Bayes, Decision Trees -Experience with topic modeling, clustering and general data wrangling to build training sets and/or look for patterns in data -Self-learner, hacker, technology advocate who can work on anything -Required Bachelor’s degree or foreign equivalent in Computer Science, CIS, MIS, Engineering, or a related field. -Requires a minimum of 2 years of experience with the same job duties. -Occasional travel required. -Proficient in Python and Jupyter as well as related data science libraries (such as Scikit-learn, NLTK, SpaCy, Tensorflow, FastText, BERT) -Experience with MySQL, ElasticSearch, ESB, Hadoop, Spark, or other related data processing/database technologies -Experience with web services using REST -Coding experience in large distributed environments with multiple endpoints and complex interactions