Data Scientist Solution Architect / Data Analyst

Location: McLean, VA 22102

Duration: 6+ months (could go beyond)

Description:

• The candidate for this position will provide analytical support to the Data Science Division in the Cyber, Cloud and Data Science Service line.

• The successful candidate will support the enterprise through designing solutions for data collection, preparation, and model building to develop end-to-end analytic lifecycles to synthesize actionable information.

• The candidate will determine appropriate tools and methods for specific projects to design the analytics solution either as a standalone system, or an analytics embedded inside and overall solution.

• The candidate will be working in teams that include enterprise architects, intelligence analysts, data and visualization experts, software developers, and system engineers, and will have an excellent opportunity to broaden skills.

• The candidate must have experience in developing and deploying solutions for customers.


Applicant must have skills applicable to one or more of the following areas:

• Data wrangling, cleansing, and analytics

• The data science process

• Presenting work to both technical and non-technical audiences

• Statistical evaluation

• Machine Learning, Predictive Modeling

Applicant should have skills in one or more of the following areas:

• Machine Learning Technologies, such as Natural Language Processing (NLP) - e.g., Jaro-Winkler, Damerau- Levenshtein, Metaphone, string manipulation, etc.

• Natural Language Processing (NLP) - e.g., Jaro-Winkler, Damerau- Levenshtein, Metaphone, string manipulation, etc.


• R libraries: base, MASS, plyr, rpart, randomForest, maps/mapproj/rworldmap, zoo, adabag, animation, ggplot, igraph, jsonlite, mclust, pROC, hexbin 


• Python libraries: numpy, scipy, matplotlib, scikit-learn,etc

• SPSS, Oracle Data Miner, SAS Base, DataMiner, Dataflux, STAT

• Entity Resolution - Basis Technology Rosette Name Indexer (RNI), Global Name Recognition (GNR), Probabilistic Matching Engine, Trillium Software (TS) Quality

• Apache Hadoop 2.x, MapReduce, Elastic Search 1.4.x, Sqoop, Pig

• Familiar with Libraries: such as ATS SSO, ATS-common framework, Highchart, jersey, jtidy, one2team, iText, Spring/Spring STS, JSON, Network Markov Clustering, Topic Modeling Tool, Naïve Bayes, Apache Commons, Google’s Guava, Apache Log4j, Open CSV, SecondString

• Working in interdisciplinary teams.

To discuss on this in more detail, Please contact:

Himanshu Prajapat

himanshu.prajapat(at)collabera.com

973-606-3290

Similar jobs