Data Scientist



30+ days ago



Data Scientist


 Smarsh is the leader in communications compliance, archiving, and analytics. We provide compliance across the broadest set of communications channels with insights on what’s being captured. Smarsh customers manage over 500 million daily conversations across 80 channels and growing. Customers include the top 10 U.S., top 8 European, top 5 Canadian, and top 3 Asian banks. The Smarsh advantage is customers stay ahead of compliance and uncover patterns and relationships hidden within their data.

At Smarsh, we’ve been helping our customers manage new forms of communication since 1998. We work closely with regulators including the SEC, FINRA, IIROC, PRA, FCA and with our customers, to ensure they understand capabilities of today’s technology and our platform meets their most stringent requirements. Our products include Connected Capture, Connected Archive, Web Archive & Business Solutions.

We recently acquired Digital Reasoning, the global leader in natural language processing (NLP), artificial intelligence (AI), and machine learning (ML). The addition of Digital Reasoning’s capabilities and expertise will further enable Smarsh customers to spot risks before they happen, maximize the scalability of supervision teams, and uncover strategic insights from large volumes of data in real-time.


About the role

As a Data Scientist, you will play a key role in designing and building POC and production solutions that leverage state of the art machine learning for communication intelligence. Working within our industry renowned Applied Machine Learning group, you will have the opportunity to work with leading AI researchers in bringing the cutting edge in machine learning to our clients through rigorous, scalable and innovative data science practice. As a senior team member you will be a company ambassador and representative during our client engagements, where you will provide expert guidance to junior team members and clients alike. As part of this role you will also be responsible for identifying opportunities for data science process optimisation, as well as machine learning innovation. As the ideal candidate you will possess a rare set of skills that include exceptional technical ability in the form of data science and engineering, deep knowledge of machine learning concepts, and excellent communication skills. 


Your Responsibilities 


  • Design and execution of data science and analytics portion of Client facing POC and Production delivery projects 
  • Application of state of the art machine learning methodology to unstructured data 
  • Development of annotated data sets for supervised machine learning model development 
  • Supervised Machine learning model development, including training data development, exploratory data analysis, hyperparameter optimisation, and threshold analysis, all following data science best practice 
  • Machine Learning Model fail state analysis  
  • Working with clients to define project success criteria in the context of machine learning 
  • Working with product management to design and innovate machine learning solutions for emerging commercial opportunities 
  • Rigorous validation of emerging machine learning research for application within our domain, strictly adhering to the scientific method where well defined hypotheses are the core to delivering actionable insights 
  • Mentorship of and collaboration with other team members in the pursuit of data science excellence 


  Your Competencies 


  • Deep knowledge of Natural Language Processing and Machine Learning (supervised and unsupervised) 
  • Extensive experience in the use of data mining tools and techniques on unstructured and structured data, including dimensionality reduction, clustering, classification, regression, (deep) neural networks and transformers 
  • Strong knowledge of core statistics (probability distribution, statistical tests, permutation testing etc.) 
  • Professional use of statistical testing for validation of solution performance (P/R, CI, AUROC etc.) 
  • Knowledge of Neural Networks and Machine Learning principles 
  • Excellent written and verbal English communication skills 
  • Strong collaborative mindset, with the ability to motivate and drive group progress toward a common objective 


Required Experience and Education 


  • Bachelor’s degree in Computer Science, Applied Math, Statistics, or a scientific field. Post-graduate (Masters or Doctorate) degree or strong industry track record preferred 
  • Strong programming skills in python 
  • Minimum of 5 years of experience as a professional Data Scientist (credit for relevant academic experience will be given) 
  • Minimum of  2 years of working with NLP and text analytics 
  • Minimum of 1 year working in a client facing role 
  • Technical communication to non-technical audience 


Preferred Experience and Education 


  • Knowledge of programming in one or more of the following programming languages would be a plus: R, C++, Java, Scala (spark), Groovy 
  • Familiarity with one or more data science and machine/deep learning frameworks and tooling, including scikit-learn, H2O, keras, pytorch, tensorflow, pandas, numpy 
  • Comfortable working within Linux environment and shell scripting 
  • Knowledge of “Big Data” frameworks like Hadoop, spark and kafka are a plus 
  • Knowledge of NLP toolkits and libraries such as NLTK, spaCy,  gensim and hugging face 
  • Knowledge of NLP transfer learning, including word embedding models (gloVe, fastText, word2vec etc.) and transformer models (Bert, SBert, and GPT-x etc.) 
  • Knowledge of MLOps and related technologies 
  • Knowledge of microservices architecture and continuous delivery concepts in machine learning and related technologies such as helm, Docker and Kubernetes 
  • Experience working with ODBC and no-sql database technologies 
  • Experience working in cloud computing environments, including GCS, AWS and Azure 

Hiring Process
Our hiring process is designed to understand your core competencies and fit for our team, without making you jump through arbitrary hoops. We are dedicated to making the process as efficient for you as possible.

1. Application CV Review
2. Phone interview with Hiring Manager (45 min)
3. NLP code test
4. Panel interview (1hr)
5. Decision

Why Smarsh?

Ready to join a thriving tech company that’s redefining digital archiving and business intelligence?

Smarsh is the leading comprehensive archiving platform. Recognized as one of today’s fastest growing companies in the U.S., Smarsh delivers innovative cloud-based solutions that help organizations manage and enforce flexible and secure records retention and compliance strategies for electronic communications, including social media and enterprise social networks (Yammer, Chatter, Facebook, LinkedIn and more).

Our motto is ‘People First. Inspire Confidence. Embrace the Impossible.’ We hire lifelong learners who have a passion for their discipline and a track record of excellence. To learn more about us, visit