Imagine a data pipeline that ingests millions of daily events, creating and updating hundreds of machine learning models a day. Sound like a challenge? It is, and we need your help!
At Meeshkan, we're hiring our first data-pipeline engineer to ensure the smooth delivery of our flagship machine-learning models to our clients. Your role will be to create, monitor, and enhance a serverless, Python-basd data processing pipeline. The pipeline is deployed in production, has seven independent components, uses modern CI/CD practices, boasts a suite of end-to-end tests with ephemeral staging environments for each branch, and successfully treats millions of daily information packets with high throughput and low latency. At the end of the pipeline, we use scikit-learn to create models and deploy them across various the various services that power our application.
Improve the testing, automation, deployment, and throughput of our data pipeline.
Help create and enforce uniform project quality standards, including configuration, security, linting, typing and formatting.
Ensure the smooth running and delivery of automated Selenium tests for our clients.
Spot blockages in the data pipeline, such as missing credentials or incorrect configuration, and proactively correct them.
3-4 hour working overlap with UTC +1
Experience in a scripting language: ideally Python, but bash, Perl or JavaScript experience is ok as well.
Experience maintaining, testing and debugging a high-throughput, low-latency data pipeline.
Optionally experience working with ML models and toolkits like scikit-learn and spacey.
Enjoy meeting and learning from clients and anchoring your work on their success!
We're Meeshkan, the the world's first testing tool that uses AI to create end-to-end tests of software by analyzing real user behavior. Product managers use Meeshkan to make sure that they ship the best possible features while preserving existing functionality.
Your day-to-day work will be fairly independent, but the entire team is always available for collaboration as a sounding board. We are low on formalities and high on substance, and in addition to our operational team, we have a broad network of advisors and investors that are always willing to help out in ideation and introductions. We have a generous office hours and vacation policy, but we are one of the most high-intensity and high-output companies in town due to the fast pace of our industry. Because of this, we are also hyper-sensitive to stress and wellbeing, and offer employees ample opportunities to recharge and feel good.
We are committed to fostering a diverse workplace, and we make it a special point to welcome candidates from underrepresented groups in tech companies. We know that, if you are part of one of these groups, you may have suffered from systemic industry-wide biases that make it more difficult for you to shine than other folks. We don't like that, and we are committed to evaluating every candidate on their own merits given their unique experiences.
Anyone doing machine learning in production knows that authoring and training models is 10% of the work. A solid pipeline that provides business value in a dynamic, rapidly changing environment is the other 90%. We currently have a team of four engineers splitting the responsibility of maintaining this pipeline, and it has become such a crucial part of our infrastructure that we're hiring a new full-time team member to have ownership over it. We'd love to see you apply!
This job comes with several perks and benefits