This job ad has been posted over 40 days ago! (*)
(also available as a 3-6 month traineeship)
In 2014 Nexedi developed a technology called Wendelin.core which provides out-of-core python ndarrays that can be shared transparently across different nodes of a cluster of python runtimes. With Wendelin.core, python can be used natively for big data without relying on other languages or runtimes. Wendelin.core is already being used in production for example for monitoring offshore wind turbines and detecting anomalies. As most use cases of Wendelin.core involve third-party libraries such as NumPy or scikit-learn who run methods “not aware” of available memories, a key challenge for us is to ensure the libraries we deploy in production perform under heavy data loads.
Nexedi is looking for a candidate interested in improving libraries utilized by Wendelin and wendelin.core (mostly NumPy, scikit-learn to some degree, other depending on specific implementation) to reduce the number of memory allocations or copies made internally. This task may require to modify default algorithms that use array allocations or replace them with algorithms that modify data in-place. It may also require to allocate explicitly out-of-core ndarrays whenever there is no better way and contribute any changes made back to upstream improving libraries utilized for the community.