Job details « Go back to category
Internship Artificial Language Processing Python DeveloperPublished at 03.04.2017 - Viewed: 1073 times - Nexedi SA in Lille, France
Nexedi is looking for a 6 months trainee to develop an application to allow to quickly and intelligently search through a large codebase hosted on github/gitlab such as the over 10 million lines of code maintained by Nexedi. The minimum outcome should be a codecrawler replicating the functionality of our old gitweb repository with a number of additional features.
In addition to this we are curious to know whether it is possible to use Natural Language Processing Models on our codebase and what the results may be (we’ll coin this “Artificial Language Processing”). We want to know whether it is possible to find some sort of structure in a codebase or some sort of grammar. Identify properties, categories and data types instead of people, locations and organisations? The exact scope of this part of the traineeship to be defined.
- Develop a search engine for code.
- Apply Natural Language Processing models to a programming language corpus.
- Try to find a grammar applicable to our codebase.
- Understand how to work with our Wendelin Big Data/Machine Learning stack.
- Learn how to analyse a programming language with natural language processing criteria.
- Passionate, self-driven
- Willingness to contribute to an open source ecosystem and the Free Software community
- Very good programming skills in Python
- Good software development skills (version control, testing, debugging)
- Good command of English