Finanziamento della Ricerca di Ateneo (FRA) 2020 (Tipo B)
Nowadays, software security plays a crucial role in our society. Let’s think about how software developers are constantly patching and updating their software, sometimes releasing updates mere days after the initial software release. That is because every piece of software will have vulnerabilities cybercriminals can find and take advantage of, or, in other words, exploit. The development of exploits is not only a matter of interest for malicious actors. Well intentioned actors, such as penetration testers, ethical hackers, researchers, and computer security teams are engaged in developing exploits, referred to as proof-of-concept (POC), not to cause harm but to reveal security weaknesses within the software. This enables them to understand how attackers could take advantage of it, and aids vendors and users at patching vulnerabilities and protecting themselves against attacks. The goal of this project is to automatically generate specific types of software exploits, named shellcodes, from natural language (i.e., using English prose), by leveraging Neural Machine Translation (NMT). This goal is very ambitious: indeed, there is no existing work in the literature that brings together NMT and the field of software security in order to generate software exploits. Together with the project proposal, we carried out a series of preliminary experiments showing that the objective of the project, as well as being ambitious, is also feasible and can start a new generation of offensive security methods.