This front adapts from our legacy website http://TrustworthyMachineLearning.org/ and introduces updates of a suite of tools we have designed for making deep learning secure and trustworthy. This project involves toolboxes for five main tasks (organized as entries in the navigation menu). Please feel free to email me when you find my typos.



Scope of problems our tools aim to tackle

Deep learning-based natural language processing (deep NLP) plays a crucial role in many security-critical domains, advancing information understanding and analysis for healthcare, legal justice, e-commerce, social media platforms, and many more. Consequently, it is essential to understand the robustness of deep NLP systems to adaptive adversaries. We introduce techniques to automatically evaluate and improve the adversarial robustness of deep NLP frameworks. This topic is a new and exciting area requiring expertise and close collaboration across multiple disciplines, including adversarial machine learning, natural language processing, and software testing.

Important tasks

At the junction between NLP, deep learning and computer security, we build toolboxes for five main task as shown in the following table. Our system aims to allow a NLP designer to understand how their NLP system performance degrades under evasion attacks, enabling better-informed and more secure design choices. The framework is general and scalable, and takes advantage of the latest advances in deep NLP and computer security.

timeline

timeline

We categorize the topics into a list of subtasks and list our selected works in the following table:

No. Tool Category ~~~~~~~Paper~Title~~~~ Venues Software
1 Evade NLP Machine Learning [TextAttack: A Framework for Adversarial Attacks in Natural Language Processing] EMNLP2020 GitHub
2 Evade Machine Learning [Automatically Evading Classifiers, Case Study on PDF Malware Classifiers] NDSS16 GitHub
3 Evade NLP Machine Learning [Black-box Generation of Adversarial Text Sequences to Fool Deep Learning Classifiers] DeepSecureWkp18 GitHub
4 Detect Adversarial Attacks [Feature Squeezing- Detecting Adversarial Examples in Deep Neural Networks] NDSS18 GitHub
5 Defense against Adversarial Attacks [DeepCloak- Masking Deep Neural Network Models for Robustness against Adversarial Samples] ICLRwkp17 GitHub
6 Visualize Adversarial Attacks [Adversarial-Playground- A Visualization Suite for Adversarial Samples] VizSec17 GitHub
7 Theorems of Adversarial Examples [A Theoretical Framework for Robustness of (Deep) Classifiers Against Adversarial Samples] ICLRw17  
8 Trustworthy via Interpretation [Deep Motif Dashboard] ICLRw2017  

Contact

Have questions or suggestions? Feel free to ask me on Twitter or email me.

Thanks for reading!

Dr Qi’s Invited Talks on textattack

less than 1 minute read

On June 24th, 2021, I gave an invited talk at the Science Academy Machine Learning Summer School on “TextAttack: Generalizing Adversarial Examples to Natural...

Best Paper Award for Deep Motif Dashboard

less than 1 minute read

Jack’s DeepMotif paper (Deep Motif Dashboard: Visualizing and Understanding Genomic Sequences Using Deep Neural Networks ) have received the “best paper awar...