Unblind Your Apps: Predicting Natural-Language Labels for Mobile GUI Components by Deep Learning

  title={Unblind Your Apps: Predicting Natural-Language Labels for Mobile GUI Components by Deep Learning},
  author={Jieshan Chen and Chunyang Chen and Zhenchang Xing and Xiwei Xu and Liming Zhu and Guoqiang Li and Jinshui Wang},
  journal={2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE)},
According to the World Health Organization(WHO), it is estimated that approximately 1.3 billion people live with some forms of vision impairment globally, of whom 36 million are blind. Due to their disability, engaging these minority into the society is a challenging problem. The recent rise of smart mobile phones provides a new solution by enabling blind users' convenient access to the information and service for understanding the world. Users with vision impairment can adopt the screen reader… 

Towards Better Semantic Understanding of Mobile Interfaces

This dataset augments images and view hierarchies from RICO, a large dataset of mobile UIs, with annotations for icons based on their shapes and semantics, and associations between different elements and their corresponding text labels, resulting in a significant increase in the number of UI elements and the categories assigned to them.

Data-driven accessibility repair revisited: on the effectiveness of generating labels for icons in Android apps

It is found that icon images are insufficient in representing icon labels, while other sources of information from the icon usage context can enrich images in determining proper tokens for labels, and the first context-aware label generation approach, called COALA, is proposed.

Towards Complete Icon Labeling in Mobile Applications

The icon types supported by this work cover 99.5% of collected icons, improving on the previously highest 78% coverage in icon classification work and verifying the usefulness of the generated icon labels.

UI Obfuscation and Its Effects on Automated UI Analysis for Android Apps

  • Hao ZhouTing Chen Wei Zhang
  • Computer Science
    2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)
  • 2020
This work points out the weaknesses in existing automated UI analysis methods and design 9 UI obfuscation approaches and implements these approaches in a new tool named UIObfuscator, which reveals limitations of automatedUI analysis and sheds light on app protection techniques.

SemCluster: a semi-supervised clustering tool for crowdsourced test reports with deep image understanding

This paper proposes a semi-supervised clustering tool for crowdsourced test reports with deep image understanding, namely SemCluster, which makes the most of the semantic connection between textual descriptions and screenshots by constructing semantic binding rules and performing semi- SuperCluster.

Cross-device record and replay for Android apps

This paper demonstrates that cross-device record and replay can be made simple and practical by a one-pass, greedy algorithm by the Rx framework leveraging the least surprise principle in the GUI design.


  • Sidong FengChunyang Chen
  • Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings
  • 2022

Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding

This work presents Pix2Struct, a pretrained image-to-text model for purely visual language understanding, which can be used on tasks containing visually-situated language, and shows that a single pretrained model can achieve state-of-the-art results in six out of nine tasks across four domains: documents, illustrations, user interfaces, and natural images.

Too Much Accessibility is Harmful! Automated Detection and Analysis of Overly Accessible Elements in Mobile Apps

OverSight is presented, an automated framework that leverages these conditions to detect overly accessible elements and verifies their accessibility dynamically using an AT, and demonstrates OverSight’s effectiveness in detecting previously unknown security threats, workflow violations, and accessibility issues.

Groundhog: An Automated Accessibility Crawler for Mobile Apps

An automated accessibility crawler for mobile apps, Groundhog, is proposed that explores an app with the purpose of finding accessibility issues without any manual effort from developers and is highly effective in detecting accessibility barriers that existing techniques cannot discover.



Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics

The results show that automatic evaluation using unigram co-occurrences between summary pairs correlates surprising well with human evaluations, based on various statistical metrics; while direct application of the BLEU evaluation procedure does not always give good results.

Seenomaly: Vision-Based Linting of GUI Animation Effects Against Design-Don't Guidelines

This work proposes an unsupervised, computer-vision based adversarial autoencoder that learns to group similar GUI animations by “seeing” lots of unlabeled real-application GUI animations and learning to generate them, and builds the datasets of synthetic and realworld GUI animations.

GUI-Squatting Attack: Automated Generation of Android Phishing Apps

This article proposes a new attacking technique, named GUI-Squatting attack, which can generate phishing apps (phapps) automatically and effectively on the Android platform, and adopts image processing and deep learning algorithms, to enable powerful and large-scale attacks.

Gallery D.C.: Design Search and Knowledge Discovery through Auto-created GUI Component Gallery

Through a process of invisible crowdsourcing, Gallery D.C. supports novel ways for designers to collect, analyze, search, summarize and compare GUI designs on a massive scale and offers additional support for design sharing and knowledge discovery beyond existing platforms.

MobiDroid: A Performance-Sensitive Malware Detection System on Mobile Platform

An effective Android malware detection system, MobiDroid, leveraging deep learning to provide a real-time secure and fast response environment on Android devices is proposed, and the different performances with various feature categories are evaluated.

Domain-specific machine translation with recurrent neural network for software localization

The results show that the proposed neural-network based translation model outperforms the general machine translation tool, Google Translate, and generates more acceptable translation for software localization with less needs for human revision.

On Information and Sufficiency

Mining Likely Analogical APIs Across Third-Party Libraries via Large-Scale Unsupervised API Semantics Embedding

This work presents an unsupervised deep learning based approach to embed both API usage semantics and API description semantics into vector space for inferring likely analogical API mappings between libraries.

A Neural Model for Method Name Generation from Functional Description

A neural network is proposed to directly generate readable method names from natural language description to handle the explosion of vocabulary when dealing with large repositories, and how to leverage the knowledge learned from large repositories to a specific project.