ShamFinder

@article{Suzuki2019ShamFinder,
  title={ShamFinder},
  author={Hiroaki Suzuki and Daiki Chiba and Yoshiro Yoneya and Tatsuya Mori and Shigeki Goto},
  journal={Proceedings of the Internet Measurement Conference},
  year={2019}
}
1 Citations

Mis-shapes, Mistakes, Misfits: An Analysis of Domain Classification Services

TLDR
This study empirically explores popular domain classification services, their methodologies, scalability limitations, label constellations, and their suitability to academic research as well as other practical applications such as content filtering, and concludes with actionable recommendations on their usage.

References

SHOWING 1-10 OF 11 REFERENCES

Detection Method of Homograph Internationalized Domain Names with OCR

TLDR
This work proposes a new method for detecting homograph IDNs using optical character recognition (OCR), focusing on the idea that homographs are visually similar to legitimate domain names, and leverages OCR techniques to recognize such similarities automatically.

Funny Accents: Exploring Genuine Interest in Internationalized Domain Names

TLDR
This paper explores IDNs that hold genuine interest, i.e. that owners of brands with diacritical marks may want to register and use, and sees that application behavior toward these IDNs remains inconsistent, hindering user experience and therefore widespread uptake of IDNs.

Clustering and the Weekend Effect: Recommendations for the Use of Top Domain Lists in Security Research

TLDR
It is found that the weekend effect in Alexa and Umbrella causes these rankings to change their geographical diversity between the workweek and the weekend, and up to 91% of ranked domains appear in alphabetically sorted clusters containing up to 87k domains of presumably equivalent popularity.

Needle in a Haystack: Tracking Down Elite Phishing Domains in the Wild

TLDR
A novel machine learning classifier is built to detect phishing pages from both the web and mobile pages under the squatting domains where the websites impersonate trusted entities not only at the page content level but also at the web domain level.

A Long Way to the Top: Significance, Structure, and Stability of Internet Top Lists

TLDR
It is found that top lists generally overestimate results compared to the general population by a significant margin, often even an order of magnitude, and some top lists have surprising change characteristics, causing high day-to-day fluctuation and leading to result instability.

Internationalizing Domain Names in Applications (IDNA)

TLDR
This document defines internationalized domain names (IDNs) and a mechanism called Internationalizing Domain Names in Applications (IDNA) for handling them in a standard fashion and allows the non-ASCII characters to be represented using only the ASCII characters already allowed in so- called host names today.

Parking Sensors: Analyzing and Detecting Parked Domains

TLDR
It is shown that users who land on parked websites are exposed to malware, inappropriate content, and elaborate scams, such as fake antivirus warnings and costly remote “technicians”.

Seven Months' Worth of Mistakes: A Longitudinal Study of Typosquatting Abuse

TLDR
It is revealed that, even though 95% of the popular domains the authors investigated are actively targeted by typosquatters, only few trademark owners protect themselves against this practice by proactively registering their own tyPOSquatting domains.

The homograph attack

Computing veterans remember an old habit of crossing zeros (O) in program listings to avoid confusing them with the letter O, in order to make sure the operator would type the program correctly into