Instead, confusion attacks prey on user uncertainty about the correct name of the desired package. Confusion attacks, in contrast, do not depend on the victim misspelling a package name. For example, a package called ‘urlib3’ sought to mimic the popular ‘ urllib3’ package. These attacks take advantage of typos made by the user when he or she tries to download a package. The most obvious sub-type is misspelling attacks. Are There Sub-types of Typosquatting?Īn examination of the 40 PyPI typosquatting attacks suggests that there are least two broad attack categories. The actual number of typosquatters is likely higher given that this definition relies on known instances of typosquatting. Has a name similar to another existing package,.We define typosquatting as a package uploaded to PyPI that: How Many Typosquatting Attacks Have There Been On PyPI?ĭrawing on public reporting and our own efforts at finding typosquatters, we found 40 typosquatting attacks against PyPI users between 20 (Figure 1). While initial PyPI typosquatting defenses should probably focus on misspelling attacks, anti-typosquatting defenders will eventually need to address this second, arguably more devious, form of typosquatting. We hope that answers to these questions aid the ecosystem integrity and namespace management efforts of the PyPI package manager community along with parties interested in open source software supply chain security, such as the Linux Foundation.Īnd for those who simply want to know the main finding: typosquatting attacks are about much more than typos! Typosquatters appear to prey on those who misspell a package name and on users who experience confusion about the package that he or she wants to download. To answer these questions, this post uses a novel dataset of typosquatting attacks found on PyPI from 2017 to 2020 and, borrowing a page from the information security metrics community, presents an analysis of the frequency and nature of typosquatting on PyPI. To what extent can edit distance algorithms detect typosquatting?.Do typosquatting packages only squat on the most downloaded packages?. How many total known instances of typosquatting on PyPI are there?.mil domain!) More recent analysis by Hashicorp’s William Bengston, who has defensively typosquatted thousands of PyPI domains to prevent typosquatting against popular packages, offers an even more cautionary tale: there were over 540,000 downloads of his anti-typosquatting packages over the past couple years, downloads that, once again, could have caused widespread harm.Ī relatively less researched area (except one related analysis) concerns patterns of actual typosquatting examples on PyPI. (Military readers: this is not just a civilian hazard. After creating software packages with names that mimic popular package names (i.e., typosquatting) and uploading the ersatz packages to popular package repositories including PyPI, Tschacher observed over 17,000 different computers downloading and executing his code, code that could have been malicious. A 2016 undergraduate thesis by Nikolai Tschacher demonstrates the viability of this attack vector. The software development and cybersecurity communities have become painfully aware that modern software package registries-repositories of free (for the user) source code such as Python’s Package Index (PyPI)-are high-value targets susceptible to typosquatting, one form of software supply chain attack.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |