Honors Thesis Archive

AuthorLily Pederson
TitleMachine Learning and Classifying Phishing Emails
DepartmentMathematics & Computer Science
AdvisorSunday Ngwobia
Year2024
HonorsUniversity Honors
Full TextView Thesis (510 KB)
AbstractThe question investigated in this thesis is whether or not machine learning can correctly classify and separate phishing emails from regular emails. Various methods of data processing were used to clean the Enron corpus including regular expression, lemmatization, and stop word removal. Results through term-frequency inverse-document-frequency (TF-IDF) and KMeans clustering yielded successful results.

Return to Main Honors Thesis Archive Page

Back
Back to top