CS 561 Final Project
AI vs Human Writing Detection
Find if your text is AI or not
Project Context
Background + Problem
NLP text classification is used here to identify whether text was written by a human or generated by AI models, supporting concerns around education, academic integrity, and content authenticity.
Dataset
Kaggle ai-and-human-text-dataset with 6,069 samples. Labels: 0 = Human and 1 = AI.
Methods
Text is lowercased, tokenized, and cleaned with stop-word removal. TF-IDF transforms text to features, then models are compared across TF-IDF baseline and BERT.
Evaluation
We evaluate with accuracy, precision, recall, and F1 to compare baseline ML performance against advanced model approaches.