# Scholscan Filters academic articles using TF-IDF on titles plus logistic regression. ## Build ``` go build -o scholscan . ``` ## Usage ``` # Train model from articles you like ./scholscan train positives.jsonl --rss-feeds feeds.txt > model.json # Score new RSS feed ./scholscan scan --url RSS_URL --model model.json > results.jsonl # Run web server ./scholscan serve --port 8080 --model model.json --rss-world rss_world.txt ``` ## Endpoints - GET `/` - redirect to live feed - GET `/live-feed` - filtered articles web UI - GET `/tools` - score individual articles - POST `/score` - API for scoring titles - POST `/scan` - API for scanning RSS - GET `/api/filtered/feed` - JSON feed - GET `/api/filtered/rss` - RSS feed - GET `/api/health` - health check ## Model settings - TF-IDF: unigrams + bigrams, MinDF=2, MaxDF=0.8 - Logistic regression: λ=0.001, L2 regularization - Class balancing: downsample majority to 1:1 ratio